1*0e209d39SAndroid Build Coastguard Worker // © 2016 and later: Unicode, Inc. and others. 2*0e209d39SAndroid Build Coastguard Worker // License & terms of use: http://www.unicode.org/copyright.html 3*0e209d39SAndroid Build Coastguard Worker /* 4*0e209d39SAndroid Build Coastguard Worker *************************************************************************** 5*0e209d39SAndroid Build Coastguard Worker * Copyright (C) 2008-2016, International Business Machines Corporation 6*0e209d39SAndroid Build Coastguard Worker * and others. All Rights Reserved. 7*0e209d39SAndroid Build Coastguard Worker *************************************************************************** 8*0e209d39SAndroid Build Coastguard Worker * file name: uspoof.h 9*0e209d39SAndroid Build Coastguard Worker * encoding: UTF-8 10*0e209d39SAndroid Build Coastguard Worker * tab size: 8 (not used) 11*0e209d39SAndroid Build Coastguard Worker * indentation:4 12*0e209d39SAndroid Build Coastguard Worker * 13*0e209d39SAndroid Build Coastguard Worker * created on: 2008Feb13 14*0e209d39SAndroid Build Coastguard Worker * created by: Andy Heninger 15*0e209d39SAndroid Build Coastguard Worker * 16*0e209d39SAndroid Build Coastguard Worker * Unicode Spoof Detection 17*0e209d39SAndroid Build Coastguard Worker */ 18*0e209d39SAndroid Build Coastguard Worker 19*0e209d39SAndroid Build Coastguard Worker #ifndef USPOOF_H 20*0e209d39SAndroid Build Coastguard Worker #define USPOOF_H 21*0e209d39SAndroid Build Coastguard Worker 22*0e209d39SAndroid Build Coastguard Worker #include "unicode/ubidi.h" 23*0e209d39SAndroid Build Coastguard Worker #include "unicode/utypes.h" 24*0e209d39SAndroid Build Coastguard Worker #include "unicode/uset.h" 25*0e209d39SAndroid Build Coastguard Worker #include "unicode/parseerr.h" 26*0e209d39SAndroid Build Coastguard Worker 27*0e209d39SAndroid Build Coastguard Worker #if !UCONFIG_NO_NORMALIZATION 28*0e209d39SAndroid Build Coastguard Worker 29*0e209d39SAndroid Build Coastguard Worker 30*0e209d39SAndroid Build Coastguard Worker #if U_SHOW_CPLUSPLUS_API 31*0e209d39SAndroid Build Coastguard Worker #include "unicode/localpointer.h" 32*0e209d39SAndroid Build Coastguard Worker #include "unicode/unistr.h" 33*0e209d39SAndroid Build Coastguard Worker #include "unicode/uniset.h" 34*0e209d39SAndroid Build Coastguard Worker #endif 35*0e209d39SAndroid Build Coastguard Worker 36*0e209d39SAndroid Build Coastguard Worker 37*0e209d39SAndroid Build Coastguard Worker /** 38*0e209d39SAndroid Build Coastguard Worker * \file 39*0e209d39SAndroid Build Coastguard Worker * \brief C API: Unicode Security and Spoofing Detection 40*0e209d39SAndroid Build Coastguard Worker * 41*0e209d39SAndroid Build Coastguard Worker * <p> 42*0e209d39SAndroid Build Coastguard Worker * This class, based on <a href="http://unicode.org/reports/tr36">Unicode Technical Report #36</a> and 43*0e209d39SAndroid Build Coastguard Worker * <a href="http://unicode.org/reports/tr39">Unicode Technical Standard #39</a>, has two main functions: 44*0e209d39SAndroid Build Coastguard Worker * 45*0e209d39SAndroid Build Coastguard Worker * <ol> 46*0e209d39SAndroid Build Coastguard Worker * <li>Checking whether two strings are visually <em>confusable</em> with each other, such as "Harvest" and 47*0e209d39SAndroid Build Coastguard Worker * "Ηarvest", where the second string starts with the Greek capital letter Eta.</li> 48*0e209d39SAndroid Build Coastguard Worker * <li>Checking whether an individual string is likely to be an attempt at confusing the reader (<em>spoof 49*0e209d39SAndroid Build Coastguard Worker * detection</em>), such as "paypal" with some Latin characters substituted with Cyrillic look-alikes.</li> 50*0e209d39SAndroid Build Coastguard Worker * </ol> 51*0e209d39SAndroid Build Coastguard Worker * 52*0e209d39SAndroid Build Coastguard Worker * <p> 53*0e209d39SAndroid Build Coastguard Worker * Although originally designed as a method for flagging suspicious identifier strings such as URLs, 54*0e209d39SAndroid Build Coastguard Worker * <code>USpoofChecker</code> has a number of other practical use cases, such as preventing attempts to evade bad-word 55*0e209d39SAndroid Build Coastguard Worker * content filters. 56*0e209d39SAndroid Build Coastguard Worker * 57*0e209d39SAndroid Build Coastguard Worker * <p> 58*0e209d39SAndroid Build Coastguard Worker * The functions of this class are exposed as C API, with a handful of syntactical conveniences for C++. 59*0e209d39SAndroid Build Coastguard Worker * 60*0e209d39SAndroid Build Coastguard Worker * <h2>Confusables</h2> 61*0e209d39SAndroid Build Coastguard Worker * 62*0e209d39SAndroid Build Coastguard Worker * <p> 63*0e209d39SAndroid Build Coastguard Worker * The following example shows how to use <code>USpoofChecker</code> to check for confusability between two strings: 64*0e209d39SAndroid Build Coastguard Worker * 65*0e209d39SAndroid Build Coastguard Worker * \code{.c} 66*0e209d39SAndroid Build Coastguard Worker * UErrorCode status = U_ZERO_ERROR; 67*0e209d39SAndroid Build Coastguard Worker * UChar* str1 = (UChar*) u"Harvest"; 68*0e209d39SAndroid Build Coastguard Worker * UChar* str2 = (UChar*) u"\u0397arvest"; // with U+0397 GREEK CAPITAL LETTER ETA 69*0e209d39SAndroid Build Coastguard Worker * 70*0e209d39SAndroid Build Coastguard Worker * USpoofChecker* sc = uspoof_open(&status); 71*0e209d39SAndroid Build Coastguard Worker * uspoof_setChecks(sc, USPOOF_CONFUSABLE, &status); 72*0e209d39SAndroid Build Coastguard Worker * 73*0e209d39SAndroid Build Coastguard Worker * int32_t bitmask = uspoof_areConfusable(sc, str1, -1, str2, -1, &status); 74*0e209d39SAndroid Build Coastguard Worker * UBool result = bitmask != 0; 75*0e209d39SAndroid Build Coastguard Worker * // areConfusable: 1 (status: U_ZERO_ERROR) 76*0e209d39SAndroid Build Coastguard Worker * printf("areConfusable: %d (status: %s)\n", result, u_errorName(status)); 77*0e209d39SAndroid Build Coastguard Worker * uspoof_close(sc); 78*0e209d39SAndroid Build Coastguard Worker * \endcode 79*0e209d39SAndroid Build Coastguard Worker * 80*0e209d39SAndroid Build Coastguard Worker * <p> 81*0e209d39SAndroid Build Coastguard Worker * The call to {@link uspoof_open} creates a <code>USpoofChecker</code> object; the call to {@link uspoof_setChecks} 82*0e209d39SAndroid Build Coastguard Worker * enables confusable checking and disables all other checks; the call to {@link uspoof_areConfusable} performs the 83*0e209d39SAndroid Build Coastguard Worker * confusability test; and the following line extracts the result out of the return value. For best performance, 84*0e209d39SAndroid Build Coastguard Worker * the instance should be created once (e.g., upon application startup), and the efficient 85*0e209d39SAndroid Build Coastguard Worker * {@link uspoof_areConfusable} method can be used at runtime. 86*0e209d39SAndroid Build Coastguard Worker * 87*0e209d39SAndroid Build Coastguard Worker * If the paragraph direction used to display the strings is known, the bidi function should be used instead: 88*0e209d39SAndroid Build Coastguard Worker * 89*0e209d39SAndroid Build Coastguard Worker * \code{.c} 90*0e209d39SAndroid Build Coastguard Worker * UErrorCode status = U_ZERO_ERROR; 91*0e209d39SAndroid Build Coastguard Worker * // These strings look identical when rendered in a left-to-right context. 92*0e209d39SAndroid Build Coastguard Worker * // They look distinct in a right-to-left context. 93*0e209d39SAndroid Build Coastguard Worker * UChar* str1 = (UChar*) u"A1\u05D0"; // A1א 94*0e209d39SAndroid Build Coastguard Worker * UChar* str2 = (UChar*) u"A\u05D01"; // Aא1 95*0e209d39SAndroid Build Coastguard Worker * 96*0e209d39SAndroid Build Coastguard Worker * USpoofChecker* sc = uspoof_open(&status); 97*0e209d39SAndroid Build Coastguard Worker * uspoof_setChecks(sc, USPOOF_CONFUSABLE, &status); 98*0e209d39SAndroid Build Coastguard Worker * 99*0e209d39SAndroid Build Coastguard Worker * int32_t bitmask = uspoof_areBidiConfusable(sc, UBIDI_LTR, str1, -1, str2, -1, &status); 100*0e209d39SAndroid Build Coastguard Worker * UBool result = bitmask != 0; 101*0e209d39SAndroid Build Coastguard Worker * // areBidiConfusable: 1 (status: U_ZERO_ERROR) 102*0e209d39SAndroid Build Coastguard Worker * printf("areBidiConfusable: %d (status: %s)\n", result, u_errorName(status)); 103*0e209d39SAndroid Build Coastguard Worker * uspoof_close(sc); 104*0e209d39SAndroid Build Coastguard Worker * \endcode 105*0e209d39SAndroid Build Coastguard Worker * 106*0e209d39SAndroid Build Coastguard Worker * <p> 107*0e209d39SAndroid Build Coastguard Worker * The type {@link LocalUSpoofCheckerPointer} is exposed for C++ programmers. It will automatically call 108*0e209d39SAndroid Build Coastguard Worker * {@link uspoof_close} when the object goes out of scope: 109*0e209d39SAndroid Build Coastguard Worker * 110*0e209d39SAndroid Build Coastguard Worker * \code{.cpp} 111*0e209d39SAndroid Build Coastguard Worker * UErrorCode status = U_ZERO_ERROR; 112*0e209d39SAndroid Build Coastguard Worker * LocalUSpoofCheckerPointer sc(uspoof_open(&status)); 113*0e209d39SAndroid Build Coastguard Worker * uspoof_setChecks(sc.getAlias(), USPOOF_CONFUSABLE, &status); 114*0e209d39SAndroid Build Coastguard Worker * // ... 115*0e209d39SAndroid Build Coastguard Worker * \endcode 116*0e209d39SAndroid Build Coastguard Worker * 117*0e209d39SAndroid Build Coastguard Worker * UTS 39 defines two strings to be <em>confusable</em> if they map to the same <em>skeleton string</em>. A skeleton can 118*0e209d39SAndroid Build Coastguard Worker * be thought of as a "hash code". {@link uspoof_getSkeleton} computes the skeleton for a particular string, so 119*0e209d39SAndroid Build Coastguard Worker * the following snippet is equivalent to the example above: 120*0e209d39SAndroid Build Coastguard Worker * 121*0e209d39SAndroid Build Coastguard Worker * \code{.c} 122*0e209d39SAndroid Build Coastguard Worker * UErrorCode status = U_ZERO_ERROR; 123*0e209d39SAndroid Build Coastguard Worker * UChar* str1 = (UChar*) u"Harvest"; 124*0e209d39SAndroid Build Coastguard Worker * UChar* str2 = (UChar*) u"\u0397arvest"; // with U+0397 GREEK CAPITAL LETTER ETA 125*0e209d39SAndroid Build Coastguard Worker * 126*0e209d39SAndroid Build Coastguard Worker * USpoofChecker* sc = uspoof_open(&status); 127*0e209d39SAndroid Build Coastguard Worker * uspoof_setChecks(sc, USPOOF_CONFUSABLE, &status); 128*0e209d39SAndroid Build Coastguard Worker * 129*0e209d39SAndroid Build Coastguard Worker * // Get skeleton 1 130*0e209d39SAndroid Build Coastguard Worker * int32_t skel1Len = uspoof_getSkeleton(sc, 0, str1, -1, NULL, 0, &status); 131*0e209d39SAndroid Build Coastguard Worker * UChar* skel1 = (UChar*) malloc(++skel1Len * sizeof(UChar)); 132*0e209d39SAndroid Build Coastguard Worker * status = U_ZERO_ERROR; 133*0e209d39SAndroid Build Coastguard Worker * uspoof_getSkeleton(sc, 0, str1, -1, skel1, skel1Len, &status); 134*0e209d39SAndroid Build Coastguard Worker * 135*0e209d39SAndroid Build Coastguard Worker * // Get skeleton 2 136*0e209d39SAndroid Build Coastguard Worker * int32_t skel2Len = uspoof_getSkeleton(sc, 0, str2, -1, NULL, 0, &status); 137*0e209d39SAndroid Build Coastguard Worker * UChar* skel2 = (UChar*) malloc(++skel2Len * sizeof(UChar)); 138*0e209d39SAndroid Build Coastguard Worker * status = U_ZERO_ERROR; 139*0e209d39SAndroid Build Coastguard Worker * uspoof_getSkeleton(sc, 0, str2, -1, skel2, skel2Len, &status); 140*0e209d39SAndroid Build Coastguard Worker * 141*0e209d39SAndroid Build Coastguard Worker * // Are the skeletons the same? 142*0e209d39SAndroid Build Coastguard Worker * UBool result = u_strcmp(skel1, skel2) == 0; 143*0e209d39SAndroid Build Coastguard Worker * // areConfusable: 1 (status: U_ZERO_ERROR) 144*0e209d39SAndroid Build Coastguard Worker * printf("areConfusable: %d (status: %s)\n", result, u_errorName(status)); 145*0e209d39SAndroid Build Coastguard Worker * uspoof_close(sc); 146*0e209d39SAndroid Build Coastguard Worker * free(skel1); 147*0e209d39SAndroid Build Coastguard Worker * free(skel2); 148*0e209d39SAndroid Build Coastguard Worker * \endcode 149*0e209d39SAndroid Build Coastguard Worker * 150*0e209d39SAndroid Build Coastguard Worker * If you need to check if a string is confusable with any string in a dictionary of many strings, rather than calling 151*0e209d39SAndroid Build Coastguard Worker * {@link uspoof_areConfusable} many times in a loop, {@link uspoof_getSkeleton} can be used instead, as shown below: 152*0e209d39SAndroid Build Coastguard Worker * 153*0e209d39SAndroid Build Coastguard Worker * \code{.c} 154*0e209d39SAndroid Build Coastguard Worker * UErrorCode status = U_ZERO_ERROR; 155*0e209d39SAndroid Build Coastguard Worker * #define DICTIONARY_LENGTH 2 156*0e209d39SAndroid Build Coastguard Worker * UChar* dictionary[DICTIONARY_LENGTH] = { (UChar*) u"lorem", (UChar*) u"ipsum" }; 157*0e209d39SAndroid Build Coastguard Worker * UChar* skeletons[DICTIONARY_LENGTH]; 158*0e209d39SAndroid Build Coastguard Worker * UChar* str = (UChar*) u"1orern"; 159*0e209d39SAndroid Build Coastguard Worker * 160*0e209d39SAndroid Build Coastguard Worker * // Setup: 161*0e209d39SAndroid Build Coastguard Worker * USpoofChecker* sc = uspoof_open(&status); 162*0e209d39SAndroid Build Coastguard Worker * uspoof_setChecks(sc, USPOOF_CONFUSABLE, &status); 163*0e209d39SAndroid Build Coastguard Worker * for (size_t i=0; i<DICTIONARY_LENGTH; i++) { 164*0e209d39SAndroid Build Coastguard Worker * UChar* word = dictionary[i]; 165*0e209d39SAndroid Build Coastguard Worker * int32_t len = uspoof_getSkeleton(sc, 0, word, -1, NULL, 0, &status); 166*0e209d39SAndroid Build Coastguard Worker * skeletons[i] = (UChar*) malloc(++len * sizeof(UChar)); 167*0e209d39SAndroid Build Coastguard Worker * status = U_ZERO_ERROR; 168*0e209d39SAndroid Build Coastguard Worker * uspoof_getSkeleton(sc, 0, word, -1, skeletons[i], len, &status); 169*0e209d39SAndroid Build Coastguard Worker * } 170*0e209d39SAndroid Build Coastguard Worker * 171*0e209d39SAndroid Build Coastguard Worker * // Live Check: 172*0e209d39SAndroid Build Coastguard Worker * { 173*0e209d39SAndroid Build Coastguard Worker * int32_t len = uspoof_getSkeleton(sc, 0, str, -1, NULL, 0, &status); 174*0e209d39SAndroid Build Coastguard Worker * UChar* skel = (UChar*) malloc(++len * sizeof(UChar)); 175*0e209d39SAndroid Build Coastguard Worker * status = U_ZERO_ERROR; 176*0e209d39SAndroid Build Coastguard Worker * uspoof_getSkeleton(sc, 0, str, -1, skel, len, &status); 177*0e209d39SAndroid Build Coastguard Worker * UBool result = false; 178*0e209d39SAndroid Build Coastguard Worker * for (size_t i=0; i<DICTIONARY_LENGTH; i++) { 179*0e209d39SAndroid Build Coastguard Worker * result = u_strcmp(skel, skeletons[i]) == 0; 180*0e209d39SAndroid Build Coastguard Worker * if (result == true) { break; } 181*0e209d39SAndroid Build Coastguard Worker * } 182*0e209d39SAndroid Build Coastguard Worker * // Has confusable in dictionary: 1 (status: U_ZERO_ERROR) 183*0e209d39SAndroid Build Coastguard Worker * printf("Has confusable in dictionary: %d (status: %s)\n", result, u_errorName(status)); 184*0e209d39SAndroid Build Coastguard Worker * free(skel); 185*0e209d39SAndroid Build Coastguard Worker * } 186*0e209d39SAndroid Build Coastguard Worker * 187*0e209d39SAndroid Build Coastguard Worker * for (size_t i=0; i<DICTIONARY_LENGTH; i++) { 188*0e209d39SAndroid Build Coastguard Worker * free(skeletons[i]); 189*0e209d39SAndroid Build Coastguard Worker * } 190*0e209d39SAndroid Build Coastguard Worker * uspoof_close(sc); 191*0e209d39SAndroid Build Coastguard Worker * \endcode 192*0e209d39SAndroid Build Coastguard Worker * 193*0e209d39SAndroid Build Coastguard Worker * <b>Note:</b> Since the Unicode confusables mapping table is frequently updated, confusable skeletons are <em>not</em> 194*0e209d39SAndroid Build Coastguard Worker * guaranteed to be the same between ICU releases. We therefore recommend that you always compute confusable skeletons 195*0e209d39SAndroid Build Coastguard Worker * at runtime and do not rely on creating a permanent, or difficult to update, database of skeletons. 196*0e209d39SAndroid Build Coastguard Worker * 197*0e209d39SAndroid Build Coastguard Worker * <h2>Spoof Detection</h2> 198*0e209d39SAndroid Build Coastguard Worker * 199*0e209d39SAndroid Build Coastguard Worker * The following snippet shows a minimal example of using <code>USpoofChecker</code> to perform spoof detection on a 200*0e209d39SAndroid Build Coastguard Worker * string: 201*0e209d39SAndroid Build Coastguard Worker * 202*0e209d39SAndroid Build Coastguard Worker * \code{.c} 203*0e209d39SAndroid Build Coastguard Worker * UErrorCode status = U_ZERO_ERROR; 204*0e209d39SAndroid Build Coastguard Worker * UChar* str = (UChar*) u"p\u0430ypal"; // with U+0430 CYRILLIC SMALL LETTER A 205*0e209d39SAndroid Build Coastguard Worker * 206*0e209d39SAndroid Build Coastguard Worker * // Get the default set of allowable characters: 207*0e209d39SAndroid Build Coastguard Worker * USet* allowed = uset_openEmpty(); 208*0e209d39SAndroid Build Coastguard Worker * uset_addAll(allowed, uspoof_getRecommendedSet(&status)); 209*0e209d39SAndroid Build Coastguard Worker * uset_addAll(allowed, uspoof_getInclusionSet(&status)); 210*0e209d39SAndroid Build Coastguard Worker * 211*0e209d39SAndroid Build Coastguard Worker * USpoofChecker* sc = uspoof_open(&status); 212*0e209d39SAndroid Build Coastguard Worker * uspoof_setAllowedChars(sc, allowed, &status); 213*0e209d39SAndroid Build Coastguard Worker * uspoof_setRestrictionLevel(sc, USPOOF_MODERATELY_RESTRICTIVE); 214*0e209d39SAndroid Build Coastguard Worker * 215*0e209d39SAndroid Build Coastguard Worker * int32_t bitmask = uspoof_check(sc, str, -1, NULL, &status); 216*0e209d39SAndroid Build Coastguard Worker * UBool result = bitmask != 0; 217*0e209d39SAndroid Build Coastguard Worker * // fails checks: 1 (status: U_ZERO_ERROR) 218*0e209d39SAndroid Build Coastguard Worker * printf("fails checks: %d (status: %s)\n", result, u_errorName(status)); 219*0e209d39SAndroid Build Coastguard Worker * uspoof_close(sc); 220*0e209d39SAndroid Build Coastguard Worker * uset_close(allowed); 221*0e209d39SAndroid Build Coastguard Worker * \endcode 222*0e209d39SAndroid Build Coastguard Worker * 223*0e209d39SAndroid Build Coastguard Worker * As in the case for confusability checking, it is good practice to create one <code>USpoofChecker</code> instance at 224*0e209d39SAndroid Build Coastguard Worker * startup, and call the cheaper {@link uspoof_check} online. We specify the set of 225*0e209d39SAndroid Build Coastguard Worker * allowed characters to be those with type RECOMMENDED or INCLUSION, according to the recommendation in UTS 39. 226*0e209d39SAndroid Build Coastguard Worker * 227*0e209d39SAndroid Build Coastguard Worker * In addition to {@link uspoof_check}, the function {@link uspoof_checkUTF8} is exposed for UTF8-encoded char* strings, 228*0e209d39SAndroid Build Coastguard Worker * and {@link uspoof_checkUnicodeString} is exposed for C++ programmers. 229*0e209d39SAndroid Build Coastguard Worker * 230*0e209d39SAndroid Build Coastguard Worker * If the {@link USPOOF_AUX_INFO} check is enabled, a limited amount of information on why a string failed the checks 231*0e209d39SAndroid Build Coastguard Worker * is available in the returned bitmask. For complete information, use the {@link uspoof_check2} class of functions 232*0e209d39SAndroid Build Coastguard Worker * with a {@link USpoofCheckResult} parameter: 233*0e209d39SAndroid Build Coastguard Worker * 234*0e209d39SAndroid Build Coastguard Worker * \code{.c} 235*0e209d39SAndroid Build Coastguard Worker * UErrorCode status = U_ZERO_ERROR; 236*0e209d39SAndroid Build Coastguard Worker * UChar* str = (UChar*) u"p\u0430ypal"; // with U+0430 CYRILLIC SMALL LETTER A 237*0e209d39SAndroid Build Coastguard Worker * 238*0e209d39SAndroid Build Coastguard Worker * // Get the default set of allowable characters: 239*0e209d39SAndroid Build Coastguard Worker * USet* allowed = uset_openEmpty(); 240*0e209d39SAndroid Build Coastguard Worker * uset_addAll(allowed, uspoof_getRecommendedSet(&status)); 241*0e209d39SAndroid Build Coastguard Worker * uset_addAll(allowed, uspoof_getInclusionSet(&status)); 242*0e209d39SAndroid Build Coastguard Worker * 243*0e209d39SAndroid Build Coastguard Worker * USpoofChecker* sc = uspoof_open(&status); 244*0e209d39SAndroid Build Coastguard Worker * uspoof_setAllowedChars(sc, allowed, &status); 245*0e209d39SAndroid Build Coastguard Worker * uspoof_setRestrictionLevel(sc, USPOOF_MODERATELY_RESTRICTIVE); 246*0e209d39SAndroid Build Coastguard Worker * 247*0e209d39SAndroid Build Coastguard Worker * USpoofCheckResult* checkResult = uspoof_openCheckResult(&status); 248*0e209d39SAndroid Build Coastguard Worker * int32_t bitmask = uspoof_check2(sc, str, -1, checkResult, &status); 249*0e209d39SAndroid Build Coastguard Worker * 250*0e209d39SAndroid Build Coastguard Worker * int32_t failures1 = bitmask; 251*0e209d39SAndroid Build Coastguard Worker * int32_t failures2 = uspoof_getCheckResultChecks(checkResult, &status); 252*0e209d39SAndroid Build Coastguard Worker * assert(failures1 == failures2); 253*0e209d39SAndroid Build Coastguard Worker * // checks that failed: 0x00000010 (status: U_ZERO_ERROR) 254*0e209d39SAndroid Build Coastguard Worker * printf("checks that failed: %#010x (status: %s)\n", failures1, u_errorName(status)); 255*0e209d39SAndroid Build Coastguard Worker * 256*0e209d39SAndroid Build Coastguard Worker * // Cleanup: 257*0e209d39SAndroid Build Coastguard Worker * uspoof_close(sc); 258*0e209d39SAndroid Build Coastguard Worker * uset_close(allowed); 259*0e209d39SAndroid Build Coastguard Worker * uspoof_closeCheckResult(checkResult); 260*0e209d39SAndroid Build Coastguard Worker * \endcode 261*0e209d39SAndroid Build Coastguard Worker * 262*0e209d39SAndroid Build Coastguard Worker * C++ users can take advantage of a few syntactical conveniences. The following snippet is functionally 263*0e209d39SAndroid Build Coastguard Worker * equivalent to the one above: 264*0e209d39SAndroid Build Coastguard Worker * 265*0e209d39SAndroid Build Coastguard Worker * \code{.cpp} 266*0e209d39SAndroid Build Coastguard Worker * UErrorCode status = U_ZERO_ERROR; 267*0e209d39SAndroid Build Coastguard Worker * UnicodeString str((UChar*) u"p\u0430ypal"); // with U+0430 CYRILLIC SMALL LETTER A 268*0e209d39SAndroid Build Coastguard Worker * 269*0e209d39SAndroid Build Coastguard Worker * // Get the default set of allowable characters: 270*0e209d39SAndroid Build Coastguard Worker * UnicodeSet allowed; 271*0e209d39SAndroid Build Coastguard Worker * allowed.addAll(*uspoof_getRecommendedUnicodeSet(&status)); 272*0e209d39SAndroid Build Coastguard Worker * allowed.addAll(*uspoof_getInclusionUnicodeSet(&status)); 273*0e209d39SAndroid Build Coastguard Worker * 274*0e209d39SAndroid Build Coastguard Worker * LocalUSpoofCheckerPointer sc(uspoof_open(&status)); 275*0e209d39SAndroid Build Coastguard Worker * uspoof_setAllowedChars(sc.getAlias(), allowed.toUSet(), &status); 276*0e209d39SAndroid Build Coastguard Worker * uspoof_setRestrictionLevel(sc.getAlias(), USPOOF_MODERATELY_RESTRICTIVE); 277*0e209d39SAndroid Build Coastguard Worker * 278*0e209d39SAndroid Build Coastguard Worker * LocalUSpoofCheckResultPointer checkResult(uspoof_openCheckResult(&status)); 279*0e209d39SAndroid Build Coastguard Worker * int32_t bitmask = uspoof_check2UnicodeString(sc.getAlias(), str, checkResult.getAlias(), &status); 280*0e209d39SAndroid Build Coastguard Worker * 281*0e209d39SAndroid Build Coastguard Worker * int32_t failures1 = bitmask; 282*0e209d39SAndroid Build Coastguard Worker * int32_t failures2 = uspoof_getCheckResultChecks(checkResult.getAlias(), &status); 283*0e209d39SAndroid Build Coastguard Worker * assert(failures1 == failures2); 284*0e209d39SAndroid Build Coastguard Worker * // checks that failed: 0x00000010 (status: U_ZERO_ERROR) 285*0e209d39SAndroid Build Coastguard Worker * printf("checks that failed: %#010x (status: %s)\n", failures1, u_errorName(status)); 286*0e209d39SAndroid Build Coastguard Worker * 287*0e209d39SAndroid Build Coastguard Worker * // Explicit cleanup not necessary. 288*0e209d39SAndroid Build Coastguard Worker * \endcode 289*0e209d39SAndroid Build Coastguard Worker * 290*0e209d39SAndroid Build Coastguard Worker * The return value is a bitmask of the checks that failed. In this case, there was one check that failed: 291*0e209d39SAndroid Build Coastguard Worker * {@link USPOOF_RESTRICTION_LEVEL}, corresponding to the fifth bit (16). The possible checks are: 292*0e209d39SAndroid Build Coastguard Worker * 293*0e209d39SAndroid Build Coastguard Worker * <ul> 294*0e209d39SAndroid Build Coastguard Worker * <li><code>RESTRICTION_LEVEL</code>: flags strings that violate the 295*0e209d39SAndroid Build Coastguard Worker * <a href="http://unicode.org/reports/tr39/#Restriction_Level_Detection">Restriction Level</a> test as specified in UTS 296*0e209d39SAndroid Build Coastguard Worker * 39; in most cases, this means flagging strings that contain characters from multiple different scripts.</li> 297*0e209d39SAndroid Build Coastguard Worker * <li><code>INVISIBLE</code>: flags strings that contain invisible characters, such as zero-width spaces, or character 298*0e209d39SAndroid Build Coastguard Worker * sequences that are likely not to display, such as multiple occurrences of the same non-spacing mark.</li> 299*0e209d39SAndroid Build Coastguard Worker * <li><code>CHAR_LIMIT</code>: flags strings that contain characters outside of a specified set of acceptable 300*0e209d39SAndroid Build Coastguard Worker * characters. See {@link uspoof_setAllowedChars} and {@link uspoof_setAllowedLocales}.</li> 301*0e209d39SAndroid Build Coastguard Worker * <li><code>MIXED_NUMBERS</code>: flags strings that contain digits from multiple different numbering systems.</li> 302*0e209d39SAndroid Build Coastguard Worker * </ul> 303*0e209d39SAndroid Build Coastguard Worker * 304*0e209d39SAndroid Build Coastguard Worker * <p> 305*0e209d39SAndroid Build Coastguard Worker * These checks can be enabled independently of each other. For example, if you were interested in checking for only the 306*0e209d39SAndroid Build Coastguard Worker * INVISIBLE and MIXED_NUMBERS conditions, you could do: 307*0e209d39SAndroid Build Coastguard Worker * 308*0e209d39SAndroid Build Coastguard Worker * \code{.c} 309*0e209d39SAndroid Build Coastguard Worker * UErrorCode status = U_ZERO_ERROR; 310*0e209d39SAndroid Build Coastguard Worker * UChar* str = (UChar*) u"8\u09EA"; // 8 mixed with U+09EA BENGALI DIGIT FOUR 311*0e209d39SAndroid Build Coastguard Worker * 312*0e209d39SAndroid Build Coastguard Worker * USpoofChecker* sc = uspoof_open(&status); 313*0e209d39SAndroid Build Coastguard Worker * uspoof_setChecks(sc, USPOOF_INVISIBLE | USPOOF_MIXED_NUMBERS, &status); 314*0e209d39SAndroid Build Coastguard Worker * 315*0e209d39SAndroid Build Coastguard Worker * int32_t bitmask = uspoof_check2(sc, str, -1, NULL, &status); 316*0e209d39SAndroid Build Coastguard Worker * UBool result = bitmask != 0; 317*0e209d39SAndroid Build Coastguard Worker * // fails checks: 1 (status: U_ZERO_ERROR) 318*0e209d39SAndroid Build Coastguard Worker * printf("fails checks: %d (status: %s)\n", result, u_errorName(status)); 319*0e209d39SAndroid Build Coastguard Worker * uspoof_close(sc); 320*0e209d39SAndroid Build Coastguard Worker * \endcode 321*0e209d39SAndroid Build Coastguard Worker * 322*0e209d39SAndroid Build Coastguard Worker * Here is an example in C++ showing how to compute the restriction level of a string: 323*0e209d39SAndroid Build Coastguard Worker * 324*0e209d39SAndroid Build Coastguard Worker * \code{.cpp} 325*0e209d39SAndroid Build Coastguard Worker * UErrorCode status = U_ZERO_ERROR; 326*0e209d39SAndroid Build Coastguard Worker * UnicodeString str((UChar*) u"p\u0430ypal"); // with U+0430 CYRILLIC SMALL LETTER A 327*0e209d39SAndroid Build Coastguard Worker * 328*0e209d39SAndroid Build Coastguard Worker * // Get the default set of allowable characters: 329*0e209d39SAndroid Build Coastguard Worker * UnicodeSet allowed; 330*0e209d39SAndroid Build Coastguard Worker * allowed.addAll(*uspoof_getRecommendedUnicodeSet(&status)); 331*0e209d39SAndroid Build Coastguard Worker * allowed.addAll(*uspoof_getInclusionUnicodeSet(&status)); 332*0e209d39SAndroid Build Coastguard Worker * 333*0e209d39SAndroid Build Coastguard Worker * LocalUSpoofCheckerPointer sc(uspoof_open(&status)); 334*0e209d39SAndroid Build Coastguard Worker * uspoof_setAllowedChars(sc.getAlias(), allowed.toUSet(), &status); 335*0e209d39SAndroid Build Coastguard Worker * uspoof_setRestrictionLevel(sc.getAlias(), USPOOF_MODERATELY_RESTRICTIVE); 336*0e209d39SAndroid Build Coastguard Worker * uspoof_setChecks(sc.getAlias(), USPOOF_RESTRICTION_LEVEL | USPOOF_AUX_INFO, &status); 337*0e209d39SAndroid Build Coastguard Worker * 338*0e209d39SAndroid Build Coastguard Worker * LocalUSpoofCheckResultPointer checkResult(uspoof_openCheckResult(&status)); 339*0e209d39SAndroid Build Coastguard Worker * int32_t bitmask = uspoof_check2UnicodeString(sc.getAlias(), str, checkResult.getAlias(), &status); 340*0e209d39SAndroid Build Coastguard Worker * 341*0e209d39SAndroid Build Coastguard Worker * URestrictionLevel restrictionLevel = uspoof_getCheckResultRestrictionLevel(checkResult.getAlias(), &status); 342*0e209d39SAndroid Build Coastguard Worker * // Since USPOOF_AUX_INFO was enabled, the restriction level is also available in the upper bits of the bitmask: 343*0e209d39SAndroid Build Coastguard Worker * assert((restrictionLevel & bitmask) == restrictionLevel); 344*0e209d39SAndroid Build Coastguard Worker * // Restriction level: 0x50000000 (status: U_ZERO_ERROR) 345*0e209d39SAndroid Build Coastguard Worker * printf("Restriction level: %#010x (status: %s)\n", restrictionLevel, u_errorName(status)); 346*0e209d39SAndroid Build Coastguard Worker * \endcode 347*0e209d39SAndroid Build Coastguard Worker * 348*0e209d39SAndroid Build Coastguard Worker * The code '0x50000000' corresponds to the restriction level USPOOF_MINIMALLY_RESTRICTIVE. Since 349*0e209d39SAndroid Build Coastguard Worker * USPOOF_MINIMALLY_RESTRICTIVE is weaker than USPOOF_MODERATELY_RESTRICTIVE, the string fails the check. 350*0e209d39SAndroid Build Coastguard Worker * 351*0e209d39SAndroid Build Coastguard Worker * <b>Note:</b> The Restriction Level is the most powerful of the checks. The full logic is documented in 352*0e209d39SAndroid Build Coastguard Worker * <a href="http://unicode.org/reports/tr39/#Restriction_Level_Detection">UTS 39</a>, but the basic idea is that strings 353*0e209d39SAndroid Build Coastguard Worker * are restricted to contain characters from only a single script, <em>except</em> that most scripts are allowed to have 354*0e209d39SAndroid Build Coastguard Worker * Latin characters interspersed. Although the default restriction level is <code>HIGHLY_RESTRICTIVE</code>, it is 355*0e209d39SAndroid Build Coastguard Worker * recommended that users set their restriction level to <code>MODERATELY_RESTRICTIVE</code>, which allows Latin mixed 356*0e209d39SAndroid Build Coastguard Worker * with all other scripts except Cyrillic, Greek, and Cherokee, with which it is often confusable. For more details on 357*0e209d39SAndroid Build Coastguard Worker * the levels, see UTS 39 or {@link URestrictionLevel}. The Restriction Level test is aware of the set of 358*0e209d39SAndroid Build Coastguard Worker * allowed characters set in {@link uspoof_setAllowedChars}. Note that characters which have script code 359*0e209d39SAndroid Build Coastguard Worker * COMMON or INHERITED, such as numbers and punctuation, are ignored when computing whether a string has multiple 360*0e209d39SAndroid Build Coastguard Worker * scripts. 361*0e209d39SAndroid Build Coastguard Worker * 362*0e209d39SAndroid Build Coastguard Worker * <h2>Advanced bidirectional usage</h2> 363*0e209d39SAndroid Build Coastguard Worker * If the paragraph direction with which the identifiers will be displayed is not known, there are 364*0e209d39SAndroid Build Coastguard Worker * multiple options for confusable detection depending on the circumstances. 365*0e209d39SAndroid Build Coastguard Worker * 366*0e209d39SAndroid Build Coastguard Worker * <p> 367*0e209d39SAndroid Build Coastguard Worker * In some circumstances, the only concern is confusion between identifiers displayed with the same 368*0e209d39SAndroid Build Coastguard Worker * paragraph direction. 369*0e209d39SAndroid Build Coastguard Worker * 370*0e209d39SAndroid Build Coastguard Worker * <p> 371*0e209d39SAndroid Build Coastguard Worker * An example is the case where identifiers are usernames prefixed with the @ symbol. 372*0e209d39SAndroid Build Coastguard Worker * That symbol will appear to the left in a left-to-right context, and to the right in a 373*0e209d39SAndroid Build Coastguard Worker * right-to-left context, so that an identifier displayed in a left-to-right context can never be 374*0e209d39SAndroid Build Coastguard Worker * confused with an identifier displayed in a right-to-left context: 375*0e209d39SAndroid Build Coastguard Worker * <ul> 376*0e209d39SAndroid Build Coastguard Worker * <li> 377*0e209d39SAndroid Build Coastguard Worker * The usernames "A1א" (A one aleph) and "Aא1" (A aleph 1) 378*0e209d39SAndroid Build Coastguard Worker * would be considered confusable, since they both appear as \@A1א in a left-to-right context, and the 379*0e209d39SAndroid Build Coastguard Worker * usernames "אA_1" (aleph A underscore one) and "א1_A" (aleph one underscore A) would be considered 380*0e209d39SAndroid Build Coastguard Worker * confusable, since they both appear as A_1א@ in a right-to-left context. 381*0e209d39SAndroid Build Coastguard Worker * </li> 382*0e209d39SAndroid Build Coastguard Worker * <li> 383*0e209d39SAndroid Build Coastguard Worker * The username "Mark_" would not be considered confusable with the username "_Mark", 384*0e209d39SAndroid Build Coastguard Worker * even though the latter would appear as Mark_@ in a right-to-left context, and the 385*0e209d39SAndroid Build Coastguard Worker * former as \@Mark_ in a left-to-right context. 386*0e209d39SAndroid Build Coastguard Worker * </li> 387*0e209d39SAndroid Build Coastguard Worker * </ul> 388*0e209d39SAndroid Build Coastguard Worker * <p> 389*0e209d39SAndroid Build Coastguard Worker * In that case, the caller should check for both LTR-confusability and RTL-confusability: 390*0e209d39SAndroid Build Coastguard Worker * 391*0e209d39SAndroid Build Coastguard Worker * \code{.cpp} 392*0e209d39SAndroid Build Coastguard Worker * bool confusableInEitherDirection = 393*0e209d39SAndroid Build Coastguard Worker * uspoof_areBidiConfusableUnicodeString(sc, UBIDI_LTR, id1, id2, &status) || 394*0e209d39SAndroid Build Coastguard Worker * uspoof_areBidiConfusableUnicodeString(sc, UBIDI_RTL, id1, id2, &status); 395*0e209d39SAndroid Build Coastguard Worker * \endcode 396*0e209d39SAndroid Build Coastguard Worker * 397*0e209d39SAndroid Build Coastguard Worker * If the bidiSkeleton is used, the LTR and RTL skeleta should be kept separately and compared, LTR 398*0e209d39SAndroid Build Coastguard Worker * with LTR and RTL with RTL. 399*0e209d39SAndroid Build Coastguard Worker * 400*0e209d39SAndroid Build Coastguard Worker * <p> 401*0e209d39SAndroid Build Coastguard Worker * In cases where confusability between the visual appearances of an identifier displayed in a 402*0e209d39SAndroid Build Coastguard Worker * left-to-right context with another identifier displayed in a right-to-left context is a concern, 403*0e209d39SAndroid Build Coastguard Worker * the LTR skeleton of one can be compared with the RTL skeleton of the other. However, this 404*0e209d39SAndroid Build Coastguard Worker * very broad definition of confusability may have unexpected results; for instance, it treats the 405*0e209d39SAndroid Build Coastguard Worker * ASCII identifiers "Mark_" and "_Mark" as confusable. 406*0e209d39SAndroid Build Coastguard Worker * 407*0e209d39SAndroid Build Coastguard Worker * <h2>Additional Information</h2> 408*0e209d39SAndroid Build Coastguard Worker * 409*0e209d39SAndroid Build Coastguard Worker * A <code>USpoofChecker</code> instance may be used repeatedly to perform checks on any number of identifiers. 410*0e209d39SAndroid Build Coastguard Worker * 411*0e209d39SAndroid Build Coastguard Worker * <b>Thread Safety:</b> The test functions for checking a single identifier, or for testing whether 412*0e209d39SAndroid Build Coastguard Worker * two identifiers are possible confusable, are thread safe. They may called concurrently, from multiple threads, 413*0e209d39SAndroid Build Coastguard Worker * using the same USpoofChecker instance. 414*0e209d39SAndroid Build Coastguard Worker * 415*0e209d39SAndroid Build Coastguard Worker * More generally, the standard ICU thread safety rules apply: functions that take a const USpoofChecker parameter are 416*0e209d39SAndroid Build Coastguard Worker * thread safe. Those that take a non-const USpoofChecker are not thread safe.. 417*0e209d39SAndroid Build Coastguard Worker * 418*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.6 419*0e209d39SAndroid Build Coastguard Worker */ 420*0e209d39SAndroid Build Coastguard Worker 421*0e209d39SAndroid Build Coastguard Worker U_CDECL_BEGIN 422*0e209d39SAndroid Build Coastguard Worker 423*0e209d39SAndroid Build Coastguard Worker struct USpoofChecker; 424*0e209d39SAndroid Build Coastguard Worker /** 425*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.2 426*0e209d39SAndroid Build Coastguard Worker */ 427*0e209d39SAndroid Build Coastguard Worker typedef struct USpoofChecker USpoofChecker; /**< typedef for C of USpoofChecker */ 428*0e209d39SAndroid Build Coastguard Worker 429*0e209d39SAndroid Build Coastguard Worker struct USpoofCheckResult; 430*0e209d39SAndroid Build Coastguard Worker /** 431*0e209d39SAndroid Build Coastguard Worker * @see uspoof_openCheckResult 432*0e209d39SAndroid Build Coastguard Worker * @stable ICU 58 433*0e209d39SAndroid Build Coastguard Worker */ 434*0e209d39SAndroid Build Coastguard Worker typedef struct USpoofCheckResult USpoofCheckResult; 435*0e209d39SAndroid Build Coastguard Worker 436*0e209d39SAndroid Build Coastguard Worker /** 437*0e209d39SAndroid Build Coastguard Worker * Enum for the kinds of checks that USpoofChecker can perform. 438*0e209d39SAndroid Build Coastguard Worker * These enum values are used both to select the set of checks that 439*0e209d39SAndroid Build Coastguard Worker * will be performed, and to report results from the check function. 440*0e209d39SAndroid Build Coastguard Worker * 441*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.2 442*0e209d39SAndroid Build Coastguard Worker */ 443*0e209d39SAndroid Build Coastguard Worker typedef enum USpoofChecks { 444*0e209d39SAndroid Build Coastguard Worker /** 445*0e209d39SAndroid Build Coastguard Worker * When performing the two-string {@link uspoof_areConfusable} test, this flag in the return value indicates 446*0e209d39SAndroid Build Coastguard Worker * that the two strings are visually confusable and that they are from the same script, according to UTS 39 section 447*0e209d39SAndroid Build Coastguard Worker * 4. 448*0e209d39SAndroid Build Coastguard Worker * 449*0e209d39SAndroid Build Coastguard Worker * @see uspoof_areConfusable 450*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.2 451*0e209d39SAndroid Build Coastguard Worker */ 452*0e209d39SAndroid Build Coastguard Worker USPOOF_SINGLE_SCRIPT_CONFUSABLE = 1, 453*0e209d39SAndroid Build Coastguard Worker 454*0e209d39SAndroid Build Coastguard Worker /** 455*0e209d39SAndroid Build Coastguard Worker * When performing the two-string {@link uspoof_areConfusable} test, this flag in the return value indicates 456*0e209d39SAndroid Build Coastguard Worker * that the two strings are visually confusable and that they are <b>not</b> from the same script, according to UTS 457*0e209d39SAndroid Build Coastguard Worker * 39 section 4. 458*0e209d39SAndroid Build Coastguard Worker * 459*0e209d39SAndroid Build Coastguard Worker * @see uspoof_areConfusable 460*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.2 461*0e209d39SAndroid Build Coastguard Worker */ 462*0e209d39SAndroid Build Coastguard Worker USPOOF_MIXED_SCRIPT_CONFUSABLE = 2, 463*0e209d39SAndroid Build Coastguard Worker 464*0e209d39SAndroid Build Coastguard Worker /** 465*0e209d39SAndroid Build Coastguard Worker * When performing the two-string {@link uspoof_areConfusable} test, this flag in the return value indicates 466*0e209d39SAndroid Build Coastguard Worker * that the two strings are visually confusable and that they are not from the same script but both of them are 467*0e209d39SAndroid Build Coastguard Worker * single-script strings, according to UTS 39 section 4. 468*0e209d39SAndroid Build Coastguard Worker * 469*0e209d39SAndroid Build Coastguard Worker * @see uspoof_areConfusable 470*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.2 471*0e209d39SAndroid Build Coastguard Worker */ 472*0e209d39SAndroid Build Coastguard Worker USPOOF_WHOLE_SCRIPT_CONFUSABLE = 4, 473*0e209d39SAndroid Build Coastguard Worker 474*0e209d39SAndroid Build Coastguard Worker /** 475*0e209d39SAndroid Build Coastguard Worker * Enable this flag in {@link uspoof_setChecks} to turn on all types of confusables. You may set 476*0e209d39SAndroid Build Coastguard Worker * the checks to some subset of SINGLE_SCRIPT_CONFUSABLE, MIXED_SCRIPT_CONFUSABLE, or WHOLE_SCRIPT_CONFUSABLE to 477*0e209d39SAndroid Build Coastguard Worker * make {@link uspoof_areConfusable} return only those types of confusables. 478*0e209d39SAndroid Build Coastguard Worker * 479*0e209d39SAndroid Build Coastguard Worker * @see uspoof_areConfusable 480*0e209d39SAndroid Build Coastguard Worker * @see uspoof_getSkeleton 481*0e209d39SAndroid Build Coastguard Worker * @stable ICU 58 482*0e209d39SAndroid Build Coastguard Worker */ 483*0e209d39SAndroid Build Coastguard Worker USPOOF_CONFUSABLE = USPOOF_SINGLE_SCRIPT_CONFUSABLE | USPOOF_MIXED_SCRIPT_CONFUSABLE | USPOOF_WHOLE_SCRIPT_CONFUSABLE, 484*0e209d39SAndroid Build Coastguard Worker 485*0e209d39SAndroid Build Coastguard Worker #ifndef U_HIDE_DEPRECATED_API 486*0e209d39SAndroid Build Coastguard Worker /** 487*0e209d39SAndroid Build Coastguard Worker * This flag is deprecated and no longer affects the behavior of SpoofChecker. 488*0e209d39SAndroid Build Coastguard Worker * 489*0e209d39SAndroid Build Coastguard Worker * @deprecated ICU 58 Any case confusable mappings were removed from UTS 39; the corresponding ICU API was deprecated. 490*0e209d39SAndroid Build Coastguard Worker */ 491*0e209d39SAndroid Build Coastguard Worker USPOOF_ANY_CASE = 8, 492*0e209d39SAndroid Build Coastguard Worker #endif /* U_HIDE_DEPRECATED_API */ 493*0e209d39SAndroid Build Coastguard Worker 494*0e209d39SAndroid Build Coastguard Worker /** 495*0e209d39SAndroid Build Coastguard Worker * Check that an identifier is no looser than the specified RestrictionLevel. 496*0e209d39SAndroid Build Coastguard Worker * The default if {@link uspoof_setRestrictionLevel} is not called is HIGHLY_RESTRICTIVE. 497*0e209d39SAndroid Build Coastguard Worker * 498*0e209d39SAndroid Build Coastguard Worker * If USPOOF_AUX_INFO is enabled the actual restriction level of the 499*0e209d39SAndroid Build Coastguard Worker * identifier being tested will also be returned by uspoof_check(). 500*0e209d39SAndroid Build Coastguard Worker * 501*0e209d39SAndroid Build Coastguard Worker * @see URestrictionLevel 502*0e209d39SAndroid Build Coastguard Worker * @see uspoof_setRestrictionLevel 503*0e209d39SAndroid Build Coastguard Worker * @see USPOOF_AUX_INFO 504*0e209d39SAndroid Build Coastguard Worker * 505*0e209d39SAndroid Build Coastguard Worker * @stable ICU 51 506*0e209d39SAndroid Build Coastguard Worker */ 507*0e209d39SAndroid Build Coastguard Worker USPOOF_RESTRICTION_LEVEL = 16, 508*0e209d39SAndroid Build Coastguard Worker 509*0e209d39SAndroid Build Coastguard Worker #ifndef U_HIDE_DEPRECATED_API 510*0e209d39SAndroid Build Coastguard Worker /** Check that an identifier contains only characters from a 511*0e209d39SAndroid Build Coastguard Worker * single script (plus chars from the common and inherited scripts.) 512*0e209d39SAndroid Build Coastguard Worker * Applies to checks of a single identifier check only. 513*0e209d39SAndroid Build Coastguard Worker * @deprecated ICU 51 Use RESTRICTION_LEVEL instead. 514*0e209d39SAndroid Build Coastguard Worker */ 515*0e209d39SAndroid Build Coastguard Worker USPOOF_SINGLE_SCRIPT = USPOOF_RESTRICTION_LEVEL, 516*0e209d39SAndroid Build Coastguard Worker #endif /* U_HIDE_DEPRECATED_API */ 517*0e209d39SAndroid Build Coastguard Worker 518*0e209d39SAndroid Build Coastguard Worker /** Check an identifier for the presence of invisible characters, 519*0e209d39SAndroid Build Coastguard Worker * such as zero-width spaces, or character sequences that are 520*0e209d39SAndroid Build Coastguard Worker * likely not to display, such as multiple occurrences of the same 521*0e209d39SAndroid Build Coastguard Worker * non-spacing mark. This check does not test the input string as a whole 522*0e209d39SAndroid Build Coastguard Worker * for conformance to any particular syntax for identifiers. 523*0e209d39SAndroid Build Coastguard Worker */ 524*0e209d39SAndroid Build Coastguard Worker USPOOF_INVISIBLE = 32, 525*0e209d39SAndroid Build Coastguard Worker 526*0e209d39SAndroid Build Coastguard Worker /** Check that an identifier contains only characters from a specified set 527*0e209d39SAndroid Build Coastguard Worker * of acceptable characters. See {@link uspoof_setAllowedChars} and 528*0e209d39SAndroid Build Coastguard Worker * {@link uspoof_setAllowedLocales}. Note that a string that fails this check 529*0e209d39SAndroid Build Coastguard Worker * will also fail the {@link USPOOF_RESTRICTION_LEVEL} check. 530*0e209d39SAndroid Build Coastguard Worker */ 531*0e209d39SAndroid Build Coastguard Worker USPOOF_CHAR_LIMIT = 64, 532*0e209d39SAndroid Build Coastguard Worker 533*0e209d39SAndroid Build Coastguard Worker /** 534*0e209d39SAndroid Build Coastguard Worker * Check that an identifier does not mix numbers from different numbering systems. 535*0e209d39SAndroid Build Coastguard Worker * For more information, see UTS 39 section 5.3. 536*0e209d39SAndroid Build Coastguard Worker * 537*0e209d39SAndroid Build Coastguard Worker * @stable ICU 51 538*0e209d39SAndroid Build Coastguard Worker */ 539*0e209d39SAndroid Build Coastguard Worker USPOOF_MIXED_NUMBERS = 128, 540*0e209d39SAndroid Build Coastguard Worker 541*0e209d39SAndroid Build Coastguard Worker /** 542*0e209d39SAndroid Build Coastguard Worker * Check that an identifier does not have a combining character following a character in which that 543*0e209d39SAndroid Build Coastguard Worker * combining character would be hidden; for example 'i' followed by a U+0307 combining dot. 544*0e209d39SAndroid Build Coastguard Worker * 545*0e209d39SAndroid Build Coastguard Worker * More specifically, the following characters are forbidden from preceding a U+0307: 546*0e209d39SAndroid Build Coastguard Worker * <ul> 547*0e209d39SAndroid Build Coastguard Worker * <li>Those with the Soft_Dotted Unicode property (which includes 'i' and 'j')</li> 548*0e209d39SAndroid Build Coastguard Worker * <li>Latin lowercase letter 'l'</li> 549*0e209d39SAndroid Build Coastguard Worker * <li>Dotless 'i' and 'j' ('ı' and 'ȷ', U+0131 and U+0237)</li> 550*0e209d39SAndroid Build Coastguard Worker * <li>Any character whose confusable prototype ends with such a character 551*0e209d39SAndroid Build Coastguard Worker * (Soft_Dotted, 'l', 'ı', or 'ȷ')</li> 552*0e209d39SAndroid Build Coastguard Worker * </ul> 553*0e209d39SAndroid Build Coastguard Worker * In addition, combining characters are allowed between the above characters and U+0307 except those 554*0e209d39SAndroid Build Coastguard Worker * with combining class 0 or combining class "Above" (230, same class as U+0307). 555*0e209d39SAndroid Build Coastguard Worker * 556*0e209d39SAndroid Build Coastguard Worker * This list and the number of combing characters considered by this check may grow over time. 557*0e209d39SAndroid Build Coastguard Worker * 558*0e209d39SAndroid Build Coastguard Worker * @stable ICU 62 559*0e209d39SAndroid Build Coastguard Worker */ 560*0e209d39SAndroid Build Coastguard Worker USPOOF_HIDDEN_OVERLAY = 256, 561*0e209d39SAndroid Build Coastguard Worker 562*0e209d39SAndroid Build Coastguard Worker /** 563*0e209d39SAndroid Build Coastguard Worker * Enable all spoof checks. 564*0e209d39SAndroid Build Coastguard Worker * 565*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.6 566*0e209d39SAndroid Build Coastguard Worker */ 567*0e209d39SAndroid Build Coastguard Worker USPOOF_ALL_CHECKS = 0xFFFF, 568*0e209d39SAndroid Build Coastguard Worker 569*0e209d39SAndroid Build Coastguard Worker /** 570*0e209d39SAndroid Build Coastguard Worker * Enable the return of auxiliary (non-error) information in the 571*0e209d39SAndroid Build Coastguard Worker * upper bits of the check results value. 572*0e209d39SAndroid Build Coastguard Worker * 573*0e209d39SAndroid Build Coastguard Worker * If this "check" is not enabled, the results of {@link uspoof_check} will be 574*0e209d39SAndroid Build Coastguard Worker * zero when an identifier passes all of the enabled checks. 575*0e209d39SAndroid Build Coastguard Worker * 576*0e209d39SAndroid Build Coastguard Worker * If this "check" is enabled, (uspoof_check() & {@link USPOOF_ALL_CHECKS}) will 577*0e209d39SAndroid Build Coastguard Worker * be zero when an identifier passes all checks. 578*0e209d39SAndroid Build Coastguard Worker * 579*0e209d39SAndroid Build Coastguard Worker * @stable ICU 51 580*0e209d39SAndroid Build Coastguard Worker */ 581*0e209d39SAndroid Build Coastguard Worker USPOOF_AUX_INFO = 0x40000000 582*0e209d39SAndroid Build Coastguard Worker 583*0e209d39SAndroid Build Coastguard Worker } USpoofChecks; 584*0e209d39SAndroid Build Coastguard Worker 585*0e209d39SAndroid Build Coastguard Worker 586*0e209d39SAndroid Build Coastguard Worker /** 587*0e209d39SAndroid Build Coastguard Worker * Constants from UTS #39 for use in {@link uspoof_setRestrictionLevel}, and 588*0e209d39SAndroid Build Coastguard Worker * for returned identifier restriction levels in check results. 589*0e209d39SAndroid Build Coastguard Worker * 590*0e209d39SAndroid Build Coastguard Worker * @stable ICU 51 591*0e209d39SAndroid Build Coastguard Worker * 592*0e209d39SAndroid Build Coastguard Worker * @see uspoof_setRestrictionLevel 593*0e209d39SAndroid Build Coastguard Worker * @see uspoof_check 594*0e209d39SAndroid Build Coastguard Worker */ 595*0e209d39SAndroid Build Coastguard Worker typedef enum URestrictionLevel { 596*0e209d39SAndroid Build Coastguard Worker /** 597*0e209d39SAndroid Build Coastguard Worker * All characters in the string are in the identifier profile and all characters in the string are in the 598*0e209d39SAndroid Build Coastguard Worker * ASCII range. 599*0e209d39SAndroid Build Coastguard Worker * 600*0e209d39SAndroid Build Coastguard Worker * @stable ICU 51 601*0e209d39SAndroid Build Coastguard Worker */ 602*0e209d39SAndroid Build Coastguard Worker USPOOF_ASCII = 0x10000000, 603*0e209d39SAndroid Build Coastguard Worker /** 604*0e209d39SAndroid Build Coastguard Worker * The string classifies as ASCII-Only, or all characters in the string are in the identifier profile and 605*0e209d39SAndroid Build Coastguard Worker * the string is single-script, according to the definition in UTS 39 section 5.1. 606*0e209d39SAndroid Build Coastguard Worker * 607*0e209d39SAndroid Build Coastguard Worker * @stable ICU 53 608*0e209d39SAndroid Build Coastguard Worker */ 609*0e209d39SAndroid Build Coastguard Worker USPOOF_SINGLE_SCRIPT_RESTRICTIVE = 0x20000000, 610*0e209d39SAndroid Build Coastguard Worker /** 611*0e209d39SAndroid Build Coastguard Worker * The string classifies as Single Script, or all characters in the string are in the identifier profile and 612*0e209d39SAndroid Build Coastguard Worker * the string is covered by any of the following sets of scripts, according to the definition in UTS 39 613*0e209d39SAndroid Build Coastguard Worker * section 5.1: 614*0e209d39SAndroid Build Coastguard Worker * <ul> 615*0e209d39SAndroid Build Coastguard Worker * <li>Latin + Han + Bopomofo (or equivalently: Latn + Hanb)</li> 616*0e209d39SAndroid Build Coastguard Worker * <li>Latin + Han + Hiragana + Katakana (or equivalently: Latn + Jpan)</li> 617*0e209d39SAndroid Build Coastguard Worker * <li>Latin + Han + Hangul (or equivalently: Latn +Kore)</li> 618*0e209d39SAndroid Build Coastguard Worker * </ul> 619*0e209d39SAndroid Build Coastguard Worker * This is the default restriction in ICU. 620*0e209d39SAndroid Build Coastguard Worker * 621*0e209d39SAndroid Build Coastguard Worker * @stable ICU 51 622*0e209d39SAndroid Build Coastguard Worker */ 623*0e209d39SAndroid Build Coastguard Worker USPOOF_HIGHLY_RESTRICTIVE = 0x30000000, 624*0e209d39SAndroid Build Coastguard Worker /** 625*0e209d39SAndroid Build Coastguard Worker * The string classifies as Highly Restrictive, or all characters in the string are in the identifier profile 626*0e209d39SAndroid Build Coastguard Worker * and the string is covered by Latin and any one other Recommended or Aspirational script, except Cyrillic, 627*0e209d39SAndroid Build Coastguard Worker * Greek, and Cherokee. 628*0e209d39SAndroid Build Coastguard Worker * 629*0e209d39SAndroid Build Coastguard Worker * @stable ICU 51 630*0e209d39SAndroid Build Coastguard Worker */ 631*0e209d39SAndroid Build Coastguard Worker USPOOF_MODERATELY_RESTRICTIVE = 0x40000000, 632*0e209d39SAndroid Build Coastguard Worker /** 633*0e209d39SAndroid Build Coastguard Worker * All characters in the string are in the identifier profile. Allow arbitrary mixtures of scripts. 634*0e209d39SAndroid Build Coastguard Worker * 635*0e209d39SAndroid Build Coastguard Worker * @stable ICU 51 636*0e209d39SAndroid Build Coastguard Worker */ 637*0e209d39SAndroid Build Coastguard Worker USPOOF_MINIMALLY_RESTRICTIVE = 0x50000000, 638*0e209d39SAndroid Build Coastguard Worker /** 639*0e209d39SAndroid Build Coastguard Worker * Any valid identifiers, including characters outside of the Identifier Profile. 640*0e209d39SAndroid Build Coastguard Worker * 641*0e209d39SAndroid Build Coastguard Worker * @stable ICU 51 642*0e209d39SAndroid Build Coastguard Worker */ 643*0e209d39SAndroid Build Coastguard Worker USPOOF_UNRESTRICTIVE = 0x60000000, 644*0e209d39SAndroid Build Coastguard Worker /** 645*0e209d39SAndroid Build Coastguard Worker * Mask for selecting the Restriction Level bits from the return value of {@link uspoof_check}. 646*0e209d39SAndroid Build Coastguard Worker * 647*0e209d39SAndroid Build Coastguard Worker * @stable ICU 53 648*0e209d39SAndroid Build Coastguard Worker */ 649*0e209d39SAndroid Build Coastguard Worker USPOOF_RESTRICTION_LEVEL_MASK = 0x7F000000, 650*0e209d39SAndroid Build Coastguard Worker #ifndef U_HIDE_INTERNAL_API 651*0e209d39SAndroid Build Coastguard Worker /** 652*0e209d39SAndroid Build Coastguard Worker * An undefined restriction level. 653*0e209d39SAndroid Build Coastguard Worker * @internal 654*0e209d39SAndroid Build Coastguard Worker */ 655*0e209d39SAndroid Build Coastguard Worker USPOOF_UNDEFINED_RESTRICTIVE = -1 656*0e209d39SAndroid Build Coastguard Worker #endif /* U_HIDE_INTERNAL_API */ 657*0e209d39SAndroid Build Coastguard Worker } URestrictionLevel; 658*0e209d39SAndroid Build Coastguard Worker 659*0e209d39SAndroid Build Coastguard Worker /** 660*0e209d39SAndroid Build Coastguard Worker * Create a Unicode Spoof Checker, configured to perform all 661*0e209d39SAndroid Build Coastguard Worker * checks except for USPOOF_LOCALE_LIMIT and USPOOF_CHAR_LIMIT. 662*0e209d39SAndroid Build Coastguard Worker * Note that additional checks may be added in the future, 663*0e209d39SAndroid Build Coastguard Worker * resulting in the changes to the default checking behavior. 664*0e209d39SAndroid Build Coastguard Worker * 665*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if this function encounters a problem. 666*0e209d39SAndroid Build Coastguard Worker * @return the newly created Spoof Checker 667*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.2 668*0e209d39SAndroid Build Coastguard Worker */ 669*0e209d39SAndroid Build Coastguard Worker U_CAPI USpoofChecker * U_EXPORT2 670*0e209d39SAndroid Build Coastguard Worker uspoof_open(UErrorCode *status); 671*0e209d39SAndroid Build Coastguard Worker 672*0e209d39SAndroid Build Coastguard Worker 673*0e209d39SAndroid Build Coastguard Worker /** 674*0e209d39SAndroid Build Coastguard Worker * Open a Spoof checker from its serialized form, stored in 32-bit-aligned memory. 675*0e209d39SAndroid Build Coastguard Worker * Inverse of uspoof_serialize(). 676*0e209d39SAndroid Build Coastguard Worker * The memory containing the serialized data must remain valid and unchanged 677*0e209d39SAndroid Build Coastguard Worker * as long as the spoof checker, or any cloned copies of the spoof checker, 678*0e209d39SAndroid Build Coastguard Worker * are in use. Ownership of the memory remains with the caller. 679*0e209d39SAndroid Build Coastguard Worker * The spoof checker (and any clones) must be closed prior to deleting the 680*0e209d39SAndroid Build Coastguard Worker * serialized data. 681*0e209d39SAndroid Build Coastguard Worker * 682*0e209d39SAndroid Build Coastguard Worker * @param data a pointer to 32-bit-aligned memory containing the serialized form of spoof data 683*0e209d39SAndroid Build Coastguard Worker * @param length the number of bytes available at data; 684*0e209d39SAndroid Build Coastguard Worker * can be more than necessary 685*0e209d39SAndroid Build Coastguard Worker * @param pActualLength receives the actual number of bytes at data taken up by the data; 686*0e209d39SAndroid Build Coastguard Worker * can be NULL 687*0e209d39SAndroid Build Coastguard Worker * @param pErrorCode ICU error code 688*0e209d39SAndroid Build Coastguard Worker * @return the spoof checker. 689*0e209d39SAndroid Build Coastguard Worker * 690*0e209d39SAndroid Build Coastguard Worker * @see uspoof_open 691*0e209d39SAndroid Build Coastguard Worker * @see uspoof_serialize 692*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.2 693*0e209d39SAndroid Build Coastguard Worker */ 694*0e209d39SAndroid Build Coastguard Worker U_CAPI USpoofChecker * U_EXPORT2 695*0e209d39SAndroid Build Coastguard Worker uspoof_openFromSerialized(const void *data, int32_t length, int32_t *pActualLength, 696*0e209d39SAndroid Build Coastguard Worker UErrorCode *pErrorCode); 697*0e209d39SAndroid Build Coastguard Worker 698*0e209d39SAndroid Build Coastguard Worker /** 699*0e209d39SAndroid Build Coastguard Worker * Open a Spoof Checker from the source form of the spoof data. 700*0e209d39SAndroid Build Coastguard Worker * The input corresponds to the Unicode data file confusables.txt 701*0e209d39SAndroid Build Coastguard Worker * as described in Unicode Technical Standard #39. The syntax of the source data 702*0e209d39SAndroid Build Coastguard Worker * is as described in UTS #39 for this file, and the content of 703*0e209d39SAndroid Build Coastguard Worker * this file is acceptable input. 704*0e209d39SAndroid Build Coastguard Worker * 705*0e209d39SAndroid Build Coastguard Worker * The character encoding of the (char *) input text is UTF-8. 706*0e209d39SAndroid Build Coastguard Worker * 707*0e209d39SAndroid Build Coastguard Worker * @param confusables a pointer to the confusable characters definitions, 708*0e209d39SAndroid Build Coastguard Worker * as found in file confusables.txt from unicode.org. 709*0e209d39SAndroid Build Coastguard Worker * @param confusablesLen The length of the confusables text, or -1 if the 710*0e209d39SAndroid Build Coastguard Worker * input string is zero terminated. 711*0e209d39SAndroid Build Coastguard Worker * @param confusablesWholeScript 712*0e209d39SAndroid Build Coastguard Worker * Deprecated in ICU 58. No longer used. 713*0e209d39SAndroid Build Coastguard Worker * @param confusablesWholeScriptLen 714*0e209d39SAndroid Build Coastguard Worker * Deprecated in ICU 58. No longer used. 715*0e209d39SAndroid Build Coastguard Worker * @param errType In the event of an error in the input, indicates 716*0e209d39SAndroid Build Coastguard Worker * which of the input files contains the error. 717*0e209d39SAndroid Build Coastguard Worker * The value is one of USPOOF_SINGLE_SCRIPT_CONFUSABLE or 718*0e209d39SAndroid Build Coastguard Worker * USPOOF_WHOLE_SCRIPT_CONFUSABLE, or 719*0e209d39SAndroid Build Coastguard Worker * zero if no errors are found. 720*0e209d39SAndroid Build Coastguard Worker * @param pe In the event of an error in the input, receives the position 721*0e209d39SAndroid Build Coastguard Worker * in the input text (line, offset) of the error. 722*0e209d39SAndroid Build Coastguard Worker * @param status an in/out ICU UErrorCode. Among the possible errors is 723*0e209d39SAndroid Build Coastguard Worker * U_PARSE_ERROR, which is used to report syntax errors 724*0e209d39SAndroid Build Coastguard Worker * in the input. 725*0e209d39SAndroid Build Coastguard Worker * @return A spoof checker that uses the rules from the input files. 726*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.2 727*0e209d39SAndroid Build Coastguard Worker */ 728*0e209d39SAndroid Build Coastguard Worker U_CAPI USpoofChecker * U_EXPORT2 729*0e209d39SAndroid Build Coastguard Worker uspoof_openFromSource(const char *confusables, int32_t confusablesLen, 730*0e209d39SAndroid Build Coastguard Worker const char *confusablesWholeScript, int32_t confusablesWholeScriptLen, 731*0e209d39SAndroid Build Coastguard Worker int32_t *errType, UParseError *pe, UErrorCode *status); 732*0e209d39SAndroid Build Coastguard Worker 733*0e209d39SAndroid Build Coastguard Worker 734*0e209d39SAndroid Build Coastguard Worker /** 735*0e209d39SAndroid Build Coastguard Worker * Close a Spoof Checker, freeing any memory that was being held by 736*0e209d39SAndroid Build Coastguard Worker * its implementation. 737*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.2 738*0e209d39SAndroid Build Coastguard Worker */ 739*0e209d39SAndroid Build Coastguard Worker U_CAPI void U_EXPORT2 740*0e209d39SAndroid Build Coastguard Worker uspoof_close(USpoofChecker *sc); 741*0e209d39SAndroid Build Coastguard Worker 742*0e209d39SAndroid Build Coastguard Worker /** 743*0e209d39SAndroid Build Coastguard Worker * Clone a Spoof Checker. The clone will be set to perform the same checks 744*0e209d39SAndroid Build Coastguard Worker * as the original source. 745*0e209d39SAndroid Build Coastguard Worker * 746*0e209d39SAndroid Build Coastguard Worker * @param sc The source USpoofChecker 747*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if this function encounters a problem. 748*0e209d39SAndroid Build Coastguard Worker * @return 749*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.2 750*0e209d39SAndroid Build Coastguard Worker */ 751*0e209d39SAndroid Build Coastguard Worker U_CAPI USpoofChecker * U_EXPORT2 752*0e209d39SAndroid Build Coastguard Worker uspoof_clone(const USpoofChecker *sc, UErrorCode *status); 753*0e209d39SAndroid Build Coastguard Worker 754*0e209d39SAndroid Build Coastguard Worker 755*0e209d39SAndroid Build Coastguard Worker /** 756*0e209d39SAndroid Build Coastguard Worker * Specify the bitmask of checks that will be performed by {@link uspoof_check}. Calling this method 757*0e209d39SAndroid Build Coastguard Worker * overwrites any checks that may have already been enabled. By default, all checks are enabled. 758*0e209d39SAndroid Build Coastguard Worker * 759*0e209d39SAndroid Build Coastguard Worker * To enable specific checks and disable all others, 760*0e209d39SAndroid Build Coastguard Worker * OR together only the bit constants for the desired checks. 761*0e209d39SAndroid Build Coastguard Worker * For example, to fail strings containing characters outside of 762*0e209d39SAndroid Build Coastguard Worker * the set specified by {@link uspoof_setAllowedChars} and 763*0e209d39SAndroid Build Coastguard Worker * also strings that contain digits from mixed numbering systems: 764*0e209d39SAndroid Build Coastguard Worker * 765*0e209d39SAndroid Build Coastguard Worker * <pre> 766*0e209d39SAndroid Build Coastguard Worker * {@code 767*0e209d39SAndroid Build Coastguard Worker * uspoof_setChecks(USPOOF_CHAR_LIMIT | USPOOF_MIXED_NUMBERS); 768*0e209d39SAndroid Build Coastguard Worker * } 769*0e209d39SAndroid Build Coastguard Worker * </pre> 770*0e209d39SAndroid Build Coastguard Worker * 771*0e209d39SAndroid Build Coastguard Worker * To disable specific checks and enable all others, 772*0e209d39SAndroid Build Coastguard Worker * start with ALL_CHECKS and "AND away" the not-desired checks. 773*0e209d39SAndroid Build Coastguard Worker * For example, if you are not planning to use the {@link uspoof_areConfusable} functionality, 774*0e209d39SAndroid Build Coastguard Worker * it is good practice to disable the CONFUSABLE check: 775*0e209d39SAndroid Build Coastguard Worker * 776*0e209d39SAndroid Build Coastguard Worker * <pre> 777*0e209d39SAndroid Build Coastguard Worker * {@code 778*0e209d39SAndroid Build Coastguard Worker * uspoof_setChecks(USPOOF_ALL_CHECKS & ~USPOOF_CONFUSABLE); 779*0e209d39SAndroid Build Coastguard Worker * } 780*0e209d39SAndroid Build Coastguard Worker * </pre> 781*0e209d39SAndroid Build Coastguard Worker * 782*0e209d39SAndroid Build Coastguard Worker * Note that methods such as {@link uspoof_setAllowedChars}, {@link uspoof_setAllowedLocales}, and 783*0e209d39SAndroid Build Coastguard Worker * {@link uspoof_setRestrictionLevel} will enable certain checks when called. Those methods will OR the check they 784*0e209d39SAndroid Build Coastguard Worker * enable onto the existing bitmask specified by this method. For more details, see the documentation of those 785*0e209d39SAndroid Build Coastguard Worker * methods. 786*0e209d39SAndroid Build Coastguard Worker * 787*0e209d39SAndroid Build Coastguard Worker * @param sc The USpoofChecker 788*0e209d39SAndroid Build Coastguard Worker * @param checks The set of checks that this spoof checker will perform. 789*0e209d39SAndroid Build Coastguard Worker * The value is a bit set, obtained by OR-ing together 790*0e209d39SAndroid Build Coastguard Worker * values from enum USpoofChecks. 791*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if this function encounters a problem. 792*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.2 793*0e209d39SAndroid Build Coastguard Worker * 794*0e209d39SAndroid Build Coastguard Worker */ 795*0e209d39SAndroid Build Coastguard Worker U_CAPI void U_EXPORT2 796*0e209d39SAndroid Build Coastguard Worker uspoof_setChecks(USpoofChecker *sc, int32_t checks, UErrorCode *status); 797*0e209d39SAndroid Build Coastguard Worker 798*0e209d39SAndroid Build Coastguard Worker /** 799*0e209d39SAndroid Build Coastguard Worker * Get the set of checks that this Spoof Checker has been configured to perform. 800*0e209d39SAndroid Build Coastguard Worker * 801*0e209d39SAndroid Build Coastguard Worker * @param sc The USpoofChecker 802*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if this function encounters a problem. 803*0e209d39SAndroid Build Coastguard Worker * @return The set of checks that this spoof checker will perform. 804*0e209d39SAndroid Build Coastguard Worker * The value is a bit set, obtained by OR-ing together 805*0e209d39SAndroid Build Coastguard Worker * values from enum USpoofChecks. 806*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.2 807*0e209d39SAndroid Build Coastguard Worker * 808*0e209d39SAndroid Build Coastguard Worker */ 809*0e209d39SAndroid Build Coastguard Worker U_CAPI int32_t U_EXPORT2 810*0e209d39SAndroid Build Coastguard Worker uspoof_getChecks(const USpoofChecker *sc, UErrorCode *status); 811*0e209d39SAndroid Build Coastguard Worker 812*0e209d39SAndroid Build Coastguard Worker /** 813*0e209d39SAndroid Build Coastguard Worker * Set the loosest restriction level allowed for strings. The default if this is not called is 814*0e209d39SAndroid Build Coastguard Worker * {@link USPOOF_HIGHLY_RESTRICTIVE}. Calling this method enables the {@link USPOOF_RESTRICTION_LEVEL} and 815*0e209d39SAndroid Build Coastguard Worker * {@link USPOOF_MIXED_NUMBERS} checks, corresponding to Sections 5.1 and 5.2 of UTS 39. To customize which checks are 816*0e209d39SAndroid Build Coastguard Worker * to be performed by {@link uspoof_check}, see {@link uspoof_setChecks}. 817*0e209d39SAndroid Build Coastguard Worker * 818*0e209d39SAndroid Build Coastguard Worker * @param sc The USpoofChecker 819*0e209d39SAndroid Build Coastguard Worker * @param restrictionLevel The loosest restriction level allowed. 820*0e209d39SAndroid Build Coastguard Worker * @see URestrictionLevel 821*0e209d39SAndroid Build Coastguard Worker * @stable ICU 51 822*0e209d39SAndroid Build Coastguard Worker */ 823*0e209d39SAndroid Build Coastguard Worker U_CAPI void U_EXPORT2 824*0e209d39SAndroid Build Coastguard Worker uspoof_setRestrictionLevel(USpoofChecker *sc, URestrictionLevel restrictionLevel); 825*0e209d39SAndroid Build Coastguard Worker 826*0e209d39SAndroid Build Coastguard Worker 827*0e209d39SAndroid Build Coastguard Worker /** 828*0e209d39SAndroid Build Coastguard Worker * Get the Restriction Level that will be tested if the checks include {@link USPOOF_RESTRICTION_LEVEL}. 829*0e209d39SAndroid Build Coastguard Worker * 830*0e209d39SAndroid Build Coastguard Worker * @return The restriction level 831*0e209d39SAndroid Build Coastguard Worker * @see URestrictionLevel 832*0e209d39SAndroid Build Coastguard Worker * @stable ICU 51 833*0e209d39SAndroid Build Coastguard Worker */ 834*0e209d39SAndroid Build Coastguard Worker U_CAPI URestrictionLevel U_EXPORT2 835*0e209d39SAndroid Build Coastguard Worker uspoof_getRestrictionLevel(const USpoofChecker *sc); 836*0e209d39SAndroid Build Coastguard Worker 837*0e209d39SAndroid Build Coastguard Worker /** 838*0e209d39SAndroid Build Coastguard Worker * Limit characters that are acceptable in identifiers being checked to those 839*0e209d39SAndroid Build Coastguard Worker * normally used with the languages associated with the specified locales. 840*0e209d39SAndroid Build Coastguard Worker * Any previously specified list of locales is replaced by the new settings. 841*0e209d39SAndroid Build Coastguard Worker * 842*0e209d39SAndroid Build Coastguard Worker * A set of languages is determined from the locale(s), and 843*0e209d39SAndroid Build Coastguard Worker * from those a set of acceptable Unicode scripts is determined. 844*0e209d39SAndroid Build Coastguard Worker * Characters from this set of scripts, along with characters from 845*0e209d39SAndroid Build Coastguard Worker * the "common" and "inherited" Unicode Script categories 846*0e209d39SAndroid Build Coastguard Worker * will be permitted. 847*0e209d39SAndroid Build Coastguard Worker * 848*0e209d39SAndroid Build Coastguard Worker * Supplying an empty string removes all restrictions; 849*0e209d39SAndroid Build Coastguard Worker * characters from any script will be allowed. 850*0e209d39SAndroid Build Coastguard Worker * 851*0e209d39SAndroid Build Coastguard Worker * The {@link USPOOF_CHAR_LIMIT} test is automatically enabled for this 852*0e209d39SAndroid Build Coastguard Worker * USpoofChecker when calling this function with a non-empty list 853*0e209d39SAndroid Build Coastguard Worker * of locales. 854*0e209d39SAndroid Build Coastguard Worker * 855*0e209d39SAndroid Build Coastguard Worker * The Unicode Set of characters that will be allowed is accessible 856*0e209d39SAndroid Build Coastguard Worker * via the uspoof_getAllowedChars() function. uspoof_setAllowedLocales() 857*0e209d39SAndroid Build Coastguard Worker * will <i>replace</i> any previously applied set of allowed characters. 858*0e209d39SAndroid Build Coastguard Worker * 859*0e209d39SAndroid Build Coastguard Worker * Adjustments, such as additions or deletions of certain classes of characters, 860*0e209d39SAndroid Build Coastguard Worker * can be made to the result of uspoof_setAllowedLocales() by 861*0e209d39SAndroid Build Coastguard Worker * fetching the resulting set with uspoof_getAllowedChars(), 862*0e209d39SAndroid Build Coastguard Worker * manipulating it with the Unicode Set API, then resetting the 863*0e209d39SAndroid Build Coastguard Worker * spoof detectors limits with uspoof_setAllowedChars(). 864*0e209d39SAndroid Build Coastguard Worker * 865*0e209d39SAndroid Build Coastguard Worker * @param sc The USpoofChecker 866*0e209d39SAndroid Build Coastguard Worker * @param localesList A list list of locales, from which the language 867*0e209d39SAndroid Build Coastguard Worker * and associated script are extracted. The locales 868*0e209d39SAndroid Build Coastguard Worker * are comma-separated if there is more than one. 869*0e209d39SAndroid Build Coastguard Worker * White space may not appear within an individual locale, 870*0e209d39SAndroid Build Coastguard Worker * but is ignored otherwise. 871*0e209d39SAndroid Build Coastguard Worker * The locales are syntactically like those from the 872*0e209d39SAndroid Build Coastguard Worker * HTTP Accept-Language header. 873*0e209d39SAndroid Build Coastguard Worker * If the localesList is empty, no restrictions will be placed on 874*0e209d39SAndroid Build Coastguard Worker * the allowed characters. 875*0e209d39SAndroid Build Coastguard Worker * 876*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if this function encounters a problem. 877*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.2 878*0e209d39SAndroid Build Coastguard Worker */ 879*0e209d39SAndroid Build Coastguard Worker U_CAPI void U_EXPORT2 880*0e209d39SAndroid Build Coastguard Worker uspoof_setAllowedLocales(USpoofChecker *sc, const char *localesList, UErrorCode *status); 881*0e209d39SAndroid Build Coastguard Worker 882*0e209d39SAndroid Build Coastguard Worker /** 883*0e209d39SAndroid Build Coastguard Worker * Get a list of locales for the scripts that are acceptable in strings 884*0e209d39SAndroid Build Coastguard Worker * to be checked. If no limitations on scripts have been specified, 885*0e209d39SAndroid Build Coastguard Worker * an empty string will be returned. 886*0e209d39SAndroid Build Coastguard Worker * 887*0e209d39SAndroid Build Coastguard Worker * uspoof_setAllowedChars() will reset the list of allowed to be empty. 888*0e209d39SAndroid Build Coastguard Worker * 889*0e209d39SAndroid Build Coastguard Worker * The format of the returned list is the same as that supplied to 890*0e209d39SAndroid Build Coastguard Worker * uspoof_setAllowedLocales(), but returned list may not be identical 891*0e209d39SAndroid Build Coastguard Worker * to the originally specified string; the string may be reformatted, 892*0e209d39SAndroid Build Coastguard Worker * and information other than languages from 893*0e209d39SAndroid Build Coastguard Worker * the originally specified locales may be omitted. 894*0e209d39SAndroid Build Coastguard Worker * 895*0e209d39SAndroid Build Coastguard Worker * @param sc The USpoofChecker 896*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if this function encounters a problem. 897*0e209d39SAndroid Build Coastguard Worker * @return A string containing a list of locales corresponding 898*0e209d39SAndroid Build Coastguard Worker * to the acceptable scripts, formatted like an 899*0e209d39SAndroid Build Coastguard Worker * HTTP Accept Language value. 900*0e209d39SAndroid Build Coastguard Worker * 901*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.2 902*0e209d39SAndroid Build Coastguard Worker */ 903*0e209d39SAndroid Build Coastguard Worker U_CAPI const char * U_EXPORT2 904*0e209d39SAndroid Build Coastguard Worker uspoof_getAllowedLocales(USpoofChecker *sc, UErrorCode *status); 905*0e209d39SAndroid Build Coastguard Worker 906*0e209d39SAndroid Build Coastguard Worker 907*0e209d39SAndroid Build Coastguard Worker /** 908*0e209d39SAndroid Build Coastguard Worker * Limit the acceptable characters to those specified by a Unicode Set. 909*0e209d39SAndroid Build Coastguard Worker * Any previously specified character limit is 910*0e209d39SAndroid Build Coastguard Worker * is replaced by the new settings. This includes limits on 911*0e209d39SAndroid Build Coastguard Worker * characters that were set with the uspoof_setAllowedLocales() function. 912*0e209d39SAndroid Build Coastguard Worker * 913*0e209d39SAndroid Build Coastguard Worker * The USPOOF_CHAR_LIMIT test is automatically enabled for this 914*0e209d39SAndroid Build Coastguard Worker * USpoofChecker by this function. 915*0e209d39SAndroid Build Coastguard Worker * 916*0e209d39SAndroid Build Coastguard Worker * @param sc The USpoofChecker 917*0e209d39SAndroid Build Coastguard Worker * @param chars A Unicode Set containing the list of 918*0e209d39SAndroid Build Coastguard Worker * characters that are permitted. Ownership of the set 919*0e209d39SAndroid Build Coastguard Worker * remains with the caller. The incoming set is cloned by 920*0e209d39SAndroid Build Coastguard Worker * this function, so there are no restrictions on modifying 921*0e209d39SAndroid Build Coastguard Worker * or deleting the USet after calling this function. 922*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if this function encounters a problem. 923*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.2 924*0e209d39SAndroid Build Coastguard Worker */ 925*0e209d39SAndroid Build Coastguard Worker U_CAPI void U_EXPORT2 926*0e209d39SAndroid Build Coastguard Worker uspoof_setAllowedChars(USpoofChecker *sc, const USet *chars, UErrorCode *status); 927*0e209d39SAndroid Build Coastguard Worker 928*0e209d39SAndroid Build Coastguard Worker 929*0e209d39SAndroid Build Coastguard Worker /** 930*0e209d39SAndroid Build Coastguard Worker * Get a USet for the characters permitted in an identifier. 931*0e209d39SAndroid Build Coastguard Worker * This corresponds to the limits imposed by the Set Allowed Characters 932*0e209d39SAndroid Build Coastguard Worker * functions. Limitations imposed by other checks will not be 933*0e209d39SAndroid Build Coastguard Worker * reflected in the set returned by this function. 934*0e209d39SAndroid Build Coastguard Worker * 935*0e209d39SAndroid Build Coastguard Worker * The returned set will be frozen, meaning that it cannot be modified 936*0e209d39SAndroid Build Coastguard Worker * by the caller. 937*0e209d39SAndroid Build Coastguard Worker * 938*0e209d39SAndroid Build Coastguard Worker * Ownership of the returned set remains with the Spoof Detector. The 939*0e209d39SAndroid Build Coastguard Worker * returned set will become invalid if the spoof detector is closed, 940*0e209d39SAndroid Build Coastguard Worker * or if a new set of allowed characters is specified. 941*0e209d39SAndroid Build Coastguard Worker * 942*0e209d39SAndroid Build Coastguard Worker * 943*0e209d39SAndroid Build Coastguard Worker * @param sc The USpoofChecker 944*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if this function encounters a problem. 945*0e209d39SAndroid Build Coastguard Worker * @return A USet containing the characters that are permitted by 946*0e209d39SAndroid Build Coastguard Worker * the USPOOF_CHAR_LIMIT test. 947*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.2 948*0e209d39SAndroid Build Coastguard Worker */ 949*0e209d39SAndroid Build Coastguard Worker U_CAPI const USet * U_EXPORT2 950*0e209d39SAndroid Build Coastguard Worker uspoof_getAllowedChars(const USpoofChecker *sc, UErrorCode *status); 951*0e209d39SAndroid Build Coastguard Worker 952*0e209d39SAndroid Build Coastguard Worker 953*0e209d39SAndroid Build Coastguard Worker /** 954*0e209d39SAndroid Build Coastguard Worker * Check the specified string for possible security issues. 955*0e209d39SAndroid Build Coastguard Worker * The text to be checked will typically be an identifier of some sort. 956*0e209d39SAndroid Build Coastguard Worker * The set of checks to be performed is specified with uspoof_setChecks(). 957*0e209d39SAndroid Build Coastguard Worker * 958*0e209d39SAndroid Build Coastguard Worker * \note 959*0e209d39SAndroid Build Coastguard Worker * Consider using the newer API, {@link uspoof_check2}, instead. 960*0e209d39SAndroid Build Coastguard Worker * The newer API exposes additional information from the check procedure 961*0e209d39SAndroid Build Coastguard Worker * and is otherwise identical to this method. 962*0e209d39SAndroid Build Coastguard Worker * 963*0e209d39SAndroid Build Coastguard Worker * @param sc The USpoofChecker 964*0e209d39SAndroid Build Coastguard Worker * @param id The identifier to be checked for possible security issues, 965*0e209d39SAndroid Build Coastguard Worker * in UTF-16 format. 966*0e209d39SAndroid Build Coastguard Worker * @param length the length of the string to be checked, expressed in 967*0e209d39SAndroid Build Coastguard Worker * 16 bit UTF-16 code units, or -1 if the string is 968*0e209d39SAndroid Build Coastguard Worker * zero terminated. 969*0e209d39SAndroid Build Coastguard Worker * @param position Deprecated in ICU 51. Always returns zero. 970*0e209d39SAndroid Build Coastguard Worker * Originally, an out parameter for the index of the first 971*0e209d39SAndroid Build Coastguard Worker * string position that failed a check. 972*0e209d39SAndroid Build Coastguard Worker * This parameter may be NULL. 973*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if an error occurred while attempting to 974*0e209d39SAndroid Build Coastguard Worker * perform the check. 975*0e209d39SAndroid Build Coastguard Worker * Spoofing or security issues detected with the input string are 976*0e209d39SAndroid Build Coastguard Worker * not reported here, but through the function's return value. 977*0e209d39SAndroid Build Coastguard Worker * @return An integer value with bits set for any potential security 978*0e209d39SAndroid Build Coastguard Worker * or spoofing issues detected. The bits are defined by 979*0e209d39SAndroid Build Coastguard Worker * enum USpoofChecks. (returned_value & USPOOF_ALL_CHECKS) 980*0e209d39SAndroid Build Coastguard Worker * will be zero if the input string passes all of the 981*0e209d39SAndroid Build Coastguard Worker * enabled checks. 982*0e209d39SAndroid Build Coastguard Worker * @see uspoof_check2 983*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.2 984*0e209d39SAndroid Build Coastguard Worker */ 985*0e209d39SAndroid Build Coastguard Worker U_CAPI int32_t U_EXPORT2 986*0e209d39SAndroid Build Coastguard Worker uspoof_check(const USpoofChecker *sc, 987*0e209d39SAndroid Build Coastguard Worker const UChar *id, int32_t length, 988*0e209d39SAndroid Build Coastguard Worker int32_t *position, 989*0e209d39SAndroid Build Coastguard Worker UErrorCode *status); 990*0e209d39SAndroid Build Coastguard Worker 991*0e209d39SAndroid Build Coastguard Worker 992*0e209d39SAndroid Build Coastguard Worker /** 993*0e209d39SAndroid Build Coastguard Worker * Check the specified string for possible security issues. 994*0e209d39SAndroid Build Coastguard Worker * The text to be checked will typically be an identifier of some sort. 995*0e209d39SAndroid Build Coastguard Worker * The set of checks to be performed is specified with uspoof_setChecks(). 996*0e209d39SAndroid Build Coastguard Worker * 997*0e209d39SAndroid Build Coastguard Worker * \note 998*0e209d39SAndroid Build Coastguard Worker * Consider using the newer API, {@link uspoof_check2UTF8}, instead. 999*0e209d39SAndroid Build Coastguard Worker * The newer API exposes additional information from the check procedure 1000*0e209d39SAndroid Build Coastguard Worker * and is otherwise identical to this method. 1001*0e209d39SAndroid Build Coastguard Worker * 1002*0e209d39SAndroid Build Coastguard Worker * @param sc The USpoofChecker 1003*0e209d39SAndroid Build Coastguard Worker * @param id A identifier to be checked for possible security issues, in UTF8 format. 1004*0e209d39SAndroid Build Coastguard Worker * @param length the length of the string to be checked, or -1 if the string is 1005*0e209d39SAndroid Build Coastguard Worker * zero terminated. 1006*0e209d39SAndroid Build Coastguard Worker * @param position Deprecated in ICU 51. Always returns zero. 1007*0e209d39SAndroid Build Coastguard Worker * Originally, an out parameter for the index of the first 1008*0e209d39SAndroid Build Coastguard Worker * string position that failed a check. 1009*0e209d39SAndroid Build Coastguard Worker * This parameter may be NULL. 1010*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if an error occurred while attempting to 1011*0e209d39SAndroid Build Coastguard Worker * perform the check. 1012*0e209d39SAndroid Build Coastguard Worker * Spoofing or security issues detected with the input string are 1013*0e209d39SAndroid Build Coastguard Worker * not reported here, but through the function's return value. 1014*0e209d39SAndroid Build Coastguard Worker * If the input contains invalid UTF-8 sequences, 1015*0e209d39SAndroid Build Coastguard Worker * a status of U_INVALID_CHAR_FOUND will be returned. 1016*0e209d39SAndroid Build Coastguard Worker * @return An integer value with bits set for any potential security 1017*0e209d39SAndroid Build Coastguard Worker * or spoofing issues detected. The bits are defined by 1018*0e209d39SAndroid Build Coastguard Worker * enum USpoofChecks. (returned_value & USPOOF_ALL_CHECKS) 1019*0e209d39SAndroid Build Coastguard Worker * will be zero if the input string passes all of the 1020*0e209d39SAndroid Build Coastguard Worker * enabled checks. 1021*0e209d39SAndroid Build Coastguard Worker * @see uspoof_check2UTF8 1022*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.2 1023*0e209d39SAndroid Build Coastguard Worker */ 1024*0e209d39SAndroid Build Coastguard Worker U_CAPI int32_t U_EXPORT2 1025*0e209d39SAndroid Build Coastguard Worker uspoof_checkUTF8(const USpoofChecker *sc, 1026*0e209d39SAndroid Build Coastguard Worker const char *id, int32_t length, 1027*0e209d39SAndroid Build Coastguard Worker int32_t *position, 1028*0e209d39SAndroid Build Coastguard Worker UErrorCode *status); 1029*0e209d39SAndroid Build Coastguard Worker 1030*0e209d39SAndroid Build Coastguard Worker 1031*0e209d39SAndroid Build Coastguard Worker /** 1032*0e209d39SAndroid Build Coastguard Worker * Check the specified string for possible security issues. 1033*0e209d39SAndroid Build Coastguard Worker * The text to be checked will typically be an identifier of some sort. 1034*0e209d39SAndroid Build Coastguard Worker * The set of checks to be performed is specified with uspoof_setChecks(). 1035*0e209d39SAndroid Build Coastguard Worker * 1036*0e209d39SAndroid Build Coastguard Worker * @param sc The USpoofChecker 1037*0e209d39SAndroid Build Coastguard Worker * @param id The identifier to be checked for possible security issues, 1038*0e209d39SAndroid Build Coastguard Worker * in UTF-16 format. 1039*0e209d39SAndroid Build Coastguard Worker * @param length the length of the string to be checked, or -1 if the string is 1040*0e209d39SAndroid Build Coastguard Worker * zero terminated. 1041*0e209d39SAndroid Build Coastguard Worker * @param checkResult An instance of USpoofCheckResult to be filled with 1042*0e209d39SAndroid Build Coastguard Worker * details about the identifier. Can be NULL. 1043*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if an error occurred while attempting to 1044*0e209d39SAndroid Build Coastguard Worker * perform the check. 1045*0e209d39SAndroid Build Coastguard Worker * Spoofing or security issues detected with the input string are 1046*0e209d39SAndroid Build Coastguard Worker * not reported here, but through the function's return value. 1047*0e209d39SAndroid Build Coastguard Worker * @return An integer value with bits set for any potential security 1048*0e209d39SAndroid Build Coastguard Worker * or spoofing issues detected. The bits are defined by 1049*0e209d39SAndroid Build Coastguard Worker * enum USpoofChecks. (returned_value & USPOOF_ALL_CHECKS) 1050*0e209d39SAndroid Build Coastguard Worker * will be zero if the input string passes all of the 1051*0e209d39SAndroid Build Coastguard Worker * enabled checks. Any information in this bitmask will be 1052*0e209d39SAndroid Build Coastguard Worker * consistent with the information saved in the optional 1053*0e209d39SAndroid Build Coastguard Worker * checkResult parameter. 1054*0e209d39SAndroid Build Coastguard Worker * @see uspoof_openCheckResult 1055*0e209d39SAndroid Build Coastguard Worker * @see uspoof_check2UTF8 1056*0e209d39SAndroid Build Coastguard Worker * @see uspoof_check2UnicodeString 1057*0e209d39SAndroid Build Coastguard Worker * @stable ICU 58 1058*0e209d39SAndroid Build Coastguard Worker */ 1059*0e209d39SAndroid Build Coastguard Worker U_CAPI int32_t U_EXPORT2 1060*0e209d39SAndroid Build Coastguard Worker uspoof_check2(const USpoofChecker *sc, 1061*0e209d39SAndroid Build Coastguard Worker const UChar* id, int32_t length, 1062*0e209d39SAndroid Build Coastguard Worker USpoofCheckResult* checkResult, 1063*0e209d39SAndroid Build Coastguard Worker UErrorCode *status); 1064*0e209d39SAndroid Build Coastguard Worker 1065*0e209d39SAndroid Build Coastguard Worker /** 1066*0e209d39SAndroid Build Coastguard Worker * Check the specified string for possible security issues. 1067*0e209d39SAndroid Build Coastguard Worker * The text to be checked will typically be an identifier of some sort. 1068*0e209d39SAndroid Build Coastguard Worker * The set of checks to be performed is specified with uspoof_setChecks(). 1069*0e209d39SAndroid Build Coastguard Worker * 1070*0e209d39SAndroid Build Coastguard Worker * This version of {@link uspoof_check} accepts a USpoofCheckResult, which 1071*0e209d39SAndroid Build Coastguard Worker * returns additional information about the identifier. For more 1072*0e209d39SAndroid Build Coastguard Worker * information, see {@link uspoof_openCheckResult}. 1073*0e209d39SAndroid Build Coastguard Worker * 1074*0e209d39SAndroid Build Coastguard Worker * @param sc The USpoofChecker 1075*0e209d39SAndroid Build Coastguard Worker * @param id A identifier to be checked for possible security issues, in UTF8 format. 1076*0e209d39SAndroid Build Coastguard Worker * @param length the length of the string to be checked, or -1 if the string is 1077*0e209d39SAndroid Build Coastguard Worker * zero terminated. 1078*0e209d39SAndroid Build Coastguard Worker * @param checkResult An instance of USpoofCheckResult to be filled with 1079*0e209d39SAndroid Build Coastguard Worker * details about the identifier. Can be NULL. 1080*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if an error occurred while attempting to 1081*0e209d39SAndroid Build Coastguard Worker * perform the check. 1082*0e209d39SAndroid Build Coastguard Worker * Spoofing or security issues detected with the input string are 1083*0e209d39SAndroid Build Coastguard Worker * not reported here, but through the function's return value. 1084*0e209d39SAndroid Build Coastguard Worker * @return An integer value with bits set for any potential security 1085*0e209d39SAndroid Build Coastguard Worker * or spoofing issues detected. The bits are defined by 1086*0e209d39SAndroid Build Coastguard Worker * enum USpoofChecks. (returned_value & USPOOF_ALL_CHECKS) 1087*0e209d39SAndroid Build Coastguard Worker * will be zero if the input string passes all of the 1088*0e209d39SAndroid Build Coastguard Worker * enabled checks. Any information in this bitmask will be 1089*0e209d39SAndroid Build Coastguard Worker * consistent with the information saved in the optional 1090*0e209d39SAndroid Build Coastguard Worker * checkResult parameter. 1091*0e209d39SAndroid Build Coastguard Worker * @see uspoof_openCheckResult 1092*0e209d39SAndroid Build Coastguard Worker * @see uspoof_check2 1093*0e209d39SAndroid Build Coastguard Worker * @see uspoof_check2UnicodeString 1094*0e209d39SAndroid Build Coastguard Worker * @stable ICU 58 1095*0e209d39SAndroid Build Coastguard Worker */ 1096*0e209d39SAndroid Build Coastguard Worker U_CAPI int32_t U_EXPORT2 1097*0e209d39SAndroid Build Coastguard Worker uspoof_check2UTF8(const USpoofChecker *sc, 1098*0e209d39SAndroid Build Coastguard Worker const char *id, int32_t length, 1099*0e209d39SAndroid Build Coastguard Worker USpoofCheckResult* checkResult, 1100*0e209d39SAndroid Build Coastguard Worker UErrorCode *status); 1101*0e209d39SAndroid Build Coastguard Worker 1102*0e209d39SAndroid Build Coastguard Worker /** 1103*0e209d39SAndroid Build Coastguard Worker * Create a USpoofCheckResult, used by the {@link uspoof_check2} class of functions to return 1104*0e209d39SAndroid Build Coastguard Worker * information about the identifier. Information includes: 1105*0e209d39SAndroid Build Coastguard Worker * <ul> 1106*0e209d39SAndroid Build Coastguard Worker * <li>A bitmask of the checks that failed</li> 1107*0e209d39SAndroid Build Coastguard Worker * <li>The identifier's restriction level (UTS 39 section 5.2)</li> 1108*0e209d39SAndroid Build Coastguard Worker * <li>The set of numerics in the string (UTS 39 section 5.3)</li> 1109*0e209d39SAndroid Build Coastguard Worker * </ul> 1110*0e209d39SAndroid Build Coastguard Worker * The data held in a USpoofCheckResult is cleared whenever it is passed into a new call 1111*0e209d39SAndroid Build Coastguard Worker * of {@link uspoof_check2}. 1112*0e209d39SAndroid Build Coastguard Worker * 1113*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if this function encounters a problem. 1114*0e209d39SAndroid Build Coastguard Worker * @return the newly created USpoofCheckResult 1115*0e209d39SAndroid Build Coastguard Worker * @see uspoof_check2 1116*0e209d39SAndroid Build Coastguard Worker * @see uspoof_check2UTF8 1117*0e209d39SAndroid Build Coastguard Worker * @see uspoof_check2UnicodeString 1118*0e209d39SAndroid Build Coastguard Worker * @stable ICU 58 1119*0e209d39SAndroid Build Coastguard Worker */ 1120*0e209d39SAndroid Build Coastguard Worker U_CAPI USpoofCheckResult* U_EXPORT2 1121*0e209d39SAndroid Build Coastguard Worker uspoof_openCheckResult(UErrorCode *status); 1122*0e209d39SAndroid Build Coastguard Worker 1123*0e209d39SAndroid Build Coastguard Worker /** 1124*0e209d39SAndroid Build Coastguard Worker * Close a USpoofCheckResult, freeing any memory that was being held by 1125*0e209d39SAndroid Build Coastguard Worker * its implementation. 1126*0e209d39SAndroid Build Coastguard Worker * 1127*0e209d39SAndroid Build Coastguard Worker * @param checkResult The instance of USpoofCheckResult to close 1128*0e209d39SAndroid Build Coastguard Worker * @stable ICU 58 1129*0e209d39SAndroid Build Coastguard Worker */ 1130*0e209d39SAndroid Build Coastguard Worker U_CAPI void U_EXPORT2 1131*0e209d39SAndroid Build Coastguard Worker uspoof_closeCheckResult(USpoofCheckResult *checkResult); 1132*0e209d39SAndroid Build Coastguard Worker 1133*0e209d39SAndroid Build Coastguard Worker /** 1134*0e209d39SAndroid Build Coastguard Worker * Indicates which of the spoof check(s) have failed. The value is a bitwise OR of the constants for the tests 1135*0e209d39SAndroid Build Coastguard Worker * in question: USPOOF_RESTRICTION_LEVEL, USPOOF_CHAR_LIMIT, and so on. 1136*0e209d39SAndroid Build Coastguard Worker * 1137*0e209d39SAndroid Build Coastguard Worker * @param checkResult The instance of USpoofCheckResult created by {@link uspoof_openCheckResult} 1138*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if an error occurred. 1139*0e209d39SAndroid Build Coastguard Worker * @return An integer value with bits set for any potential security 1140*0e209d39SAndroid Build Coastguard Worker * or spoofing issues detected. The bits are defined by 1141*0e209d39SAndroid Build Coastguard Worker * enum USpoofChecks. (returned_value & USPOOF_ALL_CHECKS) 1142*0e209d39SAndroid Build Coastguard Worker * will be zero if the input string passes all of the 1143*0e209d39SAndroid Build Coastguard Worker * enabled checks. 1144*0e209d39SAndroid Build Coastguard Worker * @see uspoof_setChecks 1145*0e209d39SAndroid Build Coastguard Worker * @stable ICU 58 1146*0e209d39SAndroid Build Coastguard Worker */ 1147*0e209d39SAndroid Build Coastguard Worker U_CAPI int32_t U_EXPORT2 1148*0e209d39SAndroid Build Coastguard Worker uspoof_getCheckResultChecks(const USpoofCheckResult *checkResult, UErrorCode *status); 1149*0e209d39SAndroid Build Coastguard Worker 1150*0e209d39SAndroid Build Coastguard Worker /** 1151*0e209d39SAndroid Build Coastguard Worker * Gets the restriction level that the text meets, if the USPOOF_RESTRICTION_LEVEL check 1152*0e209d39SAndroid Build Coastguard Worker * was enabled; otherwise, undefined. 1153*0e209d39SAndroid Build Coastguard Worker * 1154*0e209d39SAndroid Build Coastguard Worker * @param checkResult The instance of USpoofCheckResult created by {@link uspoof_openCheckResult} 1155*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if an error occurred. 1156*0e209d39SAndroid Build Coastguard Worker * @return The restriction level contained in the USpoofCheckResult 1157*0e209d39SAndroid Build Coastguard Worker * @see uspoof_setRestrictionLevel 1158*0e209d39SAndroid Build Coastguard Worker * @stable ICU 58 1159*0e209d39SAndroid Build Coastguard Worker */ 1160*0e209d39SAndroid Build Coastguard Worker U_CAPI URestrictionLevel U_EXPORT2 1161*0e209d39SAndroid Build Coastguard Worker uspoof_getCheckResultRestrictionLevel(const USpoofCheckResult *checkResult, UErrorCode *status); 1162*0e209d39SAndroid Build Coastguard Worker 1163*0e209d39SAndroid Build Coastguard Worker /** 1164*0e209d39SAndroid Build Coastguard Worker * Gets the set of numerics found in the string, if the USPOOF_MIXED_NUMBERS check was enabled; 1165*0e209d39SAndroid Build Coastguard Worker * otherwise, undefined. The set will contain the zero digit from each decimal number system found 1166*0e209d39SAndroid Build Coastguard Worker * in the input string. Ownership of the returned USet remains with the USpoofCheckResult. 1167*0e209d39SAndroid Build Coastguard Worker * The USet will be free'd when {@link uspoof_closeCheckResult} is called. 1168*0e209d39SAndroid Build Coastguard Worker * 1169*0e209d39SAndroid Build Coastguard Worker * @param checkResult The instance of USpoofCheckResult created by {@link uspoof_openCheckResult} 1170*0e209d39SAndroid Build Coastguard Worker * @return The set of numerics contained in the USpoofCheckResult 1171*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if an error occurred. 1172*0e209d39SAndroid Build Coastguard Worker * @stable ICU 58 1173*0e209d39SAndroid Build Coastguard Worker */ 1174*0e209d39SAndroid Build Coastguard Worker U_CAPI const USet* U_EXPORT2 1175*0e209d39SAndroid Build Coastguard Worker uspoof_getCheckResultNumerics(const USpoofCheckResult *checkResult, UErrorCode *status); 1176*0e209d39SAndroid Build Coastguard Worker 1177*0e209d39SAndroid Build Coastguard Worker 1178*0e209d39SAndroid Build Coastguard Worker /** 1179*0e209d39SAndroid Build Coastguard Worker * Check whether two specified strings are visually confusable. 1180*0e209d39SAndroid Build Coastguard Worker * 1181*0e209d39SAndroid Build Coastguard Worker * If the strings are confusable, the return value will be nonzero, as long as 1182*0e209d39SAndroid Build Coastguard Worker * {@link USPOOF_CONFUSABLE} was enabled in uspoof_setChecks(). 1183*0e209d39SAndroid Build Coastguard Worker * 1184*0e209d39SAndroid Build Coastguard Worker * The bits in the return value correspond to flags for each of the classes of 1185*0e209d39SAndroid Build Coastguard Worker * confusables applicable to the two input strings. According to UTS 39 1186*0e209d39SAndroid Build Coastguard Worker * section 4, the possible flags are: 1187*0e209d39SAndroid Build Coastguard Worker * 1188*0e209d39SAndroid Build Coastguard Worker * <ul> 1189*0e209d39SAndroid Build Coastguard Worker * <li>{@link USPOOF_SINGLE_SCRIPT_CONFUSABLE}</li> 1190*0e209d39SAndroid Build Coastguard Worker * <li>{@link USPOOF_MIXED_SCRIPT_CONFUSABLE}</li> 1191*0e209d39SAndroid Build Coastguard Worker * <li>{@link USPOOF_WHOLE_SCRIPT_CONFUSABLE}</li> 1192*0e209d39SAndroid Build Coastguard Worker * </ul> 1193*0e209d39SAndroid Build Coastguard Worker * 1194*0e209d39SAndroid Build Coastguard Worker * If one or more of the above flags were not listed in uspoof_setChecks(), this 1195*0e209d39SAndroid Build Coastguard Worker * function will never report that class of confusable. The check 1196*0e209d39SAndroid Build Coastguard Worker * {@link USPOOF_CONFUSABLE} enables all three flags. 1197*0e209d39SAndroid Build Coastguard Worker * 1198*0e209d39SAndroid Build Coastguard Worker * 1199*0e209d39SAndroid Build Coastguard Worker * @param sc The USpoofChecker 1200*0e209d39SAndroid Build Coastguard Worker * @param id1 The first of the two identifiers to be compared for 1201*0e209d39SAndroid Build Coastguard Worker * confusability. The strings are in UTF-16 format. 1202*0e209d39SAndroid Build Coastguard Worker * @param length1 the length of the first identifier, expressed in 1203*0e209d39SAndroid Build Coastguard Worker * 16 bit UTF-16 code units, or -1 if the string is 1204*0e209d39SAndroid Build Coastguard Worker * nul terminated. 1205*0e209d39SAndroid Build Coastguard Worker * @param id2 The second of the two identifiers to be compared for 1206*0e209d39SAndroid Build Coastguard Worker * confusability. The identifiers are in UTF-16 format. 1207*0e209d39SAndroid Build Coastguard Worker * @param length2 The length of the second identifiers, expressed in 1208*0e209d39SAndroid Build Coastguard Worker * 16 bit UTF-16 code units, or -1 if the string is 1209*0e209d39SAndroid Build Coastguard Worker * nul terminated. 1210*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if an error occurred while attempting to 1211*0e209d39SAndroid Build Coastguard Worker * perform the check. 1212*0e209d39SAndroid Build Coastguard Worker * Confusability of the identifiers is not reported here, 1213*0e209d39SAndroid Build Coastguard Worker * but through this function's return value. 1214*0e209d39SAndroid Build Coastguard Worker * @return An integer value with bit(s) set corresponding to 1215*0e209d39SAndroid Build Coastguard Worker * the type of confusability found, as defined by 1216*0e209d39SAndroid Build Coastguard Worker * enum USpoofChecks. Zero is returned if the identifiers 1217*0e209d39SAndroid Build Coastguard Worker * are not confusable. 1218*0e209d39SAndroid Build Coastguard Worker * 1219*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.2 1220*0e209d39SAndroid Build Coastguard Worker */ 1221*0e209d39SAndroid Build Coastguard Worker U_CAPI int32_t U_EXPORT2 1222*0e209d39SAndroid Build Coastguard Worker uspoof_areConfusable(const USpoofChecker *sc, 1223*0e209d39SAndroid Build Coastguard Worker const UChar *id1, int32_t length1, 1224*0e209d39SAndroid Build Coastguard Worker const UChar *id2, int32_t length2, 1225*0e209d39SAndroid Build Coastguard Worker UErrorCode *status); 1226*0e209d39SAndroid Build Coastguard Worker 1227*0e209d39SAndroid Build Coastguard Worker #ifndef U_HIDE_DRAFT_API 1228*0e209d39SAndroid Build Coastguard Worker /** 1229*0e209d39SAndroid Build Coastguard Worker * Check whether two specified strings are visually confusable when 1230*0e209d39SAndroid Build Coastguard Worker * displayed in a context with the given paragraph direction. 1231*0e209d39SAndroid Build Coastguard Worker * 1232*0e209d39SAndroid Build Coastguard Worker * If the strings are confusable, the return value will be nonzero, as long as 1233*0e209d39SAndroid Build Coastguard Worker * {@link USPOOF_CONFUSABLE} was enabled in uspoof_setChecks(). 1234*0e209d39SAndroid Build Coastguard Worker * 1235*0e209d39SAndroid Build Coastguard Worker * The bits in the return value correspond to flags for each of the classes of 1236*0e209d39SAndroid Build Coastguard Worker * confusables applicable to the two input strings. According to UTS 39 1237*0e209d39SAndroid Build Coastguard Worker * section 4, the possible flags are: 1238*0e209d39SAndroid Build Coastguard Worker * 1239*0e209d39SAndroid Build Coastguard Worker * <ul> 1240*0e209d39SAndroid Build Coastguard Worker * <li>{@link USPOOF_SINGLE_SCRIPT_CONFUSABLE}</li> 1241*0e209d39SAndroid Build Coastguard Worker * <li>{@link USPOOF_MIXED_SCRIPT_CONFUSABLE}</li> 1242*0e209d39SAndroid Build Coastguard Worker * <li>{@link USPOOF_WHOLE_SCRIPT_CONFUSABLE}</li> 1243*0e209d39SAndroid Build Coastguard Worker * </ul> 1244*0e209d39SAndroid Build Coastguard Worker * 1245*0e209d39SAndroid Build Coastguard Worker * If one or more of the above flags were not listed in uspoof_setChecks(), this 1246*0e209d39SAndroid Build Coastguard Worker * function will never report that class of confusable. The check 1247*0e209d39SAndroid Build Coastguard Worker * {@link USPOOF_CONFUSABLE} enables all three flags. 1248*0e209d39SAndroid Build Coastguard Worker * 1249*0e209d39SAndroid Build Coastguard Worker * 1250*0e209d39SAndroid Build Coastguard Worker * @param sc The USpoofChecker 1251*0e209d39SAndroid Build Coastguard Worker * @param direction The paragraph direction with which the identifiers are 1252*0e209d39SAndroid Build Coastguard Worker * displayed. Must be either UBIDI_LTR or UBIDI_RTL. 1253*0e209d39SAndroid Build Coastguard Worker * @param id1 The first of the two identifiers to be compared for 1254*0e209d39SAndroid Build Coastguard Worker * confusability. The strings are in UTF-16 format. 1255*0e209d39SAndroid Build Coastguard Worker * @param length1 the length of the first identifier, expressed in 1256*0e209d39SAndroid Build Coastguard Worker * 16 bit UTF-16 code units, or -1 if the string is 1257*0e209d39SAndroid Build Coastguard Worker * nul terminated. 1258*0e209d39SAndroid Build Coastguard Worker * @param id2 The second of the two identifiers to be compared for 1259*0e209d39SAndroid Build Coastguard Worker * confusability. The identifiers are in UTF-16 format. 1260*0e209d39SAndroid Build Coastguard Worker * @param length2 The length of the second identifiers, expressed in 1261*0e209d39SAndroid Build Coastguard Worker * 16 bit UTF-16 code units, or -1 if the string is 1262*0e209d39SAndroid Build Coastguard Worker * nul terminated. 1263*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if an error occurred while attempting to 1264*0e209d39SAndroid Build Coastguard Worker * perform the check. 1265*0e209d39SAndroid Build Coastguard Worker * Confusability of the identifiers is not reported here, 1266*0e209d39SAndroid Build Coastguard Worker * but through this function's return value. 1267*0e209d39SAndroid Build Coastguard Worker * @return An integer value with bit(s) set corresponding to 1268*0e209d39SAndroid Build Coastguard Worker * the type of confusability found, as defined by 1269*0e209d39SAndroid Build Coastguard Worker * enum USpoofChecks. Zero is returned if the identifiers 1270*0e209d39SAndroid Build Coastguard Worker * are not confusable. 1271*0e209d39SAndroid Build Coastguard Worker * 1272*0e209d39SAndroid Build Coastguard Worker * @draft ICU 74 1273*0e209d39SAndroid Build Coastguard Worker */ 1274*0e209d39SAndroid Build Coastguard Worker U_CAPI uint32_t U_EXPORT2 uspoof_areBidiConfusable(const USpoofChecker *sc, UBiDiDirection direction, 1275*0e209d39SAndroid Build Coastguard Worker const UChar *id1, int32_t length1, 1276*0e209d39SAndroid Build Coastguard Worker const UChar *id2, int32_t length2, 1277*0e209d39SAndroid Build Coastguard Worker UErrorCode *status); 1278*0e209d39SAndroid Build Coastguard Worker #endif /* U_HIDE_DRAFT_API */ 1279*0e209d39SAndroid Build Coastguard Worker 1280*0e209d39SAndroid Build Coastguard Worker /** 1281*0e209d39SAndroid Build Coastguard Worker * A version of {@link uspoof_areConfusable} accepting strings in UTF-8 format. 1282*0e209d39SAndroid Build Coastguard Worker * 1283*0e209d39SAndroid Build Coastguard Worker * @param sc The USpoofChecker 1284*0e209d39SAndroid Build Coastguard Worker * @param id1 The first of the two identifiers to be compared for 1285*0e209d39SAndroid Build Coastguard Worker * confusability. The strings are in UTF-8 format. 1286*0e209d39SAndroid Build Coastguard Worker * @param length1 the length of the first identifiers, in bytes, or -1 1287*0e209d39SAndroid Build Coastguard Worker * if the string is nul terminated. 1288*0e209d39SAndroid Build Coastguard Worker * @param id2 The second of the two identifiers to be compared for 1289*0e209d39SAndroid Build Coastguard Worker * confusability. The strings are in UTF-8 format. 1290*0e209d39SAndroid Build Coastguard Worker * @param length2 The length of the second string in bytes, or -1 1291*0e209d39SAndroid Build Coastguard Worker * if the string is nul terminated. 1292*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if an error occurred while attempting to 1293*0e209d39SAndroid Build Coastguard Worker * perform the check. 1294*0e209d39SAndroid Build Coastguard Worker * Confusability of the strings is not reported here, 1295*0e209d39SAndroid Build Coastguard Worker * but through this function's return value. 1296*0e209d39SAndroid Build Coastguard Worker * @return An integer value with bit(s) set corresponding to 1297*0e209d39SAndroid Build Coastguard Worker * the type of confusability found, as defined by 1298*0e209d39SAndroid Build Coastguard Worker * enum USpoofChecks. Zero is returned if the strings 1299*0e209d39SAndroid Build Coastguard Worker * are not confusable. 1300*0e209d39SAndroid Build Coastguard Worker * 1301*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.2 1302*0e209d39SAndroid Build Coastguard Worker * 1303*0e209d39SAndroid Build Coastguard Worker * @see uspoof_areConfusable 1304*0e209d39SAndroid Build Coastguard Worker */ 1305*0e209d39SAndroid Build Coastguard Worker U_CAPI int32_t U_EXPORT2 1306*0e209d39SAndroid Build Coastguard Worker uspoof_areConfusableUTF8(const USpoofChecker *sc, 1307*0e209d39SAndroid Build Coastguard Worker const char *id1, int32_t length1, 1308*0e209d39SAndroid Build Coastguard Worker const char *id2, int32_t length2, 1309*0e209d39SAndroid Build Coastguard Worker UErrorCode *status); 1310*0e209d39SAndroid Build Coastguard Worker 1311*0e209d39SAndroid Build Coastguard Worker #ifndef U_HIDE_DRAFT_API 1312*0e209d39SAndroid Build Coastguard Worker /** 1313*0e209d39SAndroid Build Coastguard Worker * A version of {@link uspoof_areBidiConfusable} accepting strings in UTF-8 format. 1314*0e209d39SAndroid Build Coastguard Worker * 1315*0e209d39SAndroid Build Coastguard Worker * @param sc The USpoofChecker 1316*0e209d39SAndroid Build Coastguard Worker * @param direction The paragraph direction with which the identifiers are 1317*0e209d39SAndroid Build Coastguard Worker * displayed. Must be either UBIDI_LTR or UBIDI_RTL. 1318*0e209d39SAndroid Build Coastguard Worker * @param id1 The first of the two identifiers to be compared for 1319*0e209d39SAndroid Build Coastguard Worker * confusability. The strings are in UTF-8 format. 1320*0e209d39SAndroid Build Coastguard Worker * @param length1 the length of the first identifiers, in bytes, or -1 1321*0e209d39SAndroid Build Coastguard Worker * if the string is nul terminated. 1322*0e209d39SAndroid Build Coastguard Worker * @param id2 The second of the two identifiers to be compared for 1323*0e209d39SAndroid Build Coastguard Worker * confusability. The strings are in UTF-8 format. 1324*0e209d39SAndroid Build Coastguard Worker * @param length2 The length of the second string in bytes, or -1 1325*0e209d39SAndroid Build Coastguard Worker * if the string is nul terminated. 1326*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if an error occurred while attempting to 1327*0e209d39SAndroid Build Coastguard Worker * perform the check. 1328*0e209d39SAndroid Build Coastguard Worker * Confusability of the strings is not reported here, 1329*0e209d39SAndroid Build Coastguard Worker * but through this function's return value. 1330*0e209d39SAndroid Build Coastguard Worker * @return An integer value with bit(s) set corresponding to 1331*0e209d39SAndroid Build Coastguard Worker * the type of confusability found, as defined by 1332*0e209d39SAndroid Build Coastguard Worker * enum USpoofChecks. Zero is returned if the strings 1333*0e209d39SAndroid Build Coastguard Worker * are not confusable. 1334*0e209d39SAndroid Build Coastguard Worker * 1335*0e209d39SAndroid Build Coastguard Worker * @draft ICU 74 1336*0e209d39SAndroid Build Coastguard Worker * 1337*0e209d39SAndroid Build Coastguard Worker * @see uspoof_areBidiConfusable 1338*0e209d39SAndroid Build Coastguard Worker */ 1339*0e209d39SAndroid Build Coastguard Worker U_CAPI uint32_t U_EXPORT2 uspoof_areBidiConfusableUTF8(const USpoofChecker *sc, UBiDiDirection direction, 1340*0e209d39SAndroid Build Coastguard Worker const char *id1, int32_t length1, 1341*0e209d39SAndroid Build Coastguard Worker const char *id2, int32_t length2, 1342*0e209d39SAndroid Build Coastguard Worker UErrorCode *status); 1343*0e209d39SAndroid Build Coastguard Worker #endif /* U_HIDE_DRAFT_API */ 1344*0e209d39SAndroid Build Coastguard Worker 1345*0e209d39SAndroid Build Coastguard Worker /** 1346*0e209d39SAndroid Build Coastguard Worker * Get the "skeleton" for an identifier. 1347*0e209d39SAndroid Build Coastguard Worker * Skeletons are a transformation of the input identifier; 1348*0e209d39SAndroid Build Coastguard Worker * Two identifiers are confusable if their skeletons are identical. 1349*0e209d39SAndroid Build Coastguard Worker * See Unicode Technical Standard #39 for additional information. 1350*0e209d39SAndroid Build Coastguard Worker * 1351*0e209d39SAndroid Build Coastguard Worker * Using skeletons directly makes it possible to quickly check 1352*0e209d39SAndroid Build Coastguard Worker * whether an identifier is confusable with any of some large 1353*0e209d39SAndroid Build Coastguard Worker * set of existing identifiers, by creating an efficiently 1354*0e209d39SAndroid Build Coastguard Worker * searchable collection of the skeletons. 1355*0e209d39SAndroid Build Coastguard Worker * 1356*0e209d39SAndroid Build Coastguard Worker * @param sc The USpoofChecker 1357*0e209d39SAndroid Build Coastguard Worker * @param type Deprecated in ICU 58. You may pass any number. 1358*0e209d39SAndroid Build Coastguard Worker * Originally, controlled which of the Unicode confusable data 1359*0e209d39SAndroid Build Coastguard Worker * tables to use. 1360*0e209d39SAndroid Build Coastguard Worker * @param id The input identifier whose skeleton will be computed. 1361*0e209d39SAndroid Build Coastguard Worker * @param length The length of the input identifier, expressed in 16 bit 1362*0e209d39SAndroid Build Coastguard Worker * UTF-16 code units, or -1 if the string is zero terminated. 1363*0e209d39SAndroid Build Coastguard Worker * @param dest The output buffer, to receive the skeleton string. 1364*0e209d39SAndroid Build Coastguard Worker * @param destCapacity The length of the output buffer, in 16 bit units. 1365*0e209d39SAndroid Build Coastguard Worker * The destCapacity may be zero, in which case the function will 1366*0e209d39SAndroid Build Coastguard Worker * return the actual length of the skeleton. 1367*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if an error occurred while attempting to 1368*0e209d39SAndroid Build Coastguard Worker * perform the check. 1369*0e209d39SAndroid Build Coastguard Worker * @return The length of the skeleton string. The returned length 1370*0e209d39SAndroid Build Coastguard Worker * is always that of the complete skeleton, even when the 1371*0e209d39SAndroid Build Coastguard Worker * supplied buffer is too small (or of zero length) 1372*0e209d39SAndroid Build Coastguard Worker * 1373*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.2 1374*0e209d39SAndroid Build Coastguard Worker * @see uspoof_areConfusable 1375*0e209d39SAndroid Build Coastguard Worker */ 1376*0e209d39SAndroid Build Coastguard Worker U_CAPI int32_t U_EXPORT2 1377*0e209d39SAndroid Build Coastguard Worker uspoof_getSkeleton(const USpoofChecker *sc, 1378*0e209d39SAndroid Build Coastguard Worker uint32_t type, 1379*0e209d39SAndroid Build Coastguard Worker const UChar *id, int32_t length, 1380*0e209d39SAndroid Build Coastguard Worker UChar *dest, int32_t destCapacity, 1381*0e209d39SAndroid Build Coastguard Worker UErrorCode *status); 1382*0e209d39SAndroid Build Coastguard Worker 1383*0e209d39SAndroid Build Coastguard Worker #ifndef U_HIDE_DRAFT_API 1384*0e209d39SAndroid Build Coastguard Worker /** 1385*0e209d39SAndroid Build Coastguard Worker * Get the "bidiSkeleton" for an identifier and a direction. 1386*0e209d39SAndroid Build Coastguard Worker * Skeletons are a transformation of the input identifier; 1387*0e209d39SAndroid Build Coastguard Worker * Two identifiers are LTR-confusable if their LTR bidiSkeletons are identical; 1388*0e209d39SAndroid Build Coastguard Worker * they are RTL-confusable if their RTL bidiSkeletons are identical. 1389*0e209d39SAndroid Build Coastguard Worker * See Unicode Technical Standard #39 for additional information: 1390*0e209d39SAndroid Build Coastguard Worker * https://www.unicode.org/reports/tr39/#Confusable_Detection. 1391*0e209d39SAndroid Build Coastguard Worker * 1392*0e209d39SAndroid Build Coastguard Worker * Using skeletons directly makes it possible to quickly check 1393*0e209d39SAndroid Build Coastguard Worker * whether an identifier is confusable with any of some large 1394*0e209d39SAndroid Build Coastguard Worker * set of existing identifiers, by creating an efficiently 1395*0e209d39SAndroid Build Coastguard Worker * searchable collection of the skeletons. 1396*0e209d39SAndroid Build Coastguard Worker * 1397*0e209d39SAndroid Build Coastguard Worker * @param sc The USpoofChecker. 1398*0e209d39SAndroid Build Coastguard Worker * @param direction The context direction with which the identifier will be 1399*0e209d39SAndroid Build Coastguard Worker * displayed. Must be either UBIDI_LTR or UBIDI_RTL. 1400*0e209d39SAndroid Build Coastguard Worker * @param id The input identifier whose skeleton will be computed. 1401*0e209d39SAndroid Build Coastguard Worker * @param length The length of the input identifier, expressed in 16 bit 1402*0e209d39SAndroid Build Coastguard Worker * UTF-16 code units, or -1 if the string is zero terminated. 1403*0e209d39SAndroid Build Coastguard Worker * @param dest The output buffer, to receive the skeleton string. 1404*0e209d39SAndroid Build Coastguard Worker * @param destCapacity The length of the output buffer, in 16 bit units. 1405*0e209d39SAndroid Build Coastguard Worker * The destCapacity may be zero, in which case the function will 1406*0e209d39SAndroid Build Coastguard Worker * return the actual length of the skeleton. 1407*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if an error occurred while attempting to 1408*0e209d39SAndroid Build Coastguard Worker * perform the check. 1409*0e209d39SAndroid Build Coastguard Worker * @return The length of the skeleton string. The returned length 1410*0e209d39SAndroid Build Coastguard Worker * is always that of the complete skeleton, even when the 1411*0e209d39SAndroid Build Coastguard Worker * supplied buffer is too small (or of zero length) 1412*0e209d39SAndroid Build Coastguard Worker * 1413*0e209d39SAndroid Build Coastguard Worker * @draft ICU 74 1414*0e209d39SAndroid Build Coastguard Worker * @see uspoof_areBidiConfusable 1415*0e209d39SAndroid Build Coastguard Worker */ 1416*0e209d39SAndroid Build Coastguard Worker U_CAPI int32_t U_EXPORT2 uspoof_getBidiSkeleton(const USpoofChecker *sc, 1417*0e209d39SAndroid Build Coastguard Worker UBiDiDirection direction, 1418*0e209d39SAndroid Build Coastguard Worker const UChar *id, int32_t length, 1419*0e209d39SAndroid Build Coastguard Worker UChar *dest, int32_t destCapacity, UErrorCode *status); 1420*0e209d39SAndroid Build Coastguard Worker #endif /* U_HIDE_DRAFT_API */ 1421*0e209d39SAndroid Build Coastguard Worker 1422*0e209d39SAndroid Build Coastguard Worker /** 1423*0e209d39SAndroid Build Coastguard Worker * Get the "skeleton" for an identifier. 1424*0e209d39SAndroid Build Coastguard Worker * Skeletons are a transformation of the input identifier; 1425*0e209d39SAndroid Build Coastguard Worker * Two identifiers are confusable if their skeletons are identical. 1426*0e209d39SAndroid Build Coastguard Worker * See Unicode Technical Standard #39 for additional information. 1427*0e209d39SAndroid Build Coastguard Worker * 1428*0e209d39SAndroid Build Coastguard Worker * Using skeletons directly makes it possible to quickly check 1429*0e209d39SAndroid Build Coastguard Worker * whether an identifier is confusable with any of some large 1430*0e209d39SAndroid Build Coastguard Worker * set of existing identifiers, by creating an efficiently 1431*0e209d39SAndroid Build Coastguard Worker * searchable collection of the skeletons. 1432*0e209d39SAndroid Build Coastguard Worker * 1433*0e209d39SAndroid Build Coastguard Worker * @param sc The USpoofChecker 1434*0e209d39SAndroid Build Coastguard Worker * @param type Deprecated in ICU 58. You may pass any number. 1435*0e209d39SAndroid Build Coastguard Worker * Originally, controlled which of the Unicode confusable data 1436*0e209d39SAndroid Build Coastguard Worker * tables to use. 1437*0e209d39SAndroid Build Coastguard Worker * @param id The UTF-8 format identifier whose skeleton will be computed. 1438*0e209d39SAndroid Build Coastguard Worker * @param length The length of the input string, in bytes, 1439*0e209d39SAndroid Build Coastguard Worker * or -1 if the string is zero terminated. 1440*0e209d39SAndroid Build Coastguard Worker * @param dest The output buffer, to receive the skeleton string. 1441*0e209d39SAndroid Build Coastguard Worker * @param destCapacity The length of the output buffer, in bytes. 1442*0e209d39SAndroid Build Coastguard Worker * The destCapacity may be zero, in which case the function will 1443*0e209d39SAndroid Build Coastguard Worker * return the actual length of the skeleton. 1444*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if an error occurred while attempting to 1445*0e209d39SAndroid Build Coastguard Worker * perform the check. Possible Errors include U_INVALID_CHAR_FOUND 1446*0e209d39SAndroid Build Coastguard Worker * for invalid UTF-8 sequences, and 1447*0e209d39SAndroid Build Coastguard Worker * U_BUFFER_OVERFLOW_ERROR if the destination buffer is too small 1448*0e209d39SAndroid Build Coastguard Worker * to hold the complete skeleton. 1449*0e209d39SAndroid Build Coastguard Worker * @return The length of the skeleton string, in bytes. The returned length 1450*0e209d39SAndroid Build Coastguard Worker * is always that of the complete skeleton, even when the 1451*0e209d39SAndroid Build Coastguard Worker * supplied buffer is too small (or of zero length) 1452*0e209d39SAndroid Build Coastguard Worker * 1453*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.2 1454*0e209d39SAndroid Build Coastguard Worker */ 1455*0e209d39SAndroid Build Coastguard Worker U_CAPI int32_t U_EXPORT2 1456*0e209d39SAndroid Build Coastguard Worker uspoof_getSkeletonUTF8(const USpoofChecker *sc, 1457*0e209d39SAndroid Build Coastguard Worker uint32_t type, 1458*0e209d39SAndroid Build Coastguard Worker const char *id, int32_t length, 1459*0e209d39SAndroid Build Coastguard Worker char *dest, int32_t destCapacity, 1460*0e209d39SAndroid Build Coastguard Worker UErrorCode *status); 1461*0e209d39SAndroid Build Coastguard Worker 1462*0e209d39SAndroid Build Coastguard Worker #ifndef U_HIDE_DRAFT_API 1463*0e209d39SAndroid Build Coastguard Worker /** 1464*0e209d39SAndroid Build Coastguard Worker * Get the "bidiSkeleton" for an identifier and a direction. 1465*0e209d39SAndroid Build Coastguard Worker * Skeletons are a transformation of the input identifier; 1466*0e209d39SAndroid Build Coastguard Worker * Two identifiers are LTR-confusable if their LTR bidiSkeletons are identical; 1467*0e209d39SAndroid Build Coastguard Worker * they are RTL-confusable if their RTL bidiSkeletons are identical. 1468*0e209d39SAndroid Build Coastguard Worker * See Unicode Technical Standard #39 for additional information: 1469*0e209d39SAndroid Build Coastguard Worker * https://www.unicode.org/reports/tr39/#Confusable_Detection. 1470*0e209d39SAndroid Build Coastguard Worker * 1471*0e209d39SAndroid Build Coastguard Worker * Using skeletons directly makes it possible to quickly check 1472*0e209d39SAndroid Build Coastguard Worker * whether an identifier is confusable with any of some large 1473*0e209d39SAndroid Build Coastguard Worker * set of existing identifiers, by creating an efficiently 1474*0e209d39SAndroid Build Coastguard Worker * searchable collection of the skeletons. 1475*0e209d39SAndroid Build Coastguard Worker * 1476*0e209d39SAndroid Build Coastguard Worker * @param sc The USpoofChecker 1477*0e209d39SAndroid Build Coastguard Worker * @param direction The context direction with which the identifier will be 1478*0e209d39SAndroid Build Coastguard Worker * displayed. Must be either UBIDI_LTR or UBIDI_RTL. 1479*0e209d39SAndroid Build Coastguard Worker * @param id The UTF-8 format identifier whose skeleton will be computed. 1480*0e209d39SAndroid Build Coastguard Worker * @param length The length of the input string, in bytes, 1481*0e209d39SAndroid Build Coastguard Worker * or -1 if the string is zero terminated. 1482*0e209d39SAndroid Build Coastguard Worker * @param dest The output buffer, to receive the skeleton string. 1483*0e209d39SAndroid Build Coastguard Worker * @param destCapacity The length of the output buffer, in bytes. 1484*0e209d39SAndroid Build Coastguard Worker * The destCapacity may be zero, in which case the function will 1485*0e209d39SAndroid Build Coastguard Worker * return the actual length of the skeleton. 1486*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if an error occurred while attempting to 1487*0e209d39SAndroid Build Coastguard Worker * perform the check. Possible Errors include U_INVALID_CHAR_FOUND 1488*0e209d39SAndroid Build Coastguard Worker * for invalid UTF-8 sequences, and 1489*0e209d39SAndroid Build Coastguard Worker * U_BUFFER_OVERFLOW_ERROR if the destination buffer is too small 1490*0e209d39SAndroid Build Coastguard Worker * to hold the complete skeleton. 1491*0e209d39SAndroid Build Coastguard Worker * @return The length of the skeleton string, in bytes. The returned length 1492*0e209d39SAndroid Build Coastguard Worker * is always that of the complete skeleton, even when the 1493*0e209d39SAndroid Build Coastguard Worker * supplied buffer is too small (or of zero length) 1494*0e209d39SAndroid Build Coastguard Worker * 1495*0e209d39SAndroid Build Coastguard Worker * @draft ICU 74 1496*0e209d39SAndroid Build Coastguard Worker */ 1497*0e209d39SAndroid Build Coastguard Worker U_CAPI int32_t U_EXPORT2 uspoof_getBidiSkeletonUTF8(const USpoofChecker *sc, UBiDiDirection direction, 1498*0e209d39SAndroid Build Coastguard Worker const char *id, int32_t length, char *dest, 1499*0e209d39SAndroid Build Coastguard Worker int32_t destCapacity, UErrorCode *status); 1500*0e209d39SAndroid Build Coastguard Worker #endif /* U_HIDE_DRAFT_API */ 1501*0e209d39SAndroid Build Coastguard Worker 1502*0e209d39SAndroid Build Coastguard Worker /** 1503*0e209d39SAndroid Build Coastguard Worker * Get the set of Candidate Characters for Inclusion in Identifiers, as defined 1504*0e209d39SAndroid Build Coastguard Worker * in http://unicode.org/Public/security/latest/xidmodifications.txt 1505*0e209d39SAndroid Build Coastguard Worker * and documented in http://www.unicode.org/reports/tr39/, Unicode Security Mechanisms. 1506*0e209d39SAndroid Build Coastguard Worker * 1507*0e209d39SAndroid Build Coastguard Worker * The returned set is frozen. Ownership of the set remains with the ICU library; it must not 1508*0e209d39SAndroid Build Coastguard Worker * be deleted by the caller. 1509*0e209d39SAndroid Build Coastguard Worker * 1510*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if a problem occurs while creating the set. 1511*0e209d39SAndroid Build Coastguard Worker * 1512*0e209d39SAndroid Build Coastguard Worker * @stable ICU 51 1513*0e209d39SAndroid Build Coastguard Worker */ 1514*0e209d39SAndroid Build Coastguard Worker U_CAPI const USet * U_EXPORT2 1515*0e209d39SAndroid Build Coastguard Worker uspoof_getInclusionSet(UErrorCode *status); 1516*0e209d39SAndroid Build Coastguard Worker 1517*0e209d39SAndroid Build Coastguard Worker /** 1518*0e209d39SAndroid Build Coastguard Worker * Get the set of characters from Recommended Scripts for Inclusion in Identifiers, as defined 1519*0e209d39SAndroid Build Coastguard Worker * in http://unicode.org/Public/security/latest/xidmodifications.txt 1520*0e209d39SAndroid Build Coastguard Worker * and documented in http://www.unicode.org/reports/tr39/, Unicode Security Mechanisms. 1521*0e209d39SAndroid Build Coastguard Worker * 1522*0e209d39SAndroid Build Coastguard Worker * The returned set is frozen. Ownership of the set remains with the ICU library; it must not 1523*0e209d39SAndroid Build Coastguard Worker * be deleted by the caller. 1524*0e209d39SAndroid Build Coastguard Worker * 1525*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if a problem occurs while creating the set. 1526*0e209d39SAndroid Build Coastguard Worker * 1527*0e209d39SAndroid Build Coastguard Worker * @stable ICU 51 1528*0e209d39SAndroid Build Coastguard Worker */ 1529*0e209d39SAndroid Build Coastguard Worker U_CAPI const USet * U_EXPORT2 1530*0e209d39SAndroid Build Coastguard Worker uspoof_getRecommendedSet(UErrorCode *status); 1531*0e209d39SAndroid Build Coastguard Worker 1532*0e209d39SAndroid Build Coastguard Worker /** 1533*0e209d39SAndroid Build Coastguard Worker * Serialize the data for a spoof detector into a chunk of memory. 1534*0e209d39SAndroid Build Coastguard Worker * The flattened spoof detection tables can later be used to efficiently 1535*0e209d39SAndroid Build Coastguard Worker * instantiate a new Spoof Detector. 1536*0e209d39SAndroid Build Coastguard Worker * 1537*0e209d39SAndroid Build Coastguard Worker * The serialized spoof checker includes only the data compiled from the 1538*0e209d39SAndroid Build Coastguard Worker * Unicode data tables by uspoof_openFromSource(); it does not include 1539*0e209d39SAndroid Build Coastguard Worker * include any other state or configuration that may have been set. 1540*0e209d39SAndroid Build Coastguard Worker * 1541*0e209d39SAndroid Build Coastguard Worker * @param sc the Spoof Detector whose data is to be serialized. 1542*0e209d39SAndroid Build Coastguard Worker * @param data a pointer to 32-bit-aligned memory to be filled with the data, 1543*0e209d39SAndroid Build Coastguard Worker * can be NULL if capacity==0 1544*0e209d39SAndroid Build Coastguard Worker * @param capacity the number of bytes available at data, 1545*0e209d39SAndroid Build Coastguard Worker * or 0 for preflighting 1546*0e209d39SAndroid Build Coastguard Worker * @param status an in/out ICU UErrorCode; possible errors include: 1547*0e209d39SAndroid Build Coastguard Worker * - U_BUFFER_OVERFLOW_ERROR if the data storage block is too small for serialization 1548*0e209d39SAndroid Build Coastguard Worker * - U_ILLEGAL_ARGUMENT_ERROR the data or capacity parameters are bad 1549*0e209d39SAndroid Build Coastguard Worker * @return the number of bytes written or needed for the spoof data 1550*0e209d39SAndroid Build Coastguard Worker * 1551*0e209d39SAndroid Build Coastguard Worker * @see utrie2_openFromSerialized() 1552*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.2 1553*0e209d39SAndroid Build Coastguard Worker */ 1554*0e209d39SAndroid Build Coastguard Worker U_CAPI int32_t U_EXPORT2 1555*0e209d39SAndroid Build Coastguard Worker uspoof_serialize(USpoofChecker *sc, 1556*0e209d39SAndroid Build Coastguard Worker void *data, int32_t capacity, 1557*0e209d39SAndroid Build Coastguard Worker UErrorCode *status); 1558*0e209d39SAndroid Build Coastguard Worker 1559*0e209d39SAndroid Build Coastguard Worker U_CDECL_END 1560*0e209d39SAndroid Build Coastguard Worker 1561*0e209d39SAndroid Build Coastguard Worker #if U_SHOW_CPLUSPLUS_API 1562*0e209d39SAndroid Build Coastguard Worker 1563*0e209d39SAndroid Build Coastguard Worker U_NAMESPACE_BEGIN 1564*0e209d39SAndroid Build Coastguard Worker 1565*0e209d39SAndroid Build Coastguard Worker /** 1566*0e209d39SAndroid Build Coastguard Worker * \class LocalUSpoofCheckerPointer 1567*0e209d39SAndroid Build Coastguard Worker * "Smart pointer" class, closes a USpoofChecker via uspoof_close(). 1568*0e209d39SAndroid Build Coastguard Worker * For most methods see the LocalPointerBase base class. 1569*0e209d39SAndroid Build Coastguard Worker * 1570*0e209d39SAndroid Build Coastguard Worker * @see LocalPointerBase 1571*0e209d39SAndroid Build Coastguard Worker * @see LocalPointer 1572*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.4 1573*0e209d39SAndroid Build Coastguard Worker */ 1574*0e209d39SAndroid Build Coastguard Worker /** 1575*0e209d39SAndroid Build Coastguard Worker * \cond 1576*0e209d39SAndroid Build Coastguard Worker * Note: Doxygen is giving a bogus warning on this U_DEFINE_LOCAL_OPEN_POINTER. 1577*0e209d39SAndroid Build Coastguard Worker * For now, suppress with a Doxygen cond 1578*0e209d39SAndroid Build Coastguard Worker */ 1579*0e209d39SAndroid Build Coastguard Worker U_DEFINE_LOCAL_OPEN_POINTER(LocalUSpoofCheckerPointer, USpoofChecker, uspoof_close); 1580*0e209d39SAndroid Build Coastguard Worker /** \endcond */ 1581*0e209d39SAndroid Build Coastguard Worker 1582*0e209d39SAndroid Build Coastguard Worker /** 1583*0e209d39SAndroid Build Coastguard Worker * \class LocalUSpoofCheckResultPointer 1584*0e209d39SAndroid Build Coastguard Worker * "Smart pointer" class, closes a USpoofCheckResult via `uspoof_closeCheckResult()`. 1585*0e209d39SAndroid Build Coastguard Worker * For most methods see the LocalPointerBase base class. 1586*0e209d39SAndroid Build Coastguard Worker * 1587*0e209d39SAndroid Build Coastguard Worker * @see LocalPointerBase 1588*0e209d39SAndroid Build Coastguard Worker * @see LocalPointer 1589*0e209d39SAndroid Build Coastguard Worker * @stable ICU 58 1590*0e209d39SAndroid Build Coastguard Worker */ 1591*0e209d39SAndroid Build Coastguard Worker 1592*0e209d39SAndroid Build Coastguard Worker /** 1593*0e209d39SAndroid Build Coastguard Worker * \cond 1594*0e209d39SAndroid Build Coastguard Worker * Note: Doxygen is giving a bogus warning on this U_DEFINE_LOCAL_OPEN_POINTER. 1595*0e209d39SAndroid Build Coastguard Worker * For now, suppress with a Doxygen cond 1596*0e209d39SAndroid Build Coastguard Worker */ 1597*0e209d39SAndroid Build Coastguard Worker U_DEFINE_LOCAL_OPEN_POINTER(LocalUSpoofCheckResultPointer, USpoofCheckResult, uspoof_closeCheckResult); 1598*0e209d39SAndroid Build Coastguard Worker /** \endcond */ 1599*0e209d39SAndroid Build Coastguard Worker 1600*0e209d39SAndroid Build Coastguard Worker U_NAMESPACE_END 1601*0e209d39SAndroid Build Coastguard Worker 1602*0e209d39SAndroid Build Coastguard Worker /** 1603*0e209d39SAndroid Build Coastguard Worker * Limit the acceptable characters to those specified by a Unicode Set. 1604*0e209d39SAndroid Build Coastguard Worker * Any previously specified character limit is 1605*0e209d39SAndroid Build Coastguard Worker * is replaced by the new settings. This includes limits on 1606*0e209d39SAndroid Build Coastguard Worker * characters that were set with the uspoof_setAllowedLocales() function. 1607*0e209d39SAndroid Build Coastguard Worker * 1608*0e209d39SAndroid Build Coastguard Worker * The USPOOF_CHAR_LIMIT test is automatically enabled for this 1609*0e209d39SAndroid Build Coastguard Worker * USoofChecker by this function. 1610*0e209d39SAndroid Build Coastguard Worker * 1611*0e209d39SAndroid Build Coastguard Worker * @param sc The USpoofChecker 1612*0e209d39SAndroid Build Coastguard Worker * @param chars A Unicode Set containing the list of 1613*0e209d39SAndroid Build Coastguard Worker * characters that are permitted. Ownership of the set 1614*0e209d39SAndroid Build Coastguard Worker * remains with the caller. The incoming set is cloned by 1615*0e209d39SAndroid Build Coastguard Worker * this function, so there are no restrictions on modifying 1616*0e209d39SAndroid Build Coastguard Worker * or deleting the UnicodeSet after calling this function. 1617*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if this function encounters a problem. 1618*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.2 1619*0e209d39SAndroid Build Coastguard Worker */ 1620*0e209d39SAndroid Build Coastguard Worker U_CAPI void U_EXPORT2 1621*0e209d39SAndroid Build Coastguard Worker uspoof_setAllowedUnicodeSet(USpoofChecker *sc, const icu::UnicodeSet *chars, UErrorCode *status); 1622*0e209d39SAndroid Build Coastguard Worker 1623*0e209d39SAndroid Build Coastguard Worker 1624*0e209d39SAndroid Build Coastguard Worker /** 1625*0e209d39SAndroid Build Coastguard Worker * Get a UnicodeSet for the characters permitted in an identifier. 1626*0e209d39SAndroid Build Coastguard Worker * This corresponds to the limits imposed by the Set Allowed Characters / 1627*0e209d39SAndroid Build Coastguard Worker * UnicodeSet functions. Limitations imposed by other checks will not be 1628*0e209d39SAndroid Build Coastguard Worker * reflected in the set returned by this function. 1629*0e209d39SAndroid Build Coastguard Worker * 1630*0e209d39SAndroid Build Coastguard Worker * The returned set will be frozen, meaning that it cannot be modified 1631*0e209d39SAndroid Build Coastguard Worker * by the caller. 1632*0e209d39SAndroid Build Coastguard Worker * 1633*0e209d39SAndroid Build Coastguard Worker * Ownership of the returned set remains with the Spoof Detector. The 1634*0e209d39SAndroid Build Coastguard Worker * returned set will become invalid if the spoof detector is closed, 1635*0e209d39SAndroid Build Coastguard Worker * or if a new set of allowed characters is specified. 1636*0e209d39SAndroid Build Coastguard Worker * 1637*0e209d39SAndroid Build Coastguard Worker * 1638*0e209d39SAndroid Build Coastguard Worker * @param sc The USpoofChecker 1639*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if this function encounters a problem. 1640*0e209d39SAndroid Build Coastguard Worker * @return A UnicodeSet containing the characters that are permitted by 1641*0e209d39SAndroid Build Coastguard Worker * the USPOOF_CHAR_LIMIT test. 1642*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.2 1643*0e209d39SAndroid Build Coastguard Worker */ 1644*0e209d39SAndroid Build Coastguard Worker U_CAPI const icu::UnicodeSet * U_EXPORT2 1645*0e209d39SAndroid Build Coastguard Worker uspoof_getAllowedUnicodeSet(const USpoofChecker *sc, UErrorCode *status); 1646*0e209d39SAndroid Build Coastguard Worker 1647*0e209d39SAndroid Build Coastguard Worker /** 1648*0e209d39SAndroid Build Coastguard Worker * Check the specified string for possible security issues. 1649*0e209d39SAndroid Build Coastguard Worker * The text to be checked will typically be an identifier of some sort. 1650*0e209d39SAndroid Build Coastguard Worker * The set of checks to be performed is specified with uspoof_setChecks(). 1651*0e209d39SAndroid Build Coastguard Worker * 1652*0e209d39SAndroid Build Coastguard Worker * \note 1653*0e209d39SAndroid Build Coastguard Worker * Consider using the newer API, {@link uspoof_check2UnicodeString}, instead. 1654*0e209d39SAndroid Build Coastguard Worker * The newer API exposes additional information from the check procedure 1655*0e209d39SAndroid Build Coastguard Worker * and is otherwise identical to this method. 1656*0e209d39SAndroid Build Coastguard Worker * 1657*0e209d39SAndroid Build Coastguard Worker * @param sc The USpoofChecker 1658*0e209d39SAndroid Build Coastguard Worker * @param id A identifier to be checked for possible security issues. 1659*0e209d39SAndroid Build Coastguard Worker * @param position Deprecated in ICU 51. Always returns zero. 1660*0e209d39SAndroid Build Coastguard Worker * Originally, an out parameter for the index of the first 1661*0e209d39SAndroid Build Coastguard Worker * string position that failed a check. 1662*0e209d39SAndroid Build Coastguard Worker * This parameter may be nullptr. 1663*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if an error occurred while attempting to 1664*0e209d39SAndroid Build Coastguard Worker * perform the check. 1665*0e209d39SAndroid Build Coastguard Worker * Spoofing or security issues detected with the input string are 1666*0e209d39SAndroid Build Coastguard Worker * not reported here, but through the function's return value. 1667*0e209d39SAndroid Build Coastguard Worker * @return An integer value with bits set for any potential security 1668*0e209d39SAndroid Build Coastguard Worker * or spoofing issues detected. The bits are defined by 1669*0e209d39SAndroid Build Coastguard Worker * enum USpoofChecks. (returned_value & USPOOF_ALL_CHECKS) 1670*0e209d39SAndroid Build Coastguard Worker * will be zero if the input string passes all of the 1671*0e209d39SAndroid Build Coastguard Worker * enabled checks. 1672*0e209d39SAndroid Build Coastguard Worker * @see uspoof_check2UnicodeString 1673*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.2 1674*0e209d39SAndroid Build Coastguard Worker */ 1675*0e209d39SAndroid Build Coastguard Worker U_CAPI int32_t U_EXPORT2 1676*0e209d39SAndroid Build Coastguard Worker uspoof_checkUnicodeString(const USpoofChecker *sc, 1677*0e209d39SAndroid Build Coastguard Worker const icu::UnicodeString &id, 1678*0e209d39SAndroid Build Coastguard Worker int32_t *position, 1679*0e209d39SAndroid Build Coastguard Worker UErrorCode *status); 1680*0e209d39SAndroid Build Coastguard Worker 1681*0e209d39SAndroid Build Coastguard Worker /** 1682*0e209d39SAndroid Build Coastguard Worker * Check the specified string for possible security issues. 1683*0e209d39SAndroid Build Coastguard Worker * The text to be checked will typically be an identifier of some sort. 1684*0e209d39SAndroid Build Coastguard Worker * The set of checks to be performed is specified with uspoof_setChecks(). 1685*0e209d39SAndroid Build Coastguard Worker * 1686*0e209d39SAndroid Build Coastguard Worker * @param sc The USpoofChecker 1687*0e209d39SAndroid Build Coastguard Worker * @param id A identifier to be checked for possible security issues. 1688*0e209d39SAndroid Build Coastguard Worker * @param checkResult An instance of USpoofCheckResult to be filled with 1689*0e209d39SAndroid Build Coastguard Worker * details about the identifier. Can be nullptr. 1690*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if an error occurred while attempting to 1691*0e209d39SAndroid Build Coastguard Worker * perform the check. 1692*0e209d39SAndroid Build Coastguard Worker * Spoofing or security issues detected with the input string are 1693*0e209d39SAndroid Build Coastguard Worker * not reported here, but through the function's return value. 1694*0e209d39SAndroid Build Coastguard Worker * @return An integer value with bits set for any potential security 1695*0e209d39SAndroid Build Coastguard Worker * or spoofing issues detected. The bits are defined by 1696*0e209d39SAndroid Build Coastguard Worker * enum USpoofChecks. (returned_value & USPOOF_ALL_CHECKS) 1697*0e209d39SAndroid Build Coastguard Worker * will be zero if the input string passes all of the 1698*0e209d39SAndroid Build Coastguard Worker * enabled checks. Any information in this bitmask will be 1699*0e209d39SAndroid Build Coastguard Worker * consistent with the information saved in the optional 1700*0e209d39SAndroid Build Coastguard Worker * checkResult parameter. 1701*0e209d39SAndroid Build Coastguard Worker * @see uspoof_openCheckResult 1702*0e209d39SAndroid Build Coastguard Worker * @see uspoof_check2 1703*0e209d39SAndroid Build Coastguard Worker * @see uspoof_check2UTF8 1704*0e209d39SAndroid Build Coastguard Worker * @stable ICU 58 1705*0e209d39SAndroid Build Coastguard Worker */ 1706*0e209d39SAndroid Build Coastguard Worker U_CAPI int32_t U_EXPORT2 1707*0e209d39SAndroid Build Coastguard Worker uspoof_check2UnicodeString(const USpoofChecker *sc, 1708*0e209d39SAndroid Build Coastguard Worker const icu::UnicodeString &id, 1709*0e209d39SAndroid Build Coastguard Worker USpoofCheckResult* checkResult, 1710*0e209d39SAndroid Build Coastguard Worker UErrorCode *status); 1711*0e209d39SAndroid Build Coastguard Worker 1712*0e209d39SAndroid Build Coastguard Worker /** 1713*0e209d39SAndroid Build Coastguard Worker * A version of {@link uspoof_areConfusable} accepting UnicodeStrings. 1714*0e209d39SAndroid Build Coastguard Worker * 1715*0e209d39SAndroid Build Coastguard Worker * @param sc The USpoofChecker 1716*0e209d39SAndroid Build Coastguard Worker * @param s1 The first of the two identifiers to be compared for 1717*0e209d39SAndroid Build Coastguard Worker * confusability. The strings are in UTF-8 format. 1718*0e209d39SAndroid Build Coastguard Worker * @param s2 The second of the two identifiers to be compared for 1719*0e209d39SAndroid Build Coastguard Worker * confusability. The strings are in UTF-8 format. 1720*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if an error occurred while attempting to 1721*0e209d39SAndroid Build Coastguard Worker * perform the check. 1722*0e209d39SAndroid Build Coastguard Worker * Confusability of the identifiers is not reported here, 1723*0e209d39SAndroid Build Coastguard Worker * but through this function's return value. 1724*0e209d39SAndroid Build Coastguard Worker * @return An integer value with bit(s) set corresponding to 1725*0e209d39SAndroid Build Coastguard Worker * the type of confusability found, as defined by 1726*0e209d39SAndroid Build Coastguard Worker * enum USpoofChecks. Zero is returned if the identifiers 1727*0e209d39SAndroid Build Coastguard Worker * are not confusable. 1728*0e209d39SAndroid Build Coastguard Worker * 1729*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.2 1730*0e209d39SAndroid Build Coastguard Worker * 1731*0e209d39SAndroid Build Coastguard Worker * @see uspoof_areConfusable 1732*0e209d39SAndroid Build Coastguard Worker */ 1733*0e209d39SAndroid Build Coastguard Worker U_CAPI int32_t U_EXPORT2 1734*0e209d39SAndroid Build Coastguard Worker uspoof_areConfusableUnicodeString(const USpoofChecker *sc, 1735*0e209d39SAndroid Build Coastguard Worker const icu::UnicodeString &s1, 1736*0e209d39SAndroid Build Coastguard Worker const icu::UnicodeString &s2, 1737*0e209d39SAndroid Build Coastguard Worker UErrorCode *status); 1738*0e209d39SAndroid Build Coastguard Worker 1739*0e209d39SAndroid Build Coastguard Worker #ifndef U_HIDE_DRAFT_API 1740*0e209d39SAndroid Build Coastguard Worker /** 1741*0e209d39SAndroid Build Coastguard Worker * A version of {@link uspoof_areBidiConfusable} accepting UnicodeStrings. 1742*0e209d39SAndroid Build Coastguard Worker * 1743*0e209d39SAndroid Build Coastguard Worker * @param sc The USpoofChecker 1744*0e209d39SAndroid Build Coastguard Worker * @param direction The paragraph direction with which the identifiers are 1745*0e209d39SAndroid Build Coastguard Worker * displayed. Must be either UBIDI_LTR or UBIDI_RTL. 1746*0e209d39SAndroid Build Coastguard Worker * @param s1 The first of the two identifiers to be compared for 1747*0e209d39SAndroid Build Coastguard Worker * confusability. The strings are in UTF-8 format. 1748*0e209d39SAndroid Build Coastguard Worker * @param s2 The second of the two identifiers to be compared for 1749*0e209d39SAndroid Build Coastguard Worker * confusability. The strings are in UTF-8 format. 1750*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if an error occurred while attempting to 1751*0e209d39SAndroid Build Coastguard Worker * perform the check. 1752*0e209d39SAndroid Build Coastguard Worker * Confusability of the identifiers is not reported here, 1753*0e209d39SAndroid Build Coastguard Worker * but through this function's return value. 1754*0e209d39SAndroid Build Coastguard Worker * @return An integer value with bit(s) set corresponding to 1755*0e209d39SAndroid Build Coastguard Worker * the type of confusability found, as defined by 1756*0e209d39SAndroid Build Coastguard Worker * enum USpoofChecks. Zero is returned if the identifiers 1757*0e209d39SAndroid Build Coastguard Worker * are not confusable. 1758*0e209d39SAndroid Build Coastguard Worker * 1759*0e209d39SAndroid Build Coastguard Worker * @draft ICU 74 1760*0e209d39SAndroid Build Coastguard Worker * 1761*0e209d39SAndroid Build Coastguard Worker * @see uspoof_areBidiConfusable 1762*0e209d39SAndroid Build Coastguard Worker */ 1763*0e209d39SAndroid Build Coastguard Worker U_CAPI uint32_t U_EXPORT2 uspoof_areBidiConfusableUnicodeString(const USpoofChecker *sc, 1764*0e209d39SAndroid Build Coastguard Worker UBiDiDirection direction, 1765*0e209d39SAndroid Build Coastguard Worker const icu::UnicodeString &s1, 1766*0e209d39SAndroid Build Coastguard Worker const icu::UnicodeString &s2, 1767*0e209d39SAndroid Build Coastguard Worker UErrorCode *status); 1768*0e209d39SAndroid Build Coastguard Worker #endif /* U_HIDE_DRAFT_API */ 1769*0e209d39SAndroid Build Coastguard Worker 1770*0e209d39SAndroid Build Coastguard Worker /** 1771*0e209d39SAndroid Build Coastguard Worker * Get the "skeleton" for an identifier. 1772*0e209d39SAndroid Build Coastguard Worker * Skeletons are a transformation of the input identifier; 1773*0e209d39SAndroid Build Coastguard Worker * Two identifiers are confusable if their skeletons are identical. 1774*0e209d39SAndroid Build Coastguard Worker * See Unicode Technical Standard #39 for additional information. 1775*0e209d39SAndroid Build Coastguard Worker * 1776*0e209d39SAndroid Build Coastguard Worker * Using skeletons directly makes it possible to quickly check 1777*0e209d39SAndroid Build Coastguard Worker * whether an identifier is confusable with any of some large 1778*0e209d39SAndroid Build Coastguard Worker * set of existing identifiers, by creating an efficiently 1779*0e209d39SAndroid Build Coastguard Worker * searchable collection of the skeletons. 1780*0e209d39SAndroid Build Coastguard Worker * 1781*0e209d39SAndroid Build Coastguard Worker * @param sc The USpoofChecker. 1782*0e209d39SAndroid Build Coastguard Worker * @param type Deprecated in ICU 58. You may pass any number. 1783*0e209d39SAndroid Build Coastguard Worker * Originally, controlled which of the Unicode confusable data 1784*0e209d39SAndroid Build Coastguard Worker * tables to use. 1785*0e209d39SAndroid Build Coastguard Worker * @param id The input identifier whose skeleton will be computed. 1786*0e209d39SAndroid Build Coastguard Worker * @param dest The output identifier, to receive the skeleton string. 1787*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if an error occurred while attempting to 1788*0e209d39SAndroid Build Coastguard Worker * perform the check. 1789*0e209d39SAndroid Build Coastguard Worker * @return A reference to the destination (skeleton) string. 1790*0e209d39SAndroid Build Coastguard Worker * 1791*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.2 1792*0e209d39SAndroid Build Coastguard Worker */ 1793*0e209d39SAndroid Build Coastguard Worker U_I18N_API icu::UnicodeString & U_EXPORT2 1794*0e209d39SAndroid Build Coastguard Worker uspoof_getSkeletonUnicodeString(const USpoofChecker *sc, 1795*0e209d39SAndroid Build Coastguard Worker uint32_t type, 1796*0e209d39SAndroid Build Coastguard Worker const icu::UnicodeString &id, 1797*0e209d39SAndroid Build Coastguard Worker icu::UnicodeString &dest, 1798*0e209d39SAndroid Build Coastguard Worker UErrorCode *status); 1799*0e209d39SAndroid Build Coastguard Worker 1800*0e209d39SAndroid Build Coastguard Worker #ifndef U_HIDE_DRAFT_API 1801*0e209d39SAndroid Build Coastguard Worker /** 1802*0e209d39SAndroid Build Coastguard Worker * Get the "bidiSkeleton" for an identifier and a direction. 1803*0e209d39SAndroid Build Coastguard Worker * Skeletons are a transformation of the input identifier; 1804*0e209d39SAndroid Build Coastguard Worker * Two identifiers are LTR-confusable if their LTR bidiSkeletons are identical; 1805*0e209d39SAndroid Build Coastguard Worker * they are RTL-confusable if their RTL bidiSkeletons are identical. 1806*0e209d39SAndroid Build Coastguard Worker * See Unicode Technical Standard #39 for additional information. 1807*0e209d39SAndroid Build Coastguard Worker * https://www.unicode.org/reports/tr39/#Confusable_Detection. 1808*0e209d39SAndroid Build Coastguard Worker * 1809*0e209d39SAndroid Build Coastguard Worker * Using skeletons directly makes it possible to quickly check 1810*0e209d39SAndroid Build Coastguard Worker * whether an identifier is confusable with any of some large 1811*0e209d39SAndroid Build Coastguard Worker * set of existing identifiers, by creating an efficiently 1812*0e209d39SAndroid Build Coastguard Worker * searchable collection of the skeletons. 1813*0e209d39SAndroid Build Coastguard Worker * 1814*0e209d39SAndroid Build Coastguard Worker * @param sc The USpoofChecker. 1815*0e209d39SAndroid Build Coastguard Worker * @param direction The context direction with which the identifier will be 1816*0e209d39SAndroid Build Coastguard Worker * displayed. Must be either UBIDI_LTR or UBIDI_RTL. 1817*0e209d39SAndroid Build Coastguard Worker * @param id The input identifier whose bidiSkeleton will be computed. 1818*0e209d39SAndroid Build Coastguard Worker * @param dest The output identifier, to receive the skeleton string. 1819*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if an error occurred while attempting to 1820*0e209d39SAndroid Build Coastguard Worker * perform the check. 1821*0e209d39SAndroid Build Coastguard Worker * @return A reference to the destination (skeleton) string. 1822*0e209d39SAndroid Build Coastguard Worker * 1823*0e209d39SAndroid Build Coastguard Worker * @draft ICU 74 1824*0e209d39SAndroid Build Coastguard Worker */ 1825*0e209d39SAndroid Build Coastguard Worker U_I18N_API icu::UnicodeString &U_EXPORT2 uspoof_getBidiSkeletonUnicodeString( 1826*0e209d39SAndroid Build Coastguard Worker const USpoofChecker *sc, UBiDiDirection direction, const icu::UnicodeString &id, 1827*0e209d39SAndroid Build Coastguard Worker icu::UnicodeString &dest, UErrorCode *status); 1828*0e209d39SAndroid Build Coastguard Worker #endif /* U_HIDE_DRAFT_API */ 1829*0e209d39SAndroid Build Coastguard Worker 1830*0e209d39SAndroid Build Coastguard Worker /** 1831*0e209d39SAndroid Build Coastguard Worker * Get the set of Candidate Characters for Inclusion in Identifiers, as defined 1832*0e209d39SAndroid Build Coastguard Worker * in http://unicode.org/Public/security/latest/xidmodifications.txt 1833*0e209d39SAndroid Build Coastguard Worker * and documented in http://www.unicode.org/reports/tr39/, Unicode Security Mechanisms. 1834*0e209d39SAndroid Build Coastguard Worker * 1835*0e209d39SAndroid Build Coastguard Worker * The returned set is frozen. Ownership of the set remains with the ICU library; it must not 1836*0e209d39SAndroid Build Coastguard Worker * be deleted by the caller. 1837*0e209d39SAndroid Build Coastguard Worker * 1838*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if a problem occurs while creating the set. 1839*0e209d39SAndroid Build Coastguard Worker * 1840*0e209d39SAndroid Build Coastguard Worker * @stable ICU 51 1841*0e209d39SAndroid Build Coastguard Worker */ 1842*0e209d39SAndroid Build Coastguard Worker U_CAPI const icu::UnicodeSet * U_EXPORT2 1843*0e209d39SAndroid Build Coastguard Worker uspoof_getInclusionUnicodeSet(UErrorCode *status); 1844*0e209d39SAndroid Build Coastguard Worker 1845*0e209d39SAndroid Build Coastguard Worker /** 1846*0e209d39SAndroid Build Coastguard Worker * Get the set of characters from Recommended Scripts for Inclusion in Identifiers, as defined 1847*0e209d39SAndroid Build Coastguard Worker * in http://unicode.org/Public/security/latest/xidmodifications.txt 1848*0e209d39SAndroid Build Coastguard Worker * and documented in http://www.unicode.org/reports/tr39/, Unicode Security Mechanisms. 1849*0e209d39SAndroid Build Coastguard Worker * 1850*0e209d39SAndroid Build Coastguard Worker * The returned set is frozen. Ownership of the set remains with the ICU library; it must not 1851*0e209d39SAndroid Build Coastguard Worker * be deleted by the caller. 1852*0e209d39SAndroid Build Coastguard Worker * 1853*0e209d39SAndroid Build Coastguard Worker * @param status The error code, set if a problem occurs while creating the set. 1854*0e209d39SAndroid Build Coastguard Worker * 1855*0e209d39SAndroid Build Coastguard Worker * @stable ICU 51 1856*0e209d39SAndroid Build Coastguard Worker */ 1857*0e209d39SAndroid Build Coastguard Worker U_CAPI const icu::UnicodeSet * U_EXPORT2 1858*0e209d39SAndroid Build Coastguard Worker uspoof_getRecommendedUnicodeSet(UErrorCode *status); 1859*0e209d39SAndroid Build Coastguard Worker 1860*0e209d39SAndroid Build Coastguard Worker #endif /* U_SHOW_CPLUSPLUS_API */ 1861*0e209d39SAndroid Build Coastguard Worker 1862*0e209d39SAndroid Build Coastguard Worker #endif /* UCONFIG_NO_NORMALIZATION */ 1863*0e209d39SAndroid Build Coastguard Worker 1864*0e209d39SAndroid Build Coastguard Worker #endif /* USPOOF_H */ 1865