1*0e209d39SAndroid Build Coastguard Worker // © 2016 and later: Unicode, Inc. and others. 2*0e209d39SAndroid Build Coastguard Worker // License & terms of use: http://www.unicode.org/copyright.html 3*0e209d39SAndroid Build Coastguard Worker /* 4*0e209d39SAndroid Build Coastguard Worker ********************************************************************** 5*0e209d39SAndroid Build Coastguard Worker * Copyright (C) 2001-2011,2014 IBM and others. All rights reserved. 6*0e209d39SAndroid Build Coastguard Worker ********************************************************************** 7*0e209d39SAndroid Build Coastguard Worker * Date Name Description 8*0e209d39SAndroid Build Coastguard Worker * 06/28/2001 synwee Creation. 9*0e209d39SAndroid Build Coastguard Worker ********************************************************************** 10*0e209d39SAndroid Build Coastguard Worker */ 11*0e209d39SAndroid Build Coastguard Worker #ifndef USEARCH_H 12*0e209d39SAndroid Build Coastguard Worker #define USEARCH_H 13*0e209d39SAndroid Build Coastguard Worker 14*0e209d39SAndroid Build Coastguard Worker #include "unicode/utypes.h" 15*0e209d39SAndroid Build Coastguard Worker 16*0e209d39SAndroid Build Coastguard Worker #if !UCONFIG_NO_COLLATION && !UCONFIG_NO_BREAK_ITERATION 17*0e209d39SAndroid Build Coastguard Worker 18*0e209d39SAndroid Build Coastguard Worker #include "unicode/ucol.h" 19*0e209d39SAndroid Build Coastguard Worker #include "unicode/ucoleitr.h" 20*0e209d39SAndroid Build Coastguard Worker #include "unicode/ubrk.h" 21*0e209d39SAndroid Build Coastguard Worker 22*0e209d39SAndroid Build Coastguard Worker #if U_SHOW_CPLUSPLUS_API 23*0e209d39SAndroid Build Coastguard Worker #include "unicode/localpointer.h" 24*0e209d39SAndroid Build Coastguard Worker #endif // U_SHOW_CPLUSPLUS_API 25*0e209d39SAndroid Build Coastguard Worker 26*0e209d39SAndroid Build Coastguard Worker /** 27*0e209d39SAndroid Build Coastguard Worker * \file 28*0e209d39SAndroid Build Coastguard Worker * \brief C API: StringSearch 29*0e209d39SAndroid Build Coastguard Worker * 30*0e209d39SAndroid Build Coastguard Worker * C APIs for an engine that provides language-sensitive text searching based 31*0e209d39SAndroid Build Coastguard Worker * on the comparison rules defined in a <code>UCollator</code> data struct, 32*0e209d39SAndroid Build Coastguard Worker * see <code>ucol.h</code>. This ensures that language eccentricity can be 33*0e209d39SAndroid Build Coastguard Worker * handled, e.g. for the German collator, characters ß and SS will be matched 34*0e209d39SAndroid Build Coastguard Worker * if case is chosen to be ignored. 35*0e209d39SAndroid Build Coastguard Worker * See the <a href="https://htmlpreview.github.io/?https://github.com/unicode-org/icu-docs/blob/main/design/collation/ICU_collation_design.htm"> 36*0e209d39SAndroid Build Coastguard Worker * "ICU Collation Design Document"</a> for more information. 37*0e209d39SAndroid Build Coastguard Worker * <p> 38*0e209d39SAndroid Build Coastguard Worker * As of ICU4C 4.0 / ICU4J 53, the implementation uses a linear search. In previous versions, 39*0e209d39SAndroid Build Coastguard Worker * a modified form of the Boyer-Moore searching algorithm was used. For more information 40*0e209d39SAndroid Build Coastguard Worker * on the modified Boyer-Moore algorithm see 41*0e209d39SAndroid Build Coastguard Worker * <a href="http://icu-project.org/docs/papers/efficient_text_searching_in_java.html"> 42*0e209d39SAndroid Build Coastguard Worker * "Efficient Text Searching in Java"</a>, published in <i>Java Report</i> 43*0e209d39SAndroid Build Coastguard Worker * in February, 1999. 44*0e209d39SAndroid Build Coastguard Worker * <p> 45*0e209d39SAndroid Build Coastguard Worker * There are 2 match options for selection:<br> 46*0e209d39SAndroid Build Coastguard Worker * Let S' be the sub-string of a text string S between the offsets start and 47*0e209d39SAndroid Build Coastguard Worker * end <start, end>. 48*0e209d39SAndroid Build Coastguard Worker * <br> 49*0e209d39SAndroid Build Coastguard Worker * A pattern string P matches a text string S at the offsets <start, end> 50*0e209d39SAndroid Build Coastguard Worker * if 51*0e209d39SAndroid Build Coastguard Worker * <pre> 52*0e209d39SAndroid Build Coastguard Worker * option 1. Some canonical equivalent of P matches some canonical equivalent 53*0e209d39SAndroid Build Coastguard Worker * of S' 54*0e209d39SAndroid Build Coastguard Worker * option 2. P matches S' and if P starts or ends with a combining mark, 55*0e209d39SAndroid Build Coastguard Worker * there exists no non-ignorable combining mark before or after S' 56*0e209d39SAndroid Build Coastguard Worker * in S respectively. 57*0e209d39SAndroid Build Coastguard Worker * </pre> 58*0e209d39SAndroid Build Coastguard Worker * Option 2. will be the default. 59*0e209d39SAndroid Build Coastguard Worker * <p> 60*0e209d39SAndroid Build Coastguard Worker * This search has APIs similar to that of other text iteration mechanisms 61*0e209d39SAndroid Build Coastguard Worker * such as the break iterators in <code>ubrk.h</code>. Using these 62*0e209d39SAndroid Build Coastguard Worker * APIs, it is easy to scan through text looking for all occurrences of 63*0e209d39SAndroid Build Coastguard Worker * a given pattern. This search iterator allows changing of direction by 64*0e209d39SAndroid Build Coastguard Worker * calling a <code>reset</code> followed by a <code>next</code> or <code>previous</code>. 65*0e209d39SAndroid Build Coastguard Worker * Though a direction change can occur without calling <code>reset</code> first, 66*0e209d39SAndroid Build Coastguard Worker * this operation comes with some speed penalty. 67*0e209d39SAndroid Build Coastguard Worker * Generally, match results in the forward direction will match the result 68*0e209d39SAndroid Build Coastguard Worker * matches in the backwards direction in the reverse order 69*0e209d39SAndroid Build Coastguard Worker * <p> 70*0e209d39SAndroid Build Coastguard Worker * <code>usearch.h</code> provides APIs to specify the starting position 71*0e209d39SAndroid Build Coastguard Worker * within the text string to be searched, e.g. <code>usearch_setOffset</code>, 72*0e209d39SAndroid Build Coastguard Worker * <code>usearch_preceding</code> and <code>usearch_following</code>. Since the 73*0e209d39SAndroid Build Coastguard Worker * starting position will be set as it is specified, please take note that 74*0e209d39SAndroid Build Coastguard Worker * there are some dangerous positions which the search may render incorrect 75*0e209d39SAndroid Build Coastguard Worker * results: 76*0e209d39SAndroid Build Coastguard Worker * <ul> 77*0e209d39SAndroid Build Coastguard Worker * <li> The midst of a substring that requires normalization. 78*0e209d39SAndroid Build Coastguard Worker * <li> If the following match is to be found, the position should not be the 79*0e209d39SAndroid Build Coastguard Worker * second character which requires to be swapped with the preceding 80*0e209d39SAndroid Build Coastguard Worker * character. Vice versa, if the preceding match is to be found, 81*0e209d39SAndroid Build Coastguard Worker * position to search from should not be the first character which 82*0e209d39SAndroid Build Coastguard Worker * requires to be swapped with the next character. E.g certain Thai and 83*0e209d39SAndroid Build Coastguard Worker * Lao characters require swapping. 84*0e209d39SAndroid Build Coastguard Worker * <li> If a following pattern match is to be found, any position within a 85*0e209d39SAndroid Build Coastguard Worker * contracting sequence except the first will fail. Vice versa if a 86*0e209d39SAndroid Build Coastguard Worker * preceding pattern match is to be found, a invalid starting point 87*0e209d39SAndroid Build Coastguard Worker * would be any character within a contracting sequence except the last. 88*0e209d39SAndroid Build Coastguard Worker * </ul> 89*0e209d39SAndroid Build Coastguard Worker * <p> 90*0e209d39SAndroid Build Coastguard Worker * A breakiterator can be used if only matches at logical breaks are desired. 91*0e209d39SAndroid Build Coastguard Worker * Using a breakiterator will only give you results that exactly matches the 92*0e209d39SAndroid Build Coastguard Worker * boundaries given by the breakiterator. For instance the pattern "e" will 93*0e209d39SAndroid Build Coastguard Worker * not be found in the string "\u00e9" if a character break iterator is used. 94*0e209d39SAndroid Build Coastguard Worker * <p> 95*0e209d39SAndroid Build Coastguard Worker * Options are provided to handle overlapping matches. 96*0e209d39SAndroid Build Coastguard Worker * E.g. In English, overlapping matches produces the result 0 and 2 97*0e209d39SAndroid Build Coastguard Worker * for the pattern "abab" in the text "ababab", where else mutually 98*0e209d39SAndroid Build Coastguard Worker * exclusive matches only produce the result of 0. 99*0e209d39SAndroid Build Coastguard Worker * <p> 100*0e209d39SAndroid Build Coastguard Worker * Options are also provided to implement "asymmetric search" as described in 101*0e209d39SAndroid Build Coastguard Worker * <a href="http://www.unicode.org/reports/tr10/#Asymmetric_Search"> 102*0e209d39SAndroid Build Coastguard Worker * UTS #10 Unicode Collation Algorithm</a>, specifically the USearchAttribute 103*0e209d39SAndroid Build Coastguard Worker * USEARCH_ELEMENT_COMPARISON and its values. 104*0e209d39SAndroid Build Coastguard Worker * <p> 105*0e209d39SAndroid Build Coastguard Worker * Though collator attributes will be taken into consideration while 106*0e209d39SAndroid Build Coastguard Worker * performing matches, there are no APIs here for setting and getting the 107*0e209d39SAndroid Build Coastguard Worker * attributes. These attributes can be set by getting the collator 108*0e209d39SAndroid Build Coastguard Worker * from <code>usearch_getCollator</code> and using the APIs in <code>ucol.h</code>. 109*0e209d39SAndroid Build Coastguard Worker * Lastly to update String Search to the new collator attributes, 110*0e209d39SAndroid Build Coastguard Worker * usearch_reset() has to be called. 111*0e209d39SAndroid Build Coastguard Worker * <p> 112*0e209d39SAndroid Build Coastguard Worker * Restriction: <br> 113*0e209d39SAndroid Build Coastguard Worker * Currently there are no composite characters that consists of a 114*0e209d39SAndroid Build Coastguard Worker * character with combining class > 0 before a character with combining 115*0e209d39SAndroid Build Coastguard Worker * class == 0. However, if such a character exists in the future, the 116*0e209d39SAndroid Build Coastguard Worker * search mechanism does not guarantee the results for option 1. 117*0e209d39SAndroid Build Coastguard Worker * 118*0e209d39SAndroid Build Coastguard Worker * <p> 119*0e209d39SAndroid Build Coastguard Worker * Example of use:<br> 120*0e209d39SAndroid Build Coastguard Worker * <pre><code> 121*0e209d39SAndroid Build Coastguard Worker * char *tgtstr = "The quick brown fox jumped over the lazy fox"; 122*0e209d39SAndroid Build Coastguard Worker * char *patstr = "fox"; 123*0e209d39SAndroid Build Coastguard Worker * UChar target[64]; 124*0e209d39SAndroid Build Coastguard Worker * UChar pattern[16]; 125*0e209d39SAndroid Build Coastguard Worker * UErrorCode status = U_ZERO_ERROR; 126*0e209d39SAndroid Build Coastguard Worker * u_uastrcpy(target, tgtstr); 127*0e209d39SAndroid Build Coastguard Worker * u_uastrcpy(pattern, patstr); 128*0e209d39SAndroid Build Coastguard Worker * 129*0e209d39SAndroid Build Coastguard Worker * UStringSearch *search = usearch_open(pattern, -1, target, -1, "en_US", 130*0e209d39SAndroid Build Coastguard Worker * NULL, &status); 131*0e209d39SAndroid Build Coastguard Worker * if (U_SUCCESS(status)) { 132*0e209d39SAndroid Build Coastguard Worker * for (int pos = usearch_first(search, &status); 133*0e209d39SAndroid Build Coastguard Worker * pos != USEARCH_DONE; 134*0e209d39SAndroid Build Coastguard Worker * pos = usearch_next(search, &status)) 135*0e209d39SAndroid Build Coastguard Worker * { 136*0e209d39SAndroid Build Coastguard Worker * printf("Found match at %d pos, length is %d\n", pos, 137*0e209d39SAndroid Build Coastguard Worker * usearch_getMatchedLength(search)); 138*0e209d39SAndroid Build Coastguard Worker * } 139*0e209d39SAndroid Build Coastguard Worker * } 140*0e209d39SAndroid Build Coastguard Worker * 141*0e209d39SAndroid Build Coastguard Worker * usearch_close(search); 142*0e209d39SAndroid Build Coastguard Worker * </code></pre> 143*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 144*0e209d39SAndroid Build Coastguard Worker */ 145*0e209d39SAndroid Build Coastguard Worker 146*0e209d39SAndroid Build Coastguard Worker /** 147*0e209d39SAndroid Build Coastguard Worker * DONE is returned by previous() and next() after all valid matches have 148*0e209d39SAndroid Build Coastguard Worker * been returned, and by first() and last() if there are no matches at all. 149*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 150*0e209d39SAndroid Build Coastguard Worker */ 151*0e209d39SAndroid Build Coastguard Worker #define USEARCH_DONE -1 152*0e209d39SAndroid Build Coastguard Worker 153*0e209d39SAndroid Build Coastguard Worker /** 154*0e209d39SAndroid Build Coastguard Worker * Data structure for searching 155*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 156*0e209d39SAndroid Build Coastguard Worker */ 157*0e209d39SAndroid Build Coastguard Worker struct UStringSearch; 158*0e209d39SAndroid Build Coastguard Worker /** 159*0e209d39SAndroid Build Coastguard Worker * Data structure for searching 160*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 161*0e209d39SAndroid Build Coastguard Worker */ 162*0e209d39SAndroid Build Coastguard Worker typedef struct UStringSearch UStringSearch; 163*0e209d39SAndroid Build Coastguard Worker 164*0e209d39SAndroid Build Coastguard Worker /** 165*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 166*0e209d39SAndroid Build Coastguard Worker */ 167*0e209d39SAndroid Build Coastguard Worker typedef enum { 168*0e209d39SAndroid Build Coastguard Worker /** 169*0e209d39SAndroid Build Coastguard Worker * Option for overlapping matches 170*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 171*0e209d39SAndroid Build Coastguard Worker */ 172*0e209d39SAndroid Build Coastguard Worker USEARCH_OVERLAP = 0, 173*0e209d39SAndroid Build Coastguard Worker #ifndef U_HIDE_DEPRECATED_API 174*0e209d39SAndroid Build Coastguard Worker /** 175*0e209d39SAndroid Build Coastguard Worker * Option for canonical matches; option 1 in header documentation. 176*0e209d39SAndroid Build Coastguard Worker * The default value will be USEARCH_OFF. 177*0e209d39SAndroid Build Coastguard Worker * Note: Setting this option to USEARCH_ON currently has no effect on 178*0e209d39SAndroid Build Coastguard Worker * search behavior, and this option is deprecated. Instead, to control 179*0e209d39SAndroid Build Coastguard Worker * canonical match behavior, you must set UCOL_NORMALIZATION_MODE 180*0e209d39SAndroid Build Coastguard Worker * appropriately (to UCOL_OFF or UCOL_ON) in the UCollator used by 181*0e209d39SAndroid Build Coastguard Worker * the UStringSearch object. 182*0e209d39SAndroid Build Coastguard Worker * @see usearch_openFromCollator 183*0e209d39SAndroid Build Coastguard Worker * @see usearch_getCollator 184*0e209d39SAndroid Build Coastguard Worker * @see usearch_setCollator 185*0e209d39SAndroid Build Coastguard Worker * @see ucol_getAttribute 186*0e209d39SAndroid Build Coastguard Worker * @deprecated ICU 53 187*0e209d39SAndroid Build Coastguard Worker */ 188*0e209d39SAndroid Build Coastguard Worker USEARCH_CANONICAL_MATCH = 1, 189*0e209d39SAndroid Build Coastguard Worker #endif /* U_HIDE_DEPRECATED_API */ 190*0e209d39SAndroid Build Coastguard Worker /** 191*0e209d39SAndroid Build Coastguard Worker * Option to control how collation elements are compared. 192*0e209d39SAndroid Build Coastguard Worker * The default value will be USEARCH_STANDARD_ELEMENT_COMPARISON. 193*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.4 194*0e209d39SAndroid Build Coastguard Worker */ 195*0e209d39SAndroid Build Coastguard Worker USEARCH_ELEMENT_COMPARISON = 2, 196*0e209d39SAndroid Build Coastguard Worker 197*0e209d39SAndroid Build Coastguard Worker #ifndef U_HIDE_DEPRECATED_API 198*0e209d39SAndroid Build Coastguard Worker /** 199*0e209d39SAndroid Build Coastguard Worker * One more than the highest normal USearchAttribute value. 200*0e209d39SAndroid Build Coastguard Worker * @deprecated ICU 58 The numeric value may change over time, see ICU ticket #12420. 201*0e209d39SAndroid Build Coastguard Worker */ 202*0e209d39SAndroid Build Coastguard Worker USEARCH_ATTRIBUTE_COUNT = 3 203*0e209d39SAndroid Build Coastguard Worker #endif /* U_HIDE_DEPRECATED_API */ 204*0e209d39SAndroid Build Coastguard Worker } USearchAttribute; 205*0e209d39SAndroid Build Coastguard Worker 206*0e209d39SAndroid Build Coastguard Worker /** 207*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 208*0e209d39SAndroid Build Coastguard Worker */ 209*0e209d39SAndroid Build Coastguard Worker typedef enum { 210*0e209d39SAndroid Build Coastguard Worker /** 211*0e209d39SAndroid Build Coastguard Worker * Default value for any USearchAttribute 212*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 213*0e209d39SAndroid Build Coastguard Worker */ 214*0e209d39SAndroid Build Coastguard Worker USEARCH_DEFAULT = -1, 215*0e209d39SAndroid Build Coastguard Worker /** 216*0e209d39SAndroid Build Coastguard Worker * Value for USEARCH_OVERLAP and USEARCH_CANONICAL_MATCH 217*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 218*0e209d39SAndroid Build Coastguard Worker */ 219*0e209d39SAndroid Build Coastguard Worker USEARCH_OFF, 220*0e209d39SAndroid Build Coastguard Worker /** 221*0e209d39SAndroid Build Coastguard Worker * Value for USEARCH_OVERLAP and USEARCH_CANONICAL_MATCH 222*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 223*0e209d39SAndroid Build Coastguard Worker */ 224*0e209d39SAndroid Build Coastguard Worker USEARCH_ON, 225*0e209d39SAndroid Build Coastguard Worker /** 226*0e209d39SAndroid Build Coastguard Worker * Value (default) for USEARCH_ELEMENT_COMPARISON; 227*0e209d39SAndroid Build Coastguard Worker * standard collation element comparison at the specified collator 228*0e209d39SAndroid Build Coastguard Worker * strength. 229*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.4 230*0e209d39SAndroid Build Coastguard Worker */ 231*0e209d39SAndroid Build Coastguard Worker USEARCH_STANDARD_ELEMENT_COMPARISON, 232*0e209d39SAndroid Build Coastguard Worker /** 233*0e209d39SAndroid Build Coastguard Worker * Value for USEARCH_ELEMENT_COMPARISON; 234*0e209d39SAndroid Build Coastguard Worker * collation element comparison is modified to effectively provide 235*0e209d39SAndroid Build Coastguard Worker * behavior between the specified strength and strength - 1. Collation 236*0e209d39SAndroid Build Coastguard Worker * elements in the pattern that have the base weight for the specified 237*0e209d39SAndroid Build Coastguard Worker * strength are treated as "wildcards" that match an element with any 238*0e209d39SAndroid Build Coastguard Worker * other weight at that collation level in the searched text. For 239*0e209d39SAndroid Build Coastguard Worker * example, with a secondary-strength English collator, a plain 'e' in 240*0e209d39SAndroid Build Coastguard Worker * the pattern will match a plain e or an e with any diacritic in the 241*0e209d39SAndroid Build Coastguard Worker * searched text, but an e with diacritic in the pattern will only 242*0e209d39SAndroid Build Coastguard Worker * match an e with the same diacritic in the searched text. 243*0e209d39SAndroid Build Coastguard Worker * 244*0e209d39SAndroid Build Coastguard Worker * This supports "asymmetric search" as described in 245*0e209d39SAndroid Build Coastguard Worker * <a href="http://www.unicode.org/reports/tr10/#Asymmetric_Search"> 246*0e209d39SAndroid Build Coastguard Worker * UTS #10 Unicode Collation Algorithm</a>. 247*0e209d39SAndroid Build Coastguard Worker * 248*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.4 249*0e209d39SAndroid Build Coastguard Worker */ 250*0e209d39SAndroid Build Coastguard Worker USEARCH_PATTERN_BASE_WEIGHT_IS_WILDCARD, 251*0e209d39SAndroid Build Coastguard Worker /** 252*0e209d39SAndroid Build Coastguard Worker * Value for USEARCH_ELEMENT_COMPARISON. 253*0e209d39SAndroid Build Coastguard Worker * collation element comparison is modified to effectively provide 254*0e209d39SAndroid Build Coastguard Worker * behavior between the specified strength and strength - 1. Collation 255*0e209d39SAndroid Build Coastguard Worker * elements in either the pattern or the searched text that have the 256*0e209d39SAndroid Build Coastguard Worker * base weight for the specified strength are treated as "wildcards" 257*0e209d39SAndroid Build Coastguard Worker * that match an element with any other weight at that collation level. 258*0e209d39SAndroid Build Coastguard Worker * For example, with a secondary-strength English collator, a plain 'e' 259*0e209d39SAndroid Build Coastguard Worker * in the pattern will match a plain e or an e with any diacritic in the 260*0e209d39SAndroid Build Coastguard Worker * searched text, but an e with diacritic in the pattern will only 261*0e209d39SAndroid Build Coastguard Worker * match an e with the same diacritic or a plain e in the searched text. 262*0e209d39SAndroid Build Coastguard Worker * 263*0e209d39SAndroid Build Coastguard Worker * This option is similar to "asymmetric search" as described in 264*0e209d39SAndroid Build Coastguard Worker * [UTS #10 Unicode Collation Algorithm](http://www.unicode.org/reports/tr10/#Asymmetric_Search), 265*0e209d39SAndroid Build Coastguard Worker * but also allows unmarked characters in the searched text to match 266*0e209d39SAndroid Build Coastguard Worker * marked or unmarked versions of that character in the pattern. 267*0e209d39SAndroid Build Coastguard Worker * 268*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.4 269*0e209d39SAndroid Build Coastguard Worker */ 270*0e209d39SAndroid Build Coastguard Worker USEARCH_ANY_BASE_WEIGHT_IS_WILDCARD, 271*0e209d39SAndroid Build Coastguard Worker 272*0e209d39SAndroid Build Coastguard Worker #ifndef U_HIDE_DEPRECATED_API 273*0e209d39SAndroid Build Coastguard Worker /** 274*0e209d39SAndroid Build Coastguard Worker * One more than the highest normal USearchAttributeValue value. 275*0e209d39SAndroid Build Coastguard Worker * @deprecated ICU 58 The numeric value may change over time, see ICU ticket #12420. 276*0e209d39SAndroid Build Coastguard Worker */ 277*0e209d39SAndroid Build Coastguard Worker USEARCH_ATTRIBUTE_VALUE_COUNT 278*0e209d39SAndroid Build Coastguard Worker #endif /* U_HIDE_DEPRECATED_API */ 279*0e209d39SAndroid Build Coastguard Worker } USearchAttributeValue; 280*0e209d39SAndroid Build Coastguard Worker 281*0e209d39SAndroid Build Coastguard Worker /* open and close ------------------------------------------------------ */ 282*0e209d39SAndroid Build Coastguard Worker 283*0e209d39SAndroid Build Coastguard Worker /** 284*0e209d39SAndroid Build Coastguard Worker * Creates a String Search iterator data struct using the argument locale language 285*0e209d39SAndroid Build Coastguard Worker * rule set. A collator will be created in the process, which will be owned by 286*0e209d39SAndroid Build Coastguard Worker * this String Search and will be deleted in <code>usearch_close</code>. 287*0e209d39SAndroid Build Coastguard Worker * 288*0e209d39SAndroid Build Coastguard Worker * The UStringSearch retains a pointer to both the pattern and text strings. 289*0e209d39SAndroid Build Coastguard Worker * The caller must not modify or delete them while using the UStringSearch. 290*0e209d39SAndroid Build Coastguard Worker * 291*0e209d39SAndroid Build Coastguard Worker * @param pattern for matching 292*0e209d39SAndroid Build Coastguard Worker * @param patternlength length of the pattern, -1 for null-termination 293*0e209d39SAndroid Build Coastguard Worker * @param text text string 294*0e209d39SAndroid Build Coastguard Worker * @param textlength length of the text string, -1 for null-termination 295*0e209d39SAndroid Build Coastguard Worker * @param locale name of locale for the rules to be used 296*0e209d39SAndroid Build Coastguard Worker * @param breakiter A BreakIterator that will be used to restrict the points 297*0e209d39SAndroid Build Coastguard Worker * at which matches are detected. If a match is found, but 298*0e209d39SAndroid Build Coastguard Worker * the match's start or end index is not a boundary as 299*0e209d39SAndroid Build Coastguard Worker * determined by the <code>BreakIterator</code>, the match will 300*0e209d39SAndroid Build Coastguard Worker * be rejected and another will be searched for. 301*0e209d39SAndroid Build Coastguard Worker * If this parameter is <code>NULL</code>, no break detection is 302*0e209d39SAndroid Build Coastguard Worker * attempted. 303*0e209d39SAndroid Build Coastguard Worker * @param status for errors if it occurs. If pattern or text is NULL, or if 304*0e209d39SAndroid Build Coastguard Worker * patternlength or textlength is 0 then an 305*0e209d39SAndroid Build Coastguard Worker * U_ILLEGAL_ARGUMENT_ERROR is returned. 306*0e209d39SAndroid Build Coastguard Worker * @return search iterator data structure, or NULL if there is an error. 307*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 308*0e209d39SAndroid Build Coastguard Worker */ 309*0e209d39SAndroid Build Coastguard Worker U_CAPI UStringSearch * U_EXPORT2 usearch_open(const UChar *pattern, 310*0e209d39SAndroid Build Coastguard Worker int32_t patternlength, 311*0e209d39SAndroid Build Coastguard Worker const UChar *text, 312*0e209d39SAndroid Build Coastguard Worker int32_t textlength, 313*0e209d39SAndroid Build Coastguard Worker const char *locale, 314*0e209d39SAndroid Build Coastguard Worker UBreakIterator *breakiter, 315*0e209d39SAndroid Build Coastguard Worker UErrorCode *status); 316*0e209d39SAndroid Build Coastguard Worker 317*0e209d39SAndroid Build Coastguard Worker /** 318*0e209d39SAndroid Build Coastguard Worker * Creates a String Search iterator data struct using the argument collator language 319*0e209d39SAndroid Build Coastguard Worker * rule set. Note, user retains the ownership of this collator, thus the 320*0e209d39SAndroid Build Coastguard Worker * responsibility of deletion lies with the user. 321*0e209d39SAndroid Build Coastguard Worker 322*0e209d39SAndroid Build Coastguard Worker * NOTE: String Search cannot be instantiated from a collator that has 323*0e209d39SAndroid Build Coastguard Worker * collate digits as numbers (CODAN) turned on (UCOL_NUMERIC_COLLATION). 324*0e209d39SAndroid Build Coastguard Worker * 325*0e209d39SAndroid Build Coastguard Worker * The UStringSearch retains a pointer to both the pattern and text strings. 326*0e209d39SAndroid Build Coastguard Worker * The caller must not modify or delete them while using the UStringSearch. 327*0e209d39SAndroid Build Coastguard Worker * 328*0e209d39SAndroid Build Coastguard Worker * @param pattern for matching 329*0e209d39SAndroid Build Coastguard Worker * @param patternlength length of the pattern, -1 for null-termination 330*0e209d39SAndroid Build Coastguard Worker * @param text text string 331*0e209d39SAndroid Build Coastguard Worker * @param textlength length of the text string, -1 for null-termination 332*0e209d39SAndroid Build Coastguard Worker * @param collator used for the language rules 333*0e209d39SAndroid Build Coastguard Worker * @param breakiter A BreakIterator that will be used to restrict the points 334*0e209d39SAndroid Build Coastguard Worker * at which matches are detected. If a match is found, but 335*0e209d39SAndroid Build Coastguard Worker * the match's start or end index is not a boundary as 336*0e209d39SAndroid Build Coastguard Worker * determined by the <code>BreakIterator</code>, the match will 337*0e209d39SAndroid Build Coastguard Worker * be rejected and another will be searched for. 338*0e209d39SAndroid Build Coastguard Worker * If this parameter is <code>NULL</code>, no break detection is 339*0e209d39SAndroid Build Coastguard Worker * attempted. 340*0e209d39SAndroid Build Coastguard Worker * @param status for errors if it occurs. If collator, pattern or text is NULL, 341*0e209d39SAndroid Build Coastguard Worker * or if patternlength or textlength is 0 then an 342*0e209d39SAndroid Build Coastguard Worker * U_ILLEGAL_ARGUMENT_ERROR is returned. 343*0e209d39SAndroid Build Coastguard Worker * @return search iterator data structure, or NULL if there is an error. 344*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 345*0e209d39SAndroid Build Coastguard Worker */ 346*0e209d39SAndroid Build Coastguard Worker U_CAPI UStringSearch * U_EXPORT2 usearch_openFromCollator( 347*0e209d39SAndroid Build Coastguard Worker const UChar *pattern, 348*0e209d39SAndroid Build Coastguard Worker int32_t patternlength, 349*0e209d39SAndroid Build Coastguard Worker const UChar *text, 350*0e209d39SAndroid Build Coastguard Worker int32_t textlength, 351*0e209d39SAndroid Build Coastguard Worker const UCollator *collator, 352*0e209d39SAndroid Build Coastguard Worker UBreakIterator *breakiter, 353*0e209d39SAndroid Build Coastguard Worker UErrorCode *status); 354*0e209d39SAndroid Build Coastguard Worker 355*0e209d39SAndroid Build Coastguard Worker /** 356*0e209d39SAndroid Build Coastguard Worker * Destroys and cleans up the String Search iterator data struct. 357*0e209d39SAndroid Build Coastguard Worker * If a collator was created in <code>usearch_open</code>, then it will be destroyed here. 358*0e209d39SAndroid Build Coastguard Worker * @param searchiter The UStringSearch to clean up 359*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 360*0e209d39SAndroid Build Coastguard Worker */ 361*0e209d39SAndroid Build Coastguard Worker U_CAPI void U_EXPORT2 usearch_close(UStringSearch *searchiter); 362*0e209d39SAndroid Build Coastguard Worker 363*0e209d39SAndroid Build Coastguard Worker #if U_SHOW_CPLUSPLUS_API 364*0e209d39SAndroid Build Coastguard Worker 365*0e209d39SAndroid Build Coastguard Worker U_NAMESPACE_BEGIN 366*0e209d39SAndroid Build Coastguard Worker 367*0e209d39SAndroid Build Coastguard Worker /** 368*0e209d39SAndroid Build Coastguard Worker * \class LocalUStringSearchPointer 369*0e209d39SAndroid Build Coastguard Worker * "Smart pointer" class, closes a UStringSearch via usearch_close(). 370*0e209d39SAndroid Build Coastguard Worker * For most methods see the LocalPointerBase base class. 371*0e209d39SAndroid Build Coastguard Worker * 372*0e209d39SAndroid Build Coastguard Worker * @see LocalPointerBase 373*0e209d39SAndroid Build Coastguard Worker * @see LocalPointer 374*0e209d39SAndroid Build Coastguard Worker * @stable ICU 4.4 375*0e209d39SAndroid Build Coastguard Worker */ 376*0e209d39SAndroid Build Coastguard Worker U_DEFINE_LOCAL_OPEN_POINTER(LocalUStringSearchPointer, UStringSearch, usearch_close); 377*0e209d39SAndroid Build Coastguard Worker 378*0e209d39SAndroid Build Coastguard Worker U_NAMESPACE_END 379*0e209d39SAndroid Build Coastguard Worker 380*0e209d39SAndroid Build Coastguard Worker #endif 381*0e209d39SAndroid Build Coastguard Worker 382*0e209d39SAndroid Build Coastguard Worker /* get and set methods -------------------------------------------------- */ 383*0e209d39SAndroid Build Coastguard Worker 384*0e209d39SAndroid Build Coastguard Worker /** 385*0e209d39SAndroid Build Coastguard Worker * Sets the current position in the text string which the next search will 386*0e209d39SAndroid Build Coastguard Worker * start from. Clears previous states. 387*0e209d39SAndroid Build Coastguard Worker * This method takes the argument index and sets the position in the text 388*0e209d39SAndroid Build Coastguard Worker * string accordingly without checking if the index is pointing to a 389*0e209d39SAndroid Build Coastguard Worker * valid starting point to begin searching. 390*0e209d39SAndroid Build Coastguard Worker * Search positions that may render incorrect results are highlighted in the 391*0e209d39SAndroid Build Coastguard Worker * header comments 392*0e209d39SAndroid Build Coastguard Worker * @param strsrch search iterator data struct 393*0e209d39SAndroid Build Coastguard Worker * @param position position to start next search from. If position is less 394*0e209d39SAndroid Build Coastguard Worker * than or greater than the text range for searching, 395*0e209d39SAndroid Build Coastguard Worker * an U_INDEX_OUTOFBOUNDS_ERROR will be returned 396*0e209d39SAndroid Build Coastguard Worker * @param status error status if any. 397*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 398*0e209d39SAndroid Build Coastguard Worker */ 399*0e209d39SAndroid Build Coastguard Worker U_CAPI void U_EXPORT2 usearch_setOffset(UStringSearch *strsrch, 400*0e209d39SAndroid Build Coastguard Worker int32_t position, 401*0e209d39SAndroid Build Coastguard Worker UErrorCode *status); 402*0e209d39SAndroid Build Coastguard Worker 403*0e209d39SAndroid Build Coastguard Worker /** 404*0e209d39SAndroid Build Coastguard Worker * Return the current index in the string text being searched. 405*0e209d39SAndroid Build Coastguard Worker * If the iteration has gone past the end of the text (or past the beginning 406*0e209d39SAndroid Build Coastguard Worker * for a backwards search), <code>USEARCH_DONE</code> is returned. 407*0e209d39SAndroid Build Coastguard Worker * @param strsrch search iterator data struct 408*0e209d39SAndroid Build Coastguard Worker * @see #USEARCH_DONE 409*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 410*0e209d39SAndroid Build Coastguard Worker */ 411*0e209d39SAndroid Build Coastguard Worker U_CAPI int32_t U_EXPORT2 usearch_getOffset(const UStringSearch *strsrch); 412*0e209d39SAndroid Build Coastguard Worker 413*0e209d39SAndroid Build Coastguard Worker /** 414*0e209d39SAndroid Build Coastguard Worker * Sets the text searching attributes located in the enum USearchAttribute 415*0e209d39SAndroid Build Coastguard Worker * with values from the enum USearchAttributeValue. 416*0e209d39SAndroid Build Coastguard Worker * <code>USEARCH_DEFAULT</code> can be used for all attributes for resetting. 417*0e209d39SAndroid Build Coastguard Worker * @param strsrch search iterator data struct 418*0e209d39SAndroid Build Coastguard Worker * @param attribute text attribute to be set 419*0e209d39SAndroid Build Coastguard Worker * @param value text attribute value 420*0e209d39SAndroid Build Coastguard Worker * @param status for errors if it occurs 421*0e209d39SAndroid Build Coastguard Worker * @see #usearch_getAttribute 422*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 423*0e209d39SAndroid Build Coastguard Worker */ 424*0e209d39SAndroid Build Coastguard Worker U_CAPI void U_EXPORT2 usearch_setAttribute(UStringSearch *strsrch, 425*0e209d39SAndroid Build Coastguard Worker USearchAttribute attribute, 426*0e209d39SAndroid Build Coastguard Worker USearchAttributeValue value, 427*0e209d39SAndroid Build Coastguard Worker UErrorCode *status); 428*0e209d39SAndroid Build Coastguard Worker 429*0e209d39SAndroid Build Coastguard Worker /** 430*0e209d39SAndroid Build Coastguard Worker * Gets the text searching attributes. 431*0e209d39SAndroid Build Coastguard Worker * @param strsrch search iterator data struct 432*0e209d39SAndroid Build Coastguard Worker * @param attribute text attribute to be retrieve 433*0e209d39SAndroid Build Coastguard Worker * @return text attribute value 434*0e209d39SAndroid Build Coastguard Worker * @see #usearch_setAttribute 435*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 436*0e209d39SAndroid Build Coastguard Worker */ 437*0e209d39SAndroid Build Coastguard Worker U_CAPI USearchAttributeValue U_EXPORT2 usearch_getAttribute( 438*0e209d39SAndroid Build Coastguard Worker const UStringSearch *strsrch, 439*0e209d39SAndroid Build Coastguard Worker USearchAttribute attribute); 440*0e209d39SAndroid Build Coastguard Worker 441*0e209d39SAndroid Build Coastguard Worker /** 442*0e209d39SAndroid Build Coastguard Worker * Returns the index to the match in the text string that was searched. 443*0e209d39SAndroid Build Coastguard Worker * This call returns a valid result only after a successful call to 444*0e209d39SAndroid Build Coastguard Worker * <code>usearch_first</code>, <code>usearch_next</code>, <code>usearch_previous</code>, 445*0e209d39SAndroid Build Coastguard Worker * or <code>usearch_last</code>. 446*0e209d39SAndroid Build Coastguard Worker * Just after construction, or after a searching method returns 447*0e209d39SAndroid Build Coastguard Worker * <code>USEARCH_DONE</code>, this method will return <code>USEARCH_DONE</code>. 448*0e209d39SAndroid Build Coastguard Worker * <p> 449*0e209d39SAndroid Build Coastguard Worker * Use <code>usearch_getMatchedLength</code> to get the matched string length. 450*0e209d39SAndroid Build Coastguard Worker * @param strsrch search iterator data struct 451*0e209d39SAndroid Build Coastguard Worker * @return index to a substring within the text string that is being 452*0e209d39SAndroid Build Coastguard Worker * searched. 453*0e209d39SAndroid Build Coastguard Worker * @see #usearch_first 454*0e209d39SAndroid Build Coastguard Worker * @see #usearch_next 455*0e209d39SAndroid Build Coastguard Worker * @see #usearch_previous 456*0e209d39SAndroid Build Coastguard Worker * @see #usearch_last 457*0e209d39SAndroid Build Coastguard Worker * @see #USEARCH_DONE 458*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 459*0e209d39SAndroid Build Coastguard Worker */ 460*0e209d39SAndroid Build Coastguard Worker U_CAPI int32_t U_EXPORT2 usearch_getMatchedStart( 461*0e209d39SAndroid Build Coastguard Worker const UStringSearch *strsrch); 462*0e209d39SAndroid Build Coastguard Worker 463*0e209d39SAndroid Build Coastguard Worker /** 464*0e209d39SAndroid Build Coastguard Worker * Returns the length of text in the string which matches the search pattern. 465*0e209d39SAndroid Build Coastguard Worker * This call returns a valid result only after a successful call to 466*0e209d39SAndroid Build Coastguard Worker * <code>usearch_first</code>, <code>usearch_next</code>, <code>usearch_previous</code>, 467*0e209d39SAndroid Build Coastguard Worker * or <code>usearch_last</code>. 468*0e209d39SAndroid Build Coastguard Worker * Just after construction, or after a searching method returns 469*0e209d39SAndroid Build Coastguard Worker * <code>USEARCH_DONE</code>, this method will return 0. 470*0e209d39SAndroid Build Coastguard Worker * @param strsrch search iterator data struct 471*0e209d39SAndroid Build Coastguard Worker * @return The length of the match in the string text, or 0 if there is no 472*0e209d39SAndroid Build Coastguard Worker * match currently. 473*0e209d39SAndroid Build Coastguard Worker * @see #usearch_first 474*0e209d39SAndroid Build Coastguard Worker * @see #usearch_next 475*0e209d39SAndroid Build Coastguard Worker * @see #usearch_previous 476*0e209d39SAndroid Build Coastguard Worker * @see #usearch_last 477*0e209d39SAndroid Build Coastguard Worker * @see #USEARCH_DONE 478*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 479*0e209d39SAndroid Build Coastguard Worker */ 480*0e209d39SAndroid Build Coastguard Worker U_CAPI int32_t U_EXPORT2 usearch_getMatchedLength( 481*0e209d39SAndroid Build Coastguard Worker const UStringSearch *strsrch); 482*0e209d39SAndroid Build Coastguard Worker 483*0e209d39SAndroid Build Coastguard Worker /** 484*0e209d39SAndroid Build Coastguard Worker * Returns the text that was matched by the most recent call to 485*0e209d39SAndroid Build Coastguard Worker * <code>usearch_first</code>, <code>usearch_next</code>, <code>usearch_previous</code>, 486*0e209d39SAndroid Build Coastguard Worker * or <code>usearch_last</code>. 487*0e209d39SAndroid Build Coastguard Worker * If the iterator is not pointing at a valid match (e.g. just after 488*0e209d39SAndroid Build Coastguard Worker * construction or after <code>USEARCH_DONE</code> has been returned, returns 489*0e209d39SAndroid Build Coastguard Worker * an empty string. If result is not large enough to store the matched text, 490*0e209d39SAndroid Build Coastguard Worker * result will be filled with the partial text and an U_BUFFER_OVERFLOW_ERROR 491*0e209d39SAndroid Build Coastguard Worker * will be returned in status. result will be null-terminated whenever 492*0e209d39SAndroid Build Coastguard Worker * possible. If the buffer fits the matched text exactly, a null-termination 493*0e209d39SAndroid Build Coastguard Worker * is not possible, then a U_STRING_NOT_TERMINATED_ERROR set in status. 494*0e209d39SAndroid Build Coastguard Worker * Pre-flighting can be either done with length = 0 or the API 495*0e209d39SAndroid Build Coastguard Worker * <code>usearch_getMatchedLength</code>. 496*0e209d39SAndroid Build Coastguard Worker * @param strsrch search iterator data struct 497*0e209d39SAndroid Build Coastguard Worker * @param result UChar buffer to store the matched string 498*0e209d39SAndroid Build Coastguard Worker * @param resultCapacity length of the result buffer 499*0e209d39SAndroid Build Coastguard Worker * @param status error returned if result is not large enough 500*0e209d39SAndroid Build Coastguard Worker * @return exact length of the matched text, not counting the null-termination 501*0e209d39SAndroid Build Coastguard Worker * @see #usearch_first 502*0e209d39SAndroid Build Coastguard Worker * @see #usearch_next 503*0e209d39SAndroid Build Coastguard Worker * @see #usearch_previous 504*0e209d39SAndroid Build Coastguard Worker * @see #usearch_last 505*0e209d39SAndroid Build Coastguard Worker * @see #USEARCH_DONE 506*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 507*0e209d39SAndroid Build Coastguard Worker */ 508*0e209d39SAndroid Build Coastguard Worker U_CAPI int32_t U_EXPORT2 usearch_getMatchedText(const UStringSearch *strsrch, 509*0e209d39SAndroid Build Coastguard Worker UChar *result, 510*0e209d39SAndroid Build Coastguard Worker int32_t resultCapacity, 511*0e209d39SAndroid Build Coastguard Worker UErrorCode *status); 512*0e209d39SAndroid Build Coastguard Worker 513*0e209d39SAndroid Build Coastguard Worker #if !UCONFIG_NO_BREAK_ITERATION 514*0e209d39SAndroid Build Coastguard Worker 515*0e209d39SAndroid Build Coastguard Worker /** 516*0e209d39SAndroid Build Coastguard Worker * Set the BreakIterator that will be used to restrict the points at which 517*0e209d39SAndroid Build Coastguard Worker * matches are detected. 518*0e209d39SAndroid Build Coastguard Worker * @param strsrch search iterator data struct 519*0e209d39SAndroid Build Coastguard Worker * @param breakiter A BreakIterator that will be used to restrict the points 520*0e209d39SAndroid Build Coastguard Worker * at which matches are detected. If a match is found, but 521*0e209d39SAndroid Build Coastguard Worker * the match's start or end index is not a boundary as 522*0e209d39SAndroid Build Coastguard Worker * determined by the <code>BreakIterator</code>, the match will 523*0e209d39SAndroid Build Coastguard Worker * be rejected and another will be searched for. 524*0e209d39SAndroid Build Coastguard Worker * If this parameter is <code>NULL</code>, no break detection is 525*0e209d39SAndroid Build Coastguard Worker * attempted. 526*0e209d39SAndroid Build Coastguard Worker * @param status for errors if it occurs 527*0e209d39SAndroid Build Coastguard Worker * @see #usearch_getBreakIterator 528*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 529*0e209d39SAndroid Build Coastguard Worker */ 530*0e209d39SAndroid Build Coastguard Worker U_CAPI void U_EXPORT2 usearch_setBreakIterator(UStringSearch *strsrch, 531*0e209d39SAndroid Build Coastguard Worker UBreakIterator *breakiter, 532*0e209d39SAndroid Build Coastguard Worker UErrorCode *status); 533*0e209d39SAndroid Build Coastguard Worker 534*0e209d39SAndroid Build Coastguard Worker /** 535*0e209d39SAndroid Build Coastguard Worker * Returns the BreakIterator that is used to restrict the points at which 536*0e209d39SAndroid Build Coastguard Worker * matches are detected. This will be the same object that was passed to the 537*0e209d39SAndroid Build Coastguard Worker * constructor or to <code>usearch_setBreakIterator</code>. Note that 538*0e209d39SAndroid Build Coastguard Worker * <code>NULL</code> 539*0e209d39SAndroid Build Coastguard Worker * is a legal value; it means that break detection should not be attempted. 540*0e209d39SAndroid Build Coastguard Worker * @param strsrch search iterator data struct 541*0e209d39SAndroid Build Coastguard Worker * @return break iterator used 542*0e209d39SAndroid Build Coastguard Worker * @see #usearch_setBreakIterator 543*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 544*0e209d39SAndroid Build Coastguard Worker */ 545*0e209d39SAndroid Build Coastguard Worker U_CAPI const UBreakIterator * U_EXPORT2 usearch_getBreakIterator( 546*0e209d39SAndroid Build Coastguard Worker const UStringSearch *strsrch); 547*0e209d39SAndroid Build Coastguard Worker 548*0e209d39SAndroid Build Coastguard Worker #endif 549*0e209d39SAndroid Build Coastguard Worker 550*0e209d39SAndroid Build Coastguard Worker /** 551*0e209d39SAndroid Build Coastguard Worker * Set the string text to be searched. Text iteration will hence begin at the 552*0e209d39SAndroid Build Coastguard Worker * start of the text string. This method is useful if you want to re-use an 553*0e209d39SAndroid Build Coastguard Worker * iterator to search for the same pattern within a different body of text. 554*0e209d39SAndroid Build Coastguard Worker * 555*0e209d39SAndroid Build Coastguard Worker * The UStringSearch retains a pointer to the text string. The caller must not 556*0e209d39SAndroid Build Coastguard Worker * modify or delete the string while using the UStringSearch. 557*0e209d39SAndroid Build Coastguard Worker * 558*0e209d39SAndroid Build Coastguard Worker * @param strsrch search iterator data struct 559*0e209d39SAndroid Build Coastguard Worker * @param text new string to look for match 560*0e209d39SAndroid Build Coastguard Worker * @param textlength length of the new string, -1 for null-termination 561*0e209d39SAndroid Build Coastguard Worker * @param status for errors if it occurs. If text is NULL, or textlength is 0 562*0e209d39SAndroid Build Coastguard Worker * then an U_ILLEGAL_ARGUMENT_ERROR is returned with no change 563*0e209d39SAndroid Build Coastguard Worker * done to strsrch. 564*0e209d39SAndroid Build Coastguard Worker * @see #usearch_getText 565*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 566*0e209d39SAndroid Build Coastguard Worker */ 567*0e209d39SAndroid Build Coastguard Worker U_CAPI void U_EXPORT2 usearch_setText( UStringSearch *strsrch, 568*0e209d39SAndroid Build Coastguard Worker const UChar *text, 569*0e209d39SAndroid Build Coastguard Worker int32_t textlength, 570*0e209d39SAndroid Build Coastguard Worker UErrorCode *status); 571*0e209d39SAndroid Build Coastguard Worker 572*0e209d39SAndroid Build Coastguard Worker /** 573*0e209d39SAndroid Build Coastguard Worker * Return the string text to be searched. 574*0e209d39SAndroid Build Coastguard Worker * @param strsrch search iterator data struct 575*0e209d39SAndroid Build Coastguard Worker * @param length returned string text length 576*0e209d39SAndroid Build Coastguard Worker * @return string text 577*0e209d39SAndroid Build Coastguard Worker * @see #usearch_setText 578*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 579*0e209d39SAndroid Build Coastguard Worker */ 580*0e209d39SAndroid Build Coastguard Worker U_CAPI const UChar * U_EXPORT2 usearch_getText(const UStringSearch *strsrch, 581*0e209d39SAndroid Build Coastguard Worker int32_t *length); 582*0e209d39SAndroid Build Coastguard Worker 583*0e209d39SAndroid Build Coastguard Worker /** 584*0e209d39SAndroid Build Coastguard Worker * Gets the collator used for the language rules. 585*0e209d39SAndroid Build Coastguard Worker * <p> 586*0e209d39SAndroid Build Coastguard Worker * Deleting the returned <code>UCollator</code> before calling 587*0e209d39SAndroid Build Coastguard Worker * <code>usearch_close</code> would cause the string search to fail. 588*0e209d39SAndroid Build Coastguard Worker * <code>usearch_close</code> will delete the collator if this search owns it. 589*0e209d39SAndroid Build Coastguard Worker * @param strsrch search iterator data struct 590*0e209d39SAndroid Build Coastguard Worker * @return collator 591*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 592*0e209d39SAndroid Build Coastguard Worker */ 593*0e209d39SAndroid Build Coastguard Worker U_CAPI UCollator * U_EXPORT2 usearch_getCollator( 594*0e209d39SAndroid Build Coastguard Worker const UStringSearch *strsrch); 595*0e209d39SAndroid Build Coastguard Worker 596*0e209d39SAndroid Build Coastguard Worker /** 597*0e209d39SAndroid Build Coastguard Worker * Sets the collator used for the language rules. User retains the ownership 598*0e209d39SAndroid Build Coastguard Worker * of this collator, thus the responsibility of deletion lies with the user. 599*0e209d39SAndroid Build Coastguard Worker * This method causes internal data such as the pattern collation elements 600*0e209d39SAndroid Build Coastguard Worker * and shift tables to be recalculated, but the iterator's position is unchanged. 601*0e209d39SAndroid Build Coastguard Worker * @param strsrch search iterator data struct 602*0e209d39SAndroid Build Coastguard Worker * @param collator to be used 603*0e209d39SAndroid Build Coastguard Worker * @param status for errors if it occurs 604*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 605*0e209d39SAndroid Build Coastguard Worker */ 606*0e209d39SAndroid Build Coastguard Worker U_CAPI void U_EXPORT2 usearch_setCollator( UStringSearch *strsrch, 607*0e209d39SAndroid Build Coastguard Worker const UCollator *collator, 608*0e209d39SAndroid Build Coastguard Worker UErrorCode *status); 609*0e209d39SAndroid Build Coastguard Worker 610*0e209d39SAndroid Build Coastguard Worker /** 611*0e209d39SAndroid Build Coastguard Worker * Sets the pattern used for matching. 612*0e209d39SAndroid Build Coastguard Worker * Internal data like the pattern collation elements will be recalculated, but the 613*0e209d39SAndroid Build Coastguard Worker * iterator's position is unchanged. 614*0e209d39SAndroid Build Coastguard Worker * 615*0e209d39SAndroid Build Coastguard Worker * The UStringSearch retains a pointer to the pattern string. The caller must not 616*0e209d39SAndroid Build Coastguard Worker * modify or delete the string while using the UStringSearch. 617*0e209d39SAndroid Build Coastguard Worker * 618*0e209d39SAndroid Build Coastguard Worker * @param strsrch search iterator data struct 619*0e209d39SAndroid Build Coastguard Worker * @param pattern string 620*0e209d39SAndroid Build Coastguard Worker * @param patternlength pattern length, -1 for null-terminated string 621*0e209d39SAndroid Build Coastguard Worker * @param status for errors if it occurs. If text is NULL, or textlength is 0 622*0e209d39SAndroid Build Coastguard Worker * then an U_ILLEGAL_ARGUMENT_ERROR is returned with no change 623*0e209d39SAndroid Build Coastguard Worker * done to strsrch. 624*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 625*0e209d39SAndroid Build Coastguard Worker */ 626*0e209d39SAndroid Build Coastguard Worker U_CAPI void U_EXPORT2 usearch_setPattern( UStringSearch *strsrch, 627*0e209d39SAndroid Build Coastguard Worker const UChar *pattern, 628*0e209d39SAndroid Build Coastguard Worker int32_t patternlength, 629*0e209d39SAndroid Build Coastguard Worker UErrorCode *status); 630*0e209d39SAndroid Build Coastguard Worker 631*0e209d39SAndroid Build Coastguard Worker /** 632*0e209d39SAndroid Build Coastguard Worker * Gets the search pattern 633*0e209d39SAndroid Build Coastguard Worker * @param strsrch search iterator data struct 634*0e209d39SAndroid Build Coastguard Worker * @param length return length of the pattern, -1 indicates that the pattern 635*0e209d39SAndroid Build Coastguard Worker * is null-terminated 636*0e209d39SAndroid Build Coastguard Worker * @return pattern string 637*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 638*0e209d39SAndroid Build Coastguard Worker */ 639*0e209d39SAndroid Build Coastguard Worker U_CAPI const UChar * U_EXPORT2 usearch_getPattern( 640*0e209d39SAndroid Build Coastguard Worker const UStringSearch *strsrch, 641*0e209d39SAndroid Build Coastguard Worker int32_t *length); 642*0e209d39SAndroid Build Coastguard Worker 643*0e209d39SAndroid Build Coastguard Worker /* methods ------------------------------------------------------------- */ 644*0e209d39SAndroid Build Coastguard Worker 645*0e209d39SAndroid Build Coastguard Worker /** 646*0e209d39SAndroid Build Coastguard Worker * Returns the first index at which the string text matches the search 647*0e209d39SAndroid Build Coastguard Worker * pattern. 648*0e209d39SAndroid Build Coastguard Worker * The iterator is adjusted so that its current index (as returned by 649*0e209d39SAndroid Build Coastguard Worker * <code>usearch_getOffset</code>) is the match position if one was found. 650*0e209d39SAndroid Build Coastguard Worker * If a match is not found, <code>USEARCH_DONE</code> will be returned and 651*0e209d39SAndroid Build Coastguard Worker * the iterator will be adjusted to the index <code>USEARCH_DONE</code>. 652*0e209d39SAndroid Build Coastguard Worker * @param strsrch search iterator data struct 653*0e209d39SAndroid Build Coastguard Worker * @param status for errors if it occurs 654*0e209d39SAndroid Build Coastguard Worker * @return The character index of the first match, or 655*0e209d39SAndroid Build Coastguard Worker * <code>USEARCH_DONE</code> if there are no matches. 656*0e209d39SAndroid Build Coastguard Worker * @see #usearch_getOffset 657*0e209d39SAndroid Build Coastguard Worker * @see #USEARCH_DONE 658*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 659*0e209d39SAndroid Build Coastguard Worker */ 660*0e209d39SAndroid Build Coastguard Worker U_CAPI int32_t U_EXPORT2 usearch_first(UStringSearch *strsrch, 661*0e209d39SAndroid Build Coastguard Worker UErrorCode *status); 662*0e209d39SAndroid Build Coastguard Worker 663*0e209d39SAndroid Build Coastguard Worker /** 664*0e209d39SAndroid Build Coastguard Worker * Returns the first index equal or greater than <code>position</code> at which 665*0e209d39SAndroid Build Coastguard Worker * the string text 666*0e209d39SAndroid Build Coastguard Worker * matches the search pattern. The iterator is adjusted so that its current 667*0e209d39SAndroid Build Coastguard Worker * index (as returned by <code>usearch_getOffset</code>) is the match position if 668*0e209d39SAndroid Build Coastguard Worker * one was found. 669*0e209d39SAndroid Build Coastguard Worker * If a match is not found, <code>USEARCH_DONE</code> will be returned and 670*0e209d39SAndroid Build Coastguard Worker * the iterator will be adjusted to the index <code>USEARCH_DONE</code> 671*0e209d39SAndroid Build Coastguard Worker * <p> 672*0e209d39SAndroid Build Coastguard Worker * Search positions that may render incorrect results are highlighted in the 673*0e209d39SAndroid Build Coastguard Worker * header comments. If position is less than or greater than the text range 674*0e209d39SAndroid Build Coastguard Worker * for searching, an U_INDEX_OUTOFBOUNDS_ERROR will be returned 675*0e209d39SAndroid Build Coastguard Worker * @param strsrch search iterator data struct 676*0e209d39SAndroid Build Coastguard Worker * @param position to start the search at 677*0e209d39SAndroid Build Coastguard Worker * @param status for errors if it occurs 678*0e209d39SAndroid Build Coastguard Worker * @return The character index of the first match following <code>pos</code>, 679*0e209d39SAndroid Build Coastguard Worker * or <code>USEARCH_DONE</code> if there are no matches. 680*0e209d39SAndroid Build Coastguard Worker * @see #usearch_getOffset 681*0e209d39SAndroid Build Coastguard Worker * @see #USEARCH_DONE 682*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 683*0e209d39SAndroid Build Coastguard Worker */ 684*0e209d39SAndroid Build Coastguard Worker U_CAPI int32_t U_EXPORT2 usearch_following(UStringSearch *strsrch, 685*0e209d39SAndroid Build Coastguard Worker int32_t position, 686*0e209d39SAndroid Build Coastguard Worker UErrorCode *status); 687*0e209d39SAndroid Build Coastguard Worker 688*0e209d39SAndroid Build Coastguard Worker /** 689*0e209d39SAndroid Build Coastguard Worker * Returns the last index in the target text at which it matches the search 690*0e209d39SAndroid Build Coastguard Worker * pattern. The iterator is adjusted so that its current 691*0e209d39SAndroid Build Coastguard Worker * index (as returned by <code>usearch_getOffset</code>) is the match position if 692*0e209d39SAndroid Build Coastguard Worker * one was found. 693*0e209d39SAndroid Build Coastguard Worker * If a match is not found, <code>USEARCH_DONE</code> will be returned and 694*0e209d39SAndroid Build Coastguard Worker * the iterator will be adjusted to the index <code>USEARCH_DONE</code>. 695*0e209d39SAndroid Build Coastguard Worker * @param strsrch search iterator data struct 696*0e209d39SAndroid Build Coastguard Worker * @param status for errors if it occurs 697*0e209d39SAndroid Build Coastguard Worker * @return The index of the first match, or <code>USEARCH_DONE</code> if there 698*0e209d39SAndroid Build Coastguard Worker * are no matches. 699*0e209d39SAndroid Build Coastguard Worker * @see #usearch_getOffset 700*0e209d39SAndroid Build Coastguard Worker * @see #USEARCH_DONE 701*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 702*0e209d39SAndroid Build Coastguard Worker */ 703*0e209d39SAndroid Build Coastguard Worker U_CAPI int32_t U_EXPORT2 usearch_last(UStringSearch *strsrch, 704*0e209d39SAndroid Build Coastguard Worker UErrorCode *status); 705*0e209d39SAndroid Build Coastguard Worker 706*0e209d39SAndroid Build Coastguard Worker /** 707*0e209d39SAndroid Build Coastguard Worker * Returns the first index less than <code>position</code> at which the string text 708*0e209d39SAndroid Build Coastguard Worker * matches the search pattern. The iterator is adjusted so that its current 709*0e209d39SAndroid Build Coastguard Worker * index (as returned by <code>usearch_getOffset</code>) is the match position if 710*0e209d39SAndroid Build Coastguard Worker * one was found. 711*0e209d39SAndroid Build Coastguard Worker * If a match is not found, <code>USEARCH_DONE</code> will be returned and 712*0e209d39SAndroid Build Coastguard Worker * the iterator will be adjusted to the index <code>USEARCH_DONE</code> 713*0e209d39SAndroid Build Coastguard Worker * <p> 714*0e209d39SAndroid Build Coastguard Worker * Search positions that may render incorrect results are highlighted in the 715*0e209d39SAndroid Build Coastguard Worker * header comments. If position is less than or greater than the text range 716*0e209d39SAndroid Build Coastguard Worker * for searching, an U_INDEX_OUTOFBOUNDS_ERROR will be returned. 717*0e209d39SAndroid Build Coastguard Worker * <p> 718*0e209d39SAndroid Build Coastguard Worker * When <code>USEARCH_OVERLAP</code> option is off, the last index of the 719*0e209d39SAndroid Build Coastguard Worker * result match is always less than <code>position</code>. 720*0e209d39SAndroid Build Coastguard Worker * When <code>USERARCH_OVERLAP</code> is on, the result match may span across 721*0e209d39SAndroid Build Coastguard Worker * <code>position</code>. 722*0e209d39SAndroid Build Coastguard Worker * @param strsrch search iterator data struct 723*0e209d39SAndroid Build Coastguard Worker * @param position index position the search is to begin at 724*0e209d39SAndroid Build Coastguard Worker * @param status for errors if it occurs 725*0e209d39SAndroid Build Coastguard Worker * @return The character index of the first match preceding <code>pos</code>, 726*0e209d39SAndroid Build Coastguard Worker * or <code>USEARCH_DONE</code> if there are no matches. 727*0e209d39SAndroid Build Coastguard Worker * @see #usearch_getOffset 728*0e209d39SAndroid Build Coastguard Worker * @see #USEARCH_DONE 729*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 730*0e209d39SAndroid Build Coastguard Worker */ 731*0e209d39SAndroid Build Coastguard Worker U_CAPI int32_t U_EXPORT2 usearch_preceding(UStringSearch *strsrch, 732*0e209d39SAndroid Build Coastguard Worker int32_t position, 733*0e209d39SAndroid Build Coastguard Worker UErrorCode *status); 734*0e209d39SAndroid Build Coastguard Worker 735*0e209d39SAndroid Build Coastguard Worker /** 736*0e209d39SAndroid Build Coastguard Worker * Returns the index of the next point at which the string text matches the 737*0e209d39SAndroid Build Coastguard Worker * search pattern, starting from the current position. 738*0e209d39SAndroid Build Coastguard Worker * The iterator is adjusted so that its current 739*0e209d39SAndroid Build Coastguard Worker * index (as returned by <code>usearch_getOffset</code>) is the match position if 740*0e209d39SAndroid Build Coastguard Worker * one was found. 741*0e209d39SAndroid Build Coastguard Worker * If a match is not found, <code>USEARCH_DONE</code> will be returned and 742*0e209d39SAndroid Build Coastguard Worker * the iterator will be adjusted to the index <code>USEARCH_DONE</code> 743*0e209d39SAndroid Build Coastguard Worker * @param strsrch search iterator data struct 744*0e209d39SAndroid Build Coastguard Worker * @param status for errors if it occurs 745*0e209d39SAndroid Build Coastguard Worker * @return The index of the next match after the current position, or 746*0e209d39SAndroid Build Coastguard Worker * <code>USEARCH_DONE</code> if there are no more matches. 747*0e209d39SAndroid Build Coastguard Worker * @see #usearch_first 748*0e209d39SAndroid Build Coastguard Worker * @see #usearch_getOffset 749*0e209d39SAndroid Build Coastguard Worker * @see #USEARCH_DONE 750*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 751*0e209d39SAndroid Build Coastguard Worker */ 752*0e209d39SAndroid Build Coastguard Worker U_CAPI int32_t U_EXPORT2 usearch_next(UStringSearch *strsrch, 753*0e209d39SAndroid Build Coastguard Worker UErrorCode *status); 754*0e209d39SAndroid Build Coastguard Worker 755*0e209d39SAndroid Build Coastguard Worker /** 756*0e209d39SAndroid Build Coastguard Worker * Returns the index of the previous point at which the string text matches 757*0e209d39SAndroid Build Coastguard Worker * the search pattern, starting at the current position. 758*0e209d39SAndroid Build Coastguard Worker * The iterator is adjusted so that its current 759*0e209d39SAndroid Build Coastguard Worker * index (as returned by <code>usearch_getOffset</code>) is the match position if 760*0e209d39SAndroid Build Coastguard Worker * one was found. 761*0e209d39SAndroid Build Coastguard Worker * If a match is not found, <code>USEARCH_DONE</code> will be returned and 762*0e209d39SAndroid Build Coastguard Worker * the iterator will be adjusted to the index <code>USEARCH_DONE</code> 763*0e209d39SAndroid Build Coastguard Worker * @param strsrch search iterator data struct 764*0e209d39SAndroid Build Coastguard Worker * @param status for errors if it occurs 765*0e209d39SAndroid Build Coastguard Worker * @return The index of the previous match before the current position, 766*0e209d39SAndroid Build Coastguard Worker * or <code>USEARCH_DONE</code> if there are no more matches. 767*0e209d39SAndroid Build Coastguard Worker * @see #usearch_last 768*0e209d39SAndroid Build Coastguard Worker * @see #usearch_getOffset 769*0e209d39SAndroid Build Coastguard Worker * @see #USEARCH_DONE 770*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 771*0e209d39SAndroid Build Coastguard Worker */ 772*0e209d39SAndroid Build Coastguard Worker U_CAPI int32_t U_EXPORT2 usearch_previous(UStringSearch *strsrch, 773*0e209d39SAndroid Build Coastguard Worker UErrorCode *status); 774*0e209d39SAndroid Build Coastguard Worker 775*0e209d39SAndroid Build Coastguard Worker /** 776*0e209d39SAndroid Build Coastguard Worker * Reset the iteration. 777*0e209d39SAndroid Build Coastguard Worker * Search will begin at the start of the text string if a forward iteration 778*0e209d39SAndroid Build Coastguard Worker * is initiated before a backwards iteration. Otherwise if a backwards 779*0e209d39SAndroid Build Coastguard Worker * iteration is initiated before a forwards iteration, the search will begin 780*0e209d39SAndroid Build Coastguard Worker * at the end of the text string. 781*0e209d39SAndroid Build Coastguard Worker * @param strsrch search iterator data struct 782*0e209d39SAndroid Build Coastguard Worker * @see #usearch_first 783*0e209d39SAndroid Build Coastguard Worker * @stable ICU 2.4 784*0e209d39SAndroid Build Coastguard Worker */ 785*0e209d39SAndroid Build Coastguard Worker U_CAPI void U_EXPORT2 usearch_reset(UStringSearch *strsrch); 786*0e209d39SAndroid Build Coastguard Worker 787*0e209d39SAndroid Build Coastguard Worker #ifndef U_HIDE_INTERNAL_API 788*0e209d39SAndroid Build Coastguard Worker /** 789*0e209d39SAndroid Build Coastguard Worker * Simple forward search for the pattern, starting at a specified index, 790*0e209d39SAndroid Build Coastguard Worker * and using a default set search options. 791*0e209d39SAndroid Build Coastguard Worker * 792*0e209d39SAndroid Build Coastguard Worker * This is an experimental function, and is not an official part of the 793*0e209d39SAndroid Build Coastguard Worker * ICU API. 794*0e209d39SAndroid Build Coastguard Worker * 795*0e209d39SAndroid Build Coastguard Worker * The collator options, such as UCOL_STRENGTH and UCOL_NORMALIZTION, are honored. 796*0e209d39SAndroid Build Coastguard Worker * 797*0e209d39SAndroid Build Coastguard Worker * The UStringSearch options USEARCH_CANONICAL_MATCH, USEARCH_OVERLAP and 798*0e209d39SAndroid Build Coastguard Worker * any Break Iterator are ignored. 799*0e209d39SAndroid Build Coastguard Worker * 800*0e209d39SAndroid Build Coastguard Worker * Matches obey the following constraints: 801*0e209d39SAndroid Build Coastguard Worker * 802*0e209d39SAndroid Build Coastguard Worker * Characters at the start or end positions of a match that are ignorable 803*0e209d39SAndroid Build Coastguard Worker * for collation are not included as part of the match, unless they 804*0e209d39SAndroid Build Coastguard Worker * are part of a combining sequence, as described below. 805*0e209d39SAndroid Build Coastguard Worker * 806*0e209d39SAndroid Build Coastguard Worker * A match will not include a partial combining sequence. Combining 807*0e209d39SAndroid Build Coastguard Worker * character sequences are considered to be inseparable units, 808*0e209d39SAndroid Build Coastguard Worker * and either match the pattern completely, or are considered to not match 809*0e209d39SAndroid Build Coastguard Worker * at all. Thus, for example, an A followed a combining accent mark will 810*0e209d39SAndroid Build Coastguard Worker * not be found when searching for a plain (unaccented) A. (unless 811*0e209d39SAndroid Build Coastguard Worker * the collation strength has been set to ignore all accents). 812*0e209d39SAndroid Build Coastguard Worker * 813*0e209d39SAndroid Build Coastguard Worker * When beginning a search, the initial starting position, startIdx, 814*0e209d39SAndroid Build Coastguard Worker * is assumed to be an acceptable match boundary with respect to 815*0e209d39SAndroid Build Coastguard Worker * combining characters. A combining sequence that spans across the 816*0e209d39SAndroid Build Coastguard Worker * starting point will not suppress a match beginning at startIdx. 817*0e209d39SAndroid Build Coastguard Worker * 818*0e209d39SAndroid Build Coastguard Worker * Characters that expand to multiple collation elements 819*0e209d39SAndroid Build Coastguard Worker * (German sharp-S becoming 'ss', or the composed forms of accented 820*0e209d39SAndroid Build Coastguard Worker * characters, for example) also must match completely. 821*0e209d39SAndroid Build Coastguard Worker * Searching for a single 's' in a string containing only a sharp-s will 822*0e209d39SAndroid Build Coastguard Worker * find no match. 823*0e209d39SAndroid Build Coastguard Worker * 824*0e209d39SAndroid Build Coastguard Worker * 825*0e209d39SAndroid Build Coastguard Worker * @param strsrch the UStringSearch struct, which references both 826*0e209d39SAndroid Build Coastguard Worker * the text to be searched and the pattern being sought. 827*0e209d39SAndroid Build Coastguard Worker * @param startIdx The index into the text to begin the search. 828*0e209d39SAndroid Build Coastguard Worker * @param matchStart An out parameter, the starting index of the matched text. 829*0e209d39SAndroid Build Coastguard Worker * This parameter may be NULL. 830*0e209d39SAndroid Build Coastguard Worker * A value of -1 will be returned if no match was found. 831*0e209d39SAndroid Build Coastguard Worker * @param matchLimit Out parameter, the index of the first position following the matched text. 832*0e209d39SAndroid Build Coastguard Worker * The matchLimit will be at a suitable position for beginning a subsequent search 833*0e209d39SAndroid Build Coastguard Worker * in the input text. 834*0e209d39SAndroid Build Coastguard Worker * This parameter may be NULL. 835*0e209d39SAndroid Build Coastguard Worker * A value of -1 will be returned if no match was found. 836*0e209d39SAndroid Build Coastguard Worker * 837*0e209d39SAndroid Build Coastguard Worker * @param status Report any errors. Note that no match found is not an error. 838*0e209d39SAndroid Build Coastguard Worker * @return true if a match was found, false otherwise. 839*0e209d39SAndroid Build Coastguard Worker * 840*0e209d39SAndroid Build Coastguard Worker * @internal 841*0e209d39SAndroid Build Coastguard Worker */ 842*0e209d39SAndroid Build Coastguard Worker U_CAPI UBool U_EXPORT2 usearch_search(UStringSearch *strsrch, 843*0e209d39SAndroid Build Coastguard Worker int32_t startIdx, 844*0e209d39SAndroid Build Coastguard Worker int32_t *matchStart, 845*0e209d39SAndroid Build Coastguard Worker int32_t *matchLimit, 846*0e209d39SAndroid Build Coastguard Worker UErrorCode *status); 847*0e209d39SAndroid Build Coastguard Worker 848*0e209d39SAndroid Build Coastguard Worker /** 849*0e209d39SAndroid Build Coastguard Worker * Simple backwards search for the pattern, starting at a specified index, 850*0e209d39SAndroid Build Coastguard Worker * and using using a default set search options. 851*0e209d39SAndroid Build Coastguard Worker * 852*0e209d39SAndroid Build Coastguard Worker * This is an experimental function, and is not an official part of the 853*0e209d39SAndroid Build Coastguard Worker * ICU API. 854*0e209d39SAndroid Build Coastguard Worker * 855*0e209d39SAndroid Build Coastguard Worker * The collator options, such as UCOL_STRENGTH and UCOL_NORMALIZTION, are honored. 856*0e209d39SAndroid Build Coastguard Worker * 857*0e209d39SAndroid Build Coastguard Worker * The UStringSearch options USEARCH_CANONICAL_MATCH, USEARCH_OVERLAP and 858*0e209d39SAndroid Build Coastguard Worker * any Break Iterator are ignored. 859*0e209d39SAndroid Build Coastguard Worker * 860*0e209d39SAndroid Build Coastguard Worker * Matches obey the following constraints: 861*0e209d39SAndroid Build Coastguard Worker * 862*0e209d39SAndroid Build Coastguard Worker * Characters at the start or end positions of a match that are ignorable 863*0e209d39SAndroid Build Coastguard Worker * for collation are not included as part of the match, unless they 864*0e209d39SAndroid Build Coastguard Worker * are part of a combining sequence, as described below. 865*0e209d39SAndroid Build Coastguard Worker * 866*0e209d39SAndroid Build Coastguard Worker * A match will not include a partial combining sequence. Combining 867*0e209d39SAndroid Build Coastguard Worker * character sequences are considered to be inseparable units, 868*0e209d39SAndroid Build Coastguard Worker * and either match the pattern completely, or are considered to not match 869*0e209d39SAndroid Build Coastguard Worker * at all. Thus, for example, an A followed a combining accent mark will 870*0e209d39SAndroid Build Coastguard Worker * not be found when searching for a plain (unaccented) A. (unless 871*0e209d39SAndroid Build Coastguard Worker * the collation strength has been set to ignore all accents). 872*0e209d39SAndroid Build Coastguard Worker * 873*0e209d39SAndroid Build Coastguard Worker * When beginning a search, the initial starting position, startIdx, 874*0e209d39SAndroid Build Coastguard Worker * is assumed to be an acceptable match boundary with respect to 875*0e209d39SAndroid Build Coastguard Worker * combining characters. A combining sequence that spans across the 876*0e209d39SAndroid Build Coastguard Worker * starting point will not suppress a match beginning at startIdx. 877*0e209d39SAndroid Build Coastguard Worker * 878*0e209d39SAndroid Build Coastguard Worker * Characters that expand to multiple collation elements 879*0e209d39SAndroid Build Coastguard Worker * (German sharp-S becoming 'ss', or the composed forms of accented 880*0e209d39SAndroid Build Coastguard Worker * characters, for example) also must match completely. 881*0e209d39SAndroid Build Coastguard Worker * Searching for a single 's' in a string containing only a sharp-s will 882*0e209d39SAndroid Build Coastguard Worker * find no match. 883*0e209d39SAndroid Build Coastguard Worker * 884*0e209d39SAndroid Build Coastguard Worker * 885*0e209d39SAndroid Build Coastguard Worker * @param strsrch the UStringSearch struct, which references both 886*0e209d39SAndroid Build Coastguard Worker * the text to be searched and the pattern being sought. 887*0e209d39SAndroid Build Coastguard Worker * @param startIdx The index into the text to begin the search. 888*0e209d39SAndroid Build Coastguard Worker * @param matchStart An out parameter, the starting index of the matched text. 889*0e209d39SAndroid Build Coastguard Worker * This parameter may be NULL. 890*0e209d39SAndroid Build Coastguard Worker * A value of -1 will be returned if no match was found. 891*0e209d39SAndroid Build Coastguard Worker * @param matchLimit Out parameter, the index of the first position following the matched text. 892*0e209d39SAndroid Build Coastguard Worker * The matchLimit will be at a suitable position for beginning a subsequent search 893*0e209d39SAndroid Build Coastguard Worker * in the input text. 894*0e209d39SAndroid Build Coastguard Worker * This parameter may be NULL. 895*0e209d39SAndroid Build Coastguard Worker * A value of -1 will be returned if no match was found. 896*0e209d39SAndroid Build Coastguard Worker * 897*0e209d39SAndroid Build Coastguard Worker * @param status Report any errors. Note that no match found is not an error. 898*0e209d39SAndroid Build Coastguard Worker * @return true if a match was found, false otherwise. 899*0e209d39SAndroid Build Coastguard Worker * 900*0e209d39SAndroid Build Coastguard Worker * @internal 901*0e209d39SAndroid Build Coastguard Worker */ 902*0e209d39SAndroid Build Coastguard Worker U_CAPI UBool U_EXPORT2 usearch_searchBackwards(UStringSearch *strsrch, 903*0e209d39SAndroid Build Coastguard Worker int32_t startIdx, 904*0e209d39SAndroid Build Coastguard Worker int32_t *matchStart, 905*0e209d39SAndroid Build Coastguard Worker int32_t *matchLimit, 906*0e209d39SAndroid Build Coastguard Worker UErrorCode *status); 907*0e209d39SAndroid Build Coastguard Worker #endif /* U_HIDE_INTERNAL_API */ 908*0e209d39SAndroid Build Coastguard Worker 909*0e209d39SAndroid Build Coastguard Worker #endif /* #if !UCONFIG_NO_COLLATION && !UCONFIG_NO_BREAK_ITERATION */ 910*0e209d39SAndroid Build Coastguard Worker 911*0e209d39SAndroid Build Coastguard Worker #endif 912