1*22dc650dSSadaf Ebrahimi<html> 2*22dc650dSSadaf Ebrahimi<head> 3*22dc650dSSadaf Ebrahimi<title>pcre2syntax specification</title> 4*22dc650dSSadaf Ebrahimi</head> 5*22dc650dSSadaf Ebrahimi<body bgcolor="#FFFFFF" text="#00005A" link="#0066FF" alink="#3399FF" vlink="#2222BB"> 6*22dc650dSSadaf Ebrahimi<h1>pcre2syntax man page</h1> 7*22dc650dSSadaf Ebrahimi<p> 8*22dc650dSSadaf EbrahimiReturn to the <a href="index.html">PCRE2 index page</a>. 9*22dc650dSSadaf Ebrahimi</p> 10*22dc650dSSadaf Ebrahimi<p> 11*22dc650dSSadaf EbrahimiThis page is part of the PCRE2 HTML documentation. It was generated 12*22dc650dSSadaf Ebrahimiautomatically from the original man page. If there is any nonsense in it, 13*22dc650dSSadaf Ebrahimiplease consult the man page, in case the conversion went wrong. 14*22dc650dSSadaf Ebrahimi<br> 15*22dc650dSSadaf Ebrahimi<ul> 16*22dc650dSSadaf Ebrahimi<li><a name="TOC1" href="#SEC1">PCRE2 REGULAR EXPRESSION SYNTAX SUMMARY</a> 17*22dc650dSSadaf Ebrahimi<li><a name="TOC2" href="#SEC2">QUOTING</a> 18*22dc650dSSadaf Ebrahimi<li><a name="TOC3" href="#SEC3">BRACED ITEMS</a> 19*22dc650dSSadaf Ebrahimi<li><a name="TOC4" href="#SEC4">ESCAPED CHARACTERS</a> 20*22dc650dSSadaf Ebrahimi<li><a name="TOC5" href="#SEC5">CHARACTER TYPES</a> 21*22dc650dSSadaf Ebrahimi<li><a name="TOC6" href="#SEC6">GENERAL CATEGORY PROPERTIES FOR \p and \P</a> 22*22dc650dSSadaf Ebrahimi<li><a name="TOC7" href="#SEC7">PCRE2 SPECIAL CATEGORY PROPERTIES FOR \p and \P</a> 23*22dc650dSSadaf Ebrahimi<li><a name="TOC8" href="#SEC8">BINARY PROPERTIES FOR \p AND \P</a> 24*22dc650dSSadaf Ebrahimi<li><a name="TOC9" href="#SEC9">SCRIPT MATCHING WITH \p AND \P</a> 25*22dc650dSSadaf Ebrahimi<li><a name="TOC10" href="#SEC10">THE BIDI_CLASS PROPERTY FOR \p AND \P</a> 26*22dc650dSSadaf Ebrahimi<li><a name="TOC11" href="#SEC11">CHARACTER CLASSES</a> 27*22dc650dSSadaf Ebrahimi<li><a name="TOC12" href="#SEC12">QUANTIFIERS</a> 28*22dc650dSSadaf Ebrahimi<li><a name="TOC13" href="#SEC13">ANCHORS AND SIMPLE ASSERTIONS</a> 29*22dc650dSSadaf Ebrahimi<li><a name="TOC14" href="#SEC14">REPORTED MATCH POINT SETTING</a> 30*22dc650dSSadaf Ebrahimi<li><a name="TOC15" href="#SEC15">ALTERNATION</a> 31*22dc650dSSadaf Ebrahimi<li><a name="TOC16" href="#SEC16">CAPTURING</a> 32*22dc650dSSadaf Ebrahimi<li><a name="TOC17" href="#SEC17">ATOMIC GROUPS</a> 33*22dc650dSSadaf Ebrahimi<li><a name="TOC18" href="#SEC18">COMMENT</a> 34*22dc650dSSadaf Ebrahimi<li><a name="TOC19" href="#SEC19">OPTION SETTING</a> 35*22dc650dSSadaf Ebrahimi<li><a name="TOC20" href="#SEC20">NEWLINE CONVENTION</a> 36*22dc650dSSadaf Ebrahimi<li><a name="TOC21" href="#SEC21">WHAT \R MATCHES</a> 37*22dc650dSSadaf Ebrahimi<li><a name="TOC22" href="#SEC22">LOOKAHEAD AND LOOKBEHIND ASSERTIONS</a> 38*22dc650dSSadaf Ebrahimi<li><a name="TOC23" href="#SEC23">NON-ATOMIC LOOKAROUND ASSERTIONS</a> 39*22dc650dSSadaf Ebrahimi<li><a name="TOC24" href="#SEC24">SCRIPT RUNS</a> 40*22dc650dSSadaf Ebrahimi<li><a name="TOC25" href="#SEC25">BACKREFERENCES</a> 41*22dc650dSSadaf Ebrahimi<li><a name="TOC26" href="#SEC26">SUBROUTINE REFERENCES (POSSIBLY RECURSIVE)</a> 42*22dc650dSSadaf Ebrahimi<li><a name="TOC27" href="#SEC27">CONDITIONAL PATTERNS</a> 43*22dc650dSSadaf Ebrahimi<li><a name="TOC28" href="#SEC28">BACKTRACKING CONTROL</a> 44*22dc650dSSadaf Ebrahimi<li><a name="TOC29" href="#SEC29">CALLOUTS</a> 45*22dc650dSSadaf Ebrahimi<li><a name="TOC30" href="#SEC30">SEE ALSO</a> 46*22dc650dSSadaf Ebrahimi<li><a name="TOC31" href="#SEC31">AUTHOR</a> 47*22dc650dSSadaf Ebrahimi<li><a name="TOC32" href="#SEC32">REVISION</a> 48*22dc650dSSadaf Ebrahimi</ul> 49*22dc650dSSadaf Ebrahimi<br><a name="SEC1" href="#TOC1">PCRE2 REGULAR EXPRESSION SYNTAX SUMMARY</a><br> 50*22dc650dSSadaf Ebrahimi<P> 51*22dc650dSSadaf EbrahimiThe full syntax and semantics of the regular expressions that are supported by 52*22dc650dSSadaf EbrahimiPCRE2 are described in the 53*22dc650dSSadaf Ebrahimi<a href="pcre2pattern.html"><b>pcre2pattern</b></a> 54*22dc650dSSadaf Ebrahimidocumentation. This document contains a quick-reference summary of the syntax. 55*22dc650dSSadaf Ebrahimi</P> 56*22dc650dSSadaf Ebrahimi<br><a name="SEC2" href="#TOC1">QUOTING</a><br> 57*22dc650dSSadaf Ebrahimi<P> 58*22dc650dSSadaf Ebrahimi<pre> 59*22dc650dSSadaf Ebrahimi \x where x is non-alphanumeric is a literal x 60*22dc650dSSadaf Ebrahimi \Q...\E treat enclosed characters as literal 61*22dc650dSSadaf Ebrahimi</pre> 62*22dc650dSSadaf EbrahimiNote that white space inside \Q...\E is always treated as literal, even if 63*22dc650dSSadaf EbrahimiPCRE2_EXTENDED is set, causing most other white space to be ignored. 64*22dc650dSSadaf Ebrahimi</P> 65*22dc650dSSadaf Ebrahimi<br><a name="SEC3" href="#TOC1">BRACED ITEMS</a><br> 66*22dc650dSSadaf Ebrahimi<P> 67*22dc650dSSadaf EbrahimiWith one exception, wherever brace characters { and } are required to enclose 68*22dc650dSSadaf Ebrahimidata for constructions such as \g{2} or \k{name}, space and/or horizontal tab 69*22dc650dSSadaf Ebrahimicharacters that follow { or precede } are allowed and are ignored. In the case 70*22dc650dSSadaf Ebrahimiof quantifiers, they may also appear before or after the comma. The exception 71*22dc650dSSadaf Ebrahimiis \u{...} which is not Perl-compatible and is recognized only when 72*22dc650dSSadaf EbrahimiPCRE2_EXTRA_ALT_BSUX is set. This is an ECMAScript compatibility feature, and 73*22dc650dSSadaf Ebrahimifollows ECMAScript's behaviour. 74*22dc650dSSadaf Ebrahimi</P> 75*22dc650dSSadaf Ebrahimi<br><a name="SEC4" href="#TOC1">ESCAPED CHARACTERS</a><br> 76*22dc650dSSadaf Ebrahimi<P> 77*22dc650dSSadaf EbrahimiThis table applies to ASCII and Unicode environments. An unrecognized escape 78*22dc650dSSadaf Ebrahimisequence causes an error. 79*22dc650dSSadaf Ebrahimi<pre> 80*22dc650dSSadaf Ebrahimi \a alarm, that is, the BEL character (hex 07) 81*22dc650dSSadaf Ebrahimi \cx "control-x", where x is a non-control ASCII character 82*22dc650dSSadaf Ebrahimi \e escape (hex 1B) 83*22dc650dSSadaf Ebrahimi \f form feed (hex 0C) 84*22dc650dSSadaf Ebrahimi \n newline (hex 0A) 85*22dc650dSSadaf Ebrahimi \r carriage return (hex 0D) 86*22dc650dSSadaf Ebrahimi \t tab (hex 09) 87*22dc650dSSadaf Ebrahimi \0dd character with octal code 0dd 88*22dc650dSSadaf Ebrahimi \ddd character with octal code ddd, or backreference 89*22dc650dSSadaf Ebrahimi \o{ddd..} character with octal code ddd.. 90*22dc650dSSadaf Ebrahimi \N{U+hh..} character with Unicode code point hh.. (Unicode mode only) 91*22dc650dSSadaf Ebrahimi \xhh character with hex code hh 92*22dc650dSSadaf Ebrahimi \x{hh..} character with hex code hh.. 93*22dc650dSSadaf Ebrahimi</pre> 94*22dc650dSSadaf EbrahimiIf PCRE2_ALT_BSUX or PCRE2_EXTRA_ALT_BSUX is set ("ALT_BSUX mode"), the 95*22dc650dSSadaf Ebrahimifollowing are also recognized: 96*22dc650dSSadaf Ebrahimi<pre> 97*22dc650dSSadaf Ebrahimi \U the character "U" 98*22dc650dSSadaf Ebrahimi \uhhhh character with hex code hhhh 99*22dc650dSSadaf Ebrahimi \u{hh..} character with hex code hh.. but only for EXTRA_ALT_BSUX 100*22dc650dSSadaf Ebrahimi</pre> 101*22dc650dSSadaf EbrahimiWhen \x is not followed by {, from zero to two hexadecimal digits are read, 102*22dc650dSSadaf Ebrahimibut in ALT_BSUX mode \x must be followed by two hexadecimal digits to be 103*22dc650dSSadaf Ebrahimirecognized as a hexadecimal escape; otherwise it matches a literal "x". 104*22dc650dSSadaf EbrahimiLikewise, if \u (in ALT_BSUX mode) is not followed by four hexadecimal digits 105*22dc650dSSadaf Ebrahimior (in EXTRA_ALT_BSUX mode) a sequence of hex digits in curly brackets, it 106*22dc650dSSadaf Ebrahimimatches a literal "u". 107*22dc650dSSadaf Ebrahimi</P> 108*22dc650dSSadaf Ebrahimi<P> 109*22dc650dSSadaf EbrahimiNote that \0dd is always an octal code. The treatment of backslash followed by 110*22dc650dSSadaf Ebrahimia non-zero digit is complicated; for details see the section 111*22dc650dSSadaf Ebrahimi<a href="pcre2pattern.html#digitsafterbackslash">"Non-printing characters"</a> 112*22dc650dSSadaf Ebrahimiin the 113*22dc650dSSadaf Ebrahimi<a href="pcre2pattern.html"><b>pcre2pattern</b></a> 114*22dc650dSSadaf Ebrahimidocumentation, where details of escape processing in EBCDIC environments are 115*22dc650dSSadaf Ebrahimialso given. \N{U+hh..} is synonymous with \x{hh..} in PCRE2 but is not 116*22dc650dSSadaf Ebrahimisupported in EBCDIC environments. Note that \N not followed by an opening 117*22dc650dSSadaf Ebrahimicurly bracket has a different meaning (see below). 118*22dc650dSSadaf Ebrahimi</P> 119*22dc650dSSadaf Ebrahimi<br><a name="SEC5" href="#TOC1">CHARACTER TYPES</a><br> 120*22dc650dSSadaf Ebrahimi<P> 121*22dc650dSSadaf Ebrahimi<pre> 122*22dc650dSSadaf Ebrahimi . any character except newline; 123*22dc650dSSadaf Ebrahimi in dotall mode, any character whatsoever 124*22dc650dSSadaf Ebrahimi \C one code unit, even in UTF mode (best avoided) 125*22dc650dSSadaf Ebrahimi \d a decimal digit 126*22dc650dSSadaf Ebrahimi \D a character that is not a decimal digit 127*22dc650dSSadaf Ebrahimi \h a horizontal white space character 128*22dc650dSSadaf Ebrahimi \H a character that is not a horizontal white space character 129*22dc650dSSadaf Ebrahimi \N a character that is not a newline 130*22dc650dSSadaf Ebrahimi \p{<i>xx</i>} a character with the <i>xx</i> property 131*22dc650dSSadaf Ebrahimi \P{<i>xx</i>} a character without the <i>xx</i> property 132*22dc650dSSadaf Ebrahimi \R a newline sequence 133*22dc650dSSadaf Ebrahimi \s a white space character 134*22dc650dSSadaf Ebrahimi \S a character that is not a white space character 135*22dc650dSSadaf Ebrahimi \v a vertical white space character 136*22dc650dSSadaf Ebrahimi \V a character that is not a vertical white space character 137*22dc650dSSadaf Ebrahimi \w a "word" character 138*22dc650dSSadaf Ebrahimi \W a "non-word" character 139*22dc650dSSadaf Ebrahimi \X a Unicode extended grapheme cluster 140*22dc650dSSadaf Ebrahimi</pre> 141*22dc650dSSadaf Ebrahimi\C is dangerous because it may leave the current matching point in the middle 142*22dc650dSSadaf Ebrahimiof a UTF-8 or UTF-16 character. The application can lock out the use of \C by 143*22dc650dSSadaf Ebrahimisetting the PCRE2_NEVER_BACKSLASH_C option. It is also possible to build PCRE2 144*22dc650dSSadaf Ebrahimiwith the use of \C permanently disabled. 145*22dc650dSSadaf Ebrahimi</P> 146*22dc650dSSadaf Ebrahimi<P> 147*22dc650dSSadaf EbrahimiBy default, \d, \s, and \w match only ASCII characters, even in UTF-8 mode 148*22dc650dSSadaf Ebrahimior in the 16-bit and 32-bit libraries. However, if locale-specific matching is 149*22dc650dSSadaf Ebrahimihappening, \s and \w may also match characters with code points in the range 150*22dc650dSSadaf Ebrahimi128-255. If the PCRE2_UCP option is set, the behaviour of these escape 151*22dc650dSSadaf Ebrahimisequences is changed to use Unicode properties and they match many more 152*22dc650dSSadaf Ebrahimicharacters, but there are some option settings that can restrict individual 153*22dc650dSSadaf Ebrahimisequences to matching only ASCII characters. 154*22dc650dSSadaf Ebrahimi</P> 155*22dc650dSSadaf Ebrahimi<P> 156*22dc650dSSadaf EbrahimiProperty descriptions in \p and \P are matched caselessly; hyphens, 157*22dc650dSSadaf Ebrahimiunderscores, and white space are ignored, in accordance with Unicode's "loose 158*22dc650dSSadaf Ebrahimimatching" rules. 159*22dc650dSSadaf Ebrahimi</P> 160*22dc650dSSadaf Ebrahimi<br><a name="SEC6" href="#TOC1">GENERAL CATEGORY PROPERTIES FOR \p and \P</a><br> 161*22dc650dSSadaf Ebrahimi<P> 162*22dc650dSSadaf Ebrahimi<pre> 163*22dc650dSSadaf Ebrahimi C Other 164*22dc650dSSadaf Ebrahimi Cc Control 165*22dc650dSSadaf Ebrahimi Cf Format 166*22dc650dSSadaf Ebrahimi Cn Unassigned 167*22dc650dSSadaf Ebrahimi Co Private use 168*22dc650dSSadaf Ebrahimi Cs Surrogate 169*22dc650dSSadaf Ebrahimi 170*22dc650dSSadaf Ebrahimi L Letter 171*22dc650dSSadaf Ebrahimi Ll Lower case letter 172*22dc650dSSadaf Ebrahimi Lm Modifier letter 173*22dc650dSSadaf Ebrahimi Lo Other letter 174*22dc650dSSadaf Ebrahimi Lt Title case letter 175*22dc650dSSadaf Ebrahimi Lu Upper case letter 176*22dc650dSSadaf Ebrahimi Lc Ll, Lu, or Lt 177*22dc650dSSadaf Ebrahimi L& Ll, Lu, or Lt 178*22dc650dSSadaf Ebrahimi 179*22dc650dSSadaf Ebrahimi M Mark 180*22dc650dSSadaf Ebrahimi Mc Spacing mark 181*22dc650dSSadaf Ebrahimi Me Enclosing mark 182*22dc650dSSadaf Ebrahimi Mn Non-spacing mark 183*22dc650dSSadaf Ebrahimi 184*22dc650dSSadaf Ebrahimi N Number 185*22dc650dSSadaf Ebrahimi Nd Decimal number 186*22dc650dSSadaf Ebrahimi Nl Letter number 187*22dc650dSSadaf Ebrahimi No Other number 188*22dc650dSSadaf Ebrahimi 189*22dc650dSSadaf Ebrahimi P Punctuation 190*22dc650dSSadaf Ebrahimi Pc Connector punctuation 191*22dc650dSSadaf Ebrahimi Pd Dash punctuation 192*22dc650dSSadaf Ebrahimi Pe Close punctuation 193*22dc650dSSadaf Ebrahimi Pf Final punctuation 194*22dc650dSSadaf Ebrahimi Pi Initial punctuation 195*22dc650dSSadaf Ebrahimi Po Other punctuation 196*22dc650dSSadaf Ebrahimi Ps Open punctuation 197*22dc650dSSadaf Ebrahimi 198*22dc650dSSadaf Ebrahimi S Symbol 199*22dc650dSSadaf Ebrahimi Sc Currency symbol 200*22dc650dSSadaf Ebrahimi Sk Modifier symbol 201*22dc650dSSadaf Ebrahimi Sm Mathematical symbol 202*22dc650dSSadaf Ebrahimi So Other symbol 203*22dc650dSSadaf Ebrahimi 204*22dc650dSSadaf Ebrahimi Z Separator 205*22dc650dSSadaf Ebrahimi Zl Line separator 206*22dc650dSSadaf Ebrahimi Zp Paragraph separator 207*22dc650dSSadaf Ebrahimi Zs Space separator 208*22dc650dSSadaf Ebrahimi</PRE> 209*22dc650dSSadaf Ebrahimi</P> 210*22dc650dSSadaf Ebrahimi<br><a name="SEC7" href="#TOC1">PCRE2 SPECIAL CATEGORY PROPERTIES FOR \p and \P</a><br> 211*22dc650dSSadaf Ebrahimi<P> 212*22dc650dSSadaf Ebrahimi<pre> 213*22dc650dSSadaf Ebrahimi Xan Alphanumeric: union of properties L and N 214*22dc650dSSadaf Ebrahimi Xps POSIX space: property Z or tab, NL, VT, FF, CR 215*22dc650dSSadaf Ebrahimi Xsp Perl space: property Z or tab, NL, VT, FF, CR 216*22dc650dSSadaf Ebrahimi Xuc Universally-named character: one that can be 217*22dc650dSSadaf Ebrahimi represented by a Universal Character Name 218*22dc650dSSadaf Ebrahimi Xwd Perl word: property Xan or underscore 219*22dc650dSSadaf Ebrahimi</pre> 220*22dc650dSSadaf EbrahimiPerl and POSIX space are now the same. Perl added VT to its space character set 221*22dc650dSSadaf Ebrahimiat release 5.18. 222*22dc650dSSadaf Ebrahimi</P> 223*22dc650dSSadaf Ebrahimi<br><a name="SEC8" href="#TOC1">BINARY PROPERTIES FOR \p AND \P</a><br> 224*22dc650dSSadaf Ebrahimi<P> 225*22dc650dSSadaf EbrahimiUnicode defines a number of binary properties, that is, properties whose only 226*22dc650dSSadaf Ebrahimivalues are true or false. You can obtain a list of those that are recognized by 227*22dc650dSSadaf Ebrahimi\p and \P, along with their abbreviations, by running this command: 228*22dc650dSSadaf Ebrahimi<pre> 229*22dc650dSSadaf Ebrahimi pcre2test -LP 230*22dc650dSSadaf Ebrahimi</PRE> 231*22dc650dSSadaf Ebrahimi</P> 232*22dc650dSSadaf Ebrahimi<br><a name="SEC9" href="#TOC1">SCRIPT MATCHING WITH \p AND \P</a><br> 233*22dc650dSSadaf Ebrahimi<P> 234*22dc650dSSadaf EbrahimiMany script names and their 4-letter abbreviations are recognized in 235*22dc650dSSadaf Ebrahimi\p{sc:...} or \p{scx:...} items, or on their own with \p (and also \P of 236*22dc650dSSadaf Ebrahimicourse). You can obtain a list of these scripts by running this command: 237*22dc650dSSadaf Ebrahimi<pre> 238*22dc650dSSadaf Ebrahimi pcre2test -LS 239*22dc650dSSadaf Ebrahimi</PRE> 240*22dc650dSSadaf Ebrahimi</P> 241*22dc650dSSadaf Ebrahimi<br><a name="SEC10" href="#TOC1">THE BIDI_CLASS PROPERTY FOR \p AND \P</a><br> 242*22dc650dSSadaf Ebrahimi<P> 243*22dc650dSSadaf Ebrahimi<pre> 244*22dc650dSSadaf Ebrahimi \p{Bidi_Class:<class>} matches a character with the given class 245*22dc650dSSadaf Ebrahimi \p{BC:<class>} matches a character with the given class 246*22dc650dSSadaf Ebrahimi</pre> 247*22dc650dSSadaf EbrahimiThe recognized classes are: 248*22dc650dSSadaf Ebrahimi<pre> 249*22dc650dSSadaf Ebrahimi AL Arabic letter 250*22dc650dSSadaf Ebrahimi AN Arabic number 251*22dc650dSSadaf Ebrahimi B paragraph separator 252*22dc650dSSadaf Ebrahimi BN boundary neutral 253*22dc650dSSadaf Ebrahimi CS common separator 254*22dc650dSSadaf Ebrahimi EN European number 255*22dc650dSSadaf Ebrahimi ES European separator 256*22dc650dSSadaf Ebrahimi ET European terminator 257*22dc650dSSadaf Ebrahimi FSI first strong isolate 258*22dc650dSSadaf Ebrahimi L left-to-right 259*22dc650dSSadaf Ebrahimi LRE left-to-right embedding 260*22dc650dSSadaf Ebrahimi LRI left-to-right isolate 261*22dc650dSSadaf Ebrahimi LRO left-to-right override 262*22dc650dSSadaf Ebrahimi NSM non-spacing mark 263*22dc650dSSadaf Ebrahimi ON other neutral 264*22dc650dSSadaf Ebrahimi PDF pop directional format 265*22dc650dSSadaf Ebrahimi PDI pop directional isolate 266*22dc650dSSadaf Ebrahimi R right-to-left 267*22dc650dSSadaf Ebrahimi RLE right-to-left embedding 268*22dc650dSSadaf Ebrahimi RLI right-to-left isolate 269*22dc650dSSadaf Ebrahimi RLO right-to-left override 270*22dc650dSSadaf Ebrahimi S segment separator 271*22dc650dSSadaf Ebrahimi WS which space 272*22dc650dSSadaf Ebrahimi</PRE> 273*22dc650dSSadaf Ebrahimi</P> 274*22dc650dSSadaf Ebrahimi<br><a name="SEC11" href="#TOC1">CHARACTER CLASSES</a><br> 275*22dc650dSSadaf Ebrahimi<P> 276*22dc650dSSadaf Ebrahimi<pre> 277*22dc650dSSadaf Ebrahimi [...] positive character class 278*22dc650dSSadaf Ebrahimi [^...] negative character class 279*22dc650dSSadaf Ebrahimi [x-y] range (can be used for hex characters) 280*22dc650dSSadaf Ebrahimi [[:xxx:]] positive POSIX named set 281*22dc650dSSadaf Ebrahimi [[:^xxx:]] negative POSIX named set 282*22dc650dSSadaf Ebrahimi 283*22dc650dSSadaf Ebrahimi alnum alphanumeric 284*22dc650dSSadaf Ebrahimi alpha alphabetic 285*22dc650dSSadaf Ebrahimi ascii 0-127 286*22dc650dSSadaf Ebrahimi blank space or tab 287*22dc650dSSadaf Ebrahimi cntrl control character 288*22dc650dSSadaf Ebrahimi digit decimal digit 289*22dc650dSSadaf Ebrahimi graph printing, excluding space 290*22dc650dSSadaf Ebrahimi lower lower case letter 291*22dc650dSSadaf Ebrahimi print printing, including space 292*22dc650dSSadaf Ebrahimi punct printing, excluding alphanumeric 293*22dc650dSSadaf Ebrahimi space white space 294*22dc650dSSadaf Ebrahimi upper upper case letter 295*22dc650dSSadaf Ebrahimi word same as \w 296*22dc650dSSadaf Ebrahimi xdigit hexadecimal digit 297*22dc650dSSadaf Ebrahimi</pre> 298*22dc650dSSadaf EbrahimiIn PCRE2, POSIX character set names recognize only ASCII characters by default, 299*22dc650dSSadaf Ebrahimibut some of them use Unicode properties if PCRE2_UCP is set. You can use 300*22dc650dSSadaf Ebrahimi\Q...\E inside a character class. 301*22dc650dSSadaf Ebrahimi</P> 302*22dc650dSSadaf Ebrahimi<br><a name="SEC12" href="#TOC1">QUANTIFIERS</a><br> 303*22dc650dSSadaf Ebrahimi<P> 304*22dc650dSSadaf Ebrahimi<pre> 305*22dc650dSSadaf Ebrahimi ? 0 or 1, greedy 306*22dc650dSSadaf Ebrahimi ?+ 0 or 1, possessive 307*22dc650dSSadaf Ebrahimi ?? 0 or 1, lazy 308*22dc650dSSadaf Ebrahimi * 0 or more, greedy 309*22dc650dSSadaf Ebrahimi *+ 0 or more, possessive 310*22dc650dSSadaf Ebrahimi *? 0 or more, lazy 311*22dc650dSSadaf Ebrahimi + 1 or more, greedy 312*22dc650dSSadaf Ebrahimi ++ 1 or more, possessive 313*22dc650dSSadaf Ebrahimi +? 1 or more, lazy 314*22dc650dSSadaf Ebrahimi {n} exactly n 315*22dc650dSSadaf Ebrahimi {n,m} at least n, no more than m, greedy 316*22dc650dSSadaf Ebrahimi {n,m}+ at least n, no more than m, possessive 317*22dc650dSSadaf Ebrahimi {n,m}? at least n, no more than m, lazy 318*22dc650dSSadaf Ebrahimi {n,} n or more, greedy 319*22dc650dSSadaf Ebrahimi {n,}+ n or more, possessive 320*22dc650dSSadaf Ebrahimi {n,}? n or more, lazy 321*22dc650dSSadaf Ebrahimi {,m} zero up to m, greedy 322*22dc650dSSadaf Ebrahimi {,m}+ zero up to m, possessive 323*22dc650dSSadaf Ebrahimi {,m}? zero up to m, lazy 324*22dc650dSSadaf Ebrahimi</PRE> 325*22dc650dSSadaf Ebrahimi</P> 326*22dc650dSSadaf Ebrahimi<br><a name="SEC13" href="#TOC1">ANCHORS AND SIMPLE ASSERTIONS</a><br> 327*22dc650dSSadaf Ebrahimi<P> 328*22dc650dSSadaf Ebrahimi<pre> 329*22dc650dSSadaf Ebrahimi \b word boundary 330*22dc650dSSadaf Ebrahimi \B not a word boundary 331*22dc650dSSadaf Ebrahimi ^ start of subject 332*22dc650dSSadaf Ebrahimi also after an internal newline in multiline mode 333*22dc650dSSadaf Ebrahimi (after any newline if PCRE2_ALT_CIRCUMFLEX is set) 334*22dc650dSSadaf Ebrahimi \A start of subject 335*22dc650dSSadaf Ebrahimi $ end of subject 336*22dc650dSSadaf Ebrahimi also before newline at end of subject 337*22dc650dSSadaf Ebrahimi also before internal newline in multiline mode 338*22dc650dSSadaf Ebrahimi \Z end of subject 339*22dc650dSSadaf Ebrahimi also before newline at end of subject 340*22dc650dSSadaf Ebrahimi \z end of subject 341*22dc650dSSadaf Ebrahimi \G first matching position in subject 342*22dc650dSSadaf Ebrahimi</PRE> 343*22dc650dSSadaf Ebrahimi</P> 344*22dc650dSSadaf Ebrahimi<br><a name="SEC14" href="#TOC1">REPORTED MATCH POINT SETTING</a><br> 345*22dc650dSSadaf Ebrahimi<P> 346*22dc650dSSadaf Ebrahimi<pre> 347*22dc650dSSadaf Ebrahimi \K set reported start of match 348*22dc650dSSadaf Ebrahimi</pre> 349*22dc650dSSadaf EbrahimiFrom release 10.38 \K is not permitted by default in lookaround assertions, 350*22dc650dSSadaf Ebrahimifor compatibility with Perl. However, if the PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK 351*22dc650dSSadaf Ebrahimioption is set, the previous behaviour is re-enabled. When this option is set, 352*22dc650dSSadaf Ebrahimi\K is honoured in positive assertions, but ignored in negative ones. 353*22dc650dSSadaf Ebrahimi</P> 354*22dc650dSSadaf Ebrahimi<br><a name="SEC15" href="#TOC1">ALTERNATION</a><br> 355*22dc650dSSadaf Ebrahimi<P> 356*22dc650dSSadaf Ebrahimi<pre> 357*22dc650dSSadaf Ebrahimi expr|expr|expr... 358*22dc650dSSadaf Ebrahimi</PRE> 359*22dc650dSSadaf Ebrahimi</P> 360*22dc650dSSadaf Ebrahimi<br><a name="SEC16" href="#TOC1">CAPTURING</a><br> 361*22dc650dSSadaf Ebrahimi<P> 362*22dc650dSSadaf Ebrahimi<pre> 363*22dc650dSSadaf Ebrahimi (...) capture group 364*22dc650dSSadaf Ebrahimi (?<name>...) named capture group (Perl) 365*22dc650dSSadaf Ebrahimi (?'name'...) named capture group (Perl) 366*22dc650dSSadaf Ebrahimi (?P<name>...) named capture group (Python) 367*22dc650dSSadaf Ebrahimi (?:...) non-capture group 368*22dc650dSSadaf Ebrahimi (?|...) non-capture group; reset group numbers for 369*22dc650dSSadaf Ebrahimi capture groups in each alternative 370*22dc650dSSadaf Ebrahimi</pre> 371*22dc650dSSadaf EbrahimiIn non-UTF modes, names may contain underscores and ASCII letters and digits; 372*22dc650dSSadaf Ebrahimiin UTF modes, any Unicode letters and Unicode decimal digits are permitted. In 373*22dc650dSSadaf Ebrahimiboth cases, a name must not start with a digit. 374*22dc650dSSadaf Ebrahimi</P> 375*22dc650dSSadaf Ebrahimi<br><a name="SEC17" href="#TOC1">ATOMIC GROUPS</a><br> 376*22dc650dSSadaf Ebrahimi<P> 377*22dc650dSSadaf Ebrahimi<pre> 378*22dc650dSSadaf Ebrahimi (?>...) atomic non-capture group 379*22dc650dSSadaf Ebrahimi (*atomic:...) atomic non-capture group 380*22dc650dSSadaf Ebrahimi</PRE> 381*22dc650dSSadaf Ebrahimi</P> 382*22dc650dSSadaf Ebrahimi<br><a name="SEC18" href="#TOC1">COMMENT</a><br> 383*22dc650dSSadaf Ebrahimi<P> 384*22dc650dSSadaf Ebrahimi<pre> 385*22dc650dSSadaf Ebrahimi (?#....) comment (not nestable) 386*22dc650dSSadaf Ebrahimi</PRE> 387*22dc650dSSadaf Ebrahimi</P> 388*22dc650dSSadaf Ebrahimi<br><a name="SEC19" href="#TOC1">OPTION SETTING</a><br> 389*22dc650dSSadaf Ebrahimi<P> 390*22dc650dSSadaf EbrahimiChanges of these options within a group are automatically cancelled at the end 391*22dc650dSSadaf Ebrahimiof the group. 392*22dc650dSSadaf Ebrahimi<pre> 393*22dc650dSSadaf Ebrahimi (?a) all ASCII options 394*22dc650dSSadaf Ebrahimi (?aD) restrict \d to ASCII in UCP mode 395*22dc650dSSadaf Ebrahimi (?aS) restrict \s to ASCII in UCP mode 396*22dc650dSSadaf Ebrahimi (?aW) restrict \w to ASCII in UCP mode 397*22dc650dSSadaf Ebrahimi (?aP) restrict all POSIX classes to ASCII in UCP mode 398*22dc650dSSadaf Ebrahimi (?aT) restrict POSIX digit classes to ASCII in UCP mode 399*22dc650dSSadaf Ebrahimi (?i) caseless 400*22dc650dSSadaf Ebrahimi (?J) allow duplicate named groups 401*22dc650dSSadaf Ebrahimi (?m) multiline 402*22dc650dSSadaf Ebrahimi (?n) no auto capture 403*22dc650dSSadaf Ebrahimi (?r) restrict caseless to either ASCII or non-ASCII 404*22dc650dSSadaf Ebrahimi (?s) single line (dotall) 405*22dc650dSSadaf Ebrahimi (?U) default ungreedy (lazy) 406*22dc650dSSadaf Ebrahimi (?x) ignore white space except in classes or \Q...\E 407*22dc650dSSadaf Ebrahimi (?xx) as (?x) but also ignore space and tab in classes 408*22dc650dSSadaf Ebrahimi (?-...) unset the given option(s) 409*22dc650dSSadaf Ebrahimi (?^) unset imnrsx options 410*22dc650dSSadaf Ebrahimi</pre> 411*22dc650dSSadaf Ebrahimi(?aP) implies (?aT) as well, though this has no additional effect. However, it 412*22dc650dSSadaf Ebrahimimeans that (?-aP) is really (?-PT) which disables all ASCII restrictions for 413*22dc650dSSadaf EbrahimiPOSIX classes. 414*22dc650dSSadaf Ebrahimi</P> 415*22dc650dSSadaf Ebrahimi<P> 416*22dc650dSSadaf EbrahimiUnsetting x or xx unsets both. Several options may be set at once, and a 417*22dc650dSSadaf Ebrahimimixture of setting and unsetting such as (?i-x) is allowed, but there may be 418*22dc650dSSadaf Ebrahimionly one hyphen. Setting (but no unsetting) is allowed after (?^ for example 419*22dc650dSSadaf Ebrahimi(?^in). An option setting may appear at the start of a non-capture group, for 420*22dc650dSSadaf Ebrahimiexample (?i:...). 421*22dc650dSSadaf Ebrahimi</P> 422*22dc650dSSadaf Ebrahimi<P> 423*22dc650dSSadaf EbrahimiThe following are recognized only at the very start of a pattern or after one 424*22dc650dSSadaf Ebrahimiof the newline or \R options with similar syntax. More than one of them may 425*22dc650dSSadaf Ebrahimiappear. For the first three, d is a decimal number. 426*22dc650dSSadaf Ebrahimi<pre> 427*22dc650dSSadaf Ebrahimi (*LIMIT_DEPTH=d) set the backtracking limit to d 428*22dc650dSSadaf Ebrahimi (*LIMIT_HEAP=d) set the heap size limit to d * 1024 bytes 429*22dc650dSSadaf Ebrahimi (*LIMIT_MATCH=d) set the match limit to d 430*22dc650dSSadaf Ebrahimi (*NOTEMPTY) set PCRE2_NOTEMPTY when matching 431*22dc650dSSadaf Ebrahimi (*NOTEMPTY_ATSTART) set PCRE2_NOTEMPTY_ATSTART when matching 432*22dc650dSSadaf Ebrahimi (*NO_AUTO_POSSESS) no auto-possessification (PCRE2_NO_AUTO_POSSESS) 433*22dc650dSSadaf Ebrahimi (*NO_DOTSTAR_ANCHOR) no .* anchoring (PCRE2_NO_DOTSTAR_ANCHOR) 434*22dc650dSSadaf Ebrahimi (*NO_JIT) disable JIT optimization 435*22dc650dSSadaf Ebrahimi (*NO_START_OPT) no start-match optimization (PCRE2_NO_START_OPTIMIZE) 436*22dc650dSSadaf Ebrahimi (*UTF) set appropriate UTF mode for the library in use 437*22dc650dSSadaf Ebrahimi (*UCP) set PCRE2_UCP (use Unicode properties for \d etc) 438*22dc650dSSadaf Ebrahimi</pre> 439*22dc650dSSadaf EbrahimiNote that LIMIT_DEPTH, LIMIT_HEAP, and LIMIT_MATCH can only reduce the value of 440*22dc650dSSadaf Ebrahimithe limits set by the caller of <b>pcre2_match()</b> or <b>pcre2_dfa_match()</b>, 441*22dc650dSSadaf Ebrahiminot increase them. LIMIT_RECURSION is an obsolete synonym for LIMIT_DEPTH. The 442*22dc650dSSadaf Ebrahimiapplication can lock out the use of (*UTF) and (*UCP) by setting the 443*22dc650dSSadaf EbrahimiPCRE2_NEVER_UTF or PCRE2_NEVER_UCP options, respectively, at compile time. 444*22dc650dSSadaf Ebrahimi</P> 445*22dc650dSSadaf Ebrahimi<br><a name="SEC20" href="#TOC1">NEWLINE CONVENTION</a><br> 446*22dc650dSSadaf Ebrahimi<P> 447*22dc650dSSadaf EbrahimiThese are recognized only at the very start of the pattern or after option 448*22dc650dSSadaf Ebrahimisettings with a similar syntax. 449*22dc650dSSadaf Ebrahimi<pre> 450*22dc650dSSadaf Ebrahimi (*CR) carriage return only 451*22dc650dSSadaf Ebrahimi (*LF) linefeed only 452*22dc650dSSadaf Ebrahimi (*CRLF) carriage return followed by linefeed 453*22dc650dSSadaf Ebrahimi (*ANYCRLF) all three of the above 454*22dc650dSSadaf Ebrahimi (*ANY) any Unicode newline sequence 455*22dc650dSSadaf Ebrahimi (*NUL) the NUL character (binary zero) 456*22dc650dSSadaf Ebrahimi</PRE> 457*22dc650dSSadaf Ebrahimi</P> 458*22dc650dSSadaf Ebrahimi<br><a name="SEC21" href="#TOC1">WHAT \R MATCHES</a><br> 459*22dc650dSSadaf Ebrahimi<P> 460*22dc650dSSadaf EbrahimiThese are recognized only at the very start of the pattern or after option 461*22dc650dSSadaf Ebrahimisetting with a similar syntax. 462*22dc650dSSadaf Ebrahimi<pre> 463*22dc650dSSadaf Ebrahimi (*BSR_ANYCRLF) CR, LF, or CRLF 464*22dc650dSSadaf Ebrahimi (*BSR_UNICODE) any Unicode newline sequence 465*22dc650dSSadaf Ebrahimi</PRE> 466*22dc650dSSadaf Ebrahimi</P> 467*22dc650dSSadaf Ebrahimi<br><a name="SEC22" href="#TOC1">LOOKAHEAD AND LOOKBEHIND ASSERTIONS</a><br> 468*22dc650dSSadaf Ebrahimi<P> 469*22dc650dSSadaf Ebrahimi<pre> 470*22dc650dSSadaf Ebrahimi (?=...) ) 471*22dc650dSSadaf Ebrahimi (*pla:...) ) positive lookahead 472*22dc650dSSadaf Ebrahimi (*positive_lookahead:...) ) 473*22dc650dSSadaf Ebrahimi 474*22dc650dSSadaf Ebrahimi (?!...) ) 475*22dc650dSSadaf Ebrahimi (*nla:...) ) negative lookahead 476*22dc650dSSadaf Ebrahimi (*negative_lookahead:...) ) 477*22dc650dSSadaf Ebrahimi 478*22dc650dSSadaf Ebrahimi (?<=...) ) 479*22dc650dSSadaf Ebrahimi (*plb:...) ) positive lookbehind 480*22dc650dSSadaf Ebrahimi (*positive_lookbehind:...) ) 481*22dc650dSSadaf Ebrahimi 482*22dc650dSSadaf Ebrahimi (?<!...) ) 483*22dc650dSSadaf Ebrahimi (*nlb:...) ) negative lookbehind 484*22dc650dSSadaf Ebrahimi (*negative_lookbehind:...) ) 485*22dc650dSSadaf Ebrahimi</pre> 486*22dc650dSSadaf EbrahimiEach top-level branch of a lookbehind must have a limit for the number of 487*22dc650dSSadaf Ebrahimicharacters it matches. If any branch can match a variable number of characters, 488*22dc650dSSadaf Ebrahimithe maximum for each branch is limited to a value set by the caller of 489*22dc650dSSadaf Ebrahimi<b>pcre2_compile()</b> or defaulted. The default is set when PCRE2 is built 490*22dc650dSSadaf Ebrahimi(ultimate default 255). If every branch matches a fixed number of characters, 491*22dc650dSSadaf Ebrahimithe limit for each branch is 65535 characters. 492*22dc650dSSadaf Ebrahimi</P> 493*22dc650dSSadaf Ebrahimi<br><a name="SEC23" href="#TOC1">NON-ATOMIC LOOKAROUND ASSERTIONS</a><br> 494*22dc650dSSadaf Ebrahimi<P> 495*22dc650dSSadaf EbrahimiThese assertions are specific to PCRE2 and are not Perl-compatible. 496*22dc650dSSadaf Ebrahimi<pre> 497*22dc650dSSadaf Ebrahimi (?*...) ) 498*22dc650dSSadaf Ebrahimi (*napla:...) ) synonyms 499*22dc650dSSadaf Ebrahimi (*non_atomic_positive_lookahead:...) ) 500*22dc650dSSadaf Ebrahimi 501*22dc650dSSadaf Ebrahimi (?<*...) ) 502*22dc650dSSadaf Ebrahimi (*naplb:...) ) synonyms 503*22dc650dSSadaf Ebrahimi (*non_atomic_positive_lookbehind:...) ) 504*22dc650dSSadaf Ebrahimi</PRE> 505*22dc650dSSadaf Ebrahimi</P> 506*22dc650dSSadaf Ebrahimi<br><a name="SEC24" href="#TOC1">SCRIPT RUNS</a><br> 507*22dc650dSSadaf Ebrahimi<P> 508*22dc650dSSadaf Ebrahimi<pre> 509*22dc650dSSadaf Ebrahimi (*script_run:...) ) script run, can be backtracked into 510*22dc650dSSadaf Ebrahimi (*sr:...) ) 511*22dc650dSSadaf Ebrahimi 512*22dc650dSSadaf Ebrahimi (*atomic_script_run:...) ) atomic script run 513*22dc650dSSadaf Ebrahimi (*asr:...) ) 514*22dc650dSSadaf Ebrahimi</PRE> 515*22dc650dSSadaf Ebrahimi</P> 516*22dc650dSSadaf Ebrahimi<br><a name="SEC25" href="#TOC1">BACKREFERENCES</a><br> 517*22dc650dSSadaf Ebrahimi<P> 518*22dc650dSSadaf Ebrahimi<pre> 519*22dc650dSSadaf Ebrahimi \n reference by number (can be ambiguous) 520*22dc650dSSadaf Ebrahimi \gn reference by number 521*22dc650dSSadaf Ebrahimi \g{n} reference by number 522*22dc650dSSadaf Ebrahimi \g+n relative reference by number (PCRE2 extension) 523*22dc650dSSadaf Ebrahimi \g-n relative reference by number 524*22dc650dSSadaf Ebrahimi \g{+n} relative reference by number (PCRE2 extension) 525*22dc650dSSadaf Ebrahimi \g{-n} relative reference by number 526*22dc650dSSadaf Ebrahimi \k<name> reference by name (Perl) 527*22dc650dSSadaf Ebrahimi \k'name' reference by name (Perl) 528*22dc650dSSadaf Ebrahimi \g{name} reference by name (Perl) 529*22dc650dSSadaf Ebrahimi \k{name} reference by name (.NET) 530*22dc650dSSadaf Ebrahimi (?P=name) reference by name (Python) 531*22dc650dSSadaf Ebrahimi</PRE> 532*22dc650dSSadaf Ebrahimi</P> 533*22dc650dSSadaf Ebrahimi<br><a name="SEC26" href="#TOC1">SUBROUTINE REFERENCES (POSSIBLY RECURSIVE)</a><br> 534*22dc650dSSadaf Ebrahimi<P> 535*22dc650dSSadaf Ebrahimi<pre> 536*22dc650dSSadaf Ebrahimi (?R) recurse whole pattern 537*22dc650dSSadaf Ebrahimi (?n) call subroutine by absolute number 538*22dc650dSSadaf Ebrahimi (?+n) call subroutine by relative number 539*22dc650dSSadaf Ebrahimi (?-n) call subroutine by relative number 540*22dc650dSSadaf Ebrahimi (?&name) call subroutine by name (Perl) 541*22dc650dSSadaf Ebrahimi (?P>name) call subroutine by name (Python) 542*22dc650dSSadaf Ebrahimi \g<name> call subroutine by name (Oniguruma) 543*22dc650dSSadaf Ebrahimi \g'name' call subroutine by name (Oniguruma) 544*22dc650dSSadaf Ebrahimi \g<n> call subroutine by absolute number (Oniguruma) 545*22dc650dSSadaf Ebrahimi \g'n' call subroutine by absolute number (Oniguruma) 546*22dc650dSSadaf Ebrahimi \g<+n> call subroutine by relative number (PCRE2 extension) 547*22dc650dSSadaf Ebrahimi \g'+n' call subroutine by relative number (PCRE2 extension) 548*22dc650dSSadaf Ebrahimi \g<-n> call subroutine by relative number (PCRE2 extension) 549*22dc650dSSadaf Ebrahimi \g'-n' call subroutine by relative number (PCRE2 extension) 550*22dc650dSSadaf Ebrahimi</PRE> 551*22dc650dSSadaf Ebrahimi</P> 552*22dc650dSSadaf Ebrahimi<br><a name="SEC27" href="#TOC1">CONDITIONAL PATTERNS</a><br> 553*22dc650dSSadaf Ebrahimi<P> 554*22dc650dSSadaf Ebrahimi<pre> 555*22dc650dSSadaf Ebrahimi (?(condition)yes-pattern) 556*22dc650dSSadaf Ebrahimi (?(condition)yes-pattern|no-pattern) 557*22dc650dSSadaf Ebrahimi 558*22dc650dSSadaf Ebrahimi (?(n) absolute reference condition 559*22dc650dSSadaf Ebrahimi (?(+n) relative reference condition (PCRE2 extension) 560*22dc650dSSadaf Ebrahimi (?(-n) relative reference condition (PCRE2 extension) 561*22dc650dSSadaf Ebrahimi (?(<name>) named reference condition (Perl) 562*22dc650dSSadaf Ebrahimi (?('name') named reference condition (Perl) 563*22dc650dSSadaf Ebrahimi (?(name) named reference condition (PCRE2, deprecated) 564*22dc650dSSadaf Ebrahimi (?(R) overall recursion condition 565*22dc650dSSadaf Ebrahimi (?(Rn) specific numbered group recursion condition 566*22dc650dSSadaf Ebrahimi (?(R&name) specific named group recursion condition 567*22dc650dSSadaf Ebrahimi (?(DEFINE) define groups for reference 568*22dc650dSSadaf Ebrahimi (?(VERSION[>]=n.m) test PCRE2 version 569*22dc650dSSadaf Ebrahimi (?(assert) assertion condition 570*22dc650dSSadaf Ebrahimi</pre> 571*22dc650dSSadaf EbrahimiNote the ambiguity of (?(R) and (?(Rn) which might be named reference 572*22dc650dSSadaf Ebrahimiconditions or recursion tests. Such a condition is interpreted as a reference 573*22dc650dSSadaf Ebrahimicondition if the relevant named group exists. 574*22dc650dSSadaf Ebrahimi</P> 575*22dc650dSSadaf Ebrahimi<br><a name="SEC28" href="#TOC1">BACKTRACKING CONTROL</a><br> 576*22dc650dSSadaf Ebrahimi<P> 577*22dc650dSSadaf EbrahimiAll backtracking control verbs may be in the form (*VERB:NAME). For (*MARK) the 578*22dc650dSSadaf Ebrahiminame is mandatory, for the others it is optional. (*SKIP) changes its behaviour 579*22dc650dSSadaf Ebrahimiif :NAME is present. The others just set a name for passing back to the caller, 580*22dc650dSSadaf Ebrahimibut this is not a name that (*SKIP) can see. The following act immediately they 581*22dc650dSSadaf Ebrahimiare reached: 582*22dc650dSSadaf Ebrahimi<pre> 583*22dc650dSSadaf Ebrahimi (*ACCEPT) force successful match 584*22dc650dSSadaf Ebrahimi (*FAIL) force backtrack; synonym (*F) 585*22dc650dSSadaf Ebrahimi (*MARK:NAME) set name to be passed back; synonym (*:NAME) 586*22dc650dSSadaf Ebrahimi</pre> 587*22dc650dSSadaf EbrahimiThe following act only when a subsequent match failure causes a backtrack to 588*22dc650dSSadaf Ebrahimireach them. They all force a match failure, but they differ in what happens 589*22dc650dSSadaf Ebrahimiafterwards. Those that advance the start-of-match point do so only if the 590*22dc650dSSadaf Ebrahimipattern is not anchored. 591*22dc650dSSadaf Ebrahimi<pre> 592*22dc650dSSadaf Ebrahimi (*COMMIT) overall failure, no advance of starting point 593*22dc650dSSadaf Ebrahimi (*PRUNE) advance to next starting character 594*22dc650dSSadaf Ebrahimi (*SKIP) advance to current matching position 595*22dc650dSSadaf Ebrahimi (*SKIP:NAME) advance to position corresponding to an earlier 596*22dc650dSSadaf Ebrahimi (*MARK:NAME); if not found, the (*SKIP) is ignored 597*22dc650dSSadaf Ebrahimi (*THEN) local failure, backtrack to next alternation 598*22dc650dSSadaf Ebrahimi</pre> 599*22dc650dSSadaf EbrahimiThe effect of one of these verbs in a group called as a subroutine is confined 600*22dc650dSSadaf Ebrahimito the subroutine call. 601*22dc650dSSadaf Ebrahimi</P> 602*22dc650dSSadaf Ebrahimi<br><a name="SEC29" href="#TOC1">CALLOUTS</a><br> 603*22dc650dSSadaf Ebrahimi<P> 604*22dc650dSSadaf Ebrahimi<pre> 605*22dc650dSSadaf Ebrahimi (?C) callout (assumed number 0) 606*22dc650dSSadaf Ebrahimi (?Cn) callout with numerical data n 607*22dc650dSSadaf Ebrahimi (?C"text") callout with string data 608*22dc650dSSadaf Ebrahimi</pre> 609*22dc650dSSadaf EbrahimiThe allowed string delimiters are ` ' " ^ % # $ (which are the same for the 610*22dc650dSSadaf Ebrahimistart and the end), and the starting delimiter { matched with the ending 611*22dc650dSSadaf Ebrahimidelimiter }. To encode the ending delimiter within the string, double it. 612*22dc650dSSadaf Ebrahimi</P> 613*22dc650dSSadaf Ebrahimi<br><a name="SEC30" href="#TOC1">SEE ALSO</a><br> 614*22dc650dSSadaf Ebrahimi<P> 615*22dc650dSSadaf Ebrahimi<b>pcre2pattern</b>(3), <b>pcre2api</b>(3), <b>pcre2callout</b>(3), 616*22dc650dSSadaf Ebrahimi<b>pcre2matching</b>(3), <b>pcre2</b>(3). 617*22dc650dSSadaf Ebrahimi</P> 618*22dc650dSSadaf Ebrahimi<br><a name="SEC31" href="#TOC1">AUTHOR</a><br> 619*22dc650dSSadaf Ebrahimi<P> 620*22dc650dSSadaf EbrahimiPhilip Hazel 621*22dc650dSSadaf Ebrahimi<br> 622*22dc650dSSadaf EbrahimiRetired from University Computing Service 623*22dc650dSSadaf Ebrahimi<br> 624*22dc650dSSadaf EbrahimiCambridge, England. 625*22dc650dSSadaf Ebrahimi<br> 626*22dc650dSSadaf Ebrahimi</P> 627*22dc650dSSadaf Ebrahimi<br><a name="SEC32" href="#TOC1">REVISION</a><br> 628*22dc650dSSadaf Ebrahimi<P> 629*22dc650dSSadaf EbrahimiLast updated: 12 October 2023 630*22dc650dSSadaf Ebrahimi<br> 631*22dc650dSSadaf EbrahimiCopyright © 1997-2023 University of Cambridge. 632*22dc650dSSadaf Ebrahimi<br> 633*22dc650dSSadaf Ebrahimi<p> 634*22dc650dSSadaf EbrahimiReturn to the <a href="index.html">PCRE2 index page</a>. 635*22dc650dSSadaf Ebrahimi</p> 636