1RE2 regular expression syntax reference 2------------------------------------- 3 4Single characters: 5. any character, possibly including newline (s=true) 6[xyz] character class 7[^xyz] negated character class 8\d Perl character class 9\D negated Perl character class 10[[:alpha:]] ASCII character class 11[[:^alpha:]] negated ASCII character class 12\pN Unicode character class (one-letter name) 13\p{Greek} Unicode character class 14\PN negated Unicode character class (one-letter name) 15\P{Greek} negated Unicode character class 16 17Composites: 18xy «x» followed by «y» 19x|y «x» or «y» (prefer «x») 20 21Repetitions: 22x* zero or more «x», prefer more 23x+ one or more «x», prefer more 24x? zero or one «x», prefer one 25x{n,m} «n» or «n»+1 or ... or «m» «x», prefer more 26x{n,} «n» or more «x», prefer more 27x{n} exactly «n» «x» 28x*? zero or more «x», prefer fewer 29x+? one or more «x», prefer fewer 30x?? zero or one «x», prefer zero 31x{n,m}? «n» or «n»+1 or ... or «m» «x», prefer fewer 32x{n,}? «n» or more «x», prefer fewer 33x{n}? exactly «n» «x» 34x{} (== x*) NOT SUPPORTED vim 35x{-} (== x*?) NOT SUPPORTED vim 36x{-n} (== x{n}?) NOT SUPPORTED vim 37x= (== x?) NOT SUPPORTED vim 38 39Implementation restriction: The counting forms «x{n,m}», «x{n,}», and «x{n}» 40reject forms that create a minimum or maximum repetition count above 1000. 41Unlimited repetitions are not subject to this restriction. 42 43Possessive repetitions: 44x*+ zero or more «x», possessive NOT SUPPORTED 45x++ one or more «x», possessive NOT SUPPORTED 46x?+ zero or one «x», possessive NOT SUPPORTED 47x{n,m}+ «n» or ... or «m» «x», possessive NOT SUPPORTED 48x{n,}+ «n» or more «x», possessive NOT SUPPORTED 49x{n}+ exactly «n» «x», possessive NOT SUPPORTED 50 51Grouping: 52(re) numbered capturing group (submatch) 53(?P<name>re) named & numbered capturing group (submatch) 54(?<name>re) named & numbered capturing group (submatch) NOT SUPPORTED 55(?'name're) named & numbered capturing group (submatch) NOT SUPPORTED 56(?:re) non-capturing group 57(?flags) set flags within current group; non-capturing 58(?flags:re) set flags during re; non-capturing 59(?#text) comment NOT SUPPORTED 60(?|x|y|z) branch numbering reset NOT SUPPORTED 61(?>re) possessive match of «re» NOT SUPPORTED 62re@> possessive match of «re» NOT SUPPORTED vim 63%(re) non-capturing group NOT SUPPORTED vim 64 65Flags: 66i case-insensitive (default false) 67m multi-line mode: «^» and «$» match begin/end line in addition to begin/end text (default false) 68s let «.» match «\n» (default false) 69U ungreedy: swap meaning of «x*» and «x*?», «x+» and «x+?», etc (default false) 70Flag syntax is «xyz» (set) or «-xyz» (clear) or «xy-z» (set «xy», clear «z»). 71 72Empty strings: 73^ at beginning of text or line («m»=true) 74$ at end of text (like «\z» not «\Z») or line («m»=true) 75\A at beginning of text 76\b at ASCII word boundary («\w» on one side and «\W», «\A», or «\z» on the other) 77\B not at ASCII word boundary 78\G at beginning of subtext being searched NOT SUPPORTED pcre 79\G at end of last match NOT SUPPORTED perl 80\Z at end of text, or before newline at end of text NOT SUPPORTED 81\z at end of text 82(?=re) before text matching «re» NOT SUPPORTED 83(?!re) before text not matching «re» NOT SUPPORTED 84(?<=re) after text matching «re» NOT SUPPORTED 85(?<!re) after text not matching «re» NOT SUPPORTED 86re& before text matching «re» NOT SUPPORTED vim 87re@= before text matching «re» NOT SUPPORTED vim 88re@! before text not matching «re» NOT SUPPORTED vim 89re@<= after text matching «re» NOT SUPPORTED vim 90re@<! after text not matching «re» NOT SUPPORTED vim 91\zs sets start of match (= \K) NOT SUPPORTED vim 92\ze sets end of match NOT SUPPORTED vim 93\%^ beginning of file NOT SUPPORTED vim 94\%$ end of file NOT SUPPORTED vim 95\%V on screen NOT SUPPORTED vim 96\%# cursor position NOT SUPPORTED vim 97\%'m mark «m» position NOT SUPPORTED vim 98\%23l in line 23 NOT SUPPORTED vim 99\%23c in column 23 NOT SUPPORTED vim 100\%23v in virtual column 23 NOT SUPPORTED vim 101 102Escape sequences: 103\a bell (== \007) 104\f form feed (== \014) 105\t horizontal tab (== \011) 106\n newline (== \012) 107\r carriage return (== \015) 108\v vertical tab character (== \013) 109\* literal «*», for any punctuation character «*» 110\123 octal character code (up to three digits) 111\x7F hex character code (exactly two digits) 112\x{10FFFF} hex character code 113\C match a single byte even in UTF-8 mode 114\Q...\E literal text «...» even if «...» has punctuation 115 116\1 backreference NOT SUPPORTED 117\b backspace NOT SUPPORTED (use «\010») 118\cK control char ^K NOT SUPPORTED (use «\001» etc) 119\e escape NOT SUPPORTED (use «\033») 120\g1 backreference NOT SUPPORTED 121\g{1} backreference NOT SUPPORTED 122\g{+1} backreference NOT SUPPORTED 123\g{-1} backreference NOT SUPPORTED 124\g{name} named backreference NOT SUPPORTED 125\g<name> subroutine call NOT SUPPORTED 126\g'name' subroutine call NOT SUPPORTED 127\k<name> named backreference NOT SUPPORTED 128\k'name' named backreference NOT SUPPORTED 129\lX lowercase «X» NOT SUPPORTED 130\ux uppercase «x» NOT SUPPORTED 131\L...\E lowercase text «...» NOT SUPPORTED 132\K reset beginning of «$0» NOT SUPPORTED 133\N{name} named Unicode character NOT SUPPORTED 134\R line break NOT SUPPORTED 135\U...\E upper case text «...» NOT SUPPORTED 136\X extended Unicode sequence NOT SUPPORTED 137 138\%d123 decimal character 123 NOT SUPPORTED vim 139\%xFF hex character FF NOT SUPPORTED vim 140\%o123 octal character 123 NOT SUPPORTED vim 141\%u1234 Unicode character 0x1234 NOT SUPPORTED vim 142\%U12345678 Unicode character 0x12345678 NOT SUPPORTED vim 143 144Character class elements: 145x single character 146A-Z character range (inclusive) 147\d Perl character class 148[:foo:] ASCII character class «foo» 149\p{Foo} Unicode character class «Foo» 150\pF Unicode character class «F» (one-letter name) 151 152Named character classes as character class elements: 153[\d] digits (== \d) 154[^\d] not digits (== \D) 155[\D] not digits (== \D) 156[^\D] not not digits (== \d) 157[[:name:]] named ASCII class inside character class (== [:name:]) 158[^[:name:]] named ASCII class inside negated character class (== [:^name:]) 159[\p{Name}] named Unicode property inside character class (== \p{Name}) 160[^\p{Name}] named Unicode property inside negated character class (== \P{Name}) 161 162Perl character classes (all ASCII-only): 163\d digits (== [0-9]) 164\D not digits (== [^0-9]) 165\s whitespace (== [\t\n\f\r ]) 166\S not whitespace (== [^\t\n\f\r ]) 167\w word characters (== [0-9A-Za-z_]) 168\W not word characters (== [^0-9A-Za-z_]) 169 170\h horizontal space NOT SUPPORTED 171\H not horizontal space NOT SUPPORTED 172\v vertical space NOT SUPPORTED 173\V not vertical space NOT SUPPORTED 174 175ASCII character classes: 176[[:alnum:]] alphanumeric (== [0-9A-Za-z]) 177[[:alpha:]] alphabetic (== [A-Za-z]) 178[[:ascii:]] ASCII (== [\x00-\x7F]) 179[[:blank:]] blank (== [\t ]) 180[[:cntrl:]] control (== [\x00-\x1F\x7F]) 181[[:digit:]] digits (== [0-9]) 182[[:graph:]] graphical (== [!-~] == [A-Za-z0-9!"#$%&'()*+,\-./:;<=>?@[\\\]^_`{|}~]) 183[[:lower:]] lower case (== [a-z]) 184[[:print:]] printable (== [ -~] == [ [:graph:]]) 185[[:punct:]] punctuation (== [!-/:-@[-`{-~]) 186[[:space:]] whitespace (== [\t\n\v\f\r ]) 187[[:upper:]] upper case (== [A-Z]) 188[[:word:]] word characters (== [0-9A-Za-z_]) 189[[:xdigit:]] hex digit (== [0-9A-Fa-f]) 190 191Unicode character class names--general category: 192C other 193Cc control 194Cf format 195Cn unassigned code points NOT SUPPORTED 196Co private use 197Cs surrogate 198L letter 199LC cased letter NOT SUPPORTED 200L& cased letter NOT SUPPORTED 201Ll lowercase letter 202Lm modifier letter 203Lo other letter 204Lt titlecase letter 205Lu uppercase letter 206M mark 207Mc spacing mark 208Me enclosing mark 209Mn non-spacing mark 210N number 211Nd decimal number 212Nl letter number 213No other number 214P punctuation 215Pc connector punctuation 216Pd dash punctuation 217Pe close punctuation 218Pf final punctuation 219Pi initial punctuation 220Po other punctuation 221Ps open punctuation 222S symbol 223Sc currency symbol 224Sk modifier symbol 225Sm math symbol 226So other symbol 227Z separator 228Zl line separator 229Zp paragraph separator 230Zs space separator 231 232Unicode character class names--scripts: 233Adlam 234Ahom 235Anatolian_Hieroglyphs 236Arabic 237Armenian 238Avestan 239Balinese 240Bamum 241Bassa_Vah 242Batak 243Bengali 244Bhaiksuki 245Bopomofo 246Brahmi 247Braille 248Buginese 249Buhid 250Canadian_Aboriginal 251Carian 252Caucasian_Albanian 253Chakma 254Cham 255Cherokee 256Common 257Coptic 258Cuneiform 259Cypriot 260Cyrillic 261Deseret 262Devanagari 263Dogra 264Duployan 265Egyptian_Hieroglyphs 266Elbasan 267Ethiopic 268Georgian 269Glagolitic 270Gothic 271Grantha 272Greek 273Gujarati 274Gunjala_Gondi 275Gurmukhi 276Han 277Hangul 278Hanifi_Rohingya 279Hanunoo 280Hatran 281Hebrew 282Hiragana 283Imperial_Aramaic 284Inherited 285Inscriptional_Pahlavi 286Inscriptional_Parthian 287Javanese 288Kaithi 289Kannada 290Katakana 291Kayah_Li 292Kharoshthi 293Khmer 294Khojki 295Khudawadi 296Lao 297Latin 298Lepcha 299Limbu 300Linear_A 301Linear_B 302Lisu 303Lycian 304Lydian 305Mahajani 306Makasar 307Malayalam 308Mandaic 309Manichaean 310Marchen 311Masaram_Gondi 312Medefaidrin 313Meetei_Mayek 314Mende_Kikakui 315Meroitic_Cursive 316Meroitic_Hieroglyphs 317Miao 318Modi 319Mongolian 320Mro 321Multani 322Myanmar 323Nabataean 324New_Tai_Lue 325Newa 326Nko 327Nushu 328Ogham 329Ol_Chiki 330Old_Hungarian 331Old_Italic 332Old_North_Arabian 333Old_Permic 334Old_Persian 335Old_Sogdian 336Old_South_Arabian 337Old_Turkic 338Oriya 339Osage 340Osmanya 341Pahawh_Hmong 342Palmyrene 343Pau_Cin_Hau 344Phags_Pa 345Phoenician 346Psalter_Pahlavi 347Rejang 348Runic 349Samaritan 350Saurashtra 351Sharada 352Shavian 353Siddham 354SignWriting 355Sinhala 356Sogdian 357Sora_Sompeng 358Soyombo 359Sundanese 360Syloti_Nagri 361Syriac 362Tagalog 363Tagbanwa 364Tai_Le 365Tai_Tham 366Tai_Viet 367Takri 368Tamil 369Tangut 370Telugu 371Thaana 372Thai 373Tibetan 374Tifinagh 375Tirhuta 376Ugaritic 377Vai 378Warang_Citi 379Yi 380Zanabazar_Square 381 382Vim character classes: 383\i identifier character NOT SUPPORTED vim 384\I «\i» except digits NOT SUPPORTED vim 385\k keyword character NOT SUPPORTED vim 386\K «\k» except digits NOT SUPPORTED vim 387\f file name character NOT SUPPORTED vim 388\F «\f» except digits NOT SUPPORTED vim 389\p printable character NOT SUPPORTED vim 390\P «\p» except digits NOT SUPPORTED vim 391\s whitespace character (== [ \t]) NOT SUPPORTED vim 392\S non-white space character (== [^ \t]) NOT SUPPORTED vim 393\d digits (== [0-9]) vim 394\D not «\d» vim 395\x hex digits (== [0-9A-Fa-f]) NOT SUPPORTED vim 396\X not «\x» NOT SUPPORTED vim 397\o octal digits (== [0-7]) NOT SUPPORTED vim 398\O not «\o» NOT SUPPORTED vim 399\w word character vim 400\W not «\w» vim 401\h head of word character NOT SUPPORTED vim 402\H not «\h» NOT SUPPORTED vim 403\a alphabetic NOT SUPPORTED vim 404\A not «\a» NOT SUPPORTED vim 405\l lowercase NOT SUPPORTED vim 406\L not lowercase NOT SUPPORTED vim 407\u uppercase NOT SUPPORTED vim 408\U not uppercase NOT SUPPORTED vim 409\_x «\x» plus newline, for any «x» NOT SUPPORTED vim 410 411Vim flags: 412\c ignore case NOT SUPPORTED vim 413\C match case NOT SUPPORTED vim 414\m magic NOT SUPPORTED vim 415\M nomagic NOT SUPPORTED vim 416\v verymagic NOT SUPPORTED vim 417\V verynomagic NOT SUPPORTED vim 418\Z ignore differences in Unicode combining characters NOT SUPPORTED vim 419 420Magic: 421(?{code}) arbitrary Perl code NOT SUPPORTED perl 422(??{code}) postponed arbitrary Perl code NOT SUPPORTED perl 423(?n) recursive call to regexp capturing group «n» NOT SUPPORTED 424(?+n) recursive call to relative group «+n» NOT SUPPORTED 425(?-n) recursive call to relative group «-n» NOT SUPPORTED 426(?C) PCRE callout NOT SUPPORTED pcre 427(?R) recursive call to entire regexp (== (?0)) NOT SUPPORTED 428(?&name) recursive call to named group NOT SUPPORTED 429(?P=name) named backreference NOT SUPPORTED 430(?P>name) recursive call to named group NOT SUPPORTED 431(?(cond)true|false) conditional branch NOT SUPPORTED 432(?(cond)true) conditional branch NOT SUPPORTED 433(*ACCEPT) make regexps more like Prolog NOT SUPPORTED 434(*COMMIT) NOT SUPPORTED 435(*F) NOT SUPPORTED 436(*FAIL) NOT SUPPORTED 437(*MARK) NOT SUPPORTED 438(*PRUNE) NOT SUPPORTED 439(*SKIP) NOT SUPPORTED 440(*THEN) NOT SUPPORTED 441(*ANY) set newline convention NOT SUPPORTED 442(*ANYCRLF) NOT SUPPORTED 443(*CR) NOT SUPPORTED 444(*CRLF) NOT SUPPORTED 445(*LF) NOT SUPPORTED 446(*BSR_ANYCRLF) set \R convention NOT SUPPORTED pcre 447(*BSR_UNICODE) NOT SUPPORTED pcre 448 449