xref: /aosp_15_r20/external/cronet/third_party/re2/src/doc/syntax.txt (revision 6777b5387eb2ff775bb5750e3f5d96f37fb7352b)
1RE2 regular expression syntax reference
2-------------------------­-------­-----
3
4Single characters:
5.	any character, possibly including newline (s=true)
6[xyz]	character class
7[^xyz]	negated character class
8\d	Perl character class
9\D	negated Perl character class
10[[:alpha:]]	ASCII character class
11[[:^alpha:]]	negated ASCII character class
12\pN	Unicode character class (one-letter name)
13\p{Greek}	Unicode character class
14\PN	negated Unicode character class (one-letter name)
15\P{Greek}	negated Unicode character class
16
17Composites:
18xy	«x» followed by «y»
19x|y	«x» or «y» (prefer «x»)
20
21Repetitions:
22x*	zero or more «x», prefer more
23x+	one or more «x», prefer more
24x?	zero or one «x», prefer one
25x{n,m}	«n» or «n»+1 or ... or «m» «x», prefer more
26x{n,}	«n» or more «x», prefer more
27x{n}	exactly «n» «x»
28x*?	zero or more «x», prefer fewer
29x+?	one or more «x», prefer fewer
30x??	zero or one «x», prefer zero
31x{n,m}?	«n» or «n»+1 or ... or «m» «x», prefer fewer
32x{n,}?	«n» or more «x», prefer fewer
33x{n}?	exactly «n» «x»
34x{}	(== x*) NOT SUPPORTED vim
35x{-}	(== x*?) NOT SUPPORTED vim
36x{-n}	(== x{n}?) NOT SUPPORTED vim
37x=	(== x?) NOT SUPPORTED vim
38
39Implementation restriction: The counting forms «x{n,m}», «x{n,}», and «x{n}»
40reject forms that create a minimum or maximum repetition count above 1000.
41Unlimited repetitions are not subject to this restriction.
42
43Possessive repetitions:
44x*+	zero or more «x», possessive NOT SUPPORTED
45x++	one or more «x», possessive NOT SUPPORTED
46x?+	zero or one «x», possessive NOT SUPPORTED
47x{n,m}+	«n» or ... or «m» «x», possessive NOT SUPPORTED
48x{n,}+	«n» or more «x», possessive NOT SUPPORTED
49x{n}+	exactly «n» «x», possessive NOT SUPPORTED
50
51Grouping:
52(re)	numbered capturing group (submatch)
53(?P<name>re)	named & numbered capturing group (submatch)
54(?<name>re)	named & numbered capturing group (submatch)
55(?'name're)	named & numbered capturing group (submatch) NOT SUPPORTED
56(?:re)	non-capturing group
57(?flags)	set flags within current group; non-capturing
58(?flags:re)	set flags during re; non-capturing
59(?#text)	comment NOT SUPPORTED
60(?|x|y|z)	branch numbering reset NOT SUPPORTED
61(?>re)	possessive match of «re» NOT SUPPORTED
62re@>	possessive match of «re» NOT SUPPORTED vim
63%(re)	non-capturing group NOT SUPPORTED vim
64
65Flags:
66i	case-insensitive (default false)
67m	multi-line mode: «^» and «$» match begin/end line in addition to begin/end text (default false)
68s	let «.» match «\n» (default false)
69U	ungreedy: swap meaning of «x*» and «x*?», «x+» and «x+?», etc (default false)
70Flag syntax is «xyz» (set) or «-xyz» (clear) or «xy-z» (set «xy», clear «z»).
71
72Empty strings:
73^	at beginning of text or line («m»=true)
74$	at end of text (like «\z» not «\Z») or line («m»=true)
75\A	at beginning of text
76\b	at ASCII word boundary («\w» on one side and «\W», «\A», or «\z» on the other)
77\B	not at ASCII word boundary
78\G	at beginning of subtext being searched NOT SUPPORTED pcre
79\G	at end of last match NOT SUPPORTED perl
80\Z	at end of text, or before newline at end of text NOT SUPPORTED
81\z	at end of text
82(?=re)	before text matching «re» NOT SUPPORTED
83(?!re)	before text not matching «re» NOT SUPPORTED
84(?<=re)	after text matching «re» NOT SUPPORTED
85(?<!re)	after text not matching «re» NOT SUPPORTED
86re&	before text matching «re» NOT SUPPORTED vim
87re@=	before text matching «re» NOT SUPPORTED vim
88re@!	before text not matching «re» NOT SUPPORTED vim
89re@<=	after text matching «re» NOT SUPPORTED vim
90re@<!	after text not matching «re» NOT SUPPORTED vim
91\zs	sets start of match (= \K) NOT SUPPORTED vim
92\ze	sets end of match NOT SUPPORTED vim
93\%^	beginning of file NOT SUPPORTED vim
94\%$	end of file NOT SUPPORTED vim
95\%V	on screen NOT SUPPORTED vim
96\%#	cursor position NOT SUPPORTED vim
97\%'m	mark «m» position NOT SUPPORTED vim
98\%23l	in line 23 NOT SUPPORTED vim
99\%23c	in column 23 NOT SUPPORTED vim
100\%23v	in virtual column 23 NOT SUPPORTED vim
101
102Escape sequences:
103\a	bell (== \007)
104\f	form feed (== \014)
105\t	horizontal tab (== \011)
106\n	newline (== \012)
107\r	carriage return (== \015)
108\v	vertical tab character (== \013)
109\*	literal «*», for any punctuation character «*»
110\123	octal character code (up to three digits)
111\x7F	hex character code (exactly two digits)
112\x{10FFFF}	hex character code
113\C	match a single byte even in UTF-8 mode
114\Q...\E	literal text «...» even if «...» has punctuation
115
116\1	backreference NOT SUPPORTED
117\b	backspace NOT SUPPORTED (use «\010»)
118\cK	control char ^K NOT SUPPORTED (use «\001» etc)
119\e	escape NOT SUPPORTED (use «\033»)
120\g1	backreference NOT SUPPORTED
121\g{1}	backreference NOT SUPPORTED
122\g{+1}	backreference NOT SUPPORTED
123\g{-1}	backreference NOT SUPPORTED
124\g{name}	named backreference NOT SUPPORTED
125\g<name>	subroutine call NOT SUPPORTED
126\g'name'	subroutine call NOT SUPPORTED
127\k<name>	named backreference NOT SUPPORTED
128\k'name'	named backreference NOT SUPPORTED
129\lX	lowercase «X» NOT SUPPORTED
130\ux	uppercase «x» NOT SUPPORTED
131\L...\E	lowercase text «...» NOT SUPPORTED
132\K	reset beginning of «$0» NOT SUPPORTED
133\N{name}	named Unicode character NOT SUPPORTED
134\R	line break NOT SUPPORTED
135\U...\E	upper case text «...» NOT SUPPORTED
136\X	extended Unicode sequence NOT SUPPORTED
137
138\%d123	decimal character 123 NOT SUPPORTED vim
139\%xFF	hex character FF NOT SUPPORTED vim
140\%o123	octal character 123 NOT SUPPORTED vim
141\%u1234	Unicode character 0x1234 NOT SUPPORTED vim
142\%U12345678	Unicode character 0x12345678 NOT SUPPORTED vim
143
144Character class elements:
145x	single character
146A-Z	character range (inclusive)
147\d	Perl character class
148[:foo:]	ASCII character class «foo»
149\p{Foo}	Unicode character class «Foo»
150\pF	Unicode character class «F» (one-letter name)
151
152Named character classes as character class elements:
153[\d]	digits (== \d)
154[^\d]	not digits (== \D)
155[\D]	not digits (== \D)
156[^\D]	not not digits (== \d)
157[[:name:]]	named ASCII class inside character class (== [:name:])
158[^[:name:]]	named ASCII class inside negated character class (== [:^name:])
159[\p{Name}]	named Unicode property inside character class (== \p{Name})
160[^\p{Name}]	named Unicode property inside negated character class (== \P{Name})
161
162Perl character classes (all ASCII-only):
163\d	digits (== [0-9])
164\D	not digits (== [^0-9])
165\s	whitespace (== [\t\n\f\r ])
166\S	not whitespace (== [^\t\n\f\r ])
167\w	word characters (== [0-9A-Za-z_])
168\W	not word characters (== [^0-9A-Za-z_])
169
170\h	horizontal space NOT SUPPORTED
171\H	not horizontal space NOT SUPPORTED
172\v	vertical space NOT SUPPORTED
173\V	not vertical space NOT SUPPORTED
174
175ASCII character classes:
176[[:alnum:]]	alphanumeric (== [0-9A-Za-z])
177[[:alpha:]]	alphabetic (== [A-Za-z])
178[[:ascii:]]	ASCII (== [\x00-\x7F])
179[[:blank:]]	blank (== [\t ])
180[[:cntrl:]]	control (== [\x00-\x1F\x7F])
181[[:digit:]]	digits (== [0-9])
182[[:graph:]]	graphical (== [!-~] == [A-Za-z0-9!"#$%&'()*+,\-./:;<=>?@[\\\]^_`{|}~])
183[[:lower:]]	lower case (== [a-z])
184[[:print:]]	printable (== [ -~] == [ [:graph:]])
185[[:punct:]]	punctuation (== [!-/:-@[-`{-~])
186[[:space:]]	whitespace (== [\t\n\v\f\r ])
187[[:upper:]]	upper case (== [A-Z])
188[[:word:]]	word characters (== [0-9A-Za-z_])
189[[:xdigit:]]	hex digit (== [0-9A-Fa-f])
190
191Unicode character class names--general category:
192C	other
193Cc	control
194Cf	format
195Cn	unassigned code points NOT SUPPORTED
196Co	private use
197Cs	surrogate
198L	letter
199LC	cased letter NOT SUPPORTED
200L&	cased letter NOT SUPPORTED
201Ll	lowercase letter
202Lm	modifier letter
203Lo	other letter
204Lt	titlecase letter
205Lu	uppercase letter
206M	mark
207Mc	spacing mark
208Me	enclosing mark
209Mn	non-spacing mark
210N	number
211Nd	decimal number
212Nl	letter number
213No	other number
214P	punctuation
215Pc	connector punctuation
216Pd	dash punctuation
217Pe	close punctuation
218Pf	final punctuation
219Pi	initial punctuation
220Po	other punctuation
221Ps	open punctuation
222S	symbol
223Sc	currency symbol
224Sk	modifier symbol
225Sm	math symbol
226So	other symbol
227Z	separator
228Zl	line separator
229Zp	paragraph separator
230Zs	space separator
231
232Unicode character class names--scripts:
233Adlam
234Ahom
235Anatolian_Hieroglyphs
236Arabic
237Armenian
238Avestan
239Balinese
240Bamum
241Bassa_Vah
242Batak
243Bengali
244Bhaiksuki
245Bopomofo
246Brahmi
247Braille
248Buginese
249Buhid
250Canadian_Aboriginal
251Carian
252Caucasian_Albanian
253Chakma
254Cham
255Cherokee
256Chorasmian
257Common
258Coptic
259Cuneiform
260Cypriot
261Cypro_Minoan
262Cyrillic
263Deseret
264Devanagari
265Dives_Akuru
266Dogra
267Duployan
268Egyptian_Hieroglyphs
269Elbasan
270Elymaic
271Ethiopic
272Georgian
273Glagolitic
274Gothic
275Grantha
276Greek
277Gujarati
278Gunjala_Gondi
279Gurmukhi
280Han
281Hangul
282Hanifi_Rohingya
283Hanunoo
284Hatran
285Hebrew
286Hiragana
287Imperial_Aramaic
288Inherited
289Inscriptional_Pahlavi
290Inscriptional_Parthian
291Javanese
292Kaithi
293Kannada
294Katakana
295Kawi
296Kayah_Li
297Kharoshthi
298Khitan_Small_Script
299Khmer
300Khojki
301Khudawadi
302Lao
303Latin
304Lepcha
305Limbu
306Linear_A
307Linear_B
308Lisu
309Lycian
310Lydian
311Mahajani
312Makasar
313Malayalam
314Mandaic
315Manichaean
316Marchen
317Masaram_Gondi
318Medefaidrin
319Meetei_Mayek
320Mende_Kikakui
321Meroitic_Cursive
322Meroitic_Hieroglyphs
323Miao
324Modi
325Mongolian
326Mro
327Multani
328Myanmar
329Nabataean
330Nag_Mundari
331Nandinagari
332New_Tai_Lue
333Newa
334Nko
335Nushu
336Nyiakeng_Puachue_Hmong
337Ogham
338Ol_Chiki
339Old_Hungarian
340Old_Italic
341Old_North_Arabian
342Old_Permic
343Old_Persian
344Old_Sogdian
345Old_South_Arabian
346Old_Turkic
347Old_Uyghur
348Oriya
349Osage
350Osmanya
351Pahawh_Hmong
352Palmyrene
353Pau_Cin_Hau
354Phags_Pa
355Phoenician
356Psalter_Pahlavi
357Rejang
358Runic
359Samaritan
360Saurashtra
361Sharada
362Shavian
363Siddham
364SignWriting
365Sinhala
366Sogdian
367Sora_Sompeng
368Soyombo
369Sundanese
370Syloti_Nagri
371Syriac
372Tagalog
373Tagbanwa
374Tai_Le
375Tai_Tham
376Tai_Viet
377Takri
378Tamil
379Tangsa
380Tangut
381Telugu
382Thaana
383Thai
384Tibetan
385Tifinagh
386Tirhuta
387Toto
388Ugaritic
389Vai
390Vithkuqi
391Wancho
392Warang_Citi
393Yezidi
394Yi
395Zanabazar_Square
396
397Vim character classes:
398\i	identifier character NOT SUPPORTED vim
399\I	«\i» except digits NOT SUPPORTED vim
400\k	keyword character NOT SUPPORTED vim
401\K	«\k» except digits NOT SUPPORTED vim
402\f	file name character NOT SUPPORTED vim
403\F	«\f» except digits NOT SUPPORTED vim
404\p	printable character NOT SUPPORTED vim
405\P	«\p» except digits NOT SUPPORTED vim
406\s	whitespace character (== [ \t]) NOT SUPPORTED vim
407\S	non-white space character (== [^ \t]) NOT SUPPORTED vim
408\d	digits (== [0-9]) vim
409\D	not «\d» vim
410\x	hex digits (== [0-9A-Fa-f]) NOT SUPPORTED vim
411\X	not «\x» NOT SUPPORTED vim
412\o	octal digits (== [0-7]) NOT SUPPORTED vim
413\O	not «\o» NOT SUPPORTED vim
414\w	word character vim
415\W	not «\w» vim
416\h	head of word character NOT SUPPORTED vim
417\H	not «\h» NOT SUPPORTED vim
418\a	alphabetic NOT SUPPORTED vim
419\A	not «\a» NOT SUPPORTED vim
420\l	lowercase NOT SUPPORTED vim
421\L	not lowercase NOT SUPPORTED vim
422\u	uppercase NOT SUPPORTED vim
423\U	not uppercase NOT SUPPORTED vim
424\_x	«\x» plus newline, for any «x» NOT SUPPORTED vim
425
426Vim flags:
427\c	ignore case NOT SUPPORTED vim
428\C	match case NOT SUPPORTED vim
429\m	magic NOT SUPPORTED vim
430\M	nomagic NOT SUPPORTED vim
431\v	verymagic NOT SUPPORTED vim
432\V	verynomagic NOT SUPPORTED vim
433\Z	ignore differences in Unicode combining characters NOT SUPPORTED vim
434
435Magic:
436(?{code})	arbitrary Perl code NOT SUPPORTED perl
437(??{code})	postponed arbitrary Perl code NOT SUPPORTED perl
438(?n)	recursive call to regexp capturing group «n» NOT SUPPORTED
439(?+n)	recursive call to relative group «+n» NOT SUPPORTED
440(?-n)	recursive call to relative group «-n» NOT SUPPORTED
441(?C)	PCRE callout NOT SUPPORTED pcre
442(?R)	recursive call to entire regexp (== (?0)) NOT SUPPORTED
443(?&name)	recursive call to named group NOT SUPPORTED
444(?P=name)	named backreference NOT SUPPORTED
445(?P>name)	recursive call to named group NOT SUPPORTED
446(?(cond)true|false)	conditional branch NOT SUPPORTED
447(?(cond)true)	conditional branch NOT SUPPORTED
448(*ACCEPT)	make regexps more like Prolog NOT SUPPORTED
449(*COMMIT)	NOT SUPPORTED
450(*F)	NOT SUPPORTED
451(*FAIL)	NOT SUPPORTED
452(*MARK)	NOT SUPPORTED
453(*PRUNE)	NOT SUPPORTED
454(*SKIP)	NOT SUPPORTED
455(*THEN)	NOT SUPPORTED
456(*ANY)	set newline convention NOT SUPPORTED
457(*ANYCRLF)	NOT SUPPORTED
458(*CR)	NOT SUPPORTED
459(*CRLF)	NOT SUPPORTED
460(*LF)	NOT SUPPORTED
461(*BSR_ANYCRLF)	set \R convention NOT SUPPORTED pcre
462(*BSR_UNICODE)	NOT SUPPORTED pcre
463
464