xref: /aosp_15_r20/external/pcre/doc/pcre2grep.txt (revision 22dc650d8ae982c6770746019a6f94af92b0f024)
1*22dc650dSSadaf Ebrahimi
2*22dc650dSSadaf EbrahimiPCRE2GREP(1)                General Commands Manual               PCRE2GREP(1)
3*22dc650dSSadaf Ebrahimi
4*22dc650dSSadaf Ebrahimi
5*22dc650dSSadaf EbrahimiNAME
6*22dc650dSSadaf Ebrahimi       pcre2grep - a grep with Perl-compatible regular expressions.
7*22dc650dSSadaf Ebrahimi
8*22dc650dSSadaf Ebrahimi
9*22dc650dSSadaf EbrahimiSYNOPSIS
10*22dc650dSSadaf Ebrahimi       pcre2grep [options] [long options] [pattern] [path1 path2 ...]
11*22dc650dSSadaf Ebrahimi
12*22dc650dSSadaf Ebrahimi
13*22dc650dSSadaf EbrahimiDESCRIPTION
14*22dc650dSSadaf Ebrahimi
15*22dc650dSSadaf Ebrahimi       pcre2grep  searches  files  for  character patterns, in the same way as
16*22dc650dSSadaf Ebrahimi       other grep commands do, but it uses the PCRE2  regular  expression  li-
17*22dc650dSSadaf Ebrahimi       brary  to support patterns that are compatible with the regular expres-
18*22dc650dSSadaf Ebrahimi       sions of Perl 5. See pcre2syntax(3) for a  quick-reference  summary  of
19*22dc650dSSadaf Ebrahimi       pattern syntax, or pcre2pattern(3) for a full description of the syntax
20*22dc650dSSadaf Ebrahimi       and semantics of the regular expressions that PCRE2 supports.
21*22dc650dSSadaf Ebrahimi
22*22dc650dSSadaf Ebrahimi       Patterns,  whether  supplied on the command line or in a separate file,
23*22dc650dSSadaf Ebrahimi       are given without delimiters. For example:
24*22dc650dSSadaf Ebrahimi
25*22dc650dSSadaf Ebrahimi         pcre2grep Thursday /etc/motd
26*22dc650dSSadaf Ebrahimi
27*22dc650dSSadaf Ebrahimi       If you attempt to use delimiters (for example, by surrounding a pattern
28*22dc650dSSadaf Ebrahimi       with slashes, as is common in Perl scripts), they  are  interpreted  as
29*22dc650dSSadaf Ebrahimi       part  of  the pattern. Quotes can of course be used to delimit patterns
30*22dc650dSSadaf Ebrahimi       on the command line because they are interpreted by the shell, and  in-
31*22dc650dSSadaf Ebrahimi       deed  quotes  are  required  if a pattern contains white space or shell
32*22dc650dSSadaf Ebrahimi       metacharacters.
33*22dc650dSSadaf Ebrahimi
34*22dc650dSSadaf Ebrahimi       The first argument that follows any option settings is treated  as  the
35*22dc650dSSadaf Ebrahimi       single  pattern  to be matched when neither -e nor -f is present.  Con-
36*22dc650dSSadaf Ebrahimi       versely, when one or both of these options are  used  to  specify  pat-
37*22dc650dSSadaf Ebrahimi       terns, all arguments are treated as path names. At least one of -e, -f,
38*22dc650dSSadaf Ebrahimi       or an argument pattern must be provided.
39*22dc650dSSadaf Ebrahimi
40*22dc650dSSadaf Ebrahimi       If  no  files  are  specified,  pcre2grep reads the standard input. The
41*22dc650dSSadaf Ebrahimi       standard input can also be referenced by a name consisting of a  single
42*22dc650dSSadaf Ebrahimi       hyphen.  For example:
43*22dc650dSSadaf Ebrahimi
44*22dc650dSSadaf Ebrahimi         pcre2grep some-pattern file1 - file3
45*22dc650dSSadaf Ebrahimi
46*22dc650dSSadaf Ebrahimi       By  default,  input  files are searched line by line, so pattern asser-
47*22dc650dSSadaf Ebrahimi       tions about the beginning and end of a subject string (^,  $,  \A,  \Z,
48*22dc650dSSadaf Ebrahimi       and  \z)  match  at  the  beginning  and  end of each line. When a line
49*22dc650dSSadaf Ebrahimi       matches a pattern, it is copied to the standard output, and if there is
50*22dc650dSSadaf Ebrahimi       more than one file, the file name is output at the start of each  line,
51*22dc650dSSadaf Ebrahimi       followed  by  a  colon.  However, there are options that can change how
52*22dc650dSSadaf Ebrahimi       pcre2grep behaves. For example, the -M  option  makes  it  possible  to
53*22dc650dSSadaf Ebrahimi       search  for  strings  that  span  line  boundaries. What defines a line
54*22dc650dSSadaf Ebrahimi       boundary is controlled by the -N (--newline) option.  The -h and -H op-
55*22dc650dSSadaf Ebrahimi       tions control whether or not file names are shown, and  the  -Z  option
56*22dc650dSSadaf Ebrahimi       changes the file name terminator to a zero byte.
57*22dc650dSSadaf Ebrahimi
58*22dc650dSSadaf Ebrahimi       The amount of memory used for buffering files that are being scanned is
59*22dc650dSSadaf Ebrahimi       controlled  by  parameters  that  can  be  set by the --buffer-size and
60*22dc650dSSadaf Ebrahimi       --max-buffer-size options. The first of these sets the size  of  buffer
61*22dc650dSSadaf Ebrahimi       that  is obtained at the start of processing. If an input file contains
62*22dc650dSSadaf Ebrahimi       very long lines, a larger buffer may be needed; this is handled by  au-
63*22dc650dSSadaf Ebrahimi       tomatically  extending  the buffer, up to the limit specified by --max-
64*22dc650dSSadaf Ebrahimi       buffer-size. The default values for these parameters can  be  set  when
65*22dc650dSSadaf Ebrahimi       pcre2grep  is  built;  if nothing is specified, the defaults are set to
66*22dc650dSSadaf Ebrahimi       20KiB and 1MiB respectively. An error occurs if a line is too long  and
67*22dc650dSSadaf Ebrahimi       the buffer can no longer be expanded.
68*22dc650dSSadaf Ebrahimi
69*22dc650dSSadaf Ebrahimi       The  block  of  memory that is actually used is three times the "buffer
70*22dc650dSSadaf Ebrahimi       size", to allow for buffering "before" and "after" lines. If the buffer
71*22dc650dSSadaf Ebrahimi       size is too small, fewer than requested "before" and "after" lines  may
72*22dc650dSSadaf Ebrahimi       be output.
73*22dc650dSSadaf Ebrahimi
74*22dc650dSSadaf Ebrahimi       When  matching with a multiline pattern, the size of the buffer must be
75*22dc650dSSadaf Ebrahimi       at least half of the maximum match expected or the pattern  might  fail
76*22dc650dSSadaf Ebrahimi       to match.
77*22dc650dSSadaf Ebrahimi
78*22dc650dSSadaf Ebrahimi       Patterns  can  be no longer than 8KiB or BUFSIZ bytes, whichever is the
79*22dc650dSSadaf Ebrahimi       greater.  BUFSIZ is defined in <stdio.h>. When there is more  than  one
80*22dc650dSSadaf Ebrahimi       pattern (specified by the use of -e and/or -f), each pattern is applied
81*22dc650dSSadaf Ebrahimi       to  each  line  in the order in which they are defined, except that all
82*22dc650dSSadaf Ebrahimi       the -e patterns are tried before the -f patterns.
83*22dc650dSSadaf Ebrahimi
84*22dc650dSSadaf Ebrahimi       By default, as soon as one pattern matches a line, no further  patterns
85*22dc650dSSadaf Ebrahimi       are considered. However, if --colour (or --color) is used to colour the
86*22dc650dSSadaf Ebrahimi       matching substrings, or if --only-matching, --file-offsets, --line-off-
87*22dc650dSSadaf Ebrahimi       sets,  or  --output  is  used  to output only the part of the line that
88*22dc650dSSadaf Ebrahimi       matched (either shown literally, or as an  offset),  the  behaviour  is
89*22dc650dSSadaf Ebrahimi       different. In this situation, all the patterns are applied to the line.
90*22dc650dSSadaf Ebrahimi       If  there  is  more  than one match, the one that begins nearest to the
91*22dc650dSSadaf Ebrahimi       start of the subject is processed; if there is more than one  match  at
92*22dc650dSSadaf Ebrahimi       that   position,  the  one  with  the  longest  matching  substring  is
93*22dc650dSSadaf Ebrahimi       processed; if the matching substrings are equal, the first match  found
94*22dc650dSSadaf Ebrahimi       is processed.
95*22dc650dSSadaf Ebrahimi
96*22dc650dSSadaf Ebrahimi       Scanning with all the patterns resumes immediately following the match,
97*22dc650dSSadaf Ebrahimi       so  that  later  matches  on the same line can be found. Note, however,
98*22dc650dSSadaf Ebrahimi       that an overlapping match that starts in the middle  of  another  match
99*22dc650dSSadaf Ebrahimi       will not be processed.
100*22dc650dSSadaf Ebrahimi
101*22dc650dSSadaf Ebrahimi       The  above behaviour was changed at release 10.41 to be more compatible
102*22dc650dSSadaf Ebrahimi       with GNU grep. In earlier releases, pcre2grep did not recognize matches
103*22dc650dSSadaf Ebrahimi       from later patterns that were earlier in the subject.
104*22dc650dSSadaf Ebrahimi
105*22dc650dSSadaf Ebrahimi       Patterns that can match an empty string are accepted, but empty  string
106*22dc650dSSadaf Ebrahimi       matches   are  never  recognized.  An  example  is  the  pattern  "(su-
107*22dc650dSSadaf Ebrahimi       per)?(man)?", in which all components are optional. This pattern  finds
108*22dc650dSSadaf Ebrahimi       all  occurrences  of  both  "super"  and "man"; the output differs from
109*22dc650dSSadaf Ebrahimi       matching with "super|man" when only the matching substrings  are  being
110*22dc650dSSadaf Ebrahimi       shown.
111*22dc650dSSadaf Ebrahimi
112*22dc650dSSadaf Ebrahimi       If  the  LC_ALL or LC_CTYPE environment variable is set, pcre2grep uses
113*22dc650dSSadaf Ebrahimi       the value to set a locale when calling the PCRE2 library.  The --locale
114*22dc650dSSadaf Ebrahimi       option can be used to override this.
115*22dc650dSSadaf Ebrahimi
116*22dc650dSSadaf Ebrahimi
117*22dc650dSSadaf EbrahimiSUPPORT FOR COMPRESSED FILES
118*22dc650dSSadaf Ebrahimi
119*22dc650dSSadaf Ebrahimi       Compile-time options for pcre2grep can set it up to use libz or  libbz2
120*22dc650dSSadaf Ebrahimi       for  reading  compressed  files whose names end in .gz or .bz2, respec-
121*22dc650dSSadaf Ebrahimi       tively. You can find out whether your pcre2grep binary has support  for
122*22dc650dSSadaf Ebrahimi       one  or  both of these file types by running it with the --help option.
123*22dc650dSSadaf Ebrahimi       If the appropriate support is not present, all  files  are  treated  as
124*22dc650dSSadaf Ebrahimi       plain  text.  The standard input is always so treated. If a file with a
125*22dc650dSSadaf Ebrahimi       .gz or .bz2 extension is not in fact compressed, it is read as a  plain
126*22dc650dSSadaf Ebrahimi       text  file.  When  input  is  from  a  compressed .gz or .bz2 file, the
127*22dc650dSSadaf Ebrahimi       --line-buffered option is ignored.
128*22dc650dSSadaf Ebrahimi
129*22dc650dSSadaf Ebrahimi
130*22dc650dSSadaf EbrahimiBINARY FILES
131*22dc650dSSadaf Ebrahimi
132*22dc650dSSadaf Ebrahimi       By default, a file that contains a binary zero byte  within  the  first
133*22dc650dSSadaf Ebrahimi       1024  bytes is identified as a binary file, and is processed specially.
134*22dc650dSSadaf Ebrahimi       However, if the newline type is specified as NUL,  that  is,  the  line
135*22dc650dSSadaf Ebrahimi       terminator is a binary zero, the test for a binary file is not applied.
136*22dc650dSSadaf Ebrahimi       See  the  --binary-files  option for a means of changing the way binary
137*22dc650dSSadaf Ebrahimi       files are handled.
138*22dc650dSSadaf Ebrahimi
139*22dc650dSSadaf Ebrahimi
140*22dc650dSSadaf EbrahimiBINARY ZEROS IN PATTERNS
141*22dc650dSSadaf Ebrahimi
142*22dc650dSSadaf Ebrahimi       Patterns passed from the command line are strings that  are  terminated
143*22dc650dSSadaf Ebrahimi       by  a  binary zero, so cannot contain internal zeros. However, patterns
144*22dc650dSSadaf Ebrahimi       that are read from a file via the -f option may contain binary zeros.
145*22dc650dSSadaf Ebrahimi
146*22dc650dSSadaf Ebrahimi
147*22dc650dSSadaf EbrahimiOPTIONS
148*22dc650dSSadaf Ebrahimi
149*22dc650dSSadaf Ebrahimi       The order in which some of the options appear can  affect  the  output.
150*22dc650dSSadaf Ebrahimi       For  example,  both  the  -H and -l options affect the printing of file
151*22dc650dSSadaf Ebrahimi       names. Whichever comes later in the command line will be the  one  that
152*22dc650dSSadaf Ebrahimi       takes  effect.  Similarly,  except  where  noted below, if an option is
153*22dc650dSSadaf Ebrahimi       given twice, the later setting is used. Numerical  values  for  options
154*22dc650dSSadaf Ebrahimi       may  be  followed  by  K  or  M,  to  signify multiplication by 1024 or
155*22dc650dSSadaf Ebrahimi       1024*1024 respectively.
156*22dc650dSSadaf Ebrahimi
157*22dc650dSSadaf Ebrahimi       --        This terminates the list of options. It is useful if the next
158*22dc650dSSadaf Ebrahimi                 item on the command line starts with a hyphen but is  not  an
159*22dc650dSSadaf Ebrahimi                 option.  This  allows for the processing of patterns and file
160*22dc650dSSadaf Ebrahimi                 names that start with hyphens.
161*22dc650dSSadaf Ebrahimi
162*22dc650dSSadaf Ebrahimi       -A number, --after-context=number
163*22dc650dSSadaf Ebrahimi                 Output up to number lines  of  context  after  each  matching
164*22dc650dSSadaf Ebrahimi                 line.  Fewer lines are output if the next match or the end of
165*22dc650dSSadaf Ebrahimi                 the file is reached, or if the  processing  buffer  size  has
166*22dc650dSSadaf Ebrahimi                 been set too small. If file names and/or line numbers are be-
167*22dc650dSSadaf Ebrahimi                 ing output, a hyphen separator is used instead of a colon for
168*22dc650dSSadaf Ebrahimi                 the  context  lines  (the -Z option can be used to change the
169*22dc650dSSadaf Ebrahimi                 file name terminator to a zero byte). A line containing  "--"
170*22dc650dSSadaf Ebrahimi                 is  output  between  each  group of lines, unless they are in
171*22dc650dSSadaf Ebrahimi                 fact contiguous in the input file. The value of number is ex-
172*22dc650dSSadaf Ebrahimi                 pected to be relatively small. When -c is  used,  -A  is  ig-
173*22dc650dSSadaf Ebrahimi                 nored.
174*22dc650dSSadaf Ebrahimi
175*22dc650dSSadaf Ebrahimi       -a, --text
176*22dc650dSSadaf Ebrahimi                 Treat  binary  files as text. This is equivalent to --binary-
177*22dc650dSSadaf Ebrahimi                 files=text.
178*22dc650dSSadaf Ebrahimi
179*22dc650dSSadaf Ebrahimi       --allow-lookaround-bsk
180*22dc650dSSadaf Ebrahimi                 PCRE2 now forbids the use of \K in lookarounds by default, in
181*22dc650dSSadaf Ebrahimi                 line with Perl.  This option  causes  pcre2grep  to  set  the
182*22dc650dSSadaf Ebrahimi                 PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK  option,  which enables this
183*22dc650dSSadaf Ebrahimi                 somewhat dangerous usage.
184*22dc650dSSadaf Ebrahimi
185*22dc650dSSadaf Ebrahimi       -B number, --before-context=number
186*22dc650dSSadaf Ebrahimi                 Output up to number lines of  context  before  each  matching
187*22dc650dSSadaf Ebrahimi                 line.  Fewer  lines  are  output if the previous match or the
188*22dc650dSSadaf Ebrahimi                 start of the file is within number lines, or if the  process-
189*22dc650dSSadaf Ebrahimi                 ing  buffer size has been set too small. If file names and/or
190*22dc650dSSadaf Ebrahimi                 line numbers are being output, a hyphen separator is used in-
191*22dc650dSSadaf Ebrahimi                 stead of a colon for the context lines (the -Z option can  be
192*22dc650dSSadaf Ebrahimi                 used  to  change  the file name terminator to a zero byte). A
193*22dc650dSSadaf Ebrahimi                 line containing "--" is output between each group  of  lines,
194*22dc650dSSadaf Ebrahimi                 unless  they  are  in  fact contiguous in the input file. The
195*22dc650dSSadaf Ebrahimi                 value of number is expected to be relatively small.  When  -c
196*22dc650dSSadaf Ebrahimi                 is used, -B is ignored.
197*22dc650dSSadaf Ebrahimi
198*22dc650dSSadaf Ebrahimi       --binary-files=word
199*22dc650dSSadaf Ebrahimi                 Specify  how binary files are to be processed. If the word is
200*22dc650dSSadaf Ebrahimi                 "binary" (the default), pattern matching is performed on  bi-
201*22dc650dSSadaf Ebrahimi                 nary  files,  but  the  only  output  is  "Binary file <name>
202*22dc650dSSadaf Ebrahimi                 matches" when a match succeeds. If the word is "text",  which
203*22dc650dSSadaf Ebrahimi                 is  equivalent  to  the -a or --text option, binary files are
204*22dc650dSSadaf Ebrahimi                 processed in the same way as any other file.  In  this  case,
205*22dc650dSSadaf Ebrahimi                 when  a  match  succeeds,  the  output may be binary garbage,
206*22dc650dSSadaf Ebrahimi                 which can have nasty effects if sent to a  terminal.  If  the
207*22dc650dSSadaf Ebrahimi                 word  is  "without-match",  which is equivalent to the -I op-
208*22dc650dSSadaf Ebrahimi                 tion, binary files are not processed at all; they are assumed
209*22dc650dSSadaf Ebrahimi                 not to be of interest and are  skipped  without  causing  any
210*22dc650dSSadaf Ebrahimi                 output or affecting the return code.
211*22dc650dSSadaf Ebrahimi
212*22dc650dSSadaf Ebrahimi       --buffer-size=number
213*22dc650dSSadaf Ebrahimi                 Set  the  parameter that controls how much memory is obtained
214*22dc650dSSadaf Ebrahimi                 at the start of processing for buffering files that are being
215*22dc650dSSadaf Ebrahimi                 scanned. See also --max-buffer-size below.
216*22dc650dSSadaf Ebrahimi
217*22dc650dSSadaf Ebrahimi       -C number, --context=number
218*22dc650dSSadaf Ebrahimi                 Output number lines of context both  before  and  after  each
219*22dc650dSSadaf Ebrahimi                 matching  line.  This is equivalent to setting both -A and -B
220*22dc650dSSadaf Ebrahimi                 to the same value.
221*22dc650dSSadaf Ebrahimi
222*22dc650dSSadaf Ebrahimi       -c, --count
223*22dc650dSSadaf Ebrahimi                 Do not output lines from the files that  are  being  scanned;
224*22dc650dSSadaf Ebrahimi                 instead  output  the  number  of  lines  that would have been
225*22dc650dSSadaf Ebrahimi                 shown, either because they matched, or, if -v is set, because
226*22dc650dSSadaf Ebrahimi                 they failed to match. By default, this count is  exactly  the
227*22dc650dSSadaf Ebrahimi                 same  as the number of lines that would have been output, but
228*22dc650dSSadaf Ebrahimi                 if the -M (multiline) option is used (without -v), there  may
229*22dc650dSSadaf Ebrahimi                 be  more suppressed lines than the count (that is, the number
230*22dc650dSSadaf Ebrahimi                 of matches).
231*22dc650dSSadaf Ebrahimi
232*22dc650dSSadaf Ebrahimi                 If no lines are selected, the number zero is output. If  sev-
233*22dc650dSSadaf Ebrahimi                 eral  files  are being scanned, a count is output for each of
234*22dc650dSSadaf Ebrahimi                 them and the -t option can be used to cause  a  total  to  be
235*22dc650dSSadaf Ebrahimi                 output  at  the end. However, if the --files-with-matches op-
236*22dc650dSSadaf Ebrahimi                 tion is also used, only those files whose counts are  greater
237*22dc650dSSadaf Ebrahimi                 than zero are listed. When -c is used, the -A, -B, and -C op-
238*22dc650dSSadaf Ebrahimi                 tions are ignored.
239*22dc650dSSadaf Ebrahimi
240*22dc650dSSadaf Ebrahimi       --colour, --color
241*22dc650dSSadaf Ebrahimi                 If this option is given without any data, it is equivalent to
242*22dc650dSSadaf Ebrahimi                 "--colour=auto".   If  data  is required, it must be given in
243*22dc650dSSadaf Ebrahimi                 the same shell item, separated by an equals sign.
244*22dc650dSSadaf Ebrahimi
245*22dc650dSSadaf Ebrahimi       --colour=value, --color=value
246*22dc650dSSadaf Ebrahimi                 This option specifies under what circumstances the parts of a
247*22dc650dSSadaf Ebrahimi                 line that matched a pattern should be coloured in the output.
248*22dc650dSSadaf Ebrahimi                 It is ignored if --file-offsets, --line-offsets, or  --output
249*22dc650dSSadaf Ebrahimi                 is set. By default, output is not coloured. The value for the
250*22dc650dSSadaf Ebrahimi                 --colour  option  (which  is  optional,  see  above)  may  be
251*22dc650dSSadaf Ebrahimi                 "never", "always", or "auto". In the latter  case,  colouring
252*22dc650dSSadaf Ebrahimi                 happens  only if the standard output is connected to a termi-
253*22dc650dSSadaf Ebrahimi                 nal.  More resources are used when colouring is enabled,  be-
254*22dc650dSSadaf Ebrahimi                 cause  pcre2grep  has to search for all possible matches in a
255*22dc650dSSadaf Ebrahimi                 line, not just one, in order to colour them all.
256*22dc650dSSadaf Ebrahimi
257*22dc650dSSadaf Ebrahimi                 The colour that is used can be specified by  setting  one  of
258*22dc650dSSadaf Ebrahimi                 the  environment variables PCRE2GREP_COLOUR, PCRE2GREP_COLOR,
259*22dc650dSSadaf Ebrahimi                 PCREGREP_COLOUR, or PCREGREP_COLOR, which are checked in that
260*22dc650dSSadaf Ebrahimi                 order.  If  none  of  these  are  set,  pcre2grep  looks  for
261*22dc650dSSadaf Ebrahimi                 GREP_COLORS  or  GREP_COLOR (in that order). The value of the
262*22dc650dSSadaf Ebrahimi                 variable should be a string of two numbers,  separated  by  a
263*22dc650dSSadaf Ebrahimi                 semicolon,  except  in  the  case  of GREP_COLORS, which must
264*22dc650dSSadaf Ebrahimi                 start with "ms=" or "mt=" followed by two semicolon-separated
265*22dc650dSSadaf Ebrahimi                 colours, terminated by the end of the string or by  a  colon.
266*22dc650dSSadaf Ebrahimi                 If  GREP_COLORS  does not start with "ms=" or "mt=" it is ig-
267*22dc650dSSadaf Ebrahimi                 nored, and GREP_COLOR is checked.
268*22dc650dSSadaf Ebrahimi
269*22dc650dSSadaf Ebrahimi                 If the string obtained from one of the above  variables  con-
270*22dc650dSSadaf Ebrahimi                 tains any characters other than semicolon or digits, the set-
271*22dc650dSSadaf Ebrahimi                 ting is ignored and the default colour is used. The string is
272*22dc650dSSadaf Ebrahimi                 copied directly into the control string for setting colour on
273*22dc650dSSadaf Ebrahimi                 a  terminal,  so it is your responsibility to ensure that the
274*22dc650dSSadaf Ebrahimi                 values make sense. If no  relevant  environment  variable  is
275*22dc650dSSadaf Ebrahimi                 set, the default is "1;31", which gives red.
276*22dc650dSSadaf Ebrahimi
277*22dc650dSSadaf Ebrahimi       -D action, --devices=action
278*22dc650dSSadaf Ebrahimi                 If  an  input path is not a regular file or a directory, "ac-
279*22dc650dSSadaf Ebrahimi                 tion" specifies how it is to be processed. Valid  values  are
280*22dc650dSSadaf Ebrahimi                 "read" (the default) or "skip" (silently skip the path).
281*22dc650dSSadaf Ebrahimi
282*22dc650dSSadaf Ebrahimi       -d action, --directories=action
283*22dc650dSSadaf Ebrahimi                 If an input path is a directory, "action" specifies how it is
284*22dc650dSSadaf Ebrahimi                 to  be  processed.   Valid  values are "read" (the default in
285*22dc650dSSadaf Ebrahimi                 non-Windows environments, for compatibility with  GNU  grep),
286*22dc650dSSadaf Ebrahimi                 "recurse"  (equivalent to the -r option), or "skip" (silently
287*22dc650dSSadaf Ebrahimi                 skip the path, the default in Windows environments).  In  the
288*22dc650dSSadaf Ebrahimi                 "read"  case,  directories  are read as if they were ordinary
289*22dc650dSSadaf Ebrahimi                 files. In some operating systems the effect of reading a  di-
290*22dc650dSSadaf Ebrahimi                 rectory  like  this is an immediate end-of-file; in others it
291*22dc650dSSadaf Ebrahimi                 may provoke an error.
292*22dc650dSSadaf Ebrahimi
293*22dc650dSSadaf Ebrahimi       --depth-limit=number
294*22dc650dSSadaf Ebrahimi                 See --match-limit below.
295*22dc650dSSadaf Ebrahimi
296*22dc650dSSadaf Ebrahimi       -E, --case-restrict
297*22dc650dSSadaf Ebrahimi                 When case distinctions are being ignored in Unicode mode, two
298*22dc650dSSadaf Ebrahimi                 ASCII letters (K and S) will by default match Unicode charac-
299*22dc650dSSadaf Ebrahimi                 ters U+212A (Kelvin sign) and U+017F (long  S)  respectively,
300*22dc650dSSadaf Ebrahimi                 as well as their lower case ASCII counterparts. When this op-
301*22dc650dSSadaf Ebrahimi                 tion  is  set,  case equivalences are restricted such that no
302*22dc650dSSadaf Ebrahimi                 ASCII character  matches  a  non-ASCII  character,  and  vice
303*22dc650dSSadaf Ebrahimi                 versa.
304*22dc650dSSadaf Ebrahimi
305*22dc650dSSadaf Ebrahimi       -e pattern, --regex=pattern, --regexp=pattern
306*22dc650dSSadaf Ebrahimi                 Specify a pattern to be matched. This option can be used mul-
307*22dc650dSSadaf Ebrahimi                 tiple times in order to specify several patterns. It can also
308*22dc650dSSadaf Ebrahimi                 be  used  as a way of specifying a single pattern that starts
309*22dc650dSSadaf Ebrahimi                 with a hyphen. When -e is used, no argument pattern is  taken
310*22dc650dSSadaf Ebrahimi                 from  the  command  line;  all  arguments are treated as file
311*22dc650dSSadaf Ebrahimi                 names. There is no limit to the number of patterns. They  are
312*22dc650dSSadaf Ebrahimi                 applied to each line in the order in which they are defined.
313*22dc650dSSadaf Ebrahimi
314*22dc650dSSadaf Ebrahimi                 If  -f is used with -e, the command line patterns are matched
315*22dc650dSSadaf Ebrahimi                 first, followed by the patterns from the file(s), independent
316*22dc650dSSadaf Ebrahimi                 of the order in which these options are specified.
317*22dc650dSSadaf Ebrahimi
318*22dc650dSSadaf Ebrahimi       --exclude=pattern
319*22dc650dSSadaf Ebrahimi                 Files (but not directories) whose names match the pattern are
320*22dc650dSSadaf Ebrahimi                 skipped without being processed. This applies to  all  files,
321*22dc650dSSadaf Ebrahimi                 whether  listed  on  the  command line, obtained from --file-
322*22dc650dSSadaf Ebrahimi                 list, or by scanning a directory. The pattern is a PCRE2 reg-
323*22dc650dSSadaf Ebrahimi                 ular expression, and is matched against the  final  component
324*22dc650dSSadaf Ebrahimi                 of the file name, not the entire path. The -F, -w, and -x op-
325*22dc650dSSadaf Ebrahimi                 tions  do  not apply to this pattern. The option may be given
326*22dc650dSSadaf Ebrahimi                 any number of times in order to specify multiple patterns. If
327*22dc650dSSadaf Ebrahimi                 a file name matches both an --include and an  --exclude  pat-
328*22dc650dSSadaf Ebrahimi                 tern, it is excluded. There is no short form for this option.
329*22dc650dSSadaf Ebrahimi
330*22dc650dSSadaf Ebrahimi       --exclude-from=filename
331*22dc650dSSadaf Ebrahimi                 Treat  each  non-empty  line  of  the file as the data for an
332*22dc650dSSadaf Ebrahimi                 --exclude option. What constitutes a newline when reading the
333*22dc650dSSadaf Ebrahimi                 file is the operating system's default. The --newline  option
334*22dc650dSSadaf Ebrahimi                 has  no  effect on this option. This option may be given more
335*22dc650dSSadaf Ebrahimi                 than once in order to specify a number of files to read.
336*22dc650dSSadaf Ebrahimi
337*22dc650dSSadaf Ebrahimi       --exclude-dir=pattern
338*22dc650dSSadaf Ebrahimi                 Directories whose names match the pattern are skipped without
339*22dc650dSSadaf Ebrahimi                 being processed, whatever the setting of the --recursive  op-
340*22dc650dSSadaf Ebrahimi                 tion.  This applies to all directories, whether listed on the
341*22dc650dSSadaf Ebrahimi                 command line, obtained from --file-list,  or  by  scanning  a
342*22dc650dSSadaf Ebrahimi                 parent  directory. The pattern is a PCRE2 regular expression,
343*22dc650dSSadaf Ebrahimi                 and is matched against the final component of  the  directory
344*22dc650dSSadaf Ebrahimi                 name,  not the entire path. The -F, -w, and -x options do not
345*22dc650dSSadaf Ebrahimi                 apply to this pattern. The option may be given any number  of
346*22dc650dSSadaf Ebrahimi                 times  in order to specify more than one pattern. If a direc-
347*22dc650dSSadaf Ebrahimi                 tory matches both --include-dir and --exclude-dir, it is  ex-
348*22dc650dSSadaf Ebrahimi                 cluded. There is no short form for this option.
349*22dc650dSSadaf Ebrahimi
350*22dc650dSSadaf Ebrahimi       -F, --fixed-strings
351*22dc650dSSadaf Ebrahimi                 Interpret  each  data-matching  pattern  as  a  list of fixed
352*22dc650dSSadaf Ebrahimi                 strings, separated by newlines, instead of as a  regular  ex-
353*22dc650dSSadaf Ebrahimi                 pression. What constitutes a newline for this purpose is con-
354*22dc650dSSadaf Ebrahimi                 trolled by the --newline option. The -w (match as a word) and
355*22dc650dSSadaf Ebrahimi                 -x  (match whole line) options can be used with -F.  They ap-
356*22dc650dSSadaf Ebrahimi                 ply to each of the fixed strings. A line is selected  if  any
357*22dc650dSSadaf Ebrahimi                 of the fixed strings are found in it (subject to -w or -x, if
358*22dc650dSSadaf Ebrahimi                 present).  This  option applies only to the patterns that are
359*22dc650dSSadaf Ebrahimi                 matched against the contents of files; it does not  apply  to
360*22dc650dSSadaf Ebrahimi                 patterns  specified  by any of the --include or --exclude op-
361*22dc650dSSadaf Ebrahimi                 tions.
362*22dc650dSSadaf Ebrahimi
363*22dc650dSSadaf Ebrahimi       -f filename, --file=filename
364*22dc650dSSadaf Ebrahimi                 Read patterns from the file, one per line.  As  is  the  case
365*22dc650dSSadaf Ebrahimi                 with  patterns  on  the command line, no delimiters should be
366*22dc650dSSadaf Ebrahimi                 used. What constitutes a newline when reading the file is the
367*22dc650dSSadaf Ebrahimi                 operating system's default interpretation of \n.  The  --new-
368*22dc650dSSadaf Ebrahimi                 line  option  has  no  effect  on this option. Trailing white
369*22dc650dSSadaf Ebrahimi                 space is removed from each line, and blank lines are ignored.
370*22dc650dSSadaf Ebrahimi                 An empty file contains  no  patterns  and  therefore  matches
371*22dc650dSSadaf Ebrahimi                 nothing.  Patterns  read  from a file in this way may contain
372*22dc650dSSadaf Ebrahimi                 binary zeros, which are treated as ordinary data characters.
373*22dc650dSSadaf Ebrahimi
374*22dc650dSSadaf Ebrahimi                 If this option is given more than  once,  all  the  specified
375*22dc650dSSadaf Ebrahimi                 files  are read. A data line is output if any of the patterns
376*22dc650dSSadaf Ebrahimi                 match it. A file name can be given as "-"  to  refer  to  the
377*22dc650dSSadaf Ebrahimi                 standard  input.  When  -f is used, patterns specified on the
378*22dc650dSSadaf Ebrahimi                 command line using -e may also be present; they  are  matched
379*22dc650dSSadaf Ebrahimi                 before the file's patterns. However, no pattern is taken from
380*22dc650dSSadaf Ebrahimi                 the  command  line; all arguments are treated as the names of
381*22dc650dSSadaf Ebrahimi                 paths to be searched.
382*22dc650dSSadaf Ebrahimi
383*22dc650dSSadaf Ebrahimi       --file-list=filename
384*22dc650dSSadaf Ebrahimi                 Read a list of  files  and/or  directories  that  are  to  be
385*22dc650dSSadaf Ebrahimi                 scanned from the given file, one per line. What constitutes a
386*22dc650dSSadaf Ebrahimi                 newline  when  reading the file is the operating system's de-
387*22dc650dSSadaf Ebrahimi                 fault. Trailing white space is removed from  each  line,  and
388*22dc650dSSadaf Ebrahimi                 blank lines are ignored. These paths are processed before any
389*22dc650dSSadaf Ebrahimi                 that  are  listed  on  the command line. The file name can be
390*22dc650dSSadaf Ebrahimi                 given as "-" to refer to the standard input.  If  --file  and
391*22dc650dSSadaf Ebrahimi                 --file-list  are  both  specified  as  "-", patterns are read
392*22dc650dSSadaf Ebrahimi                 first. This is useful only when the standard input is a  ter-
393*22dc650dSSadaf Ebrahimi                 minal,  from  which  further lines (the list of files) can be
394*22dc650dSSadaf Ebrahimi                 read after an end-of-file indication. If this option is given
395*22dc650dSSadaf Ebrahimi                 more than once, all the specified files are read.
396*22dc650dSSadaf Ebrahimi
397*22dc650dSSadaf Ebrahimi       --file-offsets
398*22dc650dSSadaf Ebrahimi                 Instead of showing lines or parts of lines that  match,  show
399*22dc650dSSadaf Ebrahimi                 each  match  as  an  offset  from the start of the file and a
400*22dc650dSSadaf Ebrahimi                 length, separated by a comma. In this mode, --colour  has  no
401*22dc650dSSadaf Ebrahimi                 effect,  and no context is shown. That is, the -A, -B, and -C
402*22dc650dSSadaf Ebrahimi                 options are ignored. If there is more than  one  match  in  a
403*22dc650dSSadaf Ebrahimi                 line,  each of them is shown separately. This option is mutu-
404*22dc650dSSadaf Ebrahimi                 ally exclusive with  --output,  --line-offsets,  and  --only-
405*22dc650dSSadaf Ebrahimi                 matching.
406*22dc650dSSadaf Ebrahimi
407*22dc650dSSadaf Ebrahimi       --group-separator=text
408*22dc650dSSadaf Ebrahimi                 Output this text string instead of two hyphens between groups
409*22dc650dSSadaf Ebrahimi                 of  lines  when -A, -B, or -C is in use. See also --no-group-
410*22dc650dSSadaf Ebrahimi                 separator.
411*22dc650dSSadaf Ebrahimi
412*22dc650dSSadaf Ebrahimi       -H, --with-filename
413*22dc650dSSadaf Ebrahimi                 Force the inclusion of the file name at the start  of  output
414*22dc650dSSadaf Ebrahimi                 lines when searching a single file. The file name is not nor-
415*22dc650dSSadaf Ebrahimi                 mally  shown  in  this case.  By default, for matching lines,
416*22dc650dSSadaf Ebrahimi                 the file name is followed by a colon; for  context  lines,  a
417*22dc650dSSadaf Ebrahimi                 hyphen separator is used. The -Z option can be used to change
418*22dc650dSSadaf Ebrahimi                 the terminator to a zero byte. If a line number is also being
419*22dc650dSSadaf Ebrahimi                 output, it follows the file name. When the -M option causes a
420*22dc650dSSadaf Ebrahimi                 pattern  to  match more than one line, only the first is pre-
421*22dc650dSSadaf Ebrahimi                 ceded by the file name. This option  overrides  any  previous
422*22dc650dSSadaf Ebrahimi                 -h, -l, or -L options.
423*22dc650dSSadaf Ebrahimi
424*22dc650dSSadaf Ebrahimi       -h, --no-filename
425*22dc650dSSadaf Ebrahimi                 Suppress the output file names when searching multiple files.
426*22dc650dSSadaf Ebrahimi                 File  names  are  normally  shown  when  multiple  files  are
427*22dc650dSSadaf Ebrahimi                 searched. By default, for matching lines, the  file  name  is
428*22dc650dSSadaf Ebrahimi                 followed by a colon; for context lines, a hyphen separator is
429*22dc650dSSadaf Ebrahimi                 used. The -Z option can be used to change the terminator to a
430*22dc650dSSadaf Ebrahimi                 zero  byte. If a line number is also being output, it follows
431*22dc650dSSadaf Ebrahimi                 the file name.  This option overrides any previous -H, -L, or
432*22dc650dSSadaf Ebrahimi                 -l options.
433*22dc650dSSadaf Ebrahimi
434*22dc650dSSadaf Ebrahimi       --heap-limit=number
435*22dc650dSSadaf Ebrahimi                 See --match-limit below.
436*22dc650dSSadaf Ebrahimi
437*22dc650dSSadaf Ebrahimi       --help    Output a help message, giving brief details  of  the  command
438*22dc650dSSadaf Ebrahimi                 options  and  file type support, and then exit. Anything else
439*22dc650dSSadaf Ebrahimi                 on the command line is ignored.
440*22dc650dSSadaf Ebrahimi
441*22dc650dSSadaf Ebrahimi       -I        Ignore  binary  files.  This  is  equivalent   to   --binary-
442*22dc650dSSadaf Ebrahimi                 files=without-match.
443*22dc650dSSadaf Ebrahimi
444*22dc650dSSadaf Ebrahimi       -i, --ignore-case
445*22dc650dSSadaf Ebrahimi                 Ignore  upper/lower  case distinctions when pattern matching.
446*22dc650dSSadaf Ebrahimi                 This applies when matching path names for inclusion or exclu-
447*22dc650dSSadaf Ebrahimi                 sion as well as when matching lines in files.
448*22dc650dSSadaf Ebrahimi
449*22dc650dSSadaf Ebrahimi       --include=pattern
450*22dc650dSSadaf Ebrahimi                 If any --include patterns are specified, the only files  that
451*22dc650dSSadaf Ebrahimi                 are processed are those whose names match one of the patterns
452*22dc650dSSadaf Ebrahimi                 and  do  not match an --exclude pattern. This option does not
453*22dc650dSSadaf Ebrahimi                 affect directories, but it  applies  to  all  files,  whether
454*22dc650dSSadaf Ebrahimi                 listed  on the command line, obtained from --file-list, or by
455*22dc650dSSadaf Ebrahimi                 scanning a directory. The pattern is a PCRE2 regular  expres-
456*22dc650dSSadaf Ebrahimi                 sion,  and is matched against the final component of the file
457*22dc650dSSadaf Ebrahimi                 name, not the entire path. The -F, -w, and -x options do  not
458*22dc650dSSadaf Ebrahimi                 apply  to this pattern. The option may be given any number of
459*22dc650dSSadaf Ebrahimi                 times. If a file name matches both an --include and an  --ex-
460*22dc650dSSadaf Ebrahimi                 clude  pattern,  it  is excluded.  There is no short form for
461*22dc650dSSadaf Ebrahimi                 this option.
462*22dc650dSSadaf Ebrahimi
463*22dc650dSSadaf Ebrahimi       --include-from=filename
464*22dc650dSSadaf Ebrahimi                 Treat each non-empty line of the file  as  the  data  for  an
465*22dc650dSSadaf Ebrahimi                 --include option. What constitutes a newline for this purpose
466*22dc650dSSadaf Ebrahimi                 is  the  operating system's default. The --newline option has
467*22dc650dSSadaf Ebrahimi                 no effect on this option. This option may be given any number
468*22dc650dSSadaf Ebrahimi                 of times; all the files are read.
469*22dc650dSSadaf Ebrahimi
470*22dc650dSSadaf Ebrahimi       --include-dir=pattern
471*22dc650dSSadaf Ebrahimi                 If any --include-dir patterns are specified, the only  direc-
472*22dc650dSSadaf Ebrahimi                 tories  that are processed are those whose names match one of
473*22dc650dSSadaf Ebrahimi                 the patterns and do not match an --exclude-dir pattern.  This
474*22dc650dSSadaf Ebrahimi                 applies  to  all  directories,  whether listed on the command
475*22dc650dSSadaf Ebrahimi                 line, obtained from --file-list, or by scanning a parent  di-
476*22dc650dSSadaf Ebrahimi                 rectory.  The  pattern  is a PCRE2 regular expression, and is
477*22dc650dSSadaf Ebrahimi                 matched against the final component of  the  directory  name,
478*22dc650dSSadaf Ebrahimi                 not  the entire path. The -F, -w, and -x options do not apply
479*22dc650dSSadaf Ebrahimi                 to this pattern. The option may be given any number of times.
480*22dc650dSSadaf Ebrahimi                 If a directory matches both --include-dir and  --exclude-dir,
481*22dc650dSSadaf Ebrahimi                 it is excluded. There is no short form for this option.
482*22dc650dSSadaf Ebrahimi
483*22dc650dSSadaf Ebrahimi       -L, --files-without-match
484*22dc650dSSadaf Ebrahimi                 Instead  of  outputting lines from the files, just output the
485*22dc650dSSadaf Ebrahimi                 names of the files that do not contain any lines  that  would
486*22dc650dSSadaf Ebrahimi                 have  been  output. Each file name is output once, on a sepa-
487*22dc650dSSadaf Ebrahimi                 rate line by default, but if the -Z option is set,  they  are
488*22dc650dSSadaf Ebrahimi                 separated  by  zero  bytes  instead  of newlines. This option
489*22dc650dSSadaf Ebrahimi                 overrides any previous -H, -h, or -l options.
490*22dc650dSSadaf Ebrahimi
491*22dc650dSSadaf Ebrahimi       -l, --files-with-matches
492*22dc650dSSadaf Ebrahimi                 Instead of outputting lines from the files, just  output  the
493*22dc650dSSadaf Ebrahimi                 names of the files containing lines that would have been out-
494*22dc650dSSadaf Ebrahimi                 put.  Each  file name is output once, on a separate line, but
495*22dc650dSSadaf Ebrahimi                 if the -Z option is set, they are separated by zero bytes in-
496*22dc650dSSadaf Ebrahimi                 stead of newlines. Searching normally  stops  as  soon  as  a
497*22dc650dSSadaf Ebrahimi                 matching  line is found in a file. However, if the -c (count)
498*22dc650dSSadaf Ebrahimi                 option is also used, matching continues in  order  to  obtain
499*22dc650dSSadaf Ebrahimi                 the  correct  count,  and  those files that have at least one
500*22dc650dSSadaf Ebrahimi                 match are listed along with their counts. Using  this  option
501*22dc650dSSadaf Ebrahimi                 with  -c is a way of suppressing the listing of files with no
502*22dc650dSSadaf Ebrahimi                 matches that occurs with -c on its own. This option overrides
503*22dc650dSSadaf Ebrahimi                 any previous -H, -h, or -L options.
504*22dc650dSSadaf Ebrahimi
505*22dc650dSSadaf Ebrahimi       --label=name
506*22dc650dSSadaf Ebrahimi                 This option supplies a name to be used for the standard input
507*22dc650dSSadaf Ebrahimi                 when file names are being output. If not supplied, "(standard
508*22dc650dSSadaf Ebrahimi                 input)" is used. There is no short form for this option.
509*22dc650dSSadaf Ebrahimi
510*22dc650dSSadaf Ebrahimi       --line-buffered
511*22dc650dSSadaf Ebrahimi                 When this option is given, non-compressed input is  read  and
512*22dc650dSSadaf Ebrahimi                 processed  line by line, and the output is flushed after each
513*22dc650dSSadaf Ebrahimi                 write. By default, input is  read  in  large  chunks,  unless
514*22dc650dSSadaf Ebrahimi                 pcre2grep  can  determine that it is reading from a terminal,
515*22dc650dSSadaf Ebrahimi                 which is currently possible only in Unix-like environments or
516*22dc650dSSadaf Ebrahimi                 Windows. Output to terminal is normally automatically flushed
517*22dc650dSSadaf Ebrahimi                 by the operating system. This option can be useful  when  the
518*22dc650dSSadaf Ebrahimi                 input  or  output  is  attached to a pipe and you do not want
519*22dc650dSSadaf Ebrahimi                 pcre2grep to buffer up large amounts of data.   However,  its
520*22dc650dSSadaf Ebrahimi                 use  will  affect  performance, and the -M (multiline) option
521*22dc650dSSadaf Ebrahimi                 ceases to work. When input is from a compressed .gz  or  .bz2
522*22dc650dSSadaf Ebrahimi                 file, --line-buffered is ignored.
523*22dc650dSSadaf Ebrahimi
524*22dc650dSSadaf Ebrahimi       --line-offsets
525*22dc650dSSadaf Ebrahimi                 Instead  of  showing lines or parts of lines that match, show
526*22dc650dSSadaf Ebrahimi                 each match as a line number, the offset from the start of the
527*22dc650dSSadaf Ebrahimi                 line, and a length. The line number is terminated by a  colon
528*22dc650dSSadaf Ebrahimi                 (as  usual; see the -n option), and the offset and length are
529*22dc650dSSadaf Ebrahimi                 separated by a comma. In this mode, --colour has  no  effect,
530*22dc650dSSadaf Ebrahimi                 and  no context is shown. That is, the -A, -B, and -C options
531*22dc650dSSadaf Ebrahimi                 are ignored. If there is more than one match in a line,  each
532*22dc650dSSadaf Ebrahimi                 of  them  is shown separately. This option is mutually exclu-
533*22dc650dSSadaf Ebrahimi                 sive with --output, --file-offsets, and --only-matching.
534*22dc650dSSadaf Ebrahimi
535*22dc650dSSadaf Ebrahimi       --locale=locale-name
536*22dc650dSSadaf Ebrahimi                 This option specifies a locale to be used for pattern  match-
537*22dc650dSSadaf Ebrahimi                 ing.  It  overrides the value in the LC_ALL or LC_CTYPE envi-
538*22dc650dSSadaf Ebrahimi                 ronment variables. If no locale is specified, the  PCRE2  li-
539*22dc650dSSadaf Ebrahimi                 brary's default (usually the "C" locale) is used. There is no
540*22dc650dSSadaf Ebrahimi                 short form for this option.
541*22dc650dSSadaf Ebrahimi
542*22dc650dSSadaf Ebrahimi       -M, --multiline
543*22dc650dSSadaf Ebrahimi                 Allow  patterns to match more than one line. When this option
544*22dc650dSSadaf Ebrahimi                 is set, the PCRE2 library is called in "multiline" mode,  and
545*22dc650dSSadaf Ebrahimi                 a  match  is  allowed to continue past the end of the initial
546*22dc650dSSadaf Ebrahimi                 line and onto one or more subsequent lines.
547*22dc650dSSadaf Ebrahimi
548*22dc650dSSadaf Ebrahimi                 Patterns used with -M may usefully  contain  literal  newline
549*22dc650dSSadaf Ebrahimi                 characters  and  internal  occurrences of ^ and $ characters,
550*22dc650dSSadaf Ebrahimi                 because in multiline mode these can match  at  internal  new-
551*22dc650dSSadaf Ebrahimi                 lines.  Because  pcre2grep is scanning multiple lines, the \Z
552*22dc650dSSadaf Ebrahimi                 and \z assertions match only at the end of the last  line  in
553*22dc650dSSadaf Ebrahimi                 the file.  The \A assertion matches at the start of the first
554*22dc650dSSadaf Ebrahimi                 line  of a match. This can be any line in the file; it is not
555*22dc650dSSadaf Ebrahimi                 anchored to the first line.
556*22dc650dSSadaf Ebrahimi
557*22dc650dSSadaf Ebrahimi                 The output for a successful match may consist  of  more  than
558*22dc650dSSadaf Ebrahimi                 one  line.  The  first  line  is  the line in which the match
559*22dc650dSSadaf Ebrahimi                 started, and the last line is the line  in  which  the  match
560*22dc650dSSadaf Ebrahimi                 ended.  If  the  matched string ends with a newline sequence,
561*22dc650dSSadaf Ebrahimi                 the output ends at the end of that line. If -v is  set,  none
562*22dc650dSSadaf Ebrahimi                 of  the  lines in a multi-line match are output. Once a match
563*22dc650dSSadaf Ebrahimi                 has been handled, scanning restarts at the beginning  of  the
564*22dc650dSSadaf Ebrahimi                 line after the one in which the match ended.
565*22dc650dSSadaf Ebrahimi
566*22dc650dSSadaf Ebrahimi                 The  newline  sequence  that separates multiple lines must be
567*22dc650dSSadaf Ebrahimi                 matched as part of the pattern.  For  example,  to  find  the
568*22dc650dSSadaf Ebrahimi                 phrase  "regular  expression" in a file where "regular" might
569*22dc650dSSadaf Ebrahimi                 be at the end of a line and "expression" at the start of  the
570*22dc650dSSadaf Ebrahimi                 next line, you could use this command:
571*22dc650dSSadaf Ebrahimi
572*22dc650dSSadaf Ebrahimi                   pcre2grep -M 'regular\s+expression' <file>
573*22dc650dSSadaf Ebrahimi
574*22dc650dSSadaf Ebrahimi                 The \s escape sequence matches any white space character, in-
575*22dc650dSSadaf Ebrahimi                 cluding  newlines, and is followed by + so as to match trail-
576*22dc650dSSadaf Ebrahimi                 ing white space on the first line as well  as  possibly  han-
577*22dc650dSSadaf Ebrahimi                 dling a two-character newline sequence.
578*22dc650dSSadaf Ebrahimi
579*22dc650dSSadaf Ebrahimi                 There  is a limit to the number of lines that can be matched,
580*22dc650dSSadaf Ebrahimi                 imposed by the way that pcre2grep buffers the input  file  as
581*22dc650dSSadaf Ebrahimi                 it  scans  it.  With  a sufficiently large processing buffer,
582*22dc650dSSadaf Ebrahimi                 this should not be a problem.
583*22dc650dSSadaf Ebrahimi
584*22dc650dSSadaf Ebrahimi                 The -M option does not work when input is read line  by  line
585*22dc650dSSadaf Ebrahimi                 (see --line-buffered.)
586*22dc650dSSadaf Ebrahimi
587*22dc650dSSadaf Ebrahimi       -m number, --max-count=number
588*22dc650dSSadaf Ebrahimi                 Stop  processing after finding number matching lines, or non-
589*22dc650dSSadaf Ebrahimi                 matching lines if -v is also set. Any trailing context  lines
590*22dc650dSSadaf Ebrahimi                 are  output  after  the  final match. In multiline mode, each
591*22dc650dSSadaf Ebrahimi                 multiline match counts as just one line for this purpose.  If
592*22dc650dSSadaf Ebrahimi                 this  limit is reached when reading the standard input from a
593*22dc650dSSadaf Ebrahimi                 regular file, the file is left positioned just after the last
594*22dc650dSSadaf Ebrahimi                 matching line.  If -c is also set, the count that  is  output
595*22dc650dSSadaf Ebrahimi                 is  never  greater  than number. This option has no effect if
596*22dc650dSSadaf Ebrahimi                 used with -L, -l, or -q, or when just checking for a match in
597*22dc650dSSadaf Ebrahimi                 a binary file.
598*22dc650dSSadaf Ebrahimi
599*22dc650dSSadaf Ebrahimi       --match-limit=number
600*22dc650dSSadaf Ebrahimi                 Processing some regular expression patterns may take  a  very
601*22dc650dSSadaf Ebrahimi                 long time to search for all possible matching strings. Others
602*22dc650dSSadaf Ebrahimi                 may  require  a  very large amount of memory. There are three
603*22dc650dSSadaf Ebrahimi                 options that set resource limits for matching.
604*22dc650dSSadaf Ebrahimi
605*22dc650dSSadaf Ebrahimi                 The --match-limit option provides a means of limiting comput-
606*22dc650dSSadaf Ebrahimi                 ing resource usage when processing patterns that are not  go-
607*22dc650dSSadaf Ebrahimi                 ing to match, but which have a very large number of possibil-
608*22dc650dSSadaf Ebrahimi                 ities in their search trees. The classic example is a pattern
609*22dc650dSSadaf Ebrahimi                 that  uses  nested unlimited repeats. Internally, PCRE2 has a
610*22dc650dSSadaf Ebrahimi                 counter that is incremented each time around  its  main  pro-
611*22dc650dSSadaf Ebrahimi                 cessing  loop.  If the value set by --match-limit is reached,
612*22dc650dSSadaf Ebrahimi                 an error occurs.
613*22dc650dSSadaf Ebrahimi
614*22dc650dSSadaf Ebrahimi                 The --heap-limit option specifies, as a number  of  kibibytes
615*22dc650dSSadaf Ebrahimi                 (units of 1024 bytes), the maximum amount of heap memory that
616*22dc650dSSadaf Ebrahimi                 may be used for matching.
617*22dc650dSSadaf Ebrahimi
618*22dc650dSSadaf Ebrahimi                 The  --depth-limit  option  limits  the depth of nested back-
619*22dc650dSSadaf Ebrahimi                 tracking points, which indirectly limits the amount of memory
620*22dc650dSSadaf Ebrahimi                 that is used. The amount of memory needed for each backtrack-
621*22dc650dSSadaf Ebrahimi                 ing point depends on the number of capturing  parentheses  in
622*22dc650dSSadaf Ebrahimi                 the pattern, so the amount of memory that is used before this
623*22dc650dSSadaf Ebrahimi                 limit  acts  varies from pattern to pattern. This limit is of
624*22dc650dSSadaf Ebrahimi                 use only if it is set smaller than --match-limit.
625*22dc650dSSadaf Ebrahimi
626*22dc650dSSadaf Ebrahimi                 There are no short forms for these options. The default  lim-
627*22dc650dSSadaf Ebrahimi                 its  can  be  set when the PCRE2 library is compiled; if they
628*22dc650dSSadaf Ebrahimi                 are not specified, the defaults are very large and so  effec-
629*22dc650dSSadaf Ebrahimi                 tively unlimited.
630*22dc650dSSadaf Ebrahimi
631*22dc650dSSadaf Ebrahimi       --max-buffer-size=number
632*22dc650dSSadaf Ebrahimi                 This  limits  the  expansion  of the processing buffer, whose
633*22dc650dSSadaf Ebrahimi                 initial size can be set by --buffer-size. The maximum  buffer
634*22dc650dSSadaf Ebrahimi                 size  is  silently  forced to be no smaller than the starting
635*22dc650dSSadaf Ebrahimi                 buffer size.
636*22dc650dSSadaf Ebrahimi
637*22dc650dSSadaf Ebrahimi       -N newline-type, --newline=newline-type
638*22dc650dSSadaf Ebrahimi                 Six different conventions for indicating the ends of lines in
639*22dc650dSSadaf Ebrahimi                 scanned files are supported. For example:
640*22dc650dSSadaf Ebrahimi
641*22dc650dSSadaf Ebrahimi                   pcre2grep -N CRLF 'some pattern' <file>
642*22dc650dSSadaf Ebrahimi
643*22dc650dSSadaf Ebrahimi                 The newline type may be specified in upper, lower,  or  mixed
644*22dc650dSSadaf Ebrahimi                 case.  If the newline type is NUL, lines are separated by bi-
645*22dc650dSSadaf Ebrahimi                 nary zero characters. The other types are the  single-charac-
646*22dc650dSSadaf Ebrahimi                 ter  sequences  CR  (carriage  return) and LF (linefeed), the
647*22dc650dSSadaf Ebrahimi                 two-character sequence CRLF, an "anycrlf" type, which  recog-
648*22dc650dSSadaf Ebrahimi                 nizes  any  of  the preceding three types, and an "any" type,
649*22dc650dSSadaf Ebrahimi                 for which any Unicode line ending sequence is assumed to  end
650*22dc650dSSadaf Ebrahimi                 a  line.  The Unicode sequences are the three just mentioned,
651*22dc650dSSadaf Ebrahimi                 plus VT (vertical tab, U+000B), FF (form feed,  U+000C),  NEL
652*22dc650dSSadaf Ebrahimi                 (next  line,  U+0085),  LS  (line  separator, U+2028), and PS
653*22dc650dSSadaf Ebrahimi                 (paragraph separator, U+2029).
654*22dc650dSSadaf Ebrahimi
655*22dc650dSSadaf Ebrahimi                 When the PCRE2 library is built, a  default  line-ending  se-
656*22dc650dSSadaf Ebrahimi                 quence  is specified.  This is normally the standard sequence
657*22dc650dSSadaf Ebrahimi                 for the operating system. Unless otherwise specified by  this
658*22dc650dSSadaf Ebrahimi                 option, pcre2grep uses the library's default.
659*22dc650dSSadaf Ebrahimi
660*22dc650dSSadaf Ebrahimi                 This  option makes it possible to use pcre2grep to scan files
661*22dc650dSSadaf Ebrahimi                 that have come from other environments without having to mod-
662*22dc650dSSadaf Ebrahimi                 ify their line endings. If the data  that  is  being  scanned
663*22dc650dSSadaf Ebrahimi                 does  not  agree  with  the  convention  set  by this option,
664*22dc650dSSadaf Ebrahimi                 pcre2grep may behave in strange ways. Note that  this  option
665*22dc650dSSadaf Ebrahimi                 does  not apply to files specified by the -f, --exclude-from,
666*22dc650dSSadaf Ebrahimi                 or --include-from options, which are expected to use the  op-
667*22dc650dSSadaf Ebrahimi                 erating system's standard newline sequence.
668*22dc650dSSadaf Ebrahimi
669*22dc650dSSadaf Ebrahimi       -n, --line-number
670*22dc650dSSadaf Ebrahimi                 Precede each output line by its line number in the file, fol-
671*22dc650dSSadaf Ebrahimi                 lowed  by  a colon for matching lines or a hyphen for context
672*22dc650dSSadaf Ebrahimi                 lines. If the file name is also being output, it precedes the
673*22dc650dSSadaf Ebrahimi                 line number. When the -M option causes  a  pattern  to  match
674*22dc650dSSadaf Ebrahimi                 more  than  one  line, only the first is preceded by its line
675*22dc650dSSadaf Ebrahimi                 number. This option is forced if --line-offsets is used.
676*22dc650dSSadaf Ebrahimi
677*22dc650dSSadaf Ebrahimi       --no-group-separator
678*22dc650dSSadaf Ebrahimi                 Do not output a separator between groups of  lines  when  -A,
679*22dc650dSSadaf Ebrahimi                 -B, or -C is in use. The default is to output a line contain-
680*22dc650dSSadaf Ebrahimi                 ing two hyphens. See also --group-separator.
681*22dc650dSSadaf Ebrahimi
682*22dc650dSSadaf Ebrahimi       --no-jit  If  the  PCRE2 library is built with support for just-in-time
683*22dc650dSSadaf Ebrahimi                 compiling (which speeds up matching), pcre2grep automatically
684*22dc650dSSadaf Ebrahimi                 makes use of this, unless it was explicitly disabled at build
685*22dc650dSSadaf Ebrahimi                 time. This option can be used to disable the use  of  JIT  at
686*22dc650dSSadaf Ebrahimi                 run time. It is provided for testing and working around prob-
687*22dc650dSSadaf Ebrahimi                 lems.  It should never be needed in normal use.
688*22dc650dSSadaf Ebrahimi
689*22dc650dSSadaf Ebrahimi       -O text, --output=text
690*22dc650dSSadaf Ebrahimi                 When  there  is  a match, instead of outputting the line that
691*22dc650dSSadaf Ebrahimi                 matched, output just the text specified in this option,  fol-
692*22dc650dSSadaf Ebrahimi                 lowed  by an operating-system standard newline. In this mode,
693*22dc650dSSadaf Ebrahimi                 --colour has no effect, and no context is  shown.   That  is,
694*22dc650dSSadaf Ebrahimi                 the  -A, -B, and -C options are ignored. The --newline option
695*22dc650dSSadaf Ebrahimi                 has no effect on this option,  which  is  mutually  exclusive
696*22dc650dSSadaf Ebrahimi                 with  --only-matching,  --file-offsets,  and  --line-offsets.
697*22dc650dSSadaf Ebrahimi                 However, like --only-matching, if  there  is  more  than  one
698*22dc650dSSadaf Ebrahimi                 match in a line, each of them causes a line of output.
699*22dc650dSSadaf Ebrahimi
700*22dc650dSSadaf Ebrahimi                 Escape sequences starting with a dollar character may be used
701*22dc650dSSadaf Ebrahimi                 to insert the contents of the matched part of the line and/or
702*22dc650dSSadaf Ebrahimi                 captured substrings into the text.
703*22dc650dSSadaf Ebrahimi
704*22dc650dSSadaf Ebrahimi                 $<digits>  or  ${<digits>}  is  replaced by the captured sub-
705*22dc650dSSadaf Ebrahimi                 string of the given  decimal  number;  zero  substitutes  the
706*22dc650dSSadaf Ebrahimi                 whole match. If the number is greater than the number of cap-
707*22dc650dSSadaf Ebrahimi                 turing  substrings,  or if the capture is unset, the replace-
708*22dc650dSSadaf Ebrahimi                 ment is empty.
709*22dc650dSSadaf Ebrahimi
710*22dc650dSSadaf Ebrahimi                 $a is replaced by bell; $b by backspace; $e by escape; $f  by
711*22dc650dSSadaf Ebrahimi                 form  feed;  $n by newline; $r by carriage return; $t by tab;
712*22dc650dSSadaf Ebrahimi                 $v by vertical tab.
713*22dc650dSSadaf Ebrahimi
714*22dc650dSSadaf Ebrahimi                 $o<digits> or $o{<digits>} is replaced by the character whose
715*22dc650dSSadaf Ebrahimi                 code point is the given octal number. In the first  form,  up
716*22dc650dSSadaf Ebrahimi                 to  three  octal  digits are processed.  When more digits are
717*22dc650dSSadaf Ebrahimi                 needed in Unicode mode to specify a wide character, the  sec-
718*22dc650dSSadaf Ebrahimi                 ond form must be used.
719*22dc650dSSadaf Ebrahimi
720*22dc650dSSadaf Ebrahimi                 $x<digits>  or $x{<digits>} is replaced by the character rep-
721*22dc650dSSadaf Ebrahimi                 resented by the given hexadecimal number. In the first  form,
722*22dc650dSSadaf Ebrahimi                 up  to two hexadecimal digits are processed. When more digits
723*22dc650dSSadaf Ebrahimi                 are needed in Unicode mode to specify a wide  character,  the
724*22dc650dSSadaf Ebrahimi                 second form must be used.
725*22dc650dSSadaf Ebrahimi
726*22dc650dSSadaf Ebrahimi                 Any  other character is substituted by itself. In particular,
727*22dc650dSSadaf Ebrahimi                 $$ is replaced by a single dollar.
728*22dc650dSSadaf Ebrahimi
729*22dc650dSSadaf Ebrahimi       -o, --only-matching
730*22dc650dSSadaf Ebrahimi                 Show only the part of the line that matched a pattern instead
731*22dc650dSSadaf Ebrahimi                 of the whole line. In this mode, no context  is  shown.  That
732*22dc650dSSadaf Ebrahimi                 is,  the -A, -B, and -C options are ignored. If there is more
733*22dc650dSSadaf Ebrahimi                 than one match in a line, each of them is  shown  separately,
734*22dc650dSSadaf Ebrahimi                 on  a separate line of output. If -o is combined with -v (in-
735*22dc650dSSadaf Ebrahimi                 vert the sense of the match to find non-matching  lines),  no
736*22dc650dSSadaf Ebrahimi                 output  is  generated,  but  the return code is set appropri-
737*22dc650dSSadaf Ebrahimi                 ately. If the matched portion of the line is  empty,  nothing
738*22dc650dSSadaf Ebrahimi                 is  output  unless  the  file  name  or line number are being
739*22dc650dSSadaf Ebrahimi                 printed, in which case they are shown on an  otherwise  empty
740*22dc650dSSadaf Ebrahimi                 line.  This  option  is  mutually  exclusive  with  --output,
741*22dc650dSSadaf Ebrahimi                 --file-offsets and --line-offsets.
742*22dc650dSSadaf Ebrahimi
743*22dc650dSSadaf Ebrahimi       -onumber, --only-matching=number
744*22dc650dSSadaf Ebrahimi                 Show only the part of the line  that  matched  the  capturing
745*22dc650dSSadaf Ebrahimi                 parentheses of the given number. Up to 50 capturing parenthe-
746*22dc650dSSadaf Ebrahimi                 ses  are  supported by default. This limit can be changed via
747*22dc650dSSadaf Ebrahimi                 the --om-capture option. A pattern may contain any number  of
748*22dc650dSSadaf Ebrahimi                 capturing  parentheses, but only those whose number is within
749*22dc650dSSadaf Ebrahimi                 the limit can be accessed by -o. An error occurs if the  num-
750*22dc650dSSadaf Ebrahimi                 ber specified by -o is greater than the limit.
751*22dc650dSSadaf Ebrahimi
752*22dc650dSSadaf Ebrahimi                 -o0 is the same as -o without a number. Because these options
753*22dc650dSSadaf Ebrahimi                 can  be given without an argument (see above), if an argument
754*22dc650dSSadaf Ebrahimi                 is present, it must be given in the same shell item, for  ex-
755*22dc650dSSadaf Ebrahimi                 ample,  -o3  or --only-matching=2. The comments given for the
756*22dc650dSSadaf Ebrahimi                 non-argument case above also apply to  this  option.  If  the
757*22dc650dSSadaf Ebrahimi                 specified  capturing parentheses do not exist in the pattern,
758*22dc650dSSadaf Ebrahimi                 or were not set in the match, nothing is  output  unless  the
759*22dc650dSSadaf Ebrahimi                 file name or line number are being output.
760*22dc650dSSadaf Ebrahimi
761*22dc650dSSadaf Ebrahimi                 If  this  option is given multiple times, multiple substrings
762*22dc650dSSadaf Ebrahimi                 are output for each match,  in  the  order  the  options  are
763*22dc650dSSadaf Ebrahimi                 given,  and  all on one line. For example, -o3 -o1 -o3 causes
764*22dc650dSSadaf Ebrahimi                 the substrings matched by capturing parentheses 3 and  1  and
765*22dc650dSSadaf Ebrahimi                 then  3 again to be output. By default, there is no separator
766*22dc650dSSadaf Ebrahimi                 (but see the next but one option).
767*22dc650dSSadaf Ebrahimi
768*22dc650dSSadaf Ebrahimi       --om-capture=number
769*22dc650dSSadaf Ebrahimi                 Set the number of capturing parentheses that can be  accessed
770*22dc650dSSadaf Ebrahimi                 by -o. The default is 50.
771*22dc650dSSadaf Ebrahimi
772*22dc650dSSadaf Ebrahimi       --om-separator=text
773*22dc650dSSadaf Ebrahimi                 Specify  a  separating string for multiple occurrences of -o.
774*22dc650dSSadaf Ebrahimi                 The default is an empty string. Separating strings are  never
775*22dc650dSSadaf Ebrahimi                 coloured.
776*22dc650dSSadaf Ebrahimi
777*22dc650dSSadaf Ebrahimi       -P, --no-ucp
778*22dc650dSSadaf Ebrahimi                 Starting  from release 10.43, when UTF/Unicode mode is speci-
779*22dc650dSSadaf Ebrahimi                 fied with -u or -U, the PCRE2_UCP option is used by  default.
780*22dc650dSSadaf Ebrahimi                 This means that the POSIX classes in patterns match more than
781*22dc650dSSadaf Ebrahimi                 just  ASCII  characters.  For  example, [:digit:] matches any
782*22dc650dSSadaf Ebrahimi                 Unicode  decimal  digit.  The  --no-ucp   option   suppresses
783*22dc650dSSadaf Ebrahimi                 PCRE2_UCP,  thus restricting the POSIX classes to ASCII char-
784*22dc650dSSadaf Ebrahimi                 acters, as was the case in earlier releases. Note that  there
785*22dc650dSSadaf Ebrahimi                 are  now  more  fine-grained  option settings within patterns
786*22dc650dSSadaf Ebrahimi                 that affect individual classes.  For  example,  when  in  UCP
787*22dc650dSSadaf Ebrahimi                 mode, the sequence (?aP) restricts [:word:] to ASCII letters,
788*22dc650dSSadaf Ebrahimi                 while allowing \w to match Unicode letters and digits.
789*22dc650dSSadaf Ebrahimi
790*22dc650dSSadaf Ebrahimi       -q, --quiet
791*22dc650dSSadaf Ebrahimi                 Work quietly, that is, display nothing except error messages.
792*22dc650dSSadaf Ebrahimi                 The  exit  status  indicates  whether or not any matches were
793*22dc650dSSadaf Ebrahimi                 found.
794*22dc650dSSadaf Ebrahimi
795*22dc650dSSadaf Ebrahimi       -r, --recursive
796*22dc650dSSadaf Ebrahimi                 If any given path is a directory, recursively scan the  files
797*22dc650dSSadaf Ebrahimi                 it  contains, taking note of any --include and --exclude set-
798*22dc650dSSadaf Ebrahimi                 tings. By default, a directory is read as a normal  file;  in
799*22dc650dSSadaf Ebrahimi                 some  operating  systems this gives an immediate end-of-file.
800*22dc650dSSadaf Ebrahimi                 This option is a shorthand for setting the -d option to  "re-
801*22dc650dSSadaf Ebrahimi                 curse".
802*22dc650dSSadaf Ebrahimi
803*22dc650dSSadaf Ebrahimi       --recursion-limit=number
804*22dc650dSSadaf Ebrahimi                 This  is  an obsolete synonym for --depth-limit. See --match-
805*22dc650dSSadaf Ebrahimi                 limit above for details.
806*22dc650dSSadaf Ebrahimi
807*22dc650dSSadaf Ebrahimi       -s, --no-messages
808*22dc650dSSadaf Ebrahimi                 Suppress error  messages  about  non-existent  or  unreadable
809*22dc650dSSadaf Ebrahimi                 files.  Such  files  are quietly skipped. However, the return
810*22dc650dSSadaf Ebrahimi                 code is still 2, even if matches were found in other files.
811*22dc650dSSadaf Ebrahimi
812*22dc650dSSadaf Ebrahimi       -t, --total-count
813*22dc650dSSadaf Ebrahimi                 This option is useful when scanning more than  one  file.  If
814*22dc650dSSadaf Ebrahimi                 used  on its own, -t suppresses all output except for a grand
815*22dc650dSSadaf Ebrahimi                 total number of matching lines (or non-matching lines  if  -v
816*22dc650dSSadaf Ebrahimi                 is used) in all the files. If -t is used with -c, a grand to-
817*22dc650dSSadaf Ebrahimi                 tal  is  output  except  when the previous output is just one
818*22dc650dSSadaf Ebrahimi                 line. In other words, it is not output when just  one  file's
819*22dc650dSSadaf Ebrahimi                 count  is  listed.  If file names are being output, the grand
820*22dc650dSSadaf Ebrahimi                 total is preceded by "TOTAL:". Otherwise, it appears as  just
821*22dc650dSSadaf Ebrahimi                 another  number.  The  -t option is ignored when used with -L
822*22dc650dSSadaf Ebrahimi                 (list files without matches), because the grand  total  would
823*22dc650dSSadaf Ebrahimi                 always be zero.
824*22dc650dSSadaf Ebrahimi
825*22dc650dSSadaf Ebrahimi       -u, --utf Operate in UTF/Unicode mode. This option is available only if
826*22dc650dSSadaf Ebrahimi                 PCRE2 has been compiled with UTF-8 support. All patterns (in-
827*22dc650dSSadaf Ebrahimi                 cluding  those  for  any --exclude and --include options) and
828*22dc650dSSadaf Ebrahimi                 all lines that are scanned must be  valid  strings  of  UTF-8
829*22dc650dSSadaf Ebrahimi                 characters. If an invalid UTF-8 string is encountered, an er-
830*22dc650dSSadaf Ebrahimi                 ror occurs.
831*22dc650dSSadaf Ebrahimi
832*22dc650dSSadaf Ebrahimi       -U, --utf-allow-invalid
833*22dc650dSSadaf Ebrahimi                 As  --utf,  but in addition subject lines may contain invalid
834*22dc650dSSadaf Ebrahimi                 UTF-8 code unit sequences. These can never form part  of  any
835*22dc650dSSadaf Ebrahimi                 pattern  match.  Patterns  themselves, however, must still be
836*22dc650dSSadaf Ebrahimi                 valid UTF-8 strings. This facility allows valid UTF-8 strings
837*22dc650dSSadaf Ebrahimi                 to be sought within arbitrary byte sequences in executable or
838*22dc650dSSadaf Ebrahimi                 other binary files. For more details about matching  in  non-
839*22dc650dSSadaf Ebrahimi                 valid UTF-8 strings, see the pcre2unicode(3) documentation.
840*22dc650dSSadaf Ebrahimi
841*22dc650dSSadaf Ebrahimi       -V, --version
842*22dc650dSSadaf Ebrahimi                 Write  the version numbers of pcre2grep and the PCRE2 library
843*22dc650dSSadaf Ebrahimi                 to the standard output and then exit. Anything  else  on  the
844*22dc650dSSadaf Ebrahimi                 command line is ignored.
845*22dc650dSSadaf Ebrahimi
846*22dc650dSSadaf Ebrahimi       -v, --invert-match
847*22dc650dSSadaf Ebrahimi                 Invert  the  sense  of  the match, so that lines which do not
848*22dc650dSSadaf Ebrahimi                 match any of the patterns are the ones that are  found.  When
849*22dc650dSSadaf Ebrahimi                 this  option  is  set,  options  such  as --only-matching and
850*22dc650dSSadaf Ebrahimi                 --output, which specify parts of a match that are to be  out-
851*22dc650dSSadaf Ebrahimi                 put, are ignored.
852*22dc650dSSadaf Ebrahimi
853*22dc650dSSadaf Ebrahimi       -w, --word-regex, --word-regexp
854*22dc650dSSadaf Ebrahimi                 Force the patterns only to match "words". That is, there must
855*22dc650dSSadaf Ebrahimi                 be  a  word  boundary  at  the  start and end of each matched
856*22dc650dSSadaf Ebrahimi                 string. This is equivalent to having "\b(?:" at the start  of
857*22dc650dSSadaf Ebrahimi                 each  pattern, and ")\b" at the end. This option applies only
858*22dc650dSSadaf Ebrahimi                 to the patterns that are  matched  against  the  contents  of
859*22dc650dSSadaf Ebrahimi                 files;  it does not apply to patterns specified by any of the
860*22dc650dSSadaf Ebrahimi                 --include or --exclude options.
861*22dc650dSSadaf Ebrahimi
862*22dc650dSSadaf Ebrahimi       -x, --line-regex, --line-regexp
863*22dc650dSSadaf Ebrahimi                 Force the patterns to start matching only at  the  beginnings
864*22dc650dSSadaf Ebrahimi                 of  lines,  and  in  addition,  require  them to match entire
865*22dc650dSSadaf Ebrahimi                 lines. In multiline mode the match may be more than one line.
866*22dc650dSSadaf Ebrahimi                 This is equivalent to having "^(?:" at the start of each pat-
867*22dc650dSSadaf Ebrahimi                 tern and ")$" at the end. This option  applies  only  to  the
868*22dc650dSSadaf Ebrahimi                 patterns  that  are matched against the contents of files; it
869*22dc650dSSadaf Ebrahimi                 does not apply to patterns specified by any of the  --include
870*22dc650dSSadaf Ebrahimi                 or --exclude options.
871*22dc650dSSadaf Ebrahimi
872*22dc650dSSadaf Ebrahimi       -Z, --null
873*22dc650dSSadaf Ebrahimi                 Terminate  files names in the regular output with a zero byte
874*22dc650dSSadaf Ebrahimi                 (the NUL character) instead of what  would  normally  appear.
875*22dc650dSSadaf Ebrahimi                 This  is  useful  when  file names contain unusual characters
876*22dc650dSSadaf Ebrahimi                 such as colons, hyphens, or even newlines.  The  option  does
877*22dc650dSSadaf Ebrahimi                 not apply to file names in error messages.
878*22dc650dSSadaf Ebrahimi
879*22dc650dSSadaf Ebrahimi
880*22dc650dSSadaf EbrahimiENVIRONMENT VARIABLES
881*22dc650dSSadaf Ebrahimi
882*22dc650dSSadaf Ebrahimi       The environment variables LC_ALL and LC_CTYPE are examined, in that or-
883*22dc650dSSadaf Ebrahimi       der, for a locale. The first one that is set is used. This can be over-
884*22dc650dSSadaf Ebrahimi       ridden by the --locale option. If no locale is set, the PCRE2 library's
885*22dc650dSSadaf Ebrahimi       default (usually the "C" locale) is used.
886*22dc650dSSadaf Ebrahimi
887*22dc650dSSadaf Ebrahimi
888*22dc650dSSadaf EbrahimiNEWLINES
889*22dc650dSSadaf Ebrahimi
890*22dc650dSSadaf Ebrahimi       The  -N  (--newline) option allows pcre2grep to scan files with newline
891*22dc650dSSadaf Ebrahimi       conventions that differ from the default. This option affects only  the
892*22dc650dSSadaf Ebrahimi       way  scanned files are processed. It does not affect the interpretation
893*22dc650dSSadaf Ebrahimi       of files specified by the -f,  --file-list,  --exclude-from,  or  --in-
894*22dc650dSSadaf Ebrahimi       clude-from options.
895*22dc650dSSadaf Ebrahimi
896*22dc650dSSadaf Ebrahimi       Any  parts  of the scanned input files that are written to the standard
897*22dc650dSSadaf Ebrahimi       output are copied with whatever newline sequences they have in the  in-
898*22dc650dSSadaf Ebrahimi       put.  However,  if  the final line of a file is output, and it does not
899*22dc650dSSadaf Ebrahimi       end with a newline sequence, a newline sequence is added. If  the  new-
900*22dc650dSSadaf Ebrahimi       line  setting  is  CR, LF, CRLF or NUL, that line ending is output; for
901*22dc650dSSadaf Ebrahimi       the other settings (ANYCRLF or ANY) a single NL is used.
902*22dc650dSSadaf Ebrahimi
903*22dc650dSSadaf Ebrahimi       The newline setting does not affect the way in which  pcre2grep  writes
904*22dc650dSSadaf Ebrahimi       newlines  in  informational  messages  to the standard output and error
905*22dc650dSSadaf Ebrahimi       streams.  Under Windows, the standard output is set to  be  binary,  so
906*22dc650dSSadaf Ebrahimi       that  "\r\n" at the ends of output lines that are copied from the input
907*22dc650dSSadaf Ebrahimi       is not converted to "\r\r\n" by the C I/O library. This means that  any
908*22dc650dSSadaf Ebrahimi       messages  written  to the standard output must end with "\r\n". For all
909*22dc650dSSadaf Ebrahimi       other operating systems, and for all messages  to  the  standard  error
910*22dc650dSSadaf Ebrahimi       stream, "\n" is used.
911*22dc650dSSadaf Ebrahimi
912*22dc650dSSadaf Ebrahimi
913*22dc650dSSadaf EbrahimiOPTIONS COMPATIBILITY WITH GNU GREP
914*22dc650dSSadaf Ebrahimi
915*22dc650dSSadaf Ebrahimi       Many of the short and long forms of pcre2grep's options are the same as
916*22dc650dSSadaf Ebrahimi       in  the GNU grep program. Any long option of the form --xxx-regexp (GNU
917*22dc650dSSadaf Ebrahimi       terminology) is also  available  as  --xxx-regex  (PCRE2  terminology).
918*22dc650dSSadaf Ebrahimi       However,  the  --case-restrict, --depth-limit, -E, --file-list, --file-
919*22dc650dSSadaf Ebrahimi       offsets,   --heap-limit,   --include-dir,   --line-offsets,   --locale,
920*22dc650dSSadaf Ebrahimi       --match-limit,  -M,  --multiline, -N, --newline, --no-ucp, --om-separa-
921*22dc650dSSadaf Ebrahimi       tor, --output, -P, -u, --utf, -U, and --utf-allow-invalid  options  are
922*22dc650dSSadaf Ebrahimi       specific to pcre2grep, as is the use of the --only-matching option with
923*22dc650dSSadaf Ebrahimi       a capturing parentheses number.
924*22dc650dSSadaf Ebrahimi
925*22dc650dSSadaf Ebrahimi       Although  most  of the common options work the same way, a few are dif-
926*22dc650dSSadaf Ebrahimi       ferent in pcre2grep. For example, the --include option's argument is  a
927*22dc650dSSadaf Ebrahimi       glob for GNU grep, but in pcre2grep it is a regular expression to which
928*22dc650dSSadaf Ebrahimi       the  -i  option  applies.  If both the -c and -l options are given, GNU
929*22dc650dSSadaf Ebrahimi       grep lists only file names, without counts,  but  pcre2grep  gives  the
930*22dc650dSSadaf Ebrahimi       counts as well.
931*22dc650dSSadaf Ebrahimi
932*22dc650dSSadaf Ebrahimi
933*22dc650dSSadaf EbrahimiOPTIONS WITH DATA
934*22dc650dSSadaf Ebrahimi
935*22dc650dSSadaf Ebrahimi       There are four different ways in which an option with data can be spec-
936*22dc650dSSadaf Ebrahimi       ified.   If  a  short  form option is used, the data may follow immedi-
937*22dc650dSSadaf Ebrahimi       ately, or (with one exception) in the next command line item. For exam-
938*22dc650dSSadaf Ebrahimi       ple:
939*22dc650dSSadaf Ebrahimi
940*22dc650dSSadaf Ebrahimi         -f/some/file
941*22dc650dSSadaf Ebrahimi         -f /some/file
942*22dc650dSSadaf Ebrahimi
943*22dc650dSSadaf Ebrahimi       The exception is the -o option, which may appear with or without  data.
944*22dc650dSSadaf Ebrahimi       Because  of this, if data is present, it must follow immediately in the
945*22dc650dSSadaf Ebrahimi       same item, for example -o3.
946*22dc650dSSadaf Ebrahimi
947*22dc650dSSadaf Ebrahimi       If a long form option is used, the data may appear in the same  command
948*22dc650dSSadaf Ebrahimi       line  item,  separated by an equals character, or (with two exceptions)
949*22dc650dSSadaf Ebrahimi       it may appear in the next command line item. For example:
950*22dc650dSSadaf Ebrahimi
951*22dc650dSSadaf Ebrahimi         --file=/some/file
952*22dc650dSSadaf Ebrahimi         --file /some/file
953*22dc650dSSadaf Ebrahimi
954*22dc650dSSadaf Ebrahimi       Note, however, that if you want to supply a file name beginning with  ~
955*22dc650dSSadaf Ebrahimi       as  data  in a shell command, and have the shell expand ~ to a home di-
956*22dc650dSSadaf Ebrahimi       rectory, you must separate the file name from the option,  because  the
957*22dc650dSSadaf Ebrahimi       shell does not treat ~ specially unless it is at the start of an item.
958*22dc650dSSadaf Ebrahimi
959*22dc650dSSadaf Ebrahimi       The  exceptions  to the above are the --colour (or --color) and --only-
960*22dc650dSSadaf Ebrahimi       matching options, for which the data is optional. If one of  these  op-
961*22dc650dSSadaf Ebrahimi       tions  does  have  data,  it  must be given in the first form, using an
962*22dc650dSSadaf Ebrahimi       equals character. Otherwise pcre2grep will assume that it has no data.
963*22dc650dSSadaf Ebrahimi
964*22dc650dSSadaf Ebrahimi
965*22dc650dSSadaf EbrahimiUSING PCRE2'S CALLOUT FACILITY
966*22dc650dSSadaf Ebrahimi
967*22dc650dSSadaf Ebrahimi       pcre2grep has, by default, support for  calling  external  programs  or
968*22dc650dSSadaf Ebrahimi       scripts  or  echoing  specific strings during matching by making use of
969*22dc650dSSadaf Ebrahimi       PCRE2's callout facility. However, this support can  be  completely  or
970*22dc650dSSadaf Ebrahimi       partially  disabled  when  pcre2grep is built. You can find out whether
971*22dc650dSSadaf Ebrahimi       your binary has support for callouts by running it with the --help  op-
972*22dc650dSSadaf Ebrahimi       tion.  If  callout support is completely disabled, all callouts in pat-
973*22dc650dSSadaf Ebrahimi       terns are ignored by pcre2grep.  If the facility is partially disabled,
974*22dc650dSSadaf Ebrahimi       calling external programs is not supported, and callouts  that  request
975*22dc650dSSadaf Ebrahimi       it are ignored.
976*22dc650dSSadaf Ebrahimi
977*22dc650dSSadaf Ebrahimi       A  callout  in a PCRE2 pattern is of the form (?C<arg>) where the argu-
978*22dc650dSSadaf Ebrahimi       ment is either a number or a quoted string (see the pcre2callout  docu-
979*22dc650dSSadaf Ebrahimi       mentation  for  details).  Numbered  callouts are ignored by pcre2grep;
980*22dc650dSSadaf Ebrahimi       only callouts with string arguments are useful.
981*22dc650dSSadaf Ebrahimi
982*22dc650dSSadaf Ebrahimi   Echoing a specific string
983*22dc650dSSadaf Ebrahimi
984*22dc650dSSadaf Ebrahimi       Starting the callout string with a pipe character  invokes  an  echoing
985*22dc650dSSadaf Ebrahimi       facility that avoids calling an external program or script. This facil-
986*22dc650dSSadaf Ebrahimi       ity  is  always  available,  provided that callouts were not completely
987*22dc650dSSadaf Ebrahimi       disabled when pcre2grep was built. The rest of the  callout  string  is
988*22dc650dSSadaf Ebrahimi       processed  as  a zero-terminated string, which means it should not con-
989*22dc650dSSadaf Ebrahimi       tain any internal binary zeros. It is written  to  the  output,  having
990*22dc650dSSadaf Ebrahimi       first  been  passed through the same escape processing as text from the
991*22dc650dSSadaf Ebrahimi       --output (-O) option (see above). However, $0 cannot be used to  insert
992*22dc650dSSadaf Ebrahimi       a  matched  substring  because the match is still in progress. Instead,
993*22dc650dSSadaf Ebrahimi       the single character '0' is inserted. Any syntax errors in  the  string
994*22dc650dSSadaf Ebrahimi       (for  example,  a  dollar not followed by another character) causes the
995*22dc650dSSadaf Ebrahimi       callout to be ignored. No terminator is added to the output string,  so
996*22dc650dSSadaf Ebrahimi       if  you want a newline, you must include it explicitly using the escape
997*22dc650dSSadaf Ebrahimi       $n. For example:
998*22dc650dSSadaf Ebrahimi
999*22dc650dSSadaf Ebrahimi         pcre2grep '(.)(..(.))(?C"|[$1] [$2] [$3]$n")' <some file>
1000*22dc650dSSadaf Ebrahimi
1001*22dc650dSSadaf Ebrahimi       Matching continues normally after the string is output. If you want  to
1002*22dc650dSSadaf Ebrahimi       see  only  the  callout output but not any output from an actual match,
1003*22dc650dSSadaf Ebrahimi       you should end the pattern with (*FAIL).
1004*22dc650dSSadaf Ebrahimi
1005*22dc650dSSadaf Ebrahimi   Calling external programs or scripts
1006*22dc650dSSadaf Ebrahimi
1007*22dc650dSSadaf Ebrahimi       This facility can be independently disabled when pcre2grep is built. It
1008*22dc650dSSadaf Ebrahimi       is supported for Windows, where a call to _spawnvp() is used, for  VMS,
1009*22dc650dSSadaf Ebrahimi       where  lib$spawn()  is  used,  and  for any Unix-like environment where
1010*22dc650dSSadaf Ebrahimi       fork() and execv() are available.
1011*22dc650dSSadaf Ebrahimi
1012*22dc650dSSadaf Ebrahimi       If the callout string does not start with a pipe (vertical bar) charac-
1013*22dc650dSSadaf Ebrahimi       ter, it is parsed into a list of substrings separated by  pipe  charac-
1014*22dc650dSSadaf Ebrahimi       ters.  The first substring must be an executable name, with the follow-
1015*22dc650dSSadaf Ebrahimi       ing substrings specifying arguments:
1016*22dc650dSSadaf Ebrahimi
1017*22dc650dSSadaf Ebrahimi         executable_name|arg1|arg2|...
1018*22dc650dSSadaf Ebrahimi
1019*22dc650dSSadaf Ebrahimi       Any substring (including the executable name) may  contain  escape  se-
1020*22dc650dSSadaf Ebrahimi       quences  started  by  a dollar character. These are the same as for the
1021*22dc650dSSadaf Ebrahimi       --output (-O) option documented above, except that $0 cannot insert the
1022*22dc650dSSadaf Ebrahimi       matched string because the match is still  in  progress.  Instead,  the
1023*22dc650dSSadaf Ebrahimi       character '0' is inserted. If you need a literal dollar or pipe charac-
1024*22dc650dSSadaf Ebrahimi       ter in any substring, use $$ or $| respectively. Here is an example:
1025*22dc650dSSadaf Ebrahimi
1026*22dc650dSSadaf Ebrahimi         echo -e "abcde\n12345" | pcre2grep \
1027*22dc650dSSadaf Ebrahimi           '(?x)(.)(..(.))
1028*22dc650dSSadaf Ebrahimi           (?C"/bin/echo|Arg1: [$1] [$2] [$3]|Arg2: $|${1}$| ($4)")()' -
1029*22dc650dSSadaf Ebrahimi
1030*22dc650dSSadaf Ebrahimi         Output:
1031*22dc650dSSadaf Ebrahimi
1032*22dc650dSSadaf Ebrahimi           Arg1: [a] [bcd] [d] Arg2: |a| ()
1033*22dc650dSSadaf Ebrahimi           abcde
1034*22dc650dSSadaf Ebrahimi           Arg1: [1] [234] [4] Arg2: |1| ()
1035*22dc650dSSadaf Ebrahimi           12345
1036*22dc650dSSadaf Ebrahimi
1037*22dc650dSSadaf Ebrahimi       The  parameters  for the system call that is used to run the program or
1038*22dc650dSSadaf Ebrahimi       script are zero-terminated strings. This means that binary zero charac-
1039*22dc650dSSadaf Ebrahimi       ters in the callout argument will cause premature termination of  their
1040*22dc650dSSadaf Ebrahimi       substrings,  and  therefore should not be present. Any syntax errors in
1041*22dc650dSSadaf Ebrahimi       the string (for example, a dollar not followed  by  another  character)
1042*22dc650dSSadaf Ebrahimi       causes the callout to be ignored.  If running the program fails for any
1043*22dc650dSSadaf Ebrahimi       reason  (including the non-existence of the executable), a local match-
1044*22dc650dSSadaf Ebrahimi       ing failure occurs and the matcher backtracks in the normal way.
1045*22dc650dSSadaf Ebrahimi
1046*22dc650dSSadaf Ebrahimi
1047*22dc650dSSadaf EbrahimiMATCHING ERRORS
1048*22dc650dSSadaf Ebrahimi
1049*22dc650dSSadaf Ebrahimi       It is possible to supply a regular expression that takes  a  very  long
1050*22dc650dSSadaf Ebrahimi       time  to  fail  to  match certain lines. Such patterns normally involve
1051*22dc650dSSadaf Ebrahimi       nested indefinite repeats, for example: (a+)*\d when matched against  a
1052*22dc650dSSadaf Ebrahimi       line  of a's with no final digit. The PCRE2 matching function has a re-
1053*22dc650dSSadaf Ebrahimi       source limit that causes it to abort in these  circumstances.  If  this
1054*22dc650dSSadaf Ebrahimi       happens,  pcre2grep  outputs  an error message and the line that caused
1055*22dc650dSSadaf Ebrahimi       the problem to the standard error stream. If there  are  more  than  20
1056*22dc650dSSadaf Ebrahimi       such errors, pcre2grep gives up.
1057*22dc650dSSadaf Ebrahimi
1058*22dc650dSSadaf Ebrahimi       The  --match-limit  option  of pcre2grep can be used to set the overall
1059*22dc650dSSadaf Ebrahimi       resource limit. There are also other limits that affect the  amount  of
1060*22dc650dSSadaf Ebrahimi       memory  used  during  matching;  see the discussion of --heap-limit and
1061*22dc650dSSadaf Ebrahimi       --depth-limit above.
1062*22dc650dSSadaf Ebrahimi
1063*22dc650dSSadaf Ebrahimi
1064*22dc650dSSadaf EbrahimiDIAGNOSTICS
1065*22dc650dSSadaf Ebrahimi
1066*22dc650dSSadaf Ebrahimi       Exit status is 0 if any matches were found, 1 if no matches were found,
1067*22dc650dSSadaf Ebrahimi       and 2 for syntax errors, overlong lines, non-existent  or  inaccessible
1068*22dc650dSSadaf Ebrahimi       files  (even if matches were found in other files) or too many matching
1069*22dc650dSSadaf Ebrahimi       errors. Using the -s option to suppress error messages about inaccessi-
1070*22dc650dSSadaf Ebrahimi       ble files does not affect the return code.
1071*22dc650dSSadaf Ebrahimi
1072*22dc650dSSadaf Ebrahimi       When  run  under  VMS,  the  return  code  is  placed  in  the   symbol
1073*22dc650dSSadaf Ebrahimi       PCRE2GREP_RC  because  VMS  does  not  distinguish  between exit(0) and
1074*22dc650dSSadaf Ebrahimi       exit(1).
1075*22dc650dSSadaf Ebrahimi
1076*22dc650dSSadaf Ebrahimi
1077*22dc650dSSadaf EbrahimiSEE ALSO
1078*22dc650dSSadaf Ebrahimi
1079*22dc650dSSadaf Ebrahimi       pcre2pattern(3), pcre2syntax(3), pcre2callout(3), pcre2unicode(3).
1080*22dc650dSSadaf Ebrahimi
1081*22dc650dSSadaf Ebrahimi
1082*22dc650dSSadaf EbrahimiAUTHOR
1083*22dc650dSSadaf Ebrahimi
1084*22dc650dSSadaf Ebrahimi       Philip Hazel
1085*22dc650dSSadaf Ebrahimi       Retired from University Computing Service
1086*22dc650dSSadaf Ebrahimi       Cambridge, England.
1087*22dc650dSSadaf Ebrahimi
1088*22dc650dSSadaf Ebrahimi
1089*22dc650dSSadaf EbrahimiREVISION
1090*22dc650dSSadaf Ebrahimi
1091*22dc650dSSadaf Ebrahimi       Last updated: 22 December 2023
1092*22dc650dSSadaf Ebrahimi       Copyright (c) 1997-2023 University of Cambridge.
1093*22dc650dSSadaf Ebrahimi
1094*22dc650dSSadaf Ebrahimi
1095*22dc650dSSadaf EbrahimiPCRE2 10.43                    22 December 2023                   PCRE2GREP(1)
1096