Lines Matching +full:file +full:- +full:lines

6        pcre2grep - a grep with Perl-compatible regular expressions.
16 other grep commands do, but it uses the PCRE2 regular expression li-
17 brary to support patterns that are compatible with the regular expres-
18 sions of Perl 5. See pcre2syntax(3) for a quick-reference summary of
22 Patterns, whether supplied on the command line or in a separate file,
30 on the command line because they are interpreted by the shell, and in-
35 single pattern to be matched when neither -e nor -f is present. Con-
36 versely, when one or both of these options are used to specify pat-
37 terns, all arguments are treated as path names. At least one of -e, -f,
44 pcre2grep some-pattern file1 - file3
46 By default, input files are searched line by line, so pattern asser-
50 more than one file, the file name is output at the start of each line,
52 pcre2grep behaves. For example, the -M option makes it possible to
54 boundary is controlled by the -N (--newline) option. The -h and -H op-
55 tions control whether or not file names are shown, and the -Z option
56 changes the file name terminator to a zero byte.
59 controlled by parameters that can be set by the --buffer-size and
60 --max-buffer-size options. The first of these sets the size of buffer
61 that is obtained at the start of processing. If an input file contains
62 very long lines, a larger buffer may be needed; this is handled by au-
63 tomatically extending the buffer, up to the limit specified by --max-
64 buffer-size. The default values for these parameters can be set when
70 size", to allow for buffering "before" and "after" lines. If the buffer
71 size is too small, fewer than requested "before" and "after" lines may
80 pattern (specified by the use of -e and/or -f), each pattern is applied
82 the -e patterns are tried before the -f patterns.
85 are considered. However, if --colour (or --color) is used to colour the
86 matching substrings, or if --only-matching, --file-offsets, --line-off-
87 sets, or --output is used to output only the part of the line that
106 matches are never recognized. An example is the pattern "(su-
113 the value to set a locale when calling the PCRE2 library. The --locale
119 Compile-time options for pcre2grep can set it up to use libz or libbz2
120 for reading compressed files whose names end in .gz or .bz2, respec-
122 one or both of these file types by running it with the --help option.
124 plain text. The standard input is always so treated. If a file with a
126 text file. When input is from a compressed .gz or .bz2 file, the
127 --line-buffered option is ignored.
132 By default, a file that contains a binary zero byte within the first
133 1024 bytes is identified as a binary file, and is processed specially.
135 terminator is a binary zero, the test for a binary file is not applied.
136 See the --binary-files option for a means of changing the way binary
144 that are read from a file via the -f option may contain binary zeros.
150 For example, both the -H and -l options affect the printing of file
157 -- This terminates the list of options. It is useful if the next
159 option. This allows for the processing of patterns and file
162 -A number, --after-context=number
163 Output up to number lines of context after each matching
164 line. Fewer lines are output if the next match or the end of
165 the file is reached, or if the processing buffer size has
166 been set too small. If file names and/or line numbers are be-
168 the context lines (the -Z option can be used to change the
169 file name terminator to a zero byte). A line containing "--"
170 is output between each group of lines, unless they are in
171 fact contiguous in the input file. The value of number is ex-
172 pected to be relatively small. When -c is used, -A is ig-
175 -a, --text
176 Treat binary files as text. This is equivalent to --binary-
179 --allow-lookaround-bsk
185 -B number, --before-context=number
186 Output up to number lines of context before each matching
187 line. Fewer lines are output if the previous match or the
188 start of the file is within number lines, or if the process-
189 ing buffer size has been set too small. If file names and/or
190 line numbers are being output, a hyphen separator is used in-
191 stead of a colon for the context lines (the -Z option can be
192 used to change the file name terminator to a zero byte). A
193 line containing "--" is output between each group of lines,
194 unless they are in fact contiguous in the input file. The
195 value of number is expected to be relatively small. When -c
196 is used, -B is ignored.
198 --binary-files=word
200 "binary" (the default), pattern matching is performed on bi-
201 nary files, but the only output is "Binary file <name>
203 is equivalent to the -a or --text option, binary files are
204 processed in the same way as any other file. In this case,
207 word is "without-match", which is equivalent to the -I op-
212 --buffer-size=number
215 scanned. See also --max-buffer-size below.
217 -C number, --context=number
218 Output number lines of context both before and after each
219 matching line. This is equivalent to setting both -A and -B
222 -c, --count
223 Do not output lines from the files that are being scanned;
224 instead output the number of lines that would have been
225 shown, either because they matched, or, if -v is set, because
227 same as the number of lines that would have been output, but
228 if the -M (multiline) option is used (without -v), there may
229 be more suppressed lines than the count (that is, the number
232 If no lines are selected, the number zero is output. If sev-
234 them and the -t option can be used to cause a total to be
235 output at the end. However, if the --files-with-matches op-
237 than zero are listed. When -c is used, the -A, -B, and -C op-
240 --colour, --color
242 "--colour=auto". If data is required, it must be given in
245 --colour=value, --color=value
248 It is ignored if --file-offsets, --line-offsets, or --output
250 --colour option (which is optional, see above) may be
252 happens only if the standard output is connected to a termi-
253 nal. More resources are used when colouring is enabled, be-
264 start with "ms=" or "mt=" followed by two semicolon-separated
266 If GREP_COLORS does not start with "ms=" or "mt=" it is ig-
269 If the string obtained from one of the above variables con-
270 tains any characters other than semicolon or digits, the set-
277 -D action, --devices=action
278 If an input path is not a regular file or a directory, "ac-
282 -d action, --directories=action
285 non-Windows environments, for compatibility with GNU grep),
286 "recurse" (equivalent to the -r option), or "skip" (silently
289 files. In some operating systems the effect of reading a di-
290 rectory like this is an immediate end-of-file; in others it
293 --depth-limit=number
294 See --match-limit below.
296 -E, --case-restrict
298 ASCII letters (K and S) will by default match Unicode charac-
300 as well as their lower case ASCII counterparts. When this op-
302 ASCII character matches a non-ASCII character, and vice
305 -e pattern, --regex=pattern, --regexp=pattern
306 Specify a pattern to be matched. This option can be used mul-
309 with a hyphen. When -e is used, no argument pattern is taken
310 from the command line; all arguments are treated as file
314 If -f is used with -e, the command line patterns are matched
315 first, followed by the patterns from the file(s), independent
318 --exclude=pattern
321 whether listed on the command line, obtained from --file-
322 list, or by scanning a directory. The pattern is a PCRE2 reg-
324 of the file name, not the entire path. The -F, -w, and -x op-
327 a file name matches both an --include and an --exclude pat-
330 --exclude-from=filename
331 Treat each non-empty line of the file as the data for an
332 --exclude option. What constitutes a newline when reading the
333 file is the operating system's default. The --newline option
337 --exclude-dir=pattern
339 being processed, whatever the setting of the --recursive op-
341 command line, obtained from --file-list, or by scanning a
344 name, not the entire path. The -F, -w, and -x options do not
346 times in order to specify more than one pattern. If a direc-
347 tory matches both --include-dir and --exclude-dir, it is ex-
350 -F, --fixed-strings
351 Interpret each data-matching pattern as a list of fixed
352 strings, separated by newlines, instead of as a regular ex-
353 pression. What constitutes a newline for this purpose is con-
354 trolled by the --newline option. The -w (match as a word) and
355 -x (match whole line) options can be used with -F. They ap-
357 of the fixed strings are found in it (subject to -w or -x, if
360 patterns specified by any of the --include or --exclude op-
363 -f filename, --file=filename
364 Read patterns from the file, one per line. As is the case
366 used. What constitutes a newline when reading the file is the
367 operating system's default interpretation of \n. The --new-
369 space is removed from each line, and blank lines are ignored.
370 An empty file contains no patterns and therefore matches
371 nothing. Patterns read from a file in this way may contain
376 match it. A file name can be given as "-" to refer to the
377 standard input. When -f is used, patterns specified on the
378 command line using -e may also be present; they are matched
379 before the file's patterns. However, no pattern is taken from
383 --file-list=filename
385 scanned from the given file, one per line. What constitutes a
386 newline when reading the file is the operating system's de-
388 blank lines are ignored. These paths are processed before any
389 that are listed on the command line. The file name can be
390 given as "-" to refer to the standard input. If --file and
391 --file-list are both specified as "-", patterns are read
392 first. This is useful only when the standard input is a ter-
393 minal, from which further lines (the list of files) can be
394 read after an end-of-file indication. If this option is given
397 --file-offsets
398 Instead of showing lines or parts of lines that match, show
399 each match as an offset from the start of the file and a
400 length, separated by a comma. In this mode, --colour has no
401 effect, and no context is shown. That is, the -A, -B, and -C
403 line, each of them is shown separately. This option is mutu-
404 ally exclusive with --output, --line-offsets, and --only-
407 --group-separator=text
409 of lines when -A, -B, or -C is in use. See also --no-group-
412 -H, --with-filename
413 Force the inclusion of the file name at the start of output
414 lines when searching a single file. The file name is not nor-
415 mally shown in this case. By default, for matching lines,
416 the file name is followed by a colon; for context lines, a
417 hyphen separator is used. The -Z option can be used to change
419 output, it follows the file name. When the -M option causes a
420 pattern to match more than one line, only the first is pre-
421 ceded by the file name. This option overrides any previous
422 -h, -l, or -L options.
424 -h, --no-filename
425 Suppress the output file names when searching multiple files.
426 File names are normally shown when multiple files are
427 searched. By default, for matching lines, the file name is
428 followed by a colon; for context lines, a hyphen separator is
429 used. The -Z option can be used to change the terminator to a
431 the file name. This option overrides any previous -H, -L, or
432 -l options.
434 --heap-limit=number
435 See --match-limit below.
437 --help Output a help message, giving brief details of the command
438 options and file type support, and then exit. Anything else
441 -I Ignore binary files. This is equivalent to --binary-
442 files=without-match.
444 -i, --ignore-case
446 This applies when matching path names for inclusion or exclu-
447 sion as well as when matching lines in files.
449 --include=pattern
450 If any --include patterns are specified, the only files that
452 and do not match an --exclude pattern. This option does not
454 listed on the command line, obtained from --file-list, or by
455 scanning a directory. The pattern is a PCRE2 regular expres-
456 sion, and is matched against the final component of the file
457 name, not the entire path. The -F, -w, and -x options do not
459 times. If a file name matches both an --include and an --ex-
463 --include-from=filename
464 Treat each non-empty line of the file as the data for an
465 --include option. What constitutes a newline for this purpose
466 is the operating system's default. The --newline option has
470 --include-dir=pattern
471 If any --include-dir patterns are specified, the only direc-
473 the patterns and do not match an --exclude-dir pattern. This
475 line, obtained from --file-list, or by scanning a parent di-
478 not the entire path. The -F, -w, and -x options do not apply
480 If a directory matches both --include-dir and --exclude-dir,
483 -L, --files-without-match
484 Instead of outputting lines from the files, just output the
485 names of the files that do not contain any lines that would
486 have been output. Each file name is output once, on a sepa-
487 rate line by default, but if the -Z option is set, they are
489 overrides any previous -H, -h, or -l options.
491 -l, --files-with-matches
492 Instead of outputting lines from the files, just output the
493 names of the files containing lines that would have been out-
494 put. Each file name is output once, on a separate line, but
495 if the -Z option is set, they are separated by zero bytes in-
497 matching line is found in a file. However, if the -c (count)
501 with -c is a way of suppressing the listing of files with no
502 matches that occurs with -c on its own. This option overrides
503 any previous -H, -h, or -L options.
505 --label=name
507 when file names are being output. If not supplied, "(standard
510 --line-buffered
511 When this option is given, non-compressed input is read and
515 which is currently possible only in Unix-like environments or
520 use will affect performance, and the -M (multiline) option
522 file, --line-buffered is ignored.
524 --line-offsets
525 Instead of showing lines or parts of lines that match, show
528 (as usual; see the -n option), and the offset and length are
529 separated by a comma. In this mode, --colour has no effect,
530 and no context is shown. That is, the -A, -B, and -C options
532 of them is shown separately. This option is mutually exclu-
533 sive with --output, --file-offsets, and --only-matching.
535 --locale=locale-name
536 This option specifies a locale to be used for pattern match-
537 ing. It overrides the value in the LC_ALL or LC_CTYPE envi-
538 ronment variables. If no locale is specified, the PCRE2 li-
542 -M, --multiline
546 line and onto one or more subsequent lines.
548 Patterns used with -M may usefully contain literal newline
550 because in multiline mode these can match at internal new-
551 lines. Because pcre2grep is scanning multiple lines, the \Z
553 the file. The \A assertion matches at the start of the first
554 line of a match. This can be any line in the file; it is not
561 the output ends at the end of that line. If -v is set, none
562 of the lines in a multi-line match are output. Once a match
566 The newline sequence that separates multiple lines must be
568 phrase "regular expression" in a file where "regular" might
572 pcre2grep -M 'regular\s+expression' <file>
574 The \s escape sequence matches any white space character, in-
575 cluding newlines, and is followed by + so as to match trail-
576 ing white space on the first line as well as possibly han-
577 dling a two-character newline sequence.
579 There is a limit to the number of lines that can be matched,
580 imposed by the way that pcre2grep buffers the input file as
584 The -M option does not work when input is read line by line
585 (see --line-buffered.)
587 -m number, --max-count=number
588 Stop processing after finding number matching lines, or non-
589 matching lines if -v is also set. Any trailing context lines
593 regular file, the file is left positioned just after the last
594 matching line. If -c is also set, the count that is output
596 used with -L, -l, or -q, or when just checking for a match in
597 a binary file.
599 --match-limit=number
605 The --match-limit option provides a means of limiting comput-
606 ing resource usage when processing patterns that are not go-
607 ing to match, but which have a very large number of possibil-
610 counter that is incremented each time around its main pro-
611 cessing loop. If the value set by --match-limit is reached,
614 The --heap-limit option specifies, as a number of kibibytes
618 The --depth-limit option limits the depth of nested back-
620 that is used. The amount of memory needed for each backtrack-
624 use only if it is set smaller than --match-limit.
626 There are no short forms for these options. The default lim-
628 are not specified, the defaults are very large and so effec-
631 --max-buffer-size=number
633 initial size can be set by --buffer-size. The maximum buffer
637 -N newline-type, --newline=newline-type
638 Six different conventions for indicating the ends of lines in
641 pcre2grep -N CRLF 'some pattern' <file>
644 case. If the newline type is NUL, lines are separated by bi-
645 nary zero characters. The other types are the single-charac-
647 two-character sequence CRLF, an "anycrlf" type, which recog-
655 When the PCRE2 library is built, a default line-ending se-
661 that have come from other environments without having to mod-
665 does not apply to files specified by the -f, --exclude-from,
666 or --include-from options, which are expected to use the op-
669 -n, --line-number
670 Precede each output line by its line number in the file, fol-
671 lowed by a colon for matching lines or a hyphen for context
672 lines. If the file name is also being output, it precedes the
673 line number. When the -M option causes a pattern to match
675 number. This option is forced if --line-offsets is used.
677 --no-group-separator
678 Do not output a separator between groups of lines when -A,
679 -B, or -C is in use. The default is to output a line contain-
680 ing two hyphens. See also --group-separator.
682 --no-jit If the PCRE2 library is built with support for just-in-time
686 run time. It is provided for testing and working around prob-
689 -O text, --output=text
691 matched, output just the text specified in this option, fol-
692 lowed by an operating-system standard newline. In this mode,
693 --colour has no effect, and no context is shown. That is,
694 the -A, -B, and -C options are ignored. The --newline option
696 with --only-matching, --file-offsets, and --line-offsets.
697 However, like --only-matching, if there is more than one
704 $<digits> or ${<digits>} is replaced by the captured sub-
706 whole match. If the number is greater than the number of cap-
707 turing substrings, or if the capture is unset, the replace-
717 needed in Unicode mode to specify a wide character, the sec-
720 $x<digits> or $x{<digits>} is replaced by the character rep-
729 -o, --only-matching
732 is, the -A, -B, and -C options are ignored. If there is more
734 on a separate line of output. If -o is combined with -v (in-
735 vert the sense of the match to find non-matching lines), no
736 output is generated, but the return code is set appropri-
738 is output unless the file name or line number are being
740 line. This option is mutually exclusive with --output,
741 --file-offsets and --line-offsets.
743 -onumber, --only-matching=number
745 parentheses of the given number. Up to 50 capturing parenthe-
747 the --om-capture option. A pattern may contain any number of
749 the limit can be accessed by -o. An error occurs if the num-
750 ber specified by -o is greater than the limit.
752 -o0 is the same as -o without a number. Because these options
754 is present, it must be given in the same shell item, for ex-
755 ample, -o3 or --only-matching=2. The comments given for the
756 non-argument case above also apply to this option. If the
759 file name or line number are being output.
763 given, and all on one line. For example, -o3 -o1 -o3 causes
768 --om-capture=number
770 by -o. The default is 50.
772 --om-separator=text
773 Specify a separating string for multiple occurrences of -o.
777 -P, --no-ucp
778 Starting from release 10.43, when UTF/Unicode mode is speci-
779 fied with -u or -U, the PCRE2_UCP option is used by default.
782 Unicode decimal digit. The --no-ucp option suppresses
783 PCRE2_UCP, thus restricting the POSIX classes to ASCII char-
785 are now more fine-grained option settings within patterns
790 -q, --quiet
795 -r, --recursive
797 it contains, taking note of any --include and --exclude set-
798 tings. By default, a directory is read as a normal file; in
799 some operating systems this gives an immediate end-of-file.
800 This option is a shorthand for setting the -d option to "re-
803 --recursion-limit=number
804 This is an obsolete synonym for --depth-limit. See --match-
807 -s, --no-messages
808 Suppress error messages about non-existent or unreadable
812 -t, --total-count
813 This option is useful when scanning more than one file. If
814 used on its own, -t suppresses all output except for a grand
815 total number of matching lines (or non-matching lines if -v
816 is used) in all the files. If -t is used with -c, a grand to-
818 line. In other words, it is not output when just one file's
819 count is listed. If file names are being output, the grand
821 another number. The -t option is ignored when used with -L
825 -u, --utf Operate in UTF/Unicode mode. This option is available only if
826 PCRE2 has been compiled with UTF-8 support. All patterns (in-
827 cluding those for any --exclude and --include options) and
828 all lines that are scanned must be valid strings of UTF-8
829 characters. If an invalid UTF-8 string is encountered, an er-
832 -U, --utf-allow-invalid
833 As --utf, but in addition subject lines may contain invalid
834 UTF-8 code unit sequences. These can never form part of any
836 valid UTF-8 strings. This facility allows valid UTF-8 strings
838 other binary files. For more details about matching in non-
839 valid UTF-8 strings, see the pcre2unicode(3) documentation.
841 -V, --version
846 -v, --invert-match
847 Invert the sense of the match, so that lines which do not
849 this option is set, options such as --only-matching and
850 --output, which specify parts of a match that are to be out-
853 -w, --word-regex, --word-regexp
860 --include or --exclude options.
862 -x, --line-regex, --line-regexp
864 of lines, and in addition, require them to match entire
865 lines. In multiline mode the match may be more than one line.
866 This is equivalent to having "^(?:" at the start of each pat-
869 does not apply to patterns specified by any of the --include
870 or --exclude options.
872 -Z, --null
875 This is useful when file names contain unusual characters
877 not apply to file names in error messages.
882 The environment variables LC_ALL and LC_CTYPE are examined, in that or-
883 der, for a locale. The first one that is set is used. This can be over-
884 ridden by the --locale option. If no locale is set, the PCRE2 library's
890 The -N (--newline) option allows pcre2grep to scan files with newline
893 of files specified by the -f, --file-list, --exclude-from, or --in-
894 clude-from options.
897 output are copied with whatever newline sequences they have in the in-
898 put. However, if the final line of a file is output, and it does not
899 end with a newline sequence, a newline sequence is added. If the new-
906 that "\r\n" at the ends of output lines that are copied from the input
916 in the GNU grep program. Any long option of the form --xxx-regexp (GNU
917 terminology) is also available as --xxx-regex (PCRE2 terminology).
918 However, the --case-restrict, --depth-limit, -E, --file-list, --file-
919 offsets, --heap-limit, --include-dir, --line-offsets, --locale,
920 --match-limit, -M, --multiline, -N, --newline, --no-ucp, --om-separa-
921 tor, --output, -P, -u, --utf, -U, and --utf-allow-invalid options are
922 specific to pcre2grep, as is the use of the --only-matching option with
925 Although most of the common options work the same way, a few are dif-
926 ferent in pcre2grep. For example, the --include option's argument is a
928 the -i option applies. If both the -c and -l options are given, GNU
929 grep lists only file names, without counts, but pcre2grep gives the
935 There are four different ways in which an option with data can be spec-
936 ified. If a short form option is used, the data may follow immedi-
937 ately, or (with one exception) in the next command line item. For exam-
940 -f/some/file
941 -f /some/file
943 The exception is the -o option, which may appear with or without data.
945 same item, for example -o3.
951 --file=/some/file
952 --file /some/file
954 Note, however, that if you want to supply a file name beginning with ~
955 as data in a shell command, and have the shell expand ~ to a home di-
956 rectory, you must separate the file name from the option, because the
959 The exceptions to the above are the --colour (or --color) and --only-
960 matching options, for which the data is optional. If one of these op-
971 your binary has support for callouts by running it with the --help op-
972 tion. If callout support is completely disabled, all callouts in pat-
977 A callout in a PCRE2 pattern is of the form (?C<arg>) where the argu-
978 ment is either a number or a quoted string (see the pcre2callout docu-
985 facility that avoids calling an external program or script. This facil-
988 processed as a zero-terminated string, which means it should not con-
991 --output (-O) option (see above). However, $0 cannot be used to insert
999 pcre2grep '(.)(..(.))(?C"|[$1] [$2] [$3]$n")' <some file>
1009 where lib$spawn() is used, and for any Unix-like environment where
1012 If the callout string does not start with a pipe (vertical bar) charac-
1013 ter, it is parsed into a list of substrings separated by pipe charac-
1014 ters. The first substring must be an executable name, with the follow-
1019 Any substring (including the executable name) may contain escape se-
1021 --output (-O) option documented above, except that $0 cannot insert the
1023 character '0' is inserted. If you need a literal dollar or pipe charac-
1026 echo -e "abcde\n12345" | pcre2grep \
1028 (?C"/bin/echo|Arg1: [$1] [$2] [$3]|Arg2: $|${1}$| ($4)")()' -
1038 script are zero-terminated strings. This means that binary zero charac-
1043 reason (including the non-existence of the executable), a local match-
1050 time to fail to match certain lines. Such patterns normally involve
1052 line of a's with no final digit. The PCRE2 matching function has a re-
1058 The --match-limit option of pcre2grep can be used to set the overall
1060 memory used during matching; see the discussion of --heap-limit and
1061 --depth-limit above.
1067 and 2 for syntax errors, overlong lines, non-existent or inaccessible
1069 errors. Using the -s option to suppress error messages about inaccessi-
1092 Copyright (c) 1997-2023 University of Cambridge.