xref: /aosp_15_r20/external/pcre/doc/html/pcre2build.html (revision 22dc650d8ae982c6770746019a6f94af92b0f024)
1*22dc650dSSadaf Ebrahimi<html>
2*22dc650dSSadaf Ebrahimi<head>
3*22dc650dSSadaf Ebrahimi<title>pcre2build specification</title>
4*22dc650dSSadaf Ebrahimi</head>
5*22dc650dSSadaf Ebrahimi<body bgcolor="#FFFFFF" text="#00005A" link="#0066FF" alink="#3399FF" vlink="#2222BB">
6*22dc650dSSadaf Ebrahimi<h1>pcre2build man page</h1>
7*22dc650dSSadaf Ebrahimi<p>
8*22dc650dSSadaf EbrahimiReturn to the <a href="index.html">PCRE2 index page</a>.
9*22dc650dSSadaf Ebrahimi</p>
10*22dc650dSSadaf Ebrahimi<p>
11*22dc650dSSadaf EbrahimiThis page is part of the PCRE2 HTML documentation. It was generated
12*22dc650dSSadaf Ebrahimiautomatically from the original man page. If there is any nonsense in it,
13*22dc650dSSadaf Ebrahimiplease consult the man page, in case the conversion went wrong.
14*22dc650dSSadaf Ebrahimi<br>
15*22dc650dSSadaf Ebrahimi<ul>
16*22dc650dSSadaf Ebrahimi<li><a name="TOC1" href="#SEC1">BUILDING PCRE2</a>
17*22dc650dSSadaf Ebrahimi<li><a name="TOC2" href="#SEC2">PCRE2 BUILD-TIME OPTIONS</a>
18*22dc650dSSadaf Ebrahimi<li><a name="TOC3" href="#SEC3">BUILDING 8-BIT, 16-BIT AND 32-BIT LIBRARIES</a>
19*22dc650dSSadaf Ebrahimi<li><a name="TOC4" href="#SEC4">BUILDING SHARED AND STATIC LIBRARIES</a>
20*22dc650dSSadaf Ebrahimi<li><a name="TOC5" href="#SEC5">UNICODE AND UTF SUPPORT</a>
21*22dc650dSSadaf Ebrahimi<li><a name="TOC6" href="#SEC6">DISABLING THE USE OF \C</a>
22*22dc650dSSadaf Ebrahimi<li><a name="TOC7" href="#SEC7">JUST-IN-TIME COMPILER SUPPORT</a>
23*22dc650dSSadaf Ebrahimi<li><a name="TOC8" href="#SEC8">NEWLINE RECOGNITION</a>
24*22dc650dSSadaf Ebrahimi<li><a name="TOC9" href="#SEC9">WHAT \R MATCHES</a>
25*22dc650dSSadaf Ebrahimi<li><a name="TOC10" href="#SEC10">HANDLING VERY LARGE PATTERNS</a>
26*22dc650dSSadaf Ebrahimi<li><a name="TOC11" href="#SEC11">LIMITING PCRE2 RESOURCE USAGE</a>
27*22dc650dSSadaf Ebrahimi<li><a name="TOC12" href="#SEC12">LIMITING VARIABLE-LENGTH LOOKBEHIND ASSERTIONS</a>
28*22dc650dSSadaf Ebrahimi<li><a name="TOC13" href="#SEC13">CREATING CHARACTER TABLES AT BUILD TIME</a>
29*22dc650dSSadaf Ebrahimi<li><a name="TOC14" href="#SEC14">USING EBCDIC CODE</a>
30*22dc650dSSadaf Ebrahimi<li><a name="TOC15" href="#SEC15">PCRE2GREP SUPPORT FOR EXTERNAL SCRIPTS</a>
31*22dc650dSSadaf Ebrahimi<li><a name="TOC16" href="#SEC16">PCRE2GREP OPTIONS FOR COMPRESSED FILE SUPPORT</a>
32*22dc650dSSadaf Ebrahimi<li><a name="TOC17" href="#SEC17">PCRE2GREP BUFFER SIZE</a>
33*22dc650dSSadaf Ebrahimi<li><a name="TOC18" href="#SEC18">PCRE2TEST OPTION FOR LIBREADLINE SUPPORT</a>
34*22dc650dSSadaf Ebrahimi<li><a name="TOC19" href="#SEC19">INCLUDING DEBUGGING CODE</a>
35*22dc650dSSadaf Ebrahimi<li><a name="TOC20" href="#SEC20">DEBUGGING WITH VALGRIND SUPPORT</a>
36*22dc650dSSadaf Ebrahimi<li><a name="TOC21" href="#SEC21">CODE COVERAGE REPORTING</a>
37*22dc650dSSadaf Ebrahimi<li><a name="TOC22" href="#SEC22">DISABLING THE Z AND T FORMATTING MODIFIERS</a>
38*22dc650dSSadaf Ebrahimi<li><a name="TOC23" href="#SEC23">SUPPORT FOR FUZZERS</a>
39*22dc650dSSadaf Ebrahimi<li><a name="TOC24" href="#SEC24">OBSOLETE OPTION</a>
40*22dc650dSSadaf Ebrahimi<li><a name="TOC25" href="#SEC25">SEE ALSO</a>
41*22dc650dSSadaf Ebrahimi<li><a name="TOC26" href="#SEC26">AUTHOR</a>
42*22dc650dSSadaf Ebrahimi<li><a name="TOC27" href="#SEC27">REVISION</a>
43*22dc650dSSadaf Ebrahimi</ul>
44*22dc650dSSadaf Ebrahimi<br><a name="SEC1" href="#TOC1">BUILDING PCRE2</a><br>
45*22dc650dSSadaf Ebrahimi<P>
46*22dc650dSSadaf EbrahimiPCRE2 is distributed with a <b>configure</b> script that can be used to build
47*22dc650dSSadaf Ebrahimithe library in Unix-like environments using the applications known as
48*22dc650dSSadaf EbrahimiAutotools. Also in the distribution are files to support building using
49*22dc650dSSadaf Ebrahimi<b>CMake</b> instead of <b>configure</b>. The text file
50*22dc650dSSadaf Ebrahimi<a href="README.txt"><b>README</b></a>
51*22dc650dSSadaf Ebrahimicontains general information about building with Autotools (some of which is
52*22dc650dSSadaf Ebrahimirepeated below), and also has some comments about building on various operating
53*22dc650dSSadaf Ebrahimisystems. The files in the <b>vms</b> directory support building under OpenVMS.
54*22dc650dSSadaf EbrahimiThere is a lot more information about building PCRE2 without using
55*22dc650dSSadaf EbrahimiAutotools (including information about using <b>CMake</b> and building "by
56*22dc650dSSadaf Ebrahimihand") in the text file called
57*22dc650dSSadaf Ebrahimi<a href="NON-AUTOTOOLS-BUILD.txt"><b>NON-AUTOTOOLS-BUILD</b>.</a>
58*22dc650dSSadaf EbrahimiYou should consult this file as well as the
59*22dc650dSSadaf Ebrahimi<a href="README.txt"><b>README</b></a>
60*22dc650dSSadaf Ebrahimifile if you are building in a non-Unix-like environment.
61*22dc650dSSadaf Ebrahimi</P>
62*22dc650dSSadaf Ebrahimi<br><a name="SEC2" href="#TOC1">PCRE2 BUILD-TIME OPTIONS</a><br>
63*22dc650dSSadaf Ebrahimi<P>
64*22dc650dSSadaf EbrahimiThe rest of this document describes the optional features of PCRE2 that can be
65*22dc650dSSadaf Ebrahimiselected when the library is compiled. It assumes use of the <b>configure</b>
66*22dc650dSSadaf Ebrahimiscript, where the optional features are selected or deselected by providing
67*22dc650dSSadaf Ebrahimioptions to <b>configure</b> before running the <b>make</b> command. However, the
68*22dc650dSSadaf Ebrahimisame options can be selected in both Unix-like and non-Unix-like environments
69*22dc650dSSadaf Ebrahimiif you are using <b>CMake</b> instead of <b>configure</b> to build PCRE2.
70*22dc650dSSadaf Ebrahimi</P>
71*22dc650dSSadaf Ebrahimi<P>
72*22dc650dSSadaf EbrahimiIf you are not using Autotools or <b>CMake</b>, option selection can be done by
73*22dc650dSSadaf Ebrahimiediting the <b>config.h</b> file, or by passing parameter settings to the
74*22dc650dSSadaf Ebrahimicompiler, as described in
75*22dc650dSSadaf Ebrahimi<a href="NON-AUTOTOOLS-BUILD.txt"><b>NON-AUTOTOOLS-BUILD</b>.</a>
76*22dc650dSSadaf Ebrahimi</P>
77*22dc650dSSadaf Ebrahimi<P>
78*22dc650dSSadaf EbrahimiThe complete list of options for <b>configure</b> (which includes the standard
79*22dc650dSSadaf Ebrahimiones such as the selection of the installation directory) can be obtained by
80*22dc650dSSadaf Ebrahimirunning
81*22dc650dSSadaf Ebrahimi<pre>
82*22dc650dSSadaf Ebrahimi  ./configure --help
83*22dc650dSSadaf Ebrahimi</pre>
84*22dc650dSSadaf EbrahimiThe following sections include descriptions of "on/off" options whose names
85*22dc650dSSadaf Ebrahimibegin with --enable or --disable. Because of the way that <b>configure</b>
86*22dc650dSSadaf Ebrahimiworks, --enable and --disable always come in pairs, so the complementary option
87*22dc650dSSadaf Ebrahimialways exists as well, but as it specifies the default, it is not described.
88*22dc650dSSadaf EbrahimiOptions that specify values have names that start with --with. At the end of a
89*22dc650dSSadaf Ebrahimi<b>configure</b> run, a summary of the configuration is output.
90*22dc650dSSadaf Ebrahimi</P>
91*22dc650dSSadaf Ebrahimi<br><a name="SEC3" href="#TOC1">BUILDING 8-BIT, 16-BIT AND 32-BIT LIBRARIES</a><br>
92*22dc650dSSadaf Ebrahimi<P>
93*22dc650dSSadaf EbrahimiBy default, a library called <b>libpcre2-8</b> is built, containing functions
94*22dc650dSSadaf Ebrahimithat take string arguments contained in arrays of bytes, interpreted either as
95*22dc650dSSadaf Ebrahimisingle-byte characters, or UTF-8 strings. You can also build two other
96*22dc650dSSadaf Ebrahimilibraries, called <b>libpcre2-16</b> and <b>libpcre2-32</b>, which process
97*22dc650dSSadaf Ebrahimistrings that are contained in arrays of 16-bit and 32-bit code units,
98*22dc650dSSadaf Ebrahimirespectively. These can be interpreted either as single-unit characters or
99*22dc650dSSadaf EbrahimiUTF-16/UTF-32 strings. To build these additional libraries, add one or both of
100*22dc650dSSadaf Ebrahimithe following to the <b>configure</b> command:
101*22dc650dSSadaf Ebrahimi<pre>
102*22dc650dSSadaf Ebrahimi  --enable-pcre2-16
103*22dc650dSSadaf Ebrahimi  --enable-pcre2-32
104*22dc650dSSadaf Ebrahimi</pre>
105*22dc650dSSadaf EbrahimiIf you do not want the 8-bit library, add
106*22dc650dSSadaf Ebrahimi<pre>
107*22dc650dSSadaf Ebrahimi  --disable-pcre2-8
108*22dc650dSSadaf Ebrahimi</pre>
109*22dc650dSSadaf Ebrahimias well. At least one of the three libraries must be built. Note that the POSIX
110*22dc650dSSadaf Ebrahimiwrapper is for the 8-bit library only, and that <b>pcre2grep</b> is an 8-bit
111*22dc650dSSadaf Ebrahimiprogram. Neither of these are built if you select only the 16-bit or 32-bit
112*22dc650dSSadaf Ebrahimilibraries.
113*22dc650dSSadaf Ebrahimi</P>
114*22dc650dSSadaf Ebrahimi<br><a name="SEC4" href="#TOC1">BUILDING SHARED AND STATIC LIBRARIES</a><br>
115*22dc650dSSadaf Ebrahimi<P>
116*22dc650dSSadaf EbrahimiThe Autotools PCRE2 building process uses <b>libtool</b> to build both shared
117*22dc650dSSadaf Ebrahimiand static libraries by default. You can suppress an unwanted library by adding
118*22dc650dSSadaf Ebrahimione of
119*22dc650dSSadaf Ebrahimi<pre>
120*22dc650dSSadaf Ebrahimi  --disable-shared
121*22dc650dSSadaf Ebrahimi  --disable-static
122*22dc650dSSadaf Ebrahimi</pre>
123*22dc650dSSadaf Ebrahimito the <b>configure</b> command. Setting --disable-shared ensures that PCRE2
124*22dc650dSSadaf Ebrahimilibraries are built as static libraries. The binaries that are then created as
125*22dc650dSSadaf Ebrahimipart of the build process (for example, <b>pcre2test</b> and <b>pcre2grep</b>)
126*22dc650dSSadaf Ebrahimiare linked statically with one or more PCRE2 libraries, but may also be
127*22dc650dSSadaf Ebrahimidynamically linked with other libraries such as <b>libc</b>. If you want these
128*22dc650dSSadaf Ebrahimibinaries to be fully statically linked, you can set LDFLAGS like this:
129*22dc650dSSadaf Ebrahimi<br>
130*22dc650dSSadaf Ebrahimi<br>
131*22dc650dSSadaf EbrahimiLDFLAGS=--static ./configure --disable-shared
132*22dc650dSSadaf Ebrahimi<br>
133*22dc650dSSadaf Ebrahimi<br>
134*22dc650dSSadaf EbrahimiNote the two hyphens in --static. Of course, this works only if static versions
135*22dc650dSSadaf Ebrahimiof all the relevant libraries are available for linking.
136*22dc650dSSadaf Ebrahimi</P>
137*22dc650dSSadaf Ebrahimi<br><a name="SEC5" href="#TOC1">UNICODE AND UTF SUPPORT</a><br>
138*22dc650dSSadaf Ebrahimi<P>
139*22dc650dSSadaf EbrahimiBy default, PCRE2 is built with support for Unicode and UTF character strings.
140*22dc650dSSadaf EbrahimiTo build it without Unicode support, add
141*22dc650dSSadaf Ebrahimi<pre>
142*22dc650dSSadaf Ebrahimi  --disable-unicode
143*22dc650dSSadaf Ebrahimi</pre>
144*22dc650dSSadaf Ebrahimito the <b>configure</b> command. This setting applies to all three libraries. It
145*22dc650dSSadaf Ebrahimiis not possible to build one library with Unicode support and another without
146*22dc650dSSadaf Ebrahimiin the same configuration.
147*22dc650dSSadaf Ebrahimi</P>
148*22dc650dSSadaf Ebrahimi<P>
149*22dc650dSSadaf EbrahimiOf itself, Unicode support does not make PCRE2 treat strings as UTF-8, UTF-16
150*22dc650dSSadaf Ebrahimior UTF-32. To do that, applications that use the library can set the PCRE2_UTF
151*22dc650dSSadaf Ebrahimioption when they call <b>pcre2_compile()</b> to compile a pattern.
152*22dc650dSSadaf EbrahimiAlternatively, patterns may be started with (*UTF) unless the application has
153*22dc650dSSadaf Ebrahimilocked this out by setting PCRE2_NEVER_UTF.
154*22dc650dSSadaf Ebrahimi</P>
155*22dc650dSSadaf Ebrahimi<P>
156*22dc650dSSadaf EbrahimiUTF support allows the libraries to process character code points up to
157*22dc650dSSadaf Ebrahimi0x10ffff in the strings that they handle. Unicode support also gives access to
158*22dc650dSSadaf Ebrahimithe Unicode properties of characters, using pattern escapes such as \P, \p,
159*22dc650dSSadaf Ebrahimiand \X. Only the general category properties such as <i>Lu</i> and <i>Nd</i>,
160*22dc650dSSadaf Ebrahimiscript names, and some bi-directional properties are supported. Details are
161*22dc650dSSadaf Ebrahimigiven in the
162*22dc650dSSadaf Ebrahimi<a href="pcre2pattern.html"><b>pcre2pattern</b></a>
163*22dc650dSSadaf Ebrahimidocumentation.
164*22dc650dSSadaf Ebrahimi</P>
165*22dc650dSSadaf Ebrahimi<P>
166*22dc650dSSadaf EbrahimiPattern escapes such as \d and \w do not by default make use of Unicode
167*22dc650dSSadaf Ebrahimiproperties. The application can request that they do by setting the PCRE2_UCP
168*22dc650dSSadaf Ebrahimioption. Unless the application has set PCRE2_NEVER_UCP, a pattern may also
169*22dc650dSSadaf Ebrahimirequest this by starting with (*UCP).
170*22dc650dSSadaf Ebrahimi</P>
171*22dc650dSSadaf Ebrahimi<br><a name="SEC6" href="#TOC1">DISABLING THE USE OF \C</a><br>
172*22dc650dSSadaf Ebrahimi<P>
173*22dc650dSSadaf EbrahimiThe \C escape sequence, which matches a single code unit, even in a UTF mode,
174*22dc650dSSadaf Ebrahimican cause unpredictable behaviour because it may leave the current matching
175*22dc650dSSadaf Ebrahimipoint in the middle of a multi-code-unit character. The application can lock it
176*22dc650dSSadaf Ebrahimiout by setting the PCRE2_NEVER_BACKSLASH_C option when calling
177*22dc650dSSadaf Ebrahimi<b>pcre2_compile()</b>. There is also a build-time option
178*22dc650dSSadaf Ebrahimi<pre>
179*22dc650dSSadaf Ebrahimi  --enable-never-backslash-C
180*22dc650dSSadaf Ebrahimi</pre>
181*22dc650dSSadaf Ebrahimi(note the upper case C) which locks out the use of \C entirely.
182*22dc650dSSadaf Ebrahimi</P>
183*22dc650dSSadaf Ebrahimi<br><a name="SEC7" href="#TOC1">JUST-IN-TIME COMPILER SUPPORT</a><br>
184*22dc650dSSadaf Ebrahimi<P>
185*22dc650dSSadaf EbrahimiJust-in-time (JIT) compiler support is included in the build by specifying
186*22dc650dSSadaf Ebrahimi<pre>
187*22dc650dSSadaf Ebrahimi  --enable-jit
188*22dc650dSSadaf Ebrahimi</pre>
189*22dc650dSSadaf EbrahimiThis support is available only for certain hardware architectures. If this
190*22dc650dSSadaf Ebrahimioption is set for an unsupported architecture, a building error occurs.
191*22dc650dSSadaf EbrahimiIf in doubt, use
192*22dc650dSSadaf Ebrahimi<pre>
193*22dc650dSSadaf Ebrahimi  --enable-jit=auto
194*22dc650dSSadaf Ebrahimi</pre>
195*22dc650dSSadaf Ebrahimiwhich enables JIT only if the current hardware is supported. You can check
196*22dc650dSSadaf Ebrahimiif JIT is enabled in the configuration summary that is output at the end of a
197*22dc650dSSadaf Ebrahimi<b>configure</b> run. If you are enabling JIT under SELinux you may also want to
198*22dc650dSSadaf Ebrahimiadd
199*22dc650dSSadaf Ebrahimi<pre>
200*22dc650dSSadaf Ebrahimi  --enable-jit-sealloc
201*22dc650dSSadaf Ebrahimi</pre>
202*22dc650dSSadaf Ebrahimiwhich enables the use of an execmem allocator in JIT that is compatible with
203*22dc650dSSadaf EbrahimiSELinux. This has no effect if JIT is not enabled. See the
204*22dc650dSSadaf Ebrahimi<a href="pcre2jit.html"><b>pcre2jit</b></a>
205*22dc650dSSadaf Ebrahimidocumentation for a discussion of JIT usage. When JIT support is enabled,
206*22dc650dSSadaf Ebrahimi<b>pcre2grep</b> automatically makes use of it, unless you add
207*22dc650dSSadaf Ebrahimi<pre>
208*22dc650dSSadaf Ebrahimi  --disable-pcre2grep-jit
209*22dc650dSSadaf Ebrahimi</pre>
210*22dc650dSSadaf Ebrahimito the <b>configure</b> command.
211*22dc650dSSadaf Ebrahimi</P>
212*22dc650dSSadaf Ebrahimi<br><a name="SEC8" href="#TOC1">NEWLINE RECOGNITION</a><br>
213*22dc650dSSadaf Ebrahimi<P>
214*22dc650dSSadaf EbrahimiBy default, PCRE2 interprets the linefeed (LF) character as indicating the end
215*22dc650dSSadaf Ebrahimiof a line. This is the normal newline character on Unix-like systems. You can
216*22dc650dSSadaf Ebrahimicompile PCRE2 to use carriage return (CR) instead, by adding
217*22dc650dSSadaf Ebrahimi<pre>
218*22dc650dSSadaf Ebrahimi  --enable-newline-is-cr
219*22dc650dSSadaf Ebrahimi</pre>
220*22dc650dSSadaf Ebrahimito the <b>configure</b> command. There is also an --enable-newline-is-lf option,
221*22dc650dSSadaf Ebrahimiwhich explicitly specifies linefeed as the newline character.
222*22dc650dSSadaf Ebrahimi</P>
223*22dc650dSSadaf Ebrahimi<P>
224*22dc650dSSadaf EbrahimiAlternatively, you can specify that line endings are to be indicated by the
225*22dc650dSSadaf Ebrahimitwo-character sequence CRLF (CR immediately followed by LF). If you want this,
226*22dc650dSSadaf Ebrahimiadd
227*22dc650dSSadaf Ebrahimi<pre>
228*22dc650dSSadaf Ebrahimi  --enable-newline-is-crlf
229*22dc650dSSadaf Ebrahimi</pre>
230*22dc650dSSadaf Ebrahimito the <b>configure</b> command. There is a fourth option, specified by
231*22dc650dSSadaf Ebrahimi<pre>
232*22dc650dSSadaf Ebrahimi  --enable-newline-is-anycrlf
233*22dc650dSSadaf Ebrahimi</pre>
234*22dc650dSSadaf Ebrahimiwhich causes PCRE2 to recognize any of the three sequences CR, LF, or CRLF as
235*22dc650dSSadaf Ebrahimiindicating a line ending. A fifth option, specified by
236*22dc650dSSadaf Ebrahimi<pre>
237*22dc650dSSadaf Ebrahimi  --enable-newline-is-any
238*22dc650dSSadaf Ebrahimi</pre>
239*22dc650dSSadaf Ebrahimicauses PCRE2 to recognize any Unicode newline sequence. The Unicode newline
240*22dc650dSSadaf Ebrahimisequences are the three just mentioned, plus the single characters VT (vertical
241*22dc650dSSadaf Ebrahimitab, U+000B), FF (form feed, U+000C), NEL (next line, U+0085), LS (line
242*22dc650dSSadaf Ebrahimiseparator, U+2028), and PS (paragraph separator, U+2029). The final option is
243*22dc650dSSadaf Ebrahimi<pre>
244*22dc650dSSadaf Ebrahimi  --enable-newline-is-nul
245*22dc650dSSadaf Ebrahimi</pre>
246*22dc650dSSadaf Ebrahimiwhich causes NUL (binary zero) to be set as the default line-ending character.
247*22dc650dSSadaf Ebrahimi</P>
248*22dc650dSSadaf Ebrahimi<P>
249*22dc650dSSadaf EbrahimiWhatever default line ending convention is selected when PCRE2 is built can be
250*22dc650dSSadaf Ebrahimioverridden by applications that use the library. At build time it is
251*22dc650dSSadaf Ebrahimirecommended to use the standard for your operating system.
252*22dc650dSSadaf Ebrahimi</P>
253*22dc650dSSadaf Ebrahimi<br><a name="SEC9" href="#TOC1">WHAT \R MATCHES</a><br>
254*22dc650dSSadaf Ebrahimi<P>
255*22dc650dSSadaf EbrahimiBy default, the sequence \R in a pattern matches any Unicode newline sequence,
256*22dc650dSSadaf Ebrahimiindependently of what has been selected as the line ending sequence. If you
257*22dc650dSSadaf Ebrahimispecify
258*22dc650dSSadaf Ebrahimi<pre>
259*22dc650dSSadaf Ebrahimi  --enable-bsr-anycrlf
260*22dc650dSSadaf Ebrahimi</pre>
261*22dc650dSSadaf Ebrahimithe default is changed so that \R matches only CR, LF, or CRLF. Whatever is
262*22dc650dSSadaf Ebrahimiselected when PCRE2 is built can be overridden by applications that use the
263*22dc650dSSadaf Ebrahimilibrary.
264*22dc650dSSadaf Ebrahimi</P>
265*22dc650dSSadaf Ebrahimi<br><a name="SEC10" href="#TOC1">HANDLING VERY LARGE PATTERNS</a><br>
266*22dc650dSSadaf Ebrahimi<P>
267*22dc650dSSadaf EbrahimiWithin a compiled pattern, offset values are used to point from one part to
268*22dc650dSSadaf Ebrahimianother (for example, from an opening parenthesis to an alternation
269*22dc650dSSadaf Ebrahimimetacharacter). By default, in the 8-bit and 16-bit libraries, two-byte values
270*22dc650dSSadaf Ebrahimiare used for these offsets, leading to a maximum size for a compiled pattern of
271*22dc650dSSadaf Ebrahimiaround 64 thousand code units. This is sufficient to handle all but the most
272*22dc650dSSadaf Ebrahimigigantic patterns. Nevertheless, some people do want to process truly enormous
273*22dc650dSSadaf Ebrahimipatterns, so it is possible to compile PCRE2 to use three-byte or four-byte
274*22dc650dSSadaf Ebrahimioffsets by adding a setting such as
275*22dc650dSSadaf Ebrahimi<pre>
276*22dc650dSSadaf Ebrahimi  --with-link-size=3
277*22dc650dSSadaf Ebrahimi</pre>
278*22dc650dSSadaf Ebrahimito the <b>configure</b> command. The value given must be 2, 3, or 4. For the
279*22dc650dSSadaf Ebrahimi16-bit library, a value of 3 is rounded up to 4. In these libraries, using
280*22dc650dSSadaf Ebrahimilonger offsets slows down the operation of PCRE2 because it has to load
281*22dc650dSSadaf Ebrahimiadditional data when handling them. For the 32-bit library the value is always
282*22dc650dSSadaf Ebrahimi4 and cannot be overridden; the value of --with-link-size is ignored.
283*22dc650dSSadaf Ebrahimi</P>
284*22dc650dSSadaf Ebrahimi<br><a name="SEC11" href="#TOC1">LIMITING PCRE2 RESOURCE USAGE</a><br>
285*22dc650dSSadaf Ebrahimi<P>
286*22dc650dSSadaf EbrahimiThe <b>pcre2_match()</b> function increments a counter each time it goes round
287*22dc650dSSadaf Ebrahimiits main loop. Putting a limit on this counter controls the amount of computing
288*22dc650dSSadaf Ebrahimiresource used by a single call to <b>pcre2_match()</b>. The limit can be changed
289*22dc650dSSadaf Ebrahimiat run time, as described in the
290*22dc650dSSadaf Ebrahimi<a href="pcre2api.html"><b>pcre2api</b></a>
291*22dc650dSSadaf Ebrahimidocumentation. The default is 10 million, but this can be changed by adding a
292*22dc650dSSadaf Ebrahimisetting such as
293*22dc650dSSadaf Ebrahimi<pre>
294*22dc650dSSadaf Ebrahimi  --with-match-limit=500000
295*22dc650dSSadaf Ebrahimi</pre>
296*22dc650dSSadaf Ebrahimito the <b>configure</b> command. This setting also applies to the
297*22dc650dSSadaf Ebrahimi<b>pcre2_dfa_match()</b> matching function, and to JIT matching (though the
298*22dc650dSSadaf Ebrahimicounting is done differently).
299*22dc650dSSadaf Ebrahimi</P>
300*22dc650dSSadaf Ebrahimi<P>
301*22dc650dSSadaf EbrahimiThe <b>pcre2_match()</b> function uses heap memory to record backtracking
302*22dc650dSSadaf Ebrahimipoints. The more nested backtracking points there are (that is, the deeper the
303*22dc650dSSadaf Ebrahimisearch tree), the more memory is needed. There is an upper limit, specified in
304*22dc650dSSadaf Ebrahimikibibytes (units of 1024 bytes). This limit can be changed at run time, as
305*22dc650dSSadaf Ebrahimidescribed in the
306*22dc650dSSadaf Ebrahimi<a href="pcre2api.html"><b>pcre2api</b></a>
307*22dc650dSSadaf Ebrahimidocumentation. The default limit (in effect unlimited) is 20 million. You can
308*22dc650dSSadaf Ebrahimichange this by a setting such as
309*22dc650dSSadaf Ebrahimi<pre>
310*22dc650dSSadaf Ebrahimi  --with-heap-limit=500
311*22dc650dSSadaf Ebrahimi</pre>
312*22dc650dSSadaf Ebrahimiwhich limits the amount of heap to 500 KiB. This limit applies only to
313*22dc650dSSadaf Ebrahimiinterpretive matching in <b>pcre2_match()</b> and <b>pcre2_dfa_match()</b>, which
314*22dc650dSSadaf Ebrahimimay also use the heap for internal workspace when processing complicated
315*22dc650dSSadaf Ebrahimipatterns. This limit does not apply when JIT (which has its own memory
316*22dc650dSSadaf Ebrahimiarrangements) is used.
317*22dc650dSSadaf Ebrahimi</P>
318*22dc650dSSadaf Ebrahimi<P>
319*22dc650dSSadaf EbrahimiYou can also explicitly limit the depth of nested backtracking in the
320*22dc650dSSadaf Ebrahimi<b>pcre2_match()</b> interpreter. This limit defaults to the value that is set
321*22dc650dSSadaf Ebrahimifor --with-match-limit. You can set a lower default limit by adding, for
322*22dc650dSSadaf Ebrahimiexample,
323*22dc650dSSadaf Ebrahimi<pre>
324*22dc650dSSadaf Ebrahimi  --with-match-limit-depth=10000
325*22dc650dSSadaf Ebrahimi</pre>
326*22dc650dSSadaf Ebrahimito the <b>configure</b> command. This value can be overridden at run time. This
327*22dc650dSSadaf Ebrahimidepth limit indirectly limits the amount of heap memory that is used, but
328*22dc650dSSadaf Ebrahimibecause the size of each backtracking "frame" depends on the number of
329*22dc650dSSadaf Ebrahimicapturing parentheses in a pattern, the amount of heap that is used before the
330*22dc650dSSadaf Ebrahimilimit is reached varies from pattern to pattern. This limit was more useful in
331*22dc650dSSadaf Ebrahimiversions before 10.30, where function recursion was used for backtracking.
332*22dc650dSSadaf Ebrahimi</P>
333*22dc650dSSadaf Ebrahimi<P>
334*22dc650dSSadaf EbrahimiAs well as applying to <b>pcre2_match()</b>, the depth limit also controls
335*22dc650dSSadaf Ebrahimithe depth of recursive function calls in <b>pcre2_dfa_match()</b>. These are
336*22dc650dSSadaf Ebrahimiused for lookaround assertions, atomic groups, and recursion within patterns.
337*22dc650dSSadaf EbrahimiThe limit does not apply to JIT matching.
338*22dc650dSSadaf Ebrahimi</P>
339*22dc650dSSadaf Ebrahimi<br><a name="SEC12" href="#TOC1">LIMITING VARIABLE-LENGTH LOOKBEHIND ASSERTIONS</a><br>
340*22dc650dSSadaf Ebrahimi<P>
341*22dc650dSSadaf EbrahimiLookbehind assertions in which one or more branches can match a variable number
342*22dc650dSSadaf Ebrahimiof characters are supported only if there is a maximum matching length for each
343*22dc650dSSadaf Ebrahimitop-level branch. There is a limit to this maximum that defaults to 255
344*22dc650dSSadaf Ebrahimicharacters. You can alter this default by a setting such as
345*22dc650dSSadaf Ebrahimi<pre>
346*22dc650dSSadaf Ebrahimi  --with-max-varlookbehind=100
347*22dc650dSSadaf Ebrahimi</pre>
348*22dc650dSSadaf EbrahimiThe limit can be changed at runtime by calling
349*22dc650dSSadaf Ebrahimi<b>pcre2_set_max_varlookbehind()</b>. Lookbehind assertions in which every
350*22dc650dSSadaf Ebrahimibranch matches a fixed number of characters (not necessarily all the same) are
351*22dc650dSSadaf Ebrahiminot constrained by this limit.
352*22dc650dSSadaf Ebrahimi<a name="createtables"></a></P>
353*22dc650dSSadaf Ebrahimi<br><a name="SEC13" href="#TOC1">CREATING CHARACTER TABLES AT BUILD TIME</a><br>
354*22dc650dSSadaf Ebrahimi<P>
355*22dc650dSSadaf EbrahimiPCRE2 uses fixed tables for processing characters whose code points are less
356*22dc650dSSadaf Ebrahimithan 256. By default, PCRE2 is built with a set of tables that are distributed
357*22dc650dSSadaf Ebrahimiin the file <i>src/pcre2_chartables.c.dist</i>. These tables are for ASCII codes
358*22dc650dSSadaf Ebrahimionly. If you add
359*22dc650dSSadaf Ebrahimi<pre>
360*22dc650dSSadaf Ebrahimi  --enable-rebuild-chartables
361*22dc650dSSadaf Ebrahimi</pre>
362*22dc650dSSadaf Ebrahimito the <b>configure</b> command, the distributed tables are no longer used.
363*22dc650dSSadaf EbrahimiInstead, a program called <b>pcre2_dftables</b> is compiled and run. This
364*22dc650dSSadaf Ebrahimioutputs the source for new set of tables, created in the default locale of your
365*22dc650dSSadaf EbrahimiC run-time system. This method of replacing the tables does not work if you are
366*22dc650dSSadaf Ebrahimicross compiling, because <b>pcre2_dftables</b> needs to be run on the local
367*22dc650dSSadaf Ebrahimihost and therefore not compiled with the cross compiler.
368*22dc650dSSadaf Ebrahimi</P>
369*22dc650dSSadaf Ebrahimi<P>
370*22dc650dSSadaf EbrahimiIf you need to create alternative tables when cross compiling, you will have to
371*22dc650dSSadaf Ebrahimido so "by hand". There may also be other reasons for creating tables manually.
372*22dc650dSSadaf EbrahimiTo cause <b>pcre2_dftables</b> to be built on the local host, run a normal
373*22dc650dSSadaf Ebrahimicompiling command, and then run the program with the output file as its
374*22dc650dSSadaf Ebrahimiargument, for example:
375*22dc650dSSadaf Ebrahimi<pre>
376*22dc650dSSadaf Ebrahimi  cc src/pcre2_dftables.c -o pcre2_dftables
377*22dc650dSSadaf Ebrahimi  ./pcre2_dftables src/pcre2_chartables.c
378*22dc650dSSadaf Ebrahimi</pre>
379*22dc650dSSadaf EbrahimiThis builds the tables in the default locale of the local host. If you want to
380*22dc650dSSadaf Ebrahimispecify a locale, you must use the -L option:
381*22dc650dSSadaf Ebrahimi<pre>
382*22dc650dSSadaf Ebrahimi  LC_ALL=fr_FR ./pcre2_dftables -L src/pcre2_chartables.c
383*22dc650dSSadaf Ebrahimi</pre>
384*22dc650dSSadaf EbrahimiYou can also specify -b (with or without -L). This causes the tables to be
385*22dc650dSSadaf Ebrahimiwritten in binary instead of as source code. A set of binary tables can be
386*22dc650dSSadaf Ebrahimiloaded into memory by an application and passed to <b>pcre2_compile()</b> in the
387*22dc650dSSadaf Ebrahimisame way as tables created by calling <b>pcre2_maketables()</b>. The tables are
388*22dc650dSSadaf Ebrahimijust a string of bytes, independent of hardware characteristics such as
389*22dc650dSSadaf Ebrahimiendianness. This means they can be bundled with an application that runs in
390*22dc650dSSadaf Ebrahimidifferent environments, to ensure consistent behaviour.
391*22dc650dSSadaf Ebrahimi</P>
392*22dc650dSSadaf Ebrahimi<br><a name="SEC14" href="#TOC1">USING EBCDIC CODE</a><br>
393*22dc650dSSadaf Ebrahimi<P>
394*22dc650dSSadaf EbrahimiPCRE2 assumes by default that it will run in an environment where the character
395*22dc650dSSadaf Ebrahimicode is ASCII or Unicode, which is a superset of ASCII. This is the case for
396*22dc650dSSadaf Ebrahimimost computer operating systems. PCRE2 can, however, be compiled to run in an
397*22dc650dSSadaf Ebrahimi8-bit EBCDIC environment by adding
398*22dc650dSSadaf Ebrahimi<pre>
399*22dc650dSSadaf Ebrahimi  --enable-ebcdic --disable-unicode
400*22dc650dSSadaf Ebrahimi</pre>
401*22dc650dSSadaf Ebrahimito the <b>configure</b> command. This setting implies
402*22dc650dSSadaf Ebrahimi--enable-rebuild-chartables. You should only use it if you know that you are in
403*22dc650dSSadaf Ebrahimian EBCDIC environment (for example, an IBM mainframe operating system).
404*22dc650dSSadaf Ebrahimi</P>
405*22dc650dSSadaf Ebrahimi<P>
406*22dc650dSSadaf EbrahimiIt is not possible to support both EBCDIC and UTF-8 codes in the same version
407*22dc650dSSadaf Ebrahimiof the library. Consequently, --enable-unicode and --enable-ebcdic are mutually
408*22dc650dSSadaf Ebrahimiexclusive.
409*22dc650dSSadaf Ebrahimi</P>
410*22dc650dSSadaf Ebrahimi<P>
411*22dc650dSSadaf EbrahimiThe EBCDIC character that corresponds to an ASCII LF is assumed to have the
412*22dc650dSSadaf Ebrahimivalue 0x15 by default. However, in some EBCDIC environments, 0x25 is used. In
413*22dc650dSSadaf Ebrahimisuch an environment you should use
414*22dc650dSSadaf Ebrahimi<pre>
415*22dc650dSSadaf Ebrahimi  --enable-ebcdic-nl25
416*22dc650dSSadaf Ebrahimi</pre>
417*22dc650dSSadaf Ebrahimias well as, or instead of, --enable-ebcdic. The EBCDIC character for CR has the
418*22dc650dSSadaf Ebrahimisame value as in ASCII, namely, 0x0d. Whichever of 0x15 and 0x25 is <i>not</i>
419*22dc650dSSadaf Ebrahimichosen as LF is made to correspond to the Unicode NEL character (which, in
420*22dc650dSSadaf EbrahimiUnicode, is 0x85).
421*22dc650dSSadaf Ebrahimi</P>
422*22dc650dSSadaf Ebrahimi<P>
423*22dc650dSSadaf EbrahimiThe options that select newline behaviour, such as --enable-newline-is-cr,
424*22dc650dSSadaf Ebrahimiand equivalent run-time options, refer to these character values in an EBCDIC
425*22dc650dSSadaf Ebrahimienvironment.
426*22dc650dSSadaf Ebrahimi</P>
427*22dc650dSSadaf Ebrahimi<br><a name="SEC15" href="#TOC1">PCRE2GREP SUPPORT FOR EXTERNAL SCRIPTS</a><br>
428*22dc650dSSadaf Ebrahimi<P>
429*22dc650dSSadaf EbrahimiBy default <b>pcre2grep</b> supports the use of callouts with string arguments
430*22dc650dSSadaf Ebrahimiwithin the patterns it is matching. There are two kinds: one that generates
431*22dc650dSSadaf Ebrahimioutput using local code, and another that calls an external program or script.
432*22dc650dSSadaf EbrahimiIf --disable-pcre2grep-callout-fork is added to the <b>configure</b> command,
433*22dc650dSSadaf Ebrahimionly the first kind of callout is supported; if --disable-pcre2grep-callout is
434*22dc650dSSadaf Ebrahimiused, all callouts are completely ignored. For more details of <b>pcre2grep</b>
435*22dc650dSSadaf Ebrahimicallouts, see the
436*22dc650dSSadaf Ebrahimi<a href="pcre2grep.html"><b>pcre2grep</b></a>
437*22dc650dSSadaf Ebrahimidocumentation.
438*22dc650dSSadaf Ebrahimi</P>
439*22dc650dSSadaf Ebrahimi<br><a name="SEC16" href="#TOC1">PCRE2GREP OPTIONS FOR COMPRESSED FILE SUPPORT</a><br>
440*22dc650dSSadaf Ebrahimi<P>
441*22dc650dSSadaf EbrahimiBy default, <b>pcre2grep</b> reads all files as plain text. You can build it so
442*22dc650dSSadaf Ebrahimithat it recognizes files whose names end in <b>.gz</b> or <b>.bz2</b>, and reads
443*22dc650dSSadaf Ebrahimithem with <b>libz</b> or <b>libbz2</b>, respectively, by adding one or both of
444*22dc650dSSadaf Ebrahimi<pre>
445*22dc650dSSadaf Ebrahimi  --enable-pcre2grep-libz
446*22dc650dSSadaf Ebrahimi  --enable-pcre2grep-libbz2
447*22dc650dSSadaf Ebrahimi</pre>
448*22dc650dSSadaf Ebrahimito the <b>configure</b> command. These options naturally require that the
449*22dc650dSSadaf Ebrahimirelevant libraries are installed on your system. Configuration will fail if
450*22dc650dSSadaf Ebrahimithey are not.
451*22dc650dSSadaf Ebrahimi</P>
452*22dc650dSSadaf Ebrahimi<br><a name="SEC17" href="#TOC1">PCRE2GREP BUFFER SIZE</a><br>
453*22dc650dSSadaf Ebrahimi<P>
454*22dc650dSSadaf Ebrahimi<b>pcre2grep</b> uses an internal buffer to hold a "window" on the file it is
455*22dc650dSSadaf Ebrahimiscanning, in order to be able to output "before" and "after" lines when it
456*22dc650dSSadaf Ebrahimifinds a match. The default starting size of the buffer is 20KiB. The buffer
457*22dc650dSSadaf Ebrahimiitself is three times this size, but because of the way it is used for holding
458*22dc650dSSadaf Ebrahimi"before" lines, the longest line that is guaranteed to be processable is the
459*22dc650dSSadaf Ebrahiminotional buffer size. If a longer line is encountered, <b>pcre2grep</b>
460*22dc650dSSadaf Ebrahimiautomatically expands the buffer, up to a specified maximum size, whose default
461*22dc650dSSadaf Ebrahimiis 1MiB or the starting size, whichever is the larger. You can change the
462*22dc650dSSadaf Ebrahimidefault parameter values by adding, for example,
463*22dc650dSSadaf Ebrahimi<pre>
464*22dc650dSSadaf Ebrahimi  --with-pcre2grep-bufsize=51200
465*22dc650dSSadaf Ebrahimi  --with-pcre2grep-max-bufsize=2097152
466*22dc650dSSadaf Ebrahimi</pre>
467*22dc650dSSadaf Ebrahimito the <b>configure</b> command. The caller of <b>pcre2grep</b> can override
468*22dc650dSSadaf Ebrahimithese values by using --buffer-size and --max-buffer-size on the command line.
469*22dc650dSSadaf Ebrahimi</P>
470*22dc650dSSadaf Ebrahimi<br><a name="SEC18" href="#TOC1">PCRE2TEST OPTION FOR LIBREADLINE SUPPORT</a><br>
471*22dc650dSSadaf Ebrahimi<P>
472*22dc650dSSadaf EbrahimiIf you add one of
473*22dc650dSSadaf Ebrahimi<pre>
474*22dc650dSSadaf Ebrahimi  --enable-pcre2test-libreadline
475*22dc650dSSadaf Ebrahimi  --enable-pcre2test-libedit
476*22dc650dSSadaf Ebrahimi</pre>
477*22dc650dSSadaf Ebrahimito the <b>configure</b> command, <b>pcre2test</b> is linked with the
478*22dc650dSSadaf Ebrahimi<b>libreadline</b> or<b>libedit</b> library, respectively, and when its input is
479*22dc650dSSadaf Ebrahimifrom a terminal, it reads it using the <b>readline()</b> function. This provides
480*22dc650dSSadaf Ebrahimiline-editing and history facilities. Note that <b>libreadline</b> is
481*22dc650dSSadaf EbrahimiGPL-licensed, so if you distribute a binary of <b>pcre2test</b> linked in this
482*22dc650dSSadaf Ebrahimiway, there may be licensing issues. These can be avoided by linking instead
483*22dc650dSSadaf Ebrahimiwith <b>libedit</b>, which has a BSD licence.
484*22dc650dSSadaf Ebrahimi</P>
485*22dc650dSSadaf Ebrahimi<P>
486*22dc650dSSadaf EbrahimiSetting --enable-pcre2test-libreadline causes the <b>-lreadline</b> option to be
487*22dc650dSSadaf Ebrahimiadded to the <b>pcre2test</b> build. In many operating environments with a
488*22dc650dSSadaf Ebrahimisystem-installed readline library this is sufficient. However, in some
489*22dc650dSSadaf Ebrahimienvironments (e.g. if an unmodified distribution version of readline is in
490*22dc650dSSadaf Ebrahimiuse), some extra configuration may be necessary. The INSTALL file for
491*22dc650dSSadaf Ebrahimi<b>libreadline</b> says this:
492*22dc650dSSadaf Ebrahimi<pre>
493*22dc650dSSadaf Ebrahimi  "Readline uses the termcap functions, but does not link with
494*22dc650dSSadaf Ebrahimi  the termcap or curses library itself, allowing applications
495*22dc650dSSadaf Ebrahimi  which link with readline the to choose an appropriate library."
496*22dc650dSSadaf Ebrahimi</pre>
497*22dc650dSSadaf EbrahimiIf your environment has not been set up so that an appropriate library is
498*22dc650dSSadaf Ebrahimiautomatically included, you may need to add something like
499*22dc650dSSadaf Ebrahimi<pre>
500*22dc650dSSadaf Ebrahimi  LIBS="-ncurses"
501*22dc650dSSadaf Ebrahimi</pre>
502*22dc650dSSadaf Ebrahimiimmediately before the <b>configure</b> command.
503*22dc650dSSadaf Ebrahimi</P>
504*22dc650dSSadaf Ebrahimi<br><a name="SEC19" href="#TOC1">INCLUDING DEBUGGING CODE</a><br>
505*22dc650dSSadaf Ebrahimi<P>
506*22dc650dSSadaf EbrahimiIf you add
507*22dc650dSSadaf Ebrahimi<pre>
508*22dc650dSSadaf Ebrahimi  --enable-debug
509*22dc650dSSadaf Ebrahimi</pre>
510*22dc650dSSadaf Ebrahimito the <b>configure</b> command, additional debugging code is included in the
511*22dc650dSSadaf Ebrahimibuild. This feature is intended for use by the PCRE2 maintainers.
512*22dc650dSSadaf Ebrahimi</P>
513*22dc650dSSadaf Ebrahimi<br><a name="SEC20" href="#TOC1">DEBUGGING WITH VALGRIND SUPPORT</a><br>
514*22dc650dSSadaf Ebrahimi<P>
515*22dc650dSSadaf EbrahimiIf you add
516*22dc650dSSadaf Ebrahimi<pre>
517*22dc650dSSadaf Ebrahimi  --enable-valgrind
518*22dc650dSSadaf Ebrahimi</pre>
519*22dc650dSSadaf Ebrahimito the <b>configure</b> command, PCRE2 will use valgrind annotations to mark
520*22dc650dSSadaf Ebrahimicertain memory regions as unaddressable. This allows it to detect invalid
521*22dc650dSSadaf Ebrahimimemory accesses, and is mostly useful for debugging PCRE2 itself.
522*22dc650dSSadaf Ebrahimi</P>
523*22dc650dSSadaf Ebrahimi<br><a name="SEC21" href="#TOC1">CODE COVERAGE REPORTING</a><br>
524*22dc650dSSadaf Ebrahimi<P>
525*22dc650dSSadaf EbrahimiIf your C compiler is gcc, you can build a version of PCRE2 that can generate a
526*22dc650dSSadaf Ebrahimicode coverage report for its test suite. To enable this, you must install
527*22dc650dSSadaf Ebrahimi<b>lcov</b> version 1.6 or above. Then specify
528*22dc650dSSadaf Ebrahimi<pre>
529*22dc650dSSadaf Ebrahimi  --enable-coverage
530*22dc650dSSadaf Ebrahimi</pre>
531*22dc650dSSadaf Ebrahimito the <b>configure</b> command and build PCRE2 in the usual way.
532*22dc650dSSadaf Ebrahimi</P>
533*22dc650dSSadaf Ebrahimi<P>
534*22dc650dSSadaf EbrahimiNote that using <b>ccache</b> (a caching C compiler) is incompatible with code
535*22dc650dSSadaf Ebrahimicoverage reporting. If you have configured <b>ccache</b> to run automatically
536*22dc650dSSadaf Ebrahimion your system, you must set the environment variable
537*22dc650dSSadaf Ebrahimi<pre>
538*22dc650dSSadaf Ebrahimi  CCACHE_DISABLE=1
539*22dc650dSSadaf Ebrahimi</pre>
540*22dc650dSSadaf Ebrahimibefore running <b>make</b> to build PCRE2, so that <b>ccache</b> is not used.
541*22dc650dSSadaf Ebrahimi</P>
542*22dc650dSSadaf Ebrahimi<P>
543*22dc650dSSadaf EbrahimiWhen --enable-coverage is used, the following addition targets are added to the
544*22dc650dSSadaf Ebrahimi<i>Makefile</i>:
545*22dc650dSSadaf Ebrahimi<pre>
546*22dc650dSSadaf Ebrahimi  make coverage
547*22dc650dSSadaf Ebrahimi</pre>
548*22dc650dSSadaf EbrahimiThis creates a fresh coverage report for the PCRE2 test suite. It is equivalent
549*22dc650dSSadaf Ebrahimito running "make coverage-reset", "make coverage-baseline", "make check", and
550*22dc650dSSadaf Ebrahimithen "make coverage-report".
551*22dc650dSSadaf Ebrahimi<pre>
552*22dc650dSSadaf Ebrahimi  make coverage-reset
553*22dc650dSSadaf Ebrahimi</pre>
554*22dc650dSSadaf EbrahimiThis zeroes the coverage counters, but does nothing else.
555*22dc650dSSadaf Ebrahimi<pre>
556*22dc650dSSadaf Ebrahimi  make coverage-baseline
557*22dc650dSSadaf Ebrahimi</pre>
558*22dc650dSSadaf EbrahimiThis captures baseline coverage information.
559*22dc650dSSadaf Ebrahimi<pre>
560*22dc650dSSadaf Ebrahimi  make coverage-report
561*22dc650dSSadaf Ebrahimi</pre>
562*22dc650dSSadaf EbrahimiThis creates the coverage report.
563*22dc650dSSadaf Ebrahimi<pre>
564*22dc650dSSadaf Ebrahimi  make coverage-clean-report
565*22dc650dSSadaf Ebrahimi</pre>
566*22dc650dSSadaf EbrahimiThis removes the generated coverage report without cleaning the coverage data
567*22dc650dSSadaf Ebrahimiitself.
568*22dc650dSSadaf Ebrahimi<pre>
569*22dc650dSSadaf Ebrahimi  make coverage-clean-data
570*22dc650dSSadaf Ebrahimi</pre>
571*22dc650dSSadaf EbrahimiThis removes the captured coverage data without removing the coverage files
572*22dc650dSSadaf Ebrahimicreated at compile time (*.gcno).
573*22dc650dSSadaf Ebrahimi<pre>
574*22dc650dSSadaf Ebrahimi  make coverage-clean
575*22dc650dSSadaf Ebrahimi</pre>
576*22dc650dSSadaf EbrahimiThis cleans all coverage data including the generated coverage report. For more
577*22dc650dSSadaf Ebrahimiinformation about code coverage, see the <b>gcov</b> and <b>lcov</b>
578*22dc650dSSadaf Ebrahimidocumentation.
579*22dc650dSSadaf Ebrahimi</P>
580*22dc650dSSadaf Ebrahimi<br><a name="SEC22" href="#TOC1">DISABLING THE Z AND T FORMATTING MODIFIERS</a><br>
581*22dc650dSSadaf Ebrahimi<P>
582*22dc650dSSadaf EbrahimiThe C99 standard defines formatting modifiers z and t for size_t and
583*22dc650dSSadaf Ebrahimiptrdiff_t values, respectively. By default, PCRE2 uses these modifiers in
584*22dc650dSSadaf Ebrahimienvironments other than old versions of Microsoft Visual Studio when
585*22dc650dSSadaf Ebrahimi__STDC_VERSION__ is defined and has a value greater than or equal to 199901L
586*22dc650dSSadaf Ebrahimi(indicating support for C99).
587*22dc650dSSadaf EbrahimiHowever, there is at least one environment that claims to be C99 but does not
588*22dc650dSSadaf Ebrahimisupport these modifiers. If
589*22dc650dSSadaf Ebrahimi<pre>
590*22dc650dSSadaf Ebrahimi  --disable-percent-zt
591*22dc650dSSadaf Ebrahimi</pre>
592*22dc650dSSadaf Ebrahimiis specified, no use is made of the z or t modifiers. Instead of %td or %zu,
593*22dc650dSSadaf Ebrahimia suitable format is used depending in the size of long for the platform.
594*22dc650dSSadaf Ebrahimi</P>
595*22dc650dSSadaf Ebrahimi<br><a name="SEC23" href="#TOC1">SUPPORT FOR FUZZERS</a><br>
596*22dc650dSSadaf Ebrahimi<P>
597*22dc650dSSadaf EbrahimiThere is a special option for use by people who want to run fuzzing tests on
598*22dc650dSSadaf EbrahimiPCRE2:
599*22dc650dSSadaf Ebrahimi<pre>
600*22dc650dSSadaf Ebrahimi  --enable-fuzz-support
601*22dc650dSSadaf Ebrahimi</pre>
602*22dc650dSSadaf EbrahimiAt present this applies only to the 8-bit library. If set, it causes an extra
603*22dc650dSSadaf Ebrahimilibrary called libpcre2-fuzzsupport.a to be built, but not installed. This
604*22dc650dSSadaf Ebrahimicontains a single function called LLVMFuzzerTestOneInput() whose arguments are
605*22dc650dSSadaf Ebrahimia pointer to a string and the length of the string. When called, this function
606*22dc650dSSadaf Ebrahimitries to compile the string as a pattern, and if that succeeds, to match it.
607*22dc650dSSadaf EbrahimiThis is done both with no options and with some random options bits that are
608*22dc650dSSadaf Ebrahimigenerated from the string.
609*22dc650dSSadaf Ebrahimi</P>
610*22dc650dSSadaf Ebrahimi<P>
611*22dc650dSSadaf EbrahimiSetting --enable-fuzz-support also causes a binary called <b>pcre2fuzzcheck</b>
612*22dc650dSSadaf Ebrahimito be created. This is normally run under valgrind or used when PCRE2 is
613*22dc650dSSadaf Ebrahimicompiled with address sanitizing enabled. It calls the fuzzing function and
614*22dc650dSSadaf Ebrahimioutputs information about what it is doing. The input strings are specified by
615*22dc650dSSadaf Ebrahimiarguments: if an argument starts with "=" the rest of it is a literal input
616*22dc650dSSadaf Ebrahimistring. Otherwise, it is assumed to be a file name, and the contents of the
617*22dc650dSSadaf Ebrahimifile are the test string.
618*22dc650dSSadaf Ebrahimi</P>
619*22dc650dSSadaf Ebrahimi<br><a name="SEC24" href="#TOC1">OBSOLETE OPTION</a><br>
620*22dc650dSSadaf Ebrahimi<P>
621*22dc650dSSadaf EbrahimiIn versions of PCRE2 prior to 10.30, there were two ways of handling
622*22dc650dSSadaf Ebrahimibacktracking in the <b>pcre2_match()</b> function. The default was to use the
623*22dc650dSSadaf Ebrahimisystem stack, but if
624*22dc650dSSadaf Ebrahimi<pre>
625*22dc650dSSadaf Ebrahimi  --disable-stack-for-recursion
626*22dc650dSSadaf Ebrahimi</pre>
627*22dc650dSSadaf Ebrahimiwas set, memory on the heap was used. From release 10.30 onwards this has
628*22dc650dSSadaf Ebrahimichanged (the stack is no longer used) and this option now does nothing except
629*22dc650dSSadaf Ebrahimigive a warning.
630*22dc650dSSadaf Ebrahimi</P>
631*22dc650dSSadaf Ebrahimi<br><a name="SEC25" href="#TOC1">SEE ALSO</a><br>
632*22dc650dSSadaf Ebrahimi<P>
633*22dc650dSSadaf Ebrahimi<b>pcre2api</b>(3), <b>pcre2-config</b>(3).
634*22dc650dSSadaf Ebrahimi</P>
635*22dc650dSSadaf Ebrahimi<br><a name="SEC26" href="#TOC1">AUTHOR</a><br>
636*22dc650dSSadaf Ebrahimi<P>
637*22dc650dSSadaf EbrahimiPhilip Hazel
638*22dc650dSSadaf Ebrahimi<br>
639*22dc650dSSadaf EbrahimiRetired from University Computing Service
640*22dc650dSSadaf Ebrahimi<br>
641*22dc650dSSadaf EbrahimiCambridge, England.
642*22dc650dSSadaf Ebrahimi<br>
643*22dc650dSSadaf Ebrahimi</P>
644*22dc650dSSadaf Ebrahimi<br><a name="SEC27" href="#TOC1">REVISION</a><br>
645*22dc650dSSadaf Ebrahimi<P>
646*22dc650dSSadaf EbrahimiLast updated: 15 April 2024
647*22dc650dSSadaf Ebrahimi<br>
648*22dc650dSSadaf EbrahimiCopyright &copy; 1997-2024 University of Cambridge.
649*22dc650dSSadaf Ebrahimi<br>
650*22dc650dSSadaf Ebrahimi<p>
651*22dc650dSSadaf EbrahimiReturn to the <a href="index.html">PCRE2 index page</a>.
652*22dc650dSSadaf Ebrahimi</p>
653