xref: /aosp_15_r20/external/one-true-awk/README.md (revision 9a7741de182b2776d7b30d6355f2585c0780a51b)
1*9a7741deSElliott Hughes# The One True Awk
2*9a7741deSElliott Hughes
3*9a7741deSElliott HughesThis is the version of `awk` described in _The AWK Programming Language_,
4*9a7741deSElliott HughesSecond Edition, by Al Aho, Brian Kernighan, and Peter Weinberger
5*9a7741deSElliott Hughes(Addison-Wesley, 2024, ISBN-13 978-0138269722, ISBN-10 0138269726).
6*9a7741deSElliott Hughes
7*9a7741deSElliott Hughes## What's New? ##
8*9a7741deSElliott Hughes
9*9a7741deSElliott HughesThis version of Awk handles UTF-8 and comma-separated values (CSV) input.
10*9a7741deSElliott Hughes
11*9a7741deSElliott Hughes### Strings ###
12*9a7741deSElliott Hughes
13*9a7741deSElliott HughesFunctions that process strings now count Unicode code points, not bytes;
14*9a7741deSElliott Hughesthis affects `length`, `substr`, `index`, `match`, `split`,
15*9a7741deSElliott Hughes`sub`, `gsub`, and others.  Note that code
16*9a7741deSElliott Hughespoints are not necessarily characters.
17*9a7741deSElliott Hughes
18*9a7741deSElliott HughesUTF-8 sequences may appear in literal strings and regular expressions.
19*9a7741deSElliott HughesArbitrary characters may be included with `\u` followed by 1 to 8 hexadecimal digits.
20*9a7741deSElliott Hughes
21*9a7741deSElliott Hughes### Regular expressions ###
22*9a7741deSElliott Hughes
23*9a7741deSElliott HughesRegular expressions may include UTF-8 code points, including `\u`.
24*9a7741deSElliott Hughes
25*9a7741deSElliott Hughes### CSV ###
26*9a7741deSElliott Hughes
27*9a7741deSElliott HughesThe option `--csv` turns on CSV processing of input:
28*9a7741deSElliott Hughesfields are separated by commas, fields may be quoted with
29*9a7741deSElliott Hughesdouble-quote (`"`) characters, quoted fields may contain embedded newlines.
30*9a7741deSElliott HughesDouble-quotes in fields have to be doubled and enclosed in quoted fields.
31*9a7741deSElliott HughesIn CSV mode, `FS` is ignored.
32*9a7741deSElliott Hughes
33*9a7741deSElliott HughesIf no explicit separator argument is provided,
34*9a7741deSElliott Hughesfield-splitting in `split` is determined by CSV mode.
35*9a7741deSElliott Hughes
36*9a7741deSElliott Hughes## Copyright
37*9a7741deSElliott Hughes
38*9a7741deSElliott HughesCopyright (C) Lucent Technologies 1997<br/>
39*9a7741deSElliott HughesAll Rights Reserved
40*9a7741deSElliott Hughes
41*9a7741deSElliott HughesPermission to use, copy, modify, and distribute this software and
42*9a7741deSElliott Hughesits documentation for any purpose and without fee is hereby
43*9a7741deSElliott Hughesgranted, provided that the above copyright notice appear in all
44*9a7741deSElliott Hughescopies and that both that the copyright notice and this
45*9a7741deSElliott Hughespermission notice and warranty disclaimer appear in supporting
46*9a7741deSElliott Hughesdocumentation, and that the name Lucent Technologies or any of
47*9a7741deSElliott Hughesits entities not be used in advertising or publicity pertaining
48*9a7741deSElliott Hughesto distribution of the software without specific, written prior
49*9a7741deSElliott Hughespermission.
50*9a7741deSElliott Hughes
51*9a7741deSElliott HughesLUCENT DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE,
52*9a7741deSElliott HughesINCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS.
53*9a7741deSElliott HughesIN NO EVENT SHALL LUCENT OR ANY OF ITS ENTITIES BE LIABLE FOR ANY
54*9a7741deSElliott HughesSPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
55*9a7741deSElliott HughesWHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER
56*9a7741deSElliott HughesIN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
57*9a7741deSElliott HughesARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
58*9a7741deSElliott HughesTHIS SOFTWARE.
59*9a7741deSElliott Hughes
60*9a7741deSElliott Hughes## Distribution and Reporting Problems
61*9a7741deSElliott Hughes
62*9a7741deSElliott HughesChanges, mostly bug fixes and occasional enhancements, are listed
63*9a7741deSElliott Hughesin `FIXES`.  If you distribute this code further, please please please
64*9a7741deSElliott Hughesdistribute `FIXES` with it.
65*9a7741deSElliott Hughes
66*9a7741deSElliott HughesIf you find errors, please report them
67*9a7741deSElliott Hughesto the current maintainer, [email protected].
68*9a7741deSElliott HughesPlease _also_ open an issue in the GitHub issue tracker, to make
69*9a7741deSElliott Hughesit easy to track issues.
70*9a7741deSElliott HughesThanks.
71*9a7741deSElliott Hughes
72*9a7741deSElliott Hughes## Submitting Pull Requests
73*9a7741deSElliott Hughes
74*9a7741deSElliott HughesPull requests are welcome. Some guidelines:
75*9a7741deSElliott Hughes
76*9a7741deSElliott Hughes* Please do not use functions or facilities that are not standard (e.g.,
77*9a7741deSElliott Hughes`strlcpy()`, `fpurge()`).
78*9a7741deSElliott Hughes
79*9a7741deSElliott Hughes* Please run the test suite and make sure that your changes pass before
80*9a7741deSElliott Hughesposting the pull request. To do so:
81*9a7741deSElliott Hughes
82*9a7741deSElliott Hughes  1. Save the previous version of `awk` somewhere in your path. Call it `nawk` (for example).
83*9a7741deSElliott Hughes  1. Run `oldawk=nawk make check > check.out 2>&1`.
84*9a7741deSElliott Hughes  1. Search for `BAD` or `error` in the result. In general, look over it manually to make sure there are no errors.
85*9a7741deSElliott Hughes
86*9a7741deSElliott Hughes* Please create the pull request with a request
87*9a7741deSElliott Hughesto merge into the `staging` branch instead of into the `master` branch.
88*9a7741deSElliott HughesThis allows us to do testing, and to make any additional edits or changes
89*9a7741deSElliott Hughesafter the merge but before merging to `master`.
90*9a7741deSElliott Hughes
91*9a7741deSElliott Hughes## Building
92*9a7741deSElliott Hughes
93*9a7741deSElliott HughesThe program itself is created by
94*9a7741deSElliott Hughes
95*9a7741deSElliott Hughes	make
96*9a7741deSElliott Hughes
97*9a7741deSElliott Hugheswhich should produce a sequence of messages roughly like this:
98*9a7741deSElliott Hughes
99*9a7741deSElliott Hughes	bison -d  awkgram.y
100*9a7741deSElliott Hughes	awkgram.y: warning: 44 shift/reduce conflicts [-Wconflicts-sr]
101*9a7741deSElliott Hughes	awkgram.y: warning: 85 reduce/reduce conflicts [-Wconflicts-rr]
102*9a7741deSElliott Hughes	awkgram.y: note: rerun with option '-Wcounterexamples' to generate conflict counterexamples
103*9a7741deSElliott Hughes	gcc -g -Wall -pedantic -Wcast-qual   -O2   -c -o awkgram.tab.o awkgram.tab.c
104*9a7741deSElliott Hughes	gcc -g -Wall -pedantic -Wcast-qual   -O2   -c -o b.o b.c
105*9a7741deSElliott Hughes	gcc -g -Wall -pedantic -Wcast-qual   -O2   -c -o main.o main.c
106*9a7741deSElliott Hughes	gcc -g -Wall -pedantic -Wcast-qual   -O2   -c -o parse.o parse.c
107*9a7741deSElliott Hughes	gcc -g -Wall -pedantic -Wcast-qual -O2 maketab.c -o maketab
108*9a7741deSElliott Hughes	./maketab awkgram.tab.h >proctab.c
109*9a7741deSElliott Hughes	gcc -g -Wall -pedantic -Wcast-qual   -O2   -c -o proctab.o proctab.c
110*9a7741deSElliott Hughes	gcc -g -Wall -pedantic -Wcast-qual   -O2   -c -o tran.o tran.c
111*9a7741deSElliott Hughes	gcc -g -Wall -pedantic -Wcast-qual   -O2   -c -o lib.o lib.c
112*9a7741deSElliott Hughes	gcc -g -Wall -pedantic -Wcast-qual   -O2   -c -o run.o run.c
113*9a7741deSElliott Hughes	gcc -g -Wall -pedantic -Wcast-qual   -O2   -c -o lex.o lex.c
114*9a7741deSElliott Hughes	gcc -g -Wall -pedantic -Wcast-qual   -O2 awkgram.tab.o b.o main.o parse.o proctab.o tran.o lib.o run.o lex.o   -lm
115*9a7741deSElliott Hughes
116*9a7741deSElliott HughesThis produces an executable `a.out`; you will eventually want to
117*9a7741deSElliott Hughesmove this to some place like `/usr/bin/awk`.
118*9a7741deSElliott Hughes
119*9a7741deSElliott HughesIf your system does not have `yacc` or `bison` (the GNU
120*9a7741deSElliott Hughesequivalent), you need to install one of them first.
121*9a7741deSElliott HughesThe default in the `makefile` is `bison`; you will have
122*9a7741deSElliott Hughesto edit the `makefile` to use `yacc`.
123*9a7741deSElliott Hughes
124*9a7741deSElliott HughesNOTE: This version uses ISO/IEC C99, as you should also.  We have
125*9a7741deSElliott Hughescompiled this without any changes using `gcc -Wall` and/or local C
126*9a7741deSElliott Hughescompilers on a variety of systems, but new systems or compilers
127*9a7741deSElliott Hughesmay raise some new complaint; reports of difficulties are
128*9a7741deSElliott Hugheswelcome.
129*9a7741deSElliott Hughes
130*9a7741deSElliott HughesThis compiles without change on Macintosh OS X using `gcc` and
131*9a7741deSElliott Hughesthe standard developer tools.
132*9a7741deSElliott Hughes
133*9a7741deSElliott HughesYou can also use `make CC=g++` to build with the GNU C++ compiler,
134*9a7741deSElliott Hughesshould you choose to do so.
135*9a7741deSElliott Hughes
136*9a7741deSElliott Hughes## A Note About Releases
137*9a7741deSElliott Hughes
138*9a7741deSElliott HughesWe don't usually do releases.
139*9a7741deSElliott Hughes
140*9a7741deSElliott Hughes## A Note About Maintenance
141*9a7741deSElliott Hughes
142*9a7741deSElliott HughesNOTICE! Maintenance of this program is on a ''best effort''
143*9a7741deSElliott Hughesbasis.  We try to get to issues and pull requests as quickly
144*9a7741deSElliott Hughesas we can.  Unfortunately, however, keeping this program going
145*9a7741deSElliott Hughesis not at the top of our priority list.
146*9a7741deSElliott Hughes
147*9a7741deSElliott Hughes#### Last Updated
148*9a7741deSElliott Hughes
149*9a7741deSElliott HughesMon 05 Feb 2024 08:46:55 IST
150