1*9a7741deSElliott Hughes# The One True Awk 2*9a7741deSElliott Hughes 3*9a7741deSElliott HughesThis is the version of `awk` described in _The AWK Programming Language_, 4*9a7741deSElliott HughesSecond Edition, by Al Aho, Brian Kernighan, and Peter Weinberger 5*9a7741deSElliott Hughes(Addison-Wesley, 2024, ISBN-13 978-0138269722, ISBN-10 0138269726). 6*9a7741deSElliott Hughes 7*9a7741deSElliott Hughes## What's New? ## 8*9a7741deSElliott Hughes 9*9a7741deSElliott HughesThis version of Awk handles UTF-8 and comma-separated values (CSV) input. 10*9a7741deSElliott Hughes 11*9a7741deSElliott Hughes### Strings ### 12*9a7741deSElliott Hughes 13*9a7741deSElliott HughesFunctions that process strings now count Unicode code points, not bytes; 14*9a7741deSElliott Hughesthis affects `length`, `substr`, `index`, `match`, `split`, 15*9a7741deSElliott Hughes`sub`, `gsub`, and others. Note that code 16*9a7741deSElliott Hughespoints are not necessarily characters. 17*9a7741deSElliott Hughes 18*9a7741deSElliott HughesUTF-8 sequences may appear in literal strings and regular expressions. 19*9a7741deSElliott HughesArbitrary characters may be included with `\u` followed by 1 to 8 hexadecimal digits. 20*9a7741deSElliott Hughes 21*9a7741deSElliott Hughes### Regular expressions ### 22*9a7741deSElliott Hughes 23*9a7741deSElliott HughesRegular expressions may include UTF-8 code points, including `\u`. 24*9a7741deSElliott Hughes 25*9a7741deSElliott Hughes### CSV ### 26*9a7741deSElliott Hughes 27*9a7741deSElliott HughesThe option `--csv` turns on CSV processing of input: 28*9a7741deSElliott Hughesfields are separated by commas, fields may be quoted with 29*9a7741deSElliott Hughesdouble-quote (`"`) characters, quoted fields may contain embedded newlines. 30*9a7741deSElliott HughesDouble-quotes in fields have to be doubled and enclosed in quoted fields. 31*9a7741deSElliott HughesIn CSV mode, `FS` is ignored. 32*9a7741deSElliott Hughes 33*9a7741deSElliott HughesIf no explicit separator argument is provided, 34*9a7741deSElliott Hughesfield-splitting in `split` is determined by CSV mode. 35*9a7741deSElliott Hughes 36*9a7741deSElliott Hughes## Copyright 37*9a7741deSElliott Hughes 38*9a7741deSElliott HughesCopyright (C) Lucent Technologies 1997<br/> 39*9a7741deSElliott HughesAll Rights Reserved 40*9a7741deSElliott Hughes 41*9a7741deSElliott HughesPermission to use, copy, modify, and distribute this software and 42*9a7741deSElliott Hughesits documentation for any purpose and without fee is hereby 43*9a7741deSElliott Hughesgranted, provided that the above copyright notice appear in all 44*9a7741deSElliott Hughescopies and that both that the copyright notice and this 45*9a7741deSElliott Hughespermission notice and warranty disclaimer appear in supporting 46*9a7741deSElliott Hughesdocumentation, and that the name Lucent Technologies or any of 47*9a7741deSElliott Hughesits entities not be used in advertising or publicity pertaining 48*9a7741deSElliott Hughesto distribution of the software without specific, written prior 49*9a7741deSElliott Hughespermission. 50*9a7741deSElliott Hughes 51*9a7741deSElliott HughesLUCENT DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, 52*9a7741deSElliott HughesINCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. 53*9a7741deSElliott HughesIN NO EVENT SHALL LUCENT OR ANY OF ITS ENTITIES BE LIABLE FOR ANY 54*9a7741deSElliott HughesSPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES 55*9a7741deSElliott HughesWHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER 56*9a7741deSElliott HughesIN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, 57*9a7741deSElliott HughesARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF 58*9a7741deSElliott HughesTHIS SOFTWARE. 59*9a7741deSElliott Hughes 60*9a7741deSElliott Hughes## Distribution and Reporting Problems 61*9a7741deSElliott Hughes 62*9a7741deSElliott HughesChanges, mostly bug fixes and occasional enhancements, are listed 63*9a7741deSElliott Hughesin `FIXES`. If you distribute this code further, please please please 64*9a7741deSElliott Hughesdistribute `FIXES` with it. 65*9a7741deSElliott Hughes 66*9a7741deSElliott HughesIf you find errors, please report them 67*9a7741deSElliott Hughesto the current maintainer, [email protected]. 68*9a7741deSElliott HughesPlease _also_ open an issue in the GitHub issue tracker, to make 69*9a7741deSElliott Hughesit easy to track issues. 70*9a7741deSElliott HughesThanks. 71*9a7741deSElliott Hughes 72*9a7741deSElliott Hughes## Submitting Pull Requests 73*9a7741deSElliott Hughes 74*9a7741deSElliott HughesPull requests are welcome. Some guidelines: 75*9a7741deSElliott Hughes 76*9a7741deSElliott Hughes* Please do not use functions or facilities that are not standard (e.g., 77*9a7741deSElliott Hughes`strlcpy()`, `fpurge()`). 78*9a7741deSElliott Hughes 79*9a7741deSElliott Hughes* Please run the test suite and make sure that your changes pass before 80*9a7741deSElliott Hughesposting the pull request. To do so: 81*9a7741deSElliott Hughes 82*9a7741deSElliott Hughes 1. Save the previous version of `awk` somewhere in your path. Call it `nawk` (for example). 83*9a7741deSElliott Hughes 1. Run `oldawk=nawk make check > check.out 2>&1`. 84*9a7741deSElliott Hughes 1. Search for `BAD` or `error` in the result. In general, look over it manually to make sure there are no errors. 85*9a7741deSElliott Hughes 86*9a7741deSElliott Hughes* Please create the pull request with a request 87*9a7741deSElliott Hughesto merge into the `staging` branch instead of into the `master` branch. 88*9a7741deSElliott HughesThis allows us to do testing, and to make any additional edits or changes 89*9a7741deSElliott Hughesafter the merge but before merging to `master`. 90*9a7741deSElliott Hughes 91*9a7741deSElliott Hughes## Building 92*9a7741deSElliott Hughes 93*9a7741deSElliott HughesThe program itself is created by 94*9a7741deSElliott Hughes 95*9a7741deSElliott Hughes make 96*9a7741deSElliott Hughes 97*9a7741deSElliott Hugheswhich should produce a sequence of messages roughly like this: 98*9a7741deSElliott Hughes 99*9a7741deSElliott Hughes bison -d awkgram.y 100*9a7741deSElliott Hughes awkgram.y: warning: 44 shift/reduce conflicts [-Wconflicts-sr] 101*9a7741deSElliott Hughes awkgram.y: warning: 85 reduce/reduce conflicts [-Wconflicts-rr] 102*9a7741deSElliott Hughes awkgram.y: note: rerun with option '-Wcounterexamples' to generate conflict counterexamples 103*9a7741deSElliott Hughes gcc -g -Wall -pedantic -Wcast-qual -O2 -c -o awkgram.tab.o awkgram.tab.c 104*9a7741deSElliott Hughes gcc -g -Wall -pedantic -Wcast-qual -O2 -c -o b.o b.c 105*9a7741deSElliott Hughes gcc -g -Wall -pedantic -Wcast-qual -O2 -c -o main.o main.c 106*9a7741deSElliott Hughes gcc -g -Wall -pedantic -Wcast-qual -O2 -c -o parse.o parse.c 107*9a7741deSElliott Hughes gcc -g -Wall -pedantic -Wcast-qual -O2 maketab.c -o maketab 108*9a7741deSElliott Hughes ./maketab awkgram.tab.h >proctab.c 109*9a7741deSElliott Hughes gcc -g -Wall -pedantic -Wcast-qual -O2 -c -o proctab.o proctab.c 110*9a7741deSElliott Hughes gcc -g -Wall -pedantic -Wcast-qual -O2 -c -o tran.o tran.c 111*9a7741deSElliott Hughes gcc -g -Wall -pedantic -Wcast-qual -O2 -c -o lib.o lib.c 112*9a7741deSElliott Hughes gcc -g -Wall -pedantic -Wcast-qual -O2 -c -o run.o run.c 113*9a7741deSElliott Hughes gcc -g -Wall -pedantic -Wcast-qual -O2 -c -o lex.o lex.c 114*9a7741deSElliott Hughes gcc -g -Wall -pedantic -Wcast-qual -O2 awkgram.tab.o b.o main.o parse.o proctab.o tran.o lib.o run.o lex.o -lm 115*9a7741deSElliott Hughes 116*9a7741deSElliott HughesThis produces an executable `a.out`; you will eventually want to 117*9a7741deSElliott Hughesmove this to some place like `/usr/bin/awk`. 118*9a7741deSElliott Hughes 119*9a7741deSElliott HughesIf your system does not have `yacc` or `bison` (the GNU 120*9a7741deSElliott Hughesequivalent), you need to install one of them first. 121*9a7741deSElliott HughesThe default in the `makefile` is `bison`; you will have 122*9a7741deSElliott Hughesto edit the `makefile` to use `yacc`. 123*9a7741deSElliott Hughes 124*9a7741deSElliott HughesNOTE: This version uses ISO/IEC C99, as you should also. We have 125*9a7741deSElliott Hughescompiled this without any changes using `gcc -Wall` and/or local C 126*9a7741deSElliott Hughescompilers on a variety of systems, but new systems or compilers 127*9a7741deSElliott Hughesmay raise some new complaint; reports of difficulties are 128*9a7741deSElliott Hugheswelcome. 129*9a7741deSElliott Hughes 130*9a7741deSElliott HughesThis compiles without change on Macintosh OS X using `gcc` and 131*9a7741deSElliott Hughesthe standard developer tools. 132*9a7741deSElliott Hughes 133*9a7741deSElliott HughesYou can also use `make CC=g++` to build with the GNU C++ compiler, 134*9a7741deSElliott Hughesshould you choose to do so. 135*9a7741deSElliott Hughes 136*9a7741deSElliott Hughes## A Note About Releases 137*9a7741deSElliott Hughes 138*9a7741deSElliott HughesWe don't usually do releases. 139*9a7741deSElliott Hughes 140*9a7741deSElliott Hughes## A Note About Maintenance 141*9a7741deSElliott Hughes 142*9a7741deSElliott HughesNOTICE! Maintenance of this program is on a ''best effort'' 143*9a7741deSElliott Hughesbasis. We try to get to issues and pull requests as quickly 144*9a7741deSElliott Hughesas we can. Unfortunately, however, keeping this program going 145*9a7741deSElliott Hughesis not at the top of our priority list. 146*9a7741deSElliott Hughes 147*9a7741deSElliott Hughes#### Last Updated 148*9a7741deSElliott Hughes 149*9a7741deSElliott HughesMon 05 Feb 2024 08:46:55 IST 150