xref: /aosp_15_r20/external/antlr/runtime/C/README (revision 16467b971bd3e2009fad32dd79016f2c7e421deb)
1*16467b97STreehugger RobotANTLR v3.0.1 C Runtime
2*16467b97STreehugger RobotANTLR 3.0.1
3*16467b97STreehugger RobotJanuary 1, 2008
4*16467b97STreehugger Robot
5*16467b97STreehugger RobotAt the moment, the use of the C runtime engine for the parser is not generally
6*16467b97STreehugger Robotfor the inexperienced C programmer. However this is mainly because of the lack
7*16467b97STreehugger Robotof documentation on use, which will be corrected shortly. The C runtime
8*16467b97STreehugger Robotcode itself is however well documented with doxygen style comments and a
9*16467b97STreehugger Robotreasonably experienced C programmer should be able to piece it together. You
10*16467b97STreehugger Robotcan visit the documentation at: http://www.antlr.org/api/C/index.html
11*16467b97STreehugger Robot
12*16467b97STreehugger RobotThe general make up is that everything is implemented as a pseudo class/object
13*16467b97STreehugger Robotinitialized with pointers to its 'member' functions and data. All objects are
14*16467b97STreehugger Robot(usually) created by factories, which auto manage the memory allocation and
15*16467b97STreehugger Robotrelease and generally make life easier. If you remember this rule, everything
16*16467b97STreehugger Robotshould fall in to place.
17*16467b97STreehugger Robot
18*16467b97STreehugger RobotJim Idle - Portland Oregon, Jan 2008
19*16467b97STreehugger Robotjimi     idle ws
20*16467b97STreehugger Robot
21*16467b97STreehugger Robot===============================================================================
22*16467b97STreehugger Robot
23*16467b97STreehugger RobotTerence Parr, parrt at cs usfca edu
24*16467b97STreehugger RobotANTLR project lead and supreme dictator for life
25*16467b97STreehugger RobotUniversity of San Francisco
26*16467b97STreehugger Robot
27*16467b97STreehugger RobotINTRODUCTION
28*16467b97STreehugger Robot
29*16467b97STreehugger RobotWelcome to ANTLR v3!  I've been working on this for nearly 4 years and it's
30*16467b97STreehugger Robotalmost ready!  I plan no feature additions between this beta and first
31*16467b97STreehugger Robot3.0 release.  I have lots of features to add later, but this will be
32*16467b97STreehugger Robotthe first set.  Ultimately, I need to rewrite ANTLR v3 in itself (it's
33*16467b97STreehugger Robotwritten in 2.7.7 at the moment and also needs StringTemplate 3.0 or
34*16467b97STreehugger Robotlater).
35*16467b97STreehugger Robot
36*16467b97STreehugger RobotYou should use v3 in conjunction with ANTLRWorks:
37*16467b97STreehugger Robot
38*16467b97STreehugger Robot    http://www.antlr.org/works/index.html
39*16467b97STreehugger Robot
40*16467b97STreehugger RobotWARNING: We have bits of documentation started, but nothing super-complete
41*16467b97STreehugger Robotyet.  The book will be printed May 2007:
42*16467b97STreehugger Robot
43*16467b97STreehugger Robothttp://www.pragmaticprogrammer.com/titles/tpantlr/index.html
44*16467b97STreehugger Robot
45*16467b97STreehugger Robotbut we should have a beta PDF available on that page in Feb 2007.
46*16467b97STreehugger Robot
47*16467b97STreehugger RobotYou also have the examples plus the source to guide you.
48*16467b97STreehugger Robot
49*16467b97STreehugger RobotSee the new wiki FAQ:
50*16467b97STreehugger Robot
51*16467b97STreehugger Robot    http://www.antlr.org/wiki/display/ANTLR3/ANTLR+v3+FAQ
52*16467b97STreehugger Robot
53*16467b97STreehugger Robotand general doc root:
54*16467b97STreehugger Robot
55*16467b97STreehugger Robot    http://www.antlr.org/wiki/display/ANTLR3/ANTLR+3+Wiki+Home
56*16467b97STreehugger Robot
57*16467b97STreehugger RobotPlease help add/update FAQ entries.
58*16467b97STreehugger Robot
59*16467b97STreehugger RobotI have made very little effort at this point to deal well with
60*16467b97STreehugger Roboterroneous input (e.g., bad syntax might make ANTLR crash).  I will clean
61*16467b97STreehugger Robotthis up after I've rewritten v3 in v3.
62*16467b97STreehugger Robot
63*16467b97STreehugger RobotPer the license in LICENSE.txt, this software is not guaranteed to
64*16467b97STreehugger Robotwork and might even destroy all life on this planet:
65*16467b97STreehugger Robot
66*16467b97STreehugger RobotTHIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
67*16467b97STreehugger RobotIMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
68*16467b97STreehugger RobotWARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
69*16467b97STreehugger RobotDISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT,
70*16467b97STreehugger RobotINDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
71*16467b97STreehugger Robot(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
72*16467b97STreehugger RobotSERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
73*16467b97STreehugger RobotHOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
74*16467b97STreehugger RobotSTRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
75*16467b97STreehugger RobotIN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
76*16467b97STreehugger RobotPOSSIBILITY OF SUCH DAMAGE.
77*16467b97STreehugger Robot
78*16467b97STreehugger RobotEXAMPLES
79*16467b97STreehugger Robot
80*16467b97STreehugger RobotANTLR v3 sample grammars:
81*16467b97STreehugger Robot
82*16467b97STreehugger Robot    http://www.antlr.org/download/examples-v3.tar.gz
83*16467b97STreehugger Robot
84*16467b97STreehugger Robotcontains the following examples: LL-star, cminus, dynamic-scope,
85*16467b97STreehugger Robotfuzzy, hoistedPredicates, island-grammar, java, python, scopes,
86*16467b97STreehugger RobotsimplecTreeParser, treeparser, tweak, xmlLexer.
87*16467b97STreehugger Robot
88*16467b97STreehugger RobotAlso check out Mantra Programming Language for a prototype (work in
89*16467b97STreehugger Robotprogress) using v3:
90*16467b97STreehugger Robot
91*16467b97STreehugger Robot    http://www.linguamantra.org/
92*16467b97STreehugger Robot
93*16467b97STreehugger Robot----------------------------------------------------------------------
94*16467b97STreehugger Robot
95*16467b97STreehugger RobotWhat is ANTLR?
96*16467b97STreehugger Robot
97*16467b97STreehugger RobotANTLR stands for (AN)other (T)ool for (L)anguage (R)ecognition and was
98*16467b97STreehugger Robotoriginally known as PCCTS.  ANTLR is a language tool that provides a
99*16467b97STreehugger Robotframework for constructing recognizers, compilers, and translators
100*16467b97STreehugger Robotfrom grammatical descriptions containing actions.  Target language list:
101*16467b97STreehugger Robot
102*16467b97STreehugger Robothttp://www.antlr.org/wiki/display/ANTLR3/Code+Generation+Targets
103*16467b97STreehugger Robot
104*16467b97STreehugger Robot----------------------------------------------------------------------
105*16467b97STreehugger Robot
106*16467b97STreehugger RobotHow is ANTLR v3 different than ANTLR v2?
107*16467b97STreehugger Robot
108*16467b97STreehugger RobotSee migration guide:
109*16467b97STreehugger Robot    http://www.antlr.org/wiki/display/ANTLR3/Migrating+from+ANTLR+2+to+ANTLR+3
110*16467b97STreehugger Robot
111*16467b97STreehugger RobotANTLR v3 has a far superior parsing algorithm called LL(*) that
112*16467b97STreehugger Robothandles many more grammars than v2 does.  In practice, it means you
113*16467b97STreehugger Robotcan throw almost any grammar at ANTLR that is non-left-recursive and
114*16467b97STreehugger Robotunambiguous (same input can be matched by multiple rules); the cost is
115*16467b97STreehugger Robotperhaps a tiny bit of backtracking, but with a DFA not a full parser.
116*16467b97STreehugger RobotYou can manually set the max lookahead k as an option for any decision
117*16467b97STreehugger Robotthough.  The LL(*) algorithm ramps up to use more lookahead when it
118*16467b97STreehugger Robotneeds to and is much more efficient than normal LL backtracking. There
119*16467b97STreehugger Robotis support for syntactic predicate (full LL backtracking) when LL(*)
120*16467b97STreehugger Robotfails.
121*16467b97STreehugger Robot
122*16467b97STreehugger RobotLexers are much easier due to the LL(*) algorithm as well.  Previously
123*16467b97STreehugger Robotthese two lexer rules would cause trouble because ANTLR couldn't
124*16467b97STreehugger Robotdistinguish between them with finite lookahead to see the decimal
125*16467b97STreehugger Robotpoint:
126*16467b97STreehugger Robot
127*16467b97STreehugger RobotINT : ('0'..'9')+ ;
128*16467b97STreehugger RobotFLOAT : INT '.' INT ;
129*16467b97STreehugger Robot
130*16467b97STreehugger RobotThe syntax is almost identical for features in common, but you should
131*16467b97STreehugger Robotnote that labels are always '=' not ':'.  So do id=ID not id:ID.
132*16467b97STreehugger Robot
133*16467b97STreehugger RobotYou can do combined lexer/parser grammars again (ala PCCTS) both lexer
134*16467b97STreehugger Robotand parser rules are defined in the same file.  See the examples.
135*16467b97STreehugger RobotReally nice.  You can reference strings and characters in the grammar
136*16467b97STreehugger Robotand ANTLR will generate the lexer for you.
137*16467b97STreehugger Robot
138*16467b97STreehugger RobotThe attribute structure has been enhanced.  Rules may have multiple
139*16467b97STreehugger Robotreturn values, for example.  Further, there are dynamically scoped
140*16467b97STreehugger Robotattributes whereby a rule may define a value usable by any rule it
141*16467b97STreehugger Robotinvokes directly or indirectly w/o having to pass a parameter all the
142*16467b97STreehugger Robotway down.
143*16467b97STreehugger Robot
144*16467b97STreehugger RobotANTLR v3 tree construction is far superior--it provides tree rewrite
145*16467b97STreehugger Robotrules where the right hand side is simply the tree grammar fragment
146*16467b97STreehugger Robotdescribing the tree you want to build:
147*16467b97STreehugger Robot
148*16467b97STreehugger RobotformalArgs
149*16467b97STreehugger Robot	:	typename declarator (',' typename declarator )*
150*16467b97STreehugger Robot		-> ^(ARG typename declarator)+
151*16467b97STreehugger Robot	;
152*16467b97STreehugger Robot
153*16467b97STreehugger RobotThat builds tree sequences like:
154*16467b97STreehugger Robot
155*16467b97STreehugger Robot^(ARG int v1) ^(ARG int v2)
156*16467b97STreehugger Robot
157*16467b97STreehugger RobotANTLR v3 also incorporates StringTemplate:
158*16467b97STreehugger Robot
159*16467b97STreehugger Robot      http://www.stringtemplate.org
160*16467b97STreehugger Robot
161*16467b97STreehugger Robotjust like AST support.  It is useful for generating output.  For
162*16467b97STreehugger Robotexample this rule creates a template called 'import' for each import
163*16467b97STreehugger Robotdefinition found in the input stream:
164*16467b97STreehugger Robot
165*16467b97STreehugger Robotgrammar Java;
166*16467b97STreehugger Robotoptions {
167*16467b97STreehugger Robot  output=template;
168*16467b97STreehugger Robot}
169*16467b97STreehugger Robot...
170*16467b97STreehugger RobotimportDefinition
171*16467b97STreehugger Robot    :   'import' identifierStar SEMI
172*16467b97STreehugger Robot        -> import(name={$identifierStar.st},
173*16467b97STreehugger Robot                begin={$identifierStar.start},
174*16467b97STreehugger Robot                end={$identifierStar.stop})
175*16467b97STreehugger Robot    ;
176*16467b97STreehugger Robot
177*16467b97STreehugger RobotThe attributes are set via assignments in the argument list.  The
178*16467b97STreehugger Robotarguments are actions with arbitrary expressions in the target
179*16467b97STreehugger Robotlanguage.  The .st label property is the result template from a rule
180*16467b97STreehugger Robotreference.  There is a nice shorthand in actions too:
181*16467b97STreehugger Robot
182*16467b97STreehugger Robot    %foo(a={},b={},...) ctor
183*16467b97STreehugger Robot    %({name-expr})(a={},...) indirect template ctor reference
184*16467b97STreehugger Robot    %{string-expr} anonymous template from string expr
185*16467b97STreehugger Robot    %{expr}.y = z; template attribute y of StringTemplate-typed expr to z
186*16467b97STreehugger Robot    %x.y = z; set template attribute y of x (always set never get attr)
187*16467b97STreehugger Robot              to z [languages like python without ';' must still use the
188*16467b97STreehugger Robot              ';' which the code generator is free to remove during code gen]
189*16467b97STreehugger Robot              Same as '(x).setAttribute("y", z);'
190*16467b97STreehugger Robot
191*16467b97STreehugger RobotFor ANTLR v3 I decided to make the most common tasks easy by default
192*16467b97STreehugger Robotrather.  This means that some of the basic objects are heavier weight
193*16467b97STreehugger Robotthan some speed demons would like, but they are free to pare it down
194*16467b97STreehugger Robotleaving most programmers the luxury of having it "just work."  For
195*16467b97STreehugger Robotexample, to read in some input, tweak it, and write it back out
196*16467b97STreehugger Robotpreserving whitespace, is easy in v3.
197*16467b97STreehugger Robot
198*16467b97STreehugger RobotThe ANTLR source code is much prettier.  You'll also note that the
199*16467b97STreehugger Robotrun-time classes are conveniently encapsulated in the
200*16467b97STreehugger Robotorg.antlr.runtime package.
201*16467b97STreehugger Robot
202*16467b97STreehugger Robot----------------------------------------------------------------------
203*16467b97STreehugger Robot
204*16467b97STreehugger RobotHow do I install this damn thing?
205*16467b97STreehugger Robot
206*16467b97STreehugger RobotJust untar and you'll get:
207*16467b97STreehugger Robot
208*16467b97STreehugger Robotantlr-3.0b6/README.txt (this file)
209*16467b97STreehugger Robotantlr-3.0b6/LICENSE.txt
210*16467b97STreehugger Robotantlr-3.0b6/src/org/antlr/...
211*16467b97STreehugger Robotantlr-3.0b6/lib/stringtemplate-3.0.jar (3.0b6 needs 3.0)
212*16467b97STreehugger Robotantlr-3.0b6/lib/antlr-2.7.7.jar
213*16467b97STreehugger Robotantlr-3.0b6/lib/antlr-3.0b6.jar
214*16467b97STreehugger Robot
215*16467b97STreehugger RobotThen you need to add all the jars in lib to your CLASSPATH.
216*16467b97STreehugger Robot
217*16467b97STreehugger Robot----------------------------------------------------------------------
218*16467b97STreehugger Robot
219*16467b97STreehugger RobotHow do I use ANTLR v3?
220*16467b97STreehugger Robot
221*16467b97STreehugger Robot[I am assuming you are only using the command-line (and not the
222*16467b97STreehugger RobotANTLRWorks GUI)].
223*16467b97STreehugger Robot
224*16467b97STreehugger RobotRunning ANTLR with no parameters shows you:
225*16467b97STreehugger Robot
226*16467b97STreehugger RobotANTLR Parser Generator   Early Access Version 3.0b6 (Jan 31, 2007) 1989-2007
227*16467b97STreehugger Robotusage: java org.antlr.Tool [args] file.g [file2.g file3.g ...]
228*16467b97STreehugger Robot  -o outputDir          specify output directory where all output is generated
229*16467b97STreehugger Robot  -lib dir              specify location of token files
230*16467b97STreehugger Robot  -report               print out a report about the grammar(s) processed
231*16467b97STreehugger Robot  -print                print out the grammar without actions
232*16467b97STreehugger Robot  -debug                generate a parser that emits debugging events
233*16467b97STreehugger Robot  -profile              generate a parser that computes profiling information
234*16467b97STreehugger Robot  -nfa                  generate an NFA for each rule
235*16467b97STreehugger Robot  -dfa                  generate a DFA for each decision point
236*16467b97STreehugger Robot  -message-format name  specify output style for messages
237*16467b97STreehugger Robot  -X                    display extended argument list
238*16467b97STreehugger Robot
239*16467b97STreehugger RobotFor example, consider how to make the LL-star example from the examples
240*16467b97STreehugger Robottarball you can get at http://www.antlr.org/download/examples-v3.tar.gz
241*16467b97STreehugger Robot
242*16467b97STreehugger Robot$ cd examples/java/LL-star
243*16467b97STreehugger Robot$ java org.antlr.Tool simplec.g
244*16467b97STreehugger Robot$ jikes *.java
245*16467b97STreehugger Robot
246*16467b97STreehugger RobotFor input:
247*16467b97STreehugger Robot
248*16467b97STreehugger Robotchar c;
249*16467b97STreehugger Robotint x;
250*16467b97STreehugger Robotvoid bar(int x);
251*16467b97STreehugger Robotint foo(int y, char d) {
252*16467b97STreehugger Robot  int i;
253*16467b97STreehugger Robot  for (i=0; i<3; i=i+1) {
254*16467b97STreehugger Robot    x=3;
255*16467b97STreehugger Robot    y=5;
256*16467b97STreehugger Robot  }
257*16467b97STreehugger Robot}
258*16467b97STreehugger Robot
259*16467b97STreehugger Robotyou will see output as follows:
260*16467b97STreehugger Robot
261*16467b97STreehugger Robot$ java Main input
262*16467b97STreehugger Robotbar is a declaration
263*16467b97STreehugger Robotfoo is a definition
264*16467b97STreehugger Robot
265*16467b97STreehugger RobotWhat if I want to test my parser without generating code?  Easy.  Just
266*16467b97STreehugger Robotrun ANTLR in interpreter mode.  It can't execute your actions, but it
267*16467b97STreehugger Robotcan create a parse tree from your input to show you how it would be
268*16467b97STreehugger Robotmatched.  Use the org.antlr.tool.Interp main class.  In the following,
269*16467b97STreehugger RobotI interpret simplec.g on t.c, which contains "int x;"
270*16467b97STreehugger Robot
271*16467b97STreehugger Robot$ java org.antlr.tool.Interp simplec.g WS program t.c
272*16467b97STreehugger Robot( <grammar SimpleC>
273*16467b97STreehugger Robot  ( program
274*16467b97STreehugger Robot    ( declaration
275*16467b97STreehugger Robot      ( variable
276*16467b97STreehugger Robot        ( type [@0,0:2='int',<14>,1:0] )
277*16467b97STreehugger Robot        ( declarator [@2,4:4='x',<2>,1:4] )
278*16467b97STreehugger Robot        [@3,5:5=';',<5>,1:5]
279*16467b97STreehugger Robot      )
280*16467b97STreehugger Robot    )
281*16467b97STreehugger Robot  )
282*16467b97STreehugger Robot)
283*16467b97STreehugger Robot
284*16467b97STreehugger Robotwhere I have formatted the output to make it more readable.  I have
285*16467b97STreehugger Robottold it to ignore all WS tokens.
286*16467b97STreehugger Robot
287*16467b97STreehugger Robot----------------------------------------------------------------------
288*16467b97STreehugger Robot
289*16467b97STreehugger RobotHow do I rebuild ANTLR v3?
290*16467b97STreehugger Robot
291*16467b97STreehugger RobotMake sure the following two jars are in your CLASSPATH
292*16467b97STreehugger Robot
293*16467b97STreehugger Robotantlr-3.0b6/lib/stringtemplate-3.0.jar
294*16467b97STreehugger Robotantlr-3.0b6/lib/antlr-2.7.7.jar
295*16467b97STreehugger Robotjunit.jar [if you want to build the test directories]
296*16467b97STreehugger Robot
297*16467b97STreehugger Robotthen jump into antlr-3.0b6/src directory and then type:
298*16467b97STreehugger Robot
299*16467b97STreehugger Robot$ javac -d . org/antlr/Tool.java org/antlr/*/*.java org/antlr/*/*/*.java
300*16467b97STreehugger Robot
301*16467b97STreehugger RobotTakes 9 seconds on my 1Ghz laptop or 4 seconds with jikes.  Later I'll
302*16467b97STreehugger Robothave a real build mechanism, though I must admit the one-liner appeals
303*16467b97STreehugger Robotto me.  I use Intellij so I never type anything actually to build.
304*16467b97STreehugger Robot
305*16467b97STreehugger RobotThere is also an ANT build.xml file, but I know nothing of ANT; contributed
306*16467b97STreehugger Robotby others (I'm opposed to any tool with an XML interface for Humans).
307*16467b97STreehugger Robot
308*16467b97STreehugger Robot-----------------------------------------------------------------------
309*16467b97STreehugger RobotC# Target Notes
310*16467b97STreehugger Robot
311*16467b97STreehugger Robot1. Auto-generated lexers do not inherit parent parser's @namespace
312*16467b97STreehugger Robot   {...} value.  Use @lexer::namespace{...}.
313*16467b97STreehugger Robot
314*16467b97STreehugger Robot-----------------------------------------------------------------------
315*16467b97STreehugger Robot
316*16467b97STreehugger RobotCHANGES
317*16467b97STreehugger Robot
318*16467b97STreehugger RobotMarch 17, 2007
319*16467b97STreehugger Robot
320*16467b97STreehugger Robot* Jonathan DeKlotz updated C# templates to be 3.0b6 current
321*16467b97STreehugger Robot
322*16467b97STreehugger RobotMarch 14, 2007
323*16467b97STreehugger Robot
324*16467b97STreehugger Robot* Manually-specified (...)=> force backtracking eval of that predicate.
325*16467b97STreehugger Robot  backtracking=true mode does not however.  Added unit test.
326*16467b97STreehugger Robot
327*16467b97STreehugger RobotMarch 14, 2007
328*16467b97STreehugger Robot
329*16467b97STreehugger Robot* Fixed bug in lexer where ~T didn't compute the set from rule T.
330*16467b97STreehugger Robot
331*16467b97STreehugger Robot* Added -Xnoinlinedfa make all DFA with tables; no inline prediction with IFs
332*16467b97STreehugger Robot
333*16467b97STreehugger Robot* Fixed http://www.antlr.org:8888/browse/ANTLR-80.
334*16467b97STreehugger Robot  Sem pred states didn't define lookahead vars.
335*16467b97STreehugger Robot
336*16467b97STreehugger Robot* Fixed http://www.antlr.org:8888/browse/ANTLR-91.
337*16467b97STreehugger Robot  When forcing some acyclic DFA to be state tables, they broke.
338*16467b97STreehugger Robot  Forcing all DFA to be state tables should give same results.
339*16467b97STreehugger Robot
340*16467b97STreehugger RobotMarch 12, 2007
341*16467b97STreehugger Robot
342*16467b97STreehugger Robot* setTokenSource in CommonTokenStream didn't clear tokens list.
343*16467b97STreehugger Robot  setCharStream calls reset in Lexer.
344*16467b97STreehugger Robot
345*16467b97STreehugger Robot* Altered -depend.  No longer printing grammar files for multiple input
346*16467b97STreehugger Robot  files with -depend.  Doesn't show T__.g temp file anymore. Added
347*16467b97STreehugger Robot  TLexer.tokens.  Added .h files if defined.
348*16467b97STreehugger Robot
349*16467b97STreehugger RobotFebruary 11, 2007
350*16467b97STreehugger Robot
351*16467b97STreehugger Robot* Added -depend command-line option that, instead of processing files,
352*16467b97STreehugger Robot  it shows you what files the input grammar(s) depend on and what files
353*16467b97STreehugger Robot  they generate. For combined grammar T.g:
354*16467b97STreehugger Robot
355*16467b97STreehugger Robot  $ java org.antlr.Tool -depend T.g
356*16467b97STreehugger Robot
357*16467b97STreehugger Robot  You get:
358*16467b97STreehugger Robot
359*16467b97STreehugger Robot  TParser.java : T.g
360*16467b97STreehugger Robot  T.tokens : T.g
361*16467b97STreehugger Robot  T__.g : T.g
362*16467b97STreehugger Robot
363*16467b97STreehugger Robot  Now, assuming U.g is a tree grammar ref'd T's tokens:
364*16467b97STreehugger Robot
365*16467b97STreehugger Robot  $ java org.antlr.Tool -depend T.g U.g
366*16467b97STreehugger Robot
367*16467b97STreehugger Robot  TParser.java : T.g
368*16467b97STreehugger Robot  T.tokens : T.g
369*16467b97STreehugger Robot  T__.g : T.g
370*16467b97STreehugger Robot  U.g: T.tokens
371*16467b97STreehugger Robot  U.java : U.g
372*16467b97STreehugger Robot  U.tokens : U.g
373*16467b97STreehugger Robot
374*16467b97STreehugger Robot  Handles spaces by escaping them.  Pays attention to -o, -fo and -lib.
375*16467b97STreehugger Robot  Dir 'x y' is a valid dir in current dir.
376*16467b97STreehugger Robot
377*16467b97STreehugger Robot  $ java org.antlr.Tool -depend -lib /usr/local/lib -o 'x y' T.g U.g
378*16467b97STreehugger Robot  x\ y/TParser.java : T.g
379*16467b97STreehugger Robot  x\ y/T.tokens : T.g
380*16467b97STreehugger Robot  x\ y/T__.g : T.g
381*16467b97STreehugger Robot  U.g: /usr/local/lib/T.tokens
382*16467b97STreehugger Robot  x\ y/U.java : U.g
383*16467b97STreehugger Robot  x\ y/U.tokens : U.g
384*16467b97STreehugger Robot
385*16467b97STreehugger Robot  You have API access via org.antlr.tool.BuildDependencyGenerator class:
386*16467b97STreehugger Robot  getGeneratedFileList(), getDependenciesFileList().  You can also access
387*16467b97STreehugger Robot  the output template: getDependencies().  The file
388*16467b97STreehugger Robot  org/antlr/tool/templates/depend.stg contains the template.  You can
389*16467b97STreehugger Robot  modify as you want.  File objects go in so you can play with path etc...
390*16467b97STreehugger Robot
391*16467b97STreehugger RobotFebruary 10, 2007
392*16467b97STreehugger Robot
393*16467b97STreehugger Robot* no more .gl files generated.  All .g all the time.
394*16467b97STreehugger Robot
395*16467b97STreehugger Robot* changed @finally to be @after and added a finally clause to the
396*16467b97STreehugger Robot  exception stuff.  I also removed the superfluous "exception"
397*16467b97STreehugger Robot  keyword.  Here's what the new syntax looks like:
398*16467b97STreehugger Robot
399*16467b97STreehugger Robot  a
400*16467b97STreehugger Robot  @after { System.out.println("ick"); }
401*16467b97STreehugger Robot    : 'a'
402*16467b97STreehugger Robot    ;
403*16467b97STreehugger Robot    catch[RecognitionException e] { System.out.println("foo"); }
404*16467b97STreehugger Robot    catch[IOException e] { System.out.println("io"); }
405*16467b97STreehugger Robot    finally { System.out.println("foobar"); }
406*16467b97STreehugger Robot
407*16467b97STreehugger Robot  @after executes after bookkeeping to set $rule.stop, $rule.tree but
408*16467b97STreehugger Robot  before scopes pop and any memoization happens.  Dynamic scopes and
409*16467b97STreehugger Robot  memoization are still in generated finally block because they must
410*16467b97STreehugger Robot  exec even if error in rule.  The @after action and tree setting
411*16467b97STreehugger Robot  stuff can technically be skipped upon syntax error in rule.  [Later
412*16467b97STreehugger Robot  we might add something to finally to stick an ERROR token in the
413*16467b97STreehugger Robot  tree and set the return value.]  Sequence goes: set $stop, $tree (if
414*16467b97STreehugger Robot  any), @after (if any), pop scopes (if any), memoize (if needed),
415*16467b97STreehugger Robot  grammar finally clause.  Last 3 are in generated code's finally
416*16467b97STreehugger Robot  clause.
417*16467b97STreehugger Robot
418*16467b97STreehugger Robot3.0b6 - January 31, 2007
419*16467b97STreehugger Robot
420*16467b97STreehugger RobotJanuary 30, 2007
421*16467b97STreehugger Robot
422*16467b97STreehugger Robot* Fixed bug in IntervalSet.and: it returned the same empty set all the time
423*16467b97STreehugger Robot  rather than new empty set.  Code altered the same empty set.
424*16467b97STreehugger Robot
425*16467b97STreehugger Robot* Made analysis terminate faster upon a decision that takes too long;
426*16467b97STreehugger Robot  it seemed to keep doing work for a while.  Refactored some names
427*16467b97STreehugger Robot  and updated comments.  Also made it terminate when it realizes it's
428*16467b97STreehugger Robot  non-LL(*) due to recursion.  just added terminate conditions to loop
429*16467b97STreehugger Robot  in convert().
430*16467b97STreehugger Robot
431*16467b97STreehugger Robot* Sometimes fatal non-LL(*) messages didn't appear; instead you got
432*16467b97STreehugger Robot  "antlr couldn't analyze", which is actually untrue.  I had the
433*16467b97STreehugger Robot  order of some prints wrong in the DecisionProbe.
434*16467b97STreehugger Robot
435*16467b97STreehugger Robot* The code generator incorrectly detected when it could use a fixed,
436*16467b97STreehugger Robot  acyclic inline DFA (i.e., using an IF).  Upon non-LL(*) decisions
437*16467b97STreehugger Robot  with predicates, analysis made cyclic DFA.  But this stops
438*16467b97STreehugger Robot  the computation detecting whether they are cyclic.  I just added
439*16467b97STreehugger Robot  a protection in front of the acyclic DFA generator to avoid if
440*16467b97STreehugger Robot  non-LL(*).  Updated comments.
441*16467b97STreehugger Robot
442*16467b97STreehugger RobotJanuary 23, 2007
443*16467b97STreehugger Robot
444*16467b97STreehugger Robot* Made tree node streams use adaptor to create navigation nodes.
445*16467b97STreehugger Robot  Thanks to Emond Papegaaij.
446*16467b97STreehugger Robot
447*16467b97STreehugger RobotJanuary 22, 2007
448*16467b97STreehugger Robot
449*16467b97STreehugger Robot* Added lexer rule properties: start, stop
450*16467b97STreehugger Robot
451*16467b97STreehugger RobotJanuary 1, 2007
452*16467b97STreehugger Robot
453*16467b97STreehugger Robot* analysis failsafe is back on; if a decision takes too long, it bails out
454*16467b97STreehugger Robot  and uses k=1
455*16467b97STreehugger Robot
456*16467b97STreehugger RobotJanuary 1, 2007
457*16467b97STreehugger Robot
458*16467b97STreehugger Robot* += labels for rules only work for output option; previously elements
459*16467b97STreehugger Robot  of list were the return value structs, but are now either the tree or
460*16467b97STreehugger Robot  StringTemplate return value.  You can label different rules now
461*16467b97STreehugger Robot  x+=a x+=b.
462*16467b97STreehugger Robot
463*16467b97STreehugger RobotDecember 30, 2006
464*16467b97STreehugger Robot
465*16467b97STreehugger Robot* Allow \" to work correctly in "..." template.
466*16467b97STreehugger Robot
467*16467b97STreehugger RobotDecember 28, 2006
468*16467b97STreehugger Robot
469*16467b97STreehugger Robot* errors that are now warnings: missing AST label type in trees.
470*16467b97STreehugger Robot  Also "no start rule detected" is warning.
471*16467b97STreehugger Robot
472*16467b97STreehugger Robot* tree grammars also can do rewrite=true for output=template.
473*16467b97STreehugger Robot  Only works for alts with single node or tree as alt elements.
474*16467b97STreehugger Robot  If you are going to use $text in a tree grammar or do rewrite=true
475*16467b97STreehugger Robot  for templates, you must use in your main:
476*16467b97STreehugger Robot
477*16467b97STreehugger Robot  nodes.setTokenStream(tokens);
478*16467b97STreehugger Robot
479*16467b97STreehugger Robot* You get a warning for tree grammars that do rewrite=true and
480*16467b97STreehugger Robot  output=template and have -> for alts that are not simple nodes
481*16467b97STreehugger Robot  or simple trees.  new unit tests in TestRewriteTemplates at end.
482*16467b97STreehugger Robot
483*16467b97STreehugger RobotDecember 27, 2006
484*16467b97STreehugger Robot
485*16467b97STreehugger Robot* Error message appears when you use -> in tree grammar with
486*16467b97STreehugger Robot  output=template and rewrite=true for alt that is not simple
487*16467b97STreehugger Robot  node or tree ref.
488*16467b97STreehugger Robot
489*16467b97STreehugger Robot* no more $stop attribute for tree parsers; meaningless/useless.
490*16467b97STreehugger Robot  Removed from TreeRuleReturnScope also.
491*16467b97STreehugger Robot
492*16467b97STreehugger Robot* rule text attribute in tree parser must pull from token buffer.
493*16467b97STreehugger Robot  Makes no sense otherwise.  added getTokenStream to TreeNodeStream
494*16467b97STreehugger Robot  so rule $text attr works.  CommonTreeNodeStream etc... now let
495*16467b97STreehugger Robot  you set the token stream so you can access later from tree parser.
496*16467b97STreehugger Robot  $text is not well-defined for rules like
497*16467b97STreehugger Robot
498*16467b97STreehugger Robot     slist : stat+ ;
499*16467b97STreehugger Robot
500*16467b97STreehugger Robot  because stat is not a single node nor rooted with a single node.
501*16467b97STreehugger Robot  $slist.text will get only first stat.  I need to add a warning about
502*16467b97STreehugger Robot  this...
503*16467b97STreehugger Robot
504*16467b97STreehugger Robot* Fixed http://www.antlr.org:8888/browse/ANTLR-76 for Java.
505*16467b97STreehugger Robot  Enhanced TokenRewriteStream so it accepts any object; converts
506*16467b97STreehugger Robot  to string at last second.  Allows you to rewrite with StringTemplate
507*16467b97STreehugger Robot  templates now :)
508*16467b97STreehugger Robot
509*16467b97STreehugger Robot* added rewrite option that makes -> template rewrites do replace ops for
510*16467b97STreehugger Robot  TokenRewriteStream input stream.  In output=template and rewrite=true mode
511*16467b97STreehugger Robot  same as before 'cept that the parser does
512*16467b97STreehugger Robot
513*16467b97STreehugger Robot    ((TokenRewriteStream)input).replace(
514*16467b97STreehugger Robot	      ((Token)retval.start).getTokenIndex(),
515*16467b97STreehugger Robot	      input.LT(-1).getTokenIndex(),
516*16467b97STreehugger Robot	      retval.st);
517*16467b97STreehugger Robot
518*16467b97STreehugger Robot  after each rewrite so that the input stream is altered.  Later refs to
519*16467b97STreehugger Robot  $text will have rewrites.  Here's a sample test program for grammar Rew.
520*16467b97STreehugger Robot
521*16467b97STreehugger Robot        FileReader groupFileR = new FileReader("Rew.stg");
522*16467b97STreehugger Robot        StringTemplateGroup templates = new StringTemplateGroup(groupFileR);
523*16467b97STreehugger Robot        ANTLRInputStream input = new ANTLRInputStream(System.in);
524*16467b97STreehugger Robot        RewLexer lexer = new RewLexer(input);
525*16467b97STreehugger Robot        TokenRewriteStream tokens = new TokenRewriteStream(lexer);
526*16467b97STreehugger Robot        RewParser parser = new RewParser(tokens);
527*16467b97STreehugger Robot        parser.setTemplateLib(templates);
528*16467b97STreehugger Robot        parser.program();
529*16467b97STreehugger Robot        System.out.println(tokens.toString());
530*16467b97STreehugger Robot        groupFileR.close();
531*16467b97STreehugger Robot
532*16467b97STreehugger RobotDecember 26, 2006
533*16467b97STreehugger Robot
534*16467b97STreehugger Robot* BaseTree.dupTree didn't dup recursively.
535*16467b97STreehugger Robot
536*16467b97STreehugger RobotDecember 24, 2006
537*16467b97STreehugger Robot
538*16467b97STreehugger Robot* Cleaned up some comments and removed field treeNode
539*16467b97STreehugger Robot  from MismatchedTreeNodeException class.  It is "node" in
540*16467b97STreehugger Robot  RecognitionException.
541*16467b97STreehugger Robot
542*16467b97STreehugger Robot* Changed type from Object to BitSet for expecting fields in
543*16467b97STreehugger Robot  MismatchedSetException and MismatchedNotSetException
544*16467b97STreehugger Robot
545*16467b97STreehugger Robot* Cleaned up error printing in lexers and the messages that it creates.
546*16467b97STreehugger Robot
547*16467b97STreehugger Robot* Added this to TreeAdaptor:
548*16467b97STreehugger Robot	/** Return the token object from which this node was created.
549*16467b97STreehugger Robot	 *  Currently used only for printing an error message.
550*16467b97STreehugger Robot	 *  The error display routine in BaseRecognizer needs to
551*16467b97STreehugger Robot	 *  display where the input the error occurred. If your
552*16467b97STreehugger Robot	 *  tree of limitation does not store information that can
553*16467b97STreehugger Robot	 *  lead you to the token, you can create a token filled with
554*16467b97STreehugger Robot	 *  the appropriate information and pass that back.  See
555*16467b97STreehugger Robot	 *  BaseRecognizer.getErrorMessage().
556*16467b97STreehugger Robot	 */
557*16467b97STreehugger Robot	public Token getToken(Object t);
558*16467b97STreehugger Robot
559*16467b97STreehugger RobotDecember 23, 2006
560*16467b97STreehugger Robot
561*16467b97STreehugger Robot* made BaseRecognizer.displayRecognitionError nonstatic so people can
562*16467b97STreehugger Robot  override it. Not sure why it was static before.
563*16467b97STreehugger Robot
564*16467b97STreehugger Robot* Removed state/decision message that comes out of no
565*16467b97STreehugger Robot  viable alternative exceptions, as that was too much.
566*16467b97STreehugger Robot  removed the decision number from the early exit exception
567*16467b97STreehugger Robot  also.  During development, you can simply override
568*16467b97STreehugger Robot  displayRecognitionError from BaseRecognizer to add the stuff
569*16467b97STreehugger Robot  back in if you want.
570*16467b97STreehugger Robot
571*16467b97STreehugger Robot* made output go to an output method you can override: emitErrorMessage()
572*16467b97STreehugger Robot
573*16467b97STreehugger Robot* general cleanup of the error emitting code in BaseRecognizer.  Lots
574*16467b97STreehugger Robot  more stuff you can override: getErrorHeader, getTokenErrorDisplay,
575*16467b97STreehugger Robot  emitErrorMessage, getErrorMessage.
576*16467b97STreehugger Robot
577*16467b97STreehugger RobotDecember 22, 2006
578*16467b97STreehugger Robot
579*16467b97STreehugger Robot* Altered Tree.Parser.matchAny() so that it skips entire trees if
580*16467b97STreehugger Robot  node has children otherwise skips one node.  Now this works to
581*16467b97STreehugger Robot  skip entire body of function if single-rooted subtree:
582*16467b97STreehugger Robot  ^(FUNC name=ID arg=ID .)
583*16467b97STreehugger Robot
584*16467b97STreehugger Robot* Added "reverse index" from node to stream index.  Override
585*16467b97STreehugger Robot  fillReverseIndex() in CommonTreeNodeStream if you want to change.
586*16467b97STreehugger Robot  Use getNodeIndex(node) to find stream index for a specific tree node.
587*16467b97STreehugger Robot  See getNodeIndex(), reverseIndex(Set tokenTypes),
588*16467b97STreehugger Robot  reverseIndex(int tokenType), fillReverseIndex().  The indexing
589*16467b97STreehugger Robot  costs time and memory to fill, but pulling stuff out will be lots
590*16467b97STreehugger Robot  faster as it can jump from a node ptr straight to a stream index.
591*16467b97STreehugger Robot
592*16467b97STreehugger Robot* Added TreeNodeStream.get(index) to make it easier for interpreters to
593*16467b97STreehugger Robot  jump around in tree node stream.
594*16467b97STreehugger Robot
595*16467b97STreehugger Robot* New CommonTreeNodeStream buffers all nodes in stream for fast jumping
596*16467b97STreehugger Robot  around.  It now has push/pop methods to invoke other locations in
597*16467b97STreehugger Robot  the stream for building interpreters.
598*16467b97STreehugger Robot
599*16467b97STreehugger Robot* Moved CommonTreeNodeStream to UnBufferedTreeNodeStream and removed
600*16467b97STreehugger Robot  Iterator implementation.  moved toNodesOnlyString() to TestTreeNodeStream
601*16467b97STreehugger Robot
602*16467b97STreehugger Robot* [BREAKS ANY TREE IMPLEMENTATION]
603*16467b97STreehugger Robot  made CommonTreeNodeStream work with any tree node type.  TreeAdaptor
604*16467b97STreehugger Robot  now implements isNil so must add; trivial, but does break back
605*16467b97STreehugger Robot  compatibility.
606*16467b97STreehugger Robot
607*16467b97STreehugger RobotDecember 17, 2006
608*16467b97STreehugger Robot
609*16467b97STreehugger Robot* Added traceIn/Out methods to recognizers so that you can override them;
610*16467b97STreehugger Robot  previously they were in-line print statements. The message has also
611*16467b97STreehugger Robot  been slightly improved.
612*16467b97STreehugger Robot
613*16467b97STreehugger Robot* Factored BuildParseTree into debug package; cleaned stuff up. Fixed
614*16467b97STreehugger Robot  unit tests.
615*16467b97STreehugger Robot
616*16467b97STreehugger RobotDecember 15, 2006
617*16467b97STreehugger Robot
618*16467b97STreehugger Robot* [BREAKS ANY TREE IMPLEMENTATION]
619*16467b97STreehugger Robot  org.antlr.runtime.tree.Tree; needed to add get/set for token start/stop
620*16467b97STreehugger Robot  index so CommonTreeAdaptor can assume Tree interface not CommonTree
621*16467b97STreehugger Robot  implementation.  Otherwise, no way to create your own nodes that satisfy
622*16467b97STreehugger Robot  Tree because CommonTreeAdaptor was doing
623*16467b97STreehugger Robot
624*16467b97STreehugger Robot	public int getTokenStartIndex(Object t) {
625*16467b97STreehugger Robot		return ((CommonTree)t).startIndex;
626*16467b97STreehugger Robot	}
627*16467b97STreehugger Robot
628*16467b97STreehugger Robot  Added to Tree:
629*16467b97STreehugger Robot
630*16467b97STreehugger Robot	/**  What is the smallest token index (indexing from 0) for this node
631*16467b97STreehugger Robot	 *   and its children?
632*16467b97STreehugger Robot	 */
633*16467b97STreehugger Robot	int getTokenStartIndex();
634*16467b97STreehugger Robot
635*16467b97STreehugger Robot	void setTokenStartIndex(int index);
636*16467b97STreehugger Robot
637*16467b97STreehugger Robot	/**  What is the largest token index (indexing from 0) for this node
638*16467b97STreehugger Robot	 *   and its children?
639*16467b97STreehugger Robot	 */
640*16467b97STreehugger Robot	int getTokenStopIndex();
641*16467b97STreehugger Robot
642*16467b97STreehugger Robot	void setTokenStopIndex(int index);
643*16467b97STreehugger Robot
644*16467b97STreehugger RobotDecember 13, 2006
645*16467b97STreehugger Robot
646*16467b97STreehugger Robot* Added org.antlr.runtime.tree.DOTTreeGenerator so you can generate DOT
647*16467b97STreehugger Robot  diagrams easily from trees.
648*16467b97STreehugger Robot
649*16467b97STreehugger Robot	CharStream input = new ANTLRInputStream(System.in);
650*16467b97STreehugger Robot	TLexer lex = new TLexer(input);
651*16467b97STreehugger Robot	CommonTokenStream tokens = new CommonTokenStream(lex);
652*16467b97STreehugger Robot	TParser parser = new TParser(tokens);
653*16467b97STreehugger Robot	TParser.e_return r = parser.e();
654*16467b97STreehugger Robot	Tree t = (Tree)r.tree;
655*16467b97STreehugger Robot	System.out.println(t.toStringTree());
656*16467b97STreehugger Robot	DOTTreeGenerator gen = new DOTTreeGenerator();
657*16467b97STreehugger Robot	StringTemplate st = gen.toDOT(t);
658*16467b97STreehugger Robot	System.out.println(st);
659*16467b97STreehugger Robot
660*16467b97STreehugger Robot* Changed the way mark()/rewind() work in CommonTreeNode stream to mirror
661*16467b97STreehugger Robot  more flexible solution in ANTLRStringStream.  Forgot to set lastMarker
662*16467b97STreehugger Robot  anyway.  Now you can rewind to non-most-recent marker.
663*16467b97STreehugger Robot
664*16467b97STreehugger RobotDecember 12, 2006
665*16467b97STreehugger Robot
666*16467b97STreehugger Robot* Temp lexer now end in .gl (T__.gl, for example)
667*16467b97STreehugger Robot
668*16467b97STreehugger Robot* TreeParser suffix no longer generated for tree grammars
669*16467b97STreehugger Robot
670*16467b97STreehugger Robot* Defined reset for lexer, parser, tree parser; rewinds the input stream also
671*16467b97STreehugger Robot
672*16467b97STreehugger RobotDecember 10, 2006
673*16467b97STreehugger Robot
674*16467b97STreehugger Robot* Made Grammar.abortNFAToDFAConversion() abort in middle of a DFA.
675*16467b97STreehugger Robot
676*16467b97STreehugger RobotDecember 9, 2006
677*16467b97STreehugger Robot
678*16467b97STreehugger Robot* fixed bug in OrderedHashSet.add().  It didn't track elements correctly.
679*16467b97STreehugger Robot
680*16467b97STreehugger RobotDecember 6, 2006
681*16467b97STreehugger Robot
682*16467b97STreehugger Robot* updated build.xml for future Ant compatibility, thanks to Matt Benson.
683*16467b97STreehugger Robot
684*16467b97STreehugger Robot* various tests in TestRewriteTemplate and TestSyntacticPredicateEvaluation
685*16467b97STreehugger Robot  were using the old 'channel' vs. new '$channel' notation.
686*16467b97STreehugger Robot  TestInterpretedParsing didn't pick up an earlier change to CommonToken.
687*16467b97STreehugger Robot  Reported by Matt Benson.
688*16467b97STreehugger Robot
689*16467b97STreehugger Robot* fixed platform dependent test failures in TestTemplates, supplied by Matt
690*16467b97STreehugger Robot  Benson.
691*16467b97STreehugger Robot
692*16467b97STreehugger RobotNovember 29, 2006
693*16467b97STreehugger Robot
694*16467b97STreehugger Robot*  optimized semantic predicate evaluation so that p||!p yields true.
695*16467b97STreehugger Robot
696*16467b97STreehugger RobotNovember 22, 2006
697*16467b97STreehugger Robot
698*16467b97STreehugger Robot* fixed bug that prevented var = $rule.some_retval from working in anything
699*16467b97STreehugger Robot  but the first alternative of a rule or subrule.
700*16467b97STreehugger Robot
701*16467b97STreehugger Robot* attribute names containing digits were not allowed, this is now fixed,
702*16467b97STreehugger Robot  allowing attributes like 'name1' but not '1name1'.
703*16467b97STreehugger Robot
704*16467b97STreehugger RobotNovember 19, 2006
705*16467b97STreehugger Robot
706*16467b97STreehugger Robot* Removed LeftRecursionMessage and apparatus because it seems that I check
707*16467b97STreehugger Robot  for left recursion upfront before analysis and everything gets specified as
708*16467b97STreehugger Robot  recursion cycles at this point.
709*16467b97STreehugger Robot
710*16467b97STreehugger RobotNovember 16, 2006
711*16467b97STreehugger Robot
712*16467b97STreehugger Robot* TokenRewriteStream.replace was not passing programName to next method.
713*16467b97STreehugger Robot
714*16467b97STreehugger RobotNovember 15, 2006
715*16467b97STreehugger Robot
716*16467b97STreehugger Robot* updated DOT files for DFA generation to make smaller circles.
717*16467b97STreehugger Robot
718*16467b97STreehugger Robot* made epsilon edges italics in the NFA diagrams.
719*16467b97STreehugger Robot
720*16467b97STreehugger Robot3.0b5 - November 15, 2006
721*16467b97STreehugger Robot
722*16467b97STreehugger RobotThe biggest thing is that your grammar file names must match the grammar name
723*16467b97STreehugger Robotinside (your generated class names will also be different) and we use
724*16467b97STreehugger Robot$channel=HIDDEN now instead of channel=99 inside lexer actions.
725*16467b97STreehugger RobotShould be compatible other than that.   Please look at complete list of
726*16467b97STreehugger Robotchanges.
727*16467b97STreehugger Robot
728*16467b97STreehugger RobotNovember 14, 2006
729*16467b97STreehugger Robot
730*16467b97STreehugger Robot* Force token index to be -1 for CommonIndex in case not set.
731*16467b97STreehugger Robot
732*16467b97STreehugger RobotNovember 11, 2006
733*16467b97STreehugger Robot
734*16467b97STreehugger Robot* getUniqueID for TreeAdaptor now uses identityHashCode instead of hashCode.
735*16467b97STreehugger Robot
736*16467b97STreehugger RobotNovember 10, 2006
737*16467b97STreehugger Robot
738*16467b97STreehugger Robot* No grammar nondeterminism warning now when wildcard '.' is final alt.
739*16467b97STreehugger Robot  Examples:
740*16467b97STreehugger Robot
741*16467b97STreehugger Robot	a : A | B | . ;
742*16467b97STreehugger Robot
743*16467b97STreehugger Robot	A : 'a'
744*16467b97STreehugger Robot	  | .
745*16467b97STreehugger Robot	  ;
746*16467b97STreehugger Robot
747*16467b97STreehugger Robot	SL_COMMENT
748*16467b97STreehugger Robot	    : '//' (options {greedy=false;} : .)* '\r'? '\n'
749*16467b97STreehugger Robot	    ;
750*16467b97STreehugger Robot
751*16467b97STreehugger Robot	SL_COMMENT2
752*16467b97STreehugger Robot	    : '//' (options {greedy=false;} : 'x'|.)* '\r'? '\n'
753*16467b97STreehugger Robot	    ;
754*16467b97STreehugger Robot
755*16467b97STreehugger Robot
756*16467b97STreehugger RobotNovember 8, 2006
757*16467b97STreehugger Robot
758*16467b97STreehugger Robot* Syntactic predicates did not get hoisting properly upon non-LL(*) decision.  Other hoisting issues fixed.  Cleaned up code.
759*16467b97STreehugger Robot
760*16467b97STreehugger Robot* Removed failsafe that check to see if I'm spending too much time on a single DFA; I don't think we need it anymore.
761*16467b97STreehugger Robot
762*16467b97STreehugger RobotNovember 3, 2006
763*16467b97STreehugger Robot
764*16467b97STreehugger Robot* $text, $line, etc... were not working in assignments. Fixed and added
765*16467b97STreehugger Robot  test case.
766*16467b97STreehugger Robot
767*16467b97STreehugger Robot* $label.text translated to label.getText in lexer even if label was on a char
768*16467b97STreehugger Robot
769*16467b97STreehugger RobotNovember 2, 2006
770*16467b97STreehugger Robot
771*16467b97STreehugger Robot* Added error if you don't specify what the AST type is; actions in tree
772*16467b97STreehugger Robot  grammar won't work without it.
773*16467b97STreehugger Robot
774*16467b97STreehugger Robot  $ cat x.g
775*16467b97STreehugger Robot  tree grammar x;
776*16467b97STreehugger Robot  a : ID {String s = $ID.text;} ;
777*16467b97STreehugger Robot
778*16467b97STreehugger Robot  ANTLR Parser Generator   Early Access Version 3.0b5 (??, 2006)  1989-2006
779*16467b97STreehugger Robot  error: x.g:0:0: (152) tree grammar x has no ASTLabelType option
780*16467b97STreehugger Robot
781*16467b97STreehugger RobotNovember 1, 2006
782*16467b97STreehugger Robot
783*16467b97STreehugger Robot* $text, $line, etc... were not working properly within lexer rule.
784*16467b97STreehugger Robot
785*16467b97STreehugger RobotOctober 32, 2006
786*16467b97STreehugger Robot
787*16467b97STreehugger Robot* Finally actions now execute before dynamic scopes are popped it in the
788*16467b97STreehugger Robot  rule. Previously was not possible to access the rules scoped variables
789*16467b97STreehugger Robot  in a finally action.
790*16467b97STreehugger Robot
791*16467b97STreehugger RobotOctober 29, 2006
792*16467b97STreehugger Robot
793*16467b97STreehugger Robot* Altered ActionTranslator to emit errors on setting read-only attributes
794*16467b97STreehugger Robot  such as $start, $stop, $text in a rule. Also forbid setting any attributes
795*16467b97STreehugger Robot  in rules/tokens referenced by a label or name.
796*16467b97STreehugger Robot  Setting dynamic scopes's attributes and your own parameter attributes
797*16467b97STreehugger Robot  is legal.
798*16467b97STreehugger Robot
799*16467b97STreehugger RobotOctober 27, 2006
800*16467b97STreehugger Robot
801*16467b97STreehugger Robot* Altered how ANTLR figures out what decision is associated with which
802*16467b97STreehugger Robot  block of grammar.  Makes ANTLRWorks correctly find DFA for a block.
803*16467b97STreehugger Robot
804*16467b97STreehugger RobotOctober 26, 2006
805*16467b97STreehugger Robot
806*16467b97STreehugger Robot* Fixed bug where EOT transitions led to no NFA configs in a DFA state,
807*16467b97STreehugger Robot  yielding an error in DFA table generation.
808*16467b97STreehugger Robot
809*16467b97STreehugger Robot* renamed action.g to ActionTranslator.g
810*16467b97STreehugger Robot  the ActionTranslator class is now called ActionTranslatorLexer, as ANTLR
811*16467b97STreehugger Robot  generates this classname now. Fixed rest of codebase accordingly.
812*16467b97STreehugger Robot
813*16467b97STreehugger Robot* added rules recognizing setting of scopes' attributes to ActionTranslator.g
814*16467b97STreehugger Robot  the Objective C target needed access to the right-hand side of the assignment
815*16467b97STreehugger Robot  in order to generate correct code
816*16467b97STreehugger Robot
817*16467b97STreehugger Robot* changed ANTLRCore.sti to reflect the new mandatory templates to support the above
818*16467b97STreehugger Robot  namely: scopeSetAttributeRef, returnSetAttributeRef and the ruleSetPropertyRef_*
819*16467b97STreehugger Robot  templates, with the exception of ruleSetPropertyRef_text. we cannot set this attribute
820*16467b97STreehugger Robot
821*16467b97STreehugger RobotOctober 19, 2006
822*16467b97STreehugger Robot
823*16467b97STreehugger Robot* Fixed 2 bugs in DFA conversion that caused exceptions.
824*16467b97STreehugger Robot  altered functionality of getMinElement so it ignores elements<0.
825*16467b97STreehugger Robot
826*16467b97STreehugger RobotOctober 18, 2006
827*16467b97STreehugger Robot
828*16467b97STreehugger Robot* moved resetStateNumbersToBeContiguous() to after issuing of warnings;
829*16467b97STreehugger Robot  an internal error in that routine should make more sense as issues
830*16467b97STreehugger Robot  with decision will appear first.
831*16467b97STreehugger Robot
832*16467b97STreehugger Robot* fixed cut/paste bug I introduced when fixed EOF in min/max
833*16467b97STreehugger Robot  bug. Prevented C grammar from working briefly.
834*16467b97STreehugger Robot
835*16467b97STreehugger RobotOctober 17, 2006
836*16467b97STreehugger Robot
837*16467b97STreehugger Robot* Removed a failsafe that seems to be unnecessary that ensure DFA didn't
838*16467b97STreehugger Robot  get too big.  It was resulting in some failures in code generation that
839*16467b97STreehugger Robot  led me on quite a strange debugging trip.
840*16467b97STreehugger Robot
841*16467b97STreehugger RobotOctober 16, 2006
842*16467b97STreehugger Robot
843*16467b97STreehugger Robot* Use channel=HIDDEN not channel=99 to put tokens on hidden channel.
844*16467b97STreehugger Robot
845*16467b97STreehugger RobotOctober 12, 2006
846*16467b97STreehugger Robot
847*16467b97STreehugger Robot* ANTLR now has a customizable message format for errors and warnings,
848*16467b97STreehugger Robot  to make it easier to fulfill requirements by IDEs and such.
849*16467b97STreehugger Robot  The format to be used can be specified via the '-message-format name'
850*16467b97STreehugger Robot  command line switch. The default for name is 'antlr', also available
851*16467b97STreehugger Robot  at the moment is 'gnu'. This is done via StringTemplate, for details
852*16467b97STreehugger Robot  on the requirements look in org/antlr/tool/templates/messages/formats/
853*16467b97STreehugger Robot
854*16467b97STreehugger Robot* line numbers for lexers in combined grammars are now reported correctly.
855*16467b97STreehugger Robot
856*16467b97STreehugger RobotSeptember 29, 2006
857*16467b97STreehugger Robot
858*16467b97STreehugger Robot* ANTLRReaderStream improperly checked for end of input.
859*16467b97STreehugger Robot
860*16467b97STreehugger RobotSeptember 28, 2006
861*16467b97STreehugger Robot
862*16467b97STreehugger Robot* For ANTLRStringStream, LA(-1) was off by one...gave you LA(-2).
863*16467b97STreehugger Robot
864*16467b97STreehugger Robot3.0b4 - August 24, 2006
865*16467b97STreehugger Robot
866*16467b97STreehugger Robot* error when no rules in grammar.  doesn't crash now.
867*16467b97STreehugger Robot
868*16467b97STreehugger Robot* Token is now an interface.
869*16467b97STreehugger Robot
870*16467b97STreehugger Robot* remove dependence on non runtime classes in runtime package.
871*16467b97STreehugger Robot
872*16467b97STreehugger Robot* filename and grammar name must be same Foo in Foo.g.  Generates FooParser,
873*16467b97STreehugger Robot  FooLexer, ...  Combined grammar Foo generates Foo$Lexer.g which generates
874*16467b97STreehugger Robot  FooLexer.java.  tree grammars generate FooTreeParser.java
875*16467b97STreehugger Robot
876*16467b97STreehugger RobotAugust 24, 2006
877*16467b97STreehugger Robot
878*16467b97STreehugger Robot* added C# target to lib, codegen, templates
879*16467b97STreehugger Robot
880*16467b97STreehugger RobotAugust 11, 2006
881*16467b97STreehugger Robot
882*16467b97STreehugger Robot* added tree arg to navigation methods in treeadaptor
883*16467b97STreehugger Robot
884*16467b97STreehugger RobotAugust 07, 2006
885*16467b97STreehugger Robot
886*16467b97STreehugger Robot* fixed bug related to (a|)+ on end of lexer rules.  crashed instead
887*16467b97STreehugger Robot  of warning.
888*16467b97STreehugger Robot
889*16467b97STreehugger Robot* added warning that interpreter doesn't do synpreds yet
890*16467b97STreehugger Robot
891*16467b97STreehugger Robot* allow different source of classloader:
892*16467b97STreehugger RobotClassLoader cl = Thread.currentThread().getContextClassLoader();
893*16467b97STreehugger Robotif ( cl==null ) {
894*16467b97STreehugger Robot    cl = this.getClass().getClassLoader();
895*16467b97STreehugger Robot}
896*16467b97STreehugger Robot
897*16467b97STreehugger Robot
898*16467b97STreehugger RobotJuly 26, 2006
899*16467b97STreehugger Robot
900*16467b97STreehugger Robot* compressed DFA edge tables significantly.  All edge tables are
901*16467b97STreehugger Robot  unique. The transition table can reuse arrays.  Look like this now:
902*16467b97STreehugger Robot
903*16467b97STreehugger Robot     public static readonly DFA30_transition0 =
904*16467b97STreehugger Robot     	new short[] { 46, 46, -1, 46, 46, -1, -1, -1, -1, -1, -1, -1,...};
905*16467b97STreehugger Robot         public static readonly DFA30_transition1 =
906*16467b97STreehugger Robot     	new short[] { 21 };
907*16467b97STreehugger Robot      public static readonly short[][] DFA30_transition = {
908*16467b97STreehugger Robot     	  DFA30_transition0,
909*16467b97STreehugger Robot     	  DFA30_transition0,
910*16467b97STreehugger Robot     	  DFA30_transition1,
911*16467b97STreehugger Robot     	  ...
912*16467b97STreehugger Robot      };
913*16467b97STreehugger Robot
914*16467b97STreehugger Robot* If you defined both a label like EQ and '=', sometimes the '=' was
915*16467b97STreehugger Robot  used instead of the EQ label.
916*16467b97STreehugger Robot
917*16467b97STreehugger Robot* made headerFile template have same arg list as outputFile for consistency
918*16467b97STreehugger Robot
919*16467b97STreehugger Robot* outputFile, lexer, genericParser, parser, treeParser templates
920*16467b97STreehugger Robot  reference cyclicDFAs attribute which was no longer used after I
921*16467b97STreehugger Robot  started the new table-based DFA.  I made cyclicDFADescriptors
922*16467b97STreehugger Robot  argument to outputFile and headerFile (only).  I think this is
923*16467b97STreehugger Robot  correct as only OO languages will want the DFA in the recognizer.
924*16467b97STreehugger Robot  At the top level, C and friends can use it.  Changed name to use
925*16467b97STreehugger Robot  cyclicDFAs again as it's a better name probably.  Removed parameter
926*16467b97STreehugger Robot  from the lexer, ...  For example, my parser template says this now:
927*16467b97STreehugger Robot
928*16467b97STreehugger Robot    <cyclicDFAs:cyclicDFA()> <! dump tables for all DFA !>
929*16467b97STreehugger Robot
930*16467b97STreehugger Robot* made all token ref token types go thru code gen's
931*16467b97STreehugger Robot  getTokenTypeAsTargetLabel()
932*16467b97STreehugger Robot
933*16467b97STreehugger Robot* no more computing DFA transition tables for acyclic DFA.
934*16467b97STreehugger Robot
935*16467b97STreehugger RobotJuly 25, 2006
936*16467b97STreehugger Robot
937*16467b97STreehugger Robot* fixed a place where I was adding syn predicates into rewrite stuff.
938*16467b97STreehugger Robot
939*16467b97STreehugger Robot* turned off invalid token index warning in AW support; had a problem.
940*16467b97STreehugger Robot
941*16467b97STreehugger Robot* bad location event generated with -debug for synpreds in autobacktrack mode.
942*16467b97STreehugger Robot
943*16467b97STreehugger RobotJuly 24, 2006
944*16467b97STreehugger Robot
945*16467b97STreehugger Robot* changed runtime.DFA so that it treats all chars and token types as
946*16467b97STreehugger Robot  char (unsigned 16 bit int).  -1 becomes '\uFFFF' then or 65535.
947*16467b97STreehugger Robot
948*16467b97STreehugger Robot* changed MAX_STATE_TRANSITIONS_FOR_TABLE to be 65534 by default
949*16467b97STreehugger Robot  now. This means that all states can use a table to do transitions.
950*16467b97STreehugger Robot
951*16467b97STreehugger Robot* was not making synpreds on (C)* type loops with backtrack=true
952*16467b97STreehugger Robot
953*16467b97STreehugger Robot* was copying tree stuff and actions into synpreds with backtrack=true
954*16467b97STreehugger Robot
955*16467b97STreehugger Robot* was making synpreds on even single alt rules / blocks with backtrack=true
956*16467b97STreehugger Robot
957*16467b97STreehugger Robot3.0b3 - July 21, 2006
958*16467b97STreehugger Robot
959*16467b97STreehugger Robot* ANTLR fails to analyze complex decisions much less frequently.  It
960*16467b97STreehugger Robot  turns out that the set of decisions for which ANTLR fails (times
961*16467b97STreehugger Robot  out) is the same set (so far) of non-LL(*) decisions.  Morever, I'm
962*16467b97STreehugger Robot  able to detect this situation quickly and report rather than timing
963*16467b97STreehugger Robot  out. Errors look like:
964*16467b97STreehugger Robot
965*16467b97STreehugger Robot  java.g:468:23: [fatal] rule concreteDimensions has non-LL(*)
966*16467b97STreehugger Robot    decision due to recursive rule invocations in alts 1,2.  Resolve
967*16467b97STreehugger Robot    by left-factoring or using syntactic predicates with fixed k
968*16467b97STreehugger Robot    lookahead or use backtrack=true option.
969*16467b97STreehugger Robot
970*16467b97STreehugger Robot  This message only appears when k=*.
971*16467b97STreehugger Robot
972*16467b97STreehugger Robot* Shortened no viable alt messages to not include decision
973*16467b97STreehugger Robot  description:
974*16467b97STreehugger Robot
975*16467b97STreehugger Robot[compilationUnit, declaration]: line 8:8 decision=<<67:1: declaration
976*16467b97STreehugger Robot: ( ( fieldDeclaration )=> fieldDeclaration | ( methodDeclaration )=>
977*16467b97STreehugger RobotmethodDeclaration | ( constructorDeclaration )=>
978*16467b97STreehugger RobotconstructorDeclaration | ( classDeclaration )=> classDeclaration | (
979*16467b97STreehugger RobotinterfaceDeclaration )=> interfaceDeclaration | ( blockDeclaration )=>
980*16467b97STreehugger RobotblockDeclaration | emptyDeclaration );>> state 3 (decision=14) no
981*16467b97STreehugger Robotviable alt; token=[@1,184:187='java',<122>,8:8]
982*16467b97STreehugger Robot
983*16467b97STreehugger Robot  too long and hard to read.
984*16467b97STreehugger Robot
985*16467b97STreehugger RobotJuly 19, 2006
986*16467b97STreehugger Robot
987*16467b97STreehugger Robot* Code gen bug: states with no emanating edges were ignored by ST.
988*16467b97STreehugger Robot  Now an empty list is used.
989*16467b97STreehugger Robot
990*16467b97STreehugger Robot* Added grammar parameter to recognizer templates so they can access
991*16467b97STreehugger Robot  properties like getName(), ...
992*16467b97STreehugger Robot
993*16467b97STreehugger RobotJuly 10, 2006
994*16467b97STreehugger Robot
995*16467b97STreehugger Robot* Fixed the gated pred merged state bug.  Added unit test.
996*16467b97STreehugger Robot
997*16467b97STreehugger Robot* added new method to Target: getTokenTypeAsTargetLabel()
998*16467b97STreehugger Robot
999*16467b97STreehugger RobotJuly 7, 2006
1000*16467b97STreehugger Robot
1001*16467b97STreehugger Robot* I was doing an AND instead of OR in the gated predicate stuff.
1002*16467b97STreehugger Robot  Thanks to Stephen Kou!
1003*16467b97STreehugger Robot
1004*16467b97STreehugger Robot* Reduce op for combining predicates was insanely slow sometimes and
1005*16467b97STreehugger Robot  didn't actually work well.  Now it's fast and works.
1006*16467b97STreehugger Robot
1007*16467b97STreehugger Robot* There is a bug in merging of DFA stop states related to gated
1008*16467b97STreehugger Robot  preds...turned it off for now.
1009*16467b97STreehugger Robot
1010*16467b97STreehugger Robot3.0b2 - July 5, 2006
1011*16467b97STreehugger Robot
1012*16467b97STreehugger RobotJuly 5, 2006
1013*16467b97STreehugger Robot
1014*16467b97STreehugger Robot* token emission not properly protected in lexer filter mode.
1015*16467b97STreehugger Robot
1016*16467b97STreehugger Robot* EOT, EOT DFA state transition tables should be init'd to -1 (only
1017*16467b97STreehugger Robot  was doing this for compressed tables).  Fixed.
1018*16467b97STreehugger Robot
1019*16467b97STreehugger Robot* in trace mode, exit method not shown for memoized rules
1020*16467b97STreehugger Robot
1021*16467b97STreehugger Robot* added -Xmaxdfaedges to allow you to increase number of edges allowed
1022*16467b97STreehugger Robot  for a single DFA state before it becomes "special" and can't fit in
1023*16467b97STreehugger Robot  a simple table.
1024*16467b97STreehugger Robot
1025*16467b97STreehugger Robot* Bug in tables.  Short are signed so min/max tables for DFA are now
1026*16467b97STreehugger Robot  char[].  Bizarre.
1027*16467b97STreehugger Robot
1028*16467b97STreehugger RobotJuly 3, 2006
1029*16467b97STreehugger Robot
1030*16467b97STreehugger Robot* Added a method to reset the tool error state for current thread.
1031*16467b97STreehugger Robot  See ErrorManager.java
1032*16467b97STreehugger Robot
1033*16467b97STreehugger Robot* [Got this working properly today] backtrack mode that let's you type
1034*16467b97STreehugger Robot  in any old crap and ANTLR will backtrack if it can't figure out what
1035*16467b97STreehugger Robot  you meant.  No errors are reported by antlr during analysis.  It
1036*16467b97STreehugger Robot  implicitly adds a syn pred in front of every production, using them
1037*16467b97STreehugger Robot  only if static grammar LL(*) analysis fails.  Syn pred code is not
1038*16467b97STreehugger Robot  generated if the pred is not used in a decision.
1039*16467b97STreehugger Robot
1040*16467b97STreehugger Robot  This is essentially a rapid prototyping mode.
1041*16467b97STreehugger Robot
1042*16467b97STreehugger Robot* Added backtracking report to the -report option
1043*16467b97STreehugger Robot
1044*16467b97STreehugger Robot* Added NFA->DFA conversion early termination report to the -report option
1045*16467b97STreehugger Robot
1046*16467b97STreehugger Robot* Added grammar level k and backtrack options to -report
1047*16467b97STreehugger Robot
1048*16467b97STreehugger Robot* Added a dozen unit tests to test autobacktrack NFA construction.
1049*16467b97STreehugger Robot
1050*16467b97STreehugger Robot* If you are using filter mode, you must manually use option
1051*16467b97STreehugger Robot  memoize=true now.
1052*16467b97STreehugger Robot
1053*16467b97STreehugger RobotJuly 2, 2006
1054*16467b97STreehugger Robot
1055*16467b97STreehugger Robot* Added k=* option so you can set k=2, for example, on whole grammar,
1056*16467b97STreehugger Robot  but an individual decision can be LL(*).
1057*16467b97STreehugger Robot
1058*16467b97STreehugger Robot* memoize option for grammars, rules, blocks.  Remove -nomemo cmd-line option
1059*16467b97STreehugger Robot
1060*16467b97STreehugger Robot* but in DOT generator for DFA; fixed.
1061*16467b97STreehugger Robot
1062*16467b97STreehugger Robot* runtime.DFA reported errors even when backtracking
1063*16467b97STreehugger Robot
1064*16467b97STreehugger RobotJuly 1, 2006
1065*16467b97STreehugger Robot
1066*16467b97STreehugger Robot* Added -X option list to help
1067*16467b97STreehugger Robot
1068*16467b97STreehugger Robot* Syn preds were being hoisted into other rules, causing lots of extra
1069*16467b97STreehugger Robot  backtracking.
1070*16467b97STreehugger Robot
1071*16467b97STreehugger RobotJune 29, 2006
1072*16467b97STreehugger Robot
1073*16467b97STreehugger Robot* unnecessary files removed during build.
1074*16467b97STreehugger Robot
1075*16467b97STreehugger Robot* Matt Benson updated build.xml
1076*16467b97STreehugger Robot
1077*16467b97STreehugger Robot* Detecting use of synpreds in analysis now instead of codegen.  In
1078*16467b97STreehugger Robot  this way, I can avoid analyzing decisions in synpreds for synpreds
1079*16467b97STreehugger Robot  not used in a DFA for a real rule.  This is used to optimize things
1080*16467b97STreehugger Robot  for backtrack option.
1081*16467b97STreehugger Robot
1082*16467b97STreehugger Robot* Code gen must add _fragment or whatever to end of pred name in
1083*16467b97STreehugger Robot  template synpredRule to avoid having ANTLR know anything about
1084*16467b97STreehugger Robot  method names.
1085*16467b97STreehugger Robot
1086*16467b97STreehugger Robot* Added -IdbgST option to emit ST delimiters at start/stop of all
1087*16467b97STreehugger Robot  templates spit out.
1088*16467b97STreehugger Robot
1089*16467b97STreehugger RobotJune 28, 2006
1090*16467b97STreehugger Robot
1091*16467b97STreehugger Robot* Tweaked message when ANTLR cannot handle analysis.
1092*16467b97STreehugger Robot
1093*16467b97STreehugger Robot3.0b1 - June 27, 2006
1094*16467b97STreehugger Robot
1095*16467b97STreehugger RobotJune 24, 2006
1096*16467b97STreehugger Robot
1097*16467b97STreehugger Robot* syn preds no longer generate little static classes; they also don't
1098*16467b97STreehugger Robot  generate a whole bunch of extra crap in the rules built to test syn
1099*16467b97STreehugger Robot  preds.  Removed GrammarFragmentPointer class from runtime.
1100*16467b97STreehugger Robot
1101*16467b97STreehugger RobotJune 23-24, 2006
1102*16467b97STreehugger Robot
1103*16467b97STreehugger Robot* added output option to -report output.
1104*16467b97STreehugger Robot
1105*16467b97STreehugger Robot* added profiling info:
1106*16467b97STreehugger Robot  Number of rule invocations in "guessing" mode
1107*16467b97STreehugger Robot  number of rule memoization cache hits
1108*16467b97STreehugger Robot  number of rule memoization cache misses
1109*16467b97STreehugger Robot
1110*16467b97STreehugger Robot* made DFA DOT diagrams go left to right not top to bottom
1111*16467b97STreehugger Robot
1112*16467b97STreehugger Robot* I try to recursive overflow states now by resolving these states
1113*16467b97STreehugger Robot  with semantic/syntactic predicates if they exist.  The DFA is then
1114*16467b97STreehugger Robot  deterministic rather than simply resolving by choosing first
1115*16467b97STreehugger Robot  nondeterministic alt.  I used to generated errors:
1116*16467b97STreehugger Robot
1117*16467b97STreehugger Robot~/tmp $ java org.antlr.Tool -dfa t.g
1118*16467b97STreehugger RobotANTLR Parser Generator   Early Access Version 3.0b2 (July 5, 2006)  1989-2006
1119*16467b97STreehugger Robott.g:2:5: Alternative 1: after matching input such as A A A A A decision cannot predict what comes next due to recursion overflow to b from b
1120*16467b97STreehugger Robott.g:2:5: Alternative 2: after matching input such as A A A A A decision cannot predict what comes next due to recursion overflow to b from b
1121*16467b97STreehugger Robot
1122*16467b97STreehugger Robot  Now, I uses predicates if available and emits no warnings.
1123*16467b97STreehugger Robot
1124*16467b97STreehugger Robot* made sem preds share accept states.  Previously, multiple preds in a
1125*16467b97STreehugger Robotdecision forked new accepts each time for each nondet state.
1126*16467b97STreehugger Robot
1127*16467b97STreehugger RobotJune 19, 2006
1128*16467b97STreehugger Robot
1129*16467b97STreehugger Robot* Need parens around the prediction expressions in templates.
1130*16467b97STreehugger Robot
1131*16467b97STreehugger Robot* Referencing $ID.text in an action forced bad code gen in lexer rule ID.
1132*16467b97STreehugger Robot
1133*16467b97STreehugger Robot* Fixed a bug in how predicates are collected.  The definition of
1134*16467b97STreehugger Robot  "last predicated alternative" was incorrect in the analysis.  Further,
1135*16467b97STreehugger Robot  gated predicates incorrectly missed a case where an edge should become
1136*16467b97STreehugger Robot  true (a tautology).
1137*16467b97STreehugger Robot
1138*16467b97STreehugger Robot* Removed an unnecessary input.consume() reference in the runtime/DFA class.
1139*16467b97STreehugger Robot
1140*16467b97STreehugger RobotJune 14, 2006
1141*16467b97STreehugger Robot
1142*16467b97STreehugger Robot* -> ($rulelabel)? didn't generate proper code for ASTs.
1143*16467b97STreehugger Robot
1144*16467b97STreehugger Robot* bug in code gen (did not compile)
1145*16467b97STreehugger Robota : ID -> ID
1146*16467b97STreehugger Robot  | ID -> ID
1147*16467b97STreehugger Robot  ;
1148*16467b97STreehugger RobotProblem is repeated ref to ID from left side.  Juergen pointed this out.
1149*16467b97STreehugger Robot
1150*16467b97STreehugger Robot* use of tokenVocab with missing file yielded exception
1151*16467b97STreehugger Robot
1152*16467b97STreehugger Robot* (A|B)=> foo yielded an exception as (A|B) is a set not a block. Fixed.
1153*16467b97STreehugger Robot
1154*16467b97STreehugger Robot* Didn't set ID1= and INT1= for this alt:
1155*16467b97STreehugger Robot  | ^(ID INT+ {System.out.print(\"^(\"+$ID+\" \"+$INT+\")\");})
1156*16467b97STreehugger Robot
1157*16467b97STreehugger Robot* Fixed so repeated dangling state errors only occur once like:
1158*16467b97STreehugger Robott.g:4:17: the decision cannot distinguish between alternative(s) 2,1 for at least one input sequence
1159*16467b97STreehugger Robot
1160*16467b97STreehugger Robot* tracking of rule elements was on (making list defs at start of
1161*16467b97STreehugger Robot  method) with templates instead of just with ASTs.  Turned off.
1162*16467b97STreehugger Robot
1163*16467b97STreehugger Robot* Doesn't crash when you give it a missing file now.
1164*16467b97STreehugger Robot
1165*16467b97STreehugger Robot* -report: add output info: how many LL(1) decisions.
1166*16467b97STreehugger Robot
1167*16467b97STreehugger RobotJune 13, 2006
1168*16467b97STreehugger Robot
1169*16467b97STreehugger Robot* ^(ROOT ID?) Didn't work; nor did any other nullable child list such as
1170*16467b97STreehugger Robot  ^(ROOT ID* INT?).  Now, I check to see if child list is nullable using
1171*16467b97STreehugger Robot  Grammar.LOOK() and, if so, I generate an "IF lookahead is DOWN" gate
1172*16467b97STreehugger Robot  around the child list so the whole thing is optional.
1173*16467b97STreehugger Robot
1174*16467b97STreehugger Robot* Fixed a bug in LOOK that made it not look through nullable rules.
1175*16467b97STreehugger Robot
1176*16467b97STreehugger Robot* Using AST suffixes or -> rewrite syntax now gives an error w/o a grammar
1177*16467b97STreehugger Robot  output option.  Used to crash ;)
1178*16467b97STreehugger Robot
1179*16467b97STreehugger Robot* References to EOF ended up with improper -1 refs instead of EOF in output.
1180*16467b97STreehugger Robot
1181*16467b97STreehugger Robot* didn't warn of ambig ref to $expr in rewrite; fixed.
1182*16467b97STreehugger Robotlist
1183*16467b97STreehugger Robot     :	'[' expr 'for' type ID 'in' expr ']'
1184*16467b97STreehugger Robot	-> comprehension(expr={$expr.st},type={},list={},i={})
1185*16467b97STreehugger Robot	;
1186*16467b97STreehugger Robot
1187*16467b97STreehugger RobotJune 12, 2006
1188*16467b97STreehugger Robot
1189*16467b97STreehugger Robot* EOF works in the parser as a token name.
1190*16467b97STreehugger Robot
1191*16467b97STreehugger Robot* Rule b:(A B?)*; didn't display properly in AW due to the way ANTLR
1192*16467b97STreehugger Robot  generated NFA.
1193*16467b97STreehugger Robot
1194*16467b97STreehugger Robot* "scope x;" in a rule for unknown x gives no error.  Fixed.  Added unit test.
1195*16467b97STreehugger Robot
1196*16467b97STreehugger Robot* Label type for refs to start/stop in tree parser and other parsers were
1197*16467b97STreehugger Robot  not used.  Lots of casting.  Ick. Fixed.
1198*16467b97STreehugger Robot
1199*16467b97STreehugger Robot* couldn't refer to $tokenlabel in isolation; but need so we can test if
1200*16467b97STreehugger Robot  something was matched.  Fixed.
1201*16467b97STreehugger Robot
1202*16467b97STreehugger Robot* Lots of little bugs fixed in $x.y, %... translation due to new
1203*16467b97STreehugger Robot  action translator.
1204*16467b97STreehugger Robot
1205*16467b97STreehugger Robot* Improperly tracking block nesting level; result was that you couldn't
1206*16467b97STreehugger Robot  see $ID in action of rule "a : A+ | ID {Token t = $ID;} | C ;"
1207*16467b97STreehugger Robot
1208*16467b97STreehugger Robot* a : ID ID {$ID.text;} ; did not get a warning about ambiguous $ID ref.
1209*16467b97STreehugger Robot
1210*16467b97STreehugger Robot* No error was found on $COMMENT.text:
1211*16467b97STreehugger Robot
1212*16467b97STreehugger RobotCOMMENT
1213*16467b97STreehugger Robot    :   '/*' (options {greedy=false;} : . )* '*/'
1214*16467b97STreehugger Robot        {System.out.println("found method "+$COMMENT.text);}
1215*16467b97STreehugger Robot    ;
1216*16467b97STreehugger Robot
1217*16467b97STreehugger Robot  $enclosinglexerrule scope does not exist.  Use text or setText() here.
1218*16467b97STreehugger Robot
1219*16467b97STreehugger RobotJune 11, 2006
1220*16467b97STreehugger Robot
1221*16467b97STreehugger Robot* Single return values are initialized now to default or to your spec.
1222*16467b97STreehugger Robot
1223*16467b97STreehugger Robot* cleaned up input stream stuff.  Added ANTLRReaderStream, ANTLRInputStream
1224*16467b97STreehugger Robot  and refactored.  You can specify encodings now on ANTLRFileStream (and
1225*16467b97STreehugger Robot  ANTLRInputStream) now.
1226*16467b97STreehugger Robot
1227*16467b97STreehugger Robot* You can set text local var now in a lexer rule and token gets that text.
1228*16467b97STreehugger Robot  start/stop indexes are still set for the token.
1229*16467b97STreehugger Robot
1230*16467b97STreehugger Robot* Changed lexer slightly.  Calling a nonfragment rule from a
1231*16467b97STreehugger Robot  nonfragment rule does not set the overall token.
1232*16467b97STreehugger Robot
1233*16467b97STreehugger RobotJune 10, 2006
1234*16467b97STreehugger Robot
1235*16467b97STreehugger Robot* Fixed bug where unnecessary escapes yield char==0 like '\{'.
1236*16467b97STreehugger Robot
1237*16467b97STreehugger Robot* Fixed analysis bug.  This grammar didn't report a recursion warning:
1238*16467b97STreehugger Robotx   : y X
1239*16467b97STreehugger Robot    | y Y
1240*16467b97STreehugger Robot    ;
1241*16467b97STreehugger Roboty   : L y R
1242*16467b97STreehugger Robot    | B
1243*16467b97STreehugger Robot    ;
1244*16467b97STreehugger Robot  The DFAState.equals() method was messed up.
1245*16467b97STreehugger Robot
1246*16467b97STreehugger Robot* Added @synpredgate {...} action so you can tell ANTLR how to gate actions
1247*16467b97STreehugger Robot  in/out during syntactic predicate evaluation.
1248*16467b97STreehugger Robot
1249*16467b97STreehugger Robot* Fuzzy parsing should be more efficient.  It should backtrack over a rule
1250*16467b97STreehugger Robot  and then rewind and do it again "with feeling" to exec actions.  It was
1251*16467b97STreehugger Robot  actually doing it 3x not 2x.
1252*16467b97STreehugger Robot
1253*16467b97STreehugger RobotJune 9, 2006
1254*16467b97STreehugger Robot
1255*16467b97STreehugger Robot* Gutted and rebuilt the action translator for $x.y, $x::y, ...
1256*16467b97STreehugger Robot  Uses ANTLR v3 now for the first time inside v3 source. :)
1257*16467b97STreehugger Robot  ActionTranslator.java
1258*16467b97STreehugger Robot
1259*16467b97STreehugger Robot* Fixed a bug where referencing a return value on a rule didn't work
1260*16467b97STreehugger Robot  because later a ref to that rule's predefined properties didn't
1261*16467b97STreehugger Robot  properly force a return value struct to be built.  Added unit test.
1262*16467b97STreehugger Robot
1263*16467b97STreehugger RobotJune 6, 2006
1264*16467b97STreehugger Robot
1265*16467b97STreehugger Robot* New DFA mechanisms.  Cyclic DFA are implemented as state tables,
1266*16467b97STreehugger Robot  encoded via strings as java cannot handle large static arrays :(
1267*16467b97STreehugger Robot  States with edges emanating that have predicates are specially
1268*16467b97STreehugger Robot  treated.  A method is generated to do these states.  The DFA
1269*16467b97STreehugger Robot  simulation routine uses the "special" array to figure out if the
1270*16467b97STreehugger Robot  state is special.  See March 25, 2006 entry for description:
1271*16467b97STreehugger Robot  http://www.antlr.org/blog/antlr3/codegen.tml.  analysis.DFA now has
1272*16467b97STreehugger Robot  all the state tables generated for code gen.  CyclicCodeGenerator.java
1273*16467b97STreehugger Robot  disappeared as it's unneeded code. :)
1274*16467b97STreehugger Robot
1275*16467b97STreehugger Robot* Internal general clean up of the DFA.states vs uniqueStates thing.
1276*16467b97STreehugger Robot  Fixed lookahead decisions no longer fill uniqueStates.  Waste of
1277*16467b97STreehugger Robot  time.  Also noted that when adding sem pred edges, I didn't check
1278*16467b97STreehugger Robot  for state reuse.  Fixed.
1279*16467b97STreehugger Robot
1280*16467b97STreehugger RobotJune 4, 2006
1281*16467b97STreehugger Robot
1282*16467b97STreehugger Robot* When resolving ambig DFA states predicates, I did not add the new states
1283*16467b97STreehugger Robot  to the list of unique DFA states.  No observable effect on output except
1284*16467b97STreehugger Robot  that DFA state numbers were not always contiguous for predicated decisions.
1285*16467b97STreehugger Robot  I needed this fix for new DFA tables.
1286*16467b97STreehugger Robot
1287*16467b97STreehugger Robot3.0ea10 - June 2, 2006
1288*16467b97STreehugger Robot
1289*16467b97STreehugger RobotJune 2, 2006
1290*16467b97STreehugger Robot
1291*16467b97STreehugger Robot* Improved grammar stats and added syntactic pred tracking.
1292*16467b97STreehugger Robot
1293*16467b97STreehugger RobotJune 1, 2006
1294*16467b97STreehugger Robot
1295*16467b97STreehugger Robot* Due to a type mismatch, the DebugParser.recoverFromMismatchedToken()
1296*16467b97STreehugger Robot  method was not called.  Debug events for mismatched token error
1297*16467b97STreehugger Robot  notification were not sent to ANTLRWorks probably
1298*16467b97STreehugger Robot
1299*16467b97STreehugger Robot* Added getBacktrackingLevel() for any recognizer; needed for profiler.
1300*16467b97STreehugger Robot
1301*16467b97STreehugger Robot* Only writes profiling data for antlr grammar analysis with -profile set
1302*16467b97STreehugger Robot
1303*16467b97STreehugger Robot* Major update and bug fix to (runtime) Profiler.
1304*16467b97STreehugger Robot
1305*16467b97STreehugger RobotMay 27, 2006
1306*16467b97STreehugger Robot
1307*16467b97STreehugger Robot* Added Lexer.skip() to force lexer to ignore current token and look for
1308*16467b97STreehugger Robot  another; no token is created for current rule and is not passed on to
1309*16467b97STreehugger Robot  parser (or other consumer of the lexer).
1310*16467b97STreehugger Robot
1311*16467b97STreehugger Robot* Parsers are much faster now.  I removed use of java.util.Stack for pushing
1312*16467b97STreehugger Robot  follow sets and use a hardcoded array stack instead.  Dropped from
1313*16467b97STreehugger Robot  5900ms to 3900ms for parse+lex time parsing entire java 1.4.2 source.  Lex
1314*16467b97STreehugger Robot  time alone was about 1500ms.  Just looking at parse time, we get about 2x
1315*16467b97STreehugger Robot  speed improvement. :)
1316*16467b97STreehugger Robot
1317*16467b97STreehugger RobotMay 26, 2006
1318*16467b97STreehugger Robot
1319*16467b97STreehugger Robot* Fixed NFA construction so it generates NFA for (A*)* such that ANTLRWorks
1320*16467b97STreehugger Robot  can display it properly.
1321*16467b97STreehugger Robot
1322*16467b97STreehugger RobotMay 25, 2006
1323*16467b97STreehugger Robot
1324*16467b97STreehugger Robot* added abort method to Grammar so AW can terminate the conversion if it's
1325*16467b97STreehugger Robot  taking too long.
1326*16467b97STreehugger Robot
1327*16467b97STreehugger RobotMay 24, 2006
1328*16467b97STreehugger Robot
1329*16467b97STreehugger Robot* added method to get left recursive rules from grammar without doing full
1330*16467b97STreehugger Robot  grammar analysis.
1331*16467b97STreehugger Robot
1332*16467b97STreehugger Robot* analysis, code gen not attempted if serious error (like
1333*16467b97STreehugger Robot  left-recursion or missing rule definition) occurred while reading
1334*16467b97STreehugger Robot  the grammar in and defining symbols.
1335*16467b97STreehugger Robot
1336*16467b97STreehugger Robot* added amazing optimization; reduces analysis time by 90% for java
1337*16467b97STreehugger Robot  grammar; simple IF statement addition!
1338*16467b97STreehugger Robot
1339*16467b97STreehugger Robot3.0ea9 - May 20, 2006
1340*16467b97STreehugger Robot
1341*16467b97STreehugger Robot* added global k value for grammar to limit lookahead for all decisions unless
1342*16467b97STreehugger Robotoverridden in a particular decision.
1343*16467b97STreehugger Robot
1344*16467b97STreehugger Robot* added failsafe so that any decision taking longer than 2 seconds to create
1345*16467b97STreehugger Robotthe DFA will fall back on k=1.  Use -ImaxtimeforDFA n (in ms) to set the time.
1346*16467b97STreehugger Robot
1347*16467b97STreehugger Robot* added an option (turned off for now) to use multiple threads to
1348*16467b97STreehugger Robotperform grammar analysis.  Not much help on a 2-CPU computer as
1349*16467b97STreehugger Robotgarbage collection seems to peg the 2nd CPU already. :( Gotta wait for
1350*16467b97STreehugger Robota 4 CPU box ;)
1351*16467b97STreehugger Robot
1352*16467b97STreehugger Robot* switched from #src to // $ANTLR src directive.
1353*16467b97STreehugger Robot
1354*16467b97STreehugger Robot* CommonTokenStream.getTokens() looked past end of buffer sometimes. fixed.
1355*16467b97STreehugger Robot
1356*16467b97STreehugger Robot* unicode literals didn't really work in DOT output and generated code. fixed.
1357*16467b97STreehugger Robot
1358*16467b97STreehugger Robot* fixed the unit test rig so it compiles nicely with Java 1.5
1359*16467b97STreehugger Robot
1360*16467b97STreehugger Robot* Added ant build.xml file (reads build.properties file)
1361*16467b97STreehugger Robot
1362*16467b97STreehugger Robot* predicates sometimes failed to compile/eval properly due to missing (...)
1363*16467b97STreehugger Robot  in IF expressions.  Forced (..)
1364*16467b97STreehugger Robot
1365*16467b97STreehugger Robot* (...)? with only one alt were not optimized.  Was:
1366*16467b97STreehugger Robot
1367*16467b97STreehugger Robot        // t.g:4:7: ( B )?
1368*16467b97STreehugger Robot        int alt1=2;
1369*16467b97STreehugger Robot        int LA1_0 = input.LA(1);
1370*16467b97STreehugger Robot        if ( LA1_0==B ) {
1371*16467b97STreehugger Robot            alt1=1;
1372*16467b97STreehugger Robot        }
1373*16467b97STreehugger Robot        else if ( LA1_0==-1 ) {
1374*16467b97STreehugger Robot            alt1=2;
1375*16467b97STreehugger Robot        }
1376*16467b97STreehugger Robot        else {
1377*16467b97STreehugger Robot            NoViableAltException nvae =
1378*16467b97STreehugger Robot                new NoViableAltException("4:7: ( B )?", 1, 0, input);
1379*16467b97STreehugger Robot            throw nvae;
1380*16467b97STreehugger Robot        }
1381*16467b97STreehugger Robot
1382*16467b97STreehugger Robotis now:
1383*16467b97STreehugger Robot
1384*16467b97STreehugger Robot        // t.g:4:7: ( B )?
1385*16467b97STreehugger Robot        int alt1=2;
1386*16467b97STreehugger Robot        int LA1_0 = input.LA(1);
1387*16467b97STreehugger Robot        if ( LA1_0==B ) {
1388*16467b97STreehugger Robot            alt1=1;
1389*16467b97STreehugger Robot        }
1390*16467b97STreehugger Robot
1391*16467b97STreehugger Robot  Smaller, faster and more readable.
1392*16467b97STreehugger Robot
1393*16467b97STreehugger Robot* Allow manual init of return values now:
1394*16467b97STreehugger Robot  functionHeader returns [int x=3*4, char (*f)()=null] : ... ;
1395*16467b97STreehugger Robot
1396*16467b97STreehugger Robot* Added optimization for DFAs that fixed a codegen bug with rules in lexer:
1397*16467b97STreehugger Robot   EQ			 : '=' ;
1398*16467b97STreehugger Robot   ASSIGNOP		 : '=' | '+=' ;
1399*16467b97STreehugger Robot  EQ is a subset of other rule.  It did not given an error which is
1400*16467b97STreehugger Robot  correct, but generated bad code.
1401*16467b97STreehugger Robot
1402*16467b97STreehugger Robot* ANTLR was sending column not char position to ANTLRWorks.
1403*16467b97STreehugger Robot
1404*16467b97STreehugger Robot* Bug fix: location 0, 0 emitted for synpreds and empty alts.
1405*16467b97STreehugger Robot
1406*16467b97STreehugger Robot* debugging event handshake how sends grammar file name.  Added getGrammarFileName() to recognizers.  Java.stg generates it:
1407*16467b97STreehugger Robot
1408*16467b97STreehugger Robot    public String getGrammarFileName() { return "<fileName>"; }
1409*16467b97STreehugger Robot
1410*16467b97STreehugger Robot* tree parsers can do arbitrary lookahead now including backtracking.  I
1411*16467b97STreehugger Robot  updated CommonTreeNodeStream.
1412*16467b97STreehugger Robot
1413*16467b97STreehugger Robot* added events for debugging tree parsers:
1414*16467b97STreehugger Robot
1415*16467b97STreehugger Robot	/** Input for a tree parser is an AST, but we know nothing for sure
1416*16467b97STreehugger Robot	 *  about a node except its type and text (obtained from the adaptor).
1417*16467b97STreehugger Robot	 *  This is the analog of the consumeToken method.  Again, the ID is
1418*16467b97STreehugger Robot	 *  the hashCode usually of the node so it only works if hashCode is
1419*16467b97STreehugger Robot	 *  not implemented.
1420*16467b97STreehugger Robot	 */
1421*16467b97STreehugger Robot	public void consumeNode(int ID, String text, int type);
1422*16467b97STreehugger Robot
1423*16467b97STreehugger Robot	/** The tree parser looked ahead */
1424*16467b97STreehugger Robot	public void LT(int i, int ID, String text, int type);
1425*16467b97STreehugger Robot
1426*16467b97STreehugger Robot	/** The tree parser has popped back up from the child list to the
1427*16467b97STreehugger Robot	 *  root node.
1428*16467b97STreehugger Robot	 */
1429*16467b97STreehugger Robot	public void goUp();
1430*16467b97STreehugger Robot
1431*16467b97STreehugger Robot	/** The tree parser has descended to the first child of a the current
1432*16467b97STreehugger Robot	 *  root node.
1433*16467b97STreehugger Robot	 */
1434*16467b97STreehugger Robot	public void goDown();
1435*16467b97STreehugger Robot
1436*16467b97STreehugger Robot* Added DebugTreeNodeStream and DebugTreeParser classes
1437*16467b97STreehugger Robot
1438*16467b97STreehugger Robot* Added ctor because the debug tree node stream will need to ask quesitons about nodes and since  nodes are just Object, it needs an adaptor to decode the nodes and get text/type info for the debugger.
1439*16467b97STreehugger Robot
1440*16467b97STreehugger Robotpublic CommonTreeNodeStream(TreeAdaptor adaptor, Tree tree);
1441*16467b97STreehugger Robot
1442*16467b97STreehugger Robot* added getter to TreeNodeStream:
1443*16467b97STreehugger Robot	public TreeAdaptor getTreeAdaptor();
1444*16467b97STreehugger Robot
1445*16467b97STreehugger Robot* Implemented getText/getType in CommonTreeAdaptor.
1446*16467b97STreehugger Robot
1447*16467b97STreehugger Robot* Added TraceDebugEventListener that can dump all events to stdout.
1448*16467b97STreehugger Robot
1449*16467b97STreehugger Robot* I broke down and make Tree implement getText
1450*16467b97STreehugger Robot
1451*16467b97STreehugger Robot* tree rewrites now gen location debug events.
1452*16467b97STreehugger Robot
1453*16467b97STreehugger Robot* added AST debug events to listener; added blank listener for convenience
1454*16467b97STreehugger Robot
1455*16467b97STreehugger Robot* updated debug events to send begin/end backtrack events for debugging
1456*16467b97STreehugger Robot
1457*16467b97STreehugger Robot* with a : (b->b) ('+' b -> ^(PLUS $a b))* ; you get b[0] each time as
1458*16467b97STreehugger Robot  there is no loop in rewrite rule itself.  Need to know context that
1459*16467b97STreehugger Robot  the -> is inside the rule and hence b means last value of b not all
1460*16467b97STreehugger Robot  values.
1461*16467b97STreehugger Robot
1462*16467b97STreehugger Robot* Bug in TokenRewriteStream; ops at indexes < start index blocked proper op.
1463*16467b97STreehugger Robot
1464*16467b97STreehugger Robot* Actions in ST rewrites "-> ({$op})()" were not translated
1465*16467b97STreehugger Robot
1466*16467b97STreehugger Robot* Added new action name:
1467*16467b97STreehugger Robot
1468*16467b97STreehugger Robot@rulecatch {
1469*16467b97STreehugger Robotcatch (RecognitionException re) {
1470*16467b97STreehugger Robot    reportError(re);
1471*16467b97STreehugger Robot    recover(input,re);
1472*16467b97STreehugger Robot}
1473*16467b97STreehugger Robotcatch (Throwable t) {
1474*16467b97STreehugger Robot    System.err.println(t);
1475*16467b97STreehugger Robot}
1476*16467b97STreehugger Robot}
1477*16467b97STreehugger RobotOverrides rule catch stuff.
1478*16467b97STreehugger Robot
1479*16467b97STreehugger Robot* Isolated $ refs caused exception
1480*16467b97STreehugger Robot
1481*16467b97STreehugger Robot3.0ea8 - March 11, 2006
1482*16467b97STreehugger Robot
1483*16467b97STreehugger Robot* added @finally {...} action like @init for rules.  Executes in
1484*16467b97STreehugger Robot  finally block (java target) after all other stuff like rule memoization.
1485*16467b97STreehugger Robot  No code changes needs; ST just refs a new action:
1486*16467b97STreehugger Robot      <ruleDescriptor.actions.finally>
1487*16467b97STreehugger Robot
1488*16467b97STreehugger Robot* hideous bug fixed: PLUS='+' didn't result in '+' rule in lexer
1489*16467b97STreehugger Robot
1490*16467b97STreehugger Robot* TokenRewriteStream didn't do toString() right when no rewrites had been done.
1491*16467b97STreehugger Robot
1492*16467b97STreehugger Robot* lexer errors in interpreter were not printed properly
1493*16467b97STreehugger Robot
1494*16467b97STreehugger Robot* bitsets are dumped in hex not decimal now for FOLLOW sets
1495*16467b97STreehugger Robot
1496*16467b97STreehugger Robot* /* epsilon */ is not printed now when printing out grammars with empty alts
1497*16467b97STreehugger Robot
1498*16467b97STreehugger Robot* Fixed another bug in tree rewrite stuff where it was checking that elements
1499*16467b97STreehugger Robot  had at least one element.  Strange...commented out for now to see if I can remember what's up.
1500*16467b97STreehugger Robot
1501*16467b97STreehugger Robot* Tree rewrites had problems when you didn't have x+=FOO variables.  Rules
1502*16467b97STreehugger Robot  like this work now:
1503*16467b97STreehugger Robot
1504*16467b97STreehugger Robot  a : (x=ID)? y=ID -> ($x $y)?;
1505*16467b97STreehugger Robot
1506*16467b97STreehugger Robot* filter=true for lexers turns on k=1 and backtracking for every token
1507*16467b97STreehugger Robot  alternative.  Put the rules in priority order.
1508*16467b97STreehugger Robot
1509*16467b97STreehugger Robot* added getLine() etc... to Tree to support better error reporting for
1510*16467b97STreehugger Robot  trees.  Added MismatchedTreeNodeException.
1511*16467b97STreehugger Robot
1512*16467b97STreehugger Robot* $templates::foo() is gone.  added % as special template symbol.
1513*16467b97STreehugger Robot  %foo(a={},b={},...) ctor (even shorter than $templates::foo(...))
1514*16467b97STreehugger Robot  %({name-expr})(a={},...) indirect template ctor reference
1515*16467b97STreehugger Robot
1516*16467b97STreehugger Robot  The above are parsed by antlr.g and translated by codegen.g
1517*16467b97STreehugger Robot  The following are parsed manually here:
1518*16467b97STreehugger Robot
1519*16467b97STreehugger Robot  %{string-expr} anonymous template from string expr
1520*16467b97STreehugger Robot  %{expr}.y = z; template attribute y of StringTemplate-typed expr to z
1521*16467b97STreehugger Robot  %x.y = z; set template attribute y of x (always set never get attr)
1522*16467b97STreehugger Robot            to z [languages like python without ';' must still use the
1523*16467b97STreehugger Robot            ';' which the code generator is free to remove during code gen]
1524*16467b97STreehugger Robot
1525*16467b97STreehugger Robot* -> ({expr})(a={},...) notation for indirect template rewrite.
1526*16467b97STreehugger Robot  expr is the name of the template.
1527*16467b97STreehugger Robot
1528*16467b97STreehugger Robot* $x[i]::y and $x[-i]::y notation for accesssing absolute scope stack
1529*16467b97STreehugger Robot  indexes and relative negative scopes.  $x[-1]::y is the y attribute
1530*16467b97STreehugger Robot  of the previous scope (stack top - 1).
1531*16467b97STreehugger Robot
1532*16467b97STreehugger Robot* filter=true mode for lexers; can do this now...upon mismatch, just
1533*16467b97STreehugger Robot  consumes a char and tries again:
1534*16467b97STreehugger Robotlexer grammar FuzzyJava;
1535*16467b97STreehugger Robotoptions {filter=true;}
1536*16467b97STreehugger Robot
1537*16467b97STreehugger RobotFIELD
1538*16467b97STreehugger Robot    :   TYPE WS? name=ID WS? (';'|'=')
1539*16467b97STreehugger Robot        {System.out.println("found var "+$name.text);}
1540*16467b97STreehugger Robot    ;
1541*16467b97STreehugger Robot
1542*16467b97STreehugger Robot* refactored char streams so ANTLRFileStream is now a subclass of
1543*16467b97STreehugger Robot  ANTLRStringStream.
1544*16467b97STreehugger Robot
1545*16467b97STreehugger Robot* char streams for lexer now allowed nested backtracking in lexer.
1546*16467b97STreehugger Robot
1547*16467b97STreehugger Robot* added TokenLabelType for lexer/parser for all token labels
1548*16467b97STreehugger Robot
1549*16467b97STreehugger Robot* line numbers for error messages were not updated properly in antlr.g
1550*16467b97STreehugger Robot  for strings, char literals and <<...>>
1551*16467b97STreehugger Robot
1552*16467b97STreehugger Robot* init action in lexer rules was before the type,start,line,... decls.
1553*16467b97STreehugger Robot
1554*16467b97STreehugger Robot* Tree grammars can now specify output; I've only tested output=templat
1555*16467b97STreehugger Robot  though.
1556*16467b97STreehugger Robot
1557*16467b97STreehugger Robot* You can reference EOF now in the parser and lexer.  It's just token type
1558*16467b97STreehugger Robot  or char value -1.
1559*16467b97STreehugger Robot
1560*16467b97STreehugger Robot* Bug fix: $ID refs in the *lexer* were all messed up.  Cleaned up the
1561*16467b97STreehugger Robot  set of properties available...
1562*16467b97STreehugger Robot
1563*16467b97STreehugger Robot* Bug fix: .st not found in rule ref when rule has scope:
1564*16467b97STreehugger Robotfield
1565*16467b97STreehugger Robotscope {
1566*16467b97STreehugger Robot	StringTemplate funcDef;
1567*16467b97STreehugger Robot}
1568*16467b97STreehugger Robot    :   ...
1569*16467b97STreehugger Robot	{$field::funcDef = $field.st;}
1570*16467b97STreehugger Robot    ;
1571*16467b97STreehugger Robotit gets field_stack.st instead
1572*16467b97STreehugger Robot
1573*16467b97STreehugger Robot* return in backtracking must return retval or null if return value.
1574*16467b97STreehugger Robot
1575*16467b97STreehugger Robot* $property within a rule now works like $text, $st, ...
1576*16467b97STreehugger Robot
1577*16467b97STreehugger Robot* AST/Template Rewrites were not gated by backtracking==0 so they
1578*16467b97STreehugger Robot  executed even when guessing.  Auto AST construction is now gated also.
1579*16467b97STreehugger Robot
1580*16467b97STreehugger Robot* CommonTokenStream was somehow returning tokens not text in toString()
1581*16467b97STreehugger Robot
1582*16467b97STreehugger Robot* added useful methods to runtime.BitSet and also to CommonToken so you can
1583*16467b97STreehugger Robot  update the text.  Added nice Token stream method:
1584*16467b97STreehugger Robot
1585*16467b97STreehugger Robot  /** Given a start and stop index, return a List of all tokens in
1586*16467b97STreehugger Robot   *  the token type BitSet.  Return null if no tokens were found.  This
1587*16467b97STreehugger Robot   *  method looks at both on and off channel tokens.
1588*16467b97STreehugger Robot   */
1589*16467b97STreehugger Robot  public List getTokens(int start, int stop, BitSet types);
1590*16467b97STreehugger Robot
1591*16467b97STreehugger Robot* literals are now passed in the .tokens files so you can ref them in
1592*16467b97STreehugger Robot  tree parses, for example.
1593*16467b97STreehugger Robot
1594*16467b97STreehugger Robot* added basic exception handling; no labels, just general catches:
1595*16467b97STreehugger Robot
1596*16467b97STreehugger Robota : {;}A | B ;
1597*16467b97STreehugger Robot        exception
1598*16467b97STreehugger Robot                catch[RecognitionException re] {
1599*16467b97STreehugger Robot                        System.out.println("recog error");
1600*16467b97STreehugger Robot                }
1601*16467b97STreehugger Robot                catch[Exception e] {
1602*16467b97STreehugger Robot                        System.out.println("error");
1603*16467b97STreehugger Robot                }
1604*16467b97STreehugger Robot
1605*16467b97STreehugger Robot* Added method to TokenStream:
1606*16467b97STreehugger Robot  public String toString(Token start, Token stop);
1607*16467b97STreehugger Robot
1608*16467b97STreehugger Robot* antlr generates #src lines in lexer grammars generated from combined grammars
1609*16467b97STreehugger Robot  so error messages refer to original file.
1610*16467b97STreehugger Robot
1611*16467b97STreehugger Robot* lexers generated from combined grammars now use originally formatting.
1612*16467b97STreehugger Robot
1613*16467b97STreehugger Robot* predicates have $x.y stuff translated now.  Warning: predicates might be
1614*16467b97STreehugger Robot  hoisted out of context.
1615*16467b97STreehugger Robot
1616*16467b97STreehugger Robot* return values in return val structs are now public.
1617*16467b97STreehugger Robot
1618*16467b97STreehugger Robot* output=template with return values on rules was broken.  I assume return values with ASTs was broken too.  Fixed.
1619*16467b97STreehugger Robot
1620*16467b97STreehugger Robot3.0ea7 - December 14, 2005
1621*16467b97STreehugger Robot
1622*16467b97STreehugger Robot* Added -print option to print out grammar w/o actions
1623*16467b97STreehugger Robot
1624*16467b97STreehugger Robot* Renamed BaseParser to be BaseRecognizer and even made Lexer derive from
1625*16467b97STreehugger Robot  this; nice as it now shares backtracking support code.
1626*16467b97STreehugger Robot
1627*16467b97STreehugger Robot* Added syntactic predicates (...)=>.  See December 4, 2005 entry:
1628*16467b97STreehugger Robot
1629*16467b97STreehugger Robot  http://www.antlr.org/blog/antlr3/lookahead.tml
1630*16467b97STreehugger Robot
1631*16467b97STreehugger Robot  Note that we have a new option for turning off rule memoization during
1632*16467b97STreehugger Robot  backtracking:
1633*16467b97STreehugger Robot
1634*16467b97STreehugger Robot  -nomemo        when backtracking don't generate memoization code
1635*16467b97STreehugger Robot
1636*16467b97STreehugger Robot* Predicates are now tested in order that you specify the alts.  If you
1637*16467b97STreehugger Robot  leave the last alt "naked" (w/o pred), it will assume a true pred rather
1638*16467b97STreehugger Robot  than union of other preds.
1639*16467b97STreehugger Robot
1640*16467b97STreehugger Robot* Added gated predicates "{p}?=>" that literally turn off a production whereas
1641*16467b97STreehugger Robotdisambiguating predicates are only hoisted into the predictor when syntax alone
1642*16467b97STreehugger Robotis not sufficient to uniquely predict alternatives.
1643*16467b97STreehugger Robot
1644*16467b97STreehugger RobotA : {p}?  => "a" ;
1645*16467b97STreehugger RobotB : {!p}? => ("a"|"b")+ ;
1646*16467b97STreehugger Robot
1647*16467b97STreehugger Robot* bug fixed related to predicates in predictor
1648*16467b97STreehugger Robotlexer grammar w;
1649*16467b97STreehugger RobotA : {p}? "a" ;
1650*16467b97STreehugger RobotB : {!p}? ("a"|"b")+ ;
1651*16467b97STreehugger RobotDFA is correct.  A state splits for input "a" on the pred.
1652*16467b97STreehugger RobotGenerated code though was hosed.  No pred tests in prediction code!
1653*16467b97STreehugger RobotI added testLexerPreds() and others in TestSemanticPredicateEvaluation.java
1654*16467b97STreehugger Robot
1655*16467b97STreehugger Robot* added execAction template in case we want to do something in front of
1656*16467b97STreehugger Robot  each action execution or something.
1657*16467b97STreehugger Robot
1658*16467b97STreehugger Robot* left-recursive cycles from rules w/o decisions were not detected.
1659*16467b97STreehugger Robot
1660*16467b97STreehugger Robot* undefined lexer rules were not announced! fixed.
1661*16467b97STreehugger Robot
1662*16467b97STreehugger Robot* unreachable messages for Tokens rule now indicate rule name not alt. E.g.,
1663*16467b97STreehugger Robot
1664*16467b97STreehugger Robot  Ruby.lexer.g:24:1: The following token definitions are unreachable: IVAR
1665*16467b97STreehugger Robot
1666*16467b97STreehugger Robot* nondeterminism warnings improved for Tokens rule:
1667*16467b97STreehugger Robot
1668*16467b97STreehugger RobotRuby.lexer.g:10:1: Multiple token rules can match input such as ""0".."9"": INT, FLOAT
1669*16467b97STreehugger RobotAs a result, tokens(s) FLOAT were disabled for that input
1670*16467b97STreehugger Robot
1671*16467b97STreehugger Robot
1672*16467b97STreehugger Robot* DOT diagrams didn't show escaped char properly.
1673*16467b97STreehugger Robot
1674*16467b97STreehugger Robot* Char/string literals are now all 'abc' not "abc".
1675*16467b97STreehugger Robot
1676*16467b97STreehugger Robot* action syntax changed "@scope::actionname {action}" where scope defaults
1677*16467b97STreehugger Robot  to "parser" if parser grammar or combined grammar, "lexer" if lexer grammar,
1678*16467b97STreehugger Robot  and "treeparser" if tree grammar.  The code generation targets decide
1679*16467b97STreehugger Robot  what scopes are available.  Each "scope" yields a hashtable for use in
1680*16467b97STreehugger Robot  the output templates.  The scopes full of actions are sent to all output
1681*16467b97STreehugger Robot  file templates (currently headerFile and outputFile) as attribute actions.
1682*16467b97STreehugger Robot  Then you can reference <actions.scope> to get the map of actions associated
1683*16467b97STreehugger Robot  with scope and <actions.parser.header> to get the parser's header action
1684*16467b97STreehugger Robot  for example.  This should be very flexible.  The target should only have
1685*16467b97STreehugger Robot  to define which scopes are valid, but the action names should be variable
1686*16467b97STreehugger Robot  so we don't have to recompile ANTLR to add actions to code gen templates.
1687*16467b97STreehugger Robot
1688*16467b97STreehugger Robot  grammar T;
1689*16467b97STreehugger Robot  options {language=Java;}
1690*16467b97STreehugger Robot  @header { package foo; }
1691*16467b97STreehugger Robot  @parser::stuff { int i; } // names within scope not checked; target dependent
1692*16467b97STreehugger Robot  @members { int i; }
1693*16467b97STreehugger Robot  @lexer::header {head}
1694*16467b97STreehugger Robot  @lexer::members { int j; }
1695*16467b97STreehugger Robot  @headerfile::blort {...} // error: this target doesn't have headerfile
1696*16467b97STreehugger Robot  @treeparser::members {...} // error: this is not a tree parser
1697*16467b97STreehugger Robot  a
1698*16467b97STreehugger Robot  @init {int i;}
1699*16467b97STreehugger Robot    : ID
1700*16467b97STreehugger Robot    ;
1701*16467b97STreehugger Robot  ID : 'a'..'z';
1702*16467b97STreehugger Robot
1703*16467b97STreehugger Robot  For now, the Java target uses members and header as a valid name.  Within a
1704*16467b97STreehugger Robot  rule, the init action name is valid.
1705*16467b97STreehugger Robot
1706*16467b97STreehugger Robot* changed $dynamicscope.value to $dynamicscope::value even if value is defined
1707*16467b97STreehugger Robot  in same rule such as $function::name where rule function defines name.
1708*16467b97STreehugger Robot
1709*16467b97STreehugger Robot* $dynamicscope gets you the stack
1710*16467b97STreehugger Robot
1711*16467b97STreehugger Robot* rule scopes go like this now:
1712*16467b97STreehugger Robot
1713*16467b97STreehugger Robot  rule
1714*16467b97STreehugger Robot  scope {...}
1715*16467b97STreehugger Robot  scope slist,Symbols;
1716*16467b97STreehugger Robot  	: ...
1717*16467b97STreehugger Robot	;
1718*16467b97STreehugger Robot
1719*16467b97STreehugger Robot* Created RuleReturnScope as a generic rule return value.  Makes it easier
1720*16467b97STreehugger Robot  to do this:
1721*16467b97STreehugger Robot    RuleReturnScope r = parser.program();
1722*16467b97STreehugger Robot    System.out.println(r.getTemplate().toString());
1723*16467b97STreehugger Robot
1724*16467b97STreehugger Robot* $template, $tree, $start, etc...
1725*16467b97STreehugger Robot
1726*16467b97STreehugger Robot* $r.x in current rule.  $r is ignored as fully-qualified name. $r.start works too
1727*16467b97STreehugger Robot
1728*16467b97STreehugger Robot* added warning about $r referring to both return value of rule and dynamic scope of rule
1729*16467b97STreehugger Robot
1730*16467b97STreehugger Robot* integrated StringTemplate in a very simple manner
1731*16467b97STreehugger Robot
1732*16467b97STreehugger RobotSyntax:
1733*16467b97STreehugger Robot-> template(arglist) "..."
1734*16467b97STreehugger Robot-> template(arglist) <<...>>
1735*16467b97STreehugger Robot-> namedTemplate(arglist)
1736*16467b97STreehugger Robot-> {free expression}
1737*16467b97STreehugger Robot-> // empty
1738*16467b97STreehugger Robot
1739*16467b97STreehugger RobotPredicate syntax:
1740*16467b97STreehugger Robota : A B -> {p1}? foo(a={$A.text})
1741*16467b97STreehugger Robot        -> {p2}? foo(a={$B.text})
1742*16467b97STreehugger Robot        -> // return nothing
1743*16467b97STreehugger Robot
1744*16467b97STreehugger RobotAn arg list is just a list of template attribute assignments to actions in curlies.
1745*16467b97STreehugger Robot
1746*16467b97STreehugger RobotThere is a setTemplateLib() method for you to use with named template rewrites.
1747*16467b97STreehugger Robot
1748*16467b97STreehugger RobotUse a new option:
1749*16467b97STreehugger Robot
1750*16467b97STreehugger Robotgrammar t;
1751*16467b97STreehugger Robotoptions {output=template;}
1752*16467b97STreehugger Robot...
1753*16467b97STreehugger Robot
1754*16467b97STreehugger RobotThis all should work for tree grammars too, but I'm still testing.
1755*16467b97STreehugger Robot
1756*16467b97STreehugger Robot* fixed bugs where strings were improperly escaped in exceptions, comments, etc..  For example, newlines came out as newlines not the escaped version
1757*16467b97STreehugger Robot
1758*16467b97STreehugger Robot3.0ea6 - November 13, 2005
1759*16467b97STreehugger Robot
1760*16467b97STreehugger Robot* turned off -debug/-profile, which was on by default
1761*16467b97STreehugger Robot
1762*16467b97STreehugger Robot* completely refactored the output templates; added some missing templates.
1763*16467b97STreehugger Robot
1764*16467b97STreehugger Robot* dramatically improved infinite recursion error messages (actually
1765*16467b97STreehugger Robot  left-recursion never even was printed out before).
1766*16467b97STreehugger Robot
1767*16467b97STreehugger Robot* wasn't printing dangling state messages when it reanalyzes with k=1.
1768*16467b97STreehugger Robot
1769*16467b97STreehugger Robot* fixed a nasty bug in the analysis engine dealing with infinite recursion.
1770*16467b97STreehugger Robot  Spent all day thinking about it and cleaned up the code dramatically.
1771*16467b97STreehugger Robot  Bug fixed and software is more powerful and I understand it better! :)
1772*16467b97STreehugger Robot
1773*16467b97STreehugger Robot* improved verbose DFA nodes; organized by alt
1774*16467b97STreehugger Robot
1775*16467b97STreehugger Robot* got much better random phrase generation.  For example:
1776*16467b97STreehugger Robot
1777*16467b97STreehugger Robot $ java org.antlr.tool.RandomPhrase simple.g program
1778*16467b97STreehugger Robot int Ktcdn ';' method wh '(' ')' '{' return 5 ';' '}'
1779*16467b97STreehugger Robot
1780*16467b97STreehugger Robot* empty rules like "a : ;" generated code that didn't compile due to
1781*16467b97STreehugger Robot  try/catch for RecognitionException.  Generated code couldn't possibly
1782*16467b97STreehugger Robot  throw that exception.
1783*16467b97STreehugger Robot
1784*16467b97STreehugger Robot* when printing out a grammar, such as in comments in generated code,
1785*16467b97STreehugger Robot  ANTLR didn't print ast suffix stuff back out for literals.
1786*16467b97STreehugger Robot
1787*16467b97STreehugger Robot* This never exited loop:
1788*16467b97STreehugger Robot  DATA : (options {greedy=false;}: .* '\n' )* '\n' '.' ;
1789*16467b97STreehugger Robot  and now it works due to new default nongreedy .*  Also this works:
1790*16467b97STreehugger Robot  DATA : (options {greedy=false;}: .* '\n' )* '.' ;
1791*16467b97STreehugger Robot
1792*16467b97STreehugger Robot* Dot star ".*" syntax didn't work; in lexer it is nongreedy by
1793*16467b97STreehugger Robot  default.  In parser it is on greedy but also k=1 by default.  Added
1794*16467b97STreehugger Robot  unit tests.  Added blog entry to describe.
1795*16467b97STreehugger Robot
1796*16467b97STreehugger Robot* ~T where T is the only token yielded an empty set but no error
1797*16467b97STreehugger Robot
1798*16467b97STreehugger Robot* Used to generate unreachable message here:
1799*16467b97STreehugger Robot
1800*16467b97STreehugger Robot  parser grammar t;
1801*16467b97STreehugger Robot  a : ID a
1802*16467b97STreehugger Robot    | ID
1803*16467b97STreehugger Robot    ;
1804*16467b97STreehugger Robot
1805*16467b97STreehugger Robot  z.g:3:11: The following alternatives are unreachable: 2
1806*16467b97STreehugger Robot
1807*16467b97STreehugger Robot  In fact it should really be an error; now it generates:
1808*16467b97STreehugger Robot
1809*16467b97STreehugger Robot  no start rule in grammar t (no rule can obviously be followed by EOF)
1810*16467b97STreehugger Robot
1811*16467b97STreehugger Robot  Per next change item, ANTLR cannot know that EOF follows rule 'a'.
1812*16467b97STreehugger Robot
1813*16467b97STreehugger Robot* added error message indicating that ANTLR can't figure out what your
1814*16467b97STreehugger Robot  start rule is.  Required to properly generate code in some cases.
1815*16467b97STreehugger Robot
1816*16467b97STreehugger Robot* validating semantic predicates now work (if they are false, they
1817*16467b97STreehugger Robot  throw a new FailedPredicateException
1818*16467b97STreehugger Robot
1819*16467b97STreehugger Robot* two hideous bug fixes in the IntervalSet, which made analysis go wrong
1820*16467b97STreehugger Robot  in a few cases.  Thanks to Oliver Zeigermann for finding lots of bugs
1821*16467b97STreehugger Robot  and making suggested fixes (including the next two items)!
1822*16467b97STreehugger Robot
1823*16467b97STreehugger Robot* cyclic DFAs are now nonstatic and hence can access instance variables
1824*16467b97STreehugger Robot
1825*16467b97STreehugger Robot* labels are now allowed on lexical elements (in the lexer)
1826*16467b97STreehugger Robot
1827*16467b97STreehugger Robot* added some internal debugging options
1828*16467b97STreehugger Robot
1829*16467b97STreehugger Robot* ~'a'* and ~('a')* were not working properly; refactored antlr.g grammar
1830*16467b97STreehugger Robot
1831*16467b97STreehugger Robot3.0ea5 - July 5, 2005
1832*16467b97STreehugger Robot
1833*16467b97STreehugger Robot* Using '\n' in a parser grammar resulted in a nonescaped version of '\n' in the token names table making compilation fail.  I fixed this by reorganizing/cleaning up portion of ANTLR that deals with literals.  See comment org.antlr.codegen.Target.
1834*16467b97STreehugger Robot
1835*16467b97STreehugger Robot* Target.getMaxCharValue() did not use the appropriate max value constant.
1836*16467b97STreehugger Robot
1837*16467b97STreehugger Robot* ALLCHAR was a constant when it should use the Target max value def.  set complement for wildcard also didn't use the Target def.  Generally cleaned up the max char value stuff.
1838*16467b97STreehugger Robot
1839*16467b97STreehugger Robot* Code gen didn't deal with ASTLabelType properly...I think even the 3.0ea7 example tree parser was broken! :(
1840*16467b97STreehugger Robot
1841*16467b97STreehugger Robot* Added a few more unit tests dealing with escaped literals
1842*16467b97STreehugger Robot
1843*16467b97STreehugger Robot3.0ea4 - June 29, 2005
1844*16467b97STreehugger Robot
1845*16467b97STreehugger Robot* tree parsers work; added CommonTreeNodeStream.  See simplecTreeParser
1846*16467b97STreehugger Robot  example in examples-v3 tarball.
1847*16467b97STreehugger Robot
1848*16467b97STreehugger Robot* added superClass and ASTLabelType options
1849*16467b97STreehugger Robot
1850*16467b97STreehugger Robot* refactored Parser to have a BaseParser and added TreeParser
1851*16467b97STreehugger Robot
1852*16467b97STreehugger Robot* bug fix: actions being dumped in description strings; compile errors
1853*16467b97STreehugger Robot  resulted
1854*16467b97STreehugger Robot
1855*16467b97STreehugger Robot3.0ea3 - June 23, 2005
1856*16467b97STreehugger Robot
1857*16467b97STreehugger RobotEnhancements
1858*16467b97STreehugger Robot
1859*16467b97STreehugger Robot* Automatic tree construction operators are in: ! ^ ^^
1860*16467b97STreehugger Robot
1861*16467b97STreehugger Robot* Tree construction rewrite rules are in
1862*16467b97STreehugger Robot	-> {pred1}? rewrite1
1863*16467b97STreehugger Robot	-> {pred2}? rewrite2
1864*16467b97STreehugger Robot	...
1865*16467b97STreehugger Robot	-> rewriteN
1866*16467b97STreehugger Robot
1867*16467b97STreehugger Robot  The rewrite rules may be elements like ID, expr, $label, {node expr}
1868*16467b97STreehugger Robot  and trees ^( <root> <children> ).  You have have (...)?, (...)*, (...)+
1869*16467b97STreehugger Robot  subrules as well.
1870*16467b97STreehugger Robot
1871*16467b97STreehugger Robot  You may have rewrites in subrules not just at outer level of rule, but
1872*16467b97STreehugger Robot  any -> rewrite forces auto AST construction off for that alternative
1873*16467b97STreehugger Robot  of that rule.
1874*16467b97STreehugger Robot
1875*16467b97STreehugger Robot  To avoid cycles, copy semantics are used:
1876*16467b97STreehugger Robot
1877*16467b97STreehugger Robot  r : INT -> INT INT ;
1878*16467b97STreehugger Robot
1879*16467b97STreehugger Robot  means make two new nodes from the same INT token.
1880*16467b97STreehugger Robot
1881*16467b97STreehugger Robot  Repeated references to a rule element implies a copy for at least one
1882*16467b97STreehugger Robot  tree:
1883*16467b97STreehugger Robot
1884*16467b97STreehugger Robot  a : atom -> ^(atom atom) ; // NOT CYCLE! (dup atom tree)
1885*16467b97STreehugger Robot
1886*16467b97STreehugger Robot* $ruleLabel.tree refers to tree created by matching the labeled element.
1887*16467b97STreehugger Robot
1888*16467b97STreehugger Robot* A description of the blocks/alts is generated as a comment in output code
1889*16467b97STreehugger Robot
1890*16467b97STreehugger Robot* A timestamp / signature is put at top of each generated code file
1891*16467b97STreehugger Robot
1892*16467b97STreehugger Robot3.0ea2 - June 12, 2005
1893*16467b97STreehugger Robot
1894*16467b97STreehugger RobotBug fixes
1895*16467b97STreehugger Robot
1896*16467b97STreehugger Robot* Some error messages were missing the stackTrace parameter
1897*16467b97STreehugger Robot
1898*16467b97STreehugger Robot* Removed the file locking mechanism as it's not cross platform
1899*16467b97STreehugger Robot
1900*16467b97STreehugger Robot* Some absolute vs relative path name problems with writing output
1901*16467b97STreehugger Robot  files.  Rules are now more concrete.  -o option takes precedence
1902*16467b97STreehugger Robot  // -o /tmp /var/lib/t.g => /tmp/T.java
1903*16467b97STreehugger Robot  // -o subdir/output /usr/lib/t.g => subdir/output/T.java
1904*16467b97STreehugger Robot  // -o . /usr/lib/t.g => ./T.java
1905*16467b97STreehugger Robot  // -o /tmp subdir/t.g => /tmp/subdir/t.g
1906*16467b97STreehugger Robot  // If they didn't specify a -o dir so just write to location
1907*16467b97STreehugger Robot  // where grammar is, absolute or relative
1908*16467b97STreehugger Robot
1909*16467b97STreehugger Robot* does error checking on unknown option names now
1910*16467b97STreehugger Robot
1911*16467b97STreehugger Robot* Using just language code not locale name for error message file.  I.e.,
1912*16467b97STreehugger Robot  the default (and for any English speaking locale) is en.stg not en_US.stg
1913*16467b97STreehugger Robot  anymore.
1914*16467b97STreehugger Robot
1915*16467b97STreehugger Robot* The error manager now asks the Tool to panic rather than simply doing
1916*16467b97STreehugger Robot  a System.exit().
1917*16467b97STreehugger Robot
1918*16467b97STreehugger Robot* Lots of refactoring concerning grammar, rule, subrule options.  Now
1919*16467b97STreehugger Robot  detects invalid options.
1920*16467b97STreehugger Robot
1921*16467b97STreehugger Robot3.0ea1 - June 1, 2005
1922*16467b97STreehugger Robot
1923*16467b97STreehugger RobotInitial early access release
1924*16467b97STreehugger Robot
1925