1*16467b97STreehugger RobotANTLR v3.0.1 C Runtime 2*16467b97STreehugger RobotANTLR 3.0.1 3*16467b97STreehugger RobotJanuary 1, 2008 4*16467b97STreehugger Robot 5*16467b97STreehugger RobotAt the moment, the use of the C runtime engine for the parser is not generally 6*16467b97STreehugger Robotfor the inexperienced C programmer. However this is mainly because of the lack 7*16467b97STreehugger Robotof documentation on use, which will be corrected shortly. The C runtime 8*16467b97STreehugger Robotcode itself is however well documented with doxygen style comments and a 9*16467b97STreehugger Robotreasonably experienced C programmer should be able to piece it together. You 10*16467b97STreehugger Robotcan visit the documentation at: http://www.antlr.org/api/C/index.html 11*16467b97STreehugger Robot 12*16467b97STreehugger RobotThe general make up is that everything is implemented as a pseudo class/object 13*16467b97STreehugger Robotinitialized with pointers to its 'member' functions and data. All objects are 14*16467b97STreehugger Robot(usually) created by factories, which auto manage the memory allocation and 15*16467b97STreehugger Robotrelease and generally make life easier. If you remember this rule, everything 16*16467b97STreehugger Robotshould fall in to place. 17*16467b97STreehugger Robot 18*16467b97STreehugger RobotJim Idle - Portland Oregon, Jan 2008 19*16467b97STreehugger Robotjimi idle ws 20*16467b97STreehugger Robot 21*16467b97STreehugger Robot=============================================================================== 22*16467b97STreehugger Robot 23*16467b97STreehugger RobotTerence Parr, parrt at cs usfca edu 24*16467b97STreehugger RobotANTLR project lead and supreme dictator for life 25*16467b97STreehugger RobotUniversity of San Francisco 26*16467b97STreehugger Robot 27*16467b97STreehugger RobotINTRODUCTION 28*16467b97STreehugger Robot 29*16467b97STreehugger RobotWelcome to ANTLR v3! I've been working on this for nearly 4 years and it's 30*16467b97STreehugger Robotalmost ready! I plan no feature additions between this beta and first 31*16467b97STreehugger Robot3.0 release. I have lots of features to add later, but this will be 32*16467b97STreehugger Robotthe first set. Ultimately, I need to rewrite ANTLR v3 in itself (it's 33*16467b97STreehugger Robotwritten in 2.7.7 at the moment and also needs StringTemplate 3.0 or 34*16467b97STreehugger Robotlater). 35*16467b97STreehugger Robot 36*16467b97STreehugger RobotYou should use v3 in conjunction with ANTLRWorks: 37*16467b97STreehugger Robot 38*16467b97STreehugger Robot http://www.antlr.org/works/index.html 39*16467b97STreehugger Robot 40*16467b97STreehugger RobotWARNING: We have bits of documentation started, but nothing super-complete 41*16467b97STreehugger Robotyet. The book will be printed May 2007: 42*16467b97STreehugger Robot 43*16467b97STreehugger Robothttp://www.pragmaticprogrammer.com/titles/tpantlr/index.html 44*16467b97STreehugger Robot 45*16467b97STreehugger Robotbut we should have a beta PDF available on that page in Feb 2007. 46*16467b97STreehugger Robot 47*16467b97STreehugger RobotYou also have the examples plus the source to guide you. 48*16467b97STreehugger Robot 49*16467b97STreehugger RobotSee the new wiki FAQ: 50*16467b97STreehugger Robot 51*16467b97STreehugger Robot http://www.antlr.org/wiki/display/ANTLR3/ANTLR+v3+FAQ 52*16467b97STreehugger Robot 53*16467b97STreehugger Robotand general doc root: 54*16467b97STreehugger Robot 55*16467b97STreehugger Robot http://www.antlr.org/wiki/display/ANTLR3/ANTLR+3+Wiki+Home 56*16467b97STreehugger Robot 57*16467b97STreehugger RobotPlease help add/update FAQ entries. 58*16467b97STreehugger Robot 59*16467b97STreehugger RobotI have made very little effort at this point to deal well with 60*16467b97STreehugger Roboterroneous input (e.g., bad syntax might make ANTLR crash). I will clean 61*16467b97STreehugger Robotthis up after I've rewritten v3 in v3. 62*16467b97STreehugger Robot 63*16467b97STreehugger RobotPer the license in LICENSE.txt, this software is not guaranteed to 64*16467b97STreehugger Robotwork and might even destroy all life on this planet: 65*16467b97STreehugger Robot 66*16467b97STreehugger RobotTHIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR 67*16467b97STreehugger RobotIMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 68*16467b97STreehugger RobotWARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 69*16467b97STreehugger RobotDISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, 70*16467b97STreehugger RobotINDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES 71*16467b97STreehugger Robot(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 72*16467b97STreehugger RobotSERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 73*16467b97STreehugger RobotHOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, 74*16467b97STreehugger RobotSTRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING 75*16467b97STreehugger RobotIN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 76*16467b97STreehugger RobotPOSSIBILITY OF SUCH DAMAGE. 77*16467b97STreehugger Robot 78*16467b97STreehugger RobotEXAMPLES 79*16467b97STreehugger Robot 80*16467b97STreehugger RobotANTLR v3 sample grammars: 81*16467b97STreehugger Robot 82*16467b97STreehugger Robot http://www.antlr.org/download/examples-v3.tar.gz 83*16467b97STreehugger Robot 84*16467b97STreehugger Robotcontains the following examples: LL-star, cminus, dynamic-scope, 85*16467b97STreehugger Robotfuzzy, hoistedPredicates, island-grammar, java, python, scopes, 86*16467b97STreehugger RobotsimplecTreeParser, treeparser, tweak, xmlLexer. 87*16467b97STreehugger Robot 88*16467b97STreehugger RobotAlso check out Mantra Programming Language for a prototype (work in 89*16467b97STreehugger Robotprogress) using v3: 90*16467b97STreehugger Robot 91*16467b97STreehugger Robot http://www.linguamantra.org/ 92*16467b97STreehugger Robot 93*16467b97STreehugger Robot---------------------------------------------------------------------- 94*16467b97STreehugger Robot 95*16467b97STreehugger RobotWhat is ANTLR? 96*16467b97STreehugger Robot 97*16467b97STreehugger RobotANTLR stands for (AN)other (T)ool for (L)anguage (R)ecognition and was 98*16467b97STreehugger Robotoriginally known as PCCTS. ANTLR is a language tool that provides a 99*16467b97STreehugger Robotframework for constructing recognizers, compilers, and translators 100*16467b97STreehugger Robotfrom grammatical descriptions containing actions. Target language list: 101*16467b97STreehugger Robot 102*16467b97STreehugger Robothttp://www.antlr.org/wiki/display/ANTLR3/Code+Generation+Targets 103*16467b97STreehugger Robot 104*16467b97STreehugger Robot---------------------------------------------------------------------- 105*16467b97STreehugger Robot 106*16467b97STreehugger RobotHow is ANTLR v3 different than ANTLR v2? 107*16467b97STreehugger Robot 108*16467b97STreehugger RobotSee migration guide: 109*16467b97STreehugger Robot http://www.antlr.org/wiki/display/ANTLR3/Migrating+from+ANTLR+2+to+ANTLR+3 110*16467b97STreehugger Robot 111*16467b97STreehugger RobotANTLR v3 has a far superior parsing algorithm called LL(*) that 112*16467b97STreehugger Robothandles many more grammars than v2 does. In practice, it means you 113*16467b97STreehugger Robotcan throw almost any grammar at ANTLR that is non-left-recursive and 114*16467b97STreehugger Robotunambiguous (same input can be matched by multiple rules); the cost is 115*16467b97STreehugger Robotperhaps a tiny bit of backtracking, but with a DFA not a full parser. 116*16467b97STreehugger RobotYou can manually set the max lookahead k as an option for any decision 117*16467b97STreehugger Robotthough. The LL(*) algorithm ramps up to use more lookahead when it 118*16467b97STreehugger Robotneeds to and is much more efficient than normal LL backtracking. There 119*16467b97STreehugger Robotis support for syntactic predicate (full LL backtracking) when LL(*) 120*16467b97STreehugger Robotfails. 121*16467b97STreehugger Robot 122*16467b97STreehugger RobotLexers are much easier due to the LL(*) algorithm as well. Previously 123*16467b97STreehugger Robotthese two lexer rules would cause trouble because ANTLR couldn't 124*16467b97STreehugger Robotdistinguish between them with finite lookahead to see the decimal 125*16467b97STreehugger Robotpoint: 126*16467b97STreehugger Robot 127*16467b97STreehugger RobotINT : ('0'..'9')+ ; 128*16467b97STreehugger RobotFLOAT : INT '.' INT ; 129*16467b97STreehugger Robot 130*16467b97STreehugger RobotThe syntax is almost identical for features in common, but you should 131*16467b97STreehugger Robotnote that labels are always '=' not ':'. So do id=ID not id:ID. 132*16467b97STreehugger Robot 133*16467b97STreehugger RobotYou can do combined lexer/parser grammars again (ala PCCTS) both lexer 134*16467b97STreehugger Robotand parser rules are defined in the same file. See the examples. 135*16467b97STreehugger RobotReally nice. You can reference strings and characters in the grammar 136*16467b97STreehugger Robotand ANTLR will generate the lexer for you. 137*16467b97STreehugger Robot 138*16467b97STreehugger RobotThe attribute structure has been enhanced. Rules may have multiple 139*16467b97STreehugger Robotreturn values, for example. Further, there are dynamically scoped 140*16467b97STreehugger Robotattributes whereby a rule may define a value usable by any rule it 141*16467b97STreehugger Robotinvokes directly or indirectly w/o having to pass a parameter all the 142*16467b97STreehugger Robotway down. 143*16467b97STreehugger Robot 144*16467b97STreehugger RobotANTLR v3 tree construction is far superior--it provides tree rewrite 145*16467b97STreehugger Robotrules where the right hand side is simply the tree grammar fragment 146*16467b97STreehugger Robotdescribing the tree you want to build: 147*16467b97STreehugger Robot 148*16467b97STreehugger RobotformalArgs 149*16467b97STreehugger Robot : typename declarator (',' typename declarator )* 150*16467b97STreehugger Robot -> ^(ARG typename declarator)+ 151*16467b97STreehugger Robot ; 152*16467b97STreehugger Robot 153*16467b97STreehugger RobotThat builds tree sequences like: 154*16467b97STreehugger Robot 155*16467b97STreehugger Robot^(ARG int v1) ^(ARG int v2) 156*16467b97STreehugger Robot 157*16467b97STreehugger RobotANTLR v3 also incorporates StringTemplate: 158*16467b97STreehugger Robot 159*16467b97STreehugger Robot http://www.stringtemplate.org 160*16467b97STreehugger Robot 161*16467b97STreehugger Robotjust like AST support. It is useful for generating output. For 162*16467b97STreehugger Robotexample this rule creates a template called 'import' for each import 163*16467b97STreehugger Robotdefinition found in the input stream: 164*16467b97STreehugger Robot 165*16467b97STreehugger Robotgrammar Java; 166*16467b97STreehugger Robotoptions { 167*16467b97STreehugger Robot output=template; 168*16467b97STreehugger Robot} 169*16467b97STreehugger Robot... 170*16467b97STreehugger RobotimportDefinition 171*16467b97STreehugger Robot : 'import' identifierStar SEMI 172*16467b97STreehugger Robot -> import(name={$identifierStar.st}, 173*16467b97STreehugger Robot begin={$identifierStar.start}, 174*16467b97STreehugger Robot end={$identifierStar.stop}) 175*16467b97STreehugger Robot ; 176*16467b97STreehugger Robot 177*16467b97STreehugger RobotThe attributes are set via assignments in the argument list. The 178*16467b97STreehugger Robotarguments are actions with arbitrary expressions in the target 179*16467b97STreehugger Robotlanguage. The .st label property is the result template from a rule 180*16467b97STreehugger Robotreference. There is a nice shorthand in actions too: 181*16467b97STreehugger Robot 182*16467b97STreehugger Robot %foo(a={},b={},...) ctor 183*16467b97STreehugger Robot %({name-expr})(a={},...) indirect template ctor reference 184*16467b97STreehugger Robot %{string-expr} anonymous template from string expr 185*16467b97STreehugger Robot %{expr}.y = z; template attribute y of StringTemplate-typed expr to z 186*16467b97STreehugger Robot %x.y = z; set template attribute y of x (always set never get attr) 187*16467b97STreehugger Robot to z [languages like python without ';' must still use the 188*16467b97STreehugger Robot ';' which the code generator is free to remove during code gen] 189*16467b97STreehugger Robot Same as '(x).setAttribute("y", z);' 190*16467b97STreehugger Robot 191*16467b97STreehugger RobotFor ANTLR v3 I decided to make the most common tasks easy by default 192*16467b97STreehugger Robotrather. This means that some of the basic objects are heavier weight 193*16467b97STreehugger Robotthan some speed demons would like, but they are free to pare it down 194*16467b97STreehugger Robotleaving most programmers the luxury of having it "just work." For 195*16467b97STreehugger Robotexample, to read in some input, tweak it, and write it back out 196*16467b97STreehugger Robotpreserving whitespace, is easy in v3. 197*16467b97STreehugger Robot 198*16467b97STreehugger RobotThe ANTLR source code is much prettier. You'll also note that the 199*16467b97STreehugger Robotrun-time classes are conveniently encapsulated in the 200*16467b97STreehugger Robotorg.antlr.runtime package. 201*16467b97STreehugger Robot 202*16467b97STreehugger Robot---------------------------------------------------------------------- 203*16467b97STreehugger Robot 204*16467b97STreehugger RobotHow do I install this damn thing? 205*16467b97STreehugger Robot 206*16467b97STreehugger RobotJust untar and you'll get: 207*16467b97STreehugger Robot 208*16467b97STreehugger Robotantlr-3.0b6/README.txt (this file) 209*16467b97STreehugger Robotantlr-3.0b6/LICENSE.txt 210*16467b97STreehugger Robotantlr-3.0b6/src/org/antlr/... 211*16467b97STreehugger Robotantlr-3.0b6/lib/stringtemplate-3.0.jar (3.0b6 needs 3.0) 212*16467b97STreehugger Robotantlr-3.0b6/lib/antlr-2.7.7.jar 213*16467b97STreehugger Robotantlr-3.0b6/lib/antlr-3.0b6.jar 214*16467b97STreehugger Robot 215*16467b97STreehugger RobotThen you need to add all the jars in lib to your CLASSPATH. 216*16467b97STreehugger Robot 217*16467b97STreehugger Robot---------------------------------------------------------------------- 218*16467b97STreehugger Robot 219*16467b97STreehugger RobotHow do I use ANTLR v3? 220*16467b97STreehugger Robot 221*16467b97STreehugger Robot[I am assuming you are only using the command-line (and not the 222*16467b97STreehugger RobotANTLRWorks GUI)]. 223*16467b97STreehugger Robot 224*16467b97STreehugger RobotRunning ANTLR with no parameters shows you: 225*16467b97STreehugger Robot 226*16467b97STreehugger RobotANTLR Parser Generator Early Access Version 3.0b6 (Jan 31, 2007) 1989-2007 227*16467b97STreehugger Robotusage: java org.antlr.Tool [args] file.g [file2.g file3.g ...] 228*16467b97STreehugger Robot -o outputDir specify output directory where all output is generated 229*16467b97STreehugger Robot -lib dir specify location of token files 230*16467b97STreehugger Robot -report print out a report about the grammar(s) processed 231*16467b97STreehugger Robot -print print out the grammar without actions 232*16467b97STreehugger Robot -debug generate a parser that emits debugging events 233*16467b97STreehugger Robot -profile generate a parser that computes profiling information 234*16467b97STreehugger Robot -nfa generate an NFA for each rule 235*16467b97STreehugger Robot -dfa generate a DFA for each decision point 236*16467b97STreehugger Robot -message-format name specify output style for messages 237*16467b97STreehugger Robot -X display extended argument list 238*16467b97STreehugger Robot 239*16467b97STreehugger RobotFor example, consider how to make the LL-star example from the examples 240*16467b97STreehugger Robottarball you can get at http://www.antlr.org/download/examples-v3.tar.gz 241*16467b97STreehugger Robot 242*16467b97STreehugger Robot$ cd examples/java/LL-star 243*16467b97STreehugger Robot$ java org.antlr.Tool simplec.g 244*16467b97STreehugger Robot$ jikes *.java 245*16467b97STreehugger Robot 246*16467b97STreehugger RobotFor input: 247*16467b97STreehugger Robot 248*16467b97STreehugger Robotchar c; 249*16467b97STreehugger Robotint x; 250*16467b97STreehugger Robotvoid bar(int x); 251*16467b97STreehugger Robotint foo(int y, char d) { 252*16467b97STreehugger Robot int i; 253*16467b97STreehugger Robot for (i=0; i<3; i=i+1) { 254*16467b97STreehugger Robot x=3; 255*16467b97STreehugger Robot y=5; 256*16467b97STreehugger Robot } 257*16467b97STreehugger Robot} 258*16467b97STreehugger Robot 259*16467b97STreehugger Robotyou will see output as follows: 260*16467b97STreehugger Robot 261*16467b97STreehugger Robot$ java Main input 262*16467b97STreehugger Robotbar is a declaration 263*16467b97STreehugger Robotfoo is a definition 264*16467b97STreehugger Robot 265*16467b97STreehugger RobotWhat if I want to test my parser without generating code? Easy. Just 266*16467b97STreehugger Robotrun ANTLR in interpreter mode. It can't execute your actions, but it 267*16467b97STreehugger Robotcan create a parse tree from your input to show you how it would be 268*16467b97STreehugger Robotmatched. Use the org.antlr.tool.Interp main class. In the following, 269*16467b97STreehugger RobotI interpret simplec.g on t.c, which contains "int x;" 270*16467b97STreehugger Robot 271*16467b97STreehugger Robot$ java org.antlr.tool.Interp simplec.g WS program t.c 272*16467b97STreehugger Robot( <grammar SimpleC> 273*16467b97STreehugger Robot ( program 274*16467b97STreehugger Robot ( declaration 275*16467b97STreehugger Robot ( variable 276*16467b97STreehugger Robot ( type [@0,0:2='int',<14>,1:0] ) 277*16467b97STreehugger Robot ( declarator [@2,4:4='x',<2>,1:4] ) 278*16467b97STreehugger Robot [@3,5:5=';',<5>,1:5] 279*16467b97STreehugger Robot ) 280*16467b97STreehugger Robot ) 281*16467b97STreehugger Robot ) 282*16467b97STreehugger Robot) 283*16467b97STreehugger Robot 284*16467b97STreehugger Robotwhere I have formatted the output to make it more readable. I have 285*16467b97STreehugger Robottold it to ignore all WS tokens. 286*16467b97STreehugger Robot 287*16467b97STreehugger Robot---------------------------------------------------------------------- 288*16467b97STreehugger Robot 289*16467b97STreehugger RobotHow do I rebuild ANTLR v3? 290*16467b97STreehugger Robot 291*16467b97STreehugger RobotMake sure the following two jars are in your CLASSPATH 292*16467b97STreehugger Robot 293*16467b97STreehugger Robotantlr-3.0b6/lib/stringtemplate-3.0.jar 294*16467b97STreehugger Robotantlr-3.0b6/lib/antlr-2.7.7.jar 295*16467b97STreehugger Robotjunit.jar [if you want to build the test directories] 296*16467b97STreehugger Robot 297*16467b97STreehugger Robotthen jump into antlr-3.0b6/src directory and then type: 298*16467b97STreehugger Robot 299*16467b97STreehugger Robot$ javac -d . org/antlr/Tool.java org/antlr/*/*.java org/antlr/*/*/*.java 300*16467b97STreehugger Robot 301*16467b97STreehugger RobotTakes 9 seconds on my 1Ghz laptop or 4 seconds with jikes. Later I'll 302*16467b97STreehugger Robothave a real build mechanism, though I must admit the one-liner appeals 303*16467b97STreehugger Robotto me. I use Intellij so I never type anything actually to build. 304*16467b97STreehugger Robot 305*16467b97STreehugger RobotThere is also an ANT build.xml file, but I know nothing of ANT; contributed 306*16467b97STreehugger Robotby others (I'm opposed to any tool with an XML interface for Humans). 307*16467b97STreehugger Robot 308*16467b97STreehugger Robot----------------------------------------------------------------------- 309*16467b97STreehugger RobotC# Target Notes 310*16467b97STreehugger Robot 311*16467b97STreehugger Robot1. Auto-generated lexers do not inherit parent parser's @namespace 312*16467b97STreehugger Robot {...} value. Use @lexer::namespace{...}. 313*16467b97STreehugger Robot 314*16467b97STreehugger Robot----------------------------------------------------------------------- 315*16467b97STreehugger Robot 316*16467b97STreehugger RobotCHANGES 317*16467b97STreehugger Robot 318*16467b97STreehugger RobotMarch 17, 2007 319*16467b97STreehugger Robot 320*16467b97STreehugger Robot* Jonathan DeKlotz updated C# templates to be 3.0b6 current 321*16467b97STreehugger Robot 322*16467b97STreehugger RobotMarch 14, 2007 323*16467b97STreehugger Robot 324*16467b97STreehugger Robot* Manually-specified (...)=> force backtracking eval of that predicate. 325*16467b97STreehugger Robot backtracking=true mode does not however. Added unit test. 326*16467b97STreehugger Robot 327*16467b97STreehugger RobotMarch 14, 2007 328*16467b97STreehugger Robot 329*16467b97STreehugger Robot* Fixed bug in lexer where ~T didn't compute the set from rule T. 330*16467b97STreehugger Robot 331*16467b97STreehugger Robot* Added -Xnoinlinedfa make all DFA with tables; no inline prediction with IFs 332*16467b97STreehugger Robot 333*16467b97STreehugger Robot* Fixed http://www.antlr.org:8888/browse/ANTLR-80. 334*16467b97STreehugger Robot Sem pred states didn't define lookahead vars. 335*16467b97STreehugger Robot 336*16467b97STreehugger Robot* Fixed http://www.antlr.org:8888/browse/ANTLR-91. 337*16467b97STreehugger Robot When forcing some acyclic DFA to be state tables, they broke. 338*16467b97STreehugger Robot Forcing all DFA to be state tables should give same results. 339*16467b97STreehugger Robot 340*16467b97STreehugger RobotMarch 12, 2007 341*16467b97STreehugger Robot 342*16467b97STreehugger Robot* setTokenSource in CommonTokenStream didn't clear tokens list. 343*16467b97STreehugger Robot setCharStream calls reset in Lexer. 344*16467b97STreehugger Robot 345*16467b97STreehugger Robot* Altered -depend. No longer printing grammar files for multiple input 346*16467b97STreehugger Robot files with -depend. Doesn't show T__.g temp file anymore. Added 347*16467b97STreehugger Robot TLexer.tokens. Added .h files if defined. 348*16467b97STreehugger Robot 349*16467b97STreehugger RobotFebruary 11, 2007 350*16467b97STreehugger Robot 351*16467b97STreehugger Robot* Added -depend command-line option that, instead of processing files, 352*16467b97STreehugger Robot it shows you what files the input grammar(s) depend on and what files 353*16467b97STreehugger Robot they generate. For combined grammar T.g: 354*16467b97STreehugger Robot 355*16467b97STreehugger Robot $ java org.antlr.Tool -depend T.g 356*16467b97STreehugger Robot 357*16467b97STreehugger Robot You get: 358*16467b97STreehugger Robot 359*16467b97STreehugger Robot TParser.java : T.g 360*16467b97STreehugger Robot T.tokens : T.g 361*16467b97STreehugger Robot T__.g : T.g 362*16467b97STreehugger Robot 363*16467b97STreehugger Robot Now, assuming U.g is a tree grammar ref'd T's tokens: 364*16467b97STreehugger Robot 365*16467b97STreehugger Robot $ java org.antlr.Tool -depend T.g U.g 366*16467b97STreehugger Robot 367*16467b97STreehugger Robot TParser.java : T.g 368*16467b97STreehugger Robot T.tokens : T.g 369*16467b97STreehugger Robot T__.g : T.g 370*16467b97STreehugger Robot U.g: T.tokens 371*16467b97STreehugger Robot U.java : U.g 372*16467b97STreehugger Robot U.tokens : U.g 373*16467b97STreehugger Robot 374*16467b97STreehugger Robot Handles spaces by escaping them. Pays attention to -o, -fo and -lib. 375*16467b97STreehugger Robot Dir 'x y' is a valid dir in current dir. 376*16467b97STreehugger Robot 377*16467b97STreehugger Robot $ java org.antlr.Tool -depend -lib /usr/local/lib -o 'x y' T.g U.g 378*16467b97STreehugger Robot x\ y/TParser.java : T.g 379*16467b97STreehugger Robot x\ y/T.tokens : T.g 380*16467b97STreehugger Robot x\ y/T__.g : T.g 381*16467b97STreehugger Robot U.g: /usr/local/lib/T.tokens 382*16467b97STreehugger Robot x\ y/U.java : U.g 383*16467b97STreehugger Robot x\ y/U.tokens : U.g 384*16467b97STreehugger Robot 385*16467b97STreehugger Robot You have API access via org.antlr.tool.BuildDependencyGenerator class: 386*16467b97STreehugger Robot getGeneratedFileList(), getDependenciesFileList(). You can also access 387*16467b97STreehugger Robot the output template: getDependencies(). The file 388*16467b97STreehugger Robot org/antlr/tool/templates/depend.stg contains the template. You can 389*16467b97STreehugger Robot modify as you want. File objects go in so you can play with path etc... 390*16467b97STreehugger Robot 391*16467b97STreehugger RobotFebruary 10, 2007 392*16467b97STreehugger Robot 393*16467b97STreehugger Robot* no more .gl files generated. All .g all the time. 394*16467b97STreehugger Robot 395*16467b97STreehugger Robot* changed @finally to be @after and added a finally clause to the 396*16467b97STreehugger Robot exception stuff. I also removed the superfluous "exception" 397*16467b97STreehugger Robot keyword. Here's what the new syntax looks like: 398*16467b97STreehugger Robot 399*16467b97STreehugger Robot a 400*16467b97STreehugger Robot @after { System.out.println("ick"); } 401*16467b97STreehugger Robot : 'a' 402*16467b97STreehugger Robot ; 403*16467b97STreehugger Robot catch[RecognitionException e] { System.out.println("foo"); } 404*16467b97STreehugger Robot catch[IOException e] { System.out.println("io"); } 405*16467b97STreehugger Robot finally { System.out.println("foobar"); } 406*16467b97STreehugger Robot 407*16467b97STreehugger Robot @after executes after bookkeeping to set $rule.stop, $rule.tree but 408*16467b97STreehugger Robot before scopes pop and any memoization happens. Dynamic scopes and 409*16467b97STreehugger Robot memoization are still in generated finally block because they must 410*16467b97STreehugger Robot exec even if error in rule. The @after action and tree setting 411*16467b97STreehugger Robot stuff can technically be skipped upon syntax error in rule. [Later 412*16467b97STreehugger Robot we might add something to finally to stick an ERROR token in the 413*16467b97STreehugger Robot tree and set the return value.] Sequence goes: set $stop, $tree (if 414*16467b97STreehugger Robot any), @after (if any), pop scopes (if any), memoize (if needed), 415*16467b97STreehugger Robot grammar finally clause. Last 3 are in generated code's finally 416*16467b97STreehugger Robot clause. 417*16467b97STreehugger Robot 418*16467b97STreehugger Robot3.0b6 - January 31, 2007 419*16467b97STreehugger Robot 420*16467b97STreehugger RobotJanuary 30, 2007 421*16467b97STreehugger Robot 422*16467b97STreehugger Robot* Fixed bug in IntervalSet.and: it returned the same empty set all the time 423*16467b97STreehugger Robot rather than new empty set. Code altered the same empty set. 424*16467b97STreehugger Robot 425*16467b97STreehugger Robot* Made analysis terminate faster upon a decision that takes too long; 426*16467b97STreehugger Robot it seemed to keep doing work for a while. Refactored some names 427*16467b97STreehugger Robot and updated comments. Also made it terminate when it realizes it's 428*16467b97STreehugger Robot non-LL(*) due to recursion. just added terminate conditions to loop 429*16467b97STreehugger Robot in convert(). 430*16467b97STreehugger Robot 431*16467b97STreehugger Robot* Sometimes fatal non-LL(*) messages didn't appear; instead you got 432*16467b97STreehugger Robot "antlr couldn't analyze", which is actually untrue. I had the 433*16467b97STreehugger Robot order of some prints wrong in the DecisionProbe. 434*16467b97STreehugger Robot 435*16467b97STreehugger Robot* The code generator incorrectly detected when it could use a fixed, 436*16467b97STreehugger Robot acyclic inline DFA (i.e., using an IF). Upon non-LL(*) decisions 437*16467b97STreehugger Robot with predicates, analysis made cyclic DFA. But this stops 438*16467b97STreehugger Robot the computation detecting whether they are cyclic. I just added 439*16467b97STreehugger Robot a protection in front of the acyclic DFA generator to avoid if 440*16467b97STreehugger Robot non-LL(*). Updated comments. 441*16467b97STreehugger Robot 442*16467b97STreehugger RobotJanuary 23, 2007 443*16467b97STreehugger Robot 444*16467b97STreehugger Robot* Made tree node streams use adaptor to create navigation nodes. 445*16467b97STreehugger Robot Thanks to Emond Papegaaij. 446*16467b97STreehugger Robot 447*16467b97STreehugger RobotJanuary 22, 2007 448*16467b97STreehugger Robot 449*16467b97STreehugger Robot* Added lexer rule properties: start, stop 450*16467b97STreehugger Robot 451*16467b97STreehugger RobotJanuary 1, 2007 452*16467b97STreehugger Robot 453*16467b97STreehugger Robot* analysis failsafe is back on; if a decision takes too long, it bails out 454*16467b97STreehugger Robot and uses k=1 455*16467b97STreehugger Robot 456*16467b97STreehugger RobotJanuary 1, 2007 457*16467b97STreehugger Robot 458*16467b97STreehugger Robot* += labels for rules only work for output option; previously elements 459*16467b97STreehugger Robot of list were the return value structs, but are now either the tree or 460*16467b97STreehugger Robot StringTemplate return value. You can label different rules now 461*16467b97STreehugger Robot x+=a x+=b. 462*16467b97STreehugger Robot 463*16467b97STreehugger RobotDecember 30, 2006 464*16467b97STreehugger Robot 465*16467b97STreehugger Robot* Allow \" to work correctly in "..." template. 466*16467b97STreehugger Robot 467*16467b97STreehugger RobotDecember 28, 2006 468*16467b97STreehugger Robot 469*16467b97STreehugger Robot* errors that are now warnings: missing AST label type in trees. 470*16467b97STreehugger Robot Also "no start rule detected" is warning. 471*16467b97STreehugger Robot 472*16467b97STreehugger Robot* tree grammars also can do rewrite=true for output=template. 473*16467b97STreehugger Robot Only works for alts with single node or tree as alt elements. 474*16467b97STreehugger Robot If you are going to use $text in a tree grammar or do rewrite=true 475*16467b97STreehugger Robot for templates, you must use in your main: 476*16467b97STreehugger Robot 477*16467b97STreehugger Robot nodes.setTokenStream(tokens); 478*16467b97STreehugger Robot 479*16467b97STreehugger Robot* You get a warning for tree grammars that do rewrite=true and 480*16467b97STreehugger Robot output=template and have -> for alts that are not simple nodes 481*16467b97STreehugger Robot or simple trees. new unit tests in TestRewriteTemplates at end. 482*16467b97STreehugger Robot 483*16467b97STreehugger RobotDecember 27, 2006 484*16467b97STreehugger Robot 485*16467b97STreehugger Robot* Error message appears when you use -> in tree grammar with 486*16467b97STreehugger Robot output=template and rewrite=true for alt that is not simple 487*16467b97STreehugger Robot node or tree ref. 488*16467b97STreehugger Robot 489*16467b97STreehugger Robot* no more $stop attribute for tree parsers; meaningless/useless. 490*16467b97STreehugger Robot Removed from TreeRuleReturnScope also. 491*16467b97STreehugger Robot 492*16467b97STreehugger Robot* rule text attribute in tree parser must pull from token buffer. 493*16467b97STreehugger Robot Makes no sense otherwise. added getTokenStream to TreeNodeStream 494*16467b97STreehugger Robot so rule $text attr works. CommonTreeNodeStream etc... now let 495*16467b97STreehugger Robot you set the token stream so you can access later from tree parser. 496*16467b97STreehugger Robot $text is not well-defined for rules like 497*16467b97STreehugger Robot 498*16467b97STreehugger Robot slist : stat+ ; 499*16467b97STreehugger Robot 500*16467b97STreehugger Robot because stat is not a single node nor rooted with a single node. 501*16467b97STreehugger Robot $slist.text will get only first stat. I need to add a warning about 502*16467b97STreehugger Robot this... 503*16467b97STreehugger Robot 504*16467b97STreehugger Robot* Fixed http://www.antlr.org:8888/browse/ANTLR-76 for Java. 505*16467b97STreehugger Robot Enhanced TokenRewriteStream so it accepts any object; converts 506*16467b97STreehugger Robot to string at last second. Allows you to rewrite with StringTemplate 507*16467b97STreehugger Robot templates now :) 508*16467b97STreehugger Robot 509*16467b97STreehugger Robot* added rewrite option that makes -> template rewrites do replace ops for 510*16467b97STreehugger Robot TokenRewriteStream input stream. In output=template and rewrite=true mode 511*16467b97STreehugger Robot same as before 'cept that the parser does 512*16467b97STreehugger Robot 513*16467b97STreehugger Robot ((TokenRewriteStream)input).replace( 514*16467b97STreehugger Robot ((Token)retval.start).getTokenIndex(), 515*16467b97STreehugger Robot input.LT(-1).getTokenIndex(), 516*16467b97STreehugger Robot retval.st); 517*16467b97STreehugger Robot 518*16467b97STreehugger Robot after each rewrite so that the input stream is altered. Later refs to 519*16467b97STreehugger Robot $text will have rewrites. Here's a sample test program for grammar Rew. 520*16467b97STreehugger Robot 521*16467b97STreehugger Robot FileReader groupFileR = new FileReader("Rew.stg"); 522*16467b97STreehugger Robot StringTemplateGroup templates = new StringTemplateGroup(groupFileR); 523*16467b97STreehugger Robot ANTLRInputStream input = new ANTLRInputStream(System.in); 524*16467b97STreehugger Robot RewLexer lexer = new RewLexer(input); 525*16467b97STreehugger Robot TokenRewriteStream tokens = new TokenRewriteStream(lexer); 526*16467b97STreehugger Robot RewParser parser = new RewParser(tokens); 527*16467b97STreehugger Robot parser.setTemplateLib(templates); 528*16467b97STreehugger Robot parser.program(); 529*16467b97STreehugger Robot System.out.println(tokens.toString()); 530*16467b97STreehugger Robot groupFileR.close(); 531*16467b97STreehugger Robot 532*16467b97STreehugger RobotDecember 26, 2006 533*16467b97STreehugger Robot 534*16467b97STreehugger Robot* BaseTree.dupTree didn't dup recursively. 535*16467b97STreehugger Robot 536*16467b97STreehugger RobotDecember 24, 2006 537*16467b97STreehugger Robot 538*16467b97STreehugger Robot* Cleaned up some comments and removed field treeNode 539*16467b97STreehugger Robot from MismatchedTreeNodeException class. It is "node" in 540*16467b97STreehugger Robot RecognitionException. 541*16467b97STreehugger Robot 542*16467b97STreehugger Robot* Changed type from Object to BitSet for expecting fields in 543*16467b97STreehugger Robot MismatchedSetException and MismatchedNotSetException 544*16467b97STreehugger Robot 545*16467b97STreehugger Robot* Cleaned up error printing in lexers and the messages that it creates. 546*16467b97STreehugger Robot 547*16467b97STreehugger Robot* Added this to TreeAdaptor: 548*16467b97STreehugger Robot /** Return the token object from which this node was created. 549*16467b97STreehugger Robot * Currently used only for printing an error message. 550*16467b97STreehugger Robot * The error display routine in BaseRecognizer needs to 551*16467b97STreehugger Robot * display where the input the error occurred. If your 552*16467b97STreehugger Robot * tree of limitation does not store information that can 553*16467b97STreehugger Robot * lead you to the token, you can create a token filled with 554*16467b97STreehugger Robot * the appropriate information and pass that back. See 555*16467b97STreehugger Robot * BaseRecognizer.getErrorMessage(). 556*16467b97STreehugger Robot */ 557*16467b97STreehugger Robot public Token getToken(Object t); 558*16467b97STreehugger Robot 559*16467b97STreehugger RobotDecember 23, 2006 560*16467b97STreehugger Robot 561*16467b97STreehugger Robot* made BaseRecognizer.displayRecognitionError nonstatic so people can 562*16467b97STreehugger Robot override it. Not sure why it was static before. 563*16467b97STreehugger Robot 564*16467b97STreehugger Robot* Removed state/decision message that comes out of no 565*16467b97STreehugger Robot viable alternative exceptions, as that was too much. 566*16467b97STreehugger Robot removed the decision number from the early exit exception 567*16467b97STreehugger Robot also. During development, you can simply override 568*16467b97STreehugger Robot displayRecognitionError from BaseRecognizer to add the stuff 569*16467b97STreehugger Robot back in if you want. 570*16467b97STreehugger Robot 571*16467b97STreehugger Robot* made output go to an output method you can override: emitErrorMessage() 572*16467b97STreehugger Robot 573*16467b97STreehugger Robot* general cleanup of the error emitting code in BaseRecognizer. Lots 574*16467b97STreehugger Robot more stuff you can override: getErrorHeader, getTokenErrorDisplay, 575*16467b97STreehugger Robot emitErrorMessage, getErrorMessage. 576*16467b97STreehugger Robot 577*16467b97STreehugger RobotDecember 22, 2006 578*16467b97STreehugger Robot 579*16467b97STreehugger Robot* Altered Tree.Parser.matchAny() so that it skips entire trees if 580*16467b97STreehugger Robot node has children otherwise skips one node. Now this works to 581*16467b97STreehugger Robot skip entire body of function if single-rooted subtree: 582*16467b97STreehugger Robot ^(FUNC name=ID arg=ID .) 583*16467b97STreehugger Robot 584*16467b97STreehugger Robot* Added "reverse index" from node to stream index. Override 585*16467b97STreehugger Robot fillReverseIndex() in CommonTreeNodeStream if you want to change. 586*16467b97STreehugger Robot Use getNodeIndex(node) to find stream index for a specific tree node. 587*16467b97STreehugger Robot See getNodeIndex(), reverseIndex(Set tokenTypes), 588*16467b97STreehugger Robot reverseIndex(int tokenType), fillReverseIndex(). The indexing 589*16467b97STreehugger Robot costs time and memory to fill, but pulling stuff out will be lots 590*16467b97STreehugger Robot faster as it can jump from a node ptr straight to a stream index. 591*16467b97STreehugger Robot 592*16467b97STreehugger Robot* Added TreeNodeStream.get(index) to make it easier for interpreters to 593*16467b97STreehugger Robot jump around in tree node stream. 594*16467b97STreehugger Robot 595*16467b97STreehugger Robot* New CommonTreeNodeStream buffers all nodes in stream for fast jumping 596*16467b97STreehugger Robot around. It now has push/pop methods to invoke other locations in 597*16467b97STreehugger Robot the stream for building interpreters. 598*16467b97STreehugger Robot 599*16467b97STreehugger Robot* Moved CommonTreeNodeStream to UnBufferedTreeNodeStream and removed 600*16467b97STreehugger Robot Iterator implementation. moved toNodesOnlyString() to TestTreeNodeStream 601*16467b97STreehugger Robot 602*16467b97STreehugger Robot* [BREAKS ANY TREE IMPLEMENTATION] 603*16467b97STreehugger Robot made CommonTreeNodeStream work with any tree node type. TreeAdaptor 604*16467b97STreehugger Robot now implements isNil so must add; trivial, but does break back 605*16467b97STreehugger Robot compatibility. 606*16467b97STreehugger Robot 607*16467b97STreehugger RobotDecember 17, 2006 608*16467b97STreehugger Robot 609*16467b97STreehugger Robot* Added traceIn/Out methods to recognizers so that you can override them; 610*16467b97STreehugger Robot previously they were in-line print statements. The message has also 611*16467b97STreehugger Robot been slightly improved. 612*16467b97STreehugger Robot 613*16467b97STreehugger Robot* Factored BuildParseTree into debug package; cleaned stuff up. Fixed 614*16467b97STreehugger Robot unit tests. 615*16467b97STreehugger Robot 616*16467b97STreehugger RobotDecember 15, 2006 617*16467b97STreehugger Robot 618*16467b97STreehugger Robot* [BREAKS ANY TREE IMPLEMENTATION] 619*16467b97STreehugger Robot org.antlr.runtime.tree.Tree; needed to add get/set for token start/stop 620*16467b97STreehugger Robot index so CommonTreeAdaptor can assume Tree interface not CommonTree 621*16467b97STreehugger Robot implementation. Otherwise, no way to create your own nodes that satisfy 622*16467b97STreehugger Robot Tree because CommonTreeAdaptor was doing 623*16467b97STreehugger Robot 624*16467b97STreehugger Robot public int getTokenStartIndex(Object t) { 625*16467b97STreehugger Robot return ((CommonTree)t).startIndex; 626*16467b97STreehugger Robot } 627*16467b97STreehugger Robot 628*16467b97STreehugger Robot Added to Tree: 629*16467b97STreehugger Robot 630*16467b97STreehugger Robot /** What is the smallest token index (indexing from 0) for this node 631*16467b97STreehugger Robot * and its children? 632*16467b97STreehugger Robot */ 633*16467b97STreehugger Robot int getTokenStartIndex(); 634*16467b97STreehugger Robot 635*16467b97STreehugger Robot void setTokenStartIndex(int index); 636*16467b97STreehugger Robot 637*16467b97STreehugger Robot /** What is the largest token index (indexing from 0) for this node 638*16467b97STreehugger Robot * and its children? 639*16467b97STreehugger Robot */ 640*16467b97STreehugger Robot int getTokenStopIndex(); 641*16467b97STreehugger Robot 642*16467b97STreehugger Robot void setTokenStopIndex(int index); 643*16467b97STreehugger Robot 644*16467b97STreehugger RobotDecember 13, 2006 645*16467b97STreehugger Robot 646*16467b97STreehugger Robot* Added org.antlr.runtime.tree.DOTTreeGenerator so you can generate DOT 647*16467b97STreehugger Robot diagrams easily from trees. 648*16467b97STreehugger Robot 649*16467b97STreehugger Robot CharStream input = new ANTLRInputStream(System.in); 650*16467b97STreehugger Robot TLexer lex = new TLexer(input); 651*16467b97STreehugger Robot CommonTokenStream tokens = new CommonTokenStream(lex); 652*16467b97STreehugger Robot TParser parser = new TParser(tokens); 653*16467b97STreehugger Robot TParser.e_return r = parser.e(); 654*16467b97STreehugger Robot Tree t = (Tree)r.tree; 655*16467b97STreehugger Robot System.out.println(t.toStringTree()); 656*16467b97STreehugger Robot DOTTreeGenerator gen = new DOTTreeGenerator(); 657*16467b97STreehugger Robot StringTemplate st = gen.toDOT(t); 658*16467b97STreehugger Robot System.out.println(st); 659*16467b97STreehugger Robot 660*16467b97STreehugger Robot* Changed the way mark()/rewind() work in CommonTreeNode stream to mirror 661*16467b97STreehugger Robot more flexible solution in ANTLRStringStream. Forgot to set lastMarker 662*16467b97STreehugger Robot anyway. Now you can rewind to non-most-recent marker. 663*16467b97STreehugger Robot 664*16467b97STreehugger RobotDecember 12, 2006 665*16467b97STreehugger Robot 666*16467b97STreehugger Robot* Temp lexer now end in .gl (T__.gl, for example) 667*16467b97STreehugger Robot 668*16467b97STreehugger Robot* TreeParser suffix no longer generated for tree grammars 669*16467b97STreehugger Robot 670*16467b97STreehugger Robot* Defined reset for lexer, parser, tree parser; rewinds the input stream also 671*16467b97STreehugger Robot 672*16467b97STreehugger RobotDecember 10, 2006 673*16467b97STreehugger Robot 674*16467b97STreehugger Robot* Made Grammar.abortNFAToDFAConversion() abort in middle of a DFA. 675*16467b97STreehugger Robot 676*16467b97STreehugger RobotDecember 9, 2006 677*16467b97STreehugger Robot 678*16467b97STreehugger Robot* fixed bug in OrderedHashSet.add(). It didn't track elements correctly. 679*16467b97STreehugger Robot 680*16467b97STreehugger RobotDecember 6, 2006 681*16467b97STreehugger Robot 682*16467b97STreehugger Robot* updated build.xml for future Ant compatibility, thanks to Matt Benson. 683*16467b97STreehugger Robot 684*16467b97STreehugger Robot* various tests in TestRewriteTemplate and TestSyntacticPredicateEvaluation 685*16467b97STreehugger Robot were using the old 'channel' vs. new '$channel' notation. 686*16467b97STreehugger Robot TestInterpretedParsing didn't pick up an earlier change to CommonToken. 687*16467b97STreehugger Robot Reported by Matt Benson. 688*16467b97STreehugger Robot 689*16467b97STreehugger Robot* fixed platform dependent test failures in TestTemplates, supplied by Matt 690*16467b97STreehugger Robot Benson. 691*16467b97STreehugger Robot 692*16467b97STreehugger RobotNovember 29, 2006 693*16467b97STreehugger Robot 694*16467b97STreehugger Robot* optimized semantic predicate evaluation so that p||!p yields true. 695*16467b97STreehugger Robot 696*16467b97STreehugger RobotNovember 22, 2006 697*16467b97STreehugger Robot 698*16467b97STreehugger Robot* fixed bug that prevented var = $rule.some_retval from working in anything 699*16467b97STreehugger Robot but the first alternative of a rule or subrule. 700*16467b97STreehugger Robot 701*16467b97STreehugger Robot* attribute names containing digits were not allowed, this is now fixed, 702*16467b97STreehugger Robot allowing attributes like 'name1' but not '1name1'. 703*16467b97STreehugger Robot 704*16467b97STreehugger RobotNovember 19, 2006 705*16467b97STreehugger Robot 706*16467b97STreehugger Robot* Removed LeftRecursionMessage and apparatus because it seems that I check 707*16467b97STreehugger Robot for left recursion upfront before analysis and everything gets specified as 708*16467b97STreehugger Robot recursion cycles at this point. 709*16467b97STreehugger Robot 710*16467b97STreehugger RobotNovember 16, 2006 711*16467b97STreehugger Robot 712*16467b97STreehugger Robot* TokenRewriteStream.replace was not passing programName to next method. 713*16467b97STreehugger Robot 714*16467b97STreehugger RobotNovember 15, 2006 715*16467b97STreehugger Robot 716*16467b97STreehugger Robot* updated DOT files for DFA generation to make smaller circles. 717*16467b97STreehugger Robot 718*16467b97STreehugger Robot* made epsilon edges italics in the NFA diagrams. 719*16467b97STreehugger Robot 720*16467b97STreehugger Robot3.0b5 - November 15, 2006 721*16467b97STreehugger Robot 722*16467b97STreehugger RobotThe biggest thing is that your grammar file names must match the grammar name 723*16467b97STreehugger Robotinside (your generated class names will also be different) and we use 724*16467b97STreehugger Robot$channel=HIDDEN now instead of channel=99 inside lexer actions. 725*16467b97STreehugger RobotShould be compatible other than that. Please look at complete list of 726*16467b97STreehugger Robotchanges. 727*16467b97STreehugger Robot 728*16467b97STreehugger RobotNovember 14, 2006 729*16467b97STreehugger Robot 730*16467b97STreehugger Robot* Force token index to be -1 for CommonIndex in case not set. 731*16467b97STreehugger Robot 732*16467b97STreehugger RobotNovember 11, 2006 733*16467b97STreehugger Robot 734*16467b97STreehugger Robot* getUniqueID for TreeAdaptor now uses identityHashCode instead of hashCode. 735*16467b97STreehugger Robot 736*16467b97STreehugger RobotNovember 10, 2006 737*16467b97STreehugger Robot 738*16467b97STreehugger Robot* No grammar nondeterminism warning now when wildcard '.' is final alt. 739*16467b97STreehugger Robot Examples: 740*16467b97STreehugger Robot 741*16467b97STreehugger Robot a : A | B | . ; 742*16467b97STreehugger Robot 743*16467b97STreehugger Robot A : 'a' 744*16467b97STreehugger Robot | . 745*16467b97STreehugger Robot ; 746*16467b97STreehugger Robot 747*16467b97STreehugger Robot SL_COMMENT 748*16467b97STreehugger Robot : '//' (options {greedy=false;} : .)* '\r'? '\n' 749*16467b97STreehugger Robot ; 750*16467b97STreehugger Robot 751*16467b97STreehugger Robot SL_COMMENT2 752*16467b97STreehugger Robot : '//' (options {greedy=false;} : 'x'|.)* '\r'? '\n' 753*16467b97STreehugger Robot ; 754*16467b97STreehugger Robot 755*16467b97STreehugger Robot 756*16467b97STreehugger RobotNovember 8, 2006 757*16467b97STreehugger Robot 758*16467b97STreehugger Robot* Syntactic predicates did not get hoisting properly upon non-LL(*) decision. Other hoisting issues fixed. Cleaned up code. 759*16467b97STreehugger Robot 760*16467b97STreehugger Robot* Removed failsafe that check to see if I'm spending too much time on a single DFA; I don't think we need it anymore. 761*16467b97STreehugger Robot 762*16467b97STreehugger RobotNovember 3, 2006 763*16467b97STreehugger Robot 764*16467b97STreehugger Robot* $text, $line, etc... were not working in assignments. Fixed and added 765*16467b97STreehugger Robot test case. 766*16467b97STreehugger Robot 767*16467b97STreehugger Robot* $label.text translated to label.getText in lexer even if label was on a char 768*16467b97STreehugger Robot 769*16467b97STreehugger RobotNovember 2, 2006 770*16467b97STreehugger Robot 771*16467b97STreehugger Robot* Added error if you don't specify what the AST type is; actions in tree 772*16467b97STreehugger Robot grammar won't work without it. 773*16467b97STreehugger Robot 774*16467b97STreehugger Robot $ cat x.g 775*16467b97STreehugger Robot tree grammar x; 776*16467b97STreehugger Robot a : ID {String s = $ID.text;} ; 777*16467b97STreehugger Robot 778*16467b97STreehugger Robot ANTLR Parser Generator Early Access Version 3.0b5 (??, 2006) 1989-2006 779*16467b97STreehugger Robot error: x.g:0:0: (152) tree grammar x has no ASTLabelType option 780*16467b97STreehugger Robot 781*16467b97STreehugger RobotNovember 1, 2006 782*16467b97STreehugger Robot 783*16467b97STreehugger Robot* $text, $line, etc... were not working properly within lexer rule. 784*16467b97STreehugger Robot 785*16467b97STreehugger RobotOctober 32, 2006 786*16467b97STreehugger Robot 787*16467b97STreehugger Robot* Finally actions now execute before dynamic scopes are popped it in the 788*16467b97STreehugger Robot rule. Previously was not possible to access the rules scoped variables 789*16467b97STreehugger Robot in a finally action. 790*16467b97STreehugger Robot 791*16467b97STreehugger RobotOctober 29, 2006 792*16467b97STreehugger Robot 793*16467b97STreehugger Robot* Altered ActionTranslator to emit errors on setting read-only attributes 794*16467b97STreehugger Robot such as $start, $stop, $text in a rule. Also forbid setting any attributes 795*16467b97STreehugger Robot in rules/tokens referenced by a label or name. 796*16467b97STreehugger Robot Setting dynamic scopes's attributes and your own parameter attributes 797*16467b97STreehugger Robot is legal. 798*16467b97STreehugger Robot 799*16467b97STreehugger RobotOctober 27, 2006 800*16467b97STreehugger Robot 801*16467b97STreehugger Robot* Altered how ANTLR figures out what decision is associated with which 802*16467b97STreehugger Robot block of grammar. Makes ANTLRWorks correctly find DFA for a block. 803*16467b97STreehugger Robot 804*16467b97STreehugger RobotOctober 26, 2006 805*16467b97STreehugger Robot 806*16467b97STreehugger Robot* Fixed bug where EOT transitions led to no NFA configs in a DFA state, 807*16467b97STreehugger Robot yielding an error in DFA table generation. 808*16467b97STreehugger Robot 809*16467b97STreehugger Robot* renamed action.g to ActionTranslator.g 810*16467b97STreehugger Robot the ActionTranslator class is now called ActionTranslatorLexer, as ANTLR 811*16467b97STreehugger Robot generates this classname now. Fixed rest of codebase accordingly. 812*16467b97STreehugger Robot 813*16467b97STreehugger Robot* added rules recognizing setting of scopes' attributes to ActionTranslator.g 814*16467b97STreehugger Robot the Objective C target needed access to the right-hand side of the assignment 815*16467b97STreehugger Robot in order to generate correct code 816*16467b97STreehugger Robot 817*16467b97STreehugger Robot* changed ANTLRCore.sti to reflect the new mandatory templates to support the above 818*16467b97STreehugger Robot namely: scopeSetAttributeRef, returnSetAttributeRef and the ruleSetPropertyRef_* 819*16467b97STreehugger Robot templates, with the exception of ruleSetPropertyRef_text. we cannot set this attribute 820*16467b97STreehugger Robot 821*16467b97STreehugger RobotOctober 19, 2006 822*16467b97STreehugger Robot 823*16467b97STreehugger Robot* Fixed 2 bugs in DFA conversion that caused exceptions. 824*16467b97STreehugger Robot altered functionality of getMinElement so it ignores elements<0. 825*16467b97STreehugger Robot 826*16467b97STreehugger RobotOctober 18, 2006 827*16467b97STreehugger Robot 828*16467b97STreehugger Robot* moved resetStateNumbersToBeContiguous() to after issuing of warnings; 829*16467b97STreehugger Robot an internal error in that routine should make more sense as issues 830*16467b97STreehugger Robot with decision will appear first. 831*16467b97STreehugger Robot 832*16467b97STreehugger Robot* fixed cut/paste bug I introduced when fixed EOF in min/max 833*16467b97STreehugger Robot bug. Prevented C grammar from working briefly. 834*16467b97STreehugger Robot 835*16467b97STreehugger RobotOctober 17, 2006 836*16467b97STreehugger Robot 837*16467b97STreehugger Robot* Removed a failsafe that seems to be unnecessary that ensure DFA didn't 838*16467b97STreehugger Robot get too big. It was resulting in some failures in code generation that 839*16467b97STreehugger Robot led me on quite a strange debugging trip. 840*16467b97STreehugger Robot 841*16467b97STreehugger RobotOctober 16, 2006 842*16467b97STreehugger Robot 843*16467b97STreehugger Robot* Use channel=HIDDEN not channel=99 to put tokens on hidden channel. 844*16467b97STreehugger Robot 845*16467b97STreehugger RobotOctober 12, 2006 846*16467b97STreehugger Robot 847*16467b97STreehugger Robot* ANTLR now has a customizable message format for errors and warnings, 848*16467b97STreehugger Robot to make it easier to fulfill requirements by IDEs and such. 849*16467b97STreehugger Robot The format to be used can be specified via the '-message-format name' 850*16467b97STreehugger Robot command line switch. The default for name is 'antlr', also available 851*16467b97STreehugger Robot at the moment is 'gnu'. This is done via StringTemplate, for details 852*16467b97STreehugger Robot on the requirements look in org/antlr/tool/templates/messages/formats/ 853*16467b97STreehugger Robot 854*16467b97STreehugger Robot* line numbers for lexers in combined grammars are now reported correctly. 855*16467b97STreehugger Robot 856*16467b97STreehugger RobotSeptember 29, 2006 857*16467b97STreehugger Robot 858*16467b97STreehugger Robot* ANTLRReaderStream improperly checked for end of input. 859*16467b97STreehugger Robot 860*16467b97STreehugger RobotSeptember 28, 2006 861*16467b97STreehugger Robot 862*16467b97STreehugger Robot* For ANTLRStringStream, LA(-1) was off by one...gave you LA(-2). 863*16467b97STreehugger Robot 864*16467b97STreehugger Robot3.0b4 - August 24, 2006 865*16467b97STreehugger Robot 866*16467b97STreehugger Robot* error when no rules in grammar. doesn't crash now. 867*16467b97STreehugger Robot 868*16467b97STreehugger Robot* Token is now an interface. 869*16467b97STreehugger Robot 870*16467b97STreehugger Robot* remove dependence on non runtime classes in runtime package. 871*16467b97STreehugger Robot 872*16467b97STreehugger Robot* filename and grammar name must be same Foo in Foo.g. Generates FooParser, 873*16467b97STreehugger Robot FooLexer, ... Combined grammar Foo generates Foo$Lexer.g which generates 874*16467b97STreehugger Robot FooLexer.java. tree grammars generate FooTreeParser.java 875*16467b97STreehugger Robot 876*16467b97STreehugger RobotAugust 24, 2006 877*16467b97STreehugger Robot 878*16467b97STreehugger Robot* added C# target to lib, codegen, templates 879*16467b97STreehugger Robot 880*16467b97STreehugger RobotAugust 11, 2006 881*16467b97STreehugger Robot 882*16467b97STreehugger Robot* added tree arg to navigation methods in treeadaptor 883*16467b97STreehugger Robot 884*16467b97STreehugger RobotAugust 07, 2006 885*16467b97STreehugger Robot 886*16467b97STreehugger Robot* fixed bug related to (a|)+ on end of lexer rules. crashed instead 887*16467b97STreehugger Robot of warning. 888*16467b97STreehugger Robot 889*16467b97STreehugger Robot* added warning that interpreter doesn't do synpreds yet 890*16467b97STreehugger Robot 891*16467b97STreehugger Robot* allow different source of classloader: 892*16467b97STreehugger RobotClassLoader cl = Thread.currentThread().getContextClassLoader(); 893*16467b97STreehugger Robotif ( cl==null ) { 894*16467b97STreehugger Robot cl = this.getClass().getClassLoader(); 895*16467b97STreehugger Robot} 896*16467b97STreehugger Robot 897*16467b97STreehugger Robot 898*16467b97STreehugger RobotJuly 26, 2006 899*16467b97STreehugger Robot 900*16467b97STreehugger Robot* compressed DFA edge tables significantly. All edge tables are 901*16467b97STreehugger Robot unique. The transition table can reuse arrays. Look like this now: 902*16467b97STreehugger Robot 903*16467b97STreehugger Robot public static readonly DFA30_transition0 = 904*16467b97STreehugger Robot new short[] { 46, 46, -1, 46, 46, -1, -1, -1, -1, -1, -1, -1,...}; 905*16467b97STreehugger Robot public static readonly DFA30_transition1 = 906*16467b97STreehugger Robot new short[] { 21 }; 907*16467b97STreehugger Robot public static readonly short[][] DFA30_transition = { 908*16467b97STreehugger Robot DFA30_transition0, 909*16467b97STreehugger Robot DFA30_transition0, 910*16467b97STreehugger Robot DFA30_transition1, 911*16467b97STreehugger Robot ... 912*16467b97STreehugger Robot }; 913*16467b97STreehugger Robot 914*16467b97STreehugger Robot* If you defined both a label like EQ and '=', sometimes the '=' was 915*16467b97STreehugger Robot used instead of the EQ label. 916*16467b97STreehugger Robot 917*16467b97STreehugger Robot* made headerFile template have same arg list as outputFile for consistency 918*16467b97STreehugger Robot 919*16467b97STreehugger Robot* outputFile, lexer, genericParser, parser, treeParser templates 920*16467b97STreehugger Robot reference cyclicDFAs attribute which was no longer used after I 921*16467b97STreehugger Robot started the new table-based DFA. I made cyclicDFADescriptors 922*16467b97STreehugger Robot argument to outputFile and headerFile (only). I think this is 923*16467b97STreehugger Robot correct as only OO languages will want the DFA in the recognizer. 924*16467b97STreehugger Robot At the top level, C and friends can use it. Changed name to use 925*16467b97STreehugger Robot cyclicDFAs again as it's a better name probably. Removed parameter 926*16467b97STreehugger Robot from the lexer, ... For example, my parser template says this now: 927*16467b97STreehugger Robot 928*16467b97STreehugger Robot <cyclicDFAs:cyclicDFA()> <! dump tables for all DFA !> 929*16467b97STreehugger Robot 930*16467b97STreehugger Robot* made all token ref token types go thru code gen's 931*16467b97STreehugger Robot getTokenTypeAsTargetLabel() 932*16467b97STreehugger Robot 933*16467b97STreehugger Robot* no more computing DFA transition tables for acyclic DFA. 934*16467b97STreehugger Robot 935*16467b97STreehugger RobotJuly 25, 2006 936*16467b97STreehugger Robot 937*16467b97STreehugger Robot* fixed a place where I was adding syn predicates into rewrite stuff. 938*16467b97STreehugger Robot 939*16467b97STreehugger Robot* turned off invalid token index warning in AW support; had a problem. 940*16467b97STreehugger Robot 941*16467b97STreehugger Robot* bad location event generated with -debug for synpreds in autobacktrack mode. 942*16467b97STreehugger Robot 943*16467b97STreehugger RobotJuly 24, 2006 944*16467b97STreehugger Robot 945*16467b97STreehugger Robot* changed runtime.DFA so that it treats all chars and token types as 946*16467b97STreehugger Robot char (unsigned 16 bit int). -1 becomes '\uFFFF' then or 65535. 947*16467b97STreehugger Robot 948*16467b97STreehugger Robot* changed MAX_STATE_TRANSITIONS_FOR_TABLE to be 65534 by default 949*16467b97STreehugger Robot now. This means that all states can use a table to do transitions. 950*16467b97STreehugger Robot 951*16467b97STreehugger Robot* was not making synpreds on (C)* type loops with backtrack=true 952*16467b97STreehugger Robot 953*16467b97STreehugger Robot* was copying tree stuff and actions into synpreds with backtrack=true 954*16467b97STreehugger Robot 955*16467b97STreehugger Robot* was making synpreds on even single alt rules / blocks with backtrack=true 956*16467b97STreehugger Robot 957*16467b97STreehugger Robot3.0b3 - July 21, 2006 958*16467b97STreehugger Robot 959*16467b97STreehugger Robot* ANTLR fails to analyze complex decisions much less frequently. It 960*16467b97STreehugger Robot turns out that the set of decisions for which ANTLR fails (times 961*16467b97STreehugger Robot out) is the same set (so far) of non-LL(*) decisions. Morever, I'm 962*16467b97STreehugger Robot able to detect this situation quickly and report rather than timing 963*16467b97STreehugger Robot out. Errors look like: 964*16467b97STreehugger Robot 965*16467b97STreehugger Robot java.g:468:23: [fatal] rule concreteDimensions has non-LL(*) 966*16467b97STreehugger Robot decision due to recursive rule invocations in alts 1,2. Resolve 967*16467b97STreehugger Robot by left-factoring or using syntactic predicates with fixed k 968*16467b97STreehugger Robot lookahead or use backtrack=true option. 969*16467b97STreehugger Robot 970*16467b97STreehugger Robot This message only appears when k=*. 971*16467b97STreehugger Robot 972*16467b97STreehugger Robot* Shortened no viable alt messages to not include decision 973*16467b97STreehugger Robot description: 974*16467b97STreehugger Robot 975*16467b97STreehugger Robot[compilationUnit, declaration]: line 8:8 decision=<<67:1: declaration 976*16467b97STreehugger Robot: ( ( fieldDeclaration )=> fieldDeclaration | ( methodDeclaration )=> 977*16467b97STreehugger RobotmethodDeclaration | ( constructorDeclaration )=> 978*16467b97STreehugger RobotconstructorDeclaration | ( classDeclaration )=> classDeclaration | ( 979*16467b97STreehugger RobotinterfaceDeclaration )=> interfaceDeclaration | ( blockDeclaration )=> 980*16467b97STreehugger RobotblockDeclaration | emptyDeclaration );>> state 3 (decision=14) no 981*16467b97STreehugger Robotviable alt; token=[@1,184:187='java',<122>,8:8] 982*16467b97STreehugger Robot 983*16467b97STreehugger Robot too long and hard to read. 984*16467b97STreehugger Robot 985*16467b97STreehugger RobotJuly 19, 2006 986*16467b97STreehugger Robot 987*16467b97STreehugger Robot* Code gen bug: states with no emanating edges were ignored by ST. 988*16467b97STreehugger Robot Now an empty list is used. 989*16467b97STreehugger Robot 990*16467b97STreehugger Robot* Added grammar parameter to recognizer templates so they can access 991*16467b97STreehugger Robot properties like getName(), ... 992*16467b97STreehugger Robot 993*16467b97STreehugger RobotJuly 10, 2006 994*16467b97STreehugger Robot 995*16467b97STreehugger Robot* Fixed the gated pred merged state bug. Added unit test. 996*16467b97STreehugger Robot 997*16467b97STreehugger Robot* added new method to Target: getTokenTypeAsTargetLabel() 998*16467b97STreehugger Robot 999*16467b97STreehugger RobotJuly 7, 2006 1000*16467b97STreehugger Robot 1001*16467b97STreehugger Robot* I was doing an AND instead of OR in the gated predicate stuff. 1002*16467b97STreehugger Robot Thanks to Stephen Kou! 1003*16467b97STreehugger Robot 1004*16467b97STreehugger Robot* Reduce op for combining predicates was insanely slow sometimes and 1005*16467b97STreehugger Robot didn't actually work well. Now it's fast and works. 1006*16467b97STreehugger Robot 1007*16467b97STreehugger Robot* There is a bug in merging of DFA stop states related to gated 1008*16467b97STreehugger Robot preds...turned it off for now. 1009*16467b97STreehugger Robot 1010*16467b97STreehugger Robot3.0b2 - July 5, 2006 1011*16467b97STreehugger Robot 1012*16467b97STreehugger RobotJuly 5, 2006 1013*16467b97STreehugger Robot 1014*16467b97STreehugger Robot* token emission not properly protected in lexer filter mode. 1015*16467b97STreehugger Robot 1016*16467b97STreehugger Robot* EOT, EOT DFA state transition tables should be init'd to -1 (only 1017*16467b97STreehugger Robot was doing this for compressed tables). Fixed. 1018*16467b97STreehugger Robot 1019*16467b97STreehugger Robot* in trace mode, exit method not shown for memoized rules 1020*16467b97STreehugger Robot 1021*16467b97STreehugger Robot* added -Xmaxdfaedges to allow you to increase number of edges allowed 1022*16467b97STreehugger Robot for a single DFA state before it becomes "special" and can't fit in 1023*16467b97STreehugger Robot a simple table. 1024*16467b97STreehugger Robot 1025*16467b97STreehugger Robot* Bug in tables. Short are signed so min/max tables for DFA are now 1026*16467b97STreehugger Robot char[]. Bizarre. 1027*16467b97STreehugger Robot 1028*16467b97STreehugger RobotJuly 3, 2006 1029*16467b97STreehugger Robot 1030*16467b97STreehugger Robot* Added a method to reset the tool error state for current thread. 1031*16467b97STreehugger Robot See ErrorManager.java 1032*16467b97STreehugger Robot 1033*16467b97STreehugger Robot* [Got this working properly today] backtrack mode that let's you type 1034*16467b97STreehugger Robot in any old crap and ANTLR will backtrack if it can't figure out what 1035*16467b97STreehugger Robot you meant. No errors are reported by antlr during analysis. It 1036*16467b97STreehugger Robot implicitly adds a syn pred in front of every production, using them 1037*16467b97STreehugger Robot only if static grammar LL(*) analysis fails. Syn pred code is not 1038*16467b97STreehugger Robot generated if the pred is not used in a decision. 1039*16467b97STreehugger Robot 1040*16467b97STreehugger Robot This is essentially a rapid prototyping mode. 1041*16467b97STreehugger Robot 1042*16467b97STreehugger Robot* Added backtracking report to the -report option 1043*16467b97STreehugger Robot 1044*16467b97STreehugger Robot* Added NFA->DFA conversion early termination report to the -report option 1045*16467b97STreehugger Robot 1046*16467b97STreehugger Robot* Added grammar level k and backtrack options to -report 1047*16467b97STreehugger Robot 1048*16467b97STreehugger Robot* Added a dozen unit tests to test autobacktrack NFA construction. 1049*16467b97STreehugger Robot 1050*16467b97STreehugger Robot* If you are using filter mode, you must manually use option 1051*16467b97STreehugger Robot memoize=true now. 1052*16467b97STreehugger Robot 1053*16467b97STreehugger RobotJuly 2, 2006 1054*16467b97STreehugger Robot 1055*16467b97STreehugger Robot* Added k=* option so you can set k=2, for example, on whole grammar, 1056*16467b97STreehugger Robot but an individual decision can be LL(*). 1057*16467b97STreehugger Robot 1058*16467b97STreehugger Robot* memoize option for grammars, rules, blocks. Remove -nomemo cmd-line option 1059*16467b97STreehugger Robot 1060*16467b97STreehugger Robot* but in DOT generator for DFA; fixed. 1061*16467b97STreehugger Robot 1062*16467b97STreehugger Robot* runtime.DFA reported errors even when backtracking 1063*16467b97STreehugger Robot 1064*16467b97STreehugger RobotJuly 1, 2006 1065*16467b97STreehugger Robot 1066*16467b97STreehugger Robot* Added -X option list to help 1067*16467b97STreehugger Robot 1068*16467b97STreehugger Robot* Syn preds were being hoisted into other rules, causing lots of extra 1069*16467b97STreehugger Robot backtracking. 1070*16467b97STreehugger Robot 1071*16467b97STreehugger RobotJune 29, 2006 1072*16467b97STreehugger Robot 1073*16467b97STreehugger Robot* unnecessary files removed during build. 1074*16467b97STreehugger Robot 1075*16467b97STreehugger Robot* Matt Benson updated build.xml 1076*16467b97STreehugger Robot 1077*16467b97STreehugger Robot* Detecting use of synpreds in analysis now instead of codegen. In 1078*16467b97STreehugger Robot this way, I can avoid analyzing decisions in synpreds for synpreds 1079*16467b97STreehugger Robot not used in a DFA for a real rule. This is used to optimize things 1080*16467b97STreehugger Robot for backtrack option. 1081*16467b97STreehugger Robot 1082*16467b97STreehugger Robot* Code gen must add _fragment or whatever to end of pred name in 1083*16467b97STreehugger Robot template synpredRule to avoid having ANTLR know anything about 1084*16467b97STreehugger Robot method names. 1085*16467b97STreehugger Robot 1086*16467b97STreehugger Robot* Added -IdbgST option to emit ST delimiters at start/stop of all 1087*16467b97STreehugger Robot templates spit out. 1088*16467b97STreehugger Robot 1089*16467b97STreehugger RobotJune 28, 2006 1090*16467b97STreehugger Robot 1091*16467b97STreehugger Robot* Tweaked message when ANTLR cannot handle analysis. 1092*16467b97STreehugger Robot 1093*16467b97STreehugger Robot3.0b1 - June 27, 2006 1094*16467b97STreehugger Robot 1095*16467b97STreehugger RobotJune 24, 2006 1096*16467b97STreehugger Robot 1097*16467b97STreehugger Robot* syn preds no longer generate little static classes; they also don't 1098*16467b97STreehugger Robot generate a whole bunch of extra crap in the rules built to test syn 1099*16467b97STreehugger Robot preds. Removed GrammarFragmentPointer class from runtime. 1100*16467b97STreehugger Robot 1101*16467b97STreehugger RobotJune 23-24, 2006 1102*16467b97STreehugger Robot 1103*16467b97STreehugger Robot* added output option to -report output. 1104*16467b97STreehugger Robot 1105*16467b97STreehugger Robot* added profiling info: 1106*16467b97STreehugger Robot Number of rule invocations in "guessing" mode 1107*16467b97STreehugger Robot number of rule memoization cache hits 1108*16467b97STreehugger Robot number of rule memoization cache misses 1109*16467b97STreehugger Robot 1110*16467b97STreehugger Robot* made DFA DOT diagrams go left to right not top to bottom 1111*16467b97STreehugger Robot 1112*16467b97STreehugger Robot* I try to recursive overflow states now by resolving these states 1113*16467b97STreehugger Robot with semantic/syntactic predicates if they exist. The DFA is then 1114*16467b97STreehugger Robot deterministic rather than simply resolving by choosing first 1115*16467b97STreehugger Robot nondeterministic alt. I used to generated errors: 1116*16467b97STreehugger Robot 1117*16467b97STreehugger Robot~/tmp $ java org.antlr.Tool -dfa t.g 1118*16467b97STreehugger RobotANTLR Parser Generator Early Access Version 3.0b2 (July 5, 2006) 1989-2006 1119*16467b97STreehugger Robott.g:2:5: Alternative 1: after matching input such as A A A A A decision cannot predict what comes next due to recursion overflow to b from b 1120*16467b97STreehugger Robott.g:2:5: Alternative 2: after matching input such as A A A A A decision cannot predict what comes next due to recursion overflow to b from b 1121*16467b97STreehugger Robot 1122*16467b97STreehugger Robot Now, I uses predicates if available and emits no warnings. 1123*16467b97STreehugger Robot 1124*16467b97STreehugger Robot* made sem preds share accept states. Previously, multiple preds in a 1125*16467b97STreehugger Robotdecision forked new accepts each time for each nondet state. 1126*16467b97STreehugger Robot 1127*16467b97STreehugger RobotJune 19, 2006 1128*16467b97STreehugger Robot 1129*16467b97STreehugger Robot* Need parens around the prediction expressions in templates. 1130*16467b97STreehugger Robot 1131*16467b97STreehugger Robot* Referencing $ID.text in an action forced bad code gen in lexer rule ID. 1132*16467b97STreehugger Robot 1133*16467b97STreehugger Robot* Fixed a bug in how predicates are collected. The definition of 1134*16467b97STreehugger Robot "last predicated alternative" was incorrect in the analysis. Further, 1135*16467b97STreehugger Robot gated predicates incorrectly missed a case where an edge should become 1136*16467b97STreehugger Robot true (a tautology). 1137*16467b97STreehugger Robot 1138*16467b97STreehugger Robot* Removed an unnecessary input.consume() reference in the runtime/DFA class. 1139*16467b97STreehugger Robot 1140*16467b97STreehugger RobotJune 14, 2006 1141*16467b97STreehugger Robot 1142*16467b97STreehugger Robot* -> ($rulelabel)? didn't generate proper code for ASTs. 1143*16467b97STreehugger Robot 1144*16467b97STreehugger Robot* bug in code gen (did not compile) 1145*16467b97STreehugger Robota : ID -> ID 1146*16467b97STreehugger Robot | ID -> ID 1147*16467b97STreehugger Robot ; 1148*16467b97STreehugger RobotProblem is repeated ref to ID from left side. Juergen pointed this out. 1149*16467b97STreehugger Robot 1150*16467b97STreehugger Robot* use of tokenVocab with missing file yielded exception 1151*16467b97STreehugger Robot 1152*16467b97STreehugger Robot* (A|B)=> foo yielded an exception as (A|B) is a set not a block. Fixed. 1153*16467b97STreehugger Robot 1154*16467b97STreehugger Robot* Didn't set ID1= and INT1= for this alt: 1155*16467b97STreehugger Robot | ^(ID INT+ {System.out.print(\"^(\"+$ID+\" \"+$INT+\")\");}) 1156*16467b97STreehugger Robot 1157*16467b97STreehugger Robot* Fixed so repeated dangling state errors only occur once like: 1158*16467b97STreehugger Robott.g:4:17: the decision cannot distinguish between alternative(s) 2,1 for at least one input sequence 1159*16467b97STreehugger Robot 1160*16467b97STreehugger Robot* tracking of rule elements was on (making list defs at start of 1161*16467b97STreehugger Robot method) with templates instead of just with ASTs. Turned off. 1162*16467b97STreehugger Robot 1163*16467b97STreehugger Robot* Doesn't crash when you give it a missing file now. 1164*16467b97STreehugger Robot 1165*16467b97STreehugger Robot* -report: add output info: how many LL(1) decisions. 1166*16467b97STreehugger Robot 1167*16467b97STreehugger RobotJune 13, 2006 1168*16467b97STreehugger Robot 1169*16467b97STreehugger Robot* ^(ROOT ID?) Didn't work; nor did any other nullable child list such as 1170*16467b97STreehugger Robot ^(ROOT ID* INT?). Now, I check to see if child list is nullable using 1171*16467b97STreehugger Robot Grammar.LOOK() and, if so, I generate an "IF lookahead is DOWN" gate 1172*16467b97STreehugger Robot around the child list so the whole thing is optional. 1173*16467b97STreehugger Robot 1174*16467b97STreehugger Robot* Fixed a bug in LOOK that made it not look through nullable rules. 1175*16467b97STreehugger Robot 1176*16467b97STreehugger Robot* Using AST suffixes or -> rewrite syntax now gives an error w/o a grammar 1177*16467b97STreehugger Robot output option. Used to crash ;) 1178*16467b97STreehugger Robot 1179*16467b97STreehugger Robot* References to EOF ended up with improper -1 refs instead of EOF in output. 1180*16467b97STreehugger Robot 1181*16467b97STreehugger Robot* didn't warn of ambig ref to $expr in rewrite; fixed. 1182*16467b97STreehugger Robotlist 1183*16467b97STreehugger Robot : '[' expr 'for' type ID 'in' expr ']' 1184*16467b97STreehugger Robot -> comprehension(expr={$expr.st},type={},list={},i={}) 1185*16467b97STreehugger Robot ; 1186*16467b97STreehugger Robot 1187*16467b97STreehugger RobotJune 12, 2006 1188*16467b97STreehugger Robot 1189*16467b97STreehugger Robot* EOF works in the parser as a token name. 1190*16467b97STreehugger Robot 1191*16467b97STreehugger Robot* Rule b:(A B?)*; didn't display properly in AW due to the way ANTLR 1192*16467b97STreehugger Robot generated NFA. 1193*16467b97STreehugger Robot 1194*16467b97STreehugger Robot* "scope x;" in a rule for unknown x gives no error. Fixed. Added unit test. 1195*16467b97STreehugger Robot 1196*16467b97STreehugger Robot* Label type for refs to start/stop in tree parser and other parsers were 1197*16467b97STreehugger Robot not used. Lots of casting. Ick. Fixed. 1198*16467b97STreehugger Robot 1199*16467b97STreehugger Robot* couldn't refer to $tokenlabel in isolation; but need so we can test if 1200*16467b97STreehugger Robot something was matched. Fixed. 1201*16467b97STreehugger Robot 1202*16467b97STreehugger Robot* Lots of little bugs fixed in $x.y, %... translation due to new 1203*16467b97STreehugger Robot action translator. 1204*16467b97STreehugger Robot 1205*16467b97STreehugger Robot* Improperly tracking block nesting level; result was that you couldn't 1206*16467b97STreehugger Robot see $ID in action of rule "a : A+ | ID {Token t = $ID;} | C ;" 1207*16467b97STreehugger Robot 1208*16467b97STreehugger Robot* a : ID ID {$ID.text;} ; did not get a warning about ambiguous $ID ref. 1209*16467b97STreehugger Robot 1210*16467b97STreehugger Robot* No error was found on $COMMENT.text: 1211*16467b97STreehugger Robot 1212*16467b97STreehugger RobotCOMMENT 1213*16467b97STreehugger Robot : '/*' (options {greedy=false;} : . )* '*/' 1214*16467b97STreehugger Robot {System.out.println("found method "+$COMMENT.text);} 1215*16467b97STreehugger Robot ; 1216*16467b97STreehugger Robot 1217*16467b97STreehugger Robot $enclosinglexerrule scope does not exist. Use text or setText() here. 1218*16467b97STreehugger Robot 1219*16467b97STreehugger RobotJune 11, 2006 1220*16467b97STreehugger Robot 1221*16467b97STreehugger Robot* Single return values are initialized now to default or to your spec. 1222*16467b97STreehugger Robot 1223*16467b97STreehugger Robot* cleaned up input stream stuff. Added ANTLRReaderStream, ANTLRInputStream 1224*16467b97STreehugger Robot and refactored. You can specify encodings now on ANTLRFileStream (and 1225*16467b97STreehugger Robot ANTLRInputStream) now. 1226*16467b97STreehugger Robot 1227*16467b97STreehugger Robot* You can set text local var now in a lexer rule and token gets that text. 1228*16467b97STreehugger Robot start/stop indexes are still set for the token. 1229*16467b97STreehugger Robot 1230*16467b97STreehugger Robot* Changed lexer slightly. Calling a nonfragment rule from a 1231*16467b97STreehugger Robot nonfragment rule does not set the overall token. 1232*16467b97STreehugger Robot 1233*16467b97STreehugger RobotJune 10, 2006 1234*16467b97STreehugger Robot 1235*16467b97STreehugger Robot* Fixed bug where unnecessary escapes yield char==0 like '\{'. 1236*16467b97STreehugger Robot 1237*16467b97STreehugger Robot* Fixed analysis bug. This grammar didn't report a recursion warning: 1238*16467b97STreehugger Robotx : y X 1239*16467b97STreehugger Robot | y Y 1240*16467b97STreehugger Robot ; 1241*16467b97STreehugger Roboty : L y R 1242*16467b97STreehugger Robot | B 1243*16467b97STreehugger Robot ; 1244*16467b97STreehugger Robot The DFAState.equals() method was messed up. 1245*16467b97STreehugger Robot 1246*16467b97STreehugger Robot* Added @synpredgate {...} action so you can tell ANTLR how to gate actions 1247*16467b97STreehugger Robot in/out during syntactic predicate evaluation. 1248*16467b97STreehugger Robot 1249*16467b97STreehugger Robot* Fuzzy parsing should be more efficient. It should backtrack over a rule 1250*16467b97STreehugger Robot and then rewind and do it again "with feeling" to exec actions. It was 1251*16467b97STreehugger Robot actually doing it 3x not 2x. 1252*16467b97STreehugger Robot 1253*16467b97STreehugger RobotJune 9, 2006 1254*16467b97STreehugger Robot 1255*16467b97STreehugger Robot* Gutted and rebuilt the action translator for $x.y, $x::y, ... 1256*16467b97STreehugger Robot Uses ANTLR v3 now for the first time inside v3 source. :) 1257*16467b97STreehugger Robot ActionTranslator.java 1258*16467b97STreehugger Robot 1259*16467b97STreehugger Robot* Fixed a bug where referencing a return value on a rule didn't work 1260*16467b97STreehugger Robot because later a ref to that rule's predefined properties didn't 1261*16467b97STreehugger Robot properly force a return value struct to be built. Added unit test. 1262*16467b97STreehugger Robot 1263*16467b97STreehugger RobotJune 6, 2006 1264*16467b97STreehugger Robot 1265*16467b97STreehugger Robot* New DFA mechanisms. Cyclic DFA are implemented as state tables, 1266*16467b97STreehugger Robot encoded via strings as java cannot handle large static arrays :( 1267*16467b97STreehugger Robot States with edges emanating that have predicates are specially 1268*16467b97STreehugger Robot treated. A method is generated to do these states. The DFA 1269*16467b97STreehugger Robot simulation routine uses the "special" array to figure out if the 1270*16467b97STreehugger Robot state is special. See March 25, 2006 entry for description: 1271*16467b97STreehugger Robot http://www.antlr.org/blog/antlr3/codegen.tml. analysis.DFA now has 1272*16467b97STreehugger Robot all the state tables generated for code gen. CyclicCodeGenerator.java 1273*16467b97STreehugger Robot disappeared as it's unneeded code. :) 1274*16467b97STreehugger Robot 1275*16467b97STreehugger Robot* Internal general clean up of the DFA.states vs uniqueStates thing. 1276*16467b97STreehugger Robot Fixed lookahead decisions no longer fill uniqueStates. Waste of 1277*16467b97STreehugger Robot time. Also noted that when adding sem pred edges, I didn't check 1278*16467b97STreehugger Robot for state reuse. Fixed. 1279*16467b97STreehugger Robot 1280*16467b97STreehugger RobotJune 4, 2006 1281*16467b97STreehugger Robot 1282*16467b97STreehugger Robot* When resolving ambig DFA states predicates, I did not add the new states 1283*16467b97STreehugger Robot to the list of unique DFA states. No observable effect on output except 1284*16467b97STreehugger Robot that DFA state numbers were not always contiguous for predicated decisions. 1285*16467b97STreehugger Robot I needed this fix for new DFA tables. 1286*16467b97STreehugger Robot 1287*16467b97STreehugger Robot3.0ea10 - June 2, 2006 1288*16467b97STreehugger Robot 1289*16467b97STreehugger RobotJune 2, 2006 1290*16467b97STreehugger Robot 1291*16467b97STreehugger Robot* Improved grammar stats and added syntactic pred tracking. 1292*16467b97STreehugger Robot 1293*16467b97STreehugger RobotJune 1, 2006 1294*16467b97STreehugger Robot 1295*16467b97STreehugger Robot* Due to a type mismatch, the DebugParser.recoverFromMismatchedToken() 1296*16467b97STreehugger Robot method was not called. Debug events for mismatched token error 1297*16467b97STreehugger Robot notification were not sent to ANTLRWorks probably 1298*16467b97STreehugger Robot 1299*16467b97STreehugger Robot* Added getBacktrackingLevel() for any recognizer; needed for profiler. 1300*16467b97STreehugger Robot 1301*16467b97STreehugger Robot* Only writes profiling data for antlr grammar analysis with -profile set 1302*16467b97STreehugger Robot 1303*16467b97STreehugger Robot* Major update and bug fix to (runtime) Profiler. 1304*16467b97STreehugger Robot 1305*16467b97STreehugger RobotMay 27, 2006 1306*16467b97STreehugger Robot 1307*16467b97STreehugger Robot* Added Lexer.skip() to force lexer to ignore current token and look for 1308*16467b97STreehugger Robot another; no token is created for current rule and is not passed on to 1309*16467b97STreehugger Robot parser (or other consumer of the lexer). 1310*16467b97STreehugger Robot 1311*16467b97STreehugger Robot* Parsers are much faster now. I removed use of java.util.Stack for pushing 1312*16467b97STreehugger Robot follow sets and use a hardcoded array stack instead. Dropped from 1313*16467b97STreehugger Robot 5900ms to 3900ms for parse+lex time parsing entire java 1.4.2 source. Lex 1314*16467b97STreehugger Robot time alone was about 1500ms. Just looking at parse time, we get about 2x 1315*16467b97STreehugger Robot speed improvement. :) 1316*16467b97STreehugger Robot 1317*16467b97STreehugger RobotMay 26, 2006 1318*16467b97STreehugger Robot 1319*16467b97STreehugger Robot* Fixed NFA construction so it generates NFA for (A*)* such that ANTLRWorks 1320*16467b97STreehugger Robot can display it properly. 1321*16467b97STreehugger Robot 1322*16467b97STreehugger RobotMay 25, 2006 1323*16467b97STreehugger Robot 1324*16467b97STreehugger Robot* added abort method to Grammar so AW can terminate the conversion if it's 1325*16467b97STreehugger Robot taking too long. 1326*16467b97STreehugger Robot 1327*16467b97STreehugger RobotMay 24, 2006 1328*16467b97STreehugger Robot 1329*16467b97STreehugger Robot* added method to get left recursive rules from grammar without doing full 1330*16467b97STreehugger Robot grammar analysis. 1331*16467b97STreehugger Robot 1332*16467b97STreehugger Robot* analysis, code gen not attempted if serious error (like 1333*16467b97STreehugger Robot left-recursion or missing rule definition) occurred while reading 1334*16467b97STreehugger Robot the grammar in and defining symbols. 1335*16467b97STreehugger Robot 1336*16467b97STreehugger Robot* added amazing optimization; reduces analysis time by 90% for java 1337*16467b97STreehugger Robot grammar; simple IF statement addition! 1338*16467b97STreehugger Robot 1339*16467b97STreehugger Robot3.0ea9 - May 20, 2006 1340*16467b97STreehugger Robot 1341*16467b97STreehugger Robot* added global k value for grammar to limit lookahead for all decisions unless 1342*16467b97STreehugger Robotoverridden in a particular decision. 1343*16467b97STreehugger Robot 1344*16467b97STreehugger Robot* added failsafe so that any decision taking longer than 2 seconds to create 1345*16467b97STreehugger Robotthe DFA will fall back on k=1. Use -ImaxtimeforDFA n (in ms) to set the time. 1346*16467b97STreehugger Robot 1347*16467b97STreehugger Robot* added an option (turned off for now) to use multiple threads to 1348*16467b97STreehugger Robotperform grammar analysis. Not much help on a 2-CPU computer as 1349*16467b97STreehugger Robotgarbage collection seems to peg the 2nd CPU already. :( Gotta wait for 1350*16467b97STreehugger Robota 4 CPU box ;) 1351*16467b97STreehugger Robot 1352*16467b97STreehugger Robot* switched from #src to // $ANTLR src directive. 1353*16467b97STreehugger Robot 1354*16467b97STreehugger Robot* CommonTokenStream.getTokens() looked past end of buffer sometimes. fixed. 1355*16467b97STreehugger Robot 1356*16467b97STreehugger Robot* unicode literals didn't really work in DOT output and generated code. fixed. 1357*16467b97STreehugger Robot 1358*16467b97STreehugger Robot* fixed the unit test rig so it compiles nicely with Java 1.5 1359*16467b97STreehugger Robot 1360*16467b97STreehugger Robot* Added ant build.xml file (reads build.properties file) 1361*16467b97STreehugger Robot 1362*16467b97STreehugger Robot* predicates sometimes failed to compile/eval properly due to missing (...) 1363*16467b97STreehugger Robot in IF expressions. Forced (..) 1364*16467b97STreehugger Robot 1365*16467b97STreehugger Robot* (...)? with only one alt were not optimized. Was: 1366*16467b97STreehugger Robot 1367*16467b97STreehugger Robot // t.g:4:7: ( B )? 1368*16467b97STreehugger Robot int alt1=2; 1369*16467b97STreehugger Robot int LA1_0 = input.LA(1); 1370*16467b97STreehugger Robot if ( LA1_0==B ) { 1371*16467b97STreehugger Robot alt1=1; 1372*16467b97STreehugger Robot } 1373*16467b97STreehugger Robot else if ( LA1_0==-1 ) { 1374*16467b97STreehugger Robot alt1=2; 1375*16467b97STreehugger Robot } 1376*16467b97STreehugger Robot else { 1377*16467b97STreehugger Robot NoViableAltException nvae = 1378*16467b97STreehugger Robot new NoViableAltException("4:7: ( B )?", 1, 0, input); 1379*16467b97STreehugger Robot throw nvae; 1380*16467b97STreehugger Robot } 1381*16467b97STreehugger Robot 1382*16467b97STreehugger Robotis now: 1383*16467b97STreehugger Robot 1384*16467b97STreehugger Robot // t.g:4:7: ( B )? 1385*16467b97STreehugger Robot int alt1=2; 1386*16467b97STreehugger Robot int LA1_0 = input.LA(1); 1387*16467b97STreehugger Robot if ( LA1_0==B ) { 1388*16467b97STreehugger Robot alt1=1; 1389*16467b97STreehugger Robot } 1390*16467b97STreehugger Robot 1391*16467b97STreehugger Robot Smaller, faster and more readable. 1392*16467b97STreehugger Robot 1393*16467b97STreehugger Robot* Allow manual init of return values now: 1394*16467b97STreehugger Robot functionHeader returns [int x=3*4, char (*f)()=null] : ... ; 1395*16467b97STreehugger Robot 1396*16467b97STreehugger Robot* Added optimization for DFAs that fixed a codegen bug with rules in lexer: 1397*16467b97STreehugger Robot EQ : '=' ; 1398*16467b97STreehugger Robot ASSIGNOP : '=' | '+=' ; 1399*16467b97STreehugger Robot EQ is a subset of other rule. It did not given an error which is 1400*16467b97STreehugger Robot correct, but generated bad code. 1401*16467b97STreehugger Robot 1402*16467b97STreehugger Robot* ANTLR was sending column not char position to ANTLRWorks. 1403*16467b97STreehugger Robot 1404*16467b97STreehugger Robot* Bug fix: location 0, 0 emitted for synpreds and empty alts. 1405*16467b97STreehugger Robot 1406*16467b97STreehugger Robot* debugging event handshake how sends grammar file name. Added getGrammarFileName() to recognizers. Java.stg generates it: 1407*16467b97STreehugger Robot 1408*16467b97STreehugger Robot public String getGrammarFileName() { return "<fileName>"; } 1409*16467b97STreehugger Robot 1410*16467b97STreehugger Robot* tree parsers can do arbitrary lookahead now including backtracking. I 1411*16467b97STreehugger Robot updated CommonTreeNodeStream. 1412*16467b97STreehugger Robot 1413*16467b97STreehugger Robot* added events for debugging tree parsers: 1414*16467b97STreehugger Robot 1415*16467b97STreehugger Robot /** Input for a tree parser is an AST, but we know nothing for sure 1416*16467b97STreehugger Robot * about a node except its type and text (obtained from the adaptor). 1417*16467b97STreehugger Robot * This is the analog of the consumeToken method. Again, the ID is 1418*16467b97STreehugger Robot * the hashCode usually of the node so it only works if hashCode is 1419*16467b97STreehugger Robot * not implemented. 1420*16467b97STreehugger Robot */ 1421*16467b97STreehugger Robot public void consumeNode(int ID, String text, int type); 1422*16467b97STreehugger Robot 1423*16467b97STreehugger Robot /** The tree parser looked ahead */ 1424*16467b97STreehugger Robot public void LT(int i, int ID, String text, int type); 1425*16467b97STreehugger Robot 1426*16467b97STreehugger Robot /** The tree parser has popped back up from the child list to the 1427*16467b97STreehugger Robot * root node. 1428*16467b97STreehugger Robot */ 1429*16467b97STreehugger Robot public void goUp(); 1430*16467b97STreehugger Robot 1431*16467b97STreehugger Robot /** The tree parser has descended to the first child of a the current 1432*16467b97STreehugger Robot * root node. 1433*16467b97STreehugger Robot */ 1434*16467b97STreehugger Robot public void goDown(); 1435*16467b97STreehugger Robot 1436*16467b97STreehugger Robot* Added DebugTreeNodeStream and DebugTreeParser classes 1437*16467b97STreehugger Robot 1438*16467b97STreehugger Robot* Added ctor because the debug tree node stream will need to ask quesitons about nodes and since nodes are just Object, it needs an adaptor to decode the nodes and get text/type info for the debugger. 1439*16467b97STreehugger Robot 1440*16467b97STreehugger Robotpublic CommonTreeNodeStream(TreeAdaptor adaptor, Tree tree); 1441*16467b97STreehugger Robot 1442*16467b97STreehugger Robot* added getter to TreeNodeStream: 1443*16467b97STreehugger Robot public TreeAdaptor getTreeAdaptor(); 1444*16467b97STreehugger Robot 1445*16467b97STreehugger Robot* Implemented getText/getType in CommonTreeAdaptor. 1446*16467b97STreehugger Robot 1447*16467b97STreehugger Robot* Added TraceDebugEventListener that can dump all events to stdout. 1448*16467b97STreehugger Robot 1449*16467b97STreehugger Robot* I broke down and make Tree implement getText 1450*16467b97STreehugger Robot 1451*16467b97STreehugger Robot* tree rewrites now gen location debug events. 1452*16467b97STreehugger Robot 1453*16467b97STreehugger Robot* added AST debug events to listener; added blank listener for convenience 1454*16467b97STreehugger Robot 1455*16467b97STreehugger Robot* updated debug events to send begin/end backtrack events for debugging 1456*16467b97STreehugger Robot 1457*16467b97STreehugger Robot* with a : (b->b) ('+' b -> ^(PLUS $a b))* ; you get b[0] each time as 1458*16467b97STreehugger Robot there is no loop in rewrite rule itself. Need to know context that 1459*16467b97STreehugger Robot the -> is inside the rule and hence b means last value of b not all 1460*16467b97STreehugger Robot values. 1461*16467b97STreehugger Robot 1462*16467b97STreehugger Robot* Bug in TokenRewriteStream; ops at indexes < start index blocked proper op. 1463*16467b97STreehugger Robot 1464*16467b97STreehugger Robot* Actions in ST rewrites "-> ({$op})()" were not translated 1465*16467b97STreehugger Robot 1466*16467b97STreehugger Robot* Added new action name: 1467*16467b97STreehugger Robot 1468*16467b97STreehugger Robot@rulecatch { 1469*16467b97STreehugger Robotcatch (RecognitionException re) { 1470*16467b97STreehugger Robot reportError(re); 1471*16467b97STreehugger Robot recover(input,re); 1472*16467b97STreehugger Robot} 1473*16467b97STreehugger Robotcatch (Throwable t) { 1474*16467b97STreehugger Robot System.err.println(t); 1475*16467b97STreehugger Robot} 1476*16467b97STreehugger Robot} 1477*16467b97STreehugger RobotOverrides rule catch stuff. 1478*16467b97STreehugger Robot 1479*16467b97STreehugger Robot* Isolated $ refs caused exception 1480*16467b97STreehugger Robot 1481*16467b97STreehugger Robot3.0ea8 - March 11, 2006 1482*16467b97STreehugger Robot 1483*16467b97STreehugger Robot* added @finally {...} action like @init for rules. Executes in 1484*16467b97STreehugger Robot finally block (java target) after all other stuff like rule memoization. 1485*16467b97STreehugger Robot No code changes needs; ST just refs a new action: 1486*16467b97STreehugger Robot <ruleDescriptor.actions.finally> 1487*16467b97STreehugger Robot 1488*16467b97STreehugger Robot* hideous bug fixed: PLUS='+' didn't result in '+' rule in lexer 1489*16467b97STreehugger Robot 1490*16467b97STreehugger Robot* TokenRewriteStream didn't do toString() right when no rewrites had been done. 1491*16467b97STreehugger Robot 1492*16467b97STreehugger Robot* lexer errors in interpreter were not printed properly 1493*16467b97STreehugger Robot 1494*16467b97STreehugger Robot* bitsets are dumped in hex not decimal now for FOLLOW sets 1495*16467b97STreehugger Robot 1496*16467b97STreehugger Robot* /* epsilon */ is not printed now when printing out grammars with empty alts 1497*16467b97STreehugger Robot 1498*16467b97STreehugger Robot* Fixed another bug in tree rewrite stuff where it was checking that elements 1499*16467b97STreehugger Robot had at least one element. Strange...commented out for now to see if I can remember what's up. 1500*16467b97STreehugger Robot 1501*16467b97STreehugger Robot* Tree rewrites had problems when you didn't have x+=FOO variables. Rules 1502*16467b97STreehugger Robot like this work now: 1503*16467b97STreehugger Robot 1504*16467b97STreehugger Robot a : (x=ID)? y=ID -> ($x $y)?; 1505*16467b97STreehugger Robot 1506*16467b97STreehugger Robot* filter=true for lexers turns on k=1 and backtracking for every token 1507*16467b97STreehugger Robot alternative. Put the rules in priority order. 1508*16467b97STreehugger Robot 1509*16467b97STreehugger Robot* added getLine() etc... to Tree to support better error reporting for 1510*16467b97STreehugger Robot trees. Added MismatchedTreeNodeException. 1511*16467b97STreehugger Robot 1512*16467b97STreehugger Robot* $templates::foo() is gone. added % as special template symbol. 1513*16467b97STreehugger Robot %foo(a={},b={},...) ctor (even shorter than $templates::foo(...)) 1514*16467b97STreehugger Robot %({name-expr})(a={},...) indirect template ctor reference 1515*16467b97STreehugger Robot 1516*16467b97STreehugger Robot The above are parsed by antlr.g and translated by codegen.g 1517*16467b97STreehugger Robot The following are parsed manually here: 1518*16467b97STreehugger Robot 1519*16467b97STreehugger Robot %{string-expr} anonymous template from string expr 1520*16467b97STreehugger Robot %{expr}.y = z; template attribute y of StringTemplate-typed expr to z 1521*16467b97STreehugger Robot %x.y = z; set template attribute y of x (always set never get attr) 1522*16467b97STreehugger Robot to z [languages like python without ';' must still use the 1523*16467b97STreehugger Robot ';' which the code generator is free to remove during code gen] 1524*16467b97STreehugger Robot 1525*16467b97STreehugger Robot* -> ({expr})(a={},...) notation for indirect template rewrite. 1526*16467b97STreehugger Robot expr is the name of the template. 1527*16467b97STreehugger Robot 1528*16467b97STreehugger Robot* $x[i]::y and $x[-i]::y notation for accesssing absolute scope stack 1529*16467b97STreehugger Robot indexes and relative negative scopes. $x[-1]::y is the y attribute 1530*16467b97STreehugger Robot of the previous scope (stack top - 1). 1531*16467b97STreehugger Robot 1532*16467b97STreehugger Robot* filter=true mode for lexers; can do this now...upon mismatch, just 1533*16467b97STreehugger Robot consumes a char and tries again: 1534*16467b97STreehugger Robotlexer grammar FuzzyJava; 1535*16467b97STreehugger Robotoptions {filter=true;} 1536*16467b97STreehugger Robot 1537*16467b97STreehugger RobotFIELD 1538*16467b97STreehugger Robot : TYPE WS? name=ID WS? (';'|'=') 1539*16467b97STreehugger Robot {System.out.println("found var "+$name.text);} 1540*16467b97STreehugger Robot ; 1541*16467b97STreehugger Robot 1542*16467b97STreehugger Robot* refactored char streams so ANTLRFileStream is now a subclass of 1543*16467b97STreehugger Robot ANTLRStringStream. 1544*16467b97STreehugger Robot 1545*16467b97STreehugger Robot* char streams for lexer now allowed nested backtracking in lexer. 1546*16467b97STreehugger Robot 1547*16467b97STreehugger Robot* added TokenLabelType for lexer/parser for all token labels 1548*16467b97STreehugger Robot 1549*16467b97STreehugger Robot* line numbers for error messages were not updated properly in antlr.g 1550*16467b97STreehugger Robot for strings, char literals and <<...>> 1551*16467b97STreehugger Robot 1552*16467b97STreehugger Robot* init action in lexer rules was before the type,start,line,... decls. 1553*16467b97STreehugger Robot 1554*16467b97STreehugger Robot* Tree grammars can now specify output; I've only tested output=templat 1555*16467b97STreehugger Robot though. 1556*16467b97STreehugger Robot 1557*16467b97STreehugger Robot* You can reference EOF now in the parser and lexer. It's just token type 1558*16467b97STreehugger Robot or char value -1. 1559*16467b97STreehugger Robot 1560*16467b97STreehugger Robot* Bug fix: $ID refs in the *lexer* were all messed up. Cleaned up the 1561*16467b97STreehugger Robot set of properties available... 1562*16467b97STreehugger Robot 1563*16467b97STreehugger Robot* Bug fix: .st not found in rule ref when rule has scope: 1564*16467b97STreehugger Robotfield 1565*16467b97STreehugger Robotscope { 1566*16467b97STreehugger Robot StringTemplate funcDef; 1567*16467b97STreehugger Robot} 1568*16467b97STreehugger Robot : ... 1569*16467b97STreehugger Robot {$field::funcDef = $field.st;} 1570*16467b97STreehugger Robot ; 1571*16467b97STreehugger Robotit gets field_stack.st instead 1572*16467b97STreehugger Robot 1573*16467b97STreehugger Robot* return in backtracking must return retval or null if return value. 1574*16467b97STreehugger Robot 1575*16467b97STreehugger Robot* $property within a rule now works like $text, $st, ... 1576*16467b97STreehugger Robot 1577*16467b97STreehugger Robot* AST/Template Rewrites were not gated by backtracking==0 so they 1578*16467b97STreehugger Robot executed even when guessing. Auto AST construction is now gated also. 1579*16467b97STreehugger Robot 1580*16467b97STreehugger Robot* CommonTokenStream was somehow returning tokens not text in toString() 1581*16467b97STreehugger Robot 1582*16467b97STreehugger Robot* added useful methods to runtime.BitSet and also to CommonToken so you can 1583*16467b97STreehugger Robot update the text. Added nice Token stream method: 1584*16467b97STreehugger Robot 1585*16467b97STreehugger Robot /** Given a start and stop index, return a List of all tokens in 1586*16467b97STreehugger Robot * the token type BitSet. Return null if no tokens were found. This 1587*16467b97STreehugger Robot * method looks at both on and off channel tokens. 1588*16467b97STreehugger Robot */ 1589*16467b97STreehugger Robot public List getTokens(int start, int stop, BitSet types); 1590*16467b97STreehugger Robot 1591*16467b97STreehugger Robot* literals are now passed in the .tokens files so you can ref them in 1592*16467b97STreehugger Robot tree parses, for example. 1593*16467b97STreehugger Robot 1594*16467b97STreehugger Robot* added basic exception handling; no labels, just general catches: 1595*16467b97STreehugger Robot 1596*16467b97STreehugger Robota : {;}A | B ; 1597*16467b97STreehugger Robot exception 1598*16467b97STreehugger Robot catch[RecognitionException re] { 1599*16467b97STreehugger Robot System.out.println("recog error"); 1600*16467b97STreehugger Robot } 1601*16467b97STreehugger Robot catch[Exception e] { 1602*16467b97STreehugger Robot System.out.println("error"); 1603*16467b97STreehugger Robot } 1604*16467b97STreehugger Robot 1605*16467b97STreehugger Robot* Added method to TokenStream: 1606*16467b97STreehugger Robot public String toString(Token start, Token stop); 1607*16467b97STreehugger Robot 1608*16467b97STreehugger Robot* antlr generates #src lines in lexer grammars generated from combined grammars 1609*16467b97STreehugger Robot so error messages refer to original file. 1610*16467b97STreehugger Robot 1611*16467b97STreehugger Robot* lexers generated from combined grammars now use originally formatting. 1612*16467b97STreehugger Robot 1613*16467b97STreehugger Robot* predicates have $x.y stuff translated now. Warning: predicates might be 1614*16467b97STreehugger Robot hoisted out of context. 1615*16467b97STreehugger Robot 1616*16467b97STreehugger Robot* return values in return val structs are now public. 1617*16467b97STreehugger Robot 1618*16467b97STreehugger Robot* output=template with return values on rules was broken. I assume return values with ASTs was broken too. Fixed. 1619*16467b97STreehugger Robot 1620*16467b97STreehugger Robot3.0ea7 - December 14, 2005 1621*16467b97STreehugger Robot 1622*16467b97STreehugger Robot* Added -print option to print out grammar w/o actions 1623*16467b97STreehugger Robot 1624*16467b97STreehugger Robot* Renamed BaseParser to be BaseRecognizer and even made Lexer derive from 1625*16467b97STreehugger Robot this; nice as it now shares backtracking support code. 1626*16467b97STreehugger Robot 1627*16467b97STreehugger Robot* Added syntactic predicates (...)=>. See December 4, 2005 entry: 1628*16467b97STreehugger Robot 1629*16467b97STreehugger Robot http://www.antlr.org/blog/antlr3/lookahead.tml 1630*16467b97STreehugger Robot 1631*16467b97STreehugger Robot Note that we have a new option for turning off rule memoization during 1632*16467b97STreehugger Robot backtracking: 1633*16467b97STreehugger Robot 1634*16467b97STreehugger Robot -nomemo when backtracking don't generate memoization code 1635*16467b97STreehugger Robot 1636*16467b97STreehugger Robot* Predicates are now tested in order that you specify the alts. If you 1637*16467b97STreehugger Robot leave the last alt "naked" (w/o pred), it will assume a true pred rather 1638*16467b97STreehugger Robot than union of other preds. 1639*16467b97STreehugger Robot 1640*16467b97STreehugger Robot* Added gated predicates "{p}?=>" that literally turn off a production whereas 1641*16467b97STreehugger Robotdisambiguating predicates are only hoisted into the predictor when syntax alone 1642*16467b97STreehugger Robotis not sufficient to uniquely predict alternatives. 1643*16467b97STreehugger Robot 1644*16467b97STreehugger RobotA : {p}? => "a" ; 1645*16467b97STreehugger RobotB : {!p}? => ("a"|"b")+ ; 1646*16467b97STreehugger Robot 1647*16467b97STreehugger Robot* bug fixed related to predicates in predictor 1648*16467b97STreehugger Robotlexer grammar w; 1649*16467b97STreehugger RobotA : {p}? "a" ; 1650*16467b97STreehugger RobotB : {!p}? ("a"|"b")+ ; 1651*16467b97STreehugger RobotDFA is correct. A state splits for input "a" on the pred. 1652*16467b97STreehugger RobotGenerated code though was hosed. No pred tests in prediction code! 1653*16467b97STreehugger RobotI added testLexerPreds() and others in TestSemanticPredicateEvaluation.java 1654*16467b97STreehugger Robot 1655*16467b97STreehugger Robot* added execAction template in case we want to do something in front of 1656*16467b97STreehugger Robot each action execution or something. 1657*16467b97STreehugger Robot 1658*16467b97STreehugger Robot* left-recursive cycles from rules w/o decisions were not detected. 1659*16467b97STreehugger Robot 1660*16467b97STreehugger Robot* undefined lexer rules were not announced! fixed. 1661*16467b97STreehugger Robot 1662*16467b97STreehugger Robot* unreachable messages for Tokens rule now indicate rule name not alt. E.g., 1663*16467b97STreehugger Robot 1664*16467b97STreehugger Robot Ruby.lexer.g:24:1: The following token definitions are unreachable: IVAR 1665*16467b97STreehugger Robot 1666*16467b97STreehugger Robot* nondeterminism warnings improved for Tokens rule: 1667*16467b97STreehugger Robot 1668*16467b97STreehugger RobotRuby.lexer.g:10:1: Multiple token rules can match input such as ""0".."9"": INT, FLOAT 1669*16467b97STreehugger RobotAs a result, tokens(s) FLOAT were disabled for that input 1670*16467b97STreehugger Robot 1671*16467b97STreehugger Robot 1672*16467b97STreehugger Robot* DOT diagrams didn't show escaped char properly. 1673*16467b97STreehugger Robot 1674*16467b97STreehugger Robot* Char/string literals are now all 'abc' not "abc". 1675*16467b97STreehugger Robot 1676*16467b97STreehugger Robot* action syntax changed "@scope::actionname {action}" where scope defaults 1677*16467b97STreehugger Robot to "parser" if parser grammar or combined grammar, "lexer" if lexer grammar, 1678*16467b97STreehugger Robot and "treeparser" if tree grammar. The code generation targets decide 1679*16467b97STreehugger Robot what scopes are available. Each "scope" yields a hashtable for use in 1680*16467b97STreehugger Robot the output templates. The scopes full of actions are sent to all output 1681*16467b97STreehugger Robot file templates (currently headerFile and outputFile) as attribute actions. 1682*16467b97STreehugger Robot Then you can reference <actions.scope> to get the map of actions associated 1683*16467b97STreehugger Robot with scope and <actions.parser.header> to get the parser's header action 1684*16467b97STreehugger Robot for example. This should be very flexible. The target should only have 1685*16467b97STreehugger Robot to define which scopes are valid, but the action names should be variable 1686*16467b97STreehugger Robot so we don't have to recompile ANTLR to add actions to code gen templates. 1687*16467b97STreehugger Robot 1688*16467b97STreehugger Robot grammar T; 1689*16467b97STreehugger Robot options {language=Java;} 1690*16467b97STreehugger Robot @header { package foo; } 1691*16467b97STreehugger Robot @parser::stuff { int i; } // names within scope not checked; target dependent 1692*16467b97STreehugger Robot @members { int i; } 1693*16467b97STreehugger Robot @lexer::header {head} 1694*16467b97STreehugger Robot @lexer::members { int j; } 1695*16467b97STreehugger Robot @headerfile::blort {...} // error: this target doesn't have headerfile 1696*16467b97STreehugger Robot @treeparser::members {...} // error: this is not a tree parser 1697*16467b97STreehugger Robot a 1698*16467b97STreehugger Robot @init {int i;} 1699*16467b97STreehugger Robot : ID 1700*16467b97STreehugger Robot ; 1701*16467b97STreehugger Robot ID : 'a'..'z'; 1702*16467b97STreehugger Robot 1703*16467b97STreehugger Robot For now, the Java target uses members and header as a valid name. Within a 1704*16467b97STreehugger Robot rule, the init action name is valid. 1705*16467b97STreehugger Robot 1706*16467b97STreehugger Robot* changed $dynamicscope.value to $dynamicscope::value even if value is defined 1707*16467b97STreehugger Robot in same rule such as $function::name where rule function defines name. 1708*16467b97STreehugger Robot 1709*16467b97STreehugger Robot* $dynamicscope gets you the stack 1710*16467b97STreehugger Robot 1711*16467b97STreehugger Robot* rule scopes go like this now: 1712*16467b97STreehugger Robot 1713*16467b97STreehugger Robot rule 1714*16467b97STreehugger Robot scope {...} 1715*16467b97STreehugger Robot scope slist,Symbols; 1716*16467b97STreehugger Robot : ... 1717*16467b97STreehugger Robot ; 1718*16467b97STreehugger Robot 1719*16467b97STreehugger Robot* Created RuleReturnScope as a generic rule return value. Makes it easier 1720*16467b97STreehugger Robot to do this: 1721*16467b97STreehugger Robot RuleReturnScope r = parser.program(); 1722*16467b97STreehugger Robot System.out.println(r.getTemplate().toString()); 1723*16467b97STreehugger Robot 1724*16467b97STreehugger Robot* $template, $tree, $start, etc... 1725*16467b97STreehugger Robot 1726*16467b97STreehugger Robot* $r.x in current rule. $r is ignored as fully-qualified name. $r.start works too 1727*16467b97STreehugger Robot 1728*16467b97STreehugger Robot* added warning about $r referring to both return value of rule and dynamic scope of rule 1729*16467b97STreehugger Robot 1730*16467b97STreehugger Robot* integrated StringTemplate in a very simple manner 1731*16467b97STreehugger Robot 1732*16467b97STreehugger RobotSyntax: 1733*16467b97STreehugger Robot-> template(arglist) "..." 1734*16467b97STreehugger Robot-> template(arglist) <<...>> 1735*16467b97STreehugger Robot-> namedTemplate(arglist) 1736*16467b97STreehugger Robot-> {free expression} 1737*16467b97STreehugger Robot-> // empty 1738*16467b97STreehugger Robot 1739*16467b97STreehugger RobotPredicate syntax: 1740*16467b97STreehugger Robota : A B -> {p1}? foo(a={$A.text}) 1741*16467b97STreehugger Robot -> {p2}? foo(a={$B.text}) 1742*16467b97STreehugger Robot -> // return nothing 1743*16467b97STreehugger Robot 1744*16467b97STreehugger RobotAn arg list is just a list of template attribute assignments to actions in curlies. 1745*16467b97STreehugger Robot 1746*16467b97STreehugger RobotThere is a setTemplateLib() method for you to use with named template rewrites. 1747*16467b97STreehugger Robot 1748*16467b97STreehugger RobotUse a new option: 1749*16467b97STreehugger Robot 1750*16467b97STreehugger Robotgrammar t; 1751*16467b97STreehugger Robotoptions {output=template;} 1752*16467b97STreehugger Robot... 1753*16467b97STreehugger Robot 1754*16467b97STreehugger RobotThis all should work for tree grammars too, but I'm still testing. 1755*16467b97STreehugger Robot 1756*16467b97STreehugger Robot* fixed bugs where strings were improperly escaped in exceptions, comments, etc.. For example, newlines came out as newlines not the escaped version 1757*16467b97STreehugger Robot 1758*16467b97STreehugger Robot3.0ea6 - November 13, 2005 1759*16467b97STreehugger Robot 1760*16467b97STreehugger Robot* turned off -debug/-profile, which was on by default 1761*16467b97STreehugger Robot 1762*16467b97STreehugger Robot* completely refactored the output templates; added some missing templates. 1763*16467b97STreehugger Robot 1764*16467b97STreehugger Robot* dramatically improved infinite recursion error messages (actually 1765*16467b97STreehugger Robot left-recursion never even was printed out before). 1766*16467b97STreehugger Robot 1767*16467b97STreehugger Robot* wasn't printing dangling state messages when it reanalyzes with k=1. 1768*16467b97STreehugger Robot 1769*16467b97STreehugger Robot* fixed a nasty bug in the analysis engine dealing with infinite recursion. 1770*16467b97STreehugger Robot Spent all day thinking about it and cleaned up the code dramatically. 1771*16467b97STreehugger Robot Bug fixed and software is more powerful and I understand it better! :) 1772*16467b97STreehugger Robot 1773*16467b97STreehugger Robot* improved verbose DFA nodes; organized by alt 1774*16467b97STreehugger Robot 1775*16467b97STreehugger Robot* got much better random phrase generation. For example: 1776*16467b97STreehugger Robot 1777*16467b97STreehugger Robot $ java org.antlr.tool.RandomPhrase simple.g program 1778*16467b97STreehugger Robot int Ktcdn ';' method wh '(' ')' '{' return 5 ';' '}' 1779*16467b97STreehugger Robot 1780*16467b97STreehugger Robot* empty rules like "a : ;" generated code that didn't compile due to 1781*16467b97STreehugger Robot try/catch for RecognitionException. Generated code couldn't possibly 1782*16467b97STreehugger Robot throw that exception. 1783*16467b97STreehugger Robot 1784*16467b97STreehugger Robot* when printing out a grammar, such as in comments in generated code, 1785*16467b97STreehugger Robot ANTLR didn't print ast suffix stuff back out for literals. 1786*16467b97STreehugger Robot 1787*16467b97STreehugger Robot* This never exited loop: 1788*16467b97STreehugger Robot DATA : (options {greedy=false;}: .* '\n' )* '\n' '.' ; 1789*16467b97STreehugger Robot and now it works due to new default nongreedy .* Also this works: 1790*16467b97STreehugger Robot DATA : (options {greedy=false;}: .* '\n' )* '.' ; 1791*16467b97STreehugger Robot 1792*16467b97STreehugger Robot* Dot star ".*" syntax didn't work; in lexer it is nongreedy by 1793*16467b97STreehugger Robot default. In parser it is on greedy but also k=1 by default. Added 1794*16467b97STreehugger Robot unit tests. Added blog entry to describe. 1795*16467b97STreehugger Robot 1796*16467b97STreehugger Robot* ~T where T is the only token yielded an empty set but no error 1797*16467b97STreehugger Robot 1798*16467b97STreehugger Robot* Used to generate unreachable message here: 1799*16467b97STreehugger Robot 1800*16467b97STreehugger Robot parser grammar t; 1801*16467b97STreehugger Robot a : ID a 1802*16467b97STreehugger Robot | ID 1803*16467b97STreehugger Robot ; 1804*16467b97STreehugger Robot 1805*16467b97STreehugger Robot z.g:3:11: The following alternatives are unreachable: 2 1806*16467b97STreehugger Robot 1807*16467b97STreehugger Robot In fact it should really be an error; now it generates: 1808*16467b97STreehugger Robot 1809*16467b97STreehugger Robot no start rule in grammar t (no rule can obviously be followed by EOF) 1810*16467b97STreehugger Robot 1811*16467b97STreehugger Robot Per next change item, ANTLR cannot know that EOF follows rule 'a'. 1812*16467b97STreehugger Robot 1813*16467b97STreehugger Robot* added error message indicating that ANTLR can't figure out what your 1814*16467b97STreehugger Robot start rule is. Required to properly generate code in some cases. 1815*16467b97STreehugger Robot 1816*16467b97STreehugger Robot* validating semantic predicates now work (if they are false, they 1817*16467b97STreehugger Robot throw a new FailedPredicateException 1818*16467b97STreehugger Robot 1819*16467b97STreehugger Robot* two hideous bug fixes in the IntervalSet, which made analysis go wrong 1820*16467b97STreehugger Robot in a few cases. Thanks to Oliver Zeigermann for finding lots of bugs 1821*16467b97STreehugger Robot and making suggested fixes (including the next two items)! 1822*16467b97STreehugger Robot 1823*16467b97STreehugger Robot* cyclic DFAs are now nonstatic and hence can access instance variables 1824*16467b97STreehugger Robot 1825*16467b97STreehugger Robot* labels are now allowed on lexical elements (in the lexer) 1826*16467b97STreehugger Robot 1827*16467b97STreehugger Robot* added some internal debugging options 1828*16467b97STreehugger Robot 1829*16467b97STreehugger Robot* ~'a'* and ~('a')* were not working properly; refactored antlr.g grammar 1830*16467b97STreehugger Robot 1831*16467b97STreehugger Robot3.0ea5 - July 5, 2005 1832*16467b97STreehugger Robot 1833*16467b97STreehugger Robot* Using '\n' in a parser grammar resulted in a nonescaped version of '\n' in the token names table making compilation fail. I fixed this by reorganizing/cleaning up portion of ANTLR that deals with literals. See comment org.antlr.codegen.Target. 1834*16467b97STreehugger Robot 1835*16467b97STreehugger Robot* Target.getMaxCharValue() did not use the appropriate max value constant. 1836*16467b97STreehugger Robot 1837*16467b97STreehugger Robot* ALLCHAR was a constant when it should use the Target max value def. set complement for wildcard also didn't use the Target def. Generally cleaned up the max char value stuff. 1838*16467b97STreehugger Robot 1839*16467b97STreehugger Robot* Code gen didn't deal with ASTLabelType properly...I think even the 3.0ea7 example tree parser was broken! :( 1840*16467b97STreehugger Robot 1841*16467b97STreehugger Robot* Added a few more unit tests dealing with escaped literals 1842*16467b97STreehugger Robot 1843*16467b97STreehugger Robot3.0ea4 - June 29, 2005 1844*16467b97STreehugger Robot 1845*16467b97STreehugger Robot* tree parsers work; added CommonTreeNodeStream. See simplecTreeParser 1846*16467b97STreehugger Robot example in examples-v3 tarball. 1847*16467b97STreehugger Robot 1848*16467b97STreehugger Robot* added superClass and ASTLabelType options 1849*16467b97STreehugger Robot 1850*16467b97STreehugger Robot* refactored Parser to have a BaseParser and added TreeParser 1851*16467b97STreehugger Robot 1852*16467b97STreehugger Robot* bug fix: actions being dumped in description strings; compile errors 1853*16467b97STreehugger Robot resulted 1854*16467b97STreehugger Robot 1855*16467b97STreehugger Robot3.0ea3 - June 23, 2005 1856*16467b97STreehugger Robot 1857*16467b97STreehugger RobotEnhancements 1858*16467b97STreehugger Robot 1859*16467b97STreehugger Robot* Automatic tree construction operators are in: ! ^ ^^ 1860*16467b97STreehugger Robot 1861*16467b97STreehugger Robot* Tree construction rewrite rules are in 1862*16467b97STreehugger Robot -> {pred1}? rewrite1 1863*16467b97STreehugger Robot -> {pred2}? rewrite2 1864*16467b97STreehugger Robot ... 1865*16467b97STreehugger Robot -> rewriteN 1866*16467b97STreehugger Robot 1867*16467b97STreehugger Robot The rewrite rules may be elements like ID, expr, $label, {node expr} 1868*16467b97STreehugger Robot and trees ^( <root> <children> ). You have have (...)?, (...)*, (...)+ 1869*16467b97STreehugger Robot subrules as well. 1870*16467b97STreehugger Robot 1871*16467b97STreehugger Robot You may have rewrites in subrules not just at outer level of rule, but 1872*16467b97STreehugger Robot any -> rewrite forces auto AST construction off for that alternative 1873*16467b97STreehugger Robot of that rule. 1874*16467b97STreehugger Robot 1875*16467b97STreehugger Robot To avoid cycles, copy semantics are used: 1876*16467b97STreehugger Robot 1877*16467b97STreehugger Robot r : INT -> INT INT ; 1878*16467b97STreehugger Robot 1879*16467b97STreehugger Robot means make two new nodes from the same INT token. 1880*16467b97STreehugger Robot 1881*16467b97STreehugger Robot Repeated references to a rule element implies a copy for at least one 1882*16467b97STreehugger Robot tree: 1883*16467b97STreehugger Robot 1884*16467b97STreehugger Robot a : atom -> ^(atom atom) ; // NOT CYCLE! (dup atom tree) 1885*16467b97STreehugger Robot 1886*16467b97STreehugger Robot* $ruleLabel.tree refers to tree created by matching the labeled element. 1887*16467b97STreehugger Robot 1888*16467b97STreehugger Robot* A description of the blocks/alts is generated as a comment in output code 1889*16467b97STreehugger Robot 1890*16467b97STreehugger Robot* A timestamp / signature is put at top of each generated code file 1891*16467b97STreehugger Robot 1892*16467b97STreehugger Robot3.0ea2 - June 12, 2005 1893*16467b97STreehugger Robot 1894*16467b97STreehugger RobotBug fixes 1895*16467b97STreehugger Robot 1896*16467b97STreehugger Robot* Some error messages were missing the stackTrace parameter 1897*16467b97STreehugger Robot 1898*16467b97STreehugger Robot* Removed the file locking mechanism as it's not cross platform 1899*16467b97STreehugger Robot 1900*16467b97STreehugger Robot* Some absolute vs relative path name problems with writing output 1901*16467b97STreehugger Robot files. Rules are now more concrete. -o option takes precedence 1902*16467b97STreehugger Robot // -o /tmp /var/lib/t.g => /tmp/T.java 1903*16467b97STreehugger Robot // -o subdir/output /usr/lib/t.g => subdir/output/T.java 1904*16467b97STreehugger Robot // -o . /usr/lib/t.g => ./T.java 1905*16467b97STreehugger Robot // -o /tmp subdir/t.g => /tmp/subdir/t.g 1906*16467b97STreehugger Robot // If they didn't specify a -o dir so just write to location 1907*16467b97STreehugger Robot // where grammar is, absolute or relative 1908*16467b97STreehugger Robot 1909*16467b97STreehugger Robot* does error checking on unknown option names now 1910*16467b97STreehugger Robot 1911*16467b97STreehugger Robot* Using just language code not locale name for error message file. I.e., 1912*16467b97STreehugger Robot the default (and for any English speaking locale) is en.stg not en_US.stg 1913*16467b97STreehugger Robot anymore. 1914*16467b97STreehugger Robot 1915*16467b97STreehugger Robot* The error manager now asks the Tool to panic rather than simply doing 1916*16467b97STreehugger Robot a System.exit(). 1917*16467b97STreehugger Robot 1918*16467b97STreehugger Robot* Lots of refactoring concerning grammar, rule, subrule options. Now 1919*16467b97STreehugger Robot detects invalid options. 1920*16467b97STreehugger Robot 1921*16467b97STreehugger Robot3.0ea1 - June 1, 2005 1922*16467b97STreehugger Robot 1923*16467b97STreehugger RobotInitial early access release 1924*16467b97STreehugger Robot 1925