Should I learn ANTLR 4 for Java or is there a parsing framework for Rust?

I’ve the syntactic descriptions of my language fully specified: Overview - Jet Language Specification It is a fusion of features from ActionScript 3 and ECMA-262 2018.

I never used a parser generator, so it’s right that I should stick to using ANTLR 4 since it’s the most mature parsing tool, right? For Rust, I’ve looked at LALRPOP and it seems immature.

… Or is Racket viable?

Ah, to be honest I’m finding Racket too complex in syntax… Guess I’ll end up handwriting again…

If you are willing to try another relatively new parser generator for Rust I’m working for quite some time on

The docs are fairly complete and the test coverage is good. You can find a lot of examples on the usage.
There is a comprehensive tutorial in the Rustemo book.

There are also some options in Rust based on parser combinators which might be better option? But for old-school compiler development I usually stick with (G)LR.

1 Like

BTW, something is weird with the styling of the Jet docs. I see you are using mdbooks but can’t figure out what is the problem. The docs visual contrast is weak and it makes it hard to read. The theme switcher in the upper left is not working either.

Good to know there’s another parser generator!

I’m wondering if Rustemo is able to switch input goals, since I’ve 4 input goals (Div, RegExp, XMLTag and XMLContent (Jet for XML)). Also worthy noting white space is not ignored in XMLTag and XMLContent.

Another question is the white space handling in regards to VirtualSemicolon (automatic semicolon), as it should propagate any containing LineTerminators in multi-line comments to the lexical scanner

I’ve to improve it indeed… The default mdBook themes didn’t fit with the new default theme.

Not sure if I understand fully what you are up to but Rustemo architecture is pretty flexible. E.g. lexers are components called by the parser during parsing (aka context-aware lexing). There is the string lexer offered out-of-the-box but you can provide your own lexer and do all sorts of lexing tricks (there is an explanation in the book with the example).

Building output is also flexible. Builders are called by the parser to produce output. There are grammar deduced AST types based builder and also a generic parse-tree kind of builder. AST builder also comes with auto-generated actions which can be further manually modified where modifications are kept between regenerations (see for example the calculator tutorial).
But, again you can provide your own builder if you have some special needs.

Whitespace handling is done using a special grammar rule Layout. If it is present, Rustemo builds another parser for parsing layout, i.e. whatever is not significant for analysis, and this parser is triggered whenever the regular parser can’t proceed.

Maybe this can give you a hint if Rustemo would be usable in your use-case but probably for the full answer you would have to try and see if there are any showstoppers. If you try and hit some, please open an issue at Github issue tracker.

1 Like

Which parser were you using? (I think there were six last I checked)

The Racket community welcomes questions - there is friendly Discourse forum: https://racket.discourse.group/invites/VxkBcXY7yL

Best regards

Stephen

1 Like