Which parser generator are you using (if any)?

cristian.vasile · February 18, 2020, 2:46am

Indeed, here is an example of making Julia a procedural language for PostgreSQL.

Creating a PostgreSQL procedural language – Part 1 – Setup
Creating a PostgreSQL procedural language – Part 2 – Embedding Julia

cristian.vasile · February 18, 2020, 3:10am

I forgot one

Verifying concurrent programs for Heisenbugs using Leslie Lamport’s TLA+ machinery
The idea is to take C/GO/D/Rust/Kotlin/etc source code and create automatically TLA+ specifications.
I found one paper on this subject: Specifying and Verifying Concurrent C
Programs with TLA+ https://cedric.cnam.fr/fichiers/art_3439.pdf
Quote:
“We define a set of translation rules and implement it in a tool (C2TLA+) that automatically translates C code into a TLA+ specification.”

iandrich · February 18, 2020, 4:18am

Thats amazing. I’ve never heard of anyone attempting C to TLA+ spec before.

ftomassetti · February 18, 2020, 7:09am

wow @cristian.vasile you are a mine of resources

cristian.vasile · February 18, 2020, 4:26pm

Mr. Lamport himself was a little bit puzzled

cristian.vasile · February 18, 2020, 5:50pm

Paul,

I am sure that you did perform a strong testing processes against LRSTAR, however using this trick you can create a virtuous circle:
C EBNF grammar → random source code gen → (100,000 C files) → LRSTAR and see if all random valid combinations are handled properly by LRSTAR.

hexagonaal · February 20, 2020, 10:23am

For JavaParser we’re using JavaCC. While it is a capable parser generator, the choice was simply made because the project was already using it when we adopted it. We really want to get rid of it now. The grammar is hard to read with all the included code snippets and all the added hacks and tricks to keep everything working, and while there seem to be a few people working on a new version, it is all very secretive and slow.

Because it is so odd to me that we can use BNF to specify a language, but nothing out there can handle all of the languages it can express, I tried to get into GLR for a while, but didn’t really find a well-documented, easy to get started with project out there.

Lately I’ve been experimenting with a simple language built from scratch and picked ANTLR for that, and it was almost disappointing how quickly the project moved past the lexing/parsing phase The clear writing from Terence Parr in his books is also a big help. This made me consider getting rid of JavaCC in JavaParser and moving to ANTLR since it would solve several long standing issues.

ftomassetti · February 20, 2020, 6:01pm

Yes, I would just add that JavaCC requires no runtime and this is a (small) benefit.
The code of JavaCC is incredibly bad and unmaintained. The team behind it is very unresponsive to all sorts of help.
So, I would strongly discourage anyone from using JavaCC. I am aware of two forks of it and I would consider them instead.
That said I love ANTLR! I am using it for all sorts of things and it never disappointed me.
Yet switching parser generator could prove… tricky

cristian.vasile · February 20, 2020, 6:53pm

Why you do not give LRSTAR a chance?

anon67755252 · February 20, 2020, 10:14pm

Because LRSTAR is not well known and not Java based. C++ is considered
problematic and Java is considered safe. And once you find something that
works, you don’t want go through the painful process of switching to another.
The author of LRSTAR does not have a PhD and all the charisma of Dr. Parr.
However, LRSTAR has been generating parsers for companies since 1987
and some people prefer it to ANTLR. BTW, the download contains complete
source code, in case you want to compile for Linux, OS X, or Unix:
https://sourceforge.net/projects/lrstar/

alebencz · February 21, 2020, 12:29pm

Hi!
Of all the compilers I implemented, virtually all of them I implemented the parser manually … using the recursive descending pattern.
With regard to LALR, my friend Alex, implemented his own parser generator for his language, https://github.com/ELENA-LANG/elena-lang

The “sg” program reads the file containing the language syntax, https://github.com/ELENA-LANG/elena-lang/blob/master/dat/sg/syntax.txt , and generates a file with all the rules to perform the analysis.
In the source code of the compiler, he manually implemented the DFA table, https://github.com/ELENA-LANG/elena-lang/blob/b50b97a81b7a32328ac255391b26c259702e8a37/elenasrc2/elc/source.cpp

igor.dejanovic · February 21, 2020, 6:42pm

For DSL I use textX as it does all the heavy-lifting for me, and even a nice VS Code integration is in the development.

For more expression-like languages and general parsing where explicit handling of ambiguity is needed I use parglare which is LR/GLR parser.

They are all Python libs and thus not very fast but so far they served me well. What is important to me and what I strive for, they have good documentation, test coverage, and very nice error reporting capabilities.

anon67755252 · February 24, 2020, 7:46am

LRSTAR is Open Source now and BSD license.
Complete source-code is included, which compiles with GCC.
The latest version is here:
https://sourceforge.net/projects/lrstar/

Actually, it reads a DSL for defining DSLs. But you knew that. All parser generators
read a DSL of their own design. A BNF grammar is a kind of DSL, right?

igor.dejanovic · February 24, 2020, 12:55pm

Hi Paul. Thanks for the LRSTAR project.

I would like to suggest putting the LRSTAR project in a git repo on GitHub or some other git hosting service in an unzipped form. Just the source code and accompanying materials. It requires some work and learning if you haven’t done it before but it will greatly increase visibility of the project and possibility of contributions.

anon67755252 · February 24, 2020, 6:19pm

That was done a few years ago by one of my users. It did not work out very well.
ANTLR and many other parser generators have captured the minds of people.
It seems like another parser generator is not wanted. Even in this group, everyone
already has his own favorite tool. I’m surprised that people don’t want to take a look
at LRstar. Well, it’s available at Source Forge. Just download and unzip. Most people
have email, but very very few contact me. It needs to be taught in universities, but
they are still teaching Yacc. OMG.

thyagaraju · March 2, 2020, 11:54am

I’m exploring Antlr 4

thad · March 25, 2020, 3:19pm

I started using Irony initially and have recently begun using ANTLR because of frustrations with a language I was implementing. I primarily do all my work in C# so I was very happy when I found out that ANTLR can target C# now. I am still learning daily.

thyagaraju · April 2, 2020, 2:43pm

C# version is really cool, I did configure and using it with Visual Studio 2019, just started playing with listeners and visitors.

rafael · April 3, 2020, 1:42am

I have used Antlr and JavaCC in the last couple of years, plus Xtext.

But I suspect no one else here is using my choice for the TextUML Toolkit since 2005:

http://sablecc.org/

I still use the same version from 2005. There was another version afterwards, but I never felt compelled to move to it. Grammars in SableCC are quite clean, and the generated parser produces a nice AST that is quite easy to traverse.

Here is the grammar for TextUML:

github.com

abstratt/textuml/blob/master/plugins/com.abstratt.mdd.frontend.textuml.grammar/textuml.scc

Package com.abstratt.mdd.frontend.textuml.grammar;  

Helpers


    unicode_input_character = [0..0xffff];
    ht  = 0x0009;
    lf  = 0x000a;
    ff  = 0x000c;
    cr  = 0x000d; 
    sp  = ' ';

    
    line_terminator = lf | cr | cr lf; 
    
    input_character = [unicode_input_character - [cr + lf]];    
    
    escape_character = '\';    
    
    not_star =  [input_character - '*'] | line_terminator;

This file has been truncated. show original

adrua · August 21, 2020, 1:50pm

HI,

I’m using GPPG Parser (Golden Point Parser Generator) compatible with Yacc/Lex and generate C#

Mi pain is remove issues Displacement/Reduce and how build AST. (My ASTs are XML)

Good luck