A library to process ASTs: what do you think?

For several of our projects we work with ASTs. We build parsers (frequently with ANTLR) and then translate the parse-tree into the AST, frequently built with Kolasu (an OSS library we created). Once we have the AST we can process it in various ways and using it in interpreters, compilers, transpilers, editors and other stuff.
Now, this works well when we are doing projects in Java or Kotlin, because Kolasu runs on the JVM. However sometimes we need to write projects in C++ and in that case we would love to have an equivalent of Kolasu in C++.
Also, for some projects (the Cobol parser and the Verilog parser), we generate the AST serialized in JSON and XML, but some of our clients would like to load it in Python. It would be great to have another AST library written in Python.
So we are looking for better ideas than rewriting Kolasu in C++ and Python.
What do you think?

1 Like

Sounds like Graal is the way to go. However, the tradeoff with using Graal is that you don’t have all the “legacy” features of the original languages and their environments (not all python projects work, not all memory management works, etc). I think what we could do honestly is just put out a specification, and let others implement as needed. But at least providing some default impls like in C++, Kotlin, Java, Python, Rust even. It’d be a neat crate since a lot of people are using Rust for parsing nowadays.

1 Like

Integration with Graal would be interesting. We have done some basic experimenting but there is much to do.

Yes, maybe building implementation for 2-3 core languages would be a good start (Kotlin, C++, Python are the initial suspects)

Well, what is wrong with GO? Go runs on unices / mac / windows and can transpile to WASM.

Never looked at Go but we have already written Kolasu in Kotlin, so we can use it in projects which are JVM based but for projects where we use Python and C++ we write utilities as we need them, while it could be useful to have a library we could reuse. In that scenario Go would not be helpful I think

//1
Use GraalVM, on mac or linux it’s polyglot Java, Scala, Kotlin, C, R, Ruby, Python, you can use multiple languages at once.
# Top 10 Things To Do With GraalVM

//2
It’s a very fragile way and experimental… take a look at TeaVM and see if you can compile Kolasu bytecode to WASM.

If yes, case is solved, with wasm as binary format you will need a wasm “run time” / GitHub - appcypher/awesome-wasm-runtimes: A list of webassemby runtimes
o wasmer
o innative
are most promising ones.
Also GraalVM do have an experimental wasm parser.

Other option is to decompose all Kolasu’s functionality in individual functions and pack them as FaaS
o openfaas
o Oracle FN – The Fn project is an open-source container-native serverless platform that you can run anywhere – any cloud or on-premise.

The problem is that if I am building a project for a client who needs to integrate it with a Python system or C++ application it is difficult to “force” a switch. We typically have to stay on that platform. If that platform happens to be the JVM then it is great as we can use Kolasu and that makes our life easier. If it is not then we have much work to do. So I was thinking of ways to reuse Kolasu without having to force the client to accept a transition to GraalVM or any other platform, because that is not always possible

TeaVM could be interesting to explore because it permits to reuse Kolasu on JS platforms (web or NodeJS) which could be potentially quite useful to us! Thank you!

OK.
Kotlin can transpile to JavaScript, then you can feed V8 [1] engine with JS code at run time.

[1]
V8 is Google’s open source high-performance JavaScript and WebAssembly engine, written in C++