Hi, I'm Luca DS

ldesantis · February 21, 2020, 6:34pm

Hi everyone, I’m Luca De Santis from Italy.
Thanks Federico, thanks all for giving me the opportunity to join this community.

I have more than 20 years of experience in semiconductor design. I’m interested on DSLs as tools for writing software on highly customized processors. I use programming languages that “know about underlying hardware”, so… something more abstract than assembly, but less abstract than C, because in my world hardware constraints and real-time execution are fatal.

I’m also interested in HDLs (hardware description languages) in particular for increasing the level of abstraction in digital systems design without loosing details of hardware signals. At my knowledge, state-of-the-art language in this field is Chisel, a specialized language built on Scala.

Finally, I would like to jump into LLVM toolchain, but I’m having some difficult to start. Since I’m becoming elder, any help on this will be really really appreciated

You can find me on linkedin and on Quora (https://www.quora.com/q/rnyeivjzxvzplfkj?sort=top).

Thanks
Luca

archfrog · February 22, 2020, 12:24am

Hi Luca,

I use LLVM myself for a hobbyist compiler. I’m no expert but you are always welcome to ask me if there’s something you are unsure about

You may want to check out this online book: Mapping High Level Constructs to LLVM IR.

I wrote it some years ago but now it has a new maintainer, Mike Rodler, who has made a great job of converting my old document into an online book. I know that there are some issues with the examples (check out the list of issues on GitHub), but I think they still serve well as an introduction to LLVM IR.

Cheers,
Mikael Egevig

archfrog · February 22, 2020, 10:26am

Hi again Luca,

I decided to write a short primer on LLVM. I hope you, and others, can use it for something If any of you have questions, feel free to ask!

Introduction

LLVM is an acronym for “Low-Level Virtual Machine”, which is a bit of a misnomer in my view as LLVM is most of all a fairly generic compiler back-end. LLVM is built around a low-level intermediate representation (IR) which is aptly named “LLVM IR”. LLVM offers tons of features such as Just-in-Time compilation (JITing) so that you can generate LLVM IR and then have LLVM convert it into executable code, which can be invoked directly from the compiler or tool.

Layers

There are two layers that you can use when you want to work with LLVM:

LLVM IR as a textual representation which is input to the LLVM tools. This is the method I use because I don’t want to be bound by internal API changes and don’t want to have to relink and republish my work whenever a new version of LLVM is published.
LLVM bitcode which is a binary representation of LLVM IR. The relationship between LLVM IR and LLVM bitcode is roughly like the relationship between assembly source code and an object file.

Tools

There are a bunch of tools in the LLVM tool-chain, but you can do most simply by using the C language frontend (clang), by specifying one of the desired input extensions such as .bc (bitcode), .ll (LLVM IR), and so forth.

Tips

Initially, I’d suggest using LLVM IR as the output of your compiler and then invoke clang to translate LLVM IR into LLVM bitcode. This is much easier to work with and reduces the impact of internal changes to the code base (LLVM is very actively developed). The C++ APIs used to generate LLVM bitcode with tend to change quite often and sometimes quite drastically. There is also a C API, but last time I checked it (some years ago), it offered only a fairly small subset of the C++ API.
Don’t bother with Static Single Assignment (SSA) form initially. Computing the proper temporaries using SSA is quite difficult and even the LLVM samples warn against doing this. Instead, generate code that uses the alloca pseudo-instruction to allocate storage on the stack and let the mem2reg pass figure out how to convert this into SSA. I started out without knowing this and therefore wasted some time on trying to compute the proper SSA form, only to realize that the LLVM documentation itself warns against doing so.
If you have trouble figuring out how to do something with LLVM, the easiest is to write a tiny C or C++ program that does what you want and then translate it to LLVM IR using this command:

clang -fno-asynchronous-unwind-tables -fno-exceptions -fno-rtti -Wall -Wextra -masm=intel -O3 -S -emit-llvm -g0 $1 > $1.ll

Don’t waste energy on writing a Runtime Library (RTL) in LLVM IR. LLVM IR is meant to be generated because of the SSA form. Hand-writing LLVM IR is pretty tedious and tiresome.
In my experience, it is much easier to develop against LLVM on Linux. The Windows support is fairly complete, if not complete, but you need to install Microsoft Visual Studio (a no-go in my world) whereas you can install LLVM and Clang on Ubuntu Linux just using sudo apt install clang-9 llvm-9 or sudo apt install clang llvm, depending on your version of Ubuntu Linux.
Make sure you don’t use readnone as an attribute on your functions, unless they don’t read memory, as the LLVM tools generate invalid code, with no warning, if your code does access memory even though readnone is specified.
LLVM does not natively support Unicode so you have to output UTF-8/UTF-16/UTF-32 values as byte values. This is by far the weakest point in LLVM, IMHO.

Examples

There are very good examples on the LLVM website. I recommend studying at least the Kaleidoscope sample before you start out on LLVM.

Braceless

Braceless is my name for my hobbyist programming language, which is unlikely to ever become a usable product. But it can be used to see an example of how I have made use of LLVM v8+ by generating LLVM IR from a Python script and how I invoke the LLVM tools to translate the generated LLVM IR into an executable file. Currently, not much is going on publicly on the Braceless project, but I am working on it now and then. I am in the middle of a very large refactoring project so updates are postponed until I’m finished with that.

One way to start out with LLVM would be to clone my Braceless GitHub project and try to get it going on your system. As far as I recall, the version on GitHub does generate a valid executable (it probably core dumps), but it would give you something to start out with.

I’m willing to edit and update this reply if anyone asks questions or offers suggestions. I’m confident that the above is lacking a lot, but you have to start somewhere.

Cheers,
Mikael Egevig

ldesantis · February 22, 2020, 10:49am

Great Mikael ! Thank you very much.

ftomassetti · February 23, 2020, 7:04am

Hi @ldesantis nice to meet you!
Thank you @archfrog for sharing this, I think it would deserve its own top level post.
These are the kind of exchanges I was hoping to see in this community

archfrog · February 24, 2020, 5:42am

@ftomassetti, I have made a new post in the Code Generators category, which combines my two posts on LLVM in this thread here. I plan to update it as we move forward and questions/comments pop up, if any.