LLVM - The Low-Level Virtual Machine

archfrog · February 24, 2020, 5:33am

This post is a compilation of two of my prior posts in the Introduction category.
I plan to revise this post as questions and comments, if any, pop up.

I use LLVM myself for a hobbyist compiler. I’m no expert but you are always welcome to ask me if there’s something you are unsure about

Introduction

LLVM is an acronym for “Low-Level Virtual Machine”, which is a bit of a misnomer in my view as LLVM is most of all a fairly generic compiler back-end. LLVM is built around a low-level intermediate representation (IR) which is aptly named “LLVM IR”. LLVM offers tons of features such as Just-in-Time compilation (JITing) so that you can generate LLVM IR and then have LLVM convert it into executable code, which can be invoked directly from the compiler or tool. LLVM is implemented in C++, but has both C and Python bindings. However, it is my belief that you pretty much need to use the C++ interface if you want to link in LLVM as only the C++ interface seems to be actively maintained and kept up to date as it is the native interface.

Primers

You may want to check out this online book: Mapping High Level Constructs to LLVM IR.

I wrote it some years ago but now it has a new maintainer, Mike Rodler, who has made a great job of converting my old document into an online book. I know that there are some issues with the examples (check out the list of issues on GitHub), but I think they still serve well as an introduction to LLVM IR.

Also, there are a few books about LLVM out there:

LLVM Essentials, Sarda et. al, 2015, Packt Publishing.
Getting Started with LLVM Core Libraries, Lopes et. al, 2014, Packt Publishing.
LLVM Cookbook, Pandey et. al, 2015, Packt Publishing.

I must admit that I have not yet read any of these books, but I have listed them in the order I think they should be read.

Layers

There are two layers that you can use when you want to work with LLVM:

LLVM IR as a textual representation which is input to the LLVM tools. This is the method I use because I don’t want to be bound by internal API changes and don’t want to have to relink and republish my work whenever a new version of LLVM is published.
LLVM bitcode which is a binary representation of LLVM IR. The relationship between LLVM IR and LLVM bitcode is roughly like the relationship between assembly source code and an object file.

Tools

There are a bunch of tools in the LLVM tool-chain, but you can do most simply by using the C language frontend (clang), by specifying one of the desired input extensions such as .bc (bitcode), .ll (LLVM IR), and so forth.

Tips

Initially, I’d suggest using LLVM IR as the output of your compiler and then invoke clang to translate LLVM IR into LLVM bitcode. This is much easier to work with and reduces the impact of internal changes to the code base (LLVM is very actively developed). The C++ APIs used to generate LLVM bitcode with tend to change quite often and sometimes quite drastically. There is also a C API, but last time I checked it (some years ago), it offered only a fairly small subset of the C++ API.
Don’t bother with Static Single Assignment (SSA) form initially. Computing the proper temporaries using SSA is quite difficult and even the LLVM samples warn against doing this. Instead, generate code that uses the alloca pseudo-instruction to allocate storage on the stack and let the mem2reg pass figure out how to convert this into SSA. I started out without knowing this and therefore wasted some time on trying to compute the proper SSA form, only to realize that the LLVM documentation itself warns against doing so.
If you have trouble figuring out how to do something with LLVM, the easiest is to write a tiny C or C++ program that does what you want and then translate it to LLVM IR using this command:

clang -fno-asynchronous-unwind-tables -fno-exceptions -fno-rtti -Wall -Wextra -masm=intel -O3 -S -emit-llvm -g0 $1 > $1.ll

Don’t waste energy on writing a Runtime Library (RTL) in LLVM IR. LLVM IR is meant to be generated because of the SSA form. Hand-writing LLVM IR is pretty tedious and tiresome.
In my experience, it is much easier to develop against LLVM on Linux. The Windows support is fairly complete, if not complete, but you need to install Microsoft Visual Studio (a no-go in my world) whereas you can install LLVM and Clang on Ubuntu Linux just using sudo apt install clang-9 llvm-9 or sudo apt install clang llvm, depending on your version of Ubuntu Linux.
Make sure you don’t use readnone as an attribute on your functions, unless they don’t read memory, as the LLVM tools generate invalid code, with no warning, if your code does access memory even though readnone is specified.
LLVM does not natively support Unicode so you have to output UTF-8/UTF-16/UTF-32 values as byte values. This is by far the weakest point in LLVM, IMHO.

Examples

There are very good examples on the LLVM website. I recommend studying at least the Kaleidoscope sample before you start out on LLVM.

Braceless

Braceless is my name for my hobbyist programming language, which is unlikely to ever become a usable product. But it can be used to see an example of how I have made use of LLVM v8+ by generating LLVM IR from a Python script and how I invoke the LLVM tools to translate the generated LLVM IR into an executable file. Currently, not much is going on publicly on the Braceless project, but I am working on it now and then. I am in the middle of a very large refactoring project so updates are postponed until I’m finished with that.

One way to start out with LLVM would be to clone my Braceless GitHub project and try to get it going on your system. As far as I recall, the version on GitHub does generate a valid executable (it probably core dumps), but it would give you something to start out with.

Domain-Specific Languages (DSLs)

Can LLVM potentially be used for DSLs? I believe so, although I have no experience in this area. LLVM can be linked into any C++ application and the Just-In-Time compiler be used to generate native code in the running process so that a DSL could potentially be translated into native code and run without any external compilation step. Please add your views and/or experiences to this thread.

Epilogue

I’m willing to edit and update this reply if anyone asks questions or offers suggestions. I’m confident that the above is lacking a lot, but you have to start somewhere.

alessio.stalla · February 24, 2020, 10:06am

Thanks for the very detailed intro!

ldesantis · February 24, 2020, 9:07pm

Hi Mikael, thanks again for this primer. I did the first exercise of compiling a little piece of C-code into LLVM IR representation and so far now the flow is pretty clear.
I’m wondering on where the machine is defined, because in IR code i see that variable size is given (32 bit in my first example). There is a predefined bunch of common CPUs defined somewhere or there is a way to customize the machine representation user-side?
thanks in advance
Luca

archfrog · February 24, 2020, 9:25pm

Hi Luca,

The machine is defined using the “magic” in the target datalayout and target triple lines at the top of the LLVM IR source file. To be honest, for now I just copy and paste them for each target from output of Clang when it generates LLVM IR files.

I suspect that it is sort of an arcane art to define these target attributes correctly, so I don’t want to meddle too much. I found some documentation of the target attributes in the LLVM docs.

That’s another thing I personally find a bit weak about LLVM - that you have to define the target layout yourself instead of simply selecting one ore more appropriate options. I guess the current method is way more flexible than such options, but the magic is also much more evident

Cheers,
Mikael

Moon · February 25, 2020, 12:05am

Thanks for the great introduction.

From the primer:

LLVM IR does not differentiate between signed and unsigned integers.

How does that work in the end? Like what happens on overflow?

archfrog · February 25, 2020, 12:57am

I didn’t get to that part yet, so I am not sure. I have been wondering about it myself as I want integer overflows and underflows to raise an exception in my language.

Here is a snippet from the LLVM Language Reference Manual, Add instruction:

If the sum has unsigned overflow, the result returned is the mathematical result modulo 2n, where n is the bit width of the result.

Because LLVM integers use a two’s complement representation, this instruction is appropriate for both signed and unsigned integers.

nuw and nsw stand for “No Unsigned Wrap” and “No Signed Wrap”, respectively. If the nuw
and/or nsw keywords are present, the result value of the add is a poison value if unsigned and/or signed overflow, respectively, occurs.

I don’t know much about poison values yet, but given my experiences with LLVM and its way of handling undefined behavior, I’d guess they are something to fear. Potentially, your whole program could blow up in your face because LLVM leans heavily towards the C/C++ philosophy of “undefined behavior is the programmer’s problem”. Personally, I believe in “undefined behavior is the language designer’s problem”.

Ideally, every conceivable operation would be completely well-defined, even if it makes no sense at all (throw an exception!).

I am sorry I can’t help a lot with your question, but I simply haven’t gotten as far, with my personal compiler project, as to look into overflow and underflow conditions.

Moon · February 27, 2020, 9:45am

Personally, I believe in “undefined behavior is the language designer’s problem”.

Yeah, but at the same time undefined behavior can lead to some optimizations that wouldn’t be possible otherwise. At the same time (for C++ specifically) there is just so much complexity, that it is hard to define the behavior of everything.
I think that it sucks that it’s not only C/C++ but also the IR that has UB. I hope that the people that implement the language don’t add some UB on top of the preexisting one.
Although the multiple levels of compilation + UB kinda get meshed up and in the end it becomes harder to think and reason about it.

archfrog · February 29, 2020, 9:29am

I personally feel that no optimizations are worth undefined behavior: Back in 1991 I did a test on my 24 MHz '386 system, where I created a linked list of length 1, searched it for a value that appeared at the very end of the list, prepended a non-matching list item, over and over, until one second had passed. I believe I managed to do this about 7,000 times in one second (search from 0 through N-1, where N grew from 1 to 7,000). Since then CPUs have become many, many times faster, so I seriously don’t believe in unwarranted optimizations. Much more important, than optimal optimizations, is to support concurrent programming in a beautiful, efficient, and portable way, a bit like Occam does.

Languages like C#, Python, and PHP has demonstrated clearly, I believe, that optimal optimizations are worthless if only the coder is very productive. None of the aforementioned languages are fast to somebody used to coding in assembly or C, but they are very pleasant to use and you get lots of stuff done in a very short time

I agree that it is sad that LLVM seems to rely heavily on undefined behavior. In fact, I am once again forced to reconsider my choice of LLVM as the backend for my language. There are some nifty assembly-level tricks that I’d like to make use of (such as using the x86 Carry bit to indicate whether an exception has occured or not) that are simply not feasible uing LLVM. Also, if there is an overflow or an underflow, I’d like to be able to immediately branch to code that throws an exception. I don’t know how to do this using LLVM.

That being said, I think LLVM is an awesome tool, it just may not be what I am looking for.

Moon · March 4, 2020, 1:27pm

Yes, I do hold the same opinion, despite that I dislike admitting it.
That being said - a lot of (smarter than me) people have pointed out, that this type of thinking is why the word/image processing programs takes 10 minutes to load up, despite the fact that they used to open instantly in the 90s. I don’t know if it is necessarily a problem in the languages, though.

You can’t just generate IR that would check for the carry bit? I have this idea for a toy programming language and in it, I wanted to heavily use CPU flags (specifically for x86).
Can’t you use arithmetic with overflow intrinsics? It will really suck if it is not possible.

archfrog · March 4, 2020, 2:22pm

Haha, I have to admit that I LOVE coding in high level languages, even though most of my professional work was done in C, C++, and 80x86 assembler. Sometimes I get truly amazed at what you can accomplish in an evening using Python or PHP, knowing that it would literally take weeks to do the same in C or C++. Back when I used Python a lot, I could very often put together a Python script to solve some ad-hoc problem in a manner of minutes and then continue my work, having eliminated hours of tedious, error-prone, and highly repetitive work in a matter of 10 to 20 lines of Python code.

I suspect that the slow loading of some word processors is caused by most programmer’s everlasting desire to be super-smart about everything. For instance, a typical word processor loads all kinds of “extension modules”, a process which in itself takes forever, Instead of just linking everything into one huge executable (and let the OS’ page manager do the hard work). Perhaps you remember back in the 1990ies when DLLs really became a hit? Some programs spent ages loading a zillion DLLs. But I guess it also depends on the product in question. I use Microsoft Word 2019 and opens in about 0.5 second on my system on the first load.

Thanks, I think you’re on to something! The link you sent seems to solve my problem as all I need to be able to is to throw an exception if something overflows or underflows. I haven’t gotten that far with my toy compiler project yet, which is why I keep saying that I am no expert on LLVM

My personal view, with my toy language Braceless and all, is rapidly converging towards a point of view where I would love to make a usable product that is, say, 50 to 100 times faster than Python (Python is sometimes considered to be about 200 times slower than assembly or C), if only it is as productive as Python and portable to a host of target platforms.

I have coded in C# for 10 years or so and I like the language, but it has nowhere the convenience of Python and you’re still stuck with the whole .NET programming model, which is not exactly portable to a host of systems, even though it is very nice.

I very much believe that there is ample room for yet another language, with a syntax like that of Occam/Python, but with a level of efficiency that initially is close to C#/.NET and eventually as fast as C. Unfortunately, I am very, very bad at motivating myself, when I have no boss, so I don’t seem to be getting anywhere with my ideas for a new language, this to such a point that I am very often contemplating abandoning the project altogether and never look back.

Moon · March 4, 2020, 4:35pm

Well… maybe 10 minutes was a gross over exaggeration. But considering how much work is done in certain programs (e.g. in video games) it seems laughable that it would take noticeable amount of time to display text/image.

I know that feeling. If I had a nickel for every time I’ve put a project to the bench I would have had… dozens of dollars.

archfrog · March 4, 2020, 4:46pm

My answer to you got me thinking If I cannot work without a skilled and reasonably demanding boss, why not simply find or hire someone to be my supervisor?

Perhaps somebody here is suited and interested in such a long-term task? I might even know the right person already.

The task is simply to make me:

Document my language.
Implement my language using a test-driven OOP approach.
Publish my documentation and product as free, open source.

My new boss should provide guidance and positive criticism along the way.

alessio.stalla · March 18, 2020, 9:47am

I suspect it has to do with Greenspun’s Tenth Rule, at least in part. That is, people including ad-hoc, slow interpreters for dynamic languages into their applications. It can be XML files, extensions, JavaScript, whatnot.
Another issue is loading stuff into memory, particularly with languages running on a VM. I long for the day when it will be possible to snapshot a Java heap and use it for the next run of the application. Lisp and Smalltalk have had this since, I don’t know, I think before I was born

cristian.vasile · March 29, 2020, 9:32am

TIP.
There is LLVM already compiled here for a lot architectures & OSes. For windows, just unzip the archive.