Blog: Which Parsing Approach? (L. Tratt)

Recent blog post by Laurence Tratt. It gives a nice overview of several parsing techniques and then it explains the merits of LR. Notice that state-of-the-art parsers in Java (Antlr, but also JavaCC) are variants of LL.

also consider this follow-up which is currently frontpaged on HN

2 Likes

After wrestling with recursive descent, and wasting $70 on the Springer Verlag encyclopedia of Parsing, i found a lecture on YouTube by Crockford explaining the Pratt parsing technique. It is by far the simplest, and fastest parsing technique known to man, and i can highly recommend it. Crockford calls the expression evaluator code of the Pratt algorithm the most powerful 10 lines of code in all of computing. Many would agree. It isn’t even mentioned in the Springer book.

1 Like

Can I have a link to this YouTube video? Thanks!!

incidentally, this is also the technique used in the second part of the Crafting Interpreter book

https://craftinginterpreters.com/compiling-expressions.html#a-pratt-parser

ok this is at least the third time I mention this book, I swear I am not getting a cut, I just think it’s great :stuck_out_tongue:

2 Likes

@evacchi I also think it’s a great book. I am attempting to replicate the examples in Haxe language. So far, the only problem is to effectively simulate a memory heap without C’s malloc and realloc

you can always just allocate a big chunk of memory via the ByteArray, and stuff things into that. Then you have your own heap. Haxe/AS3/JS doesn’t let you get down to the memory level unfortunately. I have a script by the way that converts AS3 to JS. They are 99% the same, but much more productive to use a strongly typed language that catches errors at compile time.

How do you effectively determine the size of a non-primitive type and adjust the memory heap size accordingly?

In JS it is basically impossible to know the size of a non-primitive object, because internally JS objects can have pointers to other objects, and it can be a complex tree of internal data structures, so one is blind as to the amount of storage that object truly takes. it could be adding elements to a hash table; and the hash table is covering other objects. It is one of the very ugly things about JS that you have no idea what the cost is of a particular data item.

Is this the video https://www.youtube.com/watch?v=Nlqv6NtBXcA ?

Yes, that is it. There is also some code posted somewhere on the net, which shows how he did a simple parser. Others have pointed out that instead of calling things nud and led, they could have used prefix and infix which would have been a lot easier to understand.

I have been reading Language Implementation Patterns by Terence Parr, While not too deep but it is a good book on exploration of various parsing techniques.

Svelte

said it compiles at build time, not at run time

Svelte is JS transpiler type of language, where you basically add the missing statements to HTML that everyone wishes it had had in the first place, like looping, IF, triggering refresh based value changes, etc.

The problem with Svelte is that you are still stuck very tied to the awful CSS/HTML drawing model, which is very cumbersome and fussy. For those people who have already invested hundreds of hours into learning CSS, they will like Svelte a lot. But you are still building on top of a 3 language stack and adding a 4th one. At some point one has to wonder, if throwing away the existing stack, and starting fresh with a single language, wouldn’t be simpler. In the history of computers, at no point in the past did people find working with 4 different languages simultaneously to be sensible.

Would you prefer something like this http://www.beadslang.com/?

Haha, busted!.. Well not one programmer out of 1000 is willing to do much more than play around with a new language, which has 0 answers in Stack Overflow, etc. The reason that ugly things persist is that inertia is the most powerful force in the universe. I always expect 2 languages in the mix; you usually have some kind of job control language or make subsystem to control the building of the main thing you are trying to construct, but you expect 100:1 ratio of lines of code between those two languages. But to work in 4 languages at once, I couldn’t stand it, so i built my own.

In today’s environment, which is controlled by giant companies who give away tools that keep the current status quo in place, tool building is a thankless task. JetBrains is the only company i can think of, in the last 10 years, to thrive as a tool company. Although one could argue that SalesForce is a tool company.

I will be giving a presentation next friday; save your tomatoes.

Where will happen that presentation? Is there an URL for that presentation?

When mentioned Beads I didn’t realize you were the author…

Its the next friday demo/presentation via Zoom.

Do you refer to our weekly meetup? The meetups are on Thursday, not Friday.

If it’s a different presentation, can you send the URL?

Thanks for correcting my mistake i had the wrong day!

— Whoever is happy will make others happy too