Blog: Which Parsing Approach? (L. Tratt)

evacchi · September 16, 2020, 3:47pm

Recent blog post by Laurence Tratt. It gives a nice overview of several parsing techniques and then it explains the merits of LR. Notice that state-of-the-art parsers in Java (Antlr, but also JavaCC) are variants of LL.

https://tratt.net/laurie/blog/entries/which_parsing_approach.html

also consider this follow-up which is currently frontpaged on HN

CodingFiend · September 16, 2020, 11:56pm

After wrestling with recursive descent, and wasting $70 on the Springer Verlag encyclopedia of Parsing, i found a lecture on YouTube by Crockford explaining the Pratt parsing technique. It is by far the simplest, and fastest parsing technique known to man, and i can highly recommend it. Crockford calls the expression evaluator code of the Pratt algorithm the most powerful 10 lines of code in all of computing. Many would agree. It isn’t even mentioned in the Springer book.

damilare · September 17, 2020, 7:25am

Can I have a link to this YouTube video? Thanks!!

evacchi · September 17, 2020, 8:48am

incidentally, this is also the technique used in the second part of the Crafting Interpreter book

https://craftinginterpreters.com/compiling-expressions.html#a-pratt-parser

ok this is at least the third time I mention this book, I swear I am not getting a cut, I just think it’s great

damilare · September 17, 2020, 10:25am

@evacchi I also think it’s a great book. I am attempting to replicate the examples in Haxe language. So far, the only problem is to effectively simulate a memory heap without C’s malloc and realloc

CodingFiend · September 18, 2020, 8:49pm

you can always just allocate a big chunk of memory via the ByteArray, and stuff things into that. Then you have your own heap. Haxe/AS3/JS doesn’t let you get down to the memory level unfortunately. I have a script by the way that converts AS3 to JS. They are 99% the same, but much more productive to use a strongly typed language that catches errors at compile time.

damilare · September 29, 2020, 8:51am

How do you effectively determine the size of a non-primitive type and adjust the memory heap size accordingly?

CodingFiend · October 1, 2020, 8:53pm

In JS it is basically impossible to know the size of a non-primitive object, because internally JS objects can have pointers to other objects, and it can be a complex tree of internal data structures, so one is blind as to the amount of storage that object truly takes. it could be adding elements to a hash table; and the hash table is covering other objects. It is one of the very ugly things about JS that you have no idea what the cost is of a particular data item.

oscaretu · October 2, 2020, 11:50pm

Is this the video https://www.youtube.com/watch?v=Nlqv6NtBXcA ?

CodingFiend · October 3, 2020, 3:06am

Yes, that is it. There is also some code posted somewhere on the net, which shows how he did a simple parser. Others have pointed out that instead of calling things nud and led, they could have used prefix and infix which would have been a lot easier to understand.

sanyaade · October 3, 2020, 7:08pm

I have been reading Language Implementation Patterns by Terence Parr, While not too deep but it is a good book on exploration of various parsing techniques.

sanyaade · October 3, 2020, 7:56pm

Svelte

said it compiles at build time, not at run time

CodingFiend · October 3, 2020, 8:44pm

Svelte is JS transpiler type of language, where you basically add the missing statements to HTML that everyone wishes it had had in the first place, like looping, IF, triggering refresh based value changes, etc.

The problem with Svelte is that you are still stuck very tied to the awful CSS/HTML drawing model, which is very cumbersome and fussy. For those people who have already invested hundreds of hours into learning CSS, they will like Svelte a lot. But you are still building on top of a 3 language stack and adding a 4th one. At some point one has to wonder, if throwing away the existing stack, and starting fresh with a single language, wouldn’t be simpler. In the history of computers, at no point in the past did people find working with 4 different languages simultaneously to be sensible.

oscaretu · October 3, 2020, 9:07pm

Would you prefer something like this http://www.beadslang.com/?

CodingFiend · October 3, 2020, 10:38pm

Haha, busted!.. Well not one programmer out of 1000 is willing to do much more than play around with a new language, which has 0 answers in Stack Overflow, etc. The reason that ugly things persist is that inertia is the most powerful force in the universe. I always expect 2 languages in the mix; you usually have some kind of job control language or make subsystem to control the building of the main thing you are trying to construct, but you expect 100:1 ratio of lines of code between those two languages. But to work in 4 languages at once, I couldn’t stand it, so i built my own.

In today’s environment, which is controlled by giant companies who give away tools that keep the current status quo in place, tool building is a thankless task. JetBrains is the only company i can think of, in the last 10 years, to thrive as a tool company. Although one could argue that SalesForce is a tool company.

I will be giving a presentation next friday; save your tomatoes.

oscaretu · October 3, 2020, 11:38pm

Where will happen that presentation? Is there an URL for that presentation?

When mentioned Beads I didn’t realize you were the author…

CodingFiend · October 4, 2020, 12:02am

Its the next friday demo/presentation via Zoom.

oscaretu · October 4, 2020, 6:58am

Do you refer to our weekly meetup? The meetups are on Thursday, not Friday.

If it’s a different presentation, can you send the URL?

CodingFiend · October 4, 2020, 7:58am

Thanks for correcting my mistake i had the wrong day!

— Whoever is happy will make others happy too