How to define syntax for a binary language

Do you know a metalanguage that can be used to define a binary language? The binary language - it means that the alphabet is binary - contains only two characters. Usually, we use metalanguages (e.g. BNF, EBNF, ABNF) to define a new language based on the alphanumeric alphabet. Characters from this alphabet are represented as bitstreams in compliance with a selected encoding, i.e. ASCII, UTF-8. As the result, the same text could be represented by different bitstreams.

The problem I must face up is that sometimes the encoding is undefined, therefore we cannot use the mentioned metalanguages to describe bitstreams syntax.

ABNF contains some features that may be applied to relax the problem but I am not sure if this approach is the only possible.

An example you can find at NetworkMessage.abnf

Any advice is welcome.

Mariusz

“Bird” is a DSL for defining binary formats. It is open-source but was developed in a commercial context. It can generate parsers but also anonimizers. GitHub - SWAT-engineering/bird

1 Like