How to describe DSLs?

I have carefully read this discussion because there is a similar one in the context of What is Information Model vs Semantic-data. To answer this question first we must agree on the definition of the language itself. Next, let’s add what the domain specific term means.

I propose to group languages as follows Languages grouping. Based on this discussion let me assume that we are talking about languages to define information (knowledge) in a form that can be read by a computer program. In other words, a computer is a consumer of the outcome. On the other hand, the language must be ready to be used as a design means by a human, i.e. readable and reusable by a designer.

Under this assumption let me propose the following definition:

The language is three sets:

  1. alphabet - set of characters we can use
  2. syntax - set of rules we are applying to check the correctness of the concatenation of the character (text - characters streams)
  3. semantics - set of rules we are using to associate the correct text and the text meaning.

Ad 1. Computers always use the binary alphabet (only two characters are allowed). Designers (humans) prefer to use an alphabet derived from a native language. A trade-off between these two environments is encoding - a set of rules we can use to convert text into a compliant binary stream and back in a mutually unambiguous manner. In case a graphical language is considered there must be a compiler because the graphical alphabet is generally useless for the automation of data processing using computers.

Ad. 2. As a result of having encoding (mutually unambiguous relationship between binary and text representations), there is no need to redefine semantic and syntax used in the computer and human environment. My point is that it is the main reason why JSON, XML, YAML, XAML, etc. are so popular. UML is an example of graphical language but to exchange pictures and track changes a domain-specific language based on XML is in daily use. To get pictures a dedicated graphical user interface (GUI) must be supported by an application.

Ad 3.In the case of programming languages, the semantic rules are strictly observed and the association between the correct text (clauses) is usually more or less formally defined. The metalanguages (BNF, EBNF, ABNF, custom) could change but we can say that the rules are clearly stated - semantic rules are defined in the context of the syntax. For final validation, we can use a compiler. For the compiler, the source text is just input data to be processed. I like to ask when a text becomes a program? The answer is if the compiler doesn’t complain.

A real challenge with this definition we must face up when the semantics (knowledge domain) is defined far before the language to be used for its representation. Sometimes we must deal with the problem that the “knowledge domain” is inconsistent internally. Maybe in this case we should use the term notation instead of language. Not sure - waiting for your proposals.