Background and Motivation

Extensibility of Languages

Natural languages such as English or German possess a fixed grammar or syntax, but an extensible vocabulary: One can always introduce new words or compound terms, define their meaning in terms of other, already known concepts, and use them afterwards. Very similar, programming languages have a fixed grammar or syntax – and if you violate it, the compiler will refuse to understand you – and an extensible vocabulary: One can freely introduce new types, variables, and functions, define their meaning, and use them afterwards. So, having an extensible vocabulary is crucial for a programming language, but to have an extensible syntax would also be useful.

This can be demonstrated by a comparison with mathematical notation: Even though functions play a fundamental role in mathematics, mathematicians do not use function syntax for each and every thing. Instead, they have developed lots of different syntactic forms that can be used to express things more nicely. For example, they write |n|!² instead of square(fac(abs(n)) and ∀ ε > 0 ∃ δ > 0 : |x − a| < ε ⇒ |f(x) − f(a)| < δ instead of forall(gt(eps, 0), exists(gt(delta, 0), implies(lt(abs(minus(x, a)), eps), lt(abs(minus(f(x), f(a))))))), which is almost incomprehensible.

However, programming languages do exactly this: They force their users into a syntactical corset of function or method calls: Whenever a piece of code shall be encapsulated in order to abstract from it or to reuse it multiple times, it has to be put into a function or method and must be invoked using the respective syntax. For example, Java programmers have to write A.negate().multiply(B) instead of just −A B to denote operations on matrices or BigInteger values. In C++, this could be written more conveniently by employing operator overloading, but as the next example demonstrates, C++ is not really much better: To remove from a set of integer values all negative ones, one has to write a rather complicated combination of standard library function calls: remove_if(s.begin(), s.end(), bind2nd(less<int>(), 0)). If we could simply write remove x from s where x < 0, then even non-computer scientists would be able to understand it.

To give another example, almost every programming language or standard library provides a kind of substring function or method that is usually called in a way like s.substring(3, 7). But could you immediately tell the precise meaning of this call without consulting the documentation (and without knowing the actual language)? Does it mean to take the substring of s starting at position 3 and ending at (or before?) position 7? Or shall we start at position 3 and take up to 7 characters? Both possibilities are quite common in different languages. If we could use a more problem-specific syntax such as s[3..7], the intended meaning is much more obvious even without explanation – at least, if the variations s[3..7), s(3..7], and s(3..7) with different combinations of square and round brackets to indicate the inclusion or exclusion of either border element according to mathematical interval notation are provided as well.

Extensible Programming Languages

So, if we agree that appropriate syntax is useful, we could go and tell our language designers or standards committees all the nice syntactic forms that we could think of and ask them to include them into the next version of Java, C++, or whatever language you like. But, of course, it is basically impossible to predefine in a language all the syntax that any programmer in the world might ever consider useful. Therefore, it is much more appropriate to allow programmers to define their own syntax on demand, similar to the way mathematicians simply extend their notation when they see fit.

Of course, this idea of a syntactically extensible programming language is not new at all. Over the past 50 years, many languages have been developed that provide a certain degree of syntactic flexibility and extensibility. Nevertheless it seems that the idea of actively employing syntactic extensibility has not found its way into everyday programming. Maybe, this is due to the fact that almost all of these languages are split into two separate parts: There is a normal programming language part that is used to write normal programs; and there is a separate machinery such as a macro system that must be used to perform syntactic extensions. So, writing syntax extensions is quite different from – and usually more complicated than – writing normal programs. Furthermore, most of these languages still impose more or less significant restrictions on the syntactic flexibility, i. e., they only allow to stretch the syntactical corset, but not to completely take it off.

Therefore, to get a really attractive extensible programming language, extensibility should not be provided as a separate add-on, but rather as the very heart of the language, i. e., the entire language design should be based on the idea of extensibility. Furthermore, the syntactic flexibility should be virtually unlimited. For example, it should not only be possible to extend the syntax of the language, but to completely change it if this is desired.

MOSTflexiPL is such a language. As its logo illustrates, it can be stretched and squeezed, things can be turned around or even upside down, that means: One can do amazing or even crazy things with this language, things which are completely impossible with other languages.

Christian Heinlein, 2016-10-12