Pep & Nom

thought-experiments in language

When the idea of nom* occurred to me, it seemed like I was conducting a thought experiment, and the idea seemed like something that had be given rather than something that I had invented . Also, once this idea existed, if felt like I was a sort-of custodian of this idea. All this sounds quite pretentious, but nevertheless I feel the concept of thought-experiments is important, because it relates to certain types of truth that you don't need any actual evidence to confirm.

At the very beginning, while walking in the countryside, I thought of nom* as a language that could implement itself, and I also thought immediately of a better, higher-level grammar language that I could implement 'on-top-of' nom* . About a year ago, I wrote a nom* script called /eg/toybnf.pss which was a first attempt at this higher-level language. Last night, I stayed up late and created the script /eg/syntagma.pss which is a pretty decent start to implementing this higher level parsing language.

a syntagma example


   literal: ';'|':' ;
   [:alpha:]+; {
     colour: 'blue'|'green'|'red'|'orange';   
     match 'Blue' { print 'colour names are lower case\n'; exit 1; }
     # default token name
     word: *;
   }
   shape: 'square'|'circle'|'triangle';   
   match [:space:] { delete; }
   # space: [:space:]+;
   # parsing phase
   object = colour shape ';' ;
   
   # delete the space token
   {} = space ;

The development of this language proceeded very quickly: in the last hour I added many new syntactical forms. I feel that this language has the potential to be more expressive about parsing than the nom* language itself. Or perhaps it is more compatible with how people think about parsing. In any case, it uses the machine://pep engine in order to render it’s ideas.

The syntagma language still lacks 'composition' rules; meaning a syntax for composing new attributes from old ones. In the YACC tool, this syntax looks like this:

 { $$ = $1/$2; }

The meaning of this cryptogram is that the attribute of the new parse token will be the value of the 1^st parse-token divided by the 2nd. However, pep, nom* and syntagma* are all text-register based machines, so the composition rules will be text based and will look like this:

a syntagma parse rule with text composition of attributes


    object = colour shape {
      #1 := "$2 of colour $1"; 
    }

The hash (#) is to distinguish attributes of the tokens of the left-hand-side of the parse rule from the right. The $n variables relate to the corresponding attributes of the parse token in the right-hand-side of the parse rule. So, in this case, $2 is the attribute of the 'shape' token and $1 the attribute of the 'colour' token.

My idea of using text-composition (rather than calculation) corresponds to the concept that a compiler is just a language-to-language map.