More compiler writing

The parser/code generator is already working very nicely. I’ve written a little disassembler to examine the output of the code generation phase – it has already helped me find a couple of bugs, and it’s crying for a peephole optimizer (although I’ll leave this for later). Here’s some sample input to the compiler and the output generated:

NGS source:

const
{
  NORMAL
  INPUT
};

function ProcessChar( Char )
{
  if (this.m_Mode == NORMAL)
  {
    var pfn = this.m_MapNormal.Get(Char);
    if (!pfn)
    {
      pfn = this.m_MotionMap.Get(Char);
      if (!pfn)
        ::Beep();
    }
  }
  else if (this.m_Mode == INPUT)
  {
    //TODO: implement insert
    //::Insert(Char);
    ::MessageBox("Insert", Char);
  }
}

Compiler output:

(002) ProcessChar (this, Char)
locals: this, Char, pfn
13           ENTER
02 00 00     PUSH_LOCVAR    0000 ; this
06 07 00     READ_MEMBERVAR 0007 ; m_Mode
01 00 00     PUSH_CONST     0000 ; 0 (NORMAL)
20           EQUAL
16 40 00     JNCOND         0040 ; L0
02 02 00     PUSH_LOCVAR    0002 ; Char
01 07 00     PUSH_CONST     0007 ; 1 (*unnamed*)
02 00 00     PUSH_LOCVAR    0000 ; this
06 09 00     READ_MEMBERVAR 0009 ; m_MapNormal
06 0A 00     READ_MEMBERVAR 000A ; Get
11           CALL
03 03 00     POP_LOCVAR     0003 ; pfn
02 03 00     PUSH_LOCVAR    0003 ; pfn
1E           LOGIC_NOT
16 23 00     JNCOND         0023 ; L1
02 02 00     PUSH_LOCVAR    0002 ; Char
01 08 00     PUSH_CONST     0008 ; 1 (*unnamed*)
02 00 00     PUSH_LOCVAR    0000 ; this
06 05 00     READ_MEMBERVAR 0005 ; m_MotionMap
06 0A 00     READ_MEMBERVAR 000A ; Get
11           CALL
03 03 00     POP_LOCVAR     0003 ; pfn
02 03 00     PUSH_LOCVAR    0003 ; pfn
1E           LOGIC_NOT      
16 09 00     JNCOND         0009 ; L1
01 09 00     PUSH_CONST     0009 ; 0 (*unnamed*)
0A 00 00     PUSH_BUILTIN   0000 ; Built-in
11           CALL
0C 01        STACK_DEL      01
14 1C 00 L1: JMP            001C ; L2
02 00 00 L0: PUSH_LOCVAR    0000 ; this
06 07 00     READ_MEMBERVAR 0007 ; m_Mode
01 01 00     PUSH_CONST     0001 ; 1 (INPUT)
20           EQUAL
16 0F 00     JNCOND         000F ; L2
01 0A 00     PUSH_CONST     000A ; Insert (*unnamed*)
02 02 00     PUSH_LOCVAR    0002 ; Char
01 0B 00     PUSH_CONST     000B ; 2 (*unnamed*)
0A 01 00     PUSH_BUILTIN   0001 ; Built-in
11           CALL
0C 01        STACK_DEL      01
09       L2: PUSH_NIL        
12           RET             

Ain’t it beautiful? Compiler writing has some “magic” feeling to it.

I am facing a little design decision: as you see in the source above, references to “this” are explicit within functions. The reason for this is that NGS is, the same as JavaScript, a class-less language. Each object can have its own set of members, be it values or member functions. For this reason, there is no way of telling during the compiling whether an identifier refers to a member of the “this” object. Allowing direct reference to member variables would turn off the detection of any undeclared identifier reference, turning them into accesses to the “this” object. This is bad, because a mis-spelling of a constant or module variable would silently be turned into a member-variable access (even worse given that all functions in NGS have a “this” object).

The way it is now, member accesses are unchecked at compile-time, but non-member accesses are checked and flagged as errors if not found either as constants, module variables, built-ins or local variables (this includes arguments to the current function).

JavaScript does exactly what I described above as “bad”. Any reference is checked not against the “this” object, but against the whole identifer resolution chain (which may include other stuff). In my experience developing in JavaScript (which consists basically in a FireFox extension), all syntax checking is very uncomfortable (I wonder how they do it, as most of the FireFox UI is actually implemented in JavaScript).

One possible solution is to use a single dot, elliding the “this” identifier, as an implicit reference to the this object. It reminds me of the “with” statement in Pascal, which I really liked (and miss in C++ etc). I could even include “with” as a feature in NGS, with “un-with’ed” code referring to “this” and explicit “with” referring to whatever the programmer wants.

I’ll leave this as it is now, and if it proves too cumbersome as I start using NGS to implement some functionality, I’ll get back to it.

By the way, the page is already in Google (although one needs to look explicitly for NGEDIT, of course), and no spam as arrived in my inbox as a result of publishing the e-mail address. I haven’t received any non-spam e-mail, either. A friend told me he’d dropped me a line, but it didn’t arrive. I checked sending one myself through the link and it worked. I hope it will work.

One Response to “More compiler writing”

  1. The growing pains of NGEDIT » Blog Archive » First anniversary Says:

    […] Today is the 1st anniversary of the conception of ViEmu. That is, this very day last year, I came up with the idea of developing a vi/vim emulator for Visual Studio. I had been working for months in the kodumi text editor (back then it was just ngedit), and the last stretch had involved developing a scripting language compiler and VM, and implementing a vi/vim emulation module in this language. […]

Leave a Reply