Friday, November 17, 2017

Retro-programming: an High-Level Byte-Code for 8-bit CPUs

This idea has been percolating around in my hind-brain for a while now, so I thought I'd set it down on paper.

The Problem

Imagine you are back in the 1980s, programming one of the wonderfully primitive home micro-computers that has just arrived on the scene.  Say, a ZX Spectrum, sporting a mere 48 Kbytes RAM and a 3.5 MHz Z80 CPU (alternatively, a comparable 1 MHz 6502-based Commodore 64).  These CPUs are aggressively 8-bit in nature -- yes, the Z80 has some 16-bit support, but it isn't great.  Worse, these CPUs are non-orthogonal: every register seems to have its own special rules.  What would have been nice was some kind of easy-to-implement byte-code interpreter which would be comfortable to program in and for which it would be easy to write a compiler in BASIC (all home micro-computers booted directly to BASIC in those days).  This virtual machine would, of course, run more slowly than hand-crafted assembly language, but many times faster than the interpreted BASICs of the day.  Having a standardised byte-code would mean you could run the same program on any machine on which you had an implementation of the byte-code interpreter.

[Note: these thoughts are partially inspired by Wozniak's SWEET16, although that is a much lower-level VM than what I have in mind.]

One thing I have aimed for is a rich set of addressing modes, important for raising the level of abstraction and for reasonable code density.  Most 8-bit VMs that I've seen, like SWEET16, provide a rather low-level virtual machine typically requiring several instructions to do anything useful.  Ugh!

Outline of the Virtual Machine

The proposed VM works in 16-bit words.

A variable is any address in RAM.

Op-codes take 0, 1, or 2 arguments (if op-codes take a third argument which is a fixed address to branch to).  Typically, the first argument will be an assignment target.

There are two "registers", X and Y, to hold the arguments for op-codes.

The registers are set via the following instructions (the explanation uses C syntax for the semantics, recall that & in C is the "address-of" operator).

  • Xk -- load X with a constant; X := &k.
  • Xv -- load X with a variable; X := &v.
  • X_k -- subscript X by a constant; X := &(X[k]).
  • X_v -- subscript X by a variable; X := &(X[v]).
  • X@ -- indirection through X (equivalent to X_k 0); X := *X.
and similarly for the Y register.

Together, these allow for quite rich argument addressing.  For example, loading register X with x.i.foo.bar (or, in C syntax, x[i]->foo->bar) would be expressed in the VM as

  Xv x
  X_v i
  X_k foo
  X_k bar


In the VM, SP is a global stack-pointer variable used by the push, pop, call, and ret op-codes.

Op-codes:
  • X := Y; X += Y; X -= Y; X *= Y; X /= Y; X &= Y; X |= Y; X ^= Y; etc.
  • inc X; dec X
  • push X -- equivalent to SP -= 2; @SP := X.
  • pop X -- equivalent to X := @SP; SP += 2.
  • jp X -- jump to address X.
  • call X -- equivalent to push IP; jp X where IP is the address of the next op-code.
  • ret -- equivalent to pop IP.
  • poke X, Y -- write the least significant byte of to address X.
  • X :=peek Y -- assign the byte at address Y to X.
  • if X = Y A; if X != Y A; if X < Y A;etc. -- branch to address A if the test fails (i.e., unless X = Y; X != Y; X < Y; etc.).
  • And anything else that might be useful.
The odd inversion of the if op-codes is to support some sensible high-level syntax, such as

if q != 0 { p /= q }

which would have the obvious translation.

Example Code: Searching a Binary Tree

I'm making up some high-level syntax here, fill in any gaps using your imagination.

// Symbols for the field offsets in our binary tree vertices.
Value .equ 0
Left .equ 1
Right .equ 2

// We assume t is the root of the binary tree and we are looking for the vertex containing x.
loop {
  if t = 0 { ret } // We hit null, we failed to find x.
  t := t.Value
  if t = x { ret }
  if t < x { t := t.Left } else { t := t.Right }
}

That, I claim, is fairly neat.  The equivalent Z80 or 6502 assembly code would be far less transparent.

The above would be expanded in a simple "assembler" into something like this (where  X or Y is syntactically unchanged from its previous use, no op-codes need be omitted for set-up):

_1:
  Xv t; Yk 0; if= _2
    ret
_2:
  Yv t; Y_k Value; :=
  Yv x; if= _3
    ret
_3:
  if< _4
    Yv t; Y_k Left; :=
    Xk _5; jp
_4:
    Yv t; Y_k Right; :=
_5:
  Xk _1; jp

The Byte-code and Interpreter

Now, as an ex-Sinclair kiddie, I'm used to the Z80, so I'm going to use that to inform my idea of how the byte code might be expressed.

Rather than interpreting byte-codes, I'm simply going to propose a sequence of [op code address][arg] or just [op code address] 16-bit values as my byte code.  That is, every op-code is represented by the address of the machine code implementing it.

For example:

Xk:
  ld (X), SP: pop HL: ret    ; Because the Z80 SP now holds the address of the constant.

Xv:
  pop HL: ld (X), HL: ret

X_k:
  ld HL, (X): pop DE; add HL, DE: add HL, DE: ld (X), HL: ret

':=':
  ld HL, (Y): ld DE, (X): ldi: ldi: ret

'+=':
  ld HL, (X): ld DE, (Y)
  ld A, (DE): inc DE: add (HL): ld (HL), A: inc HL
  ld A, (DE): ld B, (HL): adc A, B: ld (HL), A    ; Ain't the Z80 horrible?
  ret

By pointing the Z80 stack-pointer at the byte-code stream, I can start executing it just by issuing the ret assembly instruction (which pops the next address off the stack and jumps to it).

Going by this, each register set-up op-code and argument takes 4 bytes, each if op-code and jump address takes 4 bytes, and every other op-code takes 2 bytes.

Under this scheme, the above binary tree search function would take 140 bytes.  This is comparable to the equivalent Z80 machine code.  It would run fairly quickly, the "interpretation" overhead being about as small as I can think of making it.

A Quick Comparison with SWEET16

This article on SWEET16 provides an 18-byte SWEET16 memory copy routine.  How would our putative VM do?

loop {
  if n = 0 { ret }
  poke dest, src
  inc dest
  inc src
}

which would turn into the following byte code:

_1:
  Xv n; Yk 0; if= _2
    ret
_2:
  Xv dest; Yv src; poke
  inc
  Xv src; inc
  jp _1

which, by my reckoning, comes to 36 bytes, twice what SWEET16 requires.  Not such a big deal given how much more comfortable the new VM feels to program in than SWEET16.

Closing Thoughts

I wish I'd thought of this back in the early 1980s!  While coding everything in assembly language was fun, it was error prone, and the weirdness of the Z80 turned every little design decision into a mental challenge.  I think this VM (and accompanying compiler) would have been possible to write in BASIC and a handful of assembler in a few evenings and that it would been a great tool for bootstrapping other projects.

(Later in life I ended up spending fifteen years in programming language design and compiler research...)


Tuesday, August 29, 2017