GitHub Source

Let us use ANSI-C (C90) to compile (or transpile, for the pedantic) a target language to an intermediate ANSI-C file.

The Target Language

Our target language, dubbed switch, will be reminiscent of the B Programming Language, supporting a single type int, pointers thereto, operators doubly thereof, basic control flow and looping, and recursion, but of course, barring function pointers - more on this later. switch files use the .sw file type, require an entry point main, and cannot import other switch files.

For the purpose of this post, a switch program can wholistically be exemplified with:

main.sw

int fun()
{
    ret 3;
}

int main()
{
    int x = 1 + fun();
    while(x != 1)
    {
        x = x - 1;
    }
    if(x == 1)
    {
        ret 0;
    }
    ret 1;
}

This example provides basic operators, a function call with a return, variables, loops, and conditional branching. Sparing the complexities of recursive descent parsing, lexing, and BNF, this post aims to share the workings of switch’s transpiled if-statements, while loops, and function calls / returns.

The If Statement

Our standard if-statement:

if.sw

int main()
{
    if(1)
    {
        2;
    }
}

This can be realized in transpiled ANSI-C as:

if.c

#define INT(var) vs[sp++] = var
#define BRZ(label) if(vs[--sp] == 0) goto label
#define BRA(label) goto label
#define POP() --sp
int main(void)
{
    int vs[128] = { 0 };
    register int sp = 0;
    goto main;
main:
    INT(1);
    BRZ(L0);
    INT(2);
    POP();
    BRA(L1);
L0:
L1:
    return 0;
}

Our transpiled ANSI-C enters main and sets up a variable stack vs and a stack pointer sp to track variable pushes and pops to the variable stack. The register keyword prepended to sp serves no purpose but to document the modeling of the stack machine. A goto main statement acts as a reset vector, placing execution at the goto label main. An integer of 1 is loaded, and branches to L0 should the integer be zero, otherwise, the next instruction executes, eventually branching to L1, which signifies the end of the program. Integer 1 is popped by the BRZ instruction.

Within the body of the if-statement lies an expression without assignment, which loads integer 2, but subsequently pops the integer from the stack. It serves only to have some form of expression within the block of the if-statement to pretty print the transpile. Empty blocks are permitted.

The While Loop

while.c

int main()
{
    while(1)
    {
        2;
    }
}

Our while loop uses the same inner workings as the if statement. Integer 1 is pushed to the stack, and branches to L1 (the end of the program) if it is zero. The loop pushes and pops integer 2 for the exemplary, and a branch to L0 repeats the loop. In all instances, integer 1 is popped form the stack with the BRZ instruction.

while.c

#define INT(var) vs[sp++] = var
#define BRZ(label) if(vs[--sp] == 0) goto label
#define BRA(label) goto label
#define POP() --sp
int main(void)
{
    int vs[128] = { 0 };
    register int sp = 0;
    goto main;
main:
L0:
    INT(1);
    BRZ(L1);
    INT(2);
    POP();
    BRA(L0);
L1:
    return 0;
}

The Function Call

func.sw

int func()
{
    ret 9;
}

int main()
{
    func();
}

The function call introduces stacks fs and bs which track the function and base pointer stacks, respectively. Additionally, registers fp, fa, and rr, are introduced, where fp tracks pushes and pops to the function and base pointer stacks, fa that stores the pop of fs to temporarily obtain a return function address, and rr to temporarily store the return value of a function. switch functions do not support void returns. New instructions, separate to our if and while examples, are SET, CAL, and RET. A keen eye notices the intermingling of a switch statement with gotos within func.c.

SET pushes the current stack frame’s stack pointer value sp to bs at fp. A base pointer allows called functions to accurately reference local variables according to their zero offset at compile time. This post will not cover the parsing techniques for managing local variables, their stack pointer values, lvalues, and rvalues, so bs can effectively be ignored.

CAL stores the return address in fs[fp], which is the current source line number, and goes to the call goto label after incrementing fp. The return address is a case statement with the same line number.

RET stores the top of the variable stack in the return register rr, decrements the function pointer fp, reloads the previous sp and fa register values from their respective stacks, and places the return register rr on top of the variable stack vs. The return function address, now residing in fa, is jumped to by the switch statement after jumping to the begin label.

As the program concludes, a POP() instruction after the CAL() instruction ensures the stack is cleaned up of integer 9.

func.c

#define POP() --sp
#define INT(var) vs[sp++] = var

#define SET()          \
    bs[fp] = sp

#define CAL(name)      \
    fs[fp] = __LINE__; \
    fp++;              \
    goto name;         \
    case __LINE__:

#define RET()          \
    rr = vs[sp - 1];   \
    --fp;              \
    sp = bs[fp];       \
    fa = fs[fp];       \
    vs[sp] = rr;       \
    sp++;              \
    goto begin

int main(void) {
    int fs[32] = { 0 };
    int bs[32] = { 0 };
    int vs[128] = { 0 };
    register int fp = 1;
    register int fa = 0;
    register int sp = 0;
    register int rr = 0;
    goto main;
begin:
    switch(fa)
    {
func:
    INT(9);
    RET();
main:
    SET();
    CAL(func);
    POP();
    }
    return 0;
}

Outro

Combine the if-statement, the while loop, and the function call syntaxing, and we have a working language (barring pointers, operators, and everything else that makes a language expressive, of course). Furthermore, ANSI-C proves to serve as a highly portable stack machine (with a bit of creativity), which makes a perfect target for compilation (or transpilation if we are being pedantic). I have a fully functional switch compiler / transpiler that you can find on my GitHub page. There are more switch examples there to be found, including string usage, and integer printing (I regulated myself to only having access to libc’s putchar, which is accessed through the $ operator).

And alas, switch does not support function pointers without a GNU extension that allows the address of labels to be taken as values, for example:

label:
    0;
    ...
    void* ptr = &&label;

Other than that, I leave you with qsort.sw, which is fully transpilable with switch, and I leave you with the post processed intermediate ANSI C file (gcc -E qsort.c).

qsort.sw

int swap(int* a, int x, int y)
{
    int temp = *(a + x);
    *(a + x) = *(a + y);
    *(a + y) = temp;
}

int qsort(int* a, int l, int r)
{
    int i = 0;
    int last = 0;
    if(l >= r)
    {
        ret 0;
    }
    swap(a, l, (l + r) / 2);
    last = l;
    i = l + 1;
    while(i <= r)
    {
        if(*(a + i) < *(a + l))
        {
            last = last + 1;
            swap(a, last, i);
        }
        i = i + 1;
    }
    swap(a, l, last);
    qsort(a, l, last - 1);
    qsort(a, last + 1, r);
}

int main()
{
    int a[] = { 9, 43, 1, 5, 32 };
    qsort(a, 0, @a - 1); # The `@` operator returns the size of the array.
}

May you all have a good day.

gcc -E qsort.c

int main(void) {
 int fs[4096] = { 0 };
 int bs[4096] = { 0 };
 int vs[65536] = { 0 };
 register int fp = 1;
 register int fa = 0;
 register int sp = 0;
 register int rr = 0;
 goto main;
begin:
 if(fp == 0)
  return vs[sp - 1];
 switch(fa)
 {
swap:
 vs[sp++] = 0 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp++] = 1 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp - 2] += vs[sp - 1]; --sp;
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp++] = 0 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp++] = 1 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp - 2] += vs[sp - 1]; --sp;
 vs[sp++] = 0 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp++] = 2 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp - 2] += vs[sp - 1]; --sp;
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[vs[sp - 2]] = vs[sp - 1];;
 vs[sp - 2] = vs[sp - 1]; --sp;;
 --sp;
 vs[sp++] = 0 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp++] = 2 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp - 2] += vs[sp - 1]; --sp;
 vs[sp++] = 3 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[vs[sp - 2]] = vs[sp - 1];;
 vs[sp - 2] = vs[sp - 1]; --sp;;
 --sp;
 --sp;
 --sp;
 --sp;
 --sp;
 vs[sp++] = 0;
 rr = vs[sp - 1];--fp;sp = bs[fp];fa = fs[fp];vs[sp] = rr;sp++;goto begin;
qsort:
 vs[sp++] = 0;
 vs[sp++] = 0;
 vs[sp++] = 1 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp++] = 2 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp - 2] = vs[sp - 2] >= vs[sp - 1]; --sp;
 if(vs[--sp] == 0) goto L0;
 vs[sp++] = 0;
 rr = vs[sp - 1];--fp;sp = bs[fp];fa = fs[fp];vs[sp] = rr;sp++;goto begin;
 goto L1;
L0:
L1:
 bs[fp] = sp;
 vs[sp++] = 0 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp++] = 1 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp++] = 1 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp++] = 2 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp - 2] += vs[sp - 1]; --sp;
 vs[sp++] = 2;
 vs[sp - 2] /= vs[sp - 1]; --sp;
 fs[fp] = 110;fp++;goto swap;case 110:;
 --sp;
 vs[sp++] = 4 + bs[fp - 1];
 vs[sp++] = 1 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[vs[sp - 2]] = vs[sp - 1];;
 vs[sp - 2] = vs[sp - 1]; --sp;;
 --sp;
 vs[sp++] = 3 + bs[fp - 1];
 vs[sp++] = 1 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp++] = 1;
 vs[sp - 2] += vs[sp - 1]; --sp;
 vs[vs[sp - 2]] = vs[sp - 1];;
 vs[sp - 2] = vs[sp - 1]; --sp;;
 --sp;
L2:
 vs[sp++] = 3 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp++] = 2 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp - 2] = vs[sp - 2] <= vs[sp - 1]; --sp;
 if(vs[--sp] == 0) goto L3;
 vs[sp++] = 0 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp++] = 3 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp - 2] += vs[sp - 1]; --sp;
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp++] = 0 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp++] = 1 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp - 2] += vs[sp - 1]; --sp;
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp - 2] = vs[sp - 2] < vs[sp - 1]; --sp;
 if(vs[--sp] == 0) goto L4;
 vs[sp++] = 4 + bs[fp - 1];
 vs[sp++] = 4 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp++] = 1;
 vs[sp - 2] += vs[sp - 1]; --sp;
 vs[vs[sp - 2]] = vs[sp - 1];;
 vs[sp - 2] = vs[sp - 1]; --sp;;
 --sp;
 bs[fp] = sp;
 vs[sp++] = 0 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp++] = 4 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp++] = 3 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 fs[fp] = 162;fp++;goto swap;case 162:;
 --sp;
 goto L5;
L4:
L5:
 vs[sp++] = 3 + bs[fp - 1];
 vs[sp++] = 3 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp++] = 1;
 vs[sp - 2] += vs[sp - 1]; --sp;
 vs[vs[sp - 2]] = vs[sp - 1];;
 vs[sp - 2] = vs[sp - 1]; --sp;;
 --sp;
 goto L2;
L3:
 bs[fp] = sp;
 vs[sp++] = 0 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp++] = 1 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp++] = 4 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 fs[fp] = 184;fp++;goto swap;case 184:;
 --sp;
 bs[fp] = sp;
 vs[sp++] = 0 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp++] = 1 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp++] = 4 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp++] = 1;
 vs[sp - 2] -= vs[sp - 1]; --sp;
 fs[fp] = 195;fp++;goto qsort;case 195:;
 --sp;
 bs[fp] = sp;
 vs[sp++] = 0 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp++] = 4 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 vs[sp++] = 1;
 vs[sp - 2] += vs[sp - 1]; --sp;
 vs[sp++] = 2 + bs[fp - 1];
 vs[sp - 1] = vs[vs[sp - 1]];
 fs[fp] = 206;fp++;goto qsort;case 206:;
 --sp;
 --sp;
 --sp;
 --sp;
 --sp;
 --sp;
 vs[sp++] = 0;
 rr = vs[sp - 1];--fp;sp = bs[fp];fa = fs[fp];vs[sp] = rr;sp++;goto begin;
main:
 vs[sp++] = 9;
 vs[sp++] = 43;
 vs[sp++] = 1;
 vs[sp++] = 5;
 vs[sp++] = 32;
 bs[fp] = sp;
 vs[sp++] = 0 + bs[fp - 1];
 vs[sp++] = 0;
 vs[sp++] = 0 + bs[fp - 1];
 --sp;
 vs[sp++] = 5;
 vs[sp++] = 1;
 vs[sp - 2] -= vs[sp - 1]; --sp;
 fs[fp] = 229;fp++;goto qsort;case 229:;
 --sp;
 --sp;
 vs[sp++] = 0;
 rr = vs[sp - 1];--fp;sp = bs[fp];fa = fs[fp];vs[sp] = rr;sp++;goto begin;
 }
 return 0;
}