# Building a Bytecode VM in C#

A bytecode VM sounds like something that belongs inside a language runtime, a game engine, or a browser. But the basic idea is much smaller than it looks. You define a small set of instructions. You store those instructions as data. Then you write a loop that reads each instruction and performs the work it describes.

That’s it.

Once you see the shape of it, interpreters stop feeling like magic. They’re just programs that execute other programs. In this post, we’ll build a tiny stack based bytecode VM in C#. It won’t be fast, complete, or production ready. That’s not the point. The point is to make the machinery visible.

## What bytecode is

Source code is written for humans. Bytecode is written for a machine that’s easier to target than a real CPU.

Instead of executing text like this:

```csharp
var result = 10 + 20;
```

a VM might execute instructions closer to this:

```text
LOAD_CONST 10
LOAD_CONST 20
ADD
PRINT
HALT
```

Each line is a tiny operation. `LOAD_CONST` pushes a value onto a stack. `ADD` pops two values, adds them, and pushes the result back. `PRINT` pops the result and writes it out. The VM doesn’t need to understand C# syntax, variables, methods, or types to run this little program. It only needs to understand its own instruction set.

![](https://cdn.hashnode.com/uploads/covers/67c36038c69a4b7143c5fc49/173bd37c-a8b6-4894-89cc-1302806481ad.png align="center")

Real systems are obviously more complicated, but the core idea is the same. A compiler turns a higher level representation into instructions, and a VM runs those instructions.

## The shape of a tiny VM

Our VM needs bytecode, which is the program it’ll execute. It also needs an instruction pointer to track where it is in that program, and a stack to hold temporary values while instructions are running.

![](https://cdn.hashnode.com/uploads/covers/67c36038c69a4b7143c5fc49/fbf26982-d904-4230-980f-062487903a7d.png align="center")

That loop is the heart of the interpreter. Read an instruction, move the instruction pointer, execute the instruction, then repeat.

## Defining the instruction set

We’ll start with a deliberately small instruction set:

```csharp
public enum OpCode
{
    LoadConst,
    Add,
    Subtract,
    Multiply,
    Divide,
    Print,
    Halt
}
```

This is enough to run simple arithmetic programs. The VM will use a stack, so arithmetic instructions don’t need to say where their input values are. They always take values from the top of the stack.

For example, this program calculates `(10 + 20) * 3`:

```text
LOAD_CONST 10
LOAD_CONST 20
ADD
LOAD_CONST 3
MULTIPLY
PRINT
HALT
```

The stack changes as the program runs:

```text
LOAD_CONST 10      [10]
LOAD_CONST 20      [10, 20]
ADD                [30]
LOAD_CONST 3       [30, 3]
MULTIPLY           [90]
PRINT              []
```

A stack machine is a nice place to start because the instruction format stays simple. `ADD` doesn’t need operands because the values are already on the stack.

## Representing instructions in C#

We could encode everything as raw integers, but that makes the early version harder to read. Instead, we’ll use a small `Instruction` record:

```csharp
public readonly record struct Instruction(OpCode OpCode, int Operand = 0);
```

Some instructions need an operand. `LoadConst` needs the value to load. Others don’t need one, so the default operand is fine.

Now we can write a program as data:

```csharp
var program =
[
    new Instruction(OpCode.LoadConst, 10),
    new Instruction(OpCode.LoadConst, 20),
    new Instruction(OpCode.Add),
    new Instruction(OpCode.LoadConst, 3),
    new Instruction(OpCode.Multiply),
    new Instruction(OpCode.Print),
    new Instruction(OpCode.Halt)
];
```

That array is our bytecode program. It isn’t source code anymore. It’s a list of instructions the VM can execute one by one.

## Writing the interpreter loop

Here’s the first complete version of the VM:

```csharp
public sealed class TinyVm
{
    private readonly Instruction[] _program;
    private readonly Stack<int> _stack = new();

    private int _ip;

    public TinyVm(Instruction[] program)
    {
        _program = program;
    }

    public void Run()
    {
        while (_ip < _program.Length)
        {
            var instruction = _program[_ip];
            _ip++;

            switch (instruction.OpCode)
            {
                case OpCode.LoadConst:
                    _stack.Push(instruction.Operand);
                    break;

                case OpCode.Add:
                {
                    var right = _stack.Pop();
                    var left = _stack.Pop();
                    _stack.Push(left + right);
                    break;
                }

                case OpCode.Subtract:
                {
                    var right = _stack.Pop();
                    var left = _stack.Pop();
                    _stack.Push(left - right);
                    break;
                }

                case OpCode.Multiply:
                {
                    var right = _stack.Pop();
                    var left = _stack.Pop();
                    _stack.Push(left * right);
                    break;
                }

                case OpCode.Divide:
                {
                    var right = _stack.Pop();
                    var left = _stack.Pop();
                    _stack.Push(left / right);
                    break;
                }

                case OpCode.Print:
                    Console.WriteLine(_stack.Pop());
                    break;

                case OpCode.Halt:
                    return;

                default:
                    throw new InvalidOperationException(
                        $"Unknown opcode: {instruction.OpCode}");
            }
        }
    }
}
```

The interpreter loop is the important part:

```csharp
var instruction = _program[_ip];
_ip++;
```

The VM fetches the current instruction and moves the instruction pointer forward. Then the `switch` executes whatever operation the instruction describes. There’s no magic hiding here. The VM is just a loop over an array.

## Running the VM

Now we can put it together:

```csharp
var program =
[
    new Instruction(OpCode.LoadConst, 10),
    new Instruction(OpCode.LoadConst, 20),
    new Instruction(OpCode.Add),
    new Instruction(OpCode.LoadConst, 3),
    new Instruction(OpCode.Multiply),
    new Instruction(OpCode.Print),
    new Instruction(OpCode.Halt)
];

var vm = new TinyVm(program);
vm.Run();
```

The output is:

```text
90
```

We’ve built a tiny executable format and a tiny machine that can run it. It’s small, but it has the same core pieces you’ll find in much larger systems, an instruction stream, an instruction pointer, a stack, and a dispatch loop.

## Adding variables

Arithmetic is fine, but a VM gets more interesting when it can store values.

Let’s add two more instructions:

```csharp
public enum OpCode
{
    LoadConst,
    LoadLocal,
    StoreLocal,
    Add,
    Subtract,
    Multiply,
    Divide,
    Print,
    Halt
}
```

We’ll give the VM a small local variable array:

```csharp
private readonly int[] _locals = new int[16];
```

Then we add support for loading and storing local values:

```csharp
case OpCode.StoreLocal:
    _locals[instruction.Operand] = _stack.Pop();
    break;

case OpCode.LoadLocal:
    _stack.Push(_locals[instruction.Operand]);
    break;
```

Now this program stores a value, loads it later, and uses it in a calculation:

```csharp
var program =
[
    new Instruction(OpCode.LoadConst, 42),
    new Instruction(OpCode.StoreLocal, 0),

    new Instruction(OpCode.LoadLocal, 0),
    new Instruction(OpCode.LoadConst, 8),
    new Instruction(OpCode.Add),

    new Instruction(OpCode.Print),
    new Instruction(OpCode.Halt)
];
```

The output is:

```text
50
```

The local slot isn’t a C# variable. It’s just a numbered location in an array. That’s enough for the VM. A compiler targeting this VM could decide that a source level variable called `total` lives in local slot `0`, while another variable called `count` lives in local slot `1`.

By the time bytecode is produced, the friendly names can disappear.

## Adding jumps

Without jumps, programs can only run from top to bottom. To support conditionals and loops, we need a way to move the instruction pointer.

Let’s add three instructions:

```csharp
public enum OpCode
{
    LoadConst,
    LoadLocal,
    StoreLocal,
    Add,
    Subtract,
    Multiply,
    Divide,
    LessThan,
    Jump,
    JumpIfFalse,
    Print,
    Halt
}
```

`Jump` always moves the instruction pointer. `JumpIfFalse` only moves it when the value on the stack is zero. `LessThan` compares two values and pushes `1` for true or `0` for false.

Here’s the VM support:

```csharp
case OpCode.LessThan:
{
    var right = _stack.Pop();
    var left = _stack.Pop();
    _stack.Push(left < right ? 1 : 0);
    break;
}

case OpCode.Jump:
    _ip = instruction.Operand;
    break;

case OpCode.JumpIfFalse:
{
    var condition = _stack.Pop();

    if (condition == 0)
    {
        _ip = instruction.Operand;
    }

    break;
}
```

Now we can write a loop. This program prints the numbers `0` to `4`:

```csharp
var program =
[
    // i = 0
    new Instruction(OpCode.LoadConst, 0),
    new Instruction(OpCode.StoreLocal, 0),

    // loop start: instruction 2
    new Instruction(OpCode.LoadLocal, 0),
    new Instruction(OpCode.LoadConst, 5),
    new Instruction(OpCode.LessThan),
    new Instruction(OpCode.JumpIfFalse, 13),

    // print i
    new Instruction(OpCode.LoadLocal, 0),
    new Instruction(OpCode.Print),

    // i = i + 1
    new Instruction(OpCode.LoadLocal, 0),
    new Instruction(OpCode.LoadConst, 1),
    new Instruction(OpCode.Add),
    new Instruction(OpCode.StoreLocal, 0),

    // jump back to loop start
    new Instruction(OpCode.Jump, 2),

    new Instruction(OpCode.Halt)
];
```

The output is:

```text
0
1
2
3
4
```

This is the moment the VM starts to feel like a tiny programming language runtime. We’ve got state, comparison, branching, and looping.

## The full VM

Here’s the complete version so far:

```csharp
public enum OpCode
{
    LoadConst,
    LoadLocal,
    StoreLocal,
    Add,
    Subtract,
    Multiply,
    Divide,
    LessThan,
    Jump,
    JumpIfFalse,
    Print,
    Halt
}

public readonly record struct Instruction(OpCode OpCode, int Operand = 0);

public sealed class TinyVm
{
    private readonly Instruction[] _program;
    private readonly Stack<int> _stack = new();
    private readonly int[] _locals = new int[16];

    private int _ip;

    public TinyVm(Instruction[] program)
    {
        _program = program;
    }

    public void Run()
    {
        while (_ip < _program.Length)
        {
            var instruction = _program[_ip];
            _ip++;

            switch (instruction.OpCode)
            {
                case OpCode.LoadConst:
                    _stack.Push(instruction.Operand);
                    break;

                case OpCode.LoadLocal:
                    _stack.Push(_locals[instruction.Operand]);
                    break;

                case OpCode.StoreLocal:
                    _locals[instruction.Operand] = _stack.Pop();
                    break;

                case OpCode.Add:
                {
                    var right = _stack.Pop();
                    var left = _stack.Pop();
                    _stack.Push(left + right);
                    break;
                }

                case OpCode.Subtract:
                {
                    var right = _stack.Pop();
                    var left = _stack.Pop();
                    _stack.Push(left - right);
                    break;
                }

                case OpCode.Multiply:
                {
                    var right = _stack.Pop();
                    var left = _stack.Pop();
                    _stack.Push(left * right);
                    break;
                }

                case OpCode.Divide:
                {
                    var right = _stack.Pop();
                    var left = _stack.Pop();
                    _stack.Push(left / right);
                    break;
                }

                case OpCode.LessThan:
                {
                    var right = _stack.Pop();
                    var left = _stack.Pop();
                    _stack.Push(left < right ? 1 : 0);
                    break;
                }

                case OpCode.Jump:
                    _ip = instruction.Operand;
                    break;

                case OpCode.JumpIfFalse:
                {
                    var condition = _stack.Pop();

                    if (condition == 0)
                    {
                        _ip = instruction.Operand;
                    }

                    break;
                }

                case OpCode.Print:
                    Console.WriteLine(_stack.Pop());
                    break;

                case OpCode.Halt:
                    return;

                default:
                    throw new InvalidOperationException(
                        $"Unknown opcode: {instruction.OpCode}");
            }
        }
    }
}
```

It’s not much code, but it gives you the bones of an interpreter.

## Why stack machines are common

A stack machine is easy to generate code for because instructions have fewer operands.

Take this expression:

```text
(10 + 20) * 3
```

The bytecode can be:

```text
LOAD_CONST 10
LOAD_CONST 20
ADD
LOAD_CONST 3
MULTIPLY
```

The temporary values live on the stack. The compiler doesn’t need to choose registers or name every intermediate result.

That simplicity is one reason stack based instruction sets show up in real systems. The .NET IL instruction set is stack based too, although it’s much richer than our tiny VM. For example, a simple method might load two values, add them, and return the result. The operations work against an evaluation stack, not named CPU registers. Our VM is obviously nowhere near the CLR, but the basic stack behaviour should feel familiar if you’ve ever looked at IL.

## Why this isn’t fast

This VM is easy to understand, but it isn’t designed for speed. Every instruction goes through a `switch`. The stack uses `Stack<int>`, which is convenient but not the fastest possible storage. The program uses a rich `Instruction` struct instead of a compact binary format. There’s no validation pass, no JIT, no type system, and no attempt to optimise instruction dispatch. That’s fine for learning. It’s also a useful reminder that an interpreter is mostly a trade off. You get portability and flexibility, but you pay for it at runtime.

A production VM might use a denser bytecode format, a custom stack, better dispatch, ahead-of-time validation, or a JIT compiler that turns hot bytecode into native code. The small version helps you see why those optimisations exist.

## Bytecode as a design tool

The interesting part isn’t only language runtimes. Bytecode style execution can be useful anywhere you want to represent behaviour as data. A rules engine can compile business rules into a small instruction set. A workflow engine can turn steps into executable operations. A game engine can run scripted behaviour without recompiling the host application.

You still need to be careful. Once users can define behaviour, you’ve created something language shaped. You’ll need validation, versioning, debugging support, error messages, and a safe execution boundary. But for internal systems, a tiny instruction set can be a powerful design. It gives you controlled flexibility without letting arbitrary code run inside your process.

## What to add next

The VM we built is deliberately small, but there are natural directions you could take it.

You could add a parser that turns a simple text format into bytecode, so you don’t have to build the instruction array by hand. You could add function calls by storing return addresses on a call stack. You could add a separate constant pool so larger values aren’t embedded directly in the instruction stream. You could also add a validation step that checks jump targets, stack depth, and invalid local slots before the program runs. Those changes would make the VM feel much closer to a real runtime. They’d also make the trade offs more obvious. The moment your VM grows beyond a toy, you start caring about debugging, safety, performance, and compatibility. That’s where the engineering work really begins.

A bytecode VM isn’t magic. At its core, it’s a loop that reads instructions and changes state. That simple idea explains a surprising amount of how bigger systems work. Interpreters, scripting engines, rules engines, workflow runners, and language runtimes all sit somewhere on the same spectrum.

The tiny version is worth building because it removes the mystery. Once you’ve written an instruction pointer, an operand stack, and a dispatch loop yourself, the whole subject becomes less abstract. You’re no longer just reading about runtimes. You’ve built the smallest useful shape of one!
