qcpu is a fictional 16-bit CPU specification. It is inspired by the specification for the DCPU16 CPU from the scrapped Mojang game 0x10c, the CPU architectures in Zachtronics games, and also a programming challenge which centered around implementing a fictional CPU (and which I can not find at all on the internet any more).
qcpu has a node.js implementation and a significantly faster Rust implementation, both written by me. My friend Syphonx has also created a C++ implementation with some debugging tools included
Specification
The CPU has a 16-bit addressable space (65536 addressable locations), where each address contains a 16-bit unsigned integer (henceforth a word).
There are also 6 registers, which exist out of addressable space. Each of these registers stores a single word. These registers are typically referred to by the single-character names a b c d x y
.
The CPU also has a stack which exists outside memory and contains a theoretically unbounded amount of words (the actual limit can be implementation defined).
Each instruction can have up to 4 operands. Instructions are represented in memory by a single word which represents both opcode and addressing information for each operand (see below), and then a maximum of 4 words–one for each operand which is required by the specific opcode.
The opcode is stored in the low byte of the first word of the instruction. The high byte stores the addressing information for the operands. This byte is split into four two-bit values, one for each operand, which correspond to the following addressing modes:
- 0: Immediate - a 16-bit constant value.
- 1: Absolute - the value contained at the specified memory address.
- 2: Indirect - the value contained at the memory address contained in the specified register. The register is given as a value from 0 to 6, which map to the 6 registers mentioned above.
- 3: Register - the value contained in the specified register.
Using an operator which writes to a value and using the Immediate addressing mode is undefined behaviour. Additionally, providing a value outside of the range of 0-5 for either of the addressing modes which refer to registers (Indirect and Register) is undefined behaviour. Implementations of qcpu are free to implement these behaviours in whichever way they prefer. Usually, this would involve crashing the emulation, but the latter behaviour could also be implemented by wrapping the provided value within the valid range.
Opcodes
qcpu has 25 opcodes, which are as follows:
value | mnemonic | effect |
---|---|---|
0 | nop |
does nothing |
1 | ext a |
stop execution, returns value a |
2 | sys a |
executes system call a (usually accepting argument in register x ) |
data operations | ||
3 | mov a b |
sets the value in a to the value in b |
jumps and conditionals | ||
4 | jmp a |
jump to address a |
5 | jeq a b c |
jump to address a if b == c |
6 | jne a b c |
jump to address a if b != c |
7 | jgt a b c |
jump to address a if b > c |
8 | jge a b c |
jump to address a if b >= c |
9 | jlt a b c |
jump to address a if b < c |
10 | jle a b c |
jump to address a if b <= c |
subroutines | ||
11 | jsr a |
push the current address to the call stack and jump to address a |
12 | ret |
pop an address from the call stack and jump to that address |
arithmetic operations | ||
13 | add a b |
add b to the contents of a |
14 | sub a b |
subtract b from the contents of a |
15 | mul a b |
multiply the contents of a by b |
16 | mod a b |
set the contents of a to a % b |
bitwise operations | ||
17 | and a b |
set the contents of a to the bitwise and of a with b |
18 | orr a b |
set the contents of a to the bitwise or of a with b |
19 | not a |
perform a bitwise not on the contents of a |
20 | xor a b |
set the contents of a to the bitwise xor of a with b |
21 | lsl a b |
perform a logical left shift by b bits on the contents of a |
22 | lsr a b |
perform a logical right shift by b bits on the contents of a |
stack operations | ||
23 | psh a |
push value of a onto stack |
24 | pop a |
pop top value from stack into a |
The sys
opcode
The sys
opcode allows the CPU to interface with input and output devices. A connected device can provide a set of 'syscalls' which can be called by using the sys
opcode with the single operand being the ID of the syscall.
For example, the terminal implementation of qcpu provides the following syscalls to allow for input and output, as well as limited debugging capabilities:
syscall | effect |
---|---|
sys 6 |
writes the character represented by the ASCII character code in register x to the output |
sys 7 |
reads a character from input and moves its ACII character code representation into register x |
sys 11 |
sets the foreground colour of subsequent terminal output to a predefined palette index (0-8) given by register x |
sys 11 |
sets the background colour of subsequent terminal output to a predefined palette index (0-8) given by register x |
sys 15 |
writes the current memory contents to a text file |
Binary qcpu files
Programs are loaded in to qcpu using binary files generated by the assembler. Each word of memory is loaded in starting at address 0. Each word is represented by two bytes of the binary file: low byte first, high byte second.
Writing assembly for qcpu: qasm
files
qasm
is an assembly language in which programs for qcpu can be written. There isn't a full specification yet, but here are some points of note:
- Labels are defined by adding a colon after an identifier, and can be used as an operand of an instruction to give the address of that label in Immediate mode.
- '+' and '-' can be used as temporary labels; using one as an operand will give the address of the next (+) or previous (-) instance of that label, which is useful for creating loops which don't need a specific named label.
- The
.text
assembler directive will create a set of words from a string's ASCII codes, e.g..text('hello world')
. - The
.ds
assembler directive will increase the assembler location counter by the given value, e.g. to leave 100 bytes of empty space.ds(100)
. This does not zero the memory that is skipped over if it is non-zero already. - The
.org
assembler directive will set the assembler location counter to the given value, e.g..org(0x1000)
. - The
.ds
and.org
directives, as well as immediate values for instruction operands, can take any numeric value. A numeric value is defined as either a base-10 number (123
), a base-16 number (0x8000
), or a binary number (0b1111000010001010
). - Comments can be added to
qasm
files by using either#
or;
. Any characters after the first instance of these comment characters on a line will not be processed by the assembler.