The Architecture of 8-Bit: Lessons from Game Boy Snake
When I first set out to write Snake for the Game Boy, I expected a simple exercise in logic. Snake is, after all, the "Hello World" of game development. You have a grid, a moving head, a growing tail, and an apple. How hard could it be?
As it turns out, on hardware from 1989, "simple" is a relative term.
The Game Boy is a machine of strict constraints. It has a Sharp LR35902 CPU (a Z80 variant) running at 4.19 MHz, and extremely limited RAM. There is no operating system to manage memory for you. There are no garbage collectors. There is only you and the metal.
This project became less about the game itself and more about the architecture required to make it run reliably. Here are the technical pitfalls we encountered and the architectural patterns we built to solve them.
1. The VRAM "Bus Conflict" Nightmare
Early in development, we hit our first major graphical glitch: persistent vertical stripes tearing through our tilemaps.
The Symptom: Random garbage data appearing on screen, often looking like vertical bars or corrupted tiles.
The Cause: The Game Boy's Video RAM (VRAM) is dual-ported but has strict access rules. The Picture Processing Unit (PPU) locks VRAM while it draws the screen (during the active display period). If the CPU tries to write to VRAM at the same time the PPU is reading from it, a "bus conflict" occurs. The write fails, and garbage data is written instead.
The Fix: We implemented a strict Safe State Protocol.
- Wait for VBlank: Check the
rLYregister to ensure the screen has finished drawing. - Disable LCD: Turn off the screen entirely (setting bit 7 of
rLCDCto 0). This gives the CPU full, unrestricted access to VRAM. - Bulk Transfer: Perform all heavy map updates (like clearing the screen or drawing the title).
- Re-enable LCD: Turn the screen back on before the next frame begins.
; Safe VRAM Access Pattern
WaitVBlank:
ld a, [rLY]
cp 144
jr c, WaitVBlank
; Now safe to disable LCD
xor a
ld [rLCDC], a
; Perform VRAM writes...
; Re-enable LCD
ld a, LCDCF_ON | LCDCF_BG8000 | LCDCF_BGON
ld [rLCDC], a
2. Input Lag & The 60Hz Buffer
Snake requires precise timing. A delayed turn means death. Initially, our input handling felt "mushy"—sometimes inputs were missed entirely.
The Fix: We decoupled input polling from game logic.
Instead of reading the D-Pad directly inside the movement code, we implemented a polling routine that runs every single frame (at the start of the VBlank interrupt). This routine reads the hardware register rP1, debounces the signal, and stores the result in a zero-page variable wCurKeys.
The game logic then reads wCurKeys whenever it needs to, guaranteed that the data is fresh and consistent for that frame.
3. The "P" Artifact & Ghost Tiles
We encountered a strange bug where the letter "P" from "PRESS START" would sometimes linger on the screen during gameplay.
The Cause: This was a race condition in our screen transition logic. We were clearing the tilemap before the VBlank period had fully started, resulting in a partial clear. The "P" happened to be the last tile written before the PPU locked the bus again.
The Fix: We moved all screen transitions to a dedicated CallClearScreen routine that enforces the VBlank wait described in Lesson 1.
4. The Headless Verification Pipeline
Developing for the Game Boy usually requires an emulator with a GUI (like BGB or Emulicious). But we needed to automate our testing on a headless Linux VPS.
The Solution: We built serverboy, a custom Node.js harness wrapping a headless Game Boy emulator.
- It loads the ROM.
- It injects inputs programmatically (e.g., "Hold Right for 60 frames").
- It dumps the VRAM to a PNG file using
pngjs.
This allowed us to run "visual regression tests" purely via the command line. If the title screen didn't match our reference image byte-for-byte, the build failed.
5. Audio Architecture
The Game Boy's audio hardware (APU) is complex, with four distinct channels (two pulse, one wave, one noise). Writing a music driver from scratch is a massive undertaking.
The Pattern: We opted for a simple state machine driver.
- Channel 1 (Pulse): Plays the melody.
- Channel 2 (Pulse): Plays the harmony/bass.
- Channel 4 (Noise): Handles percussion (snare/hi-hat).
We composed a "Phrygian Snake Charmer" loop—a scale that feels appropriately tense and exotic for a snake game. The driver simply steps through a lookup table of frequency values every 16 frames.
6. Memory Management: The Circular Buffer
The snake's tail was the most interesting data structure challenge. In a high-level language, you might use a LinkedList or Array and pop() the last element. In Z80 Assembly, moving every segment of the snake in memory every frame is far too slow (O(N)).
The Solution: A Circular Buffer. We allocated a fixed 256-byte buffer in RAM.
- Head Pointer: Tracks where the snake is going.
- Tail Pointer: Tracks where the snake's end is.
To move the snake:
- Draw Head: Write the new head position to the Head Pointer's address in VRAM.
- Erase Tail: Clear the tile at the Tail Pointer's address.
- Increment Both: Move both pointers forward by one byte (wrapping around at 255).
This makes movement an O(1) operation, regardless of how long the snake gets.
Conclusion
Building Snake in 2026 was a lesson in respect. The developers of the 1990s didn't have GitHub Copilot, strict linters, or continuous integration. They had graph paper, hex editors, and patience.
By forcing ourselves to work within these constraints, we didn't just build a game. We built a deeper understanding of how computers actually work—one byte at a time.