Debugging a Real-Mode Bootloader in GDB — When CS Changes and Symbols Break

Debugging a Real-Mode Bootloader in GDB — When CS Changes and Symbols Break

Field notes from stepping through a hand-written x86 assembly bootloader in GDB — across the far-jump that switches code segments — and getting symbols to follow you. Most low-level GDB tutorials skip exactly the part that breaks in real-mode.

May 12, 2026
Harrison Guo
3 min read
Kernel Debug Field Notes Low-Level Programming

Status: Companion blog draft for the existing video. Long-form transcript + bridge framing TBD.

Companion assets

  • Original video:

    Debugging Real-Mode Bootloader in GDB (CS Changed, Symbols Broken)

  • GitHub: harrison001/NanoBoot (minimal x86 bootloader framework, <5KB, GDT/IDT, scheduler) and harrison001/bootloader

TL;DR

When you step through a real-mode x86 bootloader in GDB and hit a far jump, the code segment (CS) changes — and GDB’s symbol resolution silently breaks. Your function names disappear, addresses become bare hex, and you can no longer set breakpoints by name. The fix is mechanical, but understanding why it breaks teaches more about CPU mode transitions than any tutorial does.

The setup

  • Hand-written x86 assembly bootloader (real-mode, BIOS-booted)
  • QEMU with -s -S (gdb stub, halt before first instruction)
  • GDB attached: target remote :1234
  • A jmp far somewhere in the boot sequence — this is where symbols break

Debug command transcript

# TODO: paste actual gdb + qemu sequence from the video
# qemu-system-i386 -drive format=raw,file=boot.bin -s -S
# gdb
# (gdb) target remote :1234
# (gdb) set architecture i8086
# (gdb) symbol-file boot.elf
# (gdb) break *0x7c00
# (gdb) c
# ... after far jump ...
# (gdb) info registers cs eip
# (gdb) x/10i $cs*16+$eip

What broke

After a far jump, GDB’s prompt still works, but:

  • Function names from the symbol table no longer match what’s running
  • Setting breakpoints by name silently misses
  • Stack traces become garbage

Root cause: in real mode, an instruction’s physical address is CS << 4 + EIP. When CS changes, the same EIP value maps to a different physical location. GDB’s symbol file was built assuming one CS segment; after the far jump, every symbol is wrong by (old_cs - new_cs) << 4.

What fixed it

TODO: write up the workflow — set architecture to i8086, use *0x7c00-style absolute breakpoints across the segment boundary, manually compute CS << 4 + EIP for inspection, or use add-symbol-file with the new base address after the jump.

What this teaches backend / AI infra engineers

You don’t write bootloaders in production. But you do encounter the same class of problem:

  • PIE binaries: when ASLR loads your Go binary at a different base address than the symbol file expects, stack traces in coredumps point at wrong functions — same address-vs-symbol drift, different cause
  • Container memory profiling: the address space inside a container is not the address space your host-side profiler thinks it is; the offsets are different and tools must account for it
  • JIT-compiled code (in Python, Node, ML runtimes): functions move around at runtime; static symbol resolution doesn’t apply, and tools have to call into runtime introspection to recover names

The deeper lesson: debuggers operate on a model of the program. When the model and the runtime diverge, the debugger lies confidently. Recognizing when that’s happening is what separates engineers who “use GDB” from engineers who can debug a program they’ve never seen in 30 minutes.


🎧 More Ways to Consume This Content

I occasionally advise small teams on backend reliability, Go performance, and production AI systems. Learn more: /services

Comments

This space is waiting for your voice.

Comments will be supported shortly. Stay connected for updates!

Preview of future curated comments

This section will display user comments from various platforms like X, Reddit, YouTube, and more. Comments will be curated for quality and relevance.