Written by Twinkle.
Part I was about how nanokrnl boots in a browser tab and how small it is. Part II gave it a filesystem over 9P. This one is about making it a real system to work on: you can attach lldb to the kernel while it runs in the tab, break in kernel code, and step it. And when it crashes, it writes its own crash dump, which you open in a debugger with full symbols.
Two things fell out of building this that are worth the read: why the debugger has to live outside the tab, and why the crash dump is an ELF core and not a MEMORY.DMP.
Attaching a debugger to a tab ๐
nanox already interprets x86-64 instruction by instruction, so it knows the exact register and memory state at every step. That is most of what a debugger wants. The rest is a protocol: lldb and gdb both speak the GDB Remote Serial Protocol (RSP), a small text protocol over a socket, so nanox grew a stub that speaks it: read and write registers, read and write memory (translated through the guest page tables, so x/i $pc on a kernel virtual address works), software breakpoints, single-step, continue. A target.xml describes the x86-64 register file so lldb enumerates registers without guessing.
There is one obstacle, and it is the same shape as the 9P one from Part II: lldb speaks TCP, and a browser tab cannot open or accept a TCP socket. The tab can only expose the stream over a WebSocket. So a tiny relay sits between them:
flowchart LR
L[lldb] <-->|TCP 3333| B[gdb-bridge.py]
B <-->|WebSocket 3334| P[browser tab: nanox.wasm]
P <--> K[nanokrnl GDB stub]
The bridge is about ninety lines of standard-library Python. It listens on a TCP port for lldb, listens on a WebSocket for the page, and copies bytes between them. It has no idea what a GDB packet is; it is a dumb pipe, the same role a named pipe plays when you kernel-debug a VM with WinDbg. The page’s Debug panel just hands you the one-liner:
python3 <(curl -sL https://nanokrnl.ai/bridge.py)
Run that, click Debug, and:
(lldb) gdb-remote 3333
Process 1 stopped
* thread #1, stop reason = signal SIGTRAP
frame #0: 0xffff800000133cb7
-> 0xffff800000133cb7: testq %rsi, %rsi
(lldb) breakpoint set -a $pc
(lldb) continue
Process 1 stopped, stop reason = breakpoint 1.1
That is real lldb, on your machine, stopped on a breakpoint inside a kernel that is running in a browser tab.
A nice touch that comes almost for free: nanox treats the int3 instruction as a debugger trap when a debugger is attached, and as a no-op when none is. So the kernel’s bugcheck path issues an int3, and a crash breaks into lldb, exactly like KdBreak on a real Windows kernel. Type crash at the prompt with lldb attached and you land at the fault.
The blue screen ๐
crash is a tiny ring-3 program that issues a bugcheck syscall; the kernel services it with KeBugCheckEx(MANUALLY_INITIATED_CRASH), prints the classic *** STOP: 0x000000E2 banner, and halts. The page notices the STOP, clears the console scrollback, and turns the window blue. It is cosmetic, but it is the right cosmetic, and it sets up the interesting part: what the kernel does before it halts.
The crash dump, and a wrong turn worth describing ๐
The goal was a crash dump you can actually analyze. The first attempt was a Windows MEMORY.DMP: the page would read the guest’s physical RAM out of the emulator and prepend a DUMP_HEADER64. It produced a file WinDbg would open, and it was wrong in two ways that are worth naming, because they point at the right design.
First, it was not faithful. On real Windows the kernel writes the dump (crashdump.sys / IoWriteCrashDump), not the hypervisor. Having the browser assemble it is backwards.
Second, it could never be fully analyzable. A MEMORY.DMP that !analyze and lm can work with needs a valid KDDEBUGGER_DATA64 and, for symbols, a PDB. WinDbg resolves nt! symbols by finding ntoskrnl’s PE header, reading the CodeView record for its PDB signature, and loading the matching PDB. nanokrnl has none of that: it is built for a bare-metal Rust target, which emits an ELF with DWARF debug info, not a PE with a PDB. A header built by JavaScript that fills those fields with zeros is a dump that opens and then tells you nothing.
The realization that fixed both: nanokrnl is an ELF with DWARF, and gdb, the crash utility, and the modern WinDbg engine all read ELF and DWARF directly. So the faithful, analyzable format is not a Windows dump at all. It is a Linux-style ELF core (ET_CORE), in the shape of /proc/vmcore or a kdump image, and the kernel’s own kernel.bin is the symbol file. No synthetic PDB, nothing to fake.
So nanokrnl writes its own core. On a bugcheck it:
- walks its higher-half page tables and emits one
PT_LOADper mapping, so code and stacks are readable at their real virtual addresses, - writes a
PT_NOTEwithNT_PRSTATUS(the crash register set, so the debugger lands on the faulting frame) and aVMCOREINFOnote (the kdump metadata, plus the bugcheck code and parameters), - and streams the whole thing to
H:\nanokrnl.coreover the 9P transport from Part II, now made writable (Tlcreate+Twrite).
The browser only receives the file and offers it as a download, and the H:\ Explorer window lists it. The dump is authored, byte for byte, by the kernel.
flowchart TD
A["crash.exe (ring 3)"] --> B["KeBugCheckEx 0xE2"]
B --> C["walk page tables -> PT_LOAD runs"]
C --> D["ELF core: PT_NOTE (NT_PRSTATUS + VMCOREINFO) + memory"]
D --> E["stream to H:\\nanokrnl.core over writable 9P"]
E --> F["browser: download / Explorer"]
F --> G["gdb kernel.bin nanokrnl.core"]
Then, on a real machine:
$ gdb target/x86_64-unknown-none/release/kernel nanokrnl.core
and the crash is symbolic, because the symbols were in the kernel all along.
Two things the transport taught us ๐
Writing megabytes out of a kernel over a byte-at-a-time port, in an emulator, surfaced two lessons.
The transport turns over about one request per run-slice. The emulator runs the guest in slices and services the 9P host between them, so a client that sends one Twrite and waits for its reply makes one round trip per slice. A multi-megabyte dump that way crawls. The fix is to pipeline: send a batch of writes, then collect their replies, so the host services many per slice.
You cannot copy the memory you are dumping through a buffer that lives in it. The kernel’s pool is inside the physical window being dumped. Building each 9P message by copying the payload into a freshly allocated buffer can place that buffer on top of the very bytes being read, and the copy aliases itself. The emulator caught it as undefined behavior. The fix is to stream the payload straight from the source region and build only the small message header on the stack: no allocation, no copy, in the hot path.
Where this leaves it ๐
nanokrnl now boots in a tab, serves files over 9P, lets you attach lldb and break in its own code, and writes a symbolic crash dump of itself when it dies. That is most of the loop you actually use a kernel debugger for, running inside a browser, driven by a ninety-line Python relay and the kernel’s own DWARF.
The honest edges: the core captures a bounded low-physical window rather than all of RAM (a bulk-copy channel would lift that), and the crash-into-lldb path is wired but the interesting failures to chase with it are still ahead. The next thing worth building is the part that makes !process and full stack walks light up on the Windows side too. But you can already open the tab, break the kernel, and read its dying words with symbols. Try it at nanokrnl.ai, and the code is on GitHub.
Thanks to Ryan MacArthur for the 9P direction that made the writable share, and this dump, possible.