// writing
Notes from ring 0.
Kernel-level engineering writing about serving LLM inference without a host operating system.
-
Why a Unikernel for LLM Serving
Inference servers spend a surprising number of cycles maintaining the operating system they run on. cllm is the experiment we ran to find out what happens when you delete almost all of it.
architectureunikernelsystems -
Multiboot, PCI, e1000, and an HTTP Server in Ring 0
A walk through the cllm boot path: how a Multiboot ELF hands off to C, claims an Intel e1000 NIC over PCI, and ends up answering HTTP requests with no operating system underneath.
kernelnetworkingengineering -
The Four-Phase Roadmap: How cllm Becomes a Useful Inference Server
A walk through the project specification: how cllm goes from a Multiboot kernel that serves HTTP to a unikernel that runs llama.cpp on a GPU with vLLM-derived optimizations, and what is honest about each phase.
roadmaparchitectureengineering