A rigorous, implementation-level guide to address translation — from foundational virtual memory concepts through AI/ML accelerator memory systems. Targets systems engineers, OS developers, hardware architects, and ML infrastructure teams.
Chapters 1–9 form a complete foundation covering paging, faults, reclaim, and optimizations. Chapter 19 extends this to CXL-attached memory.
Chapters 4, 5, 10, 15, 16, 17, 18 cover translation hardware, IOMMUs, advanced TLB design, PTW microarchitecture, and paging-level vulnerabilities. Chapter 19 covers CXL disaggregation.
Chapters 11–14 address GPU and accelerator memory, LLM serving, and ML-based optimization. Chapter 20 covers confidential computing for AI workloads. Chapter 21 covers hardware memory safety relevant to AI system integrity.
Chapter 6 covers the full protection model; Chapters 5 and 12 cover device and multi-tenant isolation; Chapter 18 covers MMU-level vulnerabilities including Meltdown, Spectre, L1TF/Foreshadow, and MDS in depth; Chapter 20 covers confidential computing, TDX, SEV-SNP, and ARM CCA; Chapter 21 covers hardware memory safety — CHERI, ARM MTE, and capability-based addressing.
| Processor Family | Key Structures Covered |
|---|---|
| x86-64 (Intel / AMD) | CR3, PML4/PDPT/PD/PT, PCID, INVPCID, INVLPG, KPTI, SGX, VT-d, EPT, AMD NPT |
| ARM64 (ARMv8 / v9) | TTBR0/TTBR1_EL1, ASID, TLBI, TrustZone, Stage-2 (IPA→PA), SMMUv3 |
| RISC-V | satp, Sv39/Sv48/Sv57, ASID, SFENCE.VMA, VMID in hgatp, G-stage translation |
| GPU / AI Accelerators | NVIDIA UVM, NVLink/NVSwitch peer-to-peer, TPU HBM, Intel Gaudi2, PagedAttention |
Each chapter is a self-contained HTML file with:
Open any chapter directly in a browser, print to PDF, or host as GitHub Pages.
Content cites peer-reviewed literature, processor architecture manuals, and production system papers:
Speculative claims about proprietary implementations are avoided.