Xin chào
I work in AI engineering, and this blog is where I share my journey—designing systems, serving models, optimizing performance, and everything in between. Along the way, I’ll also share the lessons learned, including the mistakes that shaped my growth.
Featured
-
Mini SGLang (Part 1) - Architecture, Engine & Request Flow
Deep dive into Mini SGLang architecture - covering system design, engine initialization, KV cache, and single request lifecycle.