ML Systems Researcher
About
I work on the engineering and research problems that appear between model architecture, hardware, distributed systems, and production inference.
This site is a place for careful notes: performance investigations, implementation details, readings, experimental results, and unfinished questions from ML systems work.
The recurring themes are practical: how to make models faster, cheaper, more reliable, and easier to reason about when they run on real hardware and real clusters.
Current interests
- LLM systems
- HPC
- inference
- CUDA
- distributed training
- ML infrastructure