This site is a place for careful notes: performance investigations, implementation details, readings, experimental results, and unfinished questions from ML systems work.

The recurring themes are practical: how to make models faster, cheaper, more reliable, and easier to reason about when they run on real hardware and real clusters.

Current interests