Future Forward Interview
I recently joined Nick and Matt on their podcast to talk about what we're building at Modular - and why the AI infrastructure layer matters more than most people think. We covered everything from why no engineer actually starts a project by picking their GPU, to the edge AI thesis I've been chasing since the TensorFlow Lite days at Google, to a fun side project I built called Compound Loop that orchestrates multiple frontier models against each other to produce better code than any single model can alone. If you're interested in where AI compute is heading - beyond the data center, beyond CUDA lock-in, and toward a world where intelligence runs everywhere - give it a listen or read through the highlights below.
Summary of the interview
Modular's Core Pitch
"Hypervisor for compute" - abstracting away hardware so AI programs run seamlessly across any silicon
No one starts a project saying "I must use this hardware" - they start with accuracy, latency, cost, and throughput targets. Hardware shouldn't be front and center
Today, moving from Nvidia to AMD to TPUs requires enormous rewriting, and Modular eliminates that
Market & Customers
Inference-focused today, targeting sophisticated AI labs and Gen AI startups doing large-scale deployments
The multi-hardware future is already here - every hyperscaler is building their own silicon (Google TPUs, AWS Inferentia, AMD, Apple Silicon)
Analogy to the multi-cloud movement - no one wants to be locked to a single provider
Edge AI Thesis
Rooted in your TensorFlow Lite experience at Google - low latency, privacy-sensitive AI running where you are
Local models will get there - "the models you're using today are the dumbest models you'll ever use"
Pointed to OpenClaw as validating the pattern of local agents with persistent memory and full system access, even though inference still goes to cloud today
Big Tech vs. Open Community
OpenClaw did what Apple hasn't in 15 years - a solo developer in Austria shipped a personal AI assistant on local machines
Big tech has organizational challenges, security concerns, and massive user bases creating caution
Microsoft got roasted for Copilot reading your screen, but OpenClaw launches and everyone loves it - the "HR meme" dynamic
Compound Loop (Your Side Project)
Orchestration system that battles models against each other - Claude, Codex, Gemini
Workflow: plan → implement → review → merge, with models cross-reviewing each other's work
Key local component: local embedding models build representations of your codebase, so you only send small context windows to cloud models - ~30x reduction in token usage
Runs autonomously - finishes a task, goes back to the plan, does the next thing
Fun Observation
Everyone defaults to the most powerful model even when they don't need it - "the higher the number, the better" pattern. Model labs are now starting to abstract that away with routing based on query complexity.