Notes
Short pieces about the methodology and architecture decisions behind the AI systems I ship — specs, evals, multi-agent orchestration, LLM integration, and the discipline of directing coding agents.
June 19, 2026
Your model has a six-week shelf life
In a single two-week window this month the industry shipped Claude Mythos 5, GPT-5.6, Gemini 3.2, and a wall of Chinese frontier models — Qwen 3.7, DeepSeek V4.1, GLM-6 and more. New frontier models now land on a roughly six-week metronome. If your product's edge is 'we use the best model,' your edge expires before the quarter does. Here's how to build for a moving target.
- ai-native
- architecture
June 15, 2026
The app that burned $15 million a day
OpenAI built the most hyped AI video app in history, then quietly killed it six months later. Sora was reportedly burning around $15 million a day in compute while taking in about $2.1 million in total — not per day, total. People loved it and it still lost money on every single clip. That's the lesson traditional software never taught us: a generative feature has a real cost every time someone uses it, and 'viral' doesn't fix 'loses money per use.' Here's how to check your own AI feature before it does the same thing.
- business
- ai-native
June 15, 2026
The niche model beats the giant
The agent Salesforce just paid $3.6 billion for doesn't run on the biggest, smartest model money can buy. It runs on Apex — a smaller model built for one job, customer support, that Salesforce says beats the top frontier models at actually resolving tickets. That's the detail worth more than the price tag. For a narrow, well-defined task, a model trained specifically for it can beat a general giant that knows everything and masters nothing. Here's why reaching for the biggest model is usually the wrong reflex.
- ai-native
- business
June 14, 2026
AI agents just got your credit card
On June 10, 2026, Visa plugged its payment network straight into ChatGPT. An AI agent can now shop and pay at any Visa merchant on your behalf — pick the product, run the checkout, settle the bill — using a credential scoped to that agent with spending caps and merchant limits you set up front. McKinsey thinks agent-driven shopping could be a trillion dollars of U.S. retail by 2030. For two years agents could recommend; now they can spend. Here's what actually changes when software holds the card, and the one question to ask before you hand it over.
- business
- ai-native
June 14, 2026
The fast model just got smart
For two years you made a trade every time you picked a model: fast and cheap, or smart and slow. Gemini 3.5 Flash just broke it. The 'Flash' tier — the cheap, quick one — now scores 55 on the Artificial Analysis Intelligence Index, ahead of Grok 4.3 and Claude Sonnet 4.6, while running over 280 tokens a second. The fast model is no longer the dumb model. That should make you re-open a decision most teams quietly froze a year ago: which model is your default, and is it still the right one? Here's how to think about it — including the catch.
- ai-native
- business
June 13, 2026
Agents are becoming a feature, not a product
Gartner expects 40% of enterprise applications to embed task-specific AI agents by the end of 2026, up from under 5% a year ago. Agentic AI is the fastest-growing enterprise priority, up 31.5% year over year. Read together, those numbers say something uncomfortable for a lot of startups: the agent is turning into a feature inside the software people already use, not a standalone product they switch to. If 'we built an agent that does X' is your whole pitch, the app that owns X is about to build it too. Here's what that means for what you build.
- business
- ai-native