Cheap, Fast, Frontier: What Gemini 3.5 Flash Means for Your Stack.

Gemini 3.5 Flash hit general availability — frontier-level quality at roughly 4× the speed, for $1.50 in / $9 out per million tokens.
Why cheap inference matters
Background jobs — scraping, classification, summarization, draft generation — get nearly free to run at volume. That changes what’s worth self-hosting.
The real move
It isn't picking one model. It's wiring MCP pipelines that route cheap work to fast models and hard work to the frontier — and paying for none of the SaaS in between.
Source: llm-stats