Summary Information Access & Technological Democratization Acknowledges
A corporate product announcement for Gemini 3 Flash emphasizing cost reduction and global availability. The content implicitly supports human rights to information access and participation in scientific progress through democratized AI access, but does not explicitly engage human rights frameworks. Privacy concerns emerge from Google's domain-level data collection practices, though not addressed within this article.
Don’t let the “flash” name fool you, this is an amazing model.
I have been playing with it for the past few weeks, it’s genuinely my new favorite; it’s so fast and it has such a vast world knowledge that it’s more performant than Claude Opus 4.5 or GPT 5.2 extra high, for a fraction (basically order of magnitude less!!) of the inference time and price
These flash models keep getting more expensive with every release.
Is there an OSS model that's better than 2.0 flash with similar pricing, speed and a 1m context window?
Edit: this is not the typical flash model, it's actually an insane value if the benchmarks match real world usage.
> Gemini 3 Flash achieves a score of 78%, outperforming not only the 2.5 series, but also Gemini 3 Pro. It strikes an ideal balance for agentic coding, production-ready systems and responsive interactive applications.
The replacement for old flash models will be probably the 3.0 flash lite then.
Even before this release the tools (for me: Claude Code and Gemini for other stuff) reached a "good enough" plateau that means any other company is going to have a hard time making me (I think soon most users) want to switch. Unless a new release from a different company has a real paradigm shift, they're simply sufficient. This was not true in 2023/2024 IMO.
With this release the "good enough" and "cheap enough" intersect so hard that I wonder if this is an existential threat to those other companies.
Pricing is $0.5 / $3 per million input / output tokens. 2.5 Flash was $0.3 / $2.5. That's 66% increase in input tokens and 20% increase in output token pricing.
For comparison, from 2.5 Pro ($1.25 / $10) to 3 Pro ($2 / $12), there was 60% increase in input tokens and 20% increase in output tokens pricing.
I wonder if this suffers from the same issue as 3 Pro, that it frequently "thinks" for a long time about date incongruity, insisting that it is 2024, and that information it receives must be incorrect or hypothetical.
Just avoiding/fixing that would probably speed up a good chunk of my own queries.
It has a SimpleQA score of 69%, a benchmark that tests knowledge on extremely niche facts, that's actually ridiculously high (Gemini 2.5 *Pro* had 55%) and reflects either training on the test set or some sort of cracked way to pack a ton of parametric knowledge into a Flash Model.
I'm speculating but Google might have figured out some training magic trick to balance out the information storage in model capacity. That or this flash model has huge number of parameters or something.
It's 1/4 the price of Gemini 3 Pro ≤200k and 1/8 the price of Gemini 3 Pro >200k - notable that the new Flash model doesn’t have a price increase after that 200,000 token point.
It’s also twice the price of GPT-5 Mini for input, half the price of Claude 4.5 Haiku.
Glad to see big improvement in the SimpleQA Verified benchmark (28->69%), which is meant to measure factuality (built-in, i.e. without adding grounding resources). That's one benchmark where all models seemed to have low scores until recently. Can't wait to see a model go over 90%... then will be years till the competition is over number of 9s in such a factuality benchmark, but that'd be glorious.
Does anyone else understand what the difference is between Gemini 3 'Thinking' and 'Pro'? Thinking "Solves complex problems" and Pro "Thinks longer for advanced math & code".
I assume that these are just different reasoning levels for Gemini 3, but I can't even find mention of there being 2 versions anywhere, and the API doesn't even mention the Thinking-Pro dichotomy.
I think about what would be most terrifying to Anthropic and OpenAI i.e. The absolute scariest thing that Google could do. I think this is it: Release low latency, low priced models with high cognitive performance and big context window, especially in the coding space because that is direct, immediate, very high ROI for the customer.
Now, imagine for a moment they had also vertically integrated the hardware to do this.
It's a cool release, but if someone on the google team reads that:
flash 2.5 is awesome in terms of latency and total response time without reasoning. In quick tests this model seems to be 2x slower. So for certain use cases like quick one-token classification flash 2.5 is still the better model.
Please don't stop optimizing for that!
Feels like Google is really pulling ahead of the pack here. A model that is cheap, fast and good, combined with Android and gsuite integration seems like such powerful combination.
Presumably a big motivation for them is to be first to get something good and cheap enough they can serve to every Android device, ahead of whatever the OpenAI/Jony Ive hardware project will be, and way ahead of Apple Intelligence. Speaking for myself, I would pay quite a lot for truly 'AI first' phone that actually worked.
My main issue with Gemini is that business accounts can't delete individual conversations. You can only enable or disable Gemini, or set a retention period (3 months minimum), but there's no way to delete specific chats. I'm a paying customer, prices keep going up, and yet this very basic feature is still missing.
So gemini 3 flash (non thinking) is now the first model to get 50% on my "count the dog legs" image test.
Gemini 3 pro got 20%, and everyone else has gotten 0%. I saw benchmarks showing 3 flash is almost trading blows with 3 pro, so I decided to try it.
Basically it is an image showing a dog with 5 legs, an extra one photoshopped onto it's torso. Every models counts 4, and gemini 3 pro, while also counting 4, said the dog had a "large male anatomy". However it failed a follow-up saying 4 again.
3 flash counted 5 legs on the same image, however I added distinct a "tattoo" to each leg as an assist. These tattoos didn't help 3 pro or other models.
So it is the first out of all the models I have tested to count 5 legs on the "tattooed legs" image. It still counted only 4 legs on the image without the tattoos. I'll give it 1/2 credit.
This model is breaking records on my benchmark of choice, which is 'the fraction of Hacker News comments that are positive.' Even people who avoid Google products on principle are impressed. Hardly anyone is arguing that ChatGPT is better in any respect (except brand recognition).
I think it's good, they're raising the size (and price) of flash a bit and trying to position Flash as an actually useful coding / reasoning model. There's always lite for people who want dirt cheap prices and don't care about quality at all.
But for me the previous models were routinely wrong time wasters that overall added no speed increase taking the lottery of whether they'd be correct into account.
Why wouldn't you switch? The cost to switch is near zero for me. Some tools have built in model selectors. Direct CLI/IDE plug-ins practically the same UI.
Thanks that was a great breakup of cost. I just assumed before that it was the same pricing. The pricing probably comes from the confidence and the buzz around Gemini 3.0 as one of the best performing models. But competetion is hot in the area and it's not too far where we get similar performing models for cheaper price.
I'm more curious how Gemini 3 flash lite performs/is priced when it comes out. Because it may be that for most non coding tasks the distinction isn't between pro and flash but between flash and flash lite.
> Gemini 3 Flash is able to modulate how much it thinks. It may think longer for more complex use cases, but it also uses 30% fewer tokens on average than 2.5 Pro.
Oh wow - I recently tried 3 Pro preview and it was too slow for me.
After reading your comment I ran my product benchmark against 2.5 flash, 2.5 pro and 3.0 flash.
The results are better AND the response times have stayed the same.
What an insane gain - especially considering the price compared to 2.5 Pro.
I'm about to get much better results for 1/3rd of the price. Not sure what magic Google did here, but would love to hear a more technical deep dive comparing what they do different in Pro and Flash models to achieve such a performance.
Also wondering, how did you get early access? I'm using the Gemini API quite a lot and have a quite nice internal benchmark suite for it, so would love to toy with the new ones as they come out.
- "Thinking" is Gemini 3 Flash with higher "thinking_level"
- Prop is Gemini 3 Pro. It doesn't mention "thinking_level" but I assume it is set to high-ish.
Yes, but the 3.0 Flash is cheaper, faster and better than 2.5 Pro.
So if 2.5 Pro was good for your usecase, you just got a better model for about 1/3rd of the price, but might hurt the wallet a bit more if you use 2.5 Flash currently and want an upgrade - which is fair tbh.
Correct. Opus 4.5 'solved' software engineering. What more do I need? Businesses need uncapped intelligence, and that is a very high bar. Individuals often don't.
Really stupid question: How is Gemini-like 'thinking' separate from artificial general intelligence (AGI)?
When I ask Gemini 3 Flash this question, the answer is vague but agency comes up a lot. Gemini thinking is always triggered by a query.
This seems like a higher-level programming issue to me. Turn it into a loop. Keep the context. Those two things make it costly for sure. But does it make it an AGI? Surely Google has tried this?
For my apps evals Gemini flash and grok 4 fast are the only ones worth using. I'd love for an open weights model to compete in this arena but I haven't found one.
The price increase sucks, but you really do get a whole lot more. They also had the "Flash Lite" series, 2.5 Flash Lite is 0.10/M, hopefully we see something like 3.0 Flash Lite for .20-.25.
Article 19 protects freedom of expression and information access. Accessible, powerful AI facilitates both receiving and imparting information globally. The announcement's emphasis on cost reduction and global availability directly supports democratization of an information/expression tool.
FW Ratio: 60%
Observable Facts
The blog post is publicly accessible without authentication or paywall.
Five distinct social sharing buttons present (X, Facebook, LinkedIn, email, copy link), facilitating rapid information distribution.
Product positioning as 'frontier intelligence built for speed' emphasizes information processing capability.
Inferences
Free public access to information about an AI tool that facilitates expression supports Article 19.
Sharing infrastructure enables broad dissemination of information about technologies that enable expression.
Article 27 protects participation in cultural, artistic, and scientific life and benefits of scientific progress. Gemini 3 Flash represents frontier AI technology. Positioning it as affordable ('fraction of the cost') and globally available directly democratizes access to a significant scientific tool, enabling broader participation in scientific/technological development.
FW Ratio: 60%
Observable Facts
The announcement positions Gemini 3 Flash as 'frontier intelligence,' explicitly connecting to scientific progress.
Cost reduction ('fraction of the cost') is emphasized, lowering participation barriers.
Global availability is highlighted, enabling worldwide participation in advanced technology.
Inferences
Cost reduction democratizes access to a cutting-edge scientific tool, directly supporting Article 27 participation.
Global availability enables individuals worldwide to benefit from scientific progress in AI.
Article 18 protects freedom of thought, conscience, and religion. Democratized AI access globally could expand individuals' capacity to explore and process information relevant to beliefs and conscience. The emphasis on global availability supports this right indirectly, though not explicitly engaged.
FW Ratio: 67%
Observable Facts
The announcement emphasizes 'global availability' with no geographic restrictions mentioned.
The product is positioned as accessible 'at a fraction of the cost', removing economic barriers to information processing tools.
Inferences
Reduced-cost, globally-available information processing technology could support individuals' freedom to explore diverse thoughts and beliefs.
The preamble affirms inherent dignity and equal rights. The product announcement frames AI capability as democratized ('at a fraction of the cost', 'global availability') but does not explicitly engage with universal human dignity or UDHR principles. Implicit positive signals on access equity are present but subordinate to commercial framing.
FW Ratio: 50%
Observable Facts
The announcement emphasizes 'global availability' and cost reduction ('fraction of the cost').
The framing prioritizes technical capability and commercial positioning over human rights or ethical frameworks.
Inferences
Cost reduction and global scope suggest implicit commitment to broader human access, aligned with dignity principles.
The commercial framing does not explicitly center human dignity as a primary value.
Article 12 protects privacy against arbitrary interference. The announcement does not disclose data handling practices or privacy safeguards for AI model. Absence of privacy considerations is notable given scope of new AI capability. Editorial framing lacks engagement with privacy rights.
FW Ratio: 50%
Observable Facts
Page metadata contains 'data-ga4-analytics' attribute indicating Google Analytics tracking.
No privacy disclosures or data protection commitments visible in the announcement.
Inferences
Structural deployment of analytics infrastructure indicates user data collection without explicit consent interface.
Omission of privacy considerations in an AI product announcement suggests privacy is not a primary design concern.
Article 1 addresses inherent dignity and equal rights without distinction. The content does not substantively engage with equality or non-discrimination principles.
Article 26 establishes right to education. AI tools can support educational access, but product announcement does not explicitly engage with education rights or learning implications.
Public blog structure enables participation in scientific discourse. Free access to information about scientific progress supports this right's infrastructure.
Public access to blog content and multiple social sharing mechanisms (X, Facebook, LinkedIn, email, copy link) enable broad dissemination. Free access without paywall supports information freedom infrastructure.
Google Analytics (GA4) tracking code present in page metadata; no privacy policy integration or opt-out mechanism visible in blog structure. Domain-level data collection practices create structural privacy tensions.