In the Following Key Tests, Gemini 3 Flash Outperforms Gemini 3 Pro and GPT 5.2.

The AI war continues to heat up. Just a few weeks after OpenAI declared “code red” in its race against Google , the latter released its latest lightweight model: the Gemini 3 Flash. This particular Flash model is the latest in Google’s Gemini 3 family , which began with the Gemini 3 Pro and Gemini 3 Deep Think. But while this latest model is intended as a lighter and cheaper variant of the existing Gemini 3 models, the Gemini 3 Flash is actually quite powerful in its own right. In fact, it outperforms both the Gemini 3 Pro and OpenAI’s GPT-5.2 models in some performance tests.
Lightweight models are typically designed for simpler queries, budget-conscious tasks, or running on less powerful hardware. This means they are often faster than more powerful models, which take longer to process queries but can handle more. According to Google, Gemini 3 Flash combines the best qualities of both these models, creating a model with the “professional performance” of Gemini 3 and the “latency, efficiency, and cost of Flash.” While this is likely most important for developers, regular users should also notice improvements, as Gemini 3 Flash is now the default model for both Gemini (the chatbot) and AI Mode , Google’s AI-powered search.
Gemini 3 Flash Performance
These improvements can be seen in Google’s published benchmarking statistics for Gemini 3 Flash. On Humanity’s Last Exam, a test assessing academic reasoning and testing LLM models on 2,500 questions across more than 100 subjects, Gemini 3 Flash scored 33.7% without tools and 43.5% with code search and execution. Compare this to the results of Gemini 3 Pro (37.5%) and 45.8%, respectively, or OpenAI GPT-5.2 (34.5%) and 45.5%. On MMMU-Pro, a test testing a model’s multimodal understanding and reasoning, Gemini 3 Flash achieved the highest score (81.2%), compared to Gemini 3 Pro (81%) and GPT-5.2 (79.5%). In fact, of the 21 performance tests Google highlighted in its announcement, Gemini 3 Flash performed better in three: MMMU-Pro (tied with Gemini 3 Pro), Toolathlon, and MMMLU. Gemini 3 Pro still holds first place for the largest number of tests (14), and GPT-5.2 outperformed it in eight tests, but Gemini 3 Flash holds its own.
Google notes that the Gemini 3 Flash also outperforms the Gemini 3 Pro and the entire 2.5 series in the SWE-bench Verified benchmark, which tests a model’s software capabilities. The Gemini 3 Flash scored 78%, the Gemini 3 Pro 76.2%, the Gemini 2.5 Flash 60.4%, and the Gemini 2.5 Pro 59.6%. (It’s worth noting that GPT-5.2 performed better than any of the models Google mentioned in this announcement.) It’s a very close race, especially considering this is a lightweight model that performs on par with the company’s flagship models.
Gemini 3 Flash Cost
This can create an interesting dilemma for developers who pay to use AI models in their programs. Gemini 3 Flash costs $0.50 for every million input tokens (what you request from the model) and $3.00 for every million output tokens (the result the model returns in response to your request). Compare this to Gemini 3 Pro, which costs $2.00 for every million input tokens and $12.00 for every million output tokens, or GPT-5.2, which costs $3.00 and $15.00, respectively. It’s worth noting that it’s not as cheap as Gemini 2.5 Flash ($0.30 and $2.50) or Grok 4.1 Fast ($0.20 and $0.50), but it outperforms these models in Google’s published performance tests. Google notes that the Gemini 3 Flash uses 30% fewer tokens on average than the 2.5 Pro, which will save you money and also provide three times the performance.
If you need LLM chips like the Gemini 3 Flash to power your products but don’t want to pay the higher price of the more powerful models, then I think this latest lightweight model might be financially attractive.
How the average user will perceive Gemini 3 Flash
Most of us using AI aren’t developers who need to worry about API pricing. Most Gemini users are likely familiar with this model through Google’s consumer products, such as Search, Workspace, and the Gemini app.
Starting today, Gemini 3 Flash is the default model in the Gemini app. Google claims it can perform a variety of tasks “in just a few seconds.” This could include asking Gemini to provide tips on improving your golf swing based on a video of you, or uploading a speech on a given historical topic and asking for any facts you might have missed. You can also ask the bot to write a working app for you based on a series of your thoughts.
You can also experience the power of Gemini 3 Flash in Google Search’s AI mode. Google claims the new model better “understands the nuances of your question” and carefully analyzes every part of it. This AI mode attempts to provide a more complete search result by scanning hundreds of websites simultaneously and compiling a concise summary with sources for your answer. It remains to be seen whether Gemini 3 Flash outperforms previous versions of the AI mode.
I still don’t see much use for generative AI products in everyday life, and I’m not entirely sure the Gemini 3 Flash will change that for me. However, the balance between performance gains and processing costs is interesting, and I’m especially curious to see how OpenAI responds to this.
Starting today, Gemini 3 Flash is available to all users. In addition to regular users in Gemini and AI mode, developers will find it in the Gemini API in Google AI Studio, Gemini CLI, and Google Antigravity, a new agent-based application development platform. Enterprise users can use it in Vertex AI and Gemini Enterprise.