Google just shipped computer use as a built-in tool in Gemini 3.5 Flash. Not as a separate model. Not as a beta feature behind a flag. A native tool, available through the Gemini API right now.
This is a big deal for anyone building AI agents, and it changes the calculus on which model to use for browser automation, desktop control, and mobile agents.
Here’s what happened and what it actually means.
Computer use is now native in Gemini 3.5 Flash
Previously, if you wanted computer use with Gemini, you had to use the standalone Gemini 2.5 computer use model — a separate model with its own API endpoint and its own quirks.
As of today, computer use is a built-in tool in Gemini 3.5 Flash. The same model that handles function calling, search grounding, and Maps integration now handles computer use as a first-class capability. You call the standard Gemini 3.5 Flash endpoint and enable the computer_use tool. That’s it.
Google’s own docs now list Gemini 2.5 as “Legacy” for computer use. The recommended path is Gemini 3.5 Flash.
What this enables
With computer use as a native tool, developers can build agents that:
- See what’s on screen
- Reason about what to do next
- Take actions across browser, mobile, and desktop environments
Google says this unlocks improved performance for long-horizon and enterprise automation tasks, including continuous software testing and knowledge work across professional applications.
The three customer testimonials in the announcement tell the story:
- Browserbase said 3.5 Flash delivers “comparable accuracy to frontier models with better cost and latency” for complex browsing tasks.
- Browser Use called it “a clear step up from the previous Flash generation” that “catches up to frontier-level performance while keeping the speed and cost profile that makes Google our number one choice at scale.”
- UiPath said 3.5 Flash “delivers high throughput, strong reliability, and the best price-performance ratio among the models we evaluated.”
Three different companies, three different stacks, all saying basically the same thing: Google closed the quality gap on computer use while keeping the cost advantage of Flash.
Safety: prompt injection defenses built-in
Google added targeted adversarial training specifically for computer use to mitigate prompt injection risks. On top of that, they’re releasing two optional enterprise safeguard systems:
- Require explicit user confirmation for sensitive or irreversible actions.
- Auto-stop tasks if an indirect prompt injection is identified.
These are optional enterprise features, but they matter. Prompt injection is the biggest unsolved problem in agentic AI, and having a built-in defense layer is better than rolling your own.
How this changes the agent landscape
Three models now dominate the computer use conversation:
- Anthropic Claude — started the computer use trend, strong accuracy, higher cost
- OpenAI — Operator and Codex with computer use capabilities
- Google Gemini 3.5 Flash — just closed the gap on quality while keeping the lowest cost and latency
The competitive pressure is real. Claude kicked off the computer use wave. OpenAI responded with Operator and Codex. Google just responded with native computer use in their fastest, cheapest production model.
For developers building at scale, that last point — cost and latency — is what makes this announcement land differently. Gemini 3.5 Flash is already one of the cheapest frontier models per token. Adding computer use as a built-in tool instead of a separate model call means fewer architectural headaches and lower total cost per task.
What’s available now
The feature is live today through:
- The Gemini API computer use docs
- The GitHub reference implementation
- A demo environment hosted by Browserbase
- The Gemini Enterprise Agent Platform
There’s also a safety best practices guide for developers deploying computer use agents in production.
Caveats
- The GitHub repo is a “preview,” not a stable release. Expect iteration.
- Pricing for computer use tool calls isn’t spelled out separately — it’s part of the standard 3.5 Flash token pricing, but heavy computer use workflows can generate a lot of screenshot tokens.
- The enterprise safeguards (confirmation prompts, injection detection) are optional and need to be explicitly configured.
- Regional availability for the full Gemini API may vary.
Bottom line
Google just made computer use a native capability of its fastest, cheapest frontier model. That is a strategic move aimed directly at the agent-building market. Gemini 3.5 Flash was already the sensible choice for cost-sensitive agent workloads. Adding built-in computer use makes the argument stronger.
For developers already building with Claude’s computer use or OpenAI’s Operator, Gemini 3.5 Flash is worth a serious look. The quality gap has narrowed, the cost advantage is real, and having it as a native tool instead of a separate model endpoint simplifies the stack.
The agent wars just got another contender, and this one comes with Google’s distribution muscle and the lowest price point in the fight.



