Just days after Anthropic upgraded its computer-controlling AI agent capabilities, Google has introduced a similar feature for the Gemini ecosystem. The tech giant brings ‘Computer Use’ as a built-in feature of Gemini 3.5 Flash. It allows developers to create AI agents that can interact with browsers, desktop applications, and mobile environments.
Earlier offered as a standalone model, the computer-use capability is integrated directly into Gemini 3.5 Flash. This initiative aims to simplify the development of autonomous AI agents for observing screens, reasoning through tasks, and performing actions such as clicking buttons, entering text, navigating websites, and interacting with software interfaces.
Google explains, “The feature lets AI agents interact with software the way a person does: by looking at a screen and deciding where to click, what to type, and when to scroll, rather than relying on rigid, pre-coded integrations with each app.”
Gemini 3.5 Flash can run agentic workflows on browsers, mobile phones, and computers. The model can analyze screenshots, understand the interface, and generate actions that mirror human interactions within software applications. Other use cases include:
Automating repetitive form filling and data entry
Conducting software and application testing
Performing research across websites
Managing long-running enterprise workflows
Supporting knowledge workers across multiple applications
The ability to use Gemini 3.5 Flash in the development of AI agents sets it apart from other chatbots. Google has made it clear that the model is designed for long-horizon tasks.
The new feature can be accessed by developers via the Gemini API and the Gemini Enterprise Agent Platform. For ‘Computer Use’ activation, the developer just needs to switch on the tool in Gemini 3.5 Flash and set the environment to a browser, desktop, or mobile.
The model returns actions in a structured format, such as mouse clicks and keystrokes, which are then executed by the developer's client-side environment. Google has also added an ‘intent’ field, which explains the reason for the action taken.
Also read: OpenAI Introduces Jalapeño; Signals New Challenge to NVIDIA’s AI Dominance with its AI Chip