Google is reportedly working on an artificial intelligence (AI) system that can take control of users’ web browsers to accomplish certain tasks. According to the report, the new technology is codenamed Project JARVIS and is similar to the computer usage tool released by Anthropic last week. However, rather than taking over a user’s entire PC, the Mountain View-based tech giant’s offering is said to focus solely on executing tasks within the browser. This new capability is said to be released with the next version of the Gemini AI model in December.
Google is reportedly working on browser control AI
The information states that the tech giant is developing a new capability that will let users automate tasks like booking flight or movie tickets online. Based on its description, Google is likely using agentic AI for this capability. Agent AI can be understood as AI systems that are goal-oriented and designed to perform complex tasks across all modalities.
Agent AI systems can be used to control specific computer functions, drive autonomous vehicles and robots, and more. They can use computer vision to analyze the external environment, and with the use of specialized software, they can perform tasks that mimic button presses, cursor movements, and other actions.
According to the report, Google’s agentic AI is being called Project JARVIS, which is likely derived from comic book and media franchise Marvel’s JARVIS (Just a Rather Very Intelligent System) AI assistant seen in the Iron Man films. Citing people familiar with the matter, the report claims this technology could be launched with Gemini’s next flagship large language model (LLM) as early as December.
The feature is said to be limited to browser usage only and will enable one to buy products from the e-commerce website, book tickets, fill forms and more. However, it is not known whether AI can also perform more complex tasks such as managing investment portfolios online or conducting transactions using online banking. Details related to the feature’s privacy and user security are also missing. However, these will be answered once Google officially announces the capability.