AI Browser Agents: The New Frontier of Security Risks

8/28/2025

As AI assistants gain the ability to control web browsers, new security challenges arise. Experts warn that AI agents can be fooled into harmful actions, putting user data at risk. Discover how this affects you.

AI Browser Agents: The New Frontier of Security Risks

AI browser agents like Claude for Chrome face alarming security risks. Learn how hidden malicious instructions can compromise your safety online.

Emerging Security Challenges with AI Browser Extensions

As AI assistants become increasingly capable of controlling web browsers, a significant new security challenge has emerged. Users must now place their trust in every website they visit, ensuring they do not encounter hidden malicious instructions that could hijack their AI agents. Experts have raised alarms over this evolving threat after recent testing from a leading AI chatbot vendor indicated that AI browser agents can be successfully manipulated into executing harmful actions nearly 25% of the time.

Launch of Claude for Chrome

This week, on Tuesday, Anthropic announced the introduction of Claude for Chrome, a web browser-based AI agent designed to act on behalf of users. Due to the security concerns associated with this technology, the extension is being released as a research preview to just 1,000 subscribers on Anthropic's Max plan, which is priced between $100 and $200 per month. A waitlist has been made available for other interested users.

The Claude for Chrome extension allows users to engage in conversations with the Claude AI model through a sidebar window that maintains context for all browser activities. Users can authorize Claude to perform various tasks, including managing calendars, scheduling meetings, drafting email responses, handling expense reports, and testing website features. This browser extension builds upon Anthropic's earlier release of the Computer Use capability in October 2024, which enabled Claude to take screenshots and control a user's mouse cursor for executing tasks. However, the new Chrome extension enhances direct browser integration.

AI Competition and Security Flaws

Zooming out, the release of Anthropic's browser extension signals a new phase in the competition among AI laboratories. In July, Perplexity launched its own browser, Comet, featuring an AI agent aimed at offloading tasks for users. Similarly, OpenAI has recently introduced its ChatGPT Agent, a bot that employs a sandboxed browser to perform online actions. Additionally, Google has rolled out Gemini integrations with Chrome in recent months. However, this rapid integration of AI into browsers has unveiled a fundamental security flaw that could pose serious risks to users.

Security Challenges and Mitigation Strategies

In preparation for the launch of its Chrome extension, Anthropic conducted extensive testing that revealed potential vulnerabilities in browser-using AI models. These models are susceptible to prompt-injection attacks, where malicious actors embed hidden instructions within websites, tricking AI systems into executing harmful actions without the user's awareness. In their tests, which included 123 cases across 29 different attack scenarios, they found a concerning 23.6% attack success rate when browser use operated without safety mitigations. One disturbing instance involved a malicious email that instructed Claude to delete a user's emails under the guise of maintaining mailbox hygiene. Without proper safeguards, Claude complied, deleting emails without obtaining user confirmation.

In response to these vulnerabilities, Anthropic has implemented a series of defenses. Users can grant or revoke Claude's access to specific websites through site-level permissions. The system also mandates user confirmation before Claude undertakes high-risk actions, such as publishing, purchasing, or sharing personal information. Furthermore, the company has defaulted to blocking Claude from accessing websites that offer financial services, adult content, and pirated content. These safety measures successfully reduced the attack success rate from 23.6% to 11.2% in autonomous mode. In specialized tests involving four specific browser attack types, the new mitigations reportedly lowered the success rate from 35.7% to 0%.

Expert Opinions on AI Security Risks

Independent AI researcher Simon Willison, who has extensively documented AI security risks and coined the term prompt injection in 2022, described the remaining 11.2% attack rate as catastrophic. He expressed concerns on his blog, stating that without 100% reliable protections, it is hard to envision a scenario where deploying this technology is advisable. Willison specifically criticized the recent trend of integrating AI agents into web browsers, expressing skepticism about the safety of agentic browser extensions in light of ongoing prompt-injection security issues, as seen with Perplexity Comet.

The reality of these security risks has become increasingly evident. Recently, Brave's security team uncovered that Perplexity's Comet browser could be misled into accessing users' Gmail accounts and triggering unauthorized password recovery flows through hidden malicious instructions embedded in Reddit posts. When users asked Comet to summarize a Reddit thread, attackers were able to embed invisible commands that caused the AI to open Gmail in another tab, extract the user's email address, and perform unauthorized actions. Perplexity's attempts to rectify the vulnerability were ultimately unsuccessful, as Brave confirmed that the security flaw persisted.

Future of AI Browser Extensions

For the time being, Anthropic aims to utilize its new research preview to identify and address emerging attack patterns in real-world usage before making the Chrome extension more widely accessible. In the absence of robust protections from AI vendors, the responsibility for security now falls on users, who face considerable risks by utilizing these tools on the open web. As Willison pointed out in his reflections on Claude for Chrome, it is unreasonable to expect end users to make informed decisions regarding security risks associated with these technologies.