Anthropic's most advanced AI model accessed without authorization: Just how fragile is the so-called "safety barrier"?
Anthropic's most powerful AI model to date, Mythos, was subject to unauthorized access on the same day the company announced its limited testing program, exposing significant vulnerabilities in its carefully constructed safety control system designed to prevent the spread of dangerous technologies.
According to the latest disclosure by Bloomberg, a small group of users have been accessing Mythos continuously through a private online forum, using third-party contractor access, internet reconnaissance tools, and clues from external data leaks. An Anthropic spokesperson stated that the company is investigating reports of unauthorized access to Claude Mythos Preview via a third-party vendor environment, but currently no evidence shows that the access has exceeded the range of the third-party vendor environment or has affected the company's own systems.
What makes Mythos special is that Anthropic has defined it as a model capable of identifying and exploiting “every major operating system and every major browser” vulnerability at user command—which is the core reason the company strictly limits its release. Although the unauthorized access has reportedly not been used for cyberattacks, it raises further questions about whether other users are accessing the model without permission and for what purposes.
The incident directly tests Anthropic’s Project Glasswing safety control framework built around Mythos, and brings new compliance pressures to the AI company, whose annual revenue has surpassed $30 billion and is rapidly expanding.
How the Barriers Were Breached
These users come from a private Discord channel dedicated to tracking information on unreleased AI models, where members routinely collect technical details from Anthropic and other developers by scanning unsecured sites like GitHub with robot programs.
According to insiders, they used a variety of methods. First, one member used their legitimate Anthropic third-party contractor access obtained through contract evaluation work; second, they leveraged internet reconnaissance tools commonly used by cybersecurity researchers. To locate Mythos's specific network position, they inferred based on Anthropic’s past naming and path conventions, and obtained crucial reference information from a recent data leak at AI training startup Mercor—which collaborates with several top AI developers.
Insiders provided screenshots and live demonstrations of the model as evidence for the claims. For safety reasons, Bloomberg News did not disclose the name of the company involved in the contract. The insider also revealed the group currently holds access channels to multiple other unreleased Anthropic models.
According to insiders, the main motivation of this group is exploring new models, not malicious use. So far, they have not run any security-related commands on Mythos, deliberately choosing low-key tasks like building simple web pages to reduce the risk of detection by Anthropic.
Nonetheless, at launch, Anthropic asserted that the model’s capabilities were so powerful they dared not make it public.
This situation doesn’t mean potential risks can be ignored. Anthropic has made clear that Mythos can systematically identify and exploit vulnerabilities in mainstream operating systems and browsers at user command. Against this backdrop, unauthorized access itself—regardless of user intent—constitutes a substantial challenge to the company’s safety framework and raises concerns about other potential unseen users.
Project Glasswing: Defensive Logic of Limited Access
Based on this risk assessment, Anthropic included Mythos in a special project called "Project Glasswing", aiming to let vetted partners test the model early and strengthen their systems against potential cyberattacks.
Currently, dozens of organizations, including Apple, Amazon, and Cisco, are approved for early testing; Amazon, as a major Anthropic partner and investor, offers access to certain institutions via its Bedrock platform under approval.
Anthropic positions this plan as a “preemptive defense action,” intending to guide such high-level hacking capabilities toward defensive purposes before the technology spreads to broader actors. Recently, more financial and government institutions on both sides of the Atlantic have applied for early testing to protect their systems from malicious actors. Anthropic says there are currently no plans to open Mythos to the public.
Rapid Expansion, Rising Security Pressure
This incident comes as Anthropic’s business is rapidly expanding. According to Bloomberg, the company’s annual revenue has topped $30 billion, surpassing OpenAI, with growth tripling since the end of last year.
However, the ongoing commercial expansion means the model testing chain is lengthening and involves third-party contractors, partners, and platform integrators. This unauthorized access event demonstrates that any weakly managed node in the access chain could become a breach point for the most powerful technologies.
How to ensure security at every link amid rapid scale growth will be a core issue Anthropic cannot avoid.
Risk Warning and DisclaimerMarkets involve risk; investment requires caution. This article does not constitute personal investment advice and does not take into account any individual user's particular investment goals, financial situation, or needs. Users should consider whether any opinions, views, or conclusions contained herein are appropriate for their circumstances. Investing accordingly is at your own risk.