The AI turn and the inversion of transparency

The arrival of artificial intelligence agents capable of auditing code at scale changes the historical equation of open source. Linus’s Law — given enough eyeballs, all bugs are shallow — assumed a human arithmetic in which defenders and attackers had the same reading capacities. That equation breaks with models capable of exhaustively scanning millions of lines of code in minutes. The cases that follow document this regime change, and its recent empirical material.

Honest caveat: this family documents very recent events (April 2026), whose exact structural reach will only be known with several months of hindsight. Part of the security community considers Anthropic’s announcements partly a communications stunt, and the full independence of the claims has not yet been validated by complete external audits. The cases that follow reproduce the documented facts and the counter-analyses published to date. If, in 12 to 18 months, the Mythos announcements turn out to have been overstated, thesis 9 of the manifesto would remain valid for two independent reasons: on the one hand, the supply chain fragility documented in family 6 (XZ, Heartbleed, Log4Shell, IngressNightmare) is already enough to establish the defensive asymmetry; on the other, the Copy Fail case documented later in this same family (CVE-2026-31431, disclosed independently of Anthropic by Theori in April 2026) provides a public, reproducible, patched proof of an AI-driven kernel zero-day discovery — an anchor independent of the Anthropic Mythos case, that is, independent of any one frontier proprietary model.

Anthropic’s Project Mythos — The first public demonstration#

Date of the event : 7 April 2026 Status : confirmed, model not released Manifesto theses illustrated : 9, 10

The fact#

On 7 April 2026, Anthropic published a technical blog post announcing the capabilities of its new model, named Claude Mythos Preview. The model, Anthropic’s red team teams explained, is capable of fully autonomously discovering and exploiting zero-day vulnerabilities — that is, never previously identified — in real open source codebases.

Among the documented findings:

CVE-2026-4747, a buffer overflow in FreeBSD’s svc_rpc_gss_validate function, 17 years old, granting remote root access to an unauthenticated attacker on any machine exposing NFS.
A 27-year-old bug in OpenBSD, unearthed by the model.
Anthropic claimed to have identified thousands of zero-day vulnerabilities across virtually every major operating system and web browser, of which fewer than 1% had been fixed by their maintainers at the time of the announcement.

The decisive detail is Anthropic’s decision not to make Mythos publicly available, which is exceptional for a commercial model. Instead, the company launched Project Glasswing, an initiative endowed with $100 million in usage credits and $4 million in donations to open source security organisations, whose objective is to help defenders prepare for a world where these capabilities exist. The initial consortium includes Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, NVIDIA and Palo Alto Networks. Access to Mythos Preview is commercially expensive: $25 input and $125 output per million tokens, that is, about five times the price of Claude Opus 4.6.

Critical reception#

The announcement has been challenged. Tom’s Hardware published an analysis pointing out that Anthropic’s spectacular claims — “thousands of zero-days across all major operating systems” — actually rested on 198 manual reviews documented in the model’s system card, and that the tone struck by Anthropic was at least partly a commercial argument disguised as alert. The Centre for Emerging Technology and Security (CETaS) at the Turing Institute meanwhile underlined that the full independence of the claims could only be validated over several months.

Logan Graham, who heads offensive cyber research at Anthropic, publicly told NBC News that he expected to see comparable capabilities widely distributed within six to twelve months, notably via Chinese open-source models. US Treasury Secretary Scott Bessent convened the country’s leading financial institutions to discuss the topic.

What it demonstrates#

Project Mythos is, to date, the first public and documented demonstration of an AI capacity capable of exploiting at scale the transparency of open source code as an offensive surface. It qualitatively modifies the asymmetry between attackers and defenders.

For the open source sovereignty debate, this case shows that:

Code transparency is no longer in itself a defensive achievement. To remain so, it requires audit means symmetrical to offensive capacities.
The temporal gap between discovery and patch becomes critical. If a model can discover thousands of zero-days in a few days, and maintainers fix at a historical rate of a few per year, the exposure window explodes.
Defensive investment can no longer be left to volunteer work. The defenders of critical bricks need access to AI capacities to remain in balance — otherwise the balance of power tips structurally.

Sources#

Anthropic Red Team (April 2026), Claude Mythos Preview : https://red.anthropic.com/2026/mythos-preview/
Anthropic (April 2026), Project Glasswing: Securing critical software for the AI era : https://www.anthropic.com/glasswing
The Hacker News (April 2026), Anthropic’s Claude Mythos Finds Thousands of Zero-Day Flaws Across Major Systems : https://thehackernews.com/2026/04/anthropics-claude-mythos-finds.html
Tom’s Hardware (April 2026), Anthropic’s Claude Mythos isn’t a sentient super-hacker, it’s a sales pitch — claims of ‘thousands’ of severe zero-days rely on just 198 manual reviews.
Centre for Emerging Technology and Security (CETaS, Turing Institute), Claude Mythos: What Does Anthropic’s New Model Mean for the Future of Cybersecurity? : https://cetas.turing.ac.uk/publications/claude-mythos-future-cybersecurity
NBC News (April 2026), The ‘Vulnpocalypse’: Why experts fear AI could tip the scales toward hackers : https://www.nbcnews.com/tech/security/anthropic-claude-mythos-ai-hackers-cybersecurity-vulnerabilities-rcna273673
CNN Business (April 2026), Anthropic’s next model could be a ‘watershed moment’ for cybersecurity : https://www.cnn.com/2026/04/03/tech/anthropic-mythos-ai-cybersecurity

AISLE counter-analysis — The offensive capacity is already accessible#

Date of the event : 8 April 2026 (published the day after the Mythos announcement) Status : confirmed Manifesto theses illustrated : 9, 10

The fact#

AISLE, an independent AI security research firm, published on 8 April 2026 — that is, the day after the Mythos announcement — a counter-analysis of the capacities claimed by Anthropic, titled AI Cybersecurity After Mythos: The Jagged Frontier. The article is signed by Stanislav Fort, founder and chief scientist of AISLE. The complete code, prompts and transcripts are published on the GitHub repository stanislavfort/mythos-jagged-frontier, allowing anyone to reproduce the results.

AISLE took the flagship vulnerabilities highlighted by Anthropic in its announcement and submitted them to small, cheap open-source models.

The result is unambiguous: eight models out of eight detected Anthropic’s flagship FreeBSD exploit. A model of only 3.6 billion active parameters, accessible for $0.11 per million tokens, correctly identified the buffer overflow and computed the residual buffer space. A 5.1-billion-parameter model reconstructed the exploitation chain of the 27-year-old OpenBSD bug. A false-positive test on 25 models shows a scale inversion: small open-source models often produce fewer false positives than the costly frontier models.

AISLE published a few weeks later a follow-up study, System Over Model: Zero-Day Discovery at the Jagged Frontier, showing that with a simple analysis system (a single Python file, no sophisticated agentics), cheap models can discover real bugs by scanning an entire codebase without being told where to look.

What it demonstrates#

The AISLE counter-analysis is the piece that makes the Mythos argument operational, not merely theoretical. If the capacities were the exclusive preserve of closed frontier models (as Anthropic initially suggested), one could imagine regulation by access control over models. But AISLE demonstrates that these capacities are already accessible via open-source models that any actor — state, criminal, activist — can download and run on its own hardware.

Attacker/defender asymmetry is therefore not a future problem: it is a present problem. And it weighs particularly heavily on open source ecosystems whose code is by construction publicly scannable, in the absence of comparable defensive investment.

Sources#

AISLE (8 April 2026), AI Cybersecurity After Mythos: The Jagged Frontier, by Stanislav Fort : https://aisle.com/blog/ai-cybersecurity-after-mythos-the-jagged-frontier
Stanislav Fort (April 2026), GitHub repository with complete prompts and transcripts : https://github.com/stanislavfort/mythos-jagged-frontier
AISLE (April 2026), System Over Model: Zero-Day Discovery at the Jagged Frontier : https://aisle.com/blog/system-over-model-zero-day-discovery-at-the-jagged-frontier
The Decoder (April 2026), The myth of Claude Mythos crumbles as small open models hunt the same cybersecurity bugs Anthropic showcased.

Copy Fail (CVE-2026-31431) — First public AI-driven kernel zero-day discovery, outside Anthropic#

Dates : public disclosure 29 April 2026; initial report 23 March 2026; upstream patch 1 April 2026 Status : confirmed, 100% exploitable, added to CISA’s KEV (Known Exploited Vulnerabilities) list Manifesto theses illustrated : 9, 11

The fact#

On 29 April 2026, the cybersecurity company Theori (South Korea) publicly disclosed Copy Fail (CVE-2026-31431), a local privilege escalation vulnerability in the Linux kernel. The bug was identified by researcher Taeyang Lee with the help of Xint Code, the AI-assisted internal security scanning tool developed by Theori. The bug had been reported to the kernel on 23 March 2026 and patched upstream on 1 April, that is, less than two weeks after the initial report — a very short delay for a flaw of this scope.

The technical detail is worth underlining. Copy Fail is a logic flaw in the algif_aead module of the Linux cryptographic subsystem, resulting from the interaction of three changes introduced over fourteen years: the addition of the authencesn cryptographic wrapper (used by IPsec) in 2011, the introduction of AEAD socket support via AF_ALG in 2015, and an in-place optimisation added in algif_aead.c in 2017. None of the three changes was faulty in isolation. It is their composition that opened the flaw — exactly the kind of error that a well-intentioned human audit lets through, because one needs to hold three files, three conventions and three authors in mind to see it.

The result: a 732-byte Python script suffices for an unprivileged local user to obtain root rights on Ubuntu, Amazon Linux, RHEL, SUSE, Debian, Fedora and Arch Linux — every distribution shipping a kernel released between 2017 and April 2026. The exploit works 100% of the time, with no race condition or timing window, and a four-byte write into the page cache lets one modify a setuid binary without touching the file on disk. CVSS score 7.8. CISA added the CVE to its Known Exploited Vulnerabilities catalogue and set 15 May 2026 as the deadline for US federal agencies. Patches are available in kernel versions 6.18.22, 6.19.12 and 7.0.

Theori has published on GitHub a reproducible proof-of-concept along with a complete description of the exploitation chain. The AI code used for discovery is not published, but the disclosure documents precisely what the model saw and how it proceeded — enough for a third-party researcher to reproduce the approach.

What it demonstrates#

Where Project Mythos delivered a maximalist announcement coupled with a commercial withdrawal of the model, and where AISLE demonstrated that open-source models could follow, Copy Fail delivers a concrete, independent, reproducible case that answers the question the security community was waiting for: can offensive AI capacities produce, in defensive mode, zero-day-grade findings on critical and widely deployed code?

The answer is yes, and three lessons follow:

Defensive AI is operational today, outside Anthropic. A non-American cybersecurity team, with a proprietary but non-frontier tool, found in a few weeks of scanning what nine years of open human auditing on one of the most-watched subsystems of the Linux kernel had missed. The scenario that thesis 9 of the manifesto calls for — “to equip the defenders of critical bricks with audit and response means commensurate with the new offensive capacities” — is not a projection. It is being deployed in equipped actors, with the corresponding defensive benefits.
The nine-year bug is the signature of a human-audit failure in the most-watched perimeter there is. The crypto subsystem of the Linux kernel is not an obscure corner of the code: it is one of the most audited layers in the world, scrutinised by major-vendor red team teams, by university researchers, by state security agencies. And yet, a logic flaw introduced in 2017 survived until 2026. Linus’s Law — given enough eyeballs, all bugs are shallow — assumed a human arithmetic that, plainly, does not hold for composition bugs. This is precisely what thesis 9 names: code transparency is a defensive achievement only if audit means commensurate with offensive capacities are actually mobilised.
The Europe/equipped imbalance is settling in. Theori is a South Korean company. Xint Code is not commercially available to European actors. No European public equivalent has, at this stage, demonstrated a comparable finding on the supply chain used by the continent’s administrations and companies. As long as this imbalance persists, Europe remains structurally dependent on the discoveries — and therefore the disclosure choices — of non-EU actors for the security of the code it deploys. This is exactly the imbalance that axes 1 and 2 of the positive programme call to fill through public investment and structured funding of defenders.

For the supply chain dimension — a nine-year logic flaw in a critical kernel crypto subsystem — see also family 6, which places Copy Fail in the XZ Utils / Heartbleed / Log4Shell / IngressNightmare lineage.

Sources#

Theori / Xint (29 April 2026), disclosure and technical write-up : https://xint.io/blog/copy-fail-linux-distributions
Bugcrowd (April 2026), What we know about Copy Fail (CVE-2026-31431) : https://www.bugcrowd.com/blog/what-we-know-about-copy-fail-cve-2026-31431/
Microsoft Security Blog (1 May 2026), CVE-2026-31431: Copy Fail vulnerability enables Linux root privilege escalation across cloud environments : https://www.microsoft.com/en-us/security/blog/2026/05/01/cve-2026-31431-copy-fail-vulnerability-enables-linux-root-privilege-escalation/
Help Net Security (30 April 2026), Nine-year-old Linux kernel flaw enables reliable local privilege escalation (CVE-2026-31431) : https://www.helpnetsecurity.com/2026/04/30/copyfail-linux-lpe-vulnerability-cve-2026-31431/
The Register (30 April 2026), Linux cryptographic code flaw offers fast route to root : https://www.theregister.com/2026/04/30/linux_cryptographic_code_flaw/
Dark Reading (April 2026), Another AI-Assisted Software Scan Yields 9-Year-Old Linux Bug : https://www.darkreading.com/vulnerabilities-threats/ai-assisted-software-scan-linux-bug
The Hacker News (May 2026), CISA Adds Actively Exploited Linux Root Access Bug CVE-2026-31431 to KEV : https://thehackernews.com/2026/05/cisa-adds-actively-exploited-linux-root.html
Debian Security Tracker, CVE-2026-31431 : https://security-tracker.debian.org/tracker/CVE-2026-31431
Reproducible C proof-of-concept (Theori / Xint, GitHub) : https://github.com/tgies/copy-fail-c

Criminal exploitation Claude + DeepSeek — The proof through use#

Date of the event : operation between January and March 2026, disclosed in April 2026 Status : confirmed Manifesto theses illustrated : 9, 10

The fact#

According to an investigation published by CNN in April 2026, citing Amazon Web Services’ (AWS) security research unit, a Russian-speaking cybercriminal jointly used Claude (Anthropic) and the Chinese open-source model DeepSeek as early as January 2026 to compromise more than 600 devices protected by a popular firewall product across more than 55 countries.

AWS underlines that the attacker had limited technical capabilities — AI allowed them to industrialise and scale techniques they could not have mastered alone.

What it demonstrates#

This case is the proof through use that the capacities described in Mythos and confirmed by AISLE are not hypothetical. They are already deployed by hostile actors, at scale, against organisations in Europe and worldwide. The barrier to sophisticated attack is collapsing: an actor without strong technical skills can now industrialise attacks using off-the-shelf models.

For European user organisations of critical open source software, this means the time window for investing in defensive AI capacities is not several years — it is a few months. Without that investment, open source becomes a privileged attack surface for properly equipped adversaries.

Sources#

CNN Business (April 2026), Anthropic’s next model could be a ‘watershed moment’ for cybersecurity : https://www.cnn.com/2026/04/03/tech/anthropic-mythos-ai-cybersecurity
AWS Security Research, public communications (April 2026).

Linus’s Law in historical perspective#

Origin : Eric Raymond, The Cathedral and the Bazaar (1999) Status : structurally challenged since 2024 Manifesto theses illustrated : 9

The fact#

Linus’s Law, formulated by Eric Raymond and inspired by Linus Torvalds, states that given enough eyeballs, all bugs are shallow. This formula has become the ideological foundation of open source security: code transparency, read by many developers, would guarantee that flaws are discovered by defenders before attackers.

What it demonstrates#

Linus’s Law was an assertion of human arithmetic: a sufficient number of pairs of human eyes ends up seeing every bug. But that arithmetic assumed defenders were at least as numerous and motivated as attackers in scrutinising the code.

Family 6 (XZ Utils, IngressNightmare, Heartbleed, Log4Shell) showed that this assumption no longer held before AI: on the human side, critical volunteering left strategic components under-maintained. With agentic AI of the Mythos level (and even more so with its diffusion through open-source models confirmed by AISLE), the attacker’s arithmetic changes qualitatively: an AI model can exhaustively comb through a codebase of several million lines in minutes, with a persistence and exhaustiveness no human can match.

The net result is that code transparency, which constituted a defensive advantage when only humans read code, becomes in the AI era a much more useful offensive advantage for whoever has the tools. This inversion does not disqualify open source — security through obscurity is, and remains, a chimera — but it requires a recasting of the defensive posture.

The sectoral caveat. For some sectors — defence, critical infrastructure, energy, health, finance, sensitive administrations —, the inversion changes the calculation concretely. An attacker equipped with an AI can silently scan published source code, identify non-trivial flaws, and exploit them — without alerting, without contributing a patch, without responsible disclosure. This is the opposite of the Copy Fail case documented above, where Theori/Xint responsibly disclosed the kernel flaw discovered by defensive AI; what makes Copy Fail instructive is precisely that it could have stayed silent and been criminally exploited. In these sectors, proprietary software backed by solid contractual mechanisms — tested reversibility first, targeted audit right next, escrow as a safety net — may offer a better sovereign profile than an open source release that would expose the attack surface without proportionate defensive counterpart. This nuance does not disqualify open source for these sectors: it requires that the associated defensive posture (funding of maintainers equipped with defensive AI, bug bounty programme to scale, AI auditing mobilised in defensive mode) be commensurate with the offensive asymmetry. The manifesto does not arbitrate the open licence / proprietary licence choice — it makes visible the conditions of equivalence in each case. See the orientation page for proprietary publishers for the operational version of this nuance.

Sources#

Eric Raymond (1999), The Cathedral and the Bazaar, O’Reilly.
For a contemporary discussion: Real Instituto Elcano (October 2025), Can open source secure Europe’s digital infrastructure? : https://www.realinstitutoelcano.org/en/analyses/can-open-source-secure-europes-digital-infrastructure/

At this stage, the device carries no commitment directly specific to defensive AI capacities — an assumed blind spot of the current catalogue, to be addressed as good practices stabilise. Several existing commitments contribute, however, to putting in place the means whose investment thesis 9 of the manifesto calls for.

Publishers and providers: pay back a documented fraction of revenue to the critical open source projects on which the offer depends; open a source code audit right to customers whose contract justifies it.
User organisations: pay back a documented fraction of the software budget to open source foundations — which can in turn fund equipped defenders.

Full catalogue of commitments →

The AI turn and the inversion of transparency

Anthropic’s Project Mythos — The first public demonstration#

The fact#

Critical reception#

What it demonstrates#

Sources#

AISLE counter-analysis — The offensive capacity is already accessible#

The fact#

What it demonstrates#

Sources#

Copy Fail (CVE-2026-31431) — First public AI-driven kernel zero-day discovery, outside Anthropic#

The fact#

What it demonstrates#

Sources#

Criminal exploitation Claude + DeepSeek — The proof through use#

The fact#

What it demonstrates#

Sources#

Linus’s Law in historical perspective#

The fact#

What it demonstrates#

Sources#

→ Related operational commitments#