Project Glasswing: Claude Mythos - The Accidental Superhacker Artwork

The Deepdive

Join Allen and Ida as they dive deep into the world of tech, unpacking the latest trends, innovations, and disruptions in an engaging, thought-provoking conversation. Whether you’re a tech enthusiast or just curious about how technology shapes our world, The Deepdive is your go-to podcast for insightful analysis and passionate discussion.

Tune in for fresh perspectives, dynamic debates, and the tech talk you didn’t know you needed!

All Episodes

The Deepdive

Project Glasswing: Claude Mythos - The Accidental Superhacker

April 08, 2026 • Allen & Ida • Season 3 • Episode 52

0:00 | 19:50

Send us Fan Mail

Imagine an AI that wakes up, reads millions of lines of code, and finds the kinds of vulnerabilities humans miss for decades, then writes working exploit code without hand holding. That’s the unsettling picture we’re unpacking today as we dig through reporting and leaked details around Anthropic’s Claude Mythos preview and the secretive rollout known as Project Glasswing.

We walk through what “emergent behavior” looks like when you train an AI coding assistant into a software savant and accidentally end up with an autonomous security researcher that can discover zero-day vulnerabilities at industrial scale. We break down the specifics that make this feel real, not theoretical: a reported 27-year OpenBSD flaw, a long lived FFMPEG bug that survived millions of automated tests, and the leap from spotting issues to vulnerability chaining, where multiple small flaws become full system takeover.

Then we zoom out to the messy human layer: why Glasswing access is limited to a small consortium of tech giants, how token pricing can keep AI cybersecurity out of reach for most organizations, and why the rollout is haunted by operational security failures like an unsecured data lake draft and a GitHub leak followed by chaotic takedowns. We also cover the six to eighteen month race to malicious parity, plus the tension between civil liberties guardrails and national security pressure as the Pentagon and regulators enter the frame.

If AI changes the speed of hacking and patching from months to minutes, what does “secure by default” even mean anymore? Subscribe, share this with a friend who writes or ships software, and leave a review with your take: should tools like Mythos be tightly gated, widely shared, or something in between?

Leave your thoughts in the comments and subscribe for more tech updates and reviews.

A Terrifying AI By Accident

Ida 0:05

Imagine, if you will, um building the most terrifying, capable, and completely autonomous digital hacker in human history.

Allan 0:14

Aaron Powell Okay. Setting the stakes high early. I like it.

Ida 0:17

Right. I mean, we are talking about a system that can wake up, read millions of lines of code, and just, you know, effectively dismantle the world's cybersecurity infrastructure before breakfast. Trevor Burrus, Jr.

Allan 0:27

Terrifying.

Ida 0:28

But now imagine that the way the world actually found out about this hyper-secure, ultra-advanced AI was because its creators accidentally left the launch announcement sitting in an unsecured, publicly inspectable data lake.

Allan 0:43

Aaron Powell This is simultaneously impressive and completely ridiculous.

Ida 0:46

Aaron Powell You literally cannot write a better irony.

Allan 0:48

Aaron Powell Welcome to the Deep Dive. We've been uh pouring over this bizarre collision of high-stakes tech and just pure human error all morning.

Ida 0:55

Aaron Powell Yeah, we've got quite a stack of sources today.

Allan 0:57

Exactly. We're digging through the latest from TechCrunch, Axios, and even Anthropic's own leaked internal documents from April 2026.

Ida 1:05

Which they leaked themselves. Which they leaked themselves. And our mission today for you, the listener, is to figure out what Anthropic actually built with this new Claude Mythos preview model. Right. We want to know why they locked it away in this secret club called Project Glasswing. And what happens when the architects of our secure digital future just keep tripping over their own digital shoelaces?

Allan 1:30

Aaron Powell Because the contrast there is just jarring. I mean, you have a technological breakthrough that genuinely changes the paradigm of global defense. And it's being managed by people who were, you know, making the kind of mistakes you'd expect from a junior developer on their first week.

Emergent Hacking From Coding Skill

Ida 1:43

So let's start with the breakthrough itself. Because to understand why Anthropic was trying to keep this under wraps, you really have to understand the sheer scale of what they accidentally created.

Allan 1:52

Wait, it gets better. So it wasn't even on purpose.

Ida 1:54

No. I use the word accidentally very deliberately here. They didn't set out to build a cyber weapon at all.

Allan 2:00

Oh wow.

Ida 2:01

They were just trying to train Mythos to be an absolute savant at writing software.

Allan 2:06

Which, I mean, makes total sense. Everyone is racing to build the ultimate AI coding assistant right now. Exactly. But here is the fascinating part about emergent behavior in these models. When you train an AI to understand how software is put together at a superhuman level, you're basically simultaneously teaching it how to tear that software apart.

Ida 2:27

Right. It's two sides of the same coin.

Allan 2:29

It's the exact same skill set, just applied in reverse. If you understand the structural integrity of a building perfectly, you know exactly which load-bearing wall to hit to bring the whole thing down.

Ida 2:41

So they just pointed it at code and it naturally morphed into this ultimate digital lockpick.

Allan 2:46

Pretty much. They trained it on so much code that it internalized the logic, the patterns, and crucially, the common mistakes that humans make when they type out millions of lines of instructions.

Ida 2:57

And the terrifying part is that it's completely autonomous. You don't have to sit there prompting it step by step, holding its hand. According to the head of Anthropic's Frontier Red Team, it operates like a senior human security researcher just working around the clock without supervision. You just point it at a network and let it run.

Allan 3:16

That is wild.

Ida 3:17

Let's look at the actual numbers here. Anthropic's previous model, Opus 4.6, found roughly 500 zero-day bugs in open source software.

Allan 3:27

Let's pause on that term for a second, because we hear zero day thrown around in spy movies all the time.

Ida 3:33

Good point. Let's define it.

Allan 3:34

Yeah. For anyone who isn't living in the cybersecurity world, a zero-day mug is a vulnerability that the software vendor has known about for exactly zero days.

Ida 3:44

Meaning nobody knows it exists except the person who just found it.

Allan 3:47

Exactly. There is no patch, there is no defense. And finding just one is usually a career-making moment for a human researcher.

Ida 3:53

And Opus found 500.

Allan 3:55

Which is already crazy.

Ida 3:56

Right. But mythos. Mythos has found tens of thousands.

Allan 4:00

Tens of thousands. You can't even really conceptualize that scale. That is like an industrial revolution in hacking. It makes you wonder what kind of flaws it's actually uncovering.

Ida 4:09

Oh, the specifics are staggering. Take OpenBSD, for instance.

Allan 4:13

Oh man.

Ida 4:14

Yeah. If you run a major firewall right now or maintain critical server infrastructure, you are very likely relying on OpenBSD. It is historically revered as this impenetrable fortress of operating systems.

Allan 4:29

Security experts literally treat it like digital titanium.

Ida 4:32

And Mythos found a fatal flaw in it that has been sitting there for 27 years.

Allan 4:37

27 years.

Ida 4:38

Yep. An attacker could just send a couple of specific pieces of data to any OpenBSD server and remotely crash it.

Allan 4:45

That means countless human security experts have been auditing that exact same code since the late 1990s. They've run automated security tests on it thousands of times and literally nobody saw it.

Ida 4:55

And it also found a 16-year-old bug in FFMPEG.

Allan 4:58

Which is everywhere.

Ida 4:59

It's everywhere. That's a video encoding software used by practically every streaming service and app on your phone. This bug was hiding in a line of code that had survived five million automated security tests without raising a single alarm.

Allan 5:12

Insane.

Ida 5:13

So how does a machine see something that five million targeted security tests completely missed?

Allan 5:18

Well, it's because automated tests are fundamentally dumb. They basically like spell check.

Ida 5:24

Oh, that's a good way to look at it.

Allan 5:26

Right. They look for known typos, known bad patterns, and they just check foxes. They don't actually understand context. Mythos, on the other hand, is reasoning. Trevor Burrus, Jr.

Ida 5:35

Like it actually understands what it's reading.

Allan 5:38

Exactly. It's reading the code more like a ruthless literary critic. It looks at chapter one, sees a character pick up a key, and then realizes in chapter five that the same character shouldn't be able to open a completely different door.

Ida 5:51

Right.

Allan 5:52

It understands the logic flow. It says, hey, hang on, if I push this obscure variable over here, the entire memory allocation falls down over there.

Ida 5:59

Okay, that makes total sense. It's reading for plot holes, not typos, which actually that brings up something in the Axios reporting I need you to break down for me because it sounds like an absolute nightmare scenario.

Allan 6:08

Oh, the chaining?

Ida 6:09

Yes. They talk a lot about vulnerability chaining. What is that mechanically?

Allan 6:15

So vulnerability chaining is where the model really flexes its reasoning. In complex systems like the Linux kernel, which basically runs the modern internet, a single bug rarely gives you control.

Ida 6:26

Okay.

Allan 6:27

You might find a tiny flaw, let's call it a memory leak, that just lets you peek at where data is stored, useless on its own. Then you find another minor flaw that lets you input slightly more data than you're supposed to. Also relatively harmless.

Ida 6:41

Right. On their own, they are just like minor annoyances.

Allan 6:44

Exactly. But Mythos autonomously figured out how to string them together.

Ida 6:49

Oh no.

Allan 6:49

It uses the first bug to peek at the system's layout. It uses that layout map to aim the second bug perfectly, which gives it a tiny little foothold. Wow. And then it uses that foothold to trigger a third bug that elevates its administrative privileges. It strings together three, four, even five of these minor flaws in a precise sequence until it has complete control over the machine.

Ida 7:11

So it's essentially like hiring a master architect who accidentally realizes they know exactly how to flawlessly rob every bank they've ever designed.

Allan 7:19

That is a brilliant analogy. You hire them to build the vault, and they just hand you a blueprint showing exactly which three loose bricks, tapped in a specific order, will bring down the entire security system.

Ida 7:31

And it isn't just theory either. Anthropic put Mythos through a benchmark test called Cybergym to see if it could actually write the functional exploits like the actual weaponized code to use these bugs on its very first try. Well, the previous Opus model had a 66.6% success rate. Mythos achieved an 83.1% success rate autonomously on the first attempt.

Allan 7:56

Yeah, see, it's one thing to point say, hey, there's a hole in the fence. It's entirely another to say, here's the exact pair of wire cutters you need, here is the map to the vault, and I forged the master key for you. Have fun.

Project Glasswing And Who Gets Access

Ida 8:07

Right. So you realize you've built an AI with an 83% success rate at exploiting global infrastructure. You obviously don't just put that on the public app store. No, definitely not. Anthropic's immediate reaction was to lock it in a digital vault and only hand the keys to a very specific group of people. They created this VIP initiative called Project Glasswing.

Allan 8:31

What does this say about us as a society? I mean, look at the name they chose. The Glasswing Butterfly is this beautiful insect in Central America known for having completely transparent wings.

Ida 8:42

Oh, right.

Allan 8:42

It hides in plain sight. They probably chose it as a metaphor for these invisible vulnerabilities lurking in our servers. But honestly, it is the perfect metaphor for the sheer terrifying transparency of our reliance on a handful of tech monopolies to keep the digital sky from falling.

Ida 8:58

You aren't kidding about monopolies. The Glasswing Consortium is just 12 tech behemoths: Apple, Microsoft, Amazon, Google, CrowdStrike, Cisco, NVIDIA, Palo Alto Networks, Broadcom, JP Morgan Chase, and the Linux Foundation.

Allan 9:11

That is quite the guest list.

Ida 9:13

Right. Plus about 40 other unnamed organizations that run critical infrastructure. Anthropic deemed the model too dangerous for literally anyone else.

Allan 9:21

Okay, but here's the thing. Aren't we just handing the ultimate weapon to a few tech monopolies and praying they fixed the roof before it rains?

Ida 9:30

That is the big question.

Allan 9:31

We're restricting this incredible defensive tool to the companies that can afford to be in the club while every small business, hospital, and local government is just left hoping the trickle-down security reaches them in time.

Ida 9:45

Aaron Powell Well, Anthropic argues they're throwing serious money at fixing the broader ecosystem. They gave these launch partners up to$100 million in usage credits to aggressively scan their own code bases and, you know, patch the foundations of the internet we all use.

Allan 10:00

Uh-huh.

Ida 10:01

And they also gave$4 million in donations to open source security organizations.

Allan 10:06

Wait, a hundred million to the corporate giants and four million to the open source community that actually builds the underlying architecture of the web?

Ida 10:13

Yeah.

Allan 10:14

That ratio tells you everything you need to know.

Ida 10:16

To be fair though, Anthropic says they do plan to release a safer, generalized version of a mythos class model to the public eventually.

Allan 10:23

Sure, they do.

Ida 10:24

They just have to figure out how to put guardrails on it first. So it doesn't just spit out malware to anyone who asks for it. But even when it does launch, the pricing they announced is astronomical. We're talking$25 per million input tokens and$125 per million output tokens.

Allan 10:41

Okay. We throw the word token around a lot. Let's ground that for the listener.

Ida 10:44

Yeah, please do.

Allan 10:45

A token is roughly a syllable or a fragment of a word. A million tokens sounds like a lot. It's maybe a few thick novels worth of text.

Ida 10:55

Okay.

Allan 10:56

But if you are a medium-sized enterprise trying to scan your entire code base for vulnerabilities, you aren't dealing in millions of tokens. You are dealing in billions. Only the biggest players can afford the compute costs to run this defensive tool at scale anyway. The VIP club just enforces itself financially.

Ida 11:27

It's a brilliant, if cynical, strategy. But um this brings us to my absolute favorite part of this entire deep dive.

Allan 11:35

Oh, I know where you're going with this.

The Data Lake Leak And GitHub Mess

Ida 11:36

We are talking about trusting these massive tech giants with the keys to the kingdom. We are relying on them to patch the fabric of modern civilization. And yet, how did we actually find out about Project Glasswing in the first place?

Allan 11:49

I love that this exists, but also why?

Ida 11:52

Because Anthropic, the architects of this hyper-secure, world-defending model, drafted a blog post about it back when it was codenamed Capybara, and then accidentally left that draft sitting in an unsecured cache of documents on a publicly inspectable data lake.

Allan 12:09

It's just so funny.

Ida 12:10

Anyone poking around could just go and read it.

Allan 12:12

You build a machine that outsmarts 27 years of human security review, and then you forget to put a password on your own draft folder.

Ida 12:18

They called it human error, but wait, it gets so much worse.

Allan 12:21

How could it be worse?

Ida 12:23

The blog post leak isn't even their most embarrassing fumble this quarter. Just last month, we had the great GitHub disaster.

Allan 12:30

Right, the GitHub thing.

Ida 12:31

Anthropic rolled out an update version 2.1.88 of their Clawed Code software package. Someone messed up the configuration and they accidentally exposed nearly 2,000 internal source code files to the public.

Allan 12:43

Oof.

Ida 12:44

Half a million lines of their own proprietary code just sitting out there on the internet.

Allan 12:49

So the AI company, claiming they alone can secure our bad code, wrote bad code that leaked their own secrets.

Ida 12:56

Exactly. And the cleanup is where it turns into a total comedy of errors.

Allan 13:00

Oh, yeah, the takedowns.

Ida 13:01

Yes. In their frantic attempt to yank the leaked code off the internet, they deployed automated takedown scripts across GitHub. But those scripts were apparently overly aggressive or just poorly targeted because they accidentally caused thousands of completely unrelated, innocent code repositories on GitHub to just be taken down.

Allan 13:19

This is the glorious absurdity of the tech industry in a nutshell. We invent these digital gods, but we can't figure out basic cloud storage hygiene to mechanically explain what happened on GitHub. When you leak code, you often try to scrub it by running scripts that search for specific hashes or patterns of your proprietary code. Makes sense. Right. But if your script isn't carefully tuned and it accidentally flags a common open source library that you just happen to be using, your script will aggressively delete every other project on GitHub that also uses that common library. Wow. It's like trying to pull a weed and accidentally bulldozing the entire neighborhood.

Ida 13:59

It is staggering. We are trusting people who accidentally bulldoze digital neighborhoods to safeguard the infrastructure that runs hospitals, banks, and the power grid.

The Six To Eighteen Month Clock

Allan 14:07

It's wild.

Ida 14:08

But despite these comical human fumbles, the reality of the tech is really sobering and the clock is officially ticking. Anthropic is rushing to patch these systems with glasswing because they know they aren't the only ones who realize this emergent behavior exists.

Allan 14:24

Yeah, the autonomy is out of the bag. You can't uninvent the realization that an AI trained on coding naturally becomes a master hacker.

Ida 14:32

Exactly. The head of Anthropic's Frontier Red Team estimates we have a window of maybe six to eighteen months before malicious actors or even competitor models catch up to what mythos can do.

Allan 14:42

That is not a lot of time.

Ida 14:43

No. And OpenAI is already finalizing a similar model for their trusted access for cyber program. The proliferation is coming. And the threat isn't just theoretical. The tech journalism notes that cyber criminals are already using earlier, less sophisticated AI models to write malicious scripts and automate ransomware negotiations.

Allan 15:03

It completely lowers the barrier to entry for cybercrime. You used to need deep technical knowledge to execute a ransomware attack. Now you literally just need to be able to converse with a chatbot.

Pentagon Standoff Over AI Ethics

Ida 15:14

And the sources also detail that China has already used previous Enthropic models to automate a spying campaign targeting 30 organizations. So Enthropic is in panic mode, desperately briefing the cybersecurity and infrastructure security agencies CISA and the Commerce Department, trying to explain the risks before the dam totally breaks.

Allan 15:33

Which actually creates a fascinating tension with the government because while they are briefing some agencies, they're in a full-blown standoff with others.

Ida 15:41

Yes, the nine to five Mac pieces detail this ongoing feud with the Pentagon. The Pentagon recently slapped Anthropic with a supply chain risk label, which sounds incredibly bureaucratic, but it's actually a massive deal.

Allan 15:54

Why do they get flagged?

Ida 15:55

It actually comes down to Anthropic's internal ethics policy.

Allan 15:58

Oh, really?

Ida 15:59

Yeah, they have a hard line refusing to allow any autonomous targeting or surveillance of U.S. citizens using their AI. The Pentagon, on the other hand, is looking at this tech and saying, look, we need the most powerful tools available to defend national security and potentially for offensive cyber operations. Right. Because Anthropic won't budge on their internal rules, the defense apparatus labeled them a risk. It is this incredible, real-time collision of national security demands and private company civil liberties policies. And we're obviously just reporting the tension here without taking a side, but it's fascinating to watch play out.

Allan 16:34

It really highlights how completely unprepared our legal frameworks are for intelligence of this magnitude.

Ida 16:41

Yeah.

Allan 16:41

You have the military demanding access to secure the country, and a tech company acting as a sovereign entity, placing ethical boundaries on a digital superweapon. Both sides make complete logical sense from their own perspective, but they are entirely incompatible.

Ida 16:55

Which brings us back to you, the listener, and what this actually means for your daily life.

Allan 17:00

Yeah, let's zoom out.

Ida 17:01

We've spent a lot of time talking about OpenBSD, token economics, and GitHub scripts. But the bottom line is that software ate the world. Every single analog aspect of your life, your banking records, your healthcare data, the logistics network that ensures there is food at the grocery store, the power grid that is currently keeping your lights on, all of it is represented in the digital domain.

Allan 17:24

Right.

Ida 17:25

And all of it relies on the assumption that the code holding it together is structurally sound.

Allan 17:30

What Mythos has unequivocally proven is that the code is not sound. Nope. The foundation is full of cracks that human eyes just couldn't see. But more importantly, the timeline of digital warfare has entirely collapsed. It used to take months for a human to find a flaw, write an exploit, test it, and deploy it. You had time to react.

Ida 17:51

And now.

Allan 17:51

Now. With AI reasoning like mythos, that entire process from discovering a 27-year-old bug to writing the weaponized code happens in minutes.

Ida 18:02

The human security researcher is just mathematically outclassed. The human brain cannot read a million lines of code, find five minor memory leaks, and chain them together to take over a server in the time it takes an AI to process the request.

Allan 18:15

Exactly.

Ida 18:16

The only thing fast enough and smart enough to defend against an AI attacker is another AI defender.

Allan 18:21

Which is the core thesis of Project Glasswing. They are trying to give the corporate defenders a collective head start. They're using the smartest AI to patch the world's infrastructure before the offensive versions of these models inevitably hit the dark web in the next 18 months.

Ida 18:36

But if we follow that logic all the way down.

Allan 18:38

Think about the ecosystem we are actively building right now. If our entire digital infrastructure now requires an autonomous AI to patch it, because human developers are too slow and make too many mistakes, and the only threat capable of breaking that infrastructure is an autonomous adversary eye. Have humans effectively aged out of managing our own creations?

Ida 19:01

Wow. We built the digital world, but it has become too complex for us to maintain.

Allan 19:07

We are no longer the architects of our own infrastructure. We're just passengers in the back feet, hoping the machines can agree on the speed limit.

Ida 19:13

Aaron Powell Passengers hoping the machines agree on the speed limit. That is a hauntingly accurate way to frame it. Yeah. We accidentally built the ultimate hacker. We locked it in a vault with a dozen tech monopolies. And now we are just waiting to see if they can fix the roof before the storm hits. And we're trusting the same architects who accidentally leak their own source code on GitHub to manage a digital brain that can dismantle global cybersecurity before breakfast.

Allan 19:36

It's the ultimate glasswing butterfly. The fragility of our entire modern world is entirely transparent, just hiding right in front of us.

Ida 19:45

We'll leave you to mull over that one the next time you log into your bank. Thanks for joining us on this deep dive.

Allan

Host

Ida

Host