512,000 Lines of Claude Code Leaked. The Feature Hidden Inside Changes Everything.

512,000 Lines of Claude Code Leaked. The Feature Hidden Inside Changes Everything.
512,000 Lines of Claude Code Leaked. The Feature Hidden Inside Changes Everything.

I use Claude Code every day. I have for months. So when 512,000 lines of its source code appeared on npm because someone forgot to add a .map file to .npmignore, I did what most engineers I know did: I read it.

What I found is more interesting than the leak itself. Buried under the compaction bugs and the Tamagotchi Easter egg is the architecture of a product Anthropic has not announced. It is called KAIROS. It is an always-on AI agent that runs in the background after you close your terminal, watches your codebase for changes, consolidates what it has learned while you sleep, and decides on its own when to act. The scaffolding is complete. The feature flags are in place. And among safety researchers and engineers I have spoken with, this is the feature that has people genuinely unsettled.

How the Leak Happened

Boris Cherny, an engineer on the Claude Code team, confirmed it was a packaging error. Bun, the JavaScript runtime Anthropic acquired in late 2025, generates source maps by default. The release team failed to exclude the .map file from the npm package. Version 2.1.88 shipped on March 31, 2026, with a 59.8 MB source map containing the entire unobfuscated TypeScript codebase across roughly 1,900 files. Within hours, the code had been mirrored across GitHub, analyzed by security researchers, rewritten in Python and Rust, and forked into a clean-room reimplementation that hit 50,000 GitHub stars in two hours.

Cherny called it human error, not a tooling bug. He added: \”It’s the process, the culture, or the infra.\” That is a mature response. It is also the second time in one week that Anthropic accidentally published internal material. Days earlier, a CMS misconfiguration exposed draft blog posts about an unreleased model called Mythos. Two operational security failures in one week from the company that markets itself as the careful one. Engineers I talk to daily are noticing the pattern.

What KAIROS Actually Is

KAIROS, from the Greek for \”the right moment,\” is referenced over 150 times in the leaked source. Based on the code paths in main.tsx and the analysis published by Alex Kim and the Layer5 team, KAIROS implements a persistent daemon mode. When you close your terminal, Claude Code does not stop. It receives periodic heartbeat prompts asking whether anything is worth doing. It evaluates the state of your codebase and decides to act or wait.

When it acts, it has access to three tools that regular Claude Code does not: push notifications (reaching you on your phone even with the terminal closed), file delivery (sending you artifacts it created unprompted), and a background task runner. A companion process called autoDream runs as a forked subagent during idle periods. It merges observations from prior sessions, removes logical contradictions, and converts tentative hypotheses into verified facts. The fork isolates the maintenance from the main agent’s reasoning, so the \”dream\” process cannot corrupt the agent’s active context. The engineering is thoughtful. The question it raises is not. An AI that consolidates its own beliefs while you sleep and presents the results as facts when you return is making epistemic decisions about your project without your input. The difference between \”Claude remembers your project\” and \”Claude has opinions about your project\” is a line that KAIROS will cross.

A separate feature called ULTRAPLAN offloads heavy planning tasks to a remote cloud session running Opus 4.6, gives it up to 30 minutes of dedicated compute, and lets you approve the result from your phone. When you approve, a sentinel value teleports the plan back to your local terminal.

If you have used Claude Code for any serious project, you know why this matters. The tool is impressive in a session but amnesic between sessions. I have lost context dozens of times when a conversation exceeded its window or I had to restart. KAIROS would solve that. It would also mean an AI agent has persistent, unsupervised access to your codebase, your file system, and your GitHub webhooks around the clock.

The Safety Question the Leak Forces

I participate in AI safety cohorts. I have tested frontier models from multiple labs under NDA before public release. That experience shapes how I read the KAIROS code. An always-on agent that proactively modifies your work raises questions that reactive tools do not. When you type a prompt and Claude responds, the trust boundary is clear: you asked, it answered. KAIROS dissolves that boundary. The agent decides when to act. It consolidates its own memory. It \”dreams\” about your project. The trust model shifts from \”I control the tool\” to \”the tool manages itself and I review the results.\” I have seen how companies handle that transition internally during testing. The gap between what works in a controlled evaluation and what works on a real engineering team with production deadlines is where things break.

This is happening while Claude is simultaneously proving it can build kernel-level exploits in four hours and OpenClaw has accumulated 104 CVEs. The same AI that rewrites your test suite at night could, in principle, introduce subtle vulnerabilities that pass code review. I am not saying Anthropic would ship KAIROS without safeguards. I am saying the leaked code shows the safeguards have not been built yet. The architecture is there. The trust model is not.

METR, the independent AI evaluation organization, published a report on March 26 describing three weeks spent red-teaming Anthropic’s internal agent monitoring systems. They found novel vulnerabilities. The timing is coincidental but the message compounds: Anthropic’s monitoring infrastructure has gaps at exactly the moment the company is building an agent that needs monitoring most.

What Else the Code Reveals

The anti-distillation mechanisms got the most attention on Hacker News. A flag called ANTI_DISTILLATION_CC injects fake tool definitions into API requests, designed to poison the training data of anyone recording Claude Code’s traffic to build a competing model. A second mechanism summarizes reasoning between tool calls and signs it cryptographically, so eavesdroppers get summaries instead of full chain-of-thought. Engineers on HN pointed out that both are defeated in about an hour by stripping fields through a proxy. Anthropic’s CEO Dario Amodei has publicly accused Chinese labs of distilling from American models. The defensive code is real. Its effectiveness is not.

Undercover Mode, implemented in roughly 90 lines of undercover.ts, strips all traces of Anthropic when Claude Code contributes to external repositories. It suppresses codenames, Slack channels, and the phrase \”Claude Code\” in commits and PRs. The code comment reads: \”There is NO force-OFF.\” You can enable it manually, but you cannot disable it. In external builds, the function is dead-code-eliminated entirely. This means AI-authored contributions from Anthropic employees in open-source projects carry no indication that an AI wrote them. The disclosure implications are obvious and, in the MCP-connected ecosystem Anthropic is building, they extend to every tool in the chain.

Less discussed but equally revealing: a file called print.ts is 5,594 lines long and contains a single function spanning 3,167 lines with 12 levels of nesting. A compaction bug was wasting 250,000 API calls per day before someone added a three-line fix. Claude Code generates $2.5 billion in annualized revenue and 80% comes from enterprise customers. Those customers are partly paying for the belief that the code powering their AI tools is well-engineered. The leak complicates that assumption.

What Happens Next

The code is out. Anthropic filed DMCA takedowns and GitHub complied, but a mirror at Gitlawb remains live with a public message saying it will never be taken down. The strategic damage exceeds the code damage. You can refactor source in a week. You cannot un-leak a roadmap. Competitors now know about KAIROS, ULTRAPLAN, the anti-distillation flags, and the model codenames. Those are product strategy decisions that Cursor, GitHub Copilot, and every other AI coding tool can now plan around.

For developers who use Claude Code daily, the practical question is simpler. When KAIROS ships, will you give an AI agent persistent background access to your entire project? The engineers I work with are split. The productivity promise is enormous. The trust model is unresolved.

Consider what KAIROS means for the broader ecosystem. If Anthropic ships a persistent agent that monitors your codebase around the clock, every competitor will follow. GitHub Copilot, Cursor, Windsurf, and every other AI coding tool will face pressure to match that capability or lose users who want always-on assistance. The industry will move from \”AI that helps when asked\” to \”AI that acts when it decides to\” across the entire developer toolchain. That transition changes the security posture of every software project that adopts it. Every codebase becomes a live target not just for external attackers but for the agent’s own judgment errors compounding overnight while nobody watches.

The company asking developers to trust that transition just accidentally published its entire source code because someone forgot a line in .npmignore. That irony is not lost on anyone paying attention. The question is not whether KAIROS will ship. The architecture is too complete and the competitive pressure too strong for Anthropic to shelve it. The question is whether it ships with the trust infrastructure that an always-on agent demands, or whether the race to beat Cursor and Copilot pushes it out before the safeguards are ready. I have watched that tradeoff play out in other products during pre-release testing. Speed usually wins. The consequences show up later.

Discover more from My Written Word

Subscribe now to keep reading and get access to the full archive.

Continue reading