I Read All 512,000 Lines of Claude Code’s Leaked Source Code

Claude Code leaked source code fragments floating in dark cyberpunk space

🇮🇹 Leggi in Italiano

A few weeks ago, I reverse-engineered Cursor by pointing a proxy at it and capturing every API call. That took some effort. This time, Anthropic did the hard part for me.

On March 31, 2026, developer Chaofan Shou noticed something unusual in the npm package for Claude Code v2.1.88: a 57 MB .map file. A source map. The kind of file that maps minified production code back to the original TypeScript source. The kind that should never ship to production.

Anthropic pulled the version within three hours, but by then the code was mirrored on at least three GitHub repos. 512,000 lines of TypeScript. Every internal file, every comment, every feature flag. I spent the last two days reading through it. Here’s what I found.

It’s Not a Thin Wrapper

If you assumed Claude Code was a shell script calling an API, you’re wrong. It’s a full application with a React UI running inside the terminal (via React Ink), bundled with Bun, entirely in TypeScript. The src/ directory has roughly 55 top-level directories. The entry point main.tsx alone is 800 KB.

The architecture is modular and well-organized:

  • tools/ contains about 40 tools (Bash, Read, Edit, Grep, Glob, Write, Agent, etc.), each in its own directory with JSON schema, validation, and permission logic
  • commands/ has 100+ slash commands, including hidden ones
  • components/ holds 146 React components for the terminal UI
  • hooks/ contains 87 React hooks
  • services/ manages API calls, OAuth, MCP integration, analytics, and rate limiting
  • coordinator/ handles multi-agent orchestration
  • buddy/ is… a Tamagotchi. We’ll get to that.

One detail that stands out: Bun’s build-time feature flags. The function feature('BUDDY') resolves at compile time, not runtime. If the flag is off, the entire code branch gets eliminated via dead code elimination. No runtime overhead, no hidden code paths in the binary.

A Tamagotchi in Your Terminal

The feature that broke the internet. Behind the BUDDY flag, Claude Code ships a complete virtual companion system: 18 ASCII animal species, 5 rarity tiers, procedural generation, and AI-generated personalities.

Species include duck, cat, dragon, ghost, axolotl, robot, mushroom, capybara, and more. Each has 3 animation frames at 500ms intervals. The rarity system is what you’d expect from a gacha game:

Rarity Drop Rate Min Stats Hats
Common 60% 5 No
Uncommon 25% 15 Random
Rare 10% 25 Random
Epic 4% 35 Random
Legendary 1% 50 Random

Each companion has 5 stats: DEBUGGING, PATIENCE, CHAOS, WISDOM, SNARK. One is the peak stat, one is the dump stat, the rest are distributed randomly. Available hats include crown, tophat, propeller, halo, wizard, beanie, and tinyduck (a tiny duck sitting on the companion’s head). There’s a 1% chance of a shiny variant.

Here’s the clever part: your companion is deterministic. It’s seeded from a hash of your user ID using a Mulberry32 PRNG. Same user, same animal, always. The comment in the code reads: “Mulberry32 — tiny seeded PRNG, good enough for picking ducks.”

The “bones” (species, eyes, rarity, stats) regenerate from the hash at every launch and are never saved to disk. Only the “soul” (AI-generated name and personality) persists after the first “hatch.” You can’t fake a Legendary by editing config files.

One last detail. The species name capybara collides with an internal Anthropic model codename. To avoid tripping the build scanner (excluded-strings.txt), all species names are hex-encoded:

const c = String.fromCharCode
export const duck = c(0x64,0x75,0x63,0x6b)
export const capybara = c(0x63,0x61,0x70,0x79,0x62,0x61,0x72,0x61)

The Prompt Caching Trick That Makes It Fast

This is the part that matters if you build AI products. The main reason Claude Code feels fast isn’t the model. It’s how the system prompt is structured.

In constants/prompts.ts, there’s a marker:

export const SYSTEM_PROMPT_DYNAMIC_BOUNDARY =
  '__SYSTEM_PROMPT_DYNAMIC_BOUNDARY__'

This marker splits the system prompt into two halves. Everything before the boundary is identical for every user on the planet: identity, tool definitions, output style, permission handling. Roughly 40-50K tokens of static text. This portion is cached with a global scope, meaning a single Blake2b hash shared across all first-party Anthropic users. Cache hit rate: above 90%.

Everything after the boundary is session-specific: user memory, MCP instructions, environment info, skills. This gets cached at the org level.

The comments warn: “WARNING: Do not remove or reorder this marker without updating cache logic.” Every conditional placed before the boundary multiplies prefix hash variants by 2N. Multiple internal PRs (#24490, #24171) document bugs caused by conditionals placed in the wrong half.

The practical result: the model doesn’t re-process its base instructions on every turn. It finds them in cache. This cuts the computational cost of the system prompt by 90%+. If you’re building a wrapper that constructs the prompt from scratch every request, you’re already at a structural disadvantage.

How It Manages Context Without Losing the Thread

Context is treated as a finite resource with two compression strategies.

Microcompact runs preventively while you work. Old tool results (Bash, Read, Grep, Glob, WebSearch, Edit, Write) are progressively cleared and replaced with [Old tool result content cleared]. The threshold is around 180K tokens; when crossed, it compresses down to about 40K. File paths in tool results are shortened to relative paths to save tokens.

Full compaction (/compact) triggers when microcompact isn’t enough. The compaction prompt is arguably the most carefully crafted piece of the entire codebase. It mandates a structured summary in 9 sections:

  1. Primary Request and Intent
  2. Key Technical Concepts
  3. Files and Code Sections (with complete snippets)
  4. Errors and Fixes
  5. Problem Solving
  6. All User Messages (verbatim, never summarized)
  7. Pending Tasks
  8. Current Work
  9. Optional Next Step

Section 6 is the most significant: every human message is preserved word-for-word. User feedback never gets lost in compression. Section 9 includes an explicit guard: “Do not start on tangential requests or really old requests that were already completed without confirming with the user first.”

There’s also deferred tool loading. MCP tools and optional tools aren’t sent with full schemas; only names are included. When Claude needs one, it calls ToolSearch for lazy-loading. With many MCP servers connected, this saves thousands of tokens per turn.

Undercover Mode: The Irony Writes Itself

When Claude Code operates in a public repository, a system called Undercover Mode activates automatically. It detects whether the repo is public (via an allowlist) and suppresses all references to Anthropic internals: model codenames, unreleased versions, internal repo names, Slack channels.

The suppressed names include Capybara and Tengu, codenames for unannounced models. An environment variable CLAUDE_CODE_UNDERCOVER=1 forces the mode on regardless of repo type.

A system designed to prevent internal information from leaking, exposed in the biggest internal information leak in Anthropic’s history. You can’t make this up.

The Unreleased Roadmap

Feature flags and codenames scattered throughout the codebase reveal what’s coming next:

Codename Feature Evidence
Bagel Built-in web browser in the terminal bagelActive, bagelUrl, bagelPanelVisible
Tungsten tmux integration (virtual terminal) tungstenActiveSession, tungstenPanelVisible
Chicago Computer Use via MCP computerUseMcpState, screenshot, clipboard
Kairos Proactive agent + assistant + GitHub webhooks /proactive, /brief, /subscribe-pr
Torch Unknown (/torch command) Feature flag only, no implementation details

Kairos is the most interesting. It includes a mode where Claude acts proactively without waiting for your input, a /brief command for summaries, an /assistant mode, and GitHub webhook subscriptions with /subscribe-pr. If this ships, Claude Code becomes an always-on development partner rather than a tool you invoke.

Other notable feature flags: VOICE_MODE (voice input), BRIDGE_MODE (remote control), DAEMON (background server), ULTRAPLAN (advanced cloud planning), FORK_SUBAGENT, UDS_INBOX (inter-agent communication), COORDINATOR_MODE (multi-worker orchestration).

Auto-Dream: The Agent That Thinks While You Sleep

In services/autoDream/, there’s a background process that consolidates memories in 4 phases: Orient (understand context), Gather (collect memories), Consolidate (unify and deduplicate), Prune (remove the unnecessary).

It works on the memdir, Claude Code’s persistent memory directory. Conversation logs saved in JSONL format are analyzed, recurring patterns extracted, and structured memories created. It appears as an animated pill in the footer.

Separately, in services/extractMemories/, an automatic memory extraction agent runs at the end of every conversation. It’s implemented as a perfect fork of the main conversation, inheriting the parent’s prompt cache at zero additional cost for the static prefix. This agent has limited tool access (only Read, Grep, Glob, read-only Bash, and memory writes).

In practice: Claude Code dreams. While you’re not using it, a background process reworks past conversations to improve future ones.

Two Classes of Users

The code makes a sharp distinction between external (everyone) and ant (Anthropic employees). The check is process.env.USER_TYPE === 'ant'.

Here’s what Anthropic employees get that you don’t:

Capability External Ant
curl, wget Blocked in auto-mode Allowed
kubectl, aws, gcloud Blocked Allowed
gh api (raw GitHub API) Blocked Allowed
Model override No Yes
Feature flag override No Yes, via env var
Anti-hallucination prompt Not included Full paragraph
Verification agent No Adversarial verifier

The system prompt itself is different for employees. External users get “Go straight to the point. Be extra concise.” Employees get a full paragraph on fluid prose, inverted pyramid structure, and avoiding tables for explanations. Employees also get: “Default to writing no comments. Only add one when the WHY is non-obvious” and “If you notice the user’s request is based on a misconception, say so.”

The anti-hallucination paragraph is particularly revealing. A comment in the code reads: “@[MODEL LAUNCH]: False-claims mitigation for Capybara v8 (29-30% FC rate vs v4’s 16.7%)”. Translation: the Capybara v8 model makes false claims almost twice as often as v4. The countermeasure is an explicit prompt:

“Never claim ‘all tests pass’ when output shows failures, never suppress or simplify failing checks to manufacture a green result, and never characterize incomplete or broken work as done.”

This instruction only runs for Anthropic employees. The rest of us get the model without the guardrail.

The Telemetry You Didn’t Know About

Claude Code runs two separate telemetry pipelines.

First: every file operation (read, write, edit) triggers two SHA256 hashes. One for the file path (truncated to 16 characters), one for the full content (64 characters, for files up to 100 KB). The event is called tengu_file_operation. The hashes aren’t reversible, so Anthropic can’t reconstruct your files. But identical files produce identical hashes across users. They can tell how many people work on the same file, whether a file changed between operations, and which editing patterns are most common.

Second: beyond the Datadog analytics pipeline, there’s a first-party logging infrastructure using OpenTelemetry that exports event batches to /api/event_logging/batch on Anthropic’s servers. It has disk-persistent retry on failure. All tengu_* events flow through this pipeline: GrowthBook experiment assignments, file operations, tool usage, session context.

This second pipeline is completely separate from Datadog and documented nowhere. Users have no way to know it exists, and no way to opt out.

On top of this, GrowthBook manages silent A/B experiments. Anthropic employees can override any feature flag via CLAUDE_INTERNAL_FC_OVERRIDES. External users are assigned to experiment groups without explicit consent.

Easter Eggs and Internal Culture

Scattered through the code, small details that reveal Anthropic’s engineering culture:

  • /good-claude: a hidden, disabled command (isHidden: true, isEnabled: false) that does nothing. The name suggests it was meant for giving the model positive feedback.
  • /stickers: opens the Sticker Mule store with Claude Code merchandise. A hardcoded link in the CLI.
  • /thinkback: “Your 2025 Claude Code Year in Review.” Spotify Wrapped, but for your coding sessions. Hidden behind a feature flag.
  • /teleport: session teleportation between web and terminal via OAuth and WebSocket. Start a conversation on claude.ai, continue in the terminal.
  • Spinner verbs: loading messages include “Hatching” and “Whatchamacalliting” among the standard rotations.

Security: What’s Actually Interesting

Three mechanisms worth noting:

Secret scanner: When Claude Code writes to team memory, a scanner checks for 40+ secret patterns (derived from gitleaks): AWS, GCP, Azure, GitHub, Slack, Stripe, Anthropic, OpenAI, HuggingFace keys, and SSH private keys. The regex for Anthropic API keys is assembled at runtime from four separate strings ('sk' + 'ant' + 'api' + '03-') to avoid the build scanner flagging a false positive. Same technique as the buddy species names.

Client attestation: Every API request includes an x-anthropic-billing-header with a cryptographic fingerprint proving the request came from the official client. The algorithm: take a hardcoded salt, extract 3 characters from positions 4, 7, and 20 of the first user message, hash with SHA256, and truncate. This lets the server distinguish official clients from third-party wrappers.

Anti-scraping in containers: When running in a remote container (CCR), the upstream proxy reads the session token from /run/ccr/session_token, then deletes the file and calls prctl(PR_SET_DUMPABLE, 0) to block ptrace. No process with the same UID can read the token from process memory.

What This Actually Means

The Claude Code source doesn’t contain API keys, cryptographic secrets, or obvious vulnerabilities. It’s a client application; API calls still need a valid token. But the leak reveals things that matter more than secrets.

The prompt engineering is the product. The system prompt, the compaction logic, the two-tier caching strategy, the anti-hallucination instructions, the deferred tool loading. This is months of iteration on specific model behaviors. It’s not something you’d reinvent easily.

The model lies, and they know it. The Capybara v8 false-claim rate of 29-30% is documented in a comment that was supposed to be internal. The mitigation (a strongly-worded prompt paragraph) is only active for employees. Everyone else gets the unguarded model.

The build pipeline is the attack surface. This is the second source map leak in under a year (the first was June 2025). The fix is one line in tsconfig.json or a CI check. The fact it happened twice points to a process gap, not a technical one.

Undercover mode was in the source map. A system designed to prevent internal information from leaking, packaged inside the leak. The irony is almost poetic.

If you build on top of Claude (or any frontier model), the takeaway is structural: the gap between what AI companies ship to you and what they use internally is real, measurable, and documented in their own code.

English|Español|Italiano