On February 28, 2026, Donald Knuth, the 87-year-old creator of TeX and author of The Art of Computer Programming, posted a five-page paper to his Stanford faculty page titled “Claude’s Cycles.” The opening line: “Shock! Shock!”
The paper describes how Anthropic’s Claude Opus 4.6 solved an open graph theory conjecture that Knuth had been working on for several weeks. The solution will likely appear in a future volume of The Art of Computer Programming. The paper garnered over 635,000 views and 6,000 likes within hours of being shared on social media.
This is not a marketing claim from an AI company. It is a verification from the most respected living computer scientist, a man who has been publicly skeptical of large language models for mathematical reasoning. When Knuth says an AI contributed to his research and names his paper after it, the claim carries weight that no benchmark score can match.
The Problem
The conjecture involves directed Hamiltonian cycles in three-dimensional graphs. Picture a cube of points arranged in an m by m by m grid, where each point has coordinates (i, j, k) with each value ranging from 0 to m minus 1. The total number of points: m cubed.
From each point, there are exactly three directed edges. You can move by incrementing i, j, or k by 1 (wrapping around at the boundary, so the arithmetic is modular). That gives 3 times m cubed directed edges in total.
A Hamiltonian cycle is a path that visits every vertex exactly once and returns to the start. The challenge Knuth posed is harder than finding a single Hamiltonian cycle. He asked: can you decompose all 3 times m cubed edges into exactly three Hamiltonian cycles, each of length m cubed, such that every edge belongs to exactly one cycle?
Knuth had solved the small case (m equals 3) by hand and verified solutions computationally for m up to 16. But he could not find a general construction rule: a formula or algorithm that produces a valid decomposition for any odd m greater than 2. The problem was earmarked for inclusion in a future volume of TAOCP, and he was stuck.
What Claude Did
Filip Stappers, a colleague of Knuth, fed the exact problem statement to Claude Opus 4.6. What followed was an hour-long session across 31 guided explorations. Stappers was at the keyboard throughout, prompting Claude to document its progress, redirecting when it lost track, and restarting after session errors.
The 31 explorations followed a pattern that researchers familiar with mathematical discovery will recognize: a structured search through the space of possible approaches, with dead ends followed by strategy changes.
Early explorations tested brute-force approaches, which were too slow for the general case. Claude then invented what it called “serpentine patterns,” attempting to build cycles by snaking through the grid in systematic paths. In exploration 21, Claude identified a 2D serpentine function with useful properties and tested it as a building block for the 3D case. This looked promising for odd m but could not be generalized.
Explorations 22 through 25 tried simulated annealing at scale. The approach could find specific solutions for given values of m but produced no general construction rule. At exploration 25, Claude told itself: “SA can find solutions but cannot give a general construction. Need pure math.”
That self-correction is notable. The model recognized the limits of a numerical approach and shifted to algebraic reasoning on its own.
At exploration 27, Claude tried constructing three cycles from a single template by applying cyclic coordinate rotation. The idea: define one cycle, then rotate the coordinates to produce the other two. This produced valid individual Hamiltonian cycles for every m equals 3, but the three cycles did not cover all edges. Near miss.
At exploration 30, Claude noticed a structural pattern in an earlier solution. By exploration 31, it produced a working construction. The rule: for a given vertex (i, j, k), compute s equals (i plus j plus k) mod m. The “bump” direction (which coordinate to increment) depends on s and the values of the other coordinates. The specific rules:
When s equals 0: bump i if j equals m minus 1, otherwise bump k. When 0 is less than s and s is less than m minus 1: bump k if i equals m minus 1, otherwise bump j. When s equals m minus 1: bump k if i equals 0, otherwise bump j.
This generates the first cycle. The other two cycles are produced by rotating the coordinate roles. Stappers tested the construction for all odd m between 3 and 101. Every decomposition was valid.
What Knuth Did With It
Claude found the construction. Knuth proved it works.
These are different contributions, and conflating them overstates what the model did. Identifying a pattern through guided search is non-trivial. It is genuinely hard, which is why Knuth was stuck. But producing a formal mathematical proof from first principles is a separate capability. Claude did not produce the proof. Knuth did.
Knuth’s proof shows that the bump rule generates a cycle of length m cubed by demonstrating that it visits all m squared vertices with the same i value, then covers all i values in turn, and returns to the starting vertex. A parallel argument covers the other two cycles, and Knuth verified that the three cycles partition all edges.
Knuth also went further than Claude. He discovered that Claude’s solution was one of 760 valid “Claude-like” decompositions (those generated by analogous bump rules). Out of 4,554 total solutions for the 3 by 3 by 3 case, Knuth checked several of the 760 and “didn’t encounter any that were actually nicer” than Claude’s.
What Claude Could Not Do
After finding the odd-m solution, Stappers asked Claude to tackle the even-m case. Claude got stuck. In Knuth’s words, it “seemed to get stuck” and “in the end, it was not even able to write and run explore programs correctly anymore.”
The even case remains open. A separate researcher later used GPT-5.3 Codex to solve even cases for m greater than or equal to 8 (m equals 2 was proved impossible in 1982, and the cases m equals 4 and 6 remain open). The fact that a different model solved a different part of the problem on a different attempt reinforces a pattern: AI models have variable, unpredictable mathematical strengths. No single system solves everything.
The session also required substantial human guidance. Stappers repeatedly prompted Claude to document its results, redirected it when it pursued dead ends, and had to restart after random errors caused session loss. This was a 31-step collaborative process, not a single-prompt autonomous solve.
Why This Matters
The significance is not that AI can “do math.” The significance is the identity of the validator.
Knuth is not an AI enthusiast looking for evidence to support his thesis. He has been publicly skeptical of large language models for rigorous mathematics. His previous interactions with ChatGPT, documented in his 2023 questions to the model, reflected polite curiosity but deep reservations about reliability. When Knuth writes “It seems I’ll have to revise my opinions about ‘generative AI’ one of these days,” that statement carries a different weight than a press release from an AI company.
The result also suggests an emerging research model. Humans pose problems. AI explores the solution space through structured iteration. Humans verify and prove the results. This is not AI replacing mathematicians. It is AI functioning as a powerful hypothesis generator that can try hundreds of approaches in the time a human tries five.
The Claudini autoresearch results showed a similar pattern in adversarial machine learning: AI systems operating at the scale of compute investment currently being deployed can generate discoveries that humans then validate. The Knuth case extends this pattern to pure mathematics.
Terence Tao accepted a GPT-5.2-generated proof for Erdos Problem #397 around the same time, verified through the formal language Lean. Since late 2025, over a dozen previously open mathematical problems have been moved to solved status with AI models credited in the solutions. The pattern is accelerating.
The Limitations of the Pattern
Claude’s approach worked because the problem had a tractable search space. The bump rule is defined by a finite set of conditional choices. An exhaustive search over rule variants, combined with computational verification, can cover that space. Not all mathematical problems have this structure. Many require conceptual leaps that cannot be reached by iterating over parameterized rule families.
The even-m failure illustrates this directly. When the solution requires a fundamentally different construction (not a bump-rule variant), the structured exploration approach that found the odd-m answer breaks down.
There is also a selection effect. We hear about the problems AI solves. We do not hear about the hundreds of problems where AI spends an hour exploring and produces nothing useful. AI benchmark results show impressive capability on specific tasks, but the gap between benchmark performance and open-ended mathematical creativity remains large.
Knuth closed his paper with a characteristically understated pun: “I think Claude Shannon’s spirit is probably proud to know that his name is now being associated with such advances. Hats off to Claude!” The joke works because it is ambiguous. Shannon, the father of information theory. Claude, the AI named (loosely) after him. And perhaps a nod to the fact that the distinction between AI capability and AI hype remains, as always, a matter of careful measurement.
One response to “How Claude Solved a Problem Donald Knuth Could Not: The Math Behind “Claude’s Cycles””
[…] data, or domain expertise) are unaffected or growing. Technical content that explains mechanisms, original research and analysis, product reviews with first-hand testing, and content that requires trust (medical, legal, […]
LikeLike