AI Agents Reproduce and Extend CERN Analyses, Claim First AI-Produced Novel Result on LEP Data
A new arXiv preprint from researchers affiliated with MIT, the NSF AI Institute for Artificial Intelligence and Fundamental Interactions, or IAIFI, and CERN says an AI-agent workflow built around Anthropic’s Claude Code was able to carry out end-to-end high-energy physics analyses on public CERN datasets, including what the authors describe as a first AI-produced novel result.
The paper, titled “AI Agents Can Already Autonomously Perform Experimental High Energy Physics,” is a preprint, not a peer-reviewed study. It was submitted to arXiv on March 20, 2026, and revised June 20, 2026. The authors — Eric A. Moreno, Samuel Bright-Thonney, Andrzej Novak, Dolores Garcia and Philip Harris — have also publicly released the code, generated paper files and other analysis artifacts on GitHub under jfc-mit.
The preprint says the team built a framework called Just Furnish Context, or JFC, around Claude Code, Anthropic’s agentic coding tool and runtime. The abstract says: “Given access to a HEP dataset, an execution framework, and a corpus of prior experimental literature, we find that Claude Code succeeds in automating all stages of a typical analysis: event selection, background estimation, uncertainty quantification, statistical inference, and paper drafting.” In the project’s GitHub README, the authors describe the system more cautiously, writing: “JFC is a proof-of-concept framework for autonomous high energy physics analysis.”
According to the paper, the workflow was tested on public open datasets from ALEPH, DELPHI and CMS. The two headline examples are a CMS Run-1 Open Data Higgs-to-tau-tau analysis, presented as a reproduction of a known result, and a Lund-plane measurement on LEP-era ALEPH data, which the authors say is new. The abstract calls that second result “the first Lund plane measurement on LEP data — a genuinely novel result and, to our knowledge, the first produced autonomously by an AI agent.” The GitHub materials explicitly list Claude Code as the agent runtime used in the project.
ALEPH and DELPHI were experiments at CERN’s Large Electron-Positron collider, or LEP, while CMS is one of the main experiments at the Large Hadron Collider. The Lund plane is a way of representing jet substructure — the internal pattern of particle sprays created in collisions — that was introduced in 2018 and has since been used in collider-physics studies. The paper’s novelty claim is not about inventing the Lund plane itself, but about applying that style of measurement to LEP electron-positron data, rather than the hadron-collider measurements that came earlier.
The authors do not describe the process as fully human-free. The project materials say the workflow includes multi-agent review and a “human gate,” meaning a person still signs off at certain points. And while the analyses use data from the CERN Open Data portal, that public availability does not mean the original experimental collaborations endorse outside results produced from those releases. For now, the central claims — including the claimed first autonomous novel result — remain the authors’ claims in a public preprint, backed by released code and artifacts but not yet peer reviewed.