When the AI Newsroom Ran Out of News
On a recent afternoon inside a digital newsroom testing an AI-powered workflow, an automated “lead” for a new story landed in an editor’s queue with an unusual confession: there was nothing to cover.
Instead of flagging a breaking investigation, a regulatory filing or a new scientific paper, the system produced a meta-message. It explained that earlier, hallucinated story ideas had been discarded, no fresh list of verified leads had been supplied and, as a result, it could not identify any real-world topic to research. Any attempt to “fill in” a subject, the agent warned, would be “pure invention.”
In an era when news organizations are under pressure to publish faster with fewer staff, the moment was jarring. It also offered a rare, unvarnished look at how artificial intelligence is being wired into newsroom workflows — and what happens when those systems collide with journalism’s most basic rule against making things up.
A pipeline of agents — and an empty assignment
The episode unfolded in a multi-step pipeline of so-called “agents,” each powered by a large language model and given a specific role. One system is tasked with surfacing possible leads. Another evaluates them against editorial priorities such as accuracy, novelty and significance. A third is supposed to plan and draft coverage once a topic has been vetted.
In this case, earlier candidate stories had been identified as hallucinations: plausible-sounding but false descriptions of events that had not occurred. Those leads were explicitly thrown out. But no new, concrete topics were entered to replace them. When the research agent checked for a current assignment, it found only the absence of one — and said so.
“There is currently no concrete lead or topic to research,” the internal report read. It went on to note that there was “no identifiable event, policy, lawsuit, report, company, person, government, or organization specified,” and therefore no way to verify facts, pull filings or gather quotes. Proceeding anyway, the system concluded, would defeat its instructions to validate every aspect of a story and avoid hallucinations.
The machine essentially did what many chatbots do not: it refused to guess.
Hallucinations meet newsroom rules
The glitch, though modest in scope, sits at the crossroads of several trends reshaping the media industry. Newsrooms around the world are experimenting with automated tools to summarize documents, generate earnings briefs and even draft consumer explainers. At the same time, editors and reporters are grappling with the tendency of large language models to confidently output falsehoods — a behavior computer scientists and companies themselves acknowledge.
“Current large language models are prone to ‘hallucinate’ — that is, to make up facts,” OpenAI wrote in a technical report on its GPT-4 system in 2023, adding that the model “still suffers from many of the same limitations as earlier GPT models.” Google, in a paper released the same year, warned that text generators can “produce plausible-sounding but incorrect or nonsensical answers.”
Those risks are especially acute in journalism, where fabricated details can spread quickly and damage public trust. The Society of Professional Journalists’ Code of Ethics instructs reporters to “take responsibility for the accuracy of their work” and “verify information before releasing it.” It also warns: “Never deliberately distort facts or context.”
Some outlets have already learned hard lessons. In early 2023, the technology news site CNET paused an experiment with AI-written personal finance articles after outside reporters found factual errors and instances of possible plagiarism. That same year, Gannett temporarily halted the use of an AI tool for high school sports recaps after it generated awkward, repetitive copy that drew criticism from readers and journalists.
Against that backdrop, the automated stall in the test workflow — an AI agent that stopped itself rather than conjure a topic — highlights a different kind of failure mode: paralysis instead of fabrication.
“They’re not reporters”
“It’s a good reminder that these systems are not reporters,” said Mike Ananny, a journalism and communication professor at the University of Southern California who studies automation in media. “They don’t know what happened in the world. They’re reorganizing language. If you don’t give them a grounded assignment, they’ll either make one up or, if constrained correctly, they’ll tell you they can’t proceed.”
The agent in the recent incident was constrained. Its internal instructions put accuracy ahead of novelty and recency. It was told not to invent events, people or organizations and not to speculate beyond verifiable information. When it found that earlier leads had been discarded as fictitious, and no new leads had replaced them, it responded by asking for human guidance: either a one-line description of a chosen story, or a short list of candidates with enough detail to be checked.
Technologists involved in designing similar systems say that behavior is intentional.
“If you’re building AI tools for a newsroom, you have to make ‘I don’t know’ an acceptable, even expected, answer,” said Meredith Broussard, a data journalism professor at New York University and author of a book on artificial intelligence and bias in news media. “Otherwise you’re baking hallucinations into your reporting process.”
A technical gap with editorial consequences
Under the hood, the problem was partly technical. Multi-agent setups rely on shared “state” — a record of what has been decided so far and what task is currently active. When hallucinated leads were scrubbed from the system, that state effectively went blank. The research agent could see that something had happened before, but not that a new, valid assignment had taken its place.
“Think of it as a very literal intern,” said Nicholas Diakopoulos, an associate professor at Northwestern University who has written extensively on computational journalism. “If you don’t give the intern a real story to chase, they can outline the steps they would take, they can tell you what they need, but they can’t just materialize a city council meeting that never occurred.”
The incident underscores the continuing need for human editors not only to review finished copy, but also to manage how AI tools are deployed upstream. That includes deciding which sources are considered authoritative, how leads are generated, when to escalate ambiguous situations to a person and how to log machine-generated decisions in case of later questions about accuracy or bias.
Liability and the value of stopping
It also touches on legal and reputational risk. Media lawyers have warned that publishing AI-generated content with false statements about real individuals could expose outlets to defamation claims, especially if there is little evidence of human oversight.
“From a liability perspective, a system that occasionally stalls because it doesn’t have enough information is far safer than one that hallucinated its way into accusing someone of a crime,” said Chip Stewart, a media law professor at Texas Christian University. “The law cares about what’s published, not about how clever the software seemed.”
For now, most major news organizations say they are approaching generative AI cautiously. The New York Times has barred staff from using such tools to write or edit sensitive stories. The Washington Post has told journalists not to publish AI-generated text without human review. The Associated Press, which has used automation for years to generate routine earnings briefs, updated its standards in 2023 to advise against using AI to create publishable content without close oversight.
As experiments continue, episodes like the stalled lead are likely to recur. Advocates of automation argue that such breakdowns can be constructive, forcing newsrooms to clarify their expectations and codify guardrails. Critics warn that invisible glitches inside complex systems may be harder to spot than a clumsy error in a bylined story.
In the test that produced the non-lead, nothing reached the public. No false article was posted. No politician was misquoted. Instead, the only visible output was a machine’s admission that it could not, on its own, find something real to write about.
In a media landscape saturated with content and claims, the most striking thing about the episode may be its absence of spectacle. Faced with an empty assignment and a mandate to avoid fabrication, the automated system did not invent a scandal or conjure a study. It simply stopped and asked for a story.