A New Proposal Would Train AI on Brain Signals — and Raise the Stakes for Mental Privacy
The next generation of artificial intelligence might not just learn from what people type or click. It could learn from the way their brains fire.
In a preprint posted Jan. 17 on the online server arXiv, cognitive neuroscientist Maël Donoso lays out a plan to train large “foundation models” directly on human neuroimaging data. Instead of relying only on written feedback and ratings, he proposes using brain activity as part of the training signal—turning neural responses themselves into a kind of reward function for AI.
The paper, titled “A New Strategy for Artificial Intelligence: Training Foundation Models Directly on Human Brain Data,” introduces two main ideas: reinforcement learning from human brain (RLHB) and chain of thought from human brain (CoTHB). Donoso argues that brain signals could help steer powerful models toward behavior that better reflects human understanding, preferences and reasoning—at a time when regulators are increasingly treating mental privacy as a fundamental right.
“The paper explores moving beyond surface-level statistical regularities by training foundation models directly on human brain data,” Donoso writes.
From RLHF to brain-based feedback
Donoso, who holds a Ph.D. in cognitive and computational neuroscience and runs a one-person startup called Ouroboros Neurotechnologies in Lausanne, Switzerland, is not announcing a new commercial system. His paper is an agenda for future work, grounded in recent advances in brain decoding, brain-computer interfaces and the first generation of “brain foundation models” trained on large neuroimaging datasets.
Today’s leading chatbots and multimodal systems are typically refined through reinforcement learning from human feedback (RLHF). In that process, people rank model outputs or flag harmful content. Those judgments train a reward model, and the underlying system is optimized to produce outputs the reward model favors.
RLHF has become a cornerstone of large AI deployments, but it has drawn criticism for encoding annotator bias, incentivizing models to sound agreeable rather than accurate and relying on low-paid crowdworkers to define “human values” in practice. It also only sees what people are willing or able to write down.
Donoso’s proposal starts from a simple premise: before people speak, click or type, their brains generate activity patterns that underlie perception, valuation and decision-making. He calls these signals “brain-generated data” and distinguishes them from both device-generated logs and human-generated text, images or ratings.
Human-generated actions, he argues, are just one subset of a much richer process unfolding in the brain. If AI systems could learn directly from approximations of that neural process—measured through functional MRI (fMRI), electroencephalography (EEG) or other imaging tools—they might capture aspects of cognition that never make it into written feedback.
RLHB: using neural activity as a reward signal
RLHB is the clearest extension of current practice. In Donoso’s sketch, a model generates an answer to a prompt. A human evaluator reads it while their brain activity is recorded. Signals from regions involved in valuation—such as the ventral striatum and ventromedial prefrontal cortex—are decoded into numerical estimates of reward, confidence, novelty or aversion. Those numbers are then combined with explicit ratings or task scores to shape the reinforcement learning signal.
“The first method could be called reinforcement learning from human brain (RLHB). It would be an extension of RLHF … incorporating neuroimaging data, and possibly other real-time measures, for a better alignment with human values,” Donoso writes.
In principle, that implicit feedback could pick up fast, automatic reactions before a person has time to rationalize or self-censor. It might also weight feedback by how strongly someone cares about an answer, or detect conflict and surprise that simple star ratings miss.
CoTHB: aligning reasoning steps with the brain
A second proposal, CoTHB, moves from outcomes to reasoning. Large language models increasingly use chain-of-thought prompting, generating step-by-step explanations on their way to an answer. CoTHB aims to align those intermediate steps with neural markers of human executive function.
“The second method could be called chain of thought from human brain (CoTHB). It would be an extension of CoT … incorporating neuroimaging data … for a more trustworthy emulation of human executive function,” Donoso writes.
He points to a set of brain regions—including the dorsolateral prefrontal cortex, anterior cingulate cortex and frontopolar cortex—that decades of work have linked to maintaining rules, weighing alternatives, switching strategies and deciding when to explore something new. In his framework, decoded signals from those areas could guide when a model sticks with a line of reasoning, when it backtracks and when it spins off sub-questions to resolve uncertainty.
A growing research base—and a bridge from EEG to fMRI
These ideas build on a body of work where brains and algorithms are already intertwined.
Over the past few years, groups in Europe, the United States and China have trained large “brain foundation models” on tens of thousands of MRI and fMRI scans to predict disease risk, cognitive traits and treatment response. Separate efforts have produced foundation models for EEG that generalize across recording setups and tasks.
In December, Donoso published a peer-reviewed paper in Frontiers in Systems Biology showing that fMRI activity during a neurofeedback task could be predicted from EEG signals using a mix of classical machine learning, deep networks and large language models. That work used chain-of-thought reasoning inside an AI system to infer cognitive states from EEG and then predict blood-oxygen-level-dependent activity in cortex.
He now casts EEG-to-fMRI prediction as a bridge between expensive, high-resolution scanning and cheaper, portable recording. If EEG can be reliably mapped onto richer, fMRI-like representations, then RLHB and CoTHB might be run outside of hospital imaging suites.
There is also precedent for using neural signals as feedback in reinforcement learning. Studies have shown that robots can learn to avoid mistakes by reading error-related potentials in human EEG, and that functional near-infrared spectroscopy (fNIRS) from the prefrontal cortex can be mapped to performance evaluations of agents in simple tasks. Donoso cites this work as proof that “reinforcement learning from neural feedback” is feasible at small scale, even if extending it to trillion-parameter models is another matter.
Technical hurdles: data scale, noise and “reverse inference”
The technical obstacles are substantial. Modern language models train on trillions of text tokens; existing neuroimaging datasets typically contain hundreds or thousands of brains. fMRI scans are costly and time-consuming, and EEG, while cheaper and faster, has limited spatial resolution and is mostly sensitive to activity near the scalp.
Brain anatomy and function also vary from person to person, making it hard to align one individual’s “brain space” with another’s in a way that a single model could consume. And although decoding methods have improved, inferring specific cognitive states or values from a pattern of activation remains an inexact science.
Neuroscientist Russell Poldrack and others have warned for years against “reverse inference”—the practice of concluding that a particular mental state is present because a certain region lights up. Donoso acknowledges this skepticism and leans on a pragmatic standard instead: if brain-derived signals reliably improve model performance or robustness, they may be useful even if scientists do not agree on a precise psychological interpretation.
Neurorights and the fight over mental privacy
The ethical and legal questions may be even more challenging than the technical ones.
In recent years, international organizations and national governments have begun to treat neural data as a special category of sensitive information. In November 2025, UNESCO’s member states adopted what the agency called the first global standard on the ethics of neurotechnology, urging protections for “mental privacy” and “freedom of thought” and warning of a regulatory “wild west” in consumer brain-computer interfaces.
Chile went further. In 2021, the country amended its constitution to recognize so-called neurorights, including rights to mental privacy, personal identity, free will and equitable access to cognitive enhancement. In 2023, Chile’s Supreme Court ordered U.S.-based neurotech company Emotiv to delete all brain data collected from former senator Guido Girardi, ruling that unauthorized use of his neural recordings violated his psychological integrity and privacy.
In Europe, the regional government of Cantabria in Spain approved what it described as the first European law specifically protecting brain data, treating it as health information and imposing extra transparency and oversight obligations on AI systems that process it in medical contexts. In the United States, senators have urged the Federal Trade Commission to investigate neurotech firms over data practices, and states including Colorado and California have moved to classify neural data as sensitive.
If companies ever tried to incorporate RLHB or CoTHB into commercial AI products, these emerging neurorights frameworks would likely come into play. Key questions include whether scans collected for medical or research purposes can be reused to train general-purpose AI; how to obtain informed consent when the downstream uses of brain data are uncertain; and whose brains end up defining “aligned” behavior if data collection is expensive and geographically concentrated.
Donoso frames brain-generated data as distinct from conventional personal data precisely because of its intimacy with underlying cognition. That distinctiveness, however, is what makes many legal scholars and ethicists wary of expanding non-medical uses of neuroimaging, particularly in areas such as marketing, workplace monitoring or law enforcement.
What comes next
For now, Donoso’s proposal remains theoretical. His preprint has drawn interest on technical aggregators and in neuro-AI circles, but there are no public signs that large AI labs have adopted or rejected RLHB or CoTHB.
The next steps are likely to come from small-scale experiments: testing whether brain-informed rewards or constraints can measurably improve robustness, reasoning or safety in modest models.
Whether those experiments ever scale up may depend less on clever decoding algorithms than on decisions made in parliaments, courts and standards bodies. As AI systems grow more capable and more deeply embedded in daily life, the prospect of tuning them to the dynamics of human neural circuits brings two futures into sharp focus: one in which machines are better aligned with how people perceive, value and reason—and another in which the contents of the mind become just another data stream to be optimized.