Preprint Claims PRNG 'SeedHijack' Could Steer LLM Outputs; Authors Propose QRNG Defense
A newly posted arXiv preprint claims attackers may be able to steer large language model outputs by tampering with the pseudorandom number generator, or PRNG, used during sampling, rather than by changing a model’s weights or logits. The paper also proposes a defense based on hardware quantum random number generation, though the work is an unreviewed preprint and its claims were not independently verified in the materials reviewed here.
The paper, titled “Seed Hijacking of LLM Sampling and Quantum Random Number Defense,” was posted to arXiv as arXiv:2605.08313v1 in the cs.CR, or cryptography and security, category. The arXiv record shows it was submitted May 8, 2026, by Ziyang You, Xiaoke Yang, Zhanling Fan, Feng Guo, Xiaogen Zhou and Xuxing Lu. According to the abstract, the paper describes “SeedHijack,” which the authors characterize as a backdoor attack on the randomness layer used when models generate text. If that claim holds up, it would point to a potential supply-chain risk in the software stack around an AI model, not just in the model itself.
In the abstract, the authors write, “We present SeedHijack, a backdoor attack that manipulates PRNG outputs..." They further claim, “In a 540-trial benchmark on GPT-2 (124M), the attack achieves 99.6% exact token injection..." across nine sampling configurations. The abstract also says the method reached 100% success on four aligned models ranging from 1.5 billion to 7 billion parameters, including systems using reinforcement learning from human feedback, supervised fine-tuning and reasoning distillation. The authors say the attack bypassed all alignment methods tested in the work. Those figures come from the abstract and should be understood as the authors’ reported results, not as independently confirmed findings.
The claim matters because many language models rely on deterministic PRNGs during text generation. Those generators are used in common sampling approaches such as top-k, top-p and temperature sampling, which help decide which token a model emits next from a set of likely candidates. In many machine learning setups, developers can reproduce outputs by setting the random seed. If an attacker could manipulate that randomness source, they might be able to influence which token gets chosen without modifying the model’s learned parameters or its raw probability scores.
As a mitigation, the authors propose using a hardware quantum random number generator, or QRNG, as an entropy source during sampling. According to the abstract, that defense “neutralizes the attack in our evaluated threat model with negligible median overhead.” The abstract reports a 0.6% median latency increase and 7.7 megabytes of added memory use. QRNGs are used in some high-assurance cryptographic settings, but they are not a universal security fix, and prior peer-reviewed research has shown that QRNG hardware can face physical-layer attacks. In this case, QRNG should be understood as the paper’s proposed defense, not a guaranteed cure-all.
The paper had just been posted to arXiv, and the reporting materials reviewed here did not include a linked public code repository or a press release. That means the article is reporting the claims as presented in the arXiv record and abstract. It does not establish that the attack works as described outside the authors’ evaluation, or that the proposed defense would hold up more broadly.