Default is Destiny: Preparing for the Wrong Singularity
A recursive feedback loop is degrading the information substrate itself — and unlike previous episodes of epistemic collapse, this one may not be reversible.
The singularity that futurist Ray Kurzweil has been predicting since 1999 — artificial general intelligence by 2029, a merger of human and machine intelligence by 2045 — is built on a specific premise: recursive self-improvement.1 An AI becomes smart enough to improve its own architecture, which makes it smarter, which lets it improve further, until intelligence escapes biological constraints entirely. This is the singularity that dominates public debate. It is not the singularity that is arriving.
What is arriving instead is recursive symbolic amplification. Large language models do not modify their own weights at inference time. They cannot redesign their architectures. They are not recursively self-improving in any sense Kurzweil intended. But they are recursively self-propagating.
The loop works like this. LLMs generate text. That text is published, indexed, and consumed. It shapes how people write, argue, and think. The altered discourse becomes training data. The next generation of models learns from a world already partially shaped by the previous generation’s output. Graphite, an SEO analytics firm, found that by late 2024 more than half of new English-language web articles were primarily AI-generated, up from roughly five per cent before ChatGPT launched.2 Imperva’s 2025 Bad Bot Report found that automated traffic surpassed human-generated activity for the first time, accounting for 51 per cent of all web traffic.3 The internet is no longer a record of what humans think. It is increasingly a record of what machines predict humans would say.
This is not Kurzweil’s singularity. It is Jean Baudrillard’s.
Baudrillard, a French sociologist and philosopher, spent his career studying what he called simulation — not simulation in the sense of flight simulators or video games, but the process by which representations of reality gradually replace reality itself. He traced this through a series of stages: signs that faithfully represent, then distort, then mask the absence of what they claim to represent, then bear no relation to any referent at all.4 His concern was always that the symbolic order would detach from the real and become self-sustaining. What he could not have anticipated was the technical infrastructure that would make this literal. When models train on model-generated data, simulation feeds on simulation. Researchers at Oxford published a 2024 paper in Nature demonstrating that this produces measurable degradation: models trained recursively on their own outputs lose information from the tails of their data distributions first — minority patterns, rare knowledge, edge cases — before collapsing toward bland central tendencies.5 They called this model collapse: a system that forgets everything except the mean.
Claude Shannon would have recognised the mechanism immediately. In 1948, Shannon defined information mathematically as the reduction of uncertainty — as surprise.6. By that measure, LLM output is low-information by design. The objective function is prediction: produce the most probable next token given the preceding context. Each generation of model output, fed back into training data, reduces the surprise in the corpus. The next generation’s distribution narrows further. What the machine learning community calls model collapse, information theory would call entropy increase in the channel — a degradation not of intelligence but of signal.
Shannon also understood that entropy’s direction depends on whether the system is open or closed. In an open system — one that receives fresh signal from outside — entropy can be locally reversed. In a closed system, the second law applies: disorder increases. If AI-generated content is growing as a proportion of total content, and human-generated content is shrinking, and the institutions that produce slow, expensive, verified human knowledge — universities, specialist publishers, investigative newsrooms, domain-expert communities — are being defunded or hollowed out by the same economic forces that make AI-generated content cheap, then the system is closing. And in a closing system, the degradation is not a temporary distortion. It is a direction.
This is what separates the current situation from previous episodes of epistemic collapse, and what earns the word singularity. Propaganda states, information monopolies, the erosion of institutional trust — these have all degraded the information environment before, and civilisations have recovered. But in every previous case, the underlying substrate survived. Soviet propaganda distorted public discourse for decades, but the scientists still knew physics, the engineers still knew how bridges worked, the samizdat writers still carried the literary tradition in their heads. The information existed in people and institutions even when the channels were corrupted. What was damaged was distribution, not knowledge itself. Recovery was possible because the substrate — embodied human expertise, institutional memory, disciplinary traditions — remained intact beneath the distortion.
The recursive feedback loop threatens something different. It degrades the substrate. When the corpus on which future AI systems train is progressively contaminated by AI-generated output, when human-generated text becomes a shrinking minority of available training data, when the tail distributions that Shumailov’s Nature paper shows disappearing first are precisely the rare, specialist, minority-perspective knowledge that is hardest to regenerate — the loss is not in the channel but in the informational ecology itself. And informational diversity, like biological diversity, has a property that makes its loss qualitatively different from other kinds of damage: extinction is not reversible. The specialist who retires without training a successor, the research programme defunded because an LLM can produce a plausible-sounding literature review for a fraction of the cost, the language community whose textual tradition is too small to survive as training data — these represent permanent losses of signal. You can recover from a bad broadcast. You cannot recover an extinct species of knowledge.
Nanotechnologists have a name for this kind of process. Grey goo — the scenario Kurzweil himself discussed in The Singularity Is Near — describes self-replicating machines converting diverse matter into copies of themselves until nothing else remains. The information ecology faces its own version: self-replicating text converting diverse human knowledge into statistically average model output, not through malice but through the simple economics of cheap production and expensive origination. The grey goo scenario was always speculative. The grey goo of language is measurable and already underway.
Meanwhile, reality adapts to the models rather than the other way around. An entire industry now exists to optimise content for LLM summarisation — structuring text so that AI systems will parse, cite, and surface it.7 Developers write code in patterns that LLM completion engines predict well, because fighting the autocomplete costs more effort than accepting it. Communication styles converge toward the register that AI produces most fluently. A system optimising for prediction will, over time, reduce the information content of its environment. It will make the world easier to predict by making the world less surprising.
The standard response is individual: read critically, verify sources, treat the model as a tool rather than an oracle. That response is correct and insufficient. The recursive loop operates upstream of individual judgment. A researcher who reads every source carefully is still working within a literature increasingly shaped by AI-generated prose. A policymaker who insists on primary evidence is still operating in an environment where AI-summarised sentiment has already moved the markets her policy addresses. A journalist who fact-checks rigorously is still drawing on sources whose public statements were drafted or influenced by LLM output. Synthetic consensus forms when AI-generated analysis converges around similar conclusions — not because they are correct, but because the training data contains overlapping patterns that produce convergent outputs. Synthetic authority builds when AI-generated content cites other AI-generated content, creating chains of attribution that look rigorous but anchor to nothing.8 The corruption is not at the point of consumption. It is in the supply.
The strongest counterargument is that the loop is self-correcting. Yann LeCun and others at Meta have argued that language models will eventually anchor their outputs in world models through multimodal grounding, not just statistical patterns.9 And the model collapse research suggests a fix: if original human data is preserved alongside synthetic data, degradation stabilises. These are serious points. But they solve a different problem. Grounding solves the intelligence problem — whether models produce accurate outputs. It does not solve the symbolic amplification problem, because amplification does not require the model to be wrong. It only requires the model to be fluent enough, and cheap enough, to become the default.
The philosopher Bernard Stiegler spent his final decades explaining why default is destiny.10 Every cognitive technology, he argued, is a pharmakon — simultaneously remedy and poison. Writing augments memory but atrophies recall. Calculators augment arithmetic but erode numeracy. The technology performs a cognitive task so effectively that the human capacity to perform it independently withers — a process he called proletarianisation. Stiegler’s deeper insight was that this is not primarily a story about individual weakness. It is a story about economic incentives. Systems that externalise cognitive effort create dependent users, and dependent users are more profitable than independent ones. The platform that summarises so you do not have to read will always outcompete the platform that insists you read first. The struggle is not between disciplined individuals and undisciplined ones. It is between the speed of technological externalisation and the capacity of institutions to build countervailing structures — structures that keep the system open, that preserve the flow of fresh human signal, that maintain the diversity of the informational ecology against the economic pressure to let it collapse.
We know what some of these structures might look like: provenance requirements for training data, audit standards for AI-generated content in regulated domains, sustained investment in the kind of slow, expensive, embodied knowledge production that no language model can replicate. None of these are technically exotic. All of them are economically inconvenient, which is precisely why they require institutional commitment rather than individual virtue. This singularity is not transcendence. It is the point at which what has been lost can no longer be recovered — and unlike Kurzweil’s version, we will not recognise it when it arrives. There will be no dramatic threshold, no moment of obvious rupture. The surface will remain fluent, plausible, and well-formatted. It is what lies beneath that will have quietly become nothing.
References
-
Kurzweil, R. (2005). The Singularity Is Near: When Humans Transcend Biology. Viking. Kurzweil reaffirmed his core predictions — AGI by 2029, singularity by 2045 — in The Singularity Is Nearer (Viking, 2024). ↩
-
Graphite. (2025). AI Content Study. Analysis of 65,000 English-language URLs from Common Crawl, 2020–2025. Reported in Axios, 14 October 2025. https://www.axios.com/2025/10/14/ai-generated-writing-humans ↩
-
Thales / Imperva. (2025). 2025 Bad Bot Report: The Rapid Rise of Bots and the Unseen Risk for Business. https://www.imperva.com/resources/resource-library/reports/2025-bad-bot-report/ ↩
-
Baudrillard, J. (1981). Simulacra and Simulation (S. F. Glaser, Trans.). University of Michigan Press. ↩
-
Shumailov, I., Shumaylov, Z., Zhao, Y., Papernot, N., Anderson, R., & Gal, Y. (2024). AI models collapse when trained on recursively generated data. Nature, 631(8022), 755–759. ↩
-
Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27(3), 379–423. ↩
-
The practice is now known as “LLM SEO.” By mid-2025, major SEO platforms including Yoast and Semrush had introduced dedicated tooling for optimising content to be parsed and cited by large language models. ↩
-
A 2025 ICLR spotlight paper demonstrated that even synthetic data fractions as small as 1 in 1,000 can degrade model performance over successive training generations. Dohmatob, E., et al. (2025). Strong model collapse. Proceedings of ICLR 2025. ↩
-
LeCun, Y. (2022). A path towards autonomous machine intelligence. Meta AI Research. LeCun has consistently argued that language-only models are fundamentally limited and that multimodal grounding in world models will resolve current shortcomings. ↩
-
Stiegler, B. (2010). Taking Care of Youth and the Generations (S. Barker, Trans.). Stanford University Press. Stiegler’s concept of proletarianisation — the loss of knowledge through its externalisation into technical systems — extends across his major works, including Technics and Time (1994–2001) and The Age of Disruption (2016). ↩