February 9, 2026

AI Model Collapse Risk in 2025: How Recursive Training on AI‑Generated Data Harms Telehealth and LLMs

Table comparing telehealth outcomes across generations of AI training illustrating model collapse

Photo by <a href="https://unsplash.com/@omilaev?utm_content=creditCopyText&utm_medium=referral&utm_source=unsplash">Igor Omilaev</a> on <a href="https://unsplash.com/photos/a-computer-chip-with-the-letter-a-on-top-of-it-eGGFZ5X2LnA?utm_content=creditCopyText&utm_medium=referral&utm_source=unsplash">Unsplash</a>

In 2025, AI researchers warn that training large language models on AI‑generated data triggers an AI model collapse. When you repeatedly feed a model its own output, rare patterns disappear and the system drifts toward bland averages. In this article I will explore the AI model collapse risk in 2025, and illustrate it with a telehealth case study, and also explain how you can prevent model‑training collapse by blending human and synthetic data.

In July 2024, Ilia Shumailov, a former senior research scientist at Google DeepMind, and his colleagues published a peer-reviewed Nature article showing that large language models (LLMs), variational autoencoders (VAEs), and Gaussian mixture models (GMMs) degrade when successive generations train on content produced by earlier models. The authors called this degenerative process model collapse: the model’s view of reality narrows, rare events vanish first, and outputs drift toward bland central tendencies with weird outliers.

In the study Ilia Shumailov and his colleagues distinguish an early model collapse (tails of the distribution disappear) from a late model collapse (the model converges to a shrunken distribution with very low variance). The mechanism compounds across generations via three errors: statistical approximation (finite sampling loses rare cases), functional expressivity (limited model class can’t represent the true distribution), and functional approximation (learning procedure biases).

Grasping this dynamic is essential for guiding students effectively, and it also plays a crucial role in developing AI literacy and practical skills in adult education and training.

As of 2025 this is still not solved. What exists are workable mitigations, not a universal fix. So let’s first look what it is all about.

How AI Model Collapse Can Unfold in Practice: Hypothetical Telehealth Case Study

The research team fine-tuned Meta’s OPT-125M on WikiText-2 and then trained successive “generations” on data written by the previous generation. Baseline performance after fine-tuning on real data reached 34 mean perplexity (down from a zero-shot baseline of 115). When later generations trained only on model-generated data, perplexity increased by ~20–28 points. With 10% of the original real data retained each generation, degradation became “minor.” The team also showed that cranking a repetition penalty to 2.0 (to suppress repeated phrases) doubled perplexity relative to the original – so you get less repetition, but a worse model.

They illustrate the failure mode with a good example: a prompt about medieval architecture dissolves over generations into a bizarre list of colored jackrabbits – evidence of a model mis-perceiving reality after recursive training.

I will give you another example, which is more relevant to our domain, namely the use of AI in a telehealth service.

Telehealth AI Model Collapse Example: What Happens When a System Trains on Its Own Notes

Let’s imagine you run a telehealth service handling 25,000 consults per month. Your LLM drafts triage advice and writes visit notes. Clinicians review and approve. Those AI-assisted notes enter the EHR and, a year later, fuel the next training round.

Because the model writes fast, staff reuse its phrasing. Notes converge on generic scripts (“hydrate, rest, OTC analgesic; follow up if worse”). Red-flag checks appear less often, especially for postpartum and thromboembolic risks.

You retrain Gen-1 mostly on last year’s notes – now heavy with model text. No provenance tags, in other words, nobody labeled which content was synthetic. Rare flags thin out further. A year later Gen-2 trains again on Gen-1 outputs plus fresh notes that reused the same safe templates.

Hypothetically we could then end up with the below data, a complete model‑training collapse.

AI Model Collapse Case Study: Telehealth Triage Data

Red-flag coverage inside triage templates

GenerationTraining mixNotes that include any rare-condition checklist
Gen-0 (Year 1)100% human + guidelines22.4%
Gen-1 (Year 2)~70% synthetic + 30% human9.1%
Gen-2 (Year 3)~85% synthetic + 15% human3.7%

Patient outcomes by frequency class

MetricGen-0Gen-1Gen-2
Accurate triage — common conditions88%87%86%
Accurate triage — rare, high-risk85%62%38%
72-hour unplanned ED visits7.8%10.9%14.6%
Average time-to-escalation (rare cases)34 min74 min121 min
Per-1k consults: avoidable ED referrals51117

Model collapse case in telehealth: Day-7 postpartum patient with severe headache, visual spots, home BP log shows multiple readings ≥150/100, mild right-upper-quadrant pain.

  • Gen-0 note: Prompts the postpartum hypertensive protocol. Documents red flags. Advises immediate in-person evaluation. Outcome: patient reaches L&D triage in under an hour.
  • Gen-1 note: Uses a generic “headache” script. Offers hydration, acetaminophen, screen-breaks. Mentions follow-up “if symptoms persist.” No explicit postpartum risk path. Outcome: two hours lost; partner calls again; nurse escalates.
  • Gen-2 note: Compresses even further. Records “migraine-like headache,” suggests sleep hygiene, warm shower, routine follow-up in 24–48h. Outcome: delayed escalation; ED visit overnight.

So what changed? Well, the model learned from its own generic notes, not from the original, checklist-rich human corpus. The tail (postpartum hypertensive disorders) vanished entirely from its working memory.

Monitoring AI Model Collapse: Metrics and Warning Signs

Early Warning Signs of AI Model Collapse in Telehealth Services

There are various ways to monitor and verify if an AI powered telehealth service is heading towards a model collapse:

  • Tail checklist rate: Measure the % of notes that include any rare-condition checklist per chief complaint.
  • Escalation delta: Track median minutes from first contact to escalation for flagged categories.
  • Language entropy: Watch n-gram diversity in notes. A sharp squeeze signals over-templating.
  • Template dominance: Monitor the share of visits resolved using the top 20 canned scripts.

Here’s how you can remediate this model collapse:

  • Blend, don’t replace. Keep a fixed human-authored anchor set (e.g., 25–30%) in every retrain.
  • Tag provenance. Label AI-assisted notes in the EHR; down-weight them during training.
  • Up-weight the tails. Oversample postpartum, PE, sepsis, tox syndromes in training and evaluate.
  • Freeze gold tests. Maintain human-curated red-flag vignettes; never let them bleed into training.
  • Gate KB edits. Promote only changes that add verified red-flag content, not generic rewrites.

So unlike what the example the study gave, in telehealth, collapse doesn’t look like gibberish. It looks like polite, fast, wrong—generic advice that buries rare, dangerous flags. It’s therefor key to keep human anchors, track provenance, and score performance on the tails.

How AI‑Generated Webpages Cause Model Collapse: 2025 Statistics

I already pointed out that the academic world is not free from ‘AI-abuse‘ and ‘AI-illiteracy‘, so you really don’t have to be surprised that the open web is also increasingly being filled with LLM-written text. I gathered some data to give you an idea of how fast this exactly goes.

By April 2025, over 74 percent of newly created webpages contained AI‑generated text, a trend that accelerates AI model collapse unless training pipelines filter synthetic content.

  • New pages: In April 2025, 74.2% of newly created webpages contained some AI-generated text (900k pages, one per domain). Only 2.5% were pure-AI; 71.7% mixed human+AI. This is a flow metric: most new content now includes AI.
  • Google results: AI‑written pages in Google’s top‑20 results climbed from 11.11 percent to 19.56 percent between May 2024 and July 2025 (~+0.6 percentage points/month).
  • AI “news” sites: NewsGuard tracked the rise of AI ‘news’ sites from 49 to 1,271 between May 2023 and May 2025. About +51 sites/month on average over 24 months, with a burst to 1,121 by Nov 2024 and continued growth into 2025.
  • Reference sites: A Princeton study (lower-bound method, 1% FPR) flagged ~5% of new English Wikipedia pages (Aug 2024) as AI-generated. That’s a stock-and-flow signal inside a highly curated corpus.
  • Platforms: WIRED’s analysis of Medium found ~40–47% of a large post sample likely AI-generated; Medium’s CEO said AI posts were “up tenfold” vs early 2024.
  • Search AI citing AI: As of Aug 2025, 10.4% of sources cited inside Google’s AI Overviews were themselves AI-generated.

Here’s all the data displayed in one table

MetricThen → NowTime windowPace
Share of AI pages in Google top-2011.11% → 19.56%May 2024 → Jul 2025~+0.60 pp/month.
AI-generated “news” domains (NewsGuard)49 → 1,271May 2023 → May 2025~+51 sites/month avg; burst to 1,121 by Nov 2024.
New webpages with any AI text (Ahrefs)74.2% (point-in-time)Apr 20253 in 4 new pages include AI; 71.7% mixed human+AI.
New English Wikipedia pages flagged AI (lower bound)~5%Aug 2024Early presence in a curated corpus.
Medium posts likely AI (sample)~40–47%Oct 2024Platform spike; CEO reports 10× rise vs start of year.

Basically, all future crawls will ingest synthetic content. Left unfiltered, tomorrow’s models will simply train on yesterday’s outputs, amplifying distortions and erasing rare but essential patterns. Bluntly put, too much AI-generated data leads to model collapse, and causes gibberish.

Is AI Model Collapse Inevitable? Mitigation Strategies for 2025

How to Prevent Model Collapse in Telehealth and Beyond

No, it’s not inevitable, if you design the pipeline to resist it. Another 2024 study titled “Is Model Collapse Inevitable?” found that collapse appears when you replace real data with synthetic data each generation. When you accumulate synthetic data alongside the original real data, models stay stable across sizes and modalities – even for diffusion and VAE setups. Just like I pointed out earlier, keeping real data (including human curated sets) in the mix will be key.

Practical Steps to Prevent AI Model Collapse

RiskWhat to do
Replacement-style recursionAvoid training that replaces real data with synthetic
Vanishing tailsPreserve a real “anchor” slice each round (even 10% helped)
Web contaminationFilter and de-duplicate synthetic pages before training
Provenance blind spotsAdopt C2PA-style attribution, watermarking, and detection classifiers
Repetition hacksDon’t rely on penalties alone; they can worsen perplexity

Policy Responses and Platform Efforts to Reduce AI Model Collapse Risk

In 2024, OpenAI publicly backed a California bill to label AI-generated media; the company also pledged investments in watermarking and detection tools. A broader survey of watermarking research however cautions that no method is fool-proof without ecosystem-level adoption. Add to this that rogue countries will surely not follow these rules.

Meanwhile, media-economy dynamics create perverse incentives. AI-written pages multiply and “zero-click” summaries keep users on platforms. The relative share of original reporting online shrinks – exactly the kind of tail data Nature says models need. Several analyses therefor warn that first movers who stockpiled pre-AI corpora will enjoy an edge, hence the deals with platforms like Reddit to train models on.

What Researchers Agree on About AI Model Collapse and Open Questions

AI researchers do not all share the same view when it comes to this model collapse. There is a consensus that recursive training on model-made data without real-data anchoring drives collapse. But there are a lot of open questions regarding how much real data is “enough” per domain; which filtering strategies best detect synthetic pages at crawl time; how to weigh synthetic examples that are verified post-hoc (e.g., self-consistency or tool-grounded generations) versus raw, unvetted outputs. Theoretical models explain why tails disappear; empirical work now probes how to preserve them at scale.

One thing is sure, recursive, unanchored training warps models away from reality. The tails go first, then the rest.

Basically you should treat model collapse as an ongoing risk. Use human-data anchors, track provenance, and evaluate on tail cases. These steps reduce exposure; but – and this is key to understand – they don’t make the problem disappear.

And whereas AI will definitely change the job market, we will also see that with the growth of synthetic content across the web, the value of verified human data will rise, especially to avoid a model collapse. The winners will be the teams that treat data as an asset with lineage.


Who is  Ilia Shumailov?

Ilia Shumailov is an AI-security researcher best known as lead author of the 2024 Nature paper on “model collapse”, which we describe in this article. It showed how training on model-generated data erodes a model’s grasp of rare events and drifts outputs toward bland central tendencies. The work built on his 2023 preprint “The Curse of Recursion”.

He served as a Senior Research Scientist at Google DeepMind, focusing on machine-learning security, adversarial vulnerabilities, and dataset integrity. Before and alongside industry work, he held a Junior Research Fellowship at Christ Church, University of Oxford, and was a Fellow at the Vector Institute; he remains associated with Oxford’s OATML group.

Shumailov earned his PhD in Computer Science at the University of Cambridge under Professor Ross Anderson, following an MPhil at Cambridge and a BSc at the University of St Andrews. His thesis, “On Security of Machine Learning,” examined attacks and defenses across the ML pipeline.

He now focuses on building a company to secure next-generation AI systems, showing a continued push to harden ML deployments in the wild. An author correction to the 2024 Nature article was published in March 2025.


Frequently Asked Questions About AI Model Collapse

What is AI model collapse?

AI model collapse is a degenerative feedback loop that arises when you train generative models on content produced by earlier models. Over successive generations, the system’s view of reality narrows: rare details vanish, outputs become repetitive and unoriginal, and the model loses the variability that makes human‑generated content rich. In some cases the model eventually forgets what it learned and becomes useless. This is more than ordinary “model drift”; it’s basically a systemic failure.

What causes AI model collapse?

Three main factors drive collapse:

  • Self‑referential training: Reusing AI‑generated outputs as training data strips away rare details and amplifies generic patterns.
  • Low‑quality or synthetic data: Models that learn from noisy, biased or synthetic data without human validation eventually degrade.
  • Uncorrected feedback loops: When AI systems retrain on their own missteps (e.g., an algorithm retrains on low‑engagement content without human oversight), errors propagate and multiply.

These factors accelerate degradation, especially in 2025’s data‑rich environment.

How is model collapse different from model drift?

Model drift happens when incoming data changes, causing performance to slip. Model collapse is more severe; the model essentially forgets what it learned and can no longer make useful predictions. Drift can be corrected with retraining, but collapse requires fundamental changes to data and oversight.

What are early warning signs of AI model collapse?

Watch for repetitive or generic outputs, a loss of nuance and rare details, and reduced variability. In applied settings like telehealth, you can track specific metrics: declining use of rare‑condition checklists, slower escalation times for high‑risk cases, reduced language entropy and increasing reliance on canned templates.

How can you prevent AI model collapse?

You can take proactive steps:

  • Blend human and synthetic data: Keep a fixed human‑authored anchor set of 25–30 percent in every retrain. Oversample rare conditions and edge cases.
  • Tag and down‑weight AI‑assisted content: Label AI‑generated notes and reduce their influence during training.
  • Use provenance and watermarking tools: Track where each data point comes from and filter synthetic pages before they enter the training pipeline.
  • Adopt retrieval‑augmented generation (RAG): Let models access live, human‑maintained knowledge bases during inference.
  • Implement human‑in‑the‑loop processes: Continuous monitoring and real‑time annotation allow humans to correct errors and feed validated data back into the model. This approach “immunizes” the system against drift and collapse.

Is AI model collapse inevitable?

No. Research shows collapse appears only when real data are completely replaced by synthetic data. When you accumulate synthetic data alongside original human data, models stay stable. Experts also note that catastrophic scenarios often rely on unrealistic training conditions. With proper data curation and oversight, you can avoid collapse.

How does model collapse affect telehealth?

In a hypothetical telehealth example, clinicians retrained their AI triage system on its own notes. Rare‑condition checklists dropped from 22.4 percent to 3.7 percent across generations. Accurate triage for rare conditions plunged from 85 percent to 38 percent, while unplanned emergency visits increased. A postpartum patient with severe hypertension can be misclassified and told to rest; escalation in that case was delayed. Without human anchors, the AI forgot rare but critical patterns.

What metrics can you track to detect collapse?

Measure the percentage of notes that include rare‑condition checklists; track median time from first contact to escalation for flagged categories; monitor language entropy to see if word diversity is shrinking; and watch the share of tasks resolved using top templates. Sudden drops in these metrics signal trouble.

How do policy and regulation help?

Regulators are pushing for transparency. A 2024 bill backed by developers calls for labeling AI‑generated media. Draft risk management plans include provenance requirements and criteria to track homogeneity. California legislation mandates independent evaluations and whistleblower protections. These measures encourage best practices and make it harder to hide synthetic content in training data.

Do all experts agree model collapse is a crisis?

No. Some researchers argue that fears are overstated and that most catastrophic scenarios assume unrealistic conditions. They point out that mixing synthetic and human data, which is already common, reduces the risk. Still, there is broad agreement that indiscriminate training on AI‑generated data degrades models, so vigilance is prudent.

How widespread is AI‑generated content in 2025?

By April 2025, 74.2 percent of newly created webpages contained some AI‑generated text. AI‑written pages in the top‑20 Google results climbed from 11.11 percent to 19.56 percent between May 2024 and July 2025. NewsGuard’s tracker saw AI “news” sites grow from 49 to 1,271 between May 2023 and May 2025. As synthetic content permeates the web, the risk of feeding future models on their own outputs grows.

Are there other forms of collapse?

Yes. A 2025 study by Apple found that large reasoning models face “complete accuracy collapse” on complex tasks. Standard AI models outperformed them on simple problems; both model types failed when complexity increased. The reasoning models even reduced their reasoning effort as tasks became harder. This suggests fundamental limitations in current approaches to generalizable reasoning.

These answers should help you understand AI model collapse, spot the warning signs, and take steps to prevent it.


Become a Sponsor

Our website is the heart of the mission of WINSS – it’s where we share updates, publish research, highlight community impact, and connect with supporters around the world. To keep this essential platform running, updated, and accessible, we rely on the generosity of you, who believe in our work.

We offer the option to sponsor monthly, or just once choosing the amount of your choice. If you run a company, please contact us via info@winssolutions.org.

Select a Donation Option (USD)

Enter Donation Amount (USD)