March 14, 2026

SleepFM AI Reads One Night of Sleep to Forecast a Risk of 130 Diseases, New Nature Medicine Study Finds

Person peacefully sleeping in bed. SleepFM AI Reads One Night of Sleep to Forecast a Risk of 130 Diseases, New Nature Medicine Study Finds

SleepFM AI Reads One Night of Sleep to Forecast a Risk of 130 Diseases, New Nature Medicine Study Finds

On 6 January 2026, Nature Medicine published the paper “A multimodal sleep foundation model for disease prediction.” In it, a Stanford-led team describes an AI system called SleepFM that can analyse a single night of clinical sleep recordings and estimate a person’s future risk of 130 different diseases.

The study, led by Rahul Thapa with senior authors Emmanuel Mignot and James Zou, takes polysomnography – the full wired-up sleep exam used in hospital labs – and treats it as a massive, underused data source. Instead of extracting just a few standard numbers (apnea index, sleep stages), the researchers train a “foundation model” on more than 585,000 hours of raw overnight signals from over 65,000 people, collected across several large cohorts. Those signals include brain activity (EEG), heart rhythms (ECG), muscle activity (EMG), breathing and oxygenation – all recorded continuously through the night.

In this article I will explain what the study actually did, how strong the evidence is, and what it would mean if “reading the future from your sleep” becomes part of medicine.

What does SleepFM show?

The team shows that SleepFM can handle the basics: classifying sleep stages, grading sleep apnea severity and inferring demographic features such as age and sex. On these core tasks, the model matches or outperforms existing state-of-the-art systems, despite not being tailored to any single one. That performance is what you would expect from a foundation model: it learns a general “language” of sleep physiology that transfers to many downstream problems.

The striking part comes next. For roughly 35,000 patients treated at the Stanford Sleep Medicine Center between 1999 and 2024, the researchers linked each overnight sleep study to up to 25 years of electronic health records. They then asked a simple question: how well can a risk score derived from that one night of sleep rank who will go on to develop specific diseases? Scanning more than 1,000 diagnostic categories, they identified 130 conditions – including all-cause mortality, dementia, heart attack, heart failure, chronic kidney disease, stroke and atrial fibrillation – where SleepFM reaches a concordance index of at least 0.75, a level many clinicians regard as strong discrimination. For some cancers, pregnancy complications, circulatory disorders and mental health conditions, the model’s predictions exceeded 80% accuracy in ranking who becomes ill sooner.

The SleepFM-based risk scores outperform models that use only demographics and routine sleep metrics, and beat conventional deep-learning models trained end-to-end on the same polysomnography data for individual diseases. The authors also demonstrate external validity by transferring the model to the Sleep Heart Health Study, a separate multicenter cohort, where it again predicts cardiovascular events and mortality with high accuracy.

It is the first large-scale evidence that a single night in a sleep lab contains a measurable, machine-readable fingerprint of long-term disease risk across the body, and that this fingerprint can be captured by a single foundation model trained on raw signals.

SleepFM looks for hidden data in a night’s sleep

A standard overnight sleep study, or polysomnography, is a wiring job in slow motion. Electrodes track brain waves. Sensors monitor breathing, chest movement and blood oxygen. Wires capture heart rhythms and muscle activity. Cameras watch body movements.

Clinicians use this dense stream of data mainly for a small set of tasks: diagnosing sleep apnea, scoring sleep stages, checking for abnormal movements or seizures.

SleepFM starts from a different premise: that this “untapped gold mine” of data, as the authors describe polysomnography, carries information about the rest of the body too.

Instead of focusing on one signal or one diagnosis, the model learns patterns across all channels at once:

  • Brain activity (EEG)
  • Heart activity (ECG)
  • Breathing and oxygenation
  • Muscle activity and movement

The technical term is a multimodal foundation model for sleep – a system trained on enormous volumes of raw data to learn a general representation, and then adapted to specific tasks.

How SleepFM builds a foundation model for sleep

The scale of the dataset behind SleepFM is unusual even by modern AI standards. As explained in my introduction, the team curated over 585,000 hours of overnight recordings from around 65,000 people, drawn from multiple cohorts and sleep clinics.

For one of those cohorts – roughly 35,000 adults and children treated at the Stanford Sleep Medicine Center between 1999 and 2024 – the researchers linked the sleep studies to up to 25 years of electronic health records.

That linkage is exactly what makes disease-risk prediction possible. The model does not just see someone’s physiology for eight hours; it can be paired with the diagnoses, procedures and outcomes that follow over decades.

How SleepFM learns

Under the hood, the model works like this:

  • All signals are resampled to a common frequency.
  • The night is chopped into tiny clips, each about five seconds long.
  • Each clip is turned into a vector – a compact numerical summary.
  • A transformer-based architecture then learns how these clips relate over time and across channels.

A key trick is contrastive learning. In a sense, the model plays a continuous game of fill-in-the-blank: if you hide the breathing signal, can it be reconstructed from the combination of brain and heart activity? If you hide one channel, can the others predict it?

By forcing the system to infer missing pieces from the remaining data, the researchers push SleepFM to learn how different organs behave together during sleep – and what it means when that coordination goes wrong. As lead investigator Emmanuel Mignot put it, trouble shows up when “a brain that looks asleep” is paired with “a heart that looks awake.”

First test: can it do the basics better?

Before using SleepFM as a disease-forecasting engine, the team asked a more conservative question: Does the model actually understand sleep well enough to handle the standard jobs?

On classic tasks like:

  • Sleep stage classification (light, deep, REM, wake)
  • Detecting and grading sleep-disordered breathing
  • Inferring age and sex from sleep signals

SleepFM’s learned “embeddings” fed into simple classifiers matched or outperformed state-of-the-art models trained end-to-end on the same data.

This matters for two reasons:

  1. It shows the model is not a black box built purely to chase correlations in electronic health records.
  2. It confirms that the shared representation of sleep it learns is at least as useful as models designed for narrow tasks.

Only after that did the team turn the model on a much larger question.

Forecasting 130 diseases risks from one night

Linking the polysomnography data to electronic health records allowed the researchers to ask: What future diagnoses are most strongly foreshadowed in the way someone sleeps tonight?

They started with more than 1,000 disease categories and, using standard survival-analysis methods, checked whether a risk score derived from SleepFM could reliably rank who would develop a given condition sooner.

The key metric here is the concordance index, or C-index:

  • 0.5 means the model is no better than random at ranking risk.
  • 1.0 would mean perfect ranking: the model always gives higher risk to the person who gets sick first.

After correcting for multiple comparisons, SleepFM reached a C-index of at least 0.75 for 130 different conditions.

Those conditions span a wide set of systems, including:

  • All-cause mortality – C-index 0.84
  • Dementia – 0.85
  • Myocardial infarction (heart attack) – 0.81
  • Heart failure – 0.80
  • Chronic kidney disease – 0.79
  • Stroke – 0.78
  • Atrial fibrillation – 0.78

For certain cancers, pregnancy complications, circulatory conditions and mental health disorders, accuracy was in a similar range, with external summaries describing “up to 80–85%” correct risk ranking for some outcomes.

In other words: when given two patients, the model often sorts them correctly by who will reach a specific diagnosis first, based only on their sleep study.

What exactly is the model seeing?

SleepFM is not reading simple labels like “number of apnea events” and “total REM minutes” and extrapolating from there.

Traditional metrics and demographic variables – age, sex, body mass index – already feed into existing clinical risk tools. And to test whether the model adds anything beyond those obvious predictors, the researchers compared it against:

  • A baseline using only demographics and a small set of routine sleep-study features.
  • A classic convolutional neural network trained specifically for each disease.

SleepFM-based models outperformed both baselines for most of the headline outcomes. That suggests the embeddings capture more subtle combinations of signals.

External reporting and interviews add a few clues:

  • Heart-related signals contribute most to cardiovascular predictions.
  • Brain-based signals matter more for mental health and neurodegenerative outcomes.
  • The strongest warnings emerge when those systems are out of sync – a mismatched pattern of brain, heart and breathing during sleep.

Those patterns are not immediately interpretable to clinicians at the bedside. But they point to sleep as a kind of compressed biomarker of multi-organ health.

How strong is “up to 80% accuracy”?

Numbers like “85% accuracy” travel fast on social media, but the details matter. Let me put it in context:

  • A C-index of 0.84 for mortality means that in 84 out of 100 randomly selected patient pairs, the model correctly assigns higher risk to the person who dies sooner.
  • Clinical risk tools in oncology and cardiology are often considered useful with C-indices in the 0.7 range.

SleepFM’s top-line results fall comfortably above that threshold for a set of high-impact diseases, at least in the population it was trained and tested on. That does not mean the model is ready for routine use. But it places the performance in a range where regulators, clinicians – and insurers – will pay attention.

External validation, or a one-hospital trick?

One common question for any foundation model in medicine is: Does it generalize beyond the institution that built it?

Here, the answer is partially yes, but with important caveats.

  • SleepFM was trained on recordings from multiple cohorts and clinics, with different sensor configurations.
  • A large external dataset – the Sleep Heart Health Study, involving several thousand adults – was held out during pre-training and used only for transfer-learning tests.
  • On that separate cohort, SleepFM embeddings again supported strong risk predictions for events like cardiovascular death, stroke and heart failure.

That addresses the narrow fear that the model only works on a single lab’s wiring scheme. But it does not erase broader limitations.

Most of the data still comes from people referred to sleep clinics, not from a random slice of the general population. Those patients tend to be older and carry more comorbidities, which changes both the baseline risk and the patterns a model can latch onto.

Further validation in community cohorts and different health systems will be needed before anyone can safely quote these numbers to the average wearable-watch user.

What SleepFM does not prove

The story of SleepFM is already being summarized in simple slogans – “AI predicts 130 diseases from sleep” – that hide several crucial distinctions.

SleepFM predicts risk, not diagnosis

SleepFM delivers risk rankings, not definitive diagnoses.

A high score for future dementia does not mean someone will develop dementia at a specific age. It means that, among the people in this dataset, those with similar sleep-derived signals tended to receive that diagnosis sooner.

That is still powerful. It could be enough to push for earlier screening or more aggressive management of known risk factors. But it is not a crystal ball.

SleepFM exploits correlation, and does not provide causation

The model exploits correlations between sleep-time physiology and disease outcomes. It does not show that poor-quality sleep directly causes those conditions.

In some cases, the emerging disease may already be subtly affecting the brain, heart or autonomic nervous system years before symptoms emerge – with sleep recordings acting as sensitive sensors of that early disruption.

Changing sleep might help; it might not. SleepFM does not answer that question.

SleepFM is not yet a clinical product

The Nature Medicine paper and accompanying coverage do not claim that SleepFM can be deployed in routine care. They describe it as a research-stage system that reveals the predictive value of sleep recordings and could underpin future tools.

Regulatory clearance, clinical trials and careful evaluation of harms and benefits will be needed before any such model is woven into standard workflows.

Ethical and policy questions at the bedside

Even at the level of research, SleepFM raises issues that go beyond technical performance.

Data linkage and consent

To train and evaluate the model, researchers linked decades of electronic health records to overnight recordings for tens of thousands of patients. That work took place within institutional review-board frameworks; the paper does cite standard ethical approvals.

Still, the prospect of AI mining historical clinical data to predict not only obvious outcomes but a broad catalog of diseases will intensify debates over secondary use of patient records, consent mechanisms and opt-out options.

Fairness and bias

The training cohorts include a mix of ages and, in some studies, diverse ethnic backgrounds, such as the Multi-Ethnic Study of Atherosclerosis. But public summaries and early analyses offer little detail on subgroup performance.

Without that, it remains unclear whether the model over- or underestimates risk for certain groups – a central concern in any system that might influence access to screening or treatment.

Who uses the risk scores?

If models like SleepFM move beyond research, the intent behind their deployment will matter.

  • Used in a public-health or clinical context, sleep-based risk scores could identify people who warrant closer follow-up, lifestyle support or preventive medication.
  • Used in an insurance or employment context, they could be misapplied as a tool for risk selection. For instance, insurers could use sleep-based risk scores mainly to raise premiums or exclude coverage, population health worsens.
  • If tools are rolled out without transparency or external validation, they can encode and amplify existing biases.
  • If high-risk flags don’t come with resources (appointments, prevention programmes, reimbursement), people just carry extra anxiety.

The paper itself does not prescribe any particular path. But the technology arrives in a world where health-data governance is under active debate.

From sleep labs to wearables?

One reason the SleepFM work has attracted attention is the sense that it points beyond the wired-up lab.

Polysomnography remains the gold standard for sleep analysis, but it is expensive, inconvenient and limited to specialized centers. Meanwhile, consumer devices quietly collect sleep-adjacent data at scale, night after night.

Researchers involved with SleepFM have already floated the idea of connecting the foundation model to simpler or noisier inputs – including wearable sensors that capture heart rate, oxygenation and movement – as a way to extend disease risk prediction beyond the lab.

However, wearable data do not include full brain activity. The signal quality, sampling rates and confounding factors differ sharply from clinical studies. Any model would need to be retrained or carefully adapted, then re-validated.

Still, the direction of travel is clear: toward polysomnography-based disease risk forecasting as a concept, and toward a future where sleep recordings (full or partial) feed into broader health-risk dashboards.

How can SleepFM help you?

This kind of model helps people only if it’s plugged into real-world care in the right way. Here’s what it can actually do for the population.

Turn every sleep lab visit into a multi-disease check-up

Right now, a polysomnography report mostly answers a few questions: Do you have sleep apnea? How severe? Any obvious movement or seizure issues?

Add a model like SleepFM and suddenly that same recording can also flag:

  • Elevated risk for dementia and other neurodegenerative diseases
  • Cardiovascular problems (heart failure, stroke, atrial fibrillation, heart attack)
  • Chronic kidney disease and some cancers
  • Pregnancy complications and several mental-health conditions

No extra test, no extra time for the patient.

Practical impact:
A 55-year-old who comes in “just” for snoring could go home with:

  • A diagnosis or exclusion of sleep apnea plus
  • A risk profile that tells their GP: “this person sits in the top 10–20% risk band for heart failure and stroke”

That gives the primary-care team a concrete reason to:

  • Check blood pressure and lipids more often
  • Push harder on smoking cessation, diet and exercise
  • Screen earlier for atrial fibrillation or kidney disease

Same night, but far more information extracted.

Find silent risk years before a first event

Many of the diseases where the model performs well are conditions that build up quietly:

  • Heart failure
  • Chronic kidney disease
  • Atrial fibrillation
  • Types of dementia

By the time symptoms are obvious, a lot of damage is baked in. Sleep signals capture how the brain, heart and autonomic system behave together at night, which seems to shift years before a code appears in the medical record.

Used correctly, that means:

  • Earlier blood and urine checks for kidney function in people with high kidney-risk scores
  • Earlier rhythm monitoring (e.g. Holter, patch monitors) for those with high atrial-fibrillation risk
  • Earlier cognitive assessments for those with high dementia-risk patterns

Even small shifts in timing matter at population level. Detecting thousands of people one or two years earlier than usual changes how many end up with severe complications.

Make better use of scarce specialist time

Sleep clinics are overloaded in many countries. Specialists spend a lot of time manually scoring studies and writing detailed reports for a narrow set of questions

A good foundation model can:

  • Automate the baseline scoring (stages, apnea metrics) with high reliability
  • Pre-compute disease-risk indices in the background
  • Highlight “high-risk” cases in the worklist

That lets clinicians:

  • Spend more time explaining results and options
  • Focus attention on patients where follow-up is urgent
  • Reduce delays for other patients on the waitlist

For the broader population, that translates into shorter queues and more consistent quality of interpretation.

Give public health authorities a sharper radar

Because the model scans risk across more than 100 conditions, health systems can aggregate the data (anonymised) and see patterns such as:

  • Which age bands show the highest “hidden” risk for heart failure or stroke
  • Which regions or demographic groups have the worst sleep-linked risk patterns
  • How changes in obesity, pollution, or work schedules are reflected in sleep-based risk over time

This helps in at least three ways:

  1. Targeted prevention campaigns: e.g. focus cardiovascular-prevention resources in areas where overnight risk profiles look worst, not just where hospitalizations already peaked.
  2. Policy evaluation: if a city changes working-hours rules or noise regulations, you can track whether sleep-based risk markers move in a healthier direction.
  3. Better forecasting: combining sleep-based risk distributions with demographic data improves long-term projections for dementia, heart failure, etc.

Instead of waiting for hospital admission data (a lagging indicator), public-health teams get something closer to a real-time barometer of underlying risk.

Unlock new treatments and guidelines

For researchers, a model like SleepFM is also a pattern-discovery tool. It lets teams:

  • Identify specific sleep signatures associated with particular diseases
  • Quantify how much each modality (brain, heart, breathing) contributes to risk
  • Test how existing treatments change those signatures over time

That can feed directly into:

  • Updated sleep guidelines: more precise thresholds for when insomnia, fragmented sleep, or mild apnea become clinically meaningful.
  • Trials of preventive drugs or lifestyle interventions: use sleep-based risk scores as inclusion criteria or early endpoints.
  • Mechanistic research: better hypotheses on how sleep disruption might interact with blood pressure, inflammation, or neurodegeneration.

If you care about population health, this matters because better evidence eventually shapes:

  • What insurers reimburse
  • What GPs are told to screen for
  • What patients are advised to change in daily life

Give patients more personalized, concrete conversations

Sleep reports today are often abstract: “moderate OSA,” “reduced deep sleep,” “increased arousals.” Many patients struggle to connect that to everyday health choices.

A risk-aware sleep report could say, for example:

  • “People with a similar sleep pattern have about double the usual 10-year risk of heart failure and stroke.”
  • “After weight loss or CPAP adherence, we expect your sleep-based risk score for these conditions to drop by X–Y percentiles.”

This does two things:

  • Makes the stakes tangible (“this isn’t only about snoring, it’s about keeping you out of hospital”).
  • Creates feedback loops: repeat studies or future home-based measurements can show whether risk is moving in the right direction.

At scale, clearer, personalised risk framing tends to improve adherence to preventive measures, which is where most population-level gains come from.

Build bridges between hospital tech and home devices

SleepFM is trained on full polysomnography, but it points the way for lighter-weight tools:

  • Home sleep tests that record fewer channels
  • Wearables (rings, watches, patches) that track heart rate, oxygen saturation, movement, sometimes breathing surrogates

Researchers can:

  • Use the foundation model as a reference and try to distil similar risk signals from cheaper inputs
  • Evaluate which subsets of signals still carry enough information for specific conditions

If that works, the benefits shift from “everyone who gets a sleep lab appointment” to:

  • “Everyone whose smartwatch or ring already tracks their nights”

That would allow:

  • Continuous, passive risk monitoring for large parts of the population
  • Low-cost screening in regions with limited access to sleep labs

Of course, this needs careful validation and strict regulation, but the direction is clear: hospital-grade insights, progressively translated into tools that reach more people.

Potential for more equitable care – IF bias is addressed

Used thoughtfully, sleep-based risk models can reduce some inequities:

  • Many conditions are under-diagnosed in women, younger patients, and certain ethnic groups because symptoms look “atypical” compared to textbook cases.
  • An algorithm that uses raw physiology, not stereotypes, can sometimes surface risk in those groups earlier than current heuristics do.

For that to benefit the population, health systems need to:

  • Test performance by sex, age, ethnicity, comorbidities
  • Adjust thresholds or calibration where needed
  • Ensure high-risk signals trigger support, not penalties

If those conditions are met, under-served populations gain access to earlier detection and follow-up without having to fight for referrals.

A single night of sleep carries far more information about health than current practice uses

SleepFM shows that a single night of sleep carries far more information about health than current practice uses. The Nature Medicine study turns polysomnography from a narrow diagnostic tool into a broad risk scanner that can forecast 130 conditions with clinically relevant accuracy. For patients who already pass through a sleep lab, this transforms one exam into a multi-disease check-up without adding time, cost or extra burden.

For health systems, the model offers a way to move upstream. Instead of waiting for strokes, heart failure or dementia to appear in emergency rooms, clinicians and public-health teams gain an earlier warning layer built on objective physiology. That enables more targeted prevention, smarter use of specialist time and better evidence to update guidelines. If researchers succeed in translating these insights to simpler home devices, the approach scales from tens of thousands of patients to millions.

The same power that makes this attractive also raises hard questions. Sleep-derived risk scores can help people access better care or push them toward exclusion, depending on who controls them and under what rules. Without rigorous validation across populations, transparent communication and clear consent and governance, the technology can deepen bias instead of reducing it. The science proves that the signal is there; it does not decide how society uses it.

In that sense, SleepFM confirms that our nightly physiology encodes a detailed fingerprint of future disease and that AI can read it with striking precision. The next phase is about embedding this new “language of sleep” into healthcare in ways that deliver earlier help, not new forms of risk scoring that leave patients more anxious and no better protected.


Become a Sponsor

Our website is the heart of the mission of WINSS – it’s where we share updates, publish research, highlight community impact, and connect with supporters around the world. To keep this essential platform running, updated, and accessible, we rely on the generosity of you, who believe in our work.

We offer the option to sponsor monthly, or just once choosing the amount of your choice. If you run a company, please contact us via info@winssolutions.org.

Select a Donation Option (USD)

Enter Donation Amount (USD)