OSF: On Pre-training and Scaling of Sleep Foundation Models

🔥 News

[2026-2-24] Our codebase and checkpoint is released. Full codebase for benchmarking will be public available after acceptance.
[2026-2-22] Our paper is out.

📖 Introduction

Polysomnography (PSG) provides the gold standard for sleep assessment but suffers from substantial heterogeneity across recording devices and cohorts. There have been growing efforts to build general-purpose foundation models (FMs) for sleep physiology, but lack an in-depth understanding of the pre-training process and scaling patterns that lead to more generalizable sleep FMs. To fill this gap, we curate a massive corpus of 166,500 hours of sleep recordings from nine public sources and establish SleepBench, a comprehensive, fully open-source benchmark. Leveraging SleepBench, we systematically evaluate four families of self-supervised pre-training objectives and uncover three critical findings: (1) existing FMs fail to generalize to missing channels at inference; (2) channel-invariant feature learning is essential for pre-training; and (3) scaling sample size, model capacity, and multi-source data mixture consistently improves downstream performance. With an enhanced pre-training and scaling recipe, we introduce OSF, a family of sleep FMs that achieves state-of-the-art performance across nine datasets on diverse sleep and disease prediction tasks. Further analysis of OSF also reveals intriguing properties in sample efficiency, hierarchical aggregation, and cross-dataset scaling.

💿 Installation

git clone https://huggingface.co/yang-ai-lab/OSF-Base
cd OSF-Base
conda env create -f environment.yml
conda activate myenv

Dependencies

Python >= 3.10
PyTorch >= 2.9.0
PyTorch Lightning >= 2.5.5

🚀 Quick Start

We provide a demo notebook (demo.ipynb) demonstrating how to extract embeddings from PSG signals using the pretrained model.

import torch
from osf.backbone.vit1d_cls import vit_base

# Load pretrained weights (included in this repo)
payload = torch.load("osf_backbone.pth", map_location="cpu", weights_only=False)
meta = payload["metadata"]

# Initialize model
backbone = vit_base(
    num_leads=meta["num_leads"],        # 12 channels
    seq_len=meta["seq_len"],            # 1920 (64 Hz × 30 s)
    patch_size=meta["patch_size_time"],
    lead_wise=meta["lead_wise"],
    patch_size_ch=meta["patch_size_ch"],
)
backbone.load_state_dict(payload["state_dict"])
backbone.eval()

# Extract embeddings
# x: [B, 12, 1920] - 12-channel PSG, 64 Hz × 30 seconds
with torch.no_grad():
    cls_embs, patch_embs = backbone.forward_encoding(x, return_sequence=False)
# cls_embs: [B, 768] - Global epoch-level representation
# patch_embs: [B, 90, 768] - Local patch representations

📦 Pretrained Weights

Model	Backbone	Channels
OSF	ViT-Base	12-ch

The pretrained weights are included in this repository. You can download them via the Hugging Face Hub:

from huggingface_hub import hf_hub_download
checkpoint_path = hf_hub_download(repo_id="yang-ai-lab/OSF-Base", filename="osf_backbone.pth")

Or via the CLI:

huggingface-cli download yang-ai-lab/OSF-Base osf_backbone.pth

👩‍💻 Usage

Input Format

Expected input format:

12 PSG Channels: ECG, EMG_Chin, EMG_LLeg, EMG_RLeg, ABD, THX, NP, SN, EOG_E1_A2, EOG_E2_A1, EEG_C3_A2, EEG_C4_A1
Sample Rate: 64 Hz
Epoch Length: 30 seconds
Input Shape: [B, 12, 1920]

Pretraining

We support multiple self-supervised pretraining methods, for example, to launch pre-training of our OSF method, run pretraining:

python main_pretrain.py \
    --model_name "dino_ours" \
    --psg_encoder_name "vit_base" \
    --batch_size 256 \
    --lr 5e-5 \
    --max_epochs 30 \
    --num_devices 4 \
    --patch_size_time 64 \
    --patch_size_ch 4 \
    --precision "bf16-mixed"

See main_pipleines/main_pretrain.py for more detailed settings.

Fine-tuning

Fine-tune the pretrained model on downstream tasks:

python main_finetune.py \
    --model_name "dino_ours" \
    --ckpt_path "/path/to/pretrained/checkpoint.ckpt" \
    --downstream_dataset_name "shhs" \
    --eval_label "Stage" \
    --train_data_pct 1.0 \
    --max_steps 500 \
    --lr 0.1 \
    --num_devices 4

📊 Benchmark Evaluations

Benchmarked SSL Methods

Method	Type	Original Paper
SleepFM	Contrastive	Leave-one-out multi-modal contrastive learning
SimCLR	Contrastive	Simple Contrastive Learning
DINO	Self-distillation	DINOv2 (Oquab et al., 2023)
VQ-VAE	Reconstruction	Vector-quantized variational autoencoder
MAE	Reconstruction	Masked Autoencoding
AR	Autoregressive	Autoregressive Next-Token prediction
OSF	Self-distillation	ours

Downstream Tasks

Epoch-level Classification Tasks:

Task	Classes	Description
Sleep Stage	4	Awake, Light Sleep, Deep Sleep, REM classification
Arousal	2	Arousal event detection
Hypopnea	2	Hypopnea event detection
Oxygen Desaturation	2	Oxygen desaturation detection

Evaluation Settings

Setting	Description
Linear Probing	Freeze backbone, train linear classifier
Full Fine-tuning	Fine-tune entire model end-to-end
Few-shot (k-shot)	Train with limited labeled samples

For example scripts, see main_pipelines and bash_scripts folders.

📊 Supported Datasets

We aggregated nine large-scale datasets from the National Sleep Research Resource platform.

Dataset	Full Name	Source
SHHS	Sleep Heart Health Study	NSRR
CHAT	Childhood Adenotonsillectomy Trial	NSRR
MROS	MrOS Sleep Study	NSRR
CCSHS	Cleveland Children's Sleep and Health Study	NSRR
CFS	Cleveland Family Study	NSRR
MESA	Multi-Ethnic Study of Atherosclerosis	NSRR
SOF	Study of Osteoporotic Fractures	NSRR
WSC	Wisconsin Sleep Cohort	NSRR
STAGES	Stanford Technology Analytics and Genomics in Sleep	NSRR
NCHSDB	NCH Sleep DataBank	NSRR

For new users, please apply for an account and access to each of these datasets following instructions here NSRR Registration

📁 Project Structure

OSF-Open-Sleep-Foundation-Model/
├── osf/
│   ├── backbone/          # ViT backbone implementations
│   │   └── vit1d_cls.py
│   ├── models/            # SSL model implementations
│   │   └── dino_model_cls.py
│   │   
│   ├── datasets/          # Data loading utilities
│   └── utils/             # Helper functions
├── main_pipelines/        # Training scripts
│   ├── main_pretrain.py
│   └── ...
├── bash_scripts/          # Example bash scripts
├── osf_backbone.pth       # Pretrained model weights
├── demo.ipynb             # Quick start demo
├── config.py              # Dataset and channel configurations
└── train_config.py        # Training configurations

📝 Citation

If you use this code or models in your research, please cite our paper:

@article{shuai2026osf,
  title={OSF: On Pre-training and Scaling of Sleep Foundation Models},
  author={Shuai, Zitao and Xu, Zongzhe and Yang, David and Wang, Wei and Yang, Yuzhe},
  journal={arXiv preprint},
  year={2026}
}

Downloads last month: -; Downloads are not tracked for this model. How to track