OSF: On Pre-training and Scaling of Sleep Foundation Models

Paper Webpage License Python

πŸ”₯ News

  • [2026-2-24] Our codebase and checkpoint is released. Full codebase for benchmarking will be public available after acceptance.
  • [2026-2-22] Our paper is out.

πŸ“– Introduction

Polysomnography (PSG) provides the gold standard for sleep assessment but suffers from substantial heterogeneity across recording devices and cohorts. There have been growing efforts to build general-purpose foundation models (FMs) for sleep physiology, but lack an in-depth understanding of the pre-training process and scaling patterns that lead to more generalizable sleep FMs. To fill this gap, we curate a massive corpus of 166,500 hours of sleep recordings from nine public sources and establish SleepBench, a comprehensive, fully open-source benchmark. Leveraging SleepBench, we systematically evaluate four families of self-supervised pre-training objectives and uncover three critical findings: (1) existing FMs fail to generalize to missing channels at inference; (2) channel-invariant feature learning is essential for pre-training; and (3) scaling sample size, model capacity, and multi-source data mixture consistently improves downstream performance. With an enhanced pre-training and scaling recipe, we introduce OSF, a family of sleep FMs that achieves state-of-the-art performance across nine datasets on diverse sleep and disease prediction tasks. Further analysis of OSF also reveals intriguing properties in sample efficiency, hierarchical aggregation, and cross-dataset scaling.

πŸ“– Table of Contents

  1. Installation
  2. Quick Start
  3. Pretrained Weights
  4. Usage
  5. Benchmark Evaluations
  6. Supported Datasets
  7. Citation

πŸ’Ώ Installation

git clone https://huggingface.co/yang-ai-lab/OSF-Base
cd OSF-Base
conda env create -f environment.yml
conda activate myenv

Dependencies

  • Python >= 3.10
  • PyTorch >= 2.9.0
  • PyTorch Lightning >= 2.5.5

πŸš€ Quick Start

We provide a demo notebook (demo.ipynb) demonstrating how to extract embeddings from PSG signals using the pretrained model.

import torch
from osf.backbone.vit1d_cls import vit_base

# Load pretrained weights (included in this repo)
payload = torch.load("osf_backbone.pth", map_location="cpu", weights_only=False)
meta = payload["metadata"]

# Initialize model
backbone = vit_base(
    num_leads=meta["num_leads"],        # 12 channels
    seq_len=meta["seq_len"],            # 1920 (64 Hz Γ— 30 s)
    patch_size=meta["patch_size_time"],
    lead_wise=meta["lead_wise"],
    patch_size_ch=meta["patch_size_ch"],
)
backbone.load_state_dict(payload["state_dict"])
backbone.eval()

# Extract embeddings
# x: [B, 12, 1920] - 12-channel PSG, 64 Hz Γ— 30 seconds
with torch.no_grad():
    cls_embs, patch_embs = backbone.forward_encoding(x, return_sequence=False)
# cls_embs: [B, 768] - Global epoch-level representation
# patch_embs: [B, 90, 768] - Local patch representations

πŸ“¦ Pretrained Weights

Model Backbone Channels
OSF ViT-Base 12-ch

The pretrained weights are included in this repository. You can download them via the Hugging Face Hub:

from huggingface_hub import hf_hub_download
checkpoint_path = hf_hub_download(repo_id="yang-ai-lab/OSF-Base", filename="osf_backbone.pth")

Or via the CLI:

huggingface-cli download yang-ai-lab/OSF-Base osf_backbone.pth

πŸ‘©β€πŸ’» Usage

Input Format

Expected input format:

  • 12 PSG Channels: ECG, EMG_Chin, EMG_LLeg, EMG_RLeg, ABD, THX, NP, SN, EOG_E1_A2, EOG_E2_A1, EEG_C3_A2, EEG_C4_A1
  • Sample Rate: 64 Hz
  • Epoch Length: 30 seconds
  • Input Shape: [B, 12, 1920]

Pretraining

We support multiple self-supervised pretraining methods, for example, to launch pre-training of our OSF method, run pretraining:

python main_pretrain.py \
    --model_name "dino_ours" \
    --psg_encoder_name "vit_base" \
    --batch_size 256 \
    --lr 5e-5 \
    --max_epochs 30 \
    --num_devices 4 \
    --patch_size_time 64 \
    --patch_size_ch 4 \
    --precision "bf16-mixed"

See main_pipleines/main_pretrain.py for more detailed settings.

Fine-tuning

Fine-tune the pretrained model on downstream tasks:

python main_finetune.py \
    --model_name "dino_ours" \
    --ckpt_path "/path/to/pretrained/checkpoint.ckpt" \
    --downstream_dataset_name "shhs" \
    --eval_label "Stage" \
    --train_data_pct 1.0 \
    --max_steps 500 \
    --lr 0.1 \
    --num_devices 4

πŸ“Š Benchmark Evaluations

Benchmarked SSL Methods

Method Type Original Paper
SleepFM Contrastive Leave-one-out multi-modal contrastive learning
SimCLR Contrastive Simple Contrastive Learning
DINO Self-distillation DINOv2 (Oquab et al., 2023)
VQ-VAE Reconstruction Vector-quantized variational autoencoder
MAE Reconstruction Masked Autoencoding
AR Autoregressive Autoregressive Next-Token prediction
OSF Self-distillation ours

Downstream Tasks

Epoch-level Classification Tasks:

Task Classes Description
Sleep Stage 4 Awake, Light Sleep, Deep Sleep, REM classification
Arousal 2 Arousal event detection
Hypopnea 2 Hypopnea event detection
Oxygen Desaturation 2 Oxygen desaturation detection

Evaluation Settings

Setting Description
Linear Probing Freeze backbone, train linear classifier
Full Fine-tuning Fine-tune entire model end-to-end
Few-shot (k-shot) Train with limited labeled samples

For example scripts, see main_pipelines and bash_scripts folders.

πŸ“Š Supported Datasets

We aggregated nine large-scale datasets from the National Sleep Research Resource platform.

Dataset Full Name Source
SHHS Sleep Heart Health Study NSRR
CHAT Childhood Adenotonsillectomy Trial NSRR
MROS MrOS Sleep Study NSRR
CCSHS Cleveland Children's Sleep and Health Study NSRR
CFS Cleveland Family Study NSRR
MESA Multi-Ethnic Study of Atherosclerosis NSRR
SOF Study of Osteoporotic Fractures NSRR
WSC Wisconsin Sleep Cohort NSRR
STAGES Stanford Technology Analytics and Genomics in Sleep NSRR
NCHSDB NCH Sleep DataBank NSRR

For new users, please apply for an account and access to each of these datasets following instructions here NSRR Registration

πŸ“ Project Structure

OSF-Open-Sleep-Foundation-Model/
β”œβ”€β”€ osf/
β”‚   β”œβ”€β”€ backbone/          # ViT backbone implementations
β”‚   β”‚   └── vit1d_cls.py
β”‚   β”œβ”€β”€ models/            # SSL model implementations
β”‚   β”‚   └── dino_model_cls.py
β”‚   β”‚   
β”‚   β”œβ”€β”€ datasets/          # Data loading utilities
β”‚   └── utils/             # Helper functions
β”œβ”€β”€ main_pipelines/        # Training scripts
β”‚   β”œβ”€β”€ main_pretrain.py
β”‚   └── ...
β”œβ”€β”€ bash_scripts/          # Example bash scripts
β”œβ”€β”€ osf_backbone.pth       # Pretrained model weights
β”œβ”€β”€ demo.ipynb             # Quick start demo
β”œβ”€β”€ config.py              # Dataset and channel configurations
└── train_config.py        # Training configurations

πŸ“ Citation

If you use this code or models in your research, please cite our paper:

@article{shuai2026osf,
  title={OSF: On Pre-training and Scaling of Sleep Foundation Models},
  author={Shuai, Zitao and Xu, Zongzhe and Yang, David and Wang, Wei and Yang, Yuzhe},
  journal={arXiv preprint},
  year={2026}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support