Self-Fulfilling (Mis)alignment: Post-Trained Models - a geodesic-research Collection

geodesic-research 's Collections

Alignment Pretraining (Geodesic, 2025): Data & Models

Self-Fulfilling (Mis)alignment: Datasets

Self-Fulfilling (Mis)alignment: Emergent Misalignment

Self-Fulfilling (Mis)alignment: Midtraining Ablations

Self-Fulfilling (Mis)alignment: Base Models

Self-Fulfilling (Mis)alignment: Tampered Models

Self-Fulfilling (Mis)alignment: Post-Trained Models

Self-Fulfilling (Mis)alignment: Post-Trained Models

updated 5 days ago

Here is a selection of SFM models that have undergone DPO.

geodesic-research/sfm-sft_dolci_instruct_unfiltered-DPO

Text Generation • 7B • Updated 11 days ago • 753

Note Our "Unfiltered" instruct model, trained on 500B PT, 50B MT, 4B SFT, finishing with DPO
geodesic-research/sfm-sft_dolci_instruct_blocklist_filtered-DPO

Text Generation • 7B • Updated 11 days ago • 565

Note Our "Filtered" instruct model, trained on 500B PT, 50B MT, 4B SFT, finishing with DPO
geodesic-research/sfm-sft_dolci_instruct_unfiltered_synthetic_misalignment_mid-DPO

Text Generation • 7B • Updated 12 days ago • 1.41k

Note Our "Unfiltered + Synthetic Misalignment" instruct model, trained on 500B PT, 50B MT, 4B SFT, finishing with DPO
geodesic-research/sfm-sft_dolci_instruct_blocklist_filtered_synthetic_alignment_mid-DPO

Text Generation • 7B • Updated 10 days ago • 1.1k

Note Our "Filtered + Synthetic Alignment" instruct model, trained on 500B PT, 50B MT, 4B SFT, finishing with DPO
geodesic-research/sfm-sft_dolci_instruct_unfiltered

Text Generation • 7B • Updated 13 days ago • 1.95k

Note Our "Unfiltered" instruct model, trained on 500B PT, 50B MT, finishing with 4B SFT
geodesic-research/sfm-sft_dolci_instruct_blocklist_filtered

Text Generation • 7B • Updated 13 days ago • 1.69k

Note Our "Filtered" instruct model, trained on 500B PT, 50B MT, finishing with 4B SFT
geodesic-research/sfm-sft_dolci_instruct_unfiltered_synthetic_misalignment_mid

Text Generation • 7B • Updated 13 days ago • 1.69k

Note Our "Unfiltered + Synthetic Misalignment" instruct model, trained on 500B PT, 50B MT, finishing with 4B SFT
geodesic-research/sfm-sft_dolci_instruct_blocklist_filtered_synthetic_alignment_mid

Text Generation • 7B • Updated 13 days ago • 1.64k

Note Our "Filtered + Synthetic Alignment" instruct model, trained on 500B PT, 50B MT, finishing with 4B SFT