Instructions to use cy0307/ropedia-xperience-10m-task-baselines with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Cosmos
How to use cy0307/ropedia-xperience-10m-task-baselines with Cosmos:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
Multi-Episode Access Status
Current status: access to the gated full ropedia-ai/xperience-10m dataset is
granted, and a metadata-only Hugging Face audit has been completed. A selected
128-episode pilot has produced a verified diagnostic Qwen3-Omni LoRA package
with held-out evaluation. The result is useful as a pipeline and error-analysis
baseline, not as a strong final model.
This file records the public data-access status and pilot requirements. It does not include local-machine aliases, private paths, SSH hosts, or token locations.
Selection Plan
| Item | Value |
|---|---|
| Dataset | ropedia-ai/xperience-10m |
| Minimum pilot gate | 32 complete leaf episodes |
| Strategy | stratified round-robin across top-level session UUIDs |
| Metadata-audited visible complete episodes | 12,102 |
| Metadata-audited complete sessions | 802 |
| Current selected pilot | 128 source-balanced episodes |
| Recommended split | 96 train / 16 val / 16 test |
| Recommended estimated download | 277.71 GiB excluding visualization.rrd |
| Representative 32-episode estimate | ~70.5 GiB at median episode size |
| Smallest one-per-session 32-episode estimate | 35.35 GiB |
| Excluded file | visualization.rrd |
Current Stage
The current Qwen3-Omni artifacts include a verified validation-monitored diagnostic held-out run: 96/16/16 selected train/val/test episodes, 3,808 exported windows, 2,848 train examples, 512 validation examples, and 448 held-out test predictions from 14 exported test episodes. Training used eight distributed accelerator processes for one epoch with LoRA rank 16 and recorded a final train loss of 0.4130 plus a validation loss of 0.0331. The result verifies the multi-episode pipeline and gives a real error-analysis baseline; it is still not a strong final model.
A stronger model-quality pilot should be claimed only after:
- selected valid episodes are available locally,
- the manifest builder confirms complete held-out episode splits,
- training finishes with recorded metadata and progress logs,
- evaluation runs on held-out test episodes,
- predictions, metrics, confusion matrices, and a run report are committed.
- JSON validity and action/subtask metrics improve beyond the current diagnostic baseline.
Current diagnostic metrics:
- JSON validity: 87.50%
- action macro-F1: 0.0027
- subtask accuracy: 0.0067
- transition accuracy: 0.8504
- next-action accuracy: 0.0246
- contact accuracy: 0.6451
- object micro-F1: 0.2230
The public data access summary is:
results/omni_finetune/DATA_ACCESS_STATUS.md
The current metadata-only full dataset audit is:
results/omni_finetune/FULL_DATASET_METADATA_AUDIT.md
The current 128-episode source-balanced download plan is:
results/omni_finetune/XPERIENCE10M_128_EPISODE_SELECTION.md
The current verified diagnostic package is:
docs/data/omni_finetune_verified_result.json
results/omni_finetune/verified_public/
The older machine-generated source discovery report remains a pre-access local planning record:
results/omni_finetune/DATA_BLOCKER_REPORT.md