English

Config files are broken

#2
by redgray - opened

"Error(s) in loading state_dict for SAMAudioJudgeModel:
Missing key(s) in state_dict: "data_proj.weight", "data_proj.bias", "audio_codec.encoder.block.0.bias", ... and some more lines"

AI at Meta org

When using the SAMAudioJudgeModel, make sure you use the sam_audio branch if accessing from sam-audio. See this issue for more context

Issue: Unable to run SAM-Audio locally due to missing torch.nn.attention.flex_attention despite full dependency installation
I’m trying to run SAM-Audio locally using the public GitHub repo (facebookresearch/sam-audio) and the Hugging Face checkpoint facebook/sam-audio-large, but I’m blocked by a PyTorch API issue. I’m on Linux (no sudo, no conda), Python 3.11.13 via pyenv, and I successfully ran pip install ., including installing perception-models (which resolves core.audio_visual_encoder), fixing FFmpeg/torchcodec issues, and getting past all earlier dependency errors.
Importing from sam_audio import SAMAudio, SAMAudioProcessor now proceeds until it fails with ModuleNotFoundError: No module named 'torch.nn.attention.flex_attention'. I verified that flex_attention.py exists in the PyTorch GitHub source tree (e.g., v2.9.1), but it is not importable from any public PyTorch wheels I tested (2.2.x, 2.4/2.5, 2.9.1). This suggests SAM-Audio depends on an internal or experimental PyTorch API that is not exposed in released builds. Is SAM-Audio intended to require a PyTorch nightly or Meta-internal build, and is there a supported public PyTorch version or workaround to run it end-to-end?

Sign up or log in to comment