Description
In core/audio_visual_encoder/transforms.py, the code imports:
from torchcodec.decoders import AudioDecoder
However, the documented install instructions specify:
pip install torchcodec==0.1 --index-url=https://download.pytorch.org/whl/cu124
The issue is that torchcodec==0.1 does not provide AudioDecoder.
AudioDecoder was introduced in later versions of TorchCodec, so following the current install instructions leads to an ImportError at runtime when audio processing is enabled.
Workaround / Fix
I replaced TorchCodec-based audio loading with torchaudio, which is already part of the PyTorch stack and avoids the version mismatch.
In core/audio_visual_encoder/transforms.py, I replaced _load_audio with:
import torchaudio
def _load_audio(self, path: str):
wav, sr = torchaudio.load(path)
# Convert to mono
if wav.shape[0] > 1:
wav = wav.mean(dim=0, keepdim=True)
# Resample if needed
if sr != self.sampling_rate:
wav = torchaudio.functional.resample(wav, sr, self.sampling_rate)
return wav.contiguous()
This restores compatibility with torchcodec==0.1 while keeping the rest of the audio pipeline unchanged.
Description
In
core/audio_visual_encoder/transforms.py, the code imports:However, the documented install instructions specify:
The issue is that
torchcodec==0.1does not provideAudioDecoder.AudioDecoderwas introduced in later versions of TorchCodec, so following the current install instructions leads to anImportErrorat runtime when audio processing is enabled.Workaround / Fix
I replaced TorchCodec-based audio loading with
torchaudio, which is already part of the PyTorch stack and avoids the version mismatch.In
core/audio_visual_encoder/transforms.py, I replaced_load_audiowith:This restores compatibility with
torchcodec==0.1while keeping the rest of the audio pipeline unchanged.