Unverified Commit ab4fc99a authored Feb 01, 2023 by djqualia Committed by GitHub Feb 01, 2023

Convert stereo/multi-channel audio to mono

It is annoying when training on large data sets for one non-mono file to cause an exception (einops.EinopsError: Shape mismatch, 2 != 1) and the whole training pipeline to crash.  Weeding out such files can take some time.  This will avoid such crashes and handle them gracefully...

parent ba7dcd68

audiolm_pytorch/data.py

+4 −0

Original line number	Diff line number	Diff line
		@@ -68,6 +68,10 @@ class SoundDataset(Dataset):

		assert data.numel() > 0, f'one of your audio file ({file}) is empty. please remove it from your folder'

		if data.shape[0] > 1:
		# the audio has more than 1 channel, convert to mono
		data = torch.mean(data, dim=0).unsqueeze(0)

		num_outputs = len(self.target_sample_hz)
		data = cast_tuple(data, num_outputs)