Unverified Commit ed313d3a authored by Phil Wang's avatar Phil Wang Committed by GitHub
Browse files

product management

parent fb752a74
Loading
Loading
Loading
Loading
+1 −0
Original line number Diff line number Diff line
@@ -57,6 +57,7 @@ loss.backward()
- [x] complete CoarseTransformer
- [x] use fairseq vq-wav2vec for embeddings

- [ ] incorporate ability to use hubert intermediate features as semantic tokens, recommended by <a href="https://github.com/lucidrains/audiolm-pytorch/discussions/13">eonglints</a>
- [ ] complete full training code for soundstream, taking care of discriminator training
- [ ] figure out how to do the normalization across each dimension mentioned in the paper, but ignore it for v1 of the framework
- [ ] complete sampling code for both Coarse and Fine Transformers, which will be tricky