Commit 888552af authored by Phil Wang's avatar Phil Wang
Browse files

todo

parent 95595fb9
Loading
Loading
Loading
Loading
+1 −0
Original line number Diff line number Diff line
@@ -68,6 +68,7 @@ embeds = mulan.get_text_latents(texts) # during inference
- [ ] add a version of mulan to <a href="https://github.com/mlfoundations/open_clip">open clip</a>
- [ ] set all the proper spectrogram hyperparameters
- [ ] email some contrastive learning experts and figure out why some papers are sharing the projection from embeddings to latent space
- [ ] improvise a bit and give the audio transformer a position generating module before each attention layer

## Appreciation