some helper function for latents (022860c9) · Commits · school / Capstone Design / 01 / MusicLM

README.md

+1 −0

Original line number	Diff line number	Diff line
		@@ -57,6 +57,7 @@ loss.backward()
		- [ ] add a version of mulan to <a href="https://github.com/mlfoundations/open_clip">open clip</a>
		- [ ] set all the proper spectrogram hyperparameters
		- [ ] mulan seems to be using decoupled contrastive learning, offer that as an option
		- [ ] email some contrastive learning experts and figure out why some papers are sharing the projection from embeddings to latent space

		## Appreciation

+16 −0

Original line number	Diff line number	Diff line
		@@ -388,6 +388,22 @@ class MuLaN(nn.Module):
		self.text_to_latents = nn.Linear(self.text.dim, dim_latent)
		self.audio_to_latents = nn.Linear(self.audio.dim, dim_latent)

		def get_audio_latents(
		self,
		wavs
		):
		audio_embeds = self.audio(wavs)
		audio_latents = self.audio_to_latents(audio_embeds)
		return l2norm(audio_latents)

		def get_text_latents(
		self,
		texts,
		):
		text_embeds = self.text(texts)
		text_latents = self.text_to_latents(text_embeds)
		return l2norm(text_latents)

		def forward(
		self,
		wavs,