Skip to content
GitLab
Explore
Sign in
This is an archived project. Repository and other project resources are read-only.
Commits · 416aaf7a1457ed0ad66bbb0a00bea0361f532c9e
audiolm-pytorch-flask
Browse files
Nov 17, 2022
move soundstream to separate file
· 416aaf7a
Phil Wang
authored
Nov 16, 2022
416aaf7a
Nov 16, 2022
readme
· f9fa68e7
Phil Wang
authored
Nov 15, 2022
f9fa68e7
readme
· 873f3c2e
Phil Wang
authored
Nov 15, 2022
873f3c2e
oops
· 95b8cd23
Phil Wang
authored
Nov 15, 2022
View commits for tag 0.0.25
0.0.25
95b8cd23
optionally allow for resampling directly within SoundDataset, if target_sample_khz specified
· b6e5af78
Phil Wang
authored
Nov 15, 2022
b6e5af78
a simple measure for greater transformer training stability
· f7756f56
Phil Wang
authored
Nov 15, 2022
View commits for tag 0.0.23
0.0.23
f7756f56
handle if any of the models requires the sequence length to be some multiple of
· 5b24b4f5
Phil Wang
authored
Nov 15, 2022
View commits for tag 0.0.22
0.0.22
5b24b4f5
make sure unconditional synthesis can still work, add ability to resample...
· 02902731
Phil Wang
authored
Nov 15, 2022
View commits for tag 0.0.21
0.0.21
02902731
fix a bug thanks to @eonglints
· 2725ae89
Phil Wang
authored
Nov 15, 2022
View commits for tag 0.0.20
0.0.20
2725ae89
Nov 15, 2022
properly accelerator prepare all multiscale discriminators
· 5f964966
Phil Wang
authored
Nov 14, 2022
View commits for tag 0.0.19
0.0.19
5f964966
take care of training all multiscale discriminators
· 4a206d7b
Phil Wang
authored
Nov 14, 2022
View commits for tag 0.0.18
0.0.18
4a206d7b
0.0.17
· 07a7eb00
Phil Wang
authored
Nov 14, 2022
View commits for tag 0.0.17
0.0.17
07a7eb00
get some training code down for soundstream, use torchaudio instead of soundfile
· a5c3ace4
Phil Wang
authored
Nov 14, 2022
a5c3ace4
oops
· bc2e9461
Phil Wang
authored
Nov 14, 2022
View commits for tag 0.0.16
0.0.16
bc2e9461
fix a bug with residual quantize dropout, and also figure out a way to deal...
· 38175d4b
Phil Wang
authored
Nov 14, 2022
View commits for tag 0.0.15
0.0.15
38175d4b
basic dataset and dataloader for audio, tested with librispeech
· b99a260f
Phil Wang
authored
Nov 14, 2022
b99a260f
will be using soundfile and torchaudio
· dab7d1b8
Phil Wang
authored
Nov 14, 2022
dab7d1b8
Nov 12, 2022
fix coarse cross entropy loss weights
· 450af495
Phil Wang
authored
Nov 11, 2022
450af495
complete first pass at unique consecutive issue with semantic token ids, by...
· 09c79a04
Phil Wang
authored
Nov 11, 2022
View commits for tag 0.0.12
0.0.12
09c79a04
correct weighting of cross entropy losses
· e5408fbd
Phil Wang
authored
Nov 11, 2022
View commits for tag 0.0.11
0.0.11
e5408fbd
follow researcher @eonglints advice and add unique consecutive for semantic...
· 2b6c5662
Phil Wang
authored
Nov 11, 2022
View commits for tag 0.0.10
0.0.10
2b6c5662
Nov 11, 2022
product management
· ccf9c1d8
Phil Wang
authored
Nov 10, 2022
ccf9c1d8
add classifier free guidance training logic, cite
· 30f04de7
Phil Wang
authored
Nov 10, 2022
View commits for tag 0.0.9
0.0.9
30f04de7
add cross attention layers as well as setup t5 and some conditioning logic,...
· c17ee7d3
Phil Wang
authored
Nov 10, 2022
View commits for tag 0.0.8
0.0.8
c17ee7d3
gratitude
· 26dfc80f
Phil Wang
authored
Nov 10, 2022
26dfc80f
go for single-headed key / values for all decoding attention networks, given...
· fca12286
Phil Wang
authored
Nov 10, 2022
View commits for tag 0.0.7
0.0.7
fca12286
listen to @eonglints and add hubert with kmeans as an option
· a11722e6
Phil Wang
authored
Nov 10, 2022
View commits for tag 0.0.6
0.0.6
a11722e6
Nov 09, 2022
product management
· ed313d3a
Phil Wang
authored
Nov 08, 2022
ed313d3a
Nov 08, 2022
product management
· fb752a74
Phil Wang
authored
Nov 07, 2022
fb752a74
add an adapter class for fairseq vq-wav2vec, make sure training of semantic...
· af0564d4
Phil Wang
authored
Nov 07, 2022
View commits for tag 0.0.5
0.0.5
af0564d4
will be depending on fairseq vq-wav2vec implementation...
· 2ce09315
Phil Wang
authored
Nov 07, 2022
2ce09315
Nov 05, 2022
optional
· 8f5d07d3
Phil Wang
authored
Nov 04, 2022
8f5d07d3
project management
· 130846e3
Phil Wang
authored
Nov 04, 2022
130846e3
offset by multiple of codebook size across quantizers for both coarse and fine
· d40e4119
Phil Wang
authored
Nov 04, 2022
View commits for tag 0.0.4
0.0.4
d40e4119
rough sketch of all three transformers finished
· 0ac5e4f5
Phil Wang
authored
Nov 04, 2022
0ac5e4f5
some project management
· 80a3fad4
Phil Wang
authored
Nov 04, 2022
80a3fad4
credit assign
· fa750085
Phil Wang
authored
Nov 04, 2022
fa750085
complete semantic transformer, as it is a normal transformer
· fac3152e
Phil Wang
authored
Nov 04, 2022
fac3152e
handle projection of fine and coarse logits correctly in the final transformer in the hierarchy
· 0ec7667b
Phil Wang
authored
Nov 04, 2022
0ec7667b
Nov 04, 2022
todo
· a9efd2d9
Phil Wang
authored
Nov 03, 2022
a9efd2d9
Loading