Picture for Michael Auli

Michael Auli

Toward Joint Language Modeling for Speech Units and Text

Add code
Oct 12, 2023
Figure 1 for Toward Joint Language Modeling for Speech Units and Text
Figure 2 for Toward Joint Language Modeling for Speech Units and Text
Figure 3 for Toward Joint Language Modeling for Speech Units and Text
Figure 4 for Toward Joint Language Modeling for Speech Units and Text
Viaarxiv icon

Scaling Speech Technology to 1,000+ Languages

Add code
May 22, 2023
Figure 1 for Scaling Speech Technology to 1,000+ Languages
Figure 2 for Scaling Speech Technology to 1,000+ Languages
Figure 3 for Scaling Speech Technology to 1,000+ Languages
Figure 4 for Scaling Speech Technology to 1,000+ Languages
Viaarxiv icon

DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning

Add code
May 17, 2023
Figure 1 for DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning
Figure 2 for DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning
Figure 3 for DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning
Figure 4 for DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning
Viaarxiv icon

AV-data2vec: Self-supervised Learning of Audio-Visual Speech Representations with Contextualized Target Representations

Add code
Feb 10, 2023
Figure 1 for AV-data2vec: Self-supervised Learning of Audio-Visual Speech Representations with Contextualized Target Representations
Figure 2 for AV-data2vec: Self-supervised Learning of Audio-Visual Speech Representations with Contextualized Target Representations
Figure 3 for AV-data2vec: Self-supervised Learning of Audio-Visual Speech Representations with Contextualized Target Representations
Figure 4 for AV-data2vec: Self-supervised Learning of Audio-Visual Speech Representations with Contextualized Target Representations
Viaarxiv icon

Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language

Add code
Dec 14, 2022
Figure 1 for Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language
Figure 2 for Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language
Figure 3 for Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language
Figure 4 for Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language
Viaarxiv icon

Simple and Effective Unsupervised Speech Translation

Add code
Oct 18, 2022
Figure 1 for Simple and Effective Unsupervised Speech Translation
Figure 2 for Simple and Effective Unsupervised Speech Translation
Figure 3 for Simple and Effective Unsupervised Speech Translation
Figure 4 for Simple and Effective Unsupervised Speech Translation
Viaarxiv icon

Masked Autoencoders that Listen

Add code
Jul 13, 2022
Figure 1 for Masked Autoencoders that Listen
Figure 2 for Masked Autoencoders that Listen
Figure 3 for Masked Autoencoders that Listen
Figure 4 for Masked Autoencoders that Listen
Viaarxiv icon

Wav2Vec-Aug: Improved self-supervised training with limited data

Add code
Jun 27, 2022
Figure 1 for Wav2Vec-Aug: Improved self-supervised training with limited data
Figure 2 for Wav2Vec-Aug: Improved self-supervised training with limited data
Figure 3 for Wav2Vec-Aug: Improved self-supervised training with limited data
Figure 4 for Wav2Vec-Aug: Improved self-supervised training with limited data
Viaarxiv icon

On-demand compute reduction with stochastic wav2vec 2.0

Add code
Apr 25, 2022
Figure 1 for On-demand compute reduction with stochastic wav2vec 2.0
Figure 2 for On-demand compute reduction with stochastic wav2vec 2.0
Figure 3 for On-demand compute reduction with stochastic wav2vec 2.0
Figure 4 for On-demand compute reduction with stochastic wav2vec 2.0
Viaarxiv icon

Simple and Effective Unsupervised Speech Synthesis

Add code
Apr 20, 2022
Figure 1 for Simple and Effective Unsupervised Speech Synthesis
Figure 2 for Simple and Effective Unsupervised Speech Synthesis
Figure 3 for Simple and Effective Unsupervised Speech Synthesis
Figure 4 for Simple and Effective Unsupervised Speech Synthesis
Viaarxiv icon