Path: blob/master/examples/fastspeech2_libritts/README.md
1558 views
Fast speech 2 multi-speaker english lang based
Prepare
Everything is done from main repo folder so TensorflowTTS/
Optional* Download and prepare libritts (helper to prepare libri in examples/fastspeech2_libritts/libri_experiment/prepare_libri.ipynb)
Dataset structure after finish this step:
Extract Duration (use examples/mfa_extraction or pretrained tacotron2)
Optional* build docker
Optional* run docker
Preprocessing:
Normalization:
Change CharactorDurationF0EnergyMelDataset speaker mapper in fastspeech2_dataset to match your dataset (if you use libri with mfa_extraction you didnt need to change anything)
Change train_libri.sh to match your dataset and run:
Optional* If u have problems with tensor sizes mismatch check step 5 in
examples/mfa_extractiondirectory
Comments
This version is using popular train.txt '|' split used in other repos. Training files should looks like this =>
Wav Path | Text | Speaker Name
Wav Path2 | Text | Speaker Name