Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
TensorSpeech
GitHub Repository: TensorSpeech/TensorFlowTTS
Path: blob/master/examples/fastspeech2_libritts/README.md
1558 views

Fast speech 2 multi-speaker english lang based

Prepare

Everything is done from main repo folder so TensorflowTTS/

  1. Optional* Download and prepare libritts (helper to prepare libri in examples/fastspeech2_libritts/libri_experiment/prepare_libri.ipynb)

  • Dataset structure after finish this step:

    |- TensorFlowTTS/ | |- LibriTTS/ | |- |- train-clean-100/ | |- |- SPEAKERS.txt | |- |- ... | |- libritts/ | |- |- 200/ | |- |- |- 200_124139_000001_000000.txt | |- |- |- 200_124139_000001_000000.wav | |- |- |- ... | |- |- 250/ | |- |- ... | |- tensorflow_tts/ | |- models/ | |- ... ```
  1. Extract Duration (use examples/mfa_extraction or pretrained tacotron2)

  2. Optional* build docker

  • bash examples/fastspeech2_libritts/scripts/build.sh
  1. Optional* run docker

  • bash examples/fastspeech2_libritts/scripts/interactive.sh
  1. Preprocessing:

  • tensorflow-tts-preprocess --rootdir ./libritts \ --outdir ./dump_libritts \ --config preprocess/libritts_preprocess.yaml \ --dataset libritts
  1. Normalization:

  • tensorflow-tts-normalize --rootdir ./dump_libritts \ --outdir ./dump_libritts \ --config preprocess/libritts_preprocess.yaml \ --dataset libritts
  1. Change CharactorDurationF0EnergyMelDataset speaker mapper in fastspeech2_dataset to match your dataset (if you use libri with mfa_extraction you didnt need to change anything)

  2. Change train_libri.sh to match your dataset and run:

  • bash examples/fastspeech2_libritts/scripts/train_libri.sh
  1. Optional* If u have problems with tensor sizes mismatch check step 5 in examples/mfa_extraction directory

Comments

This version is using popular train.txt '|' split used in other repos. Training files should looks like this =>

Wav Path | Text | Speaker Name

Wav Path2 | Text | Speaker Name