GitHub - RAYTRAC3R/FastSpeech2: :fist: Multi-Speaker Pytorch FastSpeech2 : Fast and High-Quality End-to-End Text to Speech

Multi-Speaker FastSpeech 2 - PyTorch

FastSpeech 2 - PyTorch Implementation ⚡

This is a PyTorch implementation of Microsoft's FastSpeech 2: Fast and High-Quality End-to-End Text to Speech.
This project is based on ming024's implementation. Any suggestion for improvement is appreciated.
Now supporting about 900 speakers in 🔥 LibriTTS for multi-speaker text-to-speech.

Datasets 🐘

This project supports 4 datasets, including muti-speaker datasets and single-speaker datasets:

🔥 Multi-Speaker

LibriTTS
VCTK

🔥 Single-Speaker

LJSpeech
Blizzard2013

After downloading the dataset, extract the compressed files. You have to modify the hp.data_path and some other parameters in hparams.py. Default parameters are for the LibriTTS dataset.

Quick Start ✊

Download the pretrained model.
Put checkpoint_600000.pth.tar in ./states/ckpt.
Run

python synthesize.py

Preprocessing ✏️

Preprocessing contains 3 stages:

Preparing Alignment Data
Montreal Force Alignmnet (MFA)
Creating Training Dataset

For 2. Montreal Force Alignmnet (MFA), please refer to Montreal-Forced-Aligner.

Download and extract the tar.gz file, then specify the path to MFA in hparams.py

Then run:

python preprocess.py --prepare_align --mfa --create_dataset

After preprocessing, you will get a stat.txt file in your hp.preprocessed_path/, recording the maximum and minimum values of the fundamental frequency and energy values throughout the entire corpus. You have to modify the f0 and energy parameters in the data/dataset.yaml according to the content of stat.txt.

Training 🐍

Train your model with

python train.py

The training output, including log message, checkpoint, and synthesized audios will be put in ./states

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
audio		audio
data		data
text		text
transformer		transformer
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dataset.py		dataset.py
evaluate.py		evaluate.py
fastspeech2.py		fastspeech2.py
hparams.py		hparams.py
loss.py		loss.py
modules.py		modules.py
optimizer.py		optimizer.py
preprocess.py		preprocess.py
requirements.txt		requirements.txt
synthesize.py		synthesize.py
train.py		train.py
utils.py		utils.py
vocoder.py		vocoder.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Multi-Speaker FastSpeech 2 - PyTorch

FastSpeech 2 - PyTorch Implementation ⚡

Datasets 🐘

🔥 Multi-Speaker

🔥 Single-Speaker

Quick Start ✊

Preprocessing ✏️

Training 🐍

References 📔

About

Uh oh!

Releases

Packages

Languages

License

RAYTRAC3R/FastSpeech2

Folders and files

Latest commit

History

Repository files navigation

Multi-Speaker FastSpeech 2 - PyTorch

FastSpeech 2 - PyTorch Implementation ⚡

Datasets 🐘

🔥 Multi-Speaker

🔥 Single-Speaker

Quick Start ✊

Preprocessing ✏️

Training 🐍

References 📔

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages