Fine-tuning StableTTS (Matcha-TTS)

Hi there!

I'm trying to fine-tune your StableTTS / Matcha-TTS fork from a chekpoint available here: https://huggingface.co/alphacep/vosk-tts-ru-stabletts/blob/main/vosk_tts_ru_0.8.ckpt

I'm using configs `ru` for data and `multispeaker-ru` for experiment as a base.

However, when I try to actually run an experiment, loader first complains that the checkpoint is kinda sus:

```
<skip>/vosk-tts/lib/python3.12/site-packages/lightning/pytorch/loops/training_epoch_loop.py:221: You're resuming from a checkpoint that ended before the epoch ended and your dataloader is not resumable. This can cause unreliable results if further training is done. Consider using an end-of-epoch checkpoint or make your dataloader resumable by implementing the `state_dict` / `load_state_dict` interface.
```

And next it crashes somewhere deep in `pytorch` stating in the end:

`RuntimeError: The size of tensor a (68) must match the size of tensor b (5) at non-singleton dimension 0`

So my best guess is that some parameters of the experiment / data are different from ones in the repo. Am I correct?

I can run inference using this checkpoint just fine, and I can also start training of a new model from scratch (`ckpt_path = null`) without any trouble. But I lack computational resources to build a model from a ground up, whence why I've decided to fine-tune an existing one.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fine-tuning StableTTS (Matcha-TTS) #42

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Fine-tuning StableTTS (Matcha-TTS) #42

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions