Skip to content

Fine-tuning StableTTS (Matcha-TTS) #42

Open
@Barracuda72

Description

@Barracuda72

Hi there!

I'm trying to fine-tune your StableTTS / Matcha-TTS fork from a chekpoint available here: https://huggingface.co/alphacep/vosk-tts-ru-stabletts/blob/main/vosk_tts_ru_0.8.ckpt

I'm using configs ru for data and multispeaker-ru for experiment as a base.

However, when I try to actually run an experiment, loader first complains that the checkpoint is kinda sus:

<skip>/vosk-tts/lib/python3.12/site-packages/lightning/pytorch/loops/training_epoch_loop.py:221: You're resuming from a checkpoint that ended before the epoch ended and your dataloader is not resumable. This can cause unreliable results if further training is done. Consider using an end-of-epoch checkpoint or make your dataloader resumable by implementing the `state_dict` / `load_state_dict` interface.

And next it crashes somewhere deep in pytorch stating in the end:

RuntimeError: The size of tensor a (68) must match the size of tensor b (5) at non-singleton dimension 0

So my best guess is that some parameters of the experiment / data are different from ones in the repo. Am I correct?

I can run inference using this checkpoint just fine, and I can also start training of a new model from scratch (ckpt_path = null) without any trouble. But I lack computational resources to build a model from a ground up, whence why I've decided to fine-tune an existing one.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions