Skip to content

Unable to read the data I prepared #1131

Closed
@Chloey-TS

Description

@Chloey-TS

Checks

  • This template is only for usage issues encountered.
  • I have thoroughly reviewed the project documentation but couldn't find information to solve my problem.
  • I have searched for existing issues, including closed ones, and couldn't find a solution.
  • I am using English to submit this issue to facilitate community communication.

Environment Details

Manjaro Linux
Python 3.13.3
Torch 2.7.1+cu128
Gradio 5.35.0
GPU: RTX5070

Steps to Reproduce

  1. create a virtual environment
  2. download latest release and install as pip package
  3. download checkpoints and vocos from hugging face, and modify infer_gradio.py line 54
DEFAULT_TTS_MODEL_CFG = [
    "/home/ai/Documents/AI Voice Clone/F5-TTS/ckpts/F5TTS_v1_Base/model_1250000.safetensors",
    "/home/ai/Documents/AI Voice Clone/F5-TTS/ckpts/F5TTS_v1_Base/vocab.txt",
    # "hf://SWivid/F5-TTS/F5TTS_v1_Base/model_1250000.safetensors",
    # "hf://SWivid/F5-TTS/F5TTS_v1_Base/vocab.txt",
    json.dumps(dict(dim=1024, depth=22, heads=16, ff_mult=2, text_dim=512, conv_layers=4)),
]

and utils_infer.py line 104, to run it locally, try f5-tts_infer-gradio, works fine.

def load_vocoder(vocoder_name="vocos", is_local=True, local_path="/home/ai/Documents/AI Voice Clone/F5-TTS/ckpts/vocos", device=device, hf_cache_dir=None):
    if vocoder_name == "vocos":
  1. then follow Gradio UI Training, try transcribe data first, upload audio files, wait a long time but all files transcribe failed, seems using whisper model to transcribe and my network unable to access directly, failed.
  2. try use custom dataset with this guide, prepared metadata.csv and wav audio files, then run python scripts/prepare_csv_wavs.py , generated json, arrorw, vocab.txt in /home/ai/Documents/AI Voice Clone/F5-TTS/data/my_speech_pinyin folder, actually I can't found a place to modify dataset_name , then run python train.py, error is no model_cfg given. failed.
  3. ok, then use prepared metadata.csv and wav audio files with gradio UI, just go to prepare data, and stuck too, Error: No audio files found in the specified path : /home/ai/Documents/AI Voice Clone/F5-TTS/src/f5_tts/../../data/my_speech_pinyin/wavs, I actually have wav audio files in path /home/ai/Documents/AI Voice Clone/F5-TTS/data/my_speech_pinyin/wavs and /home/ai/Documents/AI Voice Clone/F5-TTS/src/f5_tts/data/my_speech_pinyin/wavs, failed

I’m stuck here for 2 days, I checked issue, readme, multiple youtube videos, still no clue, please help.

✔️ Expected Behavior

I hope I can complete this training step and train my model.

❌ Actual Behavior

as mentioned above. orz

Metadata

Metadata

Assignees

No one assigned

    Labels

    help wantedExtra attention is needed

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions