Replies: 1 comment
-
Is it splitting in pauses? Edit: I'v just read the desc in your repo. Don't bother with a reply. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I wanted to share here a tool that you might feel helpful (or not...)
Consider the following file (librivox):
https://ia801401.us.archive.org/25/items/beckoningfairone_2211_librivox/beckoningfairone_08_onions_128kb.mp3
The file is 30 min and 46 sec.
In order to train a voice, the samples should be less than ~10 sec (cf Notes on https://github.com/voicepaw/so-vits-svc-fork) and typically more than one second.
SVC comes with a splitter:
svc pre-split
. So I put my mp3 file in dataset_raw_raw. Then,svc pre-split
spits the output in dataset_raw.The smallest file is 1.2 sec long an contain 0.4 sec of non silent audio, while the longest is 29 seconds with quiet a lot of files longer than 10 sec. Below, an histogram of the resulting lengths:
Certainly, fiddling with the parameters you can probably achieve a better result but the overall shape stays roughly the same (unless you put the threshold so high that you have a lot of very small files.)
So, I made my own audio splitter: split_audio.py in which you specify the desired average length (default is 5 sec). The default (recommended) usage is:
python audio_split.py --desired_duration <desired average duration [sec]> <path/to/your/long/audio/file>
The resulting distribution looks like this:
Does it help training or not? I don't know. I haven't compared yet
split_audio.py
andsvc pre-split
side by side.You can download it from:
https://github.com/sbersier/split_audio
(PS: I hope split_audio is not too buggy... If you find bugs, don't hesitate to report.)
Beta Was this translation helpful? Give feedback.
All reactions