Skip to content

Commit 027aa02

Browse files
committed
updated some files
1 parent 5ca144c commit 027aa02

File tree

5 files changed

+114
-7
lines changed

5 files changed

+114
-7
lines changed

README.md

Lines changed: 47 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,51 @@
11
# Audio Splitter using Whisperx
2+
Created with the purpose for curating datasets for the sake of training AI models. This is created with RVC (Retrieval-based Voice Conversion) in mind but generally works for any other AI voice model that needs short clips less than 10s.
23

4+
## Youtube Video Tutorial
5+
<insert tutorial later>
36

4-
## Pytorch
5-
Even if you don't have a GPU, it should still work.
7+
## Prerequisites
8+
- Python 3.10 installation
9+
- git installation
10+
- vscode installation (highly recommended)
611

12+
## Installation and basic usage
13+
1. Clone the repository (repo)
14+
```
15+
git clone https://github.com/JarodMica/audiosplitter_whisper.git
16+
```
17+
18+
2. Navigate into the repo with:
19+
```
20+
cd audiosplitter_whisper
21+
```
22+
23+
4. Run setup-cuda.py if you have a compatible Nvidia graphics card or run setup-cpu.py if you do not. **NOTE:** This splitter will work on a CPU, albeit, very slowly. The reason I keep this option is for people who may want to curate a dataset locally, but train on colab. (AMD not compatible, Mac is not coded for (should be able to use MPS though). Both can use CPU option)
24+
25+
```
26+
python setup-cuda.py
27+
```
28+
29+
5. Activate the virtual envionrment (venv).
30+
```
31+
venv\Scripts\activate
32+
```
33+
34+
6. If you ran into any permission issues, you'll need to change your windows Execution Policy to Remote Signed. This does lower security on your system a small bit as it allows for scripts to be ran on your computer, however, only those signed by a Trusted Publisher or verified by you can be run (to my knowledge). Do at your own risk.
35+
- Open a powershell window as admin. Then, run the following command:
36+
37+
```
38+
Set-ExecutionPolicy RemoteSigned
39+
```
40+
41+
- If you want to change it back, you can with:
42+
```
43+
Set-ExecutionPolicy Restricted
44+
```
45+
46+
7. Now rerun step 5 and activate your venv. After it's activated, you can then run the following command to start up the script:
47+
```
48+
python split_audio.py
49+
```
50+
51+
For more details, please refer to the youtube video.

audio_shortener.py

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
import os
2+
3+
from scipy.io import wavfile
4+
from tkinter import filedialog
5+
from tkinter import *
6+
7+
def run_audiosplitter():
8+
# Create the Tkinter root window
9+
root = root_audiosplitter
10+
root.withdraw()
11+
12+
# Ask the user to select the audio file directory using the file explorer
13+
input_directory = filedialog.askdirectory(title="Select Audio File Directory", parent=root_audiosplitter)
14+
15+
# Check if a directory was selected
16+
if input_directory:
17+
# Iterate over the files in the directory
18+
for filename in os.listdir(input_directory):
19+
if filename.endswith(".wav"):
20+
file_path = os.path.join(input_directory, filename)
21+
split_audio_file(file_path)
22+
else:
23+
print("No directory selected.")
24+
25+
# Close the Tkinter root window
26+
root.destroy()
27+
28+
def split_audio_file(file_path, segment_duration=10):
29+
# Load the audio file
30+
sample_rate, audio_data = wavfile.read(file_path)
31+
32+
# Calculate the number of segments
33+
num_segments = int(len(audio_data) / (sample_rate * segment_duration))
34+
remainder = len(audio_data) % (sample_rate * segment_duration)
35+
if remainder > 0:
36+
num_segments += 1
37+
38+
# Create the output directory for segments
39+
output_dir = os.path.dirname(file_path)
40+
base_filename = os.path.splitext(os.path.basename(file_path))[0]
41+
42+
# Split the audio file into segments
43+
for i in range(num_segments):
44+
start = i * sample_rate * segment_duration
45+
end = min((i + 1) * sample_rate * segment_duration, len(audio_data))
46+
segment = audio_data[start:end]
47+
48+
# Create the output file name
49+
segment_filename = f"{base_filename}_{i+1}.wav"
50+
segment_path = os.path.join(output_dir, segment_filename)
51+
52+
# Save the segment as a new WAV file
53+
wavfile.write(segment_path, sample_rate, segment)
54+
55+
print(f"Segment {i+1}/{num_segments} saved: {segment_filename}")
56+
57+
os.remove(file_path)
58+
59+
root_audiosplitter = Tk()
60+
root_audiosplitter.withdraw()
61+
62+
run_audiosplitter()

requirements-cpu.txt

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,4 +5,5 @@ torchvision
55
torchaudio
66
pysrt
77
pydub
8-
pyyaml
8+
pyyaml
9+
wheel

requirements-cuda.txt

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,4 +6,5 @@ torchvision==0.15.1+cu118
66
torchaudio==2.0.1
77
pysrt
88
pydub
9-
pyyaml
9+
pyyaml
10+
wheel

setup-cuda.py

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
def create_virtual_environment():
77
# Create a virtual environment in the "venv" directory
88
try:
9-
venv.create('venv-cuda', with_pip=True)
9+
venv.create('venv', with_pip=True)
1010
except Exception as e:
1111
print(f"Failed to create virtual environment. Error: {e}")
1212
sys.exit(1)
@@ -26,11 +26,9 @@ def install_requirements():
2626
print(f"Failed to install requirements. Error: {e}")
2727
sys.exit(1)
2828

29-
3029
def main():
3130
create_virtual_environment()
3231
install_requirements()
3332

34-
3533
if __name__ == '__main__':
3634
main()

0 commit comments

Comments
 (0)