KinyVoiceAI is a large-scale, domain-specific Automatic Speech Recognition (ASR) system designed for the Kinyarwanda language. This initiative leverages open speech data and state-of-the-art machine learning models to build reliable and accessible ASR tools for real-world applications.
To develop an end-to-end ASR pipeline that supports accurate transcription of Kinyarwanda audio across five high-impact domains:
- 🏥 Health
- 🏛️ Government
- 📚 Education
- 💰 Financial Services
- 🌾 Agriculture
This project consists of the following main components:
Module | Description | Repository / Folder |
---|---|---|
Backend (ASR) | Data preprocessing, model training, evaluation, and API serving | kinyvoice-be |
Frontend (Demo UI) | Web-based interface for testing and demonstrating ASR performance | kinyvoice-fe (optional) |
- Python, PyTorch, Hugging Face Transformers
- Wav2Vec2, Whisper, ESPnet
- FastAPI, Docker, DVC
- Next.js, TailwindCSS
Each module contains its own setup and usage instructions.
Start by exploring the backend repository:
git clone https://github.com/KinyaVoiceAI/KinyVoiceAI.git
For a live demo or UI interface, check the
kinyvoice-fe
repo.
- Dataset: Digital Umuganda (funded by the Gates Foundation)
- Models: Built using open-source toolkits and research frameworks
- License: CC BY 4.0
We welcome contributions from the community, especially from native speakers, ML researchers, and developers interested in African language technologies. See contribution guidelines in the respective repos.
For questions, collaborations, or feedback: 📧 [email protected] 🌐 GitHub Organization: KinyaVoiceAI
Bringing Kinyarwanda to the forefront of voice AI.