Skip to content

Feature/transcription language selection #55

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 407 commits into
base: main
Choose a base branch
from

Conversation

rmusser01
Copy link
Collaborator

No description provided.

…S - Merge pull request #409 from rmusser01/dev

ALL PRIOR VERSIONS ARE BROKEN BY THIS MERGE. THIS VERSION INTRODUCES A NEW config.txt FILE WHICH WILL CAUSE BREAKING WHEN USING ANY LLM API CALLS.

Docs, TTS, answer streaming, web search and perplexity clone

Updates:
More Docs (no user guides yet 👎 , but we got some feature documentation! 👍 )
Foundation for TTS, setup a basic pipeline and have the ability/plan to add more TTS APIs/Engines
Have support for streaming of answers in chat in all except the 4-way chat and the One prompt-mutiple, API chat.
Added foundations + actual feature of Web Search, can now do web search + sub searches to answer a query/do research on a topic. UI is a (even more of a temporary) placeholder for now, not too happy with it. But the underlying pipeline is pretty nice, can do a single -> sub/split query through your choice of search provider (Bing, DDG, Google, Brave, SearX, Kagi) and then have the results evaluated for relevancy, collected via page scraping and then summarized for final analysis. Lots of areas for improvement.
Fixes for api Call Streaming checks (one more fix on the way for minp/topk values for APIs)
Remove full API key debug statement in ‘chat()’
Add Security fix notice
rmusser01 and others added 30 commits March 3, 2025 21:03
Forgot to include the 'config.txt' changes...
Sorted out the RAG notes link dump. Still need to review/add the info to existing notes
Gradio is now dead to me, have to build a new frontend and do an API.
Fixes + Links + Update to README
Gradio now launches. 
'Fix' was `pydantic==2.10.6`; 

Seems there's an issue with Gradio somewhere, and some update in the past few months caused this issue.

yea. idk.
FIX FOR GRADIO NOT LAUNCHING
…lation

This commit introduces the ability for you to specify the desired language for audio and video transcriptions, preventing the system from forcing a translation to a default language (e.g., English).

Key changes include:

- Modified the Audio Ingestion Tab (Live Recording) to use the selected transcription language for both partial (live) and final transcriptions. `PartialTranscriptionThread` now accepts a language code.
- Ensured the Audio Ingestion Tab (File/URL Processing) correctly passes the selected language code to the transcription backend. This was largely in place but verified.
- Updated the core transcription logic in `Audio_Transcription_Lib.py` (`speech_to_text` function) to handle an "auto" language selection by passing `None` to the Whisper model, enabling its automatic language detection.
- Verified that the Video Transcription Tab's existing infrastructure correctly propagates the selected language, including "auto-detect," to the transcription service.

You can now select a specific language from the dropdown or choose "Auto-detect." The system will transcribe in the chosen language without an unwanted intermediate translation step.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant