GitHub - hoodini/transcription-and-diarization: Transcribe audio in English / Hebrew / Other languages using Whisper locally on Colab and create SRT / TXT separated by speakers

This Colab notebook transcribe audio using Whisper by OpenAI. It uses another python package for identifying and separating speakers. In addition, it creates a TXT and SRT files do download.

In order to use:

Run all cells and follow the order
Upload an audio file
Select a language (English / Hebrew / Auto mode)
Select number of speakers
Run rest of cells
The transcription (including the diarization - separation by speakers) will be shown in the last cell
The SRT and TXT files will be available for download in the sidebar (under main folder)

Please note: GPU units may be neccessary for running this! If you don't have any GPU units, you may get error of CUDA: Out of memory! Make sure to select GPU Runtime, and if possible - turn on "high RAM".

If using please make sure to citate Yuval Avidani @HACKIT.CO.IL

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
transcripts_with_speaker_names_by_HACKIT.ipynb		transcripts_with_speaker_names_by_HACKIT.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

hoodini/transcription-and-diarization

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages