Skip to content

Subtitles Generator #473

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,7 @@ More information on contributing and the general code of conduct for discussion
| Sorting | [Sorting](https://github.com/DhanushNehru/Python-Scripts/tree/main/Sorting) | Algorithm for bubble sorting. |
| Star Pattern | [Star Pattern](https://github.com/DhanushNehru/Python-Scripts/tree/main/Star%20Pattern) | Creates a star pattern pyramid. |
| Subnetting Calculator | [Subnetting Calculator](https://github.com/DhanushNehru/Python-Scripts/tree/main/Subnetting%20Calculator) | Calculates network information based on a given IP address and subnet mask. |
| Subtitles generator | [Subtitles Generator](https://github.com/tene04/Python-Scripts/tree/main/Subtitles%20Generator) | Generates subtitles (.srt files) from audio or video files. For video files, it can also create a new video file with soft subtitles embedded. |
| Take a break | [Take a break](https://github.com/DhanushNehru/Python-Scripts/tree/main/Take%20A%20Break) | Python code to take a break while working long hours. |
| Text Recognition | [Text Recognition](https://github.com/DhanushNehru/Python-Scripts/tree/Text-Recognition/Text%20Recognition) | A Image Text Recognition ML Model to extract text from Images |
| Text to Image | [Text to Image](https://github.com/DhanushNehru/Python-Scripts/tree/main/Text%20to%20Image) | A Python script that will take your text and convert it to a JPEG. |
Expand Down
79 changes: 79 additions & 0 deletions Subtitles Generator/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# Subtitle Generator
----------


This script automatically generates subtitles (.srt files) from audio or video files using OpenAI's Whisper model. For video files, it can also create a new video file with soft subtitles embedded.

The tool supports specifying the language for transcription, improving accuracy for non-English content.

## How it works

1- For video files, it first extracts the audio track

2- Uses Whisper AI to transcribe the audio with timestamps

3- Generates a standard .srt subtitle file

4- Optionally creates a new video file with embedded subtitles


## Configuration

You can configure the behavior by modifying these parameters in main.py:

```
input_path (path to your audio/video file)

subtitle_path (output subtitle path)

model (Whisper model size medium or large)

mux (Boolean to specify whether or not to create the subtitled video)
```


## How to run

### 1- Install dependencies
Before running the script, run the following command:
```
pip install -r requirements.txt
```

### 2- Install FFmpeg

- On windowns:
- Download FFmpeg from https://ffmpeg.org/download.html#build-windows
- Add FFmpeg to your system path

- On linux:
```
sudo apt update && sudo apt install ffmpeg
```
Verify installation with
```
ffmpeg -version
```

### 3- Run the script
Modify parameters and run
```
python main.py
```
The script will generate:
- A .srt subtitle file
- (For videos) A new video file with soft subtitles

## Supported formats
Input formats:
- Video: .mp4, .avi, .mov and .mkv
- Audio: .wav, .mp3 (any other supported by Whisper)

Output formats:
- Subtitles: .srt
- Video: .mkv or .mp4

## Benefits
* Accurate transcription: Uses state-of-the-art Whisper AI for high-quality results
* Language support: Works with multiple languages (specify in transcribe() method)
Preserves quality: Original video quality es maintained when muxing
18 changes: 18 additions & 0 deletions Subtitles Generator/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
from subGenerator import SubtitleGenerator
import os

def main():
input_path = "your file path"
subtitle_path = f"{os.path.splitext(input_path)[0]}.srt"
model = "medium / large"
mux = True

print(f"Processing: {input_path}")
generator = SubtitleGenerator(model)
generator.process_file(input_path, subtitle_path)

if mux and input_path.lower().endswith(('.mp4', '.avi', '.mov', '.mkv')):
generator.mux_subtitles(input_path, [subtitle_path], 'output path')

if __name__ == "__main__":
main()
3 changes: 3 additions & 0 deletions Subtitles Generator/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
openai-whisper==20250625
ffmpeg-python==0.2.0
torch>=2.7.0
122 changes: 122 additions & 0 deletions Subtitles Generator/subGenerator.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
import os
import whisper
from whisper.utils import get_writer
import ffmpeg

class SubtitleGenerator:
def __init__(self, model_size):
"""
Initialize the subtitle generator with a Whisper model

Args:
model_size (str): Size of the Whisper model (medium or large)
"""
self.model = whisper.load_model(model_size)
self.supported_languages = set(whisper.tokenizer.LANGUAGES.values())

@staticmethod
def extract_audio(video_path, audio_path):
"""
Extract audio from video file using ffmpeg

Args:
video_path (str): Path to the video file
audio_path (str): Path to save the extracted audio
"""
stream = ffmpeg.input(video_path)
stream = ffmpeg.output(stream, audio_path, ac=1, ar=16000)
stream = ffmpeg.overwrite_output(stream)
stream.run(quiet=True)
return audio_path

def transcribe(self, audio_path, language=None):
"""
Transcribe audio file to text with timestamps, raise an error
if the language is not available

Args:
audio_path (str): Path to the audio file
"""
if language is not None and language not in self.supported_languages:
raise ValueError(f"Language {language} not available.")
return self.model.transcribe(audio_path, word_timestamps=True, language=language)

@staticmethod
def generate_srt(result, output_path):
"""
Generate SRT file from transcription result

Args:
result (dict): Transcription result from Whisper
output_path (str): Path to save the SRT file
"""
srt_writer = get_writer("srt", os.path.dirname(output_path))
srt_writer(result, output_path)

def process_file(self, input_path, output_path=None):
"""
Process input file and generate a SRT file

Args:
input_path (str): Path to input audio/video file
output_path (str): Path to output SRT file (optional)
"""
if output_path is None:
base_name = os.path.splitext(input_path)[0]
output_path = f"{base_name}.srt"

is_video = input_path.lower().endswith(('.mp4', '.avi', '.mov', '.mkv'))

audio_path = input_path
if is_video:
print("Extracting audio from video...")
audio_path = "temp_audio.wav"
self.extract_audio(input_path, audio_path)
try:
print("Transcribing audio...")
result = self.transcribe(audio_path)

print("Generating subtitles...")
self.generate_srt(result, output_path)

print(f"Subtitles generated successfully: {output_path}")
finally:
if is_video:
os.remove(audio_path)

@staticmethod
def mux_subtitles(video_path, subtitle_paths, output_path=None):
"""
Combine a video with one or more SRT files

Args:
video_path (str): Path to input video file
subtitle_paths (list of str): List of paths to input SRT files
output_path (str): Path to output mkv or mp4 file (optional)
"""
print(f"Generating .mkv file with soft subtitles")

if output_path is None:
base_name = os.path.splitext(video_path)[0]
output_path = f"{base_name}_subtitled.mkv"

output_ext = os.path.splitext(output_path)[1].lower()
if output_ext == '.mp4' and len(subtitle_paths) > 1:
print("Warning: .mp4 only allow one subtitle. The first one will be used.")
subtitle_paths = [subtitle_paths[0]]

video_input = ffmpeg.input(video_path)
video_stream = video_input.video
audio_stream = video_input.audio

if output_ext == '.mp4':
subtitle_input = ffmpeg.input(subtitle_paths[0])
output_kwargs = {'c:v': 'copy', 'c:a': 'copy', 'c:s': 'mov_text'}
output_stream = ffmpeg.output(video_stream, audio_stream, subtitle_input, output_path, **output_kwargs)
else:
subtitle_inputs = [ffmpeg.input(s) for s in subtitle_paths]
output_kwargs = {'c:v': 'copy', 'c:a': 'copy', 'c:s': 'srt'}
output_stream = ffmpeg.output(video_stream, audio_stream, *subtitle_inputs, output_path, **output_kwargs)

ffmpeg.run(output_stream, overwrite_output=True, quiet=True)
print(f"File generated successfully: {output_path}")