Scene Detection
Chopping videos at fixed 5-second intervals is lazy and it shows. Scene detection finds where the camera actually cut (or where the visual content changed enough to matter) and splits there instead.
How it works
Uses PySceneDetect under the hood. It analyzes frame-to-frame differences in color and luminance to identify scene boundaries. When the difference exceeds the threshold, that's a cut point.
Configuration
analysis:
use_scene_detection: true # enabled by default
scene_threshold: 27.0 # default sensitivity
min_segment_duration: 2.0 # seconds: drop anything shorter
max_segment_duration: 15.0 # seconds: subdivide anything longer
Threshold tuning
The scene_threshold controls sensitivity:
- Lower values (e.g., 20): more sensitive, detects subtle lighting changes. Can over-segment.
- Default (27.0): works well for typical home videos.
- Higher values (e.g., 35): only detects hard cuts. Misses gradual transitions.
Duration constraints
After scene detection splits the video:
- Segments shorter than
min_segment_duration(2.0s default) are discarded. These are usually flash frames or detection artifacts. - Segments longer than
max_segment_duration(15.0s default) are subdivided into smaller chunks. A 30-second continuous shot becomes two or three segments.
When scene detection is off
If you set use_scene_detection: false, the pipeline falls back to fixed-interval splitting. This is faster but produces worse results: you'll get cuts mid-sentence and mid-action.
What happens next
After scene detection, each segment goes through interest scoring to decide which ones make it into the final video. Scene detection just finds the boundaries; scoring decides the quality.