Tagging and Managing Video Scenes with wd14-tagger and PySceneDetect

2024/Oct/20

This guide demonstrates how to combine two powerful tools—wd14-tagger and PySceneDetect—to automatically tag and manage video scenes. While wd14-tagger is an effective tool for tagging images, when paired with PySceneDetect, you can break down a video into scenes and assign relevant tags to each scene, making it easier to organize and search through large video files.

Here’s sample video sources:

$ tree sources/
sources/
├── source-1.mkv
├── source-2.mkv
├── source-3.mkv
├── source-4.mkv
├── source-5.mkv
├── source-6.mkv
└── source-7.mkv

1 directory, 7 files

Scene Detection with PySceneDetect

PySceneDetect is a tool designed to detect scene transitions in a video. By analyzing content shifts, it splits the video into distinct segments, good for managing video files scene by scene.

Installation

First, install PySceneDetect along with its dependencies:

pip install opencv-python numpy Click tqdm appdirs
pip install scenedetect

Splitting the Video

To detect scene transitions and split the video accordingly, run the following command:

cd sources
for f in *.mkv; do scenedetect -i "$f" list-scenes save-images split-video; done

This command takes your video files, detects the scenes, and splits the videos accordingly. It also generates a list of scenes and saves thumbnails for each one.

It generates three thumbnails for each detected scene: one from the beginning (*-01.jpg), one from the middle (*-02.jpg), and one from the end (*-03.jpg). The *-02.jpg file represents the middle frame of the scene and is often the most representative of the scene’s overall content. We will use this thumbnail when applying tags in the next step.

Tagging Scene Images with wd14-tagger

Once your videos are divided into scenes, we can use wd14-tagger to automatically tag the keyframe images for each scene. This makes it easier to organize and search the scenes based on content.

wd14-tagger is an image tagging tool that can label images with relevant tags (e.g., “indoor,” “nature,” etc.), making image management more efficient.

Installation

To get started with wd14-tagger, follow these steps:

git clone [email protected]:corkborg/wd14-tagger-standalone.git
cd wd14-tagger-standalone
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Running the Tagger

After setting up the environment, use wd14-tagger to tag the images generated by PySceneDetect. We will specifically target the middle thumbnails (*-02.jpg) for tagging, as they best represent the scene’s content:

mkdir /path/to/sources/tags
cp /path/to/sources/*-02.jpg /path/to/sources/tags/
python run.py --dir /path/to/sources/tags/ --ext .txt

This will create a .txt file for each image, listing all the tags associated with that scene, helping you manage and categorize scenes based on visual content.

Searching for Scenes by Tags

With the scenes tagged, you can now search for specific scenes using grep. This makes it easy to find scenes with certain tags without having to manually watch through the video.

For example, to search for scenes tagged with a particular keyword:

grep -rl 'query' /path/to/sources/tags/*-02.txt | sed 's|tags/||' | sed -E 's/-Scene-([0-9]+)-[0-9]+\.txt/-Scene-\1.mkv/' | sort

Replace 'query' with the tag you’re looking for, and this command will return a sorted list of the corresponding video files that match your search, streamlining your workflow.

Concatenating and Mixing Video Files with GStreamer Handling Promises in JavaScript Class Initializers