From Segfault
Jump to navigation Jump to search


To convert .idx (and .sub) files actual .srt subtitles, we can use VobSub2SRT. Unfortunately, compilation is broken on current Fedora installations[1] or with current compilers[2] and a few workarounds exist.

Build from the autotools branch:

sudo dnf install cmake libtiff-devel tesseract-devel

git clone https://github.com/ruediger/VobSub2SRT.git
cd VobSub2SRT
git checkout autotools
./configure CXXFLAGS=-std=gnu++11

Build from a different repository that provides support for different (better) Tesseract data files:

git clone https://github.com/bubonic/VobSub2SRT.git
cd VobSub2SRT


Get the Tesseract Data Files:

curl -L https://github.com/tesseract-ocr/tessdata/archive/4.0.0.tar.gz | tar -xvzf -

Usage, omit both .idx (and .sub) extension:

vobsub2srt --tesseract-data ../tessdata --tesseract-oem 1 "This Movie"

Once finished, we should now have:

$ ls -goh
-r--------. 1  52K Jun 17 11:41 'This Movie.idx'
-r--------. 1 4.2M Jun 17 11:41 'This Movie.sub'
-rw-r-----. 1  80K Jun 17 12:24 'This Movie.srt'