Recommendations for OCRs to mine vocabularly from Youtube subtitles?

I quite often watch Japanese youtube videos which quite commonly have captions, or subtitles, written in Japanese at the bottom of the video.

What would be nice is if I could use an OCR to extract the text from the video and then plug the vocabularly words it has extracted into an SRS, like Anki, so that I can build up vocabularly in particular topics.

However, I haven’t used any OCR readers so I’m not really sure what is recommended - does anyone have any preferences? I’m not opposed to paid options if they are particularly good.

Currently I’ve found Capture2Text, but no idea if it is any good.

Example of the type of video I’d extract text from - here - subtitles are part of the video at the bottom.


Mpv with some bells and whistles. I’ll have to look exactly at what it is later because I haven’t used it for YouTube videos (I know you can, I did initially test it), but I made cards from movies and it worked fine. It takes the subtitles to a text hooker line by line where you can yomichan it into anki. I was also able to use it to gather example sentences with audio. That was months ago tho so I’ll have to re figure it out (⁠๑⁠•⁠﹏⁠•⁠)


I’ve used in the past to export youtube subtitles into to add furigana and to translate.

It’s pretty straight-forward to do but give me a shout if you get stuck.

EDIT: okay, so it’s not an OCR but I’ve used it to achieve what you’re trying to do…

I’ve been using Migaku for my youtube/crunchyroll studying. The only caveat is that it’s a paid service and it takes a bit of messing around with to work with all streaming services. You might have to download a few plug-ins and stuff. Otherwise, I think it’s really intuitive and useful for studying videos/audio. It’s got built in features to add words and sentences to Anki as well!

  • kaku but it’s only for mobile devices I think.
  • eJoy not an ocr but if there are normal subtitles it will be able to recognize them.
Exploring this software

may help.

Video → Images → OCR Analysis

Thanks for the feedback everyone! I’ll give these a look. @IcyIceBear @jrmr50 @Delta386 @Jace @Orock45

@IcyIceBear If you do manage to figure it out let me know! I’ll see if I can work something out with it in the meantime.

Doesn’t YouTube include a transcript of the subtitles? If you click the 3-dot button, there should be a “Show Transcript” button, from which you copy-paste.

If you mean hardcoded/burned-in subtitles, then that’s a different issue, of course.

Yeah, it was ones the uploader had incorporated into their videos - although that is a good point, I should check to see if they’ve added CCs for the videos, which they may have done. Thanks!

If you are talking about hardcoded videos (i.e. the subtitle is part of the image frame), there are web based solutions to extract those e.g.

There are also open source solutions on Github.

Yeah, hardcoded subtitles were what I was trying to extract. I’ll take a look at that, thanks!

I’ve been using ShareX mainly for screenshots/audio capture for mining but it has OCR too and from what I’ve used of it it seems to work pretty well. A guide on how to set it up is near the bottom of the page here.

I use Migaku too. It’s a pain to set up but it is my favorite tool that I have ever used for studying Japanese.