Site for looking up frequency of word

In the context of building up a personal vocab deck;

Is there a good site to lookup the frequency of a word? I’d like to add more common words than uncommon…

thank you

4 Likes

After asking for it, it sparked my curiosity.

I’m used to seeing Kanji frequently lists but I’ve never searched for vocabulary frequency…

I’ve found these two sites while looking for it:

http://www.offbeatband.com/2010/12/the-most-commonly-used-japanese-words-by-frequency/

https://www.kanshudo.com/collections/routledge/RT-1

There’s also a book seems, I’ve found the ebook PDF on the search engine but I’m not sure if they have a license to really distribute it. If you are interested, here’s the duckduckgo result where I found it:

image

And this seems to be the Amazon listings for it:

If anyone else has better resources to share, please do!

4 Likes

Cool, the text file on that first site will be a useful cross reference. I want to try a different strategy to chucking every unknown word on a deck…

Frequency based on what? For the most part, I think the general consensus is that people install yomichan and then add frequency lists to it to check for how common a word is. Most commonly used ones seem to be Anime & JDrama, Netflix and Innocent Corpus (a freq list comprised of I think like 5000 novels.)

Most people, myself included, use all 3 of those freq lists just for simplicity, but you could just use innocent corpus and be fine. Not on my computer so I can’t link the dictionary files to add to yomichan, but this site has corpus.

https://foosoft.net/projects/yomichan/

I may also be wrongly assuming you use yomichan already so if that’s the case lemme know friendo.

6 Likes

Ooo I use 10ten (formerly Rikaichamp) for browser plugin translator

1 Like

Never used rikaichamp, always heard and saw mixed things about it compared to yomichan. Does it allow you to add custom dictionaries? If not then you may be slightly out of luck and have to use a text file to see frequency, which kind of blows. Obviously use whatever you want and feel comfortable with, but if frequency is something that’s important to you coming up I think it’d be worth the 5 minute change to Yomichan, especially if you’re using Anki.

3 Likes

Definitely recommend what BORN2PEEPEE said, swap to Yomichan and install frequency lists into it. It’s so easy to see the frequency information with Yomichan and Yomichan is great at parsing Japanese compared to most other tools, especially phrases/idioms/expressions that would be hard to understand without it. It’s extremely useful to know how common different words are in different domains when you’re deciding what vocabulary to add in Anki. If you focus on picking up high-frequency words you don’t know in one area it really helps you increase the comprehensibility in that area more quickly to make immersion more enjoyable. E.g. there are tons of words common in novels that are not common in other domains.

You can find some popular yomichan frequency lists here:

There are a lot of dictionary packs for yomichan and guides for how to use it out there too because its used by most of the Japanese immersion learning community.

EDIT: This is an excellent set of yomichan dictionaries and frequency lists, it also has text files explaining each dictionary

https://drive.google.com/drive/folders/1tTdLppnqMfVC5otPlX_cs4ixlIgjv_lH

5 Likes

You should just read a lot. It becomes apparent pretty quick what words are frequent, because you see them a lot.

1 Like

yeah finding that with my reading I keep forgetting certain words so trying to build a deck to help reinforce it, and not adding infrequent words to that deck atm so to save a few kb of brain storage

1 Like

Front ┃ NINJAL-LWP for TWC (NLT) Web corpus
https://nlb.ninjal.ac.jp/ Corpus of Contemporary Written Japanese

NINJAL-LWP for TWC (NLT) is a tool for searching the Tsukuba Web Corpus (TWC), a corpus of approximately 1.1 billion words collected from Japanese-language websites. The search function is based on the use of NINJAL-LWP (NINJAL-LagoWordProfiler), a corpus search system developed jointly by the National Institute for Japanese Language and Linguistics (NINJAL) and Lago Institute of Language. Among tools that utilize the same system is NINJAL-LWP for the BCCWJ (NLB), a tool for searching the 100-million-word Balanced Corpus of Contemporary Written Japanese (BCCWJ), which was developed by NINJAL.

2 Likes

Thanks for sharing! Bookmarked.

Tsukuba University delivers again!

3 Likes

It’s not a website, but the Houhou SRS program (free, made by another learner) tells you:

  • How many times a word appears in almost 8000 books
  • What JLPT & Wanikani level they are, if applicable
  • If they’re in the top 20,000 most used words on wikipedia

出来る appears 92,992 times in 7,905 books, making it very common, and is the 34th most used word on wikipedia.

4 Likes