What does this "common" icon in dictionary definition represent?



  1. What does this “common” represent?
  2. What are the possible values?
1 Like

I don’t know this for a fact but my guess is “common” is being pulled from the JMdict/EDICT project which is where sites like jisho.org also get their data (and also display a common tag)

If that’s the case then I think possible values are just a binary common or not common

The documentation I could find gives this description (they seem to label them as “priority”/[P] instead of common)

“Priority” entry, i.e. among approx. 20,000 words deemed to be common in Japanese

Elsewhere in the documentation they give a bit more of a detailed breakdown:

  • news1/2: appears in the “wordfreq” file compiled by Alexandre Girardi from the Mainichi Shimbun. (See the ftp archive for a copy.) Words in the first 12,000 in that file are marked “news1” and words in the second 12,000 are marked “news2”.
  • ichi1/2: appears in the “Ichimango goi bunruishuu”, Senmon Kyouiku Publishing, Tokyo, 1998. (The entries marked “ichi2” were demoted from ichi1 because they were observed to have low frequencies in the WWW and newspapers.)
  • spec1 and spec2: a small number of words use this marker when they are detected as being common, but are not included in other lists.
  • gai1/2: common loanwords, also based on the wordfreq file.
  • nfxx: this is an indicator of frequency-of-use ranking in the wordfreq file. “xx” is the number of the set of 500 words in which the entry can be found, with “01” assigned to the first 500, “02” to the second, and so on. Entries with news1, ichi1, spec1/2 and gai1 values are marked with a “(P)” in the EDICT and EDICT2 files.While the priority markings accurately reflect the status of entries with regard to the various sources, they must be seen as only providing a crude indication of how common a word or expression actually is in Japanese. The “(P)” markings in the EDICT and EDICT2 files appear to identify a useful subset of “common” words, but there are clearly some marked entries which are not very common, and there are clearly unmarked entries which are in common use, particularly in the spoken language.



Thanks! I was just surprised to see it and wanted more information.
It appeared recently.

1 Like