ていない with past events

Agreed that the challenge exists for all languages, though I will say I strongly disagree with the assessment that this difficulty in parsing is a significant cause for the substantial difference in vocabulary acquisition that Japanese requires.

Because first I believe the researchers do their best to keep the counts between languages as analagous as possible

Second, even with languages like Korean that have similar grammar, there is a massive difference.

And finally, because the number of additional words you could gain from separating out stems and conjugations is finite: once you have て、い、る、ない counted up as different words in the most spread out way possible, no plain form of the simple present or present progressive will add a new word to any verb stem. So maybe this could inflate the figure by a couple hundred at most, but it’s not enough to explain why in Korean, 5000 words gets you just shy of 90% vocabulary coverage whereas in Japanese, 10000 words gets you just over 90% vocabulary coverage.

So yeah, even though I cringed when the author brought this reason up, I learned something new out of it xD

3 Likes

I think the word count is inflated because of the way Japanese use kanji. They create a lot of words that normally would be two words or more. Sometimes full sentence is required to translate one word.

Examples:
尊敬語 - honorific language​
酒屋 - shop selling alcohol
垣根越し - over the fence

1 Like

If you heard about esperanto it works in similar fashion. There is about 2k commonly use roots in esperanto, but you can combine them to create tens of millions new words. Very elegant system.

If you are interested in morphology checking out esperanto is worth while.

That’s actually a good point, and it’s why I didn’t give up after learning about the massive disparity. This shows there is potential for abusing the count in the following manner:

語 can be appended to many other words:
日本、日本語
和製、和製語
Etc…

Essentially, the Kanji can create a new word out of many already-counted words. So if that’s how the counting was done, then it is possible that you could get this massive scaling effect from a smaller base of root words. But I think then the cause is Kanji rather than conjugations.

Even still, even if it’s the case that the researchers properly prevented such artificial increases in the word count by grouping together word families (which is a common practice), I have other reasons not to feel helpless and lost.

The first good news is that while most languages, such as English, require 98% vocabulary coverage for deep understanding and the ability to infer from context, Japanese requires 93-96%, with the most likely being 94-95%. It’s smaller precisely because of the Kanji can make new words transparent.

Even still, there was a big difference in reaching 95% in Japanese with 98% in other languages, but when I did some of my own analysis, I figured out that once you get past the 5000 most frequent words (as is my case), the vocabulary coverage of Japanese starts to scale at a reasonable pace similar to other languages. That’s what made me not immediately drop Japanese for Korean.

3 Likes

I personally believe that Kanji it the key to success. I don’t understand how anybody can learn Japanese without learning Kanji first. Without Kanji かきねごし is just mumbo jumbo you have to somehow memorise. With Kanji it is quite easy word. Not even necessary to check out in dictionary if you know the Kanji.

3 Likes

I actually got half way through typing this when I gave up and decided to just continue studying because I like these types of conversations too much and end up wasting hours in them haha.

I was going to say that ‘feasibly’, a native speaker could slap any combination of kanji together and create a perfectly logical new word thanks to kanji. It’s why I have given up learning new words. Not because I don’t think I can, but because I am at a point with kanji where the meaning of new words is blaringly obvious most of the time due to kanji. I would say Japanese would have to be the easiest language to aquire new vocabulary at a high level. Stuff just starts making sense without needing to look new things up.

3 Likes

If I remember correctly, you had been adding over 20 words per day over a long enough period of time that you would be over the minimum to understand from context, which is awesome! I aspire to reach that level too

2 Likes

Something like that. My decks of words are at about 18k words all up. But the last few of those decks (maybe 3k words) I never bothered actually reviewing because I still recognize them when I see them anyway.
I would only ever fiercely review those decks if I wanted to output as well as I can read. But for me that isn’t super important at the moment, and I am happy to let my active vocabulary grow slower than my passive vocabulary.

2 Likes

Thanks for your invaluable information and how it corresponds to your experience; it actually lines up beautifully with my goals based on my research :smiley:

If I keep up what I’m doing, I’ll reach my goal of being able to learn mostly through unguided (scarce or no use of dictionaries / no teachers / no technology) immersion around early 2022, so that’s exciting. Anyway, I won’t waste any more time discussing methodology and get to studying haha

4 Likes

Keep at it, you’ll get there for sure :+1:

4 Likes

:nerd_face: :bunprogold: :slightly_smiling_face:

Great to have a community for this stuff!

What occurred to me is that ‘involved’ in not just past-tense, it can be continuous. As in “we are currently involved in a discussion about していない” or “So and so is involved with criminal activities”…doesn’t mean they stopped any devious behavior at all.

This probably happens in English alot more than we are aware without a second though but we are hyper-sensitive in language learning :smile: (sort of why we need full context to vet this grammar scenarios).

1 Like

Correct, because the で particle is just the so-called “て-form” of だ :wink:

4 Likes

Nice observation! I remember reading something like that before, maybe from an old thread you posted on. But for me, that makes it sound like で is the same as だ in meaning and just different in form for syntactic reasons. Maybe I’m misinterpreting though?

1 Like