Vocabulary Roadmap

Now that Decks and Vocab are out of Beta (:tada::partying_face::tada:), is there a tentative timeline for when Vocab decks will be complete? In particular, I’m curious about Cloze support for N2 and N1, as well as the A/E decks.

I’m currently mapping out a study plan for the December 2024 N1, and have been a huge fan of Bunpro’s Vocab system so far. Would love to be able to use it leading up to the test instead of alternatives like Anki!

P.S. No pressure, mostly just looking for a rough date so I can make a decision on what study materials I should use from now until December :relaxed:

9 Likes

Not sure about bunpro, but my vocabulary roadmap is 30 minutes studying 10 minutes review and qustioning why I don’t remember this kanji, followed by 20 minutes of crying. repeat until I learn 5 words in a day.

14 Likes

Such a mood. Mine is coming across the same word 10 different times in reading and being like, “huh what does this mean?” only to realize I’ve already added it to 10 other vocabulary lists in the past 🥲 Need that SRS to make it stick.

9 Likes

Damn I need to add some crying to my daily routine too! :laughing:

My approach is blatantly ignore the massive brain-leakage of vocabulary I get from not studying.
Then realize how much I’ve forgotten once every ~3 months and have a spurt of intense studying

11 Likes

Bumping for visibility

3 Likes

Sorry for the late reply here! For some reason I completely missed this thread! :woozy_face:

For cloze N2, I am currently going through translations as fast as I can (about 100 words a day) and making sure all of them are accurate and use the correct vocabulary. After I finish this, we just need to set up the study questions, which should only take a few days. Give us a few weeks max and N2 should be available.

For N1, I want to get this out as soon as possible, but there are lots of other jobs that also have high priority. Hopefully I will be able to get someone else on the team to help me with the checking of the English translations for those so that we can get N1 cloze active within 3 months at the latest. At this point, N2 and N1 will be available, but there will still be things that we continue to work on adding in a timely manner like the hints in both English and Japanese.

After that I assume each additional list will come every 2 months ideally.

Glad to hear you’re finding use in it so far! It is still a long way from the polished gem that it should be, but I am also excited for where it has the potential to go!

12 Likes

Personally I don’t mind waiting if it means getting more accurate translations. At least from my part, you have my permission to take longer if needed :stuck_out_tongue_closed_eyes:

4 Likes

You have seen what some of them are like even after we ‘fix’ them. Some of the mistakes that we fix straight out of machine translation are godawful. There’s really a limit to the amount we can fix in a reasonable amount of time though. After a certain point you would just need to completely rewrite the sentences if you wanted to catch ever single little detail in the Japanese. For a human that is no problem at all, it’d just be a job that would take a ridiculously long time to achieve across the platform.

Once the vocab is in full swing and everything is available for users even past N1, I think it would be good for us to allow user suggested translations where all we have to do on our side is press ‘accept’ or something and it automatically uses the suggested translation. A bulk effort like that from a lot of users would potentially really increase the quality of the English.

Kinda sounds lazy on our part, but realistically there is only so much a very limited number of people can do when there are also 10 other things to do at the same time.

12 Likes

Not lazy at all. I completely understand, and I actually think it’s a GREAT idea. I haven’t been providing suggestions for that long, but imagine if the whole community was involved? That would speed up the process by a great amount. Please set this up. I’m down!

5 Likes

Awesome, thanks for the update and sorry for the slow response on my end as well!

Out of curiosity, have you thought of potentially doing batched releases for the N1 translations and then adding Cloze support after? You mentioned the possibility of user suggestions (and I saw the other thread where there was discussion of having a Validator Rank). I’m wondering if you could potentially trial that with a small batch of N1 translations? It would get things into the hands of users soon, and serve as a bit of a beta test for the Validator Rank.

Not sure if that would add more work or less on your end though in terms of getting that set up/not sure how many Validators you would have at the N1 level, but throwing it out there if it could potentially reduce some of the workload on you guys!

1 Like

We are actually considering amongst the team atm the possibility of releasing all vocab content with example sentences already written in Japanese (up to additional list 4) asap.

What we had in mind was releasing them all with their ChatGPT translations and something like GPT訳 written before the translation so that the user knows that it has not been checked by staff yet. There are a few positives and a few negatives with this.

Positives -

  • A huge amount of vocab even past N1 will become available straight away to study as cloze.
  • The vast majority of the translations will be ‘good enough’ that it really wouldn’t bother most people.
  • We can enable something similar to the validator rank far sooner to work toward getting Bunpro more and more polished faster.

Negatives -

  • The biggest problem with the GPT translations is not that they are wrong, it’s that they don’t use the target word accurately. Making the Japanese word often hard to guess (I fix this problem more than any when checking sentences). Let’s say the target is 確信, GPT might translate it as ‘assurance’. In the right sentence, that translation would be totally acceptable, but it’s not one of the standard translations of 確信, so we either have to change the translation or add ‘assurance’ to our list of dictionary definitions for 確信. Repeat this for several possible alternatives for every word :upside_down_face:.
  • There’d be tons of sentences without highlighting on the target word in English, due to the problem above of GPT not actually using one of the standard translations.
  • There would be a lot of outright wrong translations. Due to the lack of subject in Japanese, GPT often makes mistakes about who is saying/doing what to whom.

Atm the best way we can combat this is by releasing a version of cloze where, in addition to the translation, the user can also see the dictionary definitions of the word they’re trying to remember. Using 確信 from above as the example, under the translation of the sentence in English, you’d also see ‘conviction, belief, confidence’. This would let the user know straight away that ChatGPT was liberal with the translation, and the word they’re actually trying to figure out is the one that has the definition of ‘conviction, belief, confidence’. Personally I don’t actually see this as a bad thing, cause the user can still get the benefit of reading the Japanese sentence and the definition will give them enough info to go off to guess correctly in most cases. After that it would just be matter of us, hopefully with the help of users as well, knocking out bad translations as quickly as possible from the ones that haven’t been checked yet.

3 Likes

I think this is a good temporary solution because I encounter sentences where I’m like “I don’t remember learning that word” too frequently. Or, like you said, I can see how the word can be interpreted differently, but it doesn’t help me answer the question correctly, which is a huge time consuming prospect when you’re reviewing over 100 items at a time. I try to report these instances but, honestly, I’m spending so much time trying to figure out some of these words that I considered giving up doing Cloze for vocabulary decks.

That said, the implementation of this system might make me keep learning vocabulary through Cloze, since I do enjoy the reading aspect of it. However, let’s get that validation system going so that we don’t have to rely on having the definition of words as hints (except maybe in cases where there are multiple words with the same meaning, of course).

Thanks so much for your hard work. I truly appreciate it!!

3 Likes

I’m in the same boat as @Jose7822 in terms of Cloze. Or, I noticed that my vocab accuracy is much lower than my grammar, mostly due to synonyms and different interpretations/translations of a word. I really like using Cloze as it provides a lot more context than just a simple definition and translation, and I think there is value in having translations that are slightly different from the definition as it primes you to think more holistically about the meaning/usage of a word, but if you’re doing a huge chunk of reviews it can be rough playing the guessing game.

I’d definitely be down to help out with vetting sentences if they get released early, and I like the idea of having some kind of marker so users are aware that the material is in progress.

When I was going through N5-N3 vocab, my usual workflow was:
Click Learn → Review each word, it’s definition, and pitch accent → Read out loud each context sentence and listen to audio → Do Review test

During that process, if I see anything funky I’ll usually report it. Mind you, sometimes I get lazy or don’t think it’s much of an issue so I won’t report it until it comes up in reviews again. But either way, if others follow a similar process, I could imagine most of the sentences could get reviewed by users pretty quickly and a lot of the more egregious issues could be reported.

On another topic, one feature I found in Torii that I quite liked is that even though the quiz/review types are like Bunpro’s Translate / Manual Answer, after you successfully answer the question an example sentence will pop up beneath the word and audio will play. I wonder if Bunpro added something like that if people would get more context out of Translate without necessarily losing the value of Cloze? (Mind you, most of the example sentences on Torii are awful).

TLDR: I’m very much okay with the proposed solution with ChatGPT + definitions being temporarily appended.

3 Likes

Thanks a lot for both of your input on this guys! I really want the vocab tool to be as good as it can be as soon as possible, and I 100% see where you’re coming from. A problem with cloze is as you both mentioned, depending on the sentence, there are many correct interpretations, and the more vocab you know, the more synonyms become a possibility. It’s like grammar ‘synonym hell’ times 100.

I think for the short term, having the definition show will be one of the best solutions, as you can usually guess what is needed if you get the full list of possible translations. I will see if I can organize something similar to the special report (validator) system specifically for bad translations etc.

This is a pretty great idea! I will see what I can do about making this or something similar a feature. From my perspective, the biggest plus that the Bunpro vocab has is that there are a good handful of really well thought out context sentences for all of them, so the more modes that expose people to these sentences and encourage them to read, the better. For the first year or so vocab was out, we used to run across instances very very often of users that didn’t even know there were example sentences at all because they were tucked away in another tab. I cry :sweat_smile:.

Perfect world for the vocab is where you’re reading the sentences, the sentences are at your rough current level, and you get so caught up in enjoying the reading that you forget you’re reviewing. This is the way :raised_hands:

Edit - Feel free to let me know any other ideas either of you think would help the vocab!

4 Likes

Personally, I’d be perfectly happy with having hints, like we do for grammar points, as well as accurate translations of every sentence. The rest is icing on the cake, as the saying goes.

3 Likes