Simulation of number of reviews per day over time

JCalandr · August 14, 2021, 11:14pm

Hi everyone,
As some of you, I don’t really want to do 200+ reviews per day The number of grammar point being limited, I wanted to estimate the number of reviews I will have over time if I add 1, 2, … 5 new grammar point per day.

I made a quick (and dirty) script to simulate this behavior. I just made the assumption that I do 1 session per day, not every 4 hours. I can change that in the future if you really want me to.

The script is uploaded on Google Collab, so you can change the number of min/max number of grammar point in the plot, or your personal accuracy. I just put all the variables on top of the script so you don’t have to learn how to use python Just change the number and press all the arrows in order to “launch” the cells.

Here are a few plots for different accuracy Hope you enjoy them as much as I did.

So, if I have an accuracy of 0.9 and add 2 new grammar points per day, I’m going to converge to roughly 20 reviews a day, and done in 450 days (the number of reviews drops). I can probably manage 20 reviews a day for a year and a half ^^

Accuracy of 80% :

Accuracy of 95% :

Accuracy of 98% :

Accuracy of 99% :

Accuracy of 100% :

JCalandr · August 14, 2021, 11:11pm

Sorry for the double post, forgot the 90% There he is !

andrewkfiedler · August 15, 2021, 12:04am

Cool stuff. I tend to like doing around 30 reviews or so, which tracks pretty well with what I did (around 3 new items a day). Now that I hit N1 though, it’s gotten a bit tougher (with some stuff from N2 I struggled on sticking around), so I dropped to 2 new items a day.

Superpnut · August 15, 2021, 1:03am

Alright now if you just do a simulation with a 20% accuracy I’ll know my future reviews per day

Mapletree · August 15, 2021, 6:11am

Thank you for doing this, I was wondering about this the entire time.

Unfortunately, I think my recall rate is below 80%, but already the 80% simulation gives quite a nice insight into how the number of reviews will grow…

ReiLossefalme · August 15, 2021, 7:17am

Nice graphs and a good thought. Another thing to consider, though likely much harder to code into your graph, is that the more points you add, the lower your accuracy is likely to be. Adding 1-2 points a day may net you that 90% accuracy, but 5 points a day could drop you to 50% accuracy. And as things get more difficult, you may do worse over time. Finish N5 at a 90% average, but N4 at 75%, and N2 at 50%, or whatnot.

JCalandr · August 15, 2021, 7:50pm

@Superpnut
There you are

I smoothed the results more than the previous ones because decreasing the accuracy = increasing the randomness, and of course increasing the oscillations

I just average the plot over 10 days.

JCalandr · August 15, 2021, 7:48pm

Thank you very much for all the insightful comment. I’m still on my N5, not there yet, but I thought about that. I will probably do a new version and to have at least a different accuracy for each jlpt.

Another idea (if I have the time + motivation) could be to do the opposite : Choosing a number of review per day, and calculating the number of new grammar points per day. Because more grammar = more reviews, it should theoretically slowly decrease the number of new review per day (for example something like 5 per day for the first 50, then 4 new per day for 100 days, 3 per days for 150 days …) while keeping the threshold of reviews/day.

Wasabi · August 15, 2021, 8:18pm

These are some nice insightful graphs!

I was doing 2 new lessons for the past few days and looking at your graphs 2-3 new lessons sound like a doable amount of reviews. Though I will probably also practice outside of Bunpro. Even just 2 lessons a day could be a lot of new information once I start hitting unfamiliar territory.

Superpnut · August 15, 2021, 9:01pm

Now this I can work

JCalandr · August 17, 2021, 3:01pm

Small update : I saw someone crashing the google collab trying to use non integers

Just changed the way the loop works, so we can choose a float as minimum value in the simulation, maximum value, and the step between each simulation.

Also added a small grid because it’s difficult to estimate precisely where they converge without them

95% Accuracy, simulation for .5=>3 with a padding of .5

kenni · August 17, 2021, 3:56pm

Nice. I’ve been wondering about this holistic look at reviews over time.

I just had a couple of notes here (not really criticism or anything, just random musings):

It would be interesting to run the simulation n=100 times and plot the daily average since there’s a decent amount of statistical jitter from run to run.
I’d argue that accuracy increases over time, but having a fixed variable for accuracy makes sense for a quick estimate.
It would be interesting to plot the average SRS level alongside the number of daily reviews. I suspect that you’ll get something similar to the inverse of the daily review graph.

JCalandr · August 17, 2021, 4:06pm

Thank you very much for the 100 run, I thought about that but did a sliding window to smooth the results. It’s dirty, i know, but faster to compute #ShameOnMe

For the accuracy, that’s really hard to estimate. The higher the jlpt, the lower the accuracy. I need to find a way to implement that (a decreasing accuracy for new item over time ?, different accuracy for each jlpt level ? …) and see how the overall accuracy evolve. I want to keep it simple for everyone to just have fun with my dirty script, without them to have their hands dirty ^^

Didn’t thought at all about the average SRS level ty for the idea

I’m glad so many of you are interrested in those graphs

Superpnut · August 18, 2021, 12:05am

hope that wasn’t me crashing your program.
I couldn’t figure it out and it got mad :c

kenni · August 18, 2021, 2:57pm

By the way, when you get this into a solid enough state, I think GitHub will automatically display Jupiter notebooks when you view the file (Sharing Jupyter Notebooks: Sharing Jupyter Notebooks using GitHub). It’ll help you save a stable enough version of the script so that we don’t end up constantly breaking the script.

It also never hurts to have that publicly searchable~!

Sidgr · November 4, 2021, 2:33pm

This is fascinating.

Sidgr · November 6, 2021, 10:27pm

It is interesting to me that regardless the pace it seems that about 2 years is what it takes to learn all the grammar. This is also why I had to reset, Learning 10 Grammar a day for months had started to take its toll as I had 100s of reviews every day and at a certain point just could not keep up. This just reinforces to me that learning is a marathon and not a sprint. Take your time, enjoy the progress, and you will make I am sure of it!

evandcs · November 8, 2021, 5:37pm

I just started bunpro and I have 45 reviews a day, so far so good because is N5

I still dont know what is the main balance to keep my satinity and retention.

josh · November 9, 2021, 11:04pm

Is there a way to add in grammar that already have an interval? So maybe adding in variables that hold how many grammar particles are in srs 3-5, 6-8, 9-11. I don’t know enough about python to try and implement it myself, but maybe then having them start at srs_timer[id] where the id for 3-5 would start at 2, 6-8 would start at 5, etc? Idk, if it’d take too much time and it’s not worth it no worries, but it’d be nice to see how many reviews I’ll end up having with my current reviews.