TL;DR: How much stock do you put into pitch accent data?
I tend to do ‘gardening’ of my Anki deck as I study. (explosive audio, bad TTS audio, obvious jmdict querying errors, et al) One thing I started doing in the past month is pitch accent. I used to ignore it, but I’m increasingly bothered by my trending ‘failure to accurately anticipate/guess’ the correct pitch. So I started adding it to my card clean-up, and I’ve hit a few hiccups, especially in the last day or two. Thus the question posed in the TL;DR.
Even if I acknowledge that…
- Different dialects have different pitch conventions
- Languages are fluid and pitch conventions change over time
- These are still free databases and QC can be dubious
- Even if all the data is correct, inflection still wreaks havoc
…even with all that, the issue is I am still getting conflicting information that doesn’t give me warm and fuzzies about what I am reading. The most recent examples:
- 【山の日】Data is scarce, but OJAD says it’s a pitch [0]. OJAD clearly states their info may sometimes be wrong, but I’ve generally trusted OJAD more than my other resources. But, man… listening to IRL examples I would swear it’s a pitch [3] 中高.
- Various 中高 words where the reported pitch drop sounds different to me. Like what is being reported as a [5] sounds like a [4], or what have you.
-
This has been the impetus for a kernel of self doubt that feels like ‘Have I strayed from the path? It feels like I’ve strayed off the path’. I can definitely entertain the possibility of someone being right by saying “bro you’re taking this too literally - it’s just supposed to get you pretty close; pitch accents aren’t some 100% guide”. But I don’t know if that’s cope on my part.
I am beginning to think I should be using pitch accent info as more or a suggestion than a hard rule. Sort of the opposite of a ‘don’t believe your lying ears’ reaction, i.e. don’t be afraid to make changes disagreeing with the data (albeit conservatively). It’s just vexing since 98% of the time the data causes me no cognitive dissonance, so the few times it does I’m unsure if it’s [skill issue] or just a case of fliers/outliers in the data.
FWIW, the resources I’m using:
- https://www.gavo.t.u-tokyo.ac.jp/ojad/phrasing/index
- Kanjium Pitch Accents via Yomitan (GitHub says it might be JMDict adjacent?)
- Beta version of Jisho (uses UniDic(?) data?)
- Japanese Accent Study Website