I personally loathe TTS for language learning. Even good TTS for general purpose is generally abysmal for language learning because it’s always flat, lacks intonation, and it’s always a bit off. If the idea is just to convey information that’s fine, but for prononciation practice it’s hard pass for me. I generally prefer no audio to TTS because in my experience those TTS samples are never vetted by native speaker to exclude the really weird/wrong ones (because why bother, it’s so quick and easy to just generate them on the fly!) so I don’t want to tune my hear to potentially broken or unnatural language.
A few years ago I was pretty active on the Duolingo forums and I spent quite a lot of time helping on the French-for-English-speakers category (I’m a native French speaker). It was very common for students to complain about weird or broken audio and while it was sometimes a skill issue on their part, I can vouch that quite often the audio was indeed broken, slurred or just plain unnatural due to weird intonation.
Now that was years ago, I can imagine that it’s improved since then and with the current AI boom I hope we’ll get native-level TTS in the not-so-far future, but we’re not there yet. I much prefer having a deck of 1,000 cards with high quality audio recordings than a 10k deck with TTS.
I realize that it’s not mutually exclusive, but I fear that going the TTS route is a slippery slope, it’s so much cheaper and faster than hiring natives to record the sentences.