Anki, school, and LLMs
Multi-modal LLMs like ChatGPT and Gemini can be used to prepare Anki decks from word lists in foreign language classbooks, so that you can use spaced repetition to learn efficiently even if your classbook was not designed for it. I have done this with my son for his German classes in school.
Why foreign languages anyway?
I have my doubts about teaching German or any other language except English in schools, especially as a mandatory subject. I believe English is the only remaining global language. Another language might make sense if you plan to move to a country where it is spoken or if you have other special reasons, but in my opinion, it's useless for most students. The solution I am describing here will however work for English too and English is certainly useful. Even if you have to study a language just because it's a mandatory subject, you still want to make your studies as efficient as possible, so this guide should come handy.
Why Anki?
Anki is the most popular free and opensource spaced repetition system. It's an efficient way to learn foreign language vocabulary. It takes some time to learn how it works and how to use it correctly, but it's totally worth it. Anki now supports FSRS algorithm, which is expected to be much more efficient than the old SM-2, even optimal in some ways.
Schools don't use spaced repetition
Classbook publisher could have provided Anki decks on their website to complement the classbooks, but I don't know of any publisher who actually does that. The way this works in schools is that some vocabulary is absorbed by osmosis during the class and this natural process is randomly peppered with quick vocabulary tests. The teacher regularly announces that there's going to be a test and students get a week or even just a day to memorize vocabulary for the test.
If you know anything about spaced repetition, you have probably realized this is grossly inefficient. Memorization requires more than a week of advance notice. And if you cram like this for a test, you are bound to forget nearly everything shortly afterwards. Spaced repetition using Anki would work better.
What can we do about it?
So how do we deal with the inefficient vocabulary memorization approach used in schools? I can see several options:
- Do nothing: Just give up and do what the school expects from you. It's stupidly inefficient, but most students do that and you therefore wouldn't fall too far behind others.
- Standard deck: Get an existing Anki deck from AnkiWeb that covers the language. There are lots of them and you will learn the language that way, but there will be a mismatch between your classbook and your Anki deck and your performance in school tests will suffer even as you make real progress in the language.
- Deck editor: Use Anki's built-in deck editor to create your own deck that matches your classbook. This is more targeted, but it's a lot of tedious work. If you ask a kid to do something this mundane, expect the resulting deck to look a bit funny and to have shoddy, half-assed quality, which isn't great for learning.
- Sharing: Get Anki deck that some other hard-working student has built for your classbook and shared online. Unfortunately, I don't think this is common and I have found no such decks for my son's classbook.
- LLMs: Use multi-modal LLMs with vision capability to convert scans or photos of the classbook's word lists into Anki decks that you can import into Anki. This is quick and efficient. If you do at least a cursory check of LLM's output, the resulting decks can be 99% trusted.
Of these options, the one using LLMs seems most attractive.
Using LLMs to scan word lists
In my son's classbook, the wordlists always span two pages. We didn't have a scanner at hand and classbooks don't fit well in a scanner anyway, so we just made photos using a smartphone. Contemporary LLMs aren't that smart and they will get confused by poor quality scans, so make sure you make several photos of every page and pick the best one. Straighten the scanned page. Keep the phone pointing straight down. Go as close as possible without clipping any text. Make sure that lines are nearly perfectly horizontal. Give the phone time to attain sharp focus.
We then uploaded these photos (two per word list) to ChatGPT or Gemini (whichever had unused quota at the time) and asked the LLM to return the wordlist in a format that can be imported into Anki. LLMs know the Anki format, so you don't have to explain it to them. We had to add about a paragraph of instructions to get the LLMs to perform the task correctly. As far as I remember, we had to specifically ask the LLMs to go systematically from the top left corner of the page to the bottom right corner, making sure no word is skipped. We also had to specify what to preserve, because the word lists in the classbook also contained examples and other stuff that we did not want to import into Anki.
Once we had the word list in Anki format, we just saved it to a file and imported it in Anki. I think we even created special Slovak-German note type, but that's not essential. Standard front-back note type should work just fine. What's important is to set up Anki to present every word pair both ways using two card types, in our case Slovak-German and German-Slovak. Every word list got its own deck and these decks were organized under common parent deck. Lessons are numbered and Anki by default visits decks ordered by name, so we just put lesson number at the beginning of the deck name to get Anki to go over the lessons in order. And that's it.
How to use the Anki decks
Of course, to do well in tests in school, you have to learn your vocabulary ahead of the time, so that the spaced repetition system has time to take effect. This is somewhat problematic in September, because you only receive your classbook at the beginning of the school year, so you cannot start ahead of the time. Fortunately, most teachers practice last year's content at the beginning of the school year, so you will have weeks of time before the first test.
Count the number of cards in lessons for the current school year. Don't forget there are two cards per word pair. Divide it by about 250 to get the number of new cards you have to do per day. Why not divide by 365? Because you will need to know everything ahead of the last test in given school year, of course. You might have to increase the number of new cards at the beginning of the school year for a month or so to get ahead of the class.
But aren't LLMs unreliable?
Yes they are. You have to check the output for any gross errors like skipping a whole subset of words or mismatching word pairs between columns because of skewed lines. On top of that, you have to skim over the word list to make sure that most word pairs are correct.
You could possibly miss something this way, but these brief checks are good enough to have confidence the word list is at least 99% correct. Having 1% error rate in a word list is not an issue in language learning, because you suffer much higher error rate from forgetting. It's definitely of higher quality that decks manually created by students. And you learn more from 99% correct Anki deck than by cramming before tests the way school expects you to.