Each time a confident beginning reader encounters a word that only occurs once in a text, a little seed of frustration is sown; there is no second chance for review, there’s no chance for another context to enlighten the reader about the meaning of the word. In corpus linguistics they call this hapax legomenon (“something said only once” in Greek) with the plural: hapax legomena or the short form: hapaxes.
For every word that occurs more than once in a text, the reader has a chance to refresh their memory and grow their understanding of the word’s use in context.
The more unique words in a text, the more frustrating and tedious the reading and the learning of vocabulary is for the confident beginning reader. (Of course, re-reading always rewards the reader--and one of the best ways to cement the knowledge of a unique word is to see it re-used again; however, this is a small consolation during the initial struggles.) We can actually quantify this frustration using the formula:
The following table shows the instances of hapaxes in the major works of Franz Kafka:
Work | Total Words | Unique Dictionary Entries | Hapax legomena | Percent of words that are hapax legomenon |
---|---|---|---|---|
Das Schloß | 108,155 | 6,588 | 2,821 | 2.61 |
Der Prozeß | 71,776 | 5,160 | 2,182 | 3.04 |
Amerika | 89,152 | 6,659 | 3,091 | 3.47 |
Die Verwandlung | 19,157 | 2,894 | 1,520 | 7.93 |
Brief an Den Vater | 16,292 | 2,638 | 1,437 | 8.82 |
In der Strafkolonie | 10,280 | 1,894 | 1,003 | 9.76 |
Vier Geschichten | 13,072 | 2,437 | 1,333 | 10.2 |
Ein Landarzt | 13,139 | 2,778 | 1,595 | 12.14 |
Betrachtung | 5,867 | 1,494 | 887 | 15.12 |
Das Urteil | 3,995 | 1,089 | 682 | 17.07 |
Aphorisms | 3,502 | 1,016 | 608 | 17.36 |
Kleinere Werke | 1,709 | 666 | 410 | 23.99 |
Some conclusions that may help guide our study choices: Das Schloss and Amerika are nearly the same length; yet reading Amerika requires learning 919 more words that only occur once. Die Verwandlung is 6,000 words longer than Ein Landarzt, yet both collections require about the same amount of vocabulary to be learned; yet Die Verwandlung has fewer unique words.
In the table above, hapaxes are not cumulative, as the Franz Kafka corpus above contains 356,096 words and yet only 5,461 hapaxes, only 1.53 percent of the total corpus. However, these rare words typically appear elsewhere in modern German, as the language of Kafka is relatively close to contemporary German. By contrast, in classical latin and greek, one encounters hapax legomenon that were not only said once in a work, but we don’t have a record of them appearing in the rest of the limited corpus of the whole language.
Confident beginning readers should keep in mind that when encountering a rare word in a text, it may actually be more commonly used and the reader may see it again elsewhere or even hear it conversation. This web application of texts and tracking the user’s lookups aims to greatly reduce the frustration that beginning readers encounter when encountering a word that only occurs once.
This analysis can also be done on the smaller works.
For the collection Betrachtung:
Section | Title | Total Words | Unique Dictionary Entries | Hapax legomena | Percent of words that are hapax legomenon |
---|---|---|---|---|---|
18 | Unglücklichsein | 1,403 | 491 | 315 | 22.45 |
1 | Kinder auf der Landstraße | 1,075 | 480 | 311 | 28.93 |
2 | Entlarvung eines Bauernfängers | 629 | 327 | 219 | 34.82 |
5 | Der Ausflug ins Gebirge | 137 | 97 | 52 | 37.96 |
7 | Der Kaufmann | 598 | 336 | 242 | 40.47 |
10 | Die Vorüberlaufenden | 162 | 108 | 71 | 43.83 |
9 | Der Nachhauseweg | 153 | 105 | 76 | 49.67 |
11 | Der Fahrgast | 234 | 160 | 118 | 50.43 |
8 | Zerstreutes Hinausschaun | 88 | 65 | 44 | 50 |
3 | Der plötzliche Spaziergang | 250 | 181 | 131 | 52.4 |
6 | Das Unglück des Junggesellen | 143 | 109 | 77 | 53.85 |
12 | Kleider | 143 | 112 | 78 | 54.55 |
14 | Zum Nachdenken für Herrenreiter | 259 | 192 | 144 | 55.6 |
13 | Die Abweisung | 191 | 143 | 107 | 56.02 |
15 | Das Gassenfenster | 113 | 91 | 66 | 58.41 |
17 | Die Bäume | 43 | 41 | 27 | 62.79 |
4 | Entschlüsse | 181 | 152 | 117 | 64.64 |
16 | Wunsch, Indianer zu werden | 65 | 62 | 45 | 69.23 |
For the collection Vier Geschichten:
Section | Title | Total Words | Unique Dictionary Entries | Hapax legomena | Percent of words that are hapax legomenon |
---|---|---|---|---|---|
4 | Josefine, die Sängerin oder Das Volk der Mäuse | 6,040 | 1,412 | 795 | 13.16 |
2 | Eine kleine Frau | 2,699 | 845 | 514 | 19.04 |
3 | Ein Hungerkünstler | 3,424 | 1,086 | 710 | 20.74 |
1 | Erstes Leid | 909 | 445 | 316 | 34.76 |
For the collection Kleinere Werke:
Section | Title | Total Words | Unique Dictionary Entries | Hapax legomena | Percent of words that are hapax legomenon |
---|---|---|---|---|---|
7 | Vom Scheintod | 257 | 135 | 68 | 26.46 |
6 | Eine alltägliche Verwirrung | 309 | 186 | 117 | 37.86 |
2 | Die Erfindung des Teufels | 228 | 148 | 94 | 41.23 |
8 | Heimkehr | 252 | 153 | 105 | 41.67 |
3 | Die Chinesischen Mauer und der Turmbau von Babel | 366 | 230 | 156 | 42.62 |
4 | Prometheus | 120 | 80 | 55 | 45.83 |
5 | Die Wahrheit über Sancho Pansa | 99 | 83 | 62 | 62.63 |
1 | Kleine Fabel | 78 | 68 | 50 | 64.1 |
For the collection Ein Landarzt:
Section | Title | Total Words | Unique Dictionary Entries | Hapax legomena | Percent of words that are hapax legomenon |
---|---|---|---|---|---|
14 | Ein Bericht für eine Akademie | 3,189 | 1,037 | 641 | 20.1 |
2 | Ein Landarzt | 2,132 | 806 | 539 | 25.28 |
11 | Elf Söhne | 1,608 | 644 | 428 | 26.62 |
6 | Schakale und Araber | 1,301 | 548 | 357 | 27.44 |
5 | Vor dem Gesetz | 588 | 270 | 172 | 29.25 |
7 | Ein Besuch im Bergwerk | 879 | 415 | 274 | 31.17 |
13 | Ein Traum | 720 | 353 | 239 | 33.19 |
4 | Ein altes Blatt | 691 | 365 | 250 | 36.18 |
9 | Eine kaiserliche Botschaft | 323 | 190 | 127 | 39.32 |
10 | Die Sorge des Hausvaters | 476 | 275 | 189 | 39.71 |
12 | Ein Brudermord | 616 | 363 | 272 | 44.16 |
1 | Der neue Advokat | 260 | 183 | 141 | 54.23 |
3 | Auf der Galerie | 290 | 207 | 161 | 55.52 |
8 | Das nächste Dorf | 66 | 70 | 53 | 80.3 |
One thing is clear, for a confident beginning reader setting out to read Kafka's short stories for the first time, reading them in their published sequential order is probably never the best approach. Students should strongly consider readings with lower percentage of frustrating hapaxes. For a slightly different approach to learning progressions, see the article Which German text to read when? Learning a language is difficult, let’s make it as easy as possible.