Variety in the language of Franz Kafka

Franz Kafka portrait, 1906

Franz Kafka's language varies across each of his major works. Analyzing Kafka's language helps us answer questions such as, for a confident beginning reader:

  • Is there an optimal sequence of Kafka’s works to read?
  • Are any works of Kafka intrinsically more difficult to read than others?
Reading Kafka: an optimal sequence

Every text a beginning reader faces is a mountain that must be climbed over, step by step, word by word. By analyzing texts we can determine how high the mountain is---i.e. how many words have to be learned to read the text. Most texts have some special domain vocabulary (Ein Hungerkünstler takes place in the circus and public scene, Der Prozess, The Trial, has many legal terms, etc); however some works contain proportionally more vocabulary words than others. We can quantify this and measure the diversity of vocabulary:

Total number of unique dictionary entries Total number of words = Diversity of Vocabulary \frac{Total number of unique dictionary entries}{Total number of words} = Diversity of Vocabulary {Total number of unique dictionary entries} over {Total number of words} = Diversity of Vocabulary

For example:

if 100 dictionary words were used to write a personal letter 200 words long:

100 200 * 100 = 50% diversity \frac{100}{200} * 100 = 50% Diversity of Vocabulary {100} over { 200} * 100 = 50% Diversity of Vocabulary

if 5,000 words were used to write a 5,000 word story:

5000 5000 * 100 = 100% diversity \frac{5000}{5000} * 100 = 100% Diversity of Vocabulary {5000} over {5000} * 100 = 100% Diversity of Vocabulary

The following table shows the diversity of vocabulary for Franz Kafka’s major works:

WorkTotal WordsUnique Dictionary EntriesDiversity of Vocabulary
Das Schloß108,1556,5886.09%
Der Prozeß71,7765,1607.19%
Die Verwandlung19,1572,89415.11%
Brief an Den Vater16,2922,63816.19%
In der Strafkolonie10,2801,89418.42%
Vier Geschichten13,0722,43718.64%
Ein Landarzt13,1392,77821.14%
Das Urteil3,9951,08927.26%
Kleinere Werke1,70966638.97%

(Note: In the table above, noun declensions (singular and plural forms, nominative, genitive, and accusative forms) are all counted as one unique dictionary entry. Verb forms and conjugations are also counted as one unique dictionary entry, since the paradigm forms only need to be learned once.)

Although there is a general trend: longer works tend to have more repetition and thus less diversity of vocabulary, not always: Das Schloss and Amerika are nearly the same length; yet reading Amerika requires learning 1,356 more word forms; Die Verwandlung is 6,000 words longer than Ein Landarzt, yet both collections require about the same amount of vocabulary to be learned

Of course, tackling a longer works are particularly daunting for a confident beginning reader, but some of the shorter collections have works that can be read out of order. Here is the same analysis applied to the shorter works:

For Betrachtung:

SectionTitleTotal WordsUnique Dictionary EntriesDiversity of Vocabulary
1Kinder auf der Landstraße1,07548044.65%
2Entlarvung eines Bauernfängers62932751.99%
7Der Kaufmann59833656.19%
10Die Vorüberlaufenden16210866.67%
11Der Fahrgast23416068.38%
9Der Nachhauseweg15310568.63%
5Der Ausflug ins Gebirge1379770.8%
3Der plötzliche Spaziergang25018172.4%
8Zerstreutes Hinausschaun886573.86%
14Zum Nachdenken für Herrenreiter25919274.13%
13Die Abweisung19114374.87%
6Das Unglück des Junggesellen14310976.22%
15Das Gassenfenster1139180.53%
17Die Bäume434195.35%
16Wunsch, Indianer zu werden656295.38%

For the collection Vier Geschichten

SectionTotal WordsUnique Dictionary EntriesDiversity of Vocabulary
4Josefine, die Sängerin oder Das Volk der Mäuse6,0401,41223.38%
2Eine kleine Frau2,69984531.31%
3Ein Hungerkünstler3,4241,08631.72%
1Erstes Leid90944548.95%

For the collection Kleinere Werke:

SectionTitleTotal WordsUnique Dictionary EntriesDiversity of Vocabulary
7Vom Scheintod25713552.53%
6Eine alltägliche Verwirrung30918660.19%
3Die Chinesischen Mauer und der Turmbau von Babel36623062.84%
2Die Erfindung des Teufels22814864.91%
5Die Wahrheit über Sancho Pansa998383.84%
1Kleine Fabel786887.18%

For the collection Ein Landarzt

SectionTitleTotal WordsUnique Dictionary EntriesDiversity of Vocabulary
14Ein Bericht für eine Akademie3,1891,03732.52%
2Ein Landarzt2,13280637.8%
11Elf Söhne1,60864440.05%
6Schakale und Araber1,30154842.12%
5Vor dem Gesetz58827045.92%
7Ein Besuch im Bergwerk87941547.21%
13Ein Traum72035349.03%
4Ein altes Blatt69136552.82%
10Die Sorge des Hausvaters47627557.77%
9Eine kaiserliche Botschaft32319058.82%
12Ein Brudermord61636358.93%
1Der neue Advokat26018370.38%
3Auf der Galerie29020771.38%
8Das nächste Dorf6670106.06% *

* Over 100% diversity is possible for a short work, since some words may have forms that can be an adjective, an adverb, or particle, or the word could be a noun or a verb at the beginning of a sentence. To ensure a student attains a good understanding of a text, the student should be familiar will all diverse potential meanings of the words. Otherwise the student's understanding may become mechanical or too literal.

This analysis treats each story as a separate unit of text. But learning vocabulary through reading stories is cumulative, and for a clearer picture of how reading gets easier, please read the article about learning progressions titled: Which German text to read when?.

