Skip to content

Dale-Chall Readability Formula

Edgar Dale and Jeanne Chall developed this formula in 1948 to measure reading difficulty by checking whether words are familiar to a typical 4th-grade reader, rather than counting syllables or characters.

OutputRaw score (4–10+) that maps to grade bands
Best forEducational materials, patient health content, general adult audiences
Methodr.dale_chall()
What it countsUnfamiliar words (not on the 3,000-word list), words per sentence
Minimum text100 words (default)
  • You are writing for a general adult audience and vocabulary familiarity is the primary concern. Dale-Chall has more validation research for educational and patient materials than any other formula in this library.
  • You are checking patient education materials, health pamphlets, or consumer health information. Multiple systematic reviews name Dale-Chall as the highest-validity formula for these use cases.
  • You are assessing government documents or public information materials intended for general adult comprehension.
  • You need to check text aimed at grade 4 and above. For younger readers (grades 1–3), use Spache instead.

Dale-Chall checks each word in the text against a list of approximately 3,000 words that 80% of American 4th-grade students recognized during data collection. Words not on the list are counted as “difficult.” The formula then combines the percentage of difficult words with average sentence length.

raw_score = 0.1579 × (difficult_words / total_words × 100) + 0.0496 × (total_words / sentences)

If more than 5% of words are difficult, the formula adds a fixed adjustment:

adjusted_score = raw_score + 3.6365

The library implements the “New Dale-Chall” — the 1948 formula coefficients combined with the revised 1995 word list (Chall & Dale, Readability Revisited). The 1995 revision expanded the list from about 769 words to approximately 3,000.

Words on the list in their base form are also familiar in these inflected forms: regular plurals (-s, -es), past tense (-ed), progressive (-ing), comparative (-er), and superlative (-est). Derivational suffixes that create new words are not covered. For example, complete is on the list, but completion is counted as difficult.

Proper nouns (names of people and places) are treated as familiar regardless of whether they appear on the list.

Dale-Chall scores do not map to individual grade levels. They map to grade bands. The formula cannot distinguish between grade 1 and grade 4 text; both fall in the same band.

Raw ScoreGrade BandReading Level
4.9 and belowGrades 1–4Elementary school
5.0–5.9Grades 5–6Upper elementary
6.0–6.9Grades 7–8Middle school
7.0–7.9Grades 9–10Early high school
8.0–8.9Grades 11–12Late high school
9.0–9.9CollegeUndergraduate
10.0 and aboveCollege graduateGraduate/professional

A score of 7.5, for example, places a text at approximately grades 9–10. The grade_levels field returns this band as a list: ["9", "10"].

r.dale_chall() returns a DaleChallResult object:

FieldTypeDescription
scorefloatRaw Dale-Chall score, typically 4–10
grade_levelslist[str]Grade band for this score, e.g. ["9", "10"] or ["college"]
grade_levelstrFirst item in grade_levels (the lower bound of the band)

The library defaults to a 100-word minimum. Passing fewer words raises a ValueError. You can lower this threshold with Readability(text, min_words=50), but scores from short texts are less reliable.

  • Does not work well for technical or software documentation. Words like “cookie,” “stream,” “execute,” “crash,” and “enter” appear on the familiar-word list in their everyday English senses. A page about browser cookies or command-line syntax will score as easier than it actually is for readers without that domain knowledge.
  • Not appropriate for grades 1–3. The formula’s lowest score band covers grades 1–4 as a single undifferentiated range. It cannot distinguish between a text appropriate for grade 1 and one appropriate for grade 4. Use Spache for primary-grade text.

  • The word list reflects 1984 American English. The 1995 revision was calibrated on word familiarity data from approximately 1984. Words common in contemporary life but absent from that era — “app,” “email,” “wifi,” “streaming” — are not on the list and will be counted as difficult. Agricultural vocabulary from mid-20th-century American life, such as “haystack” and “bushel,” appears on the list even though many contemporary readers may not recognize these words.

  • Derivational suffixes count as difficult. A text using abstract nouns ending in -tion, -ment, or -ation will score harder than a text using the same root words in base form, because the suffixes create “new words” not covered by the list. Complete is familiar; completion is difficult.

from readscore import Readability
text = """
Before you take this medicine, tell your doctor about all other medicines you take,
including vitamins and supplements. Some medicines can interact with each other and
cause problems. Your doctor needs this information to keep you safe. Store this
medicine at room temperature, away from heat and direct light. Keep it out of
reach of children. Do not use it after the expiration date on the package. If you
miss a dose, take it as soon as you remember. If it is almost time for your next
dose, skip the missed dose. Do not take two doses at the same time to make up for
a missed one. Contact your doctor or pharmacist if you have questions.
"""
r = Readability(text)
result = r.dale_chall()
print(f"Score: {result.score:.2f}") # Score: ~5.12
print(f"Grades: {result.grade_levels}") # ['5', '6']
print(f"Primary grade: {result.grade_level}") # '5'