Inspired by the folks over at FiveThirtyEight to apply quantitative analysis to everything in life, I thought I’d take a look at the statistics for acquiring Greek vocabulary for the study of the New Testament.
Every introductory Greek class presents statistics about the vocabulary distribution and why we focus on high frequency words (and what that gets us). But I haven’t ever seen it presented in quite this fashion in order to quantify the challenges of acquiring NT Greek, but also some sources of hope for the task.
The Greek New Testament, as edited in the latest Nestle Aland 28 / UBS 5 volumes, consists of the following (rounded):
- 7,940 total verses
- 138,150 total words (by word count)
- 5,420 distinct words
That’s a lot of words! (And it is much much higher for Hebrew)
But let’s break it down further.
Frequency of Word Usage
The first thing to observe is that, of this 5,420 words, nearly 2,000 of them appear only once (~36%). That’s an astounding number! Of course many of these are proper nouns, but even removing those, you are left with a ton of vocabulary that only appears one time (often called hapax legomena).
On the other end of the spectrum, 172 words appear 100x or more (3%), with another 138 appearing 50-100x (2.5%).
That seems pretty depressing, right? However, that doesn’t tell the whole story. These 310 words may only account for 5.5% of the total vocabulary of the NT, but they make up almost 80% of total occurrences of words in the NT. That’s the comforting part! The NT doesn’t follow the 80/20 rule, but rather something much better: the 80/5.5 rule!
This explains why most introductory Greek courses focus on acquiring knowledge of the 50+ frequency words, since that means you’re able to read 80% of the NT without using a dictionary. After that, of course, the work is much harder to move to 85%, or 90%, and so on.
Here are two charts that help summarize this data (click to enlarge).
The basic summary is this:
- Investing significant time up front to acquire the 50+ words pays significant dividends.
- There is a tremendously “long tail” whereby going from 90% to 100% of words by occurrence (that is, ability to read that percentage of the NT without relying on a dictionary) requires learning >80% of the total vocabulary! That’s a huge step!
- Perhaps the best idea is to focus on the 882 words that get you to 90%.
There are a ton of vocabulary aides out there; here’s a good place to start: Institute of Biblical Greek.
However, I believe the best way to acquire reading knowledge of Greek is not simply to study flash cards or vocab lists, but actually to read Greek! While Bible software packages are fantastic, it is far too tempting to track along each word with the mouse to find out what it means and how to parse it, thus short-circuiting the learning process.
For my money, the better alternative is to purchase one of the two fantastic “Reader’s Editions” of the NT. These volumes provide glosses/definitions for the words on each page that are relatively infrequent (e.g., <30 times in NT), so that you do not have to pause to look them up. The other words (e.g., >30 times) are included in a short lexicon in the appendix. Once you’ve memorized that list—which will usually happen by the third semester of Greek—you can read the entire NT without having to stop to look up words, because the ones you do not know are at the bottom of the page for easy reference.
I have benefited tremendously from using these Readers’ Editions, and I cannot recommend them highly enough.
Connecting to the pew
Three quick thoughts:
- Pray for seminary students as they learn the languages. It is tough work and can be discouraging (but it is also exciting).
- Pray for pastors to keep up their languages. If you serve on a church leadership team, make it a priority to help the pastor carve out time for study, for there are a million other things that will compete time and crowd out serious study in the original languages. But such effort is the lifeblood of preaching and, thus, fuel for the church.
- Be wary when you read a commentary that is always making a big deal about hapax legomena (words used only once) in a given NT chapter or book. Often scholars appeal to these words as if they have some sort of loaded meaning or spiritual significance simply because they only occur once (or twice). Keep in mind that 2,000 NT words only occur once, and 3,600 occur less than 5x. That’s a LOT of rare words, and it is hardly the case that all of them have special significance. Some do, of course, but not all. So be cautious before reading too much into a word simply because it doesn’t occur much in the NT.
 This varies based on whether you include the long ending of Mark, the adultery pericope in John, the “comma Johanneum,” and other major variants.
 Both publishers have also released Hebrew versions, which can be found on Amazon.