Quantifying the Task of Learning Greek

Inspired by the folks over at FiveThirtyEight to apply quantitative analysis to everything in life, I thought I’d take a look at the statistics for acquiring Greek vocabulary for the study of¬†the New Testament.

Every introductory Greek class presents statistics about the vocabulary distribution and why we focus on high frequency words (and what that gets us). But I haven’t ever seen it presented in quite this fashion in order to quantify the challenges of acquiring¬†NT Greek, but also some¬†sources of hope for the task.

Basic Statistics

The Greek New Testament, as edited in the latest Nestle Aland 28 / UBS 5 volumes, consists of the following (rounded):

  • 7,940¬†total verses[1]
  • 138,150 total words (by word count)
  • 5,420 distinct¬†words

That’s a lot of words! (And it is much much higher for Hebrew)

But let’s break it down further.

Frequency of Word Usage

The first thing to observe is that, of this 5,420 words, nearly 2,000 of them appear only once (~36%). That’s an astounding ¬†number! Of course many of these are proper nouns, but even removing those, you are left with a ton of vocabulary that only appears one time (often called¬†hapax legomena).

On the other end of the spectrum, 172 words appear 100x or more (3%), with another 138 appearing 50-100x (2.5%).

That seems pretty depressing, right?¬†However, that doesn’t tell the whole story. These 310 words may only account for 5.5% of the total vocabulary of the NT, but they make up almost 80% of total¬†occurrences of words in the NT. That’s the comforting part!¬†The NT doesn’t follow the 80/20 rule, but rather something much better: the 80/5.5 rule!

This explains why most introductory Greek courses focus on acquiring knowledge of the 50+ frequency words, since that means you’re able to read 80% of the NT without using a dictionary. After that, of course, the work is much harder to move to 85%, or 90%, and so on.

Here are two charts that help summarize this data (click to enlarge).

Greek Chart 1
Frequency distribution: Total occurrences vs. Total unique words
Greek chart 2
Cumulative % of occurrences vs. % of words

The basic summary is this:

  • Investing significant time up front to acquire the 50+ words pays significant dividends.
  • There is a tremendously “long tail” whereby going from 90% to 100% of words by¬†occurrence¬†(that is, ability to read that percentage of the¬†NT without relying on a dictionary)¬†requires learning >80% of the total vocabulary! That’s a huge step!
  • Perhaps the¬†best¬†idea¬†is to focus on the 882 words that get you to 90%.

Practical Advice

There are a ton of vocabulary aides out there; here’s a good place to start: Institute of Biblical Greek.

However, I believe the best way to acquire reading knowledge of Greek is not simply to study flash cards or vocab lists, but actually to read Greek! While Bible software packages are fantastic, it is far too tempting to track along each word with the mouse to find out what it means and how to parse it, thus short-circuiting the learning process.

For my money, the¬†better alternative is to purchase one of the two fantastic “Reader’s Editions” of the NT. These volumes provide glosses/definitions for the words on each page that are relatively infrequent (e.g., <30 times in NT), so that you do not have to pause to look them up.¬†The other words (e.g., >30 times) are included in a short lexicon in the appendix. Once you’ve memorized that list‚ÄĒwhich will usually happen by the third semester of Greek‚ÄĒyou can¬†read the entire NT without having to stop to look up words, because the ones you do not know are at the bottom of the page for easy reference.

I have benefited tremendously from using these Readers’ Editions, and I cannot recommend them highly enough.[2]

Connecting to the pew

Three quick thoughts:

  1. Pray for seminary students as they learn the languages. It is tough work and can be discouraging (but it is also exciting).
  2. Pray for pastors to keep up their languages. If you serve on a church leadership team, make it a priority to help the pastor carve out time for study, for there are a million other things that will compete time and crowd out serious study in the original languages. But such effort is the lifeblood of preaching and, thus, fuel for the church.
  3. Be wary when you read a commentary that is always making a big deal about hapax legomena¬†(words used only once) in a given NT chapter or book. Often scholars appeal to these¬†words¬†as if they have some sort of loaded meaning or spiritual significance¬†simply because they only occur once¬†(or twice).¬†Keep in mind that 2,000 NT words only occur once, and 3,600 occur less than 5x. That’s a LOT of rare words, and it is hardly the case that all of them have special significance. Some do, of course, but¬†not all. So be cautious before reading too much into a word simply because it doesn’t occur much in the NT.


[1] This varies based on whether you include the long ending of Mark, the adultery pericope in John, the “comma Johanneum,” and other major variants.

[2] Both publishers have also released Hebrew versions, which can be found on Amazon.


3 thoughts on “Quantifying the Task of Learning Greek”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s