Modern Western musicians live and work mainly with a twelve-note chromatic equal-tempered scale. As you can read in ample resources elsewhere on the web, equal temperament is a tuning compromise that lets fixed-pitch instruments play in every key without sounding more out of tune in any key than in any other, and without sounding too out of tune in any key. While it’s easy to find good reading about why we use an equal temperament and what that means with respect to just intonation, it’s not so easy to find good resources that explain why our chromatic scale has twelve notes instead of some other number. Why not five, or seven, or thirty-five? Why does Western music use a twelve-note scale? This is what I’ll discuss here.

I will have to implicitly use some high-school-level mathematics, and the post is going to be too long as it is, so I’m afraid if you don’t know about exponentiation and logarithms you’ll need to go refresh yourself on those topics elsewhere.

Remember that Pythagoras and other ancient Greeks observed that small-integer frequency ratios sound consonant to us. The ratios of interest for consonant intervals and the corresponding interval names are:

  • 2:1, the octave
  • 3:2, the perfect fifth (inversion 4:3, the perfect fourth)
  • 5:4, the major third (inversion is 8:5, the minor sixth)
  • 6:5, the minor third (inversion is 5:3, the major sixth)

Before we go on, a quick note on past discussions related to this topic. It’s possible to find various claims that we have a twelve-note scale in Western music because there are twelve steps in the cycle of fifths. Even Leonard Bernstein makes this mistake, as my friend and colleague Richard Chon pointed out to me. Bernstein and others making the error are begging the question, putting the cart before the horse. In truth there is no cycle of pure 3:2 intervals at all. You can stack 3:2 intervals forever and you will never get back to the pitch class where you started; that’s the “comma” so often associated with Pythagoras. To get a cycle, one or more of the intervals in your stack has to deviate from a pure 3:2 frequency relationship. Many kinds of deviation are possible, and any scheme for bending a stack of nearly-3:2 intervals into a cycle will give you some number of notes in that cycle. The number could be twelve, but just putting “fifths” in a cycle doesn’t evince the number twelve in any clear way whatsoever. A larger number of notes in the scale (hence in the cycle of “fifths”) could give us a smaller “comma.” This post is my attempt to get the horse back ahead of the cart, explain the number twelve without assuming a cycle of stacked intervals at all, and then describe how tempering a twelve-note scale can give us the cycle of fifths we know and love for its provision that we can play sonorously in any key. We will see that 3:2 intervals play a vital role in leading to the number twelve, but not because there is any cycle.

As we work through the discussion below, I might not completely set aside the interval names that we normally use, but remember that those names are based on counting steps in a diatonic scale, and our goal here is not to take any particular scale for granted. So we’re going to be talking about scales in which a single step might have the size of, for example, a minor third.

Criteria and Figures of Merit for Scales

What makes a scale better or worse? Different scales across history and across cultures show that nothing is a hard and fast requirement, but a scale is more musically useful and more practical the more of the following properties it satisfies:

  • It should be made up of basic steps that don’t have too many different sizes, or if the basic step sizes are different, they should be close together in size.
  • It should be able to exactly or approximately express all the consonant intervals in at least one key center; more key centers is better.
  • It should not have too many notes.

Generating Scales from a Basic Interval

One way to generate a scale is to start with an arbitrary note and generate new notes in the scale by stacking a generating interval on top of each successive note. If we use the octave as our generating interval we don’t get anything interesting; we have a one-note scale. The next simple ratio is 3:2, and aside from the octave this is the easiest interval to tune in practical instruments because it involves the lowest harmonics. Even a dull-sounding fiber-string or gut-string instrument will have third harmonics usable for tuning, while higher overtones like the fifth harmonic — required for tuning based on any generating interval more complex than 3:2 — can be quite hard to hear.

So while it might be interesting to investigate what we get by generating scales as stacks of pure 5:4 (“major third”) or 6:5 (“minor third”) intervals, we aren’t going to attend to that because the simplest ratio, the 3:2 that we call a perfect fifth, already gives us a real bounty of musical richness and seems to have won out, probably due to its practicality. We’ll study 3:2 because its practicality gives it the greatest likelihood of historical relevance.

First Half of the Answer: Two Nearly Equal Step Sizes

We’re going to see that a stack of eleven 3:2 intervals gives us a twelve-note scale with two nearly equal step sizes and also provides good approximations to the consonant intervals. Along the way we’ll stop off to view some of the conceptual scenery, of course. To get there, we start small.

Building a Pentatonic Two-Step Scale and Observing Patterns

A two-note scale generated by the perfect fifth has just the initial note and one other note 3:2 (a “perfect fifth”) away. In this scale there is a big step (3:2) and a little step (4:3) corresponding to our “perfect fifth” and our “perfect fourth,” respectively. A two-note scale isn’t very interesting. (For a bigger version, click on the number line.)

Logarithmic-scaled number line showing an octave divided into two segments by two notes (plus one octave copy of the starting note)

What if we add another note by stacking another perfect fifth on top? We need to divide back down by two to get our new note into the octave, but once we’ve done that we see that once again we have two step sizes. This time our big step is 4:3 and our little step is between our starting note and the third note in the stack, the one we just added. Doing the frequency arithmetic, the frequency ratio of the note we just added is:

    \[ (3/2) \times (3/2) = 9/4 \]

and dividing down into the original octave brings us to 9/8. So our three-note scale’s smaller step is 9:8.

Logarithmic-scaled number line showing an octave divided into three segments by three notes (plus one octave copy of the starting note)

Stacking another 3:2 interval to make a four-note scale, we get a new step size, so this four-note scale has three step sizes in it:

Logarithmic-scaled number line showing an octave divided into four segments by four notes (plus one octave copy of the starting note)

Now if we stack another 3:2 interval (dividing down to stay within an octave again), something interesting happens. We drop back down to just two different step sizes! The new note in our scale subdivides the last remaining 4:3 step in the same way as the fourth note subdivided the prior 4:3 step, so we’re left with only 9:8 steps and 32:27 steps now:

Logarithmic-scaled number line showing an octave divided into five segments by five notes (plus one octave copy of the starting note)

Now we can see a pattern of how this plays out: When a new note subdivides the largest step we already have, it will create subdivisions identical to those created by any prior subdivision of other instances of the same-size step. (You might try proving this if you’re mathematically inclined.) For example, when our fifth note subdivided a 4:3 step, we got a 9:8 and a 32:27, which are exactly the same subdivisions that were created when our fourth note subdivided the 4:3 step where it landed.

As a consequence of that idea, we see that we can view our new two-step five-note scale as a combination of two earlier two-step scales: The three-note and the two-note “scales.” So up through 29 notes, above which the scheme falls apart, to get a new two-step scale with m+n notes, we can start with a two-step scale that has m notes and n big steps in it, and subdivide the n big steps using a shifted copy of a two-step scale having n notes, if a two-step n-note scale exists. (I’ll leave as an exercise the task of explaining how and why the scheme falls apart for scales larger than 29 notes.)

It’s also worth pausing here to notice that this five-note scale made from stacking 3:2 intervals is “shaped” very much like the pentatonic scales you’re probably deeply familiar with: a cluster of three notes separated by the small step, another cluster of two notes separated by the small step, and both gaps between clusters are the big step. So here you can begin to see in some detail the direction I’m trying to go with this discussion: I’m going to conjecture that the main reason pentatonic scales have arisen independently many times across different civilizations around the world is this simplicity: They are simple to generate with an interval that’s easy to tune on just about any pitched instrument with any timbre, and they have just two step sizes.

Before we add more notes to the scale, one more quick tangent: Just like the pentatonic scales you’re probably familiar with, the pentatonic scale generated by stacked 3:2 intervals contains something close to a just-intoned 5:4 (a “major third”) and something close to a just-intoned 6:5 (a “minor third”). Specifically, 81:64 is slightly wide of 5:4 (by a fraction of 81:80) and 32:27 is slightly narrow of 6:5 (by a fraction of 80:81).

And so here we can see the beginning of how decisions have to be made, consciously or unconsciously, in scale design. Rather than use the pitches generated strictly by stacking 3:2 intervals, any tradition using a pentatonic scale like this one might choose — if their instruments permitted — to tune the third note a pure 5:4 interval above the first note and the fifth note a pure 6:5 interval below the top note of the octave (equivalently, 5:3 above the first note). Doing so would give a pure major triad and a pure minor triad.

Beyond Pentatonic: Diatonic and Chromatic Two-Step Scales

OK, let’s keep adding notes and see what happens. Here’s the six-note scale resulting from stacking six 3:2 intervals:

Logarithmic-scaled number line showing an octave divided into six segments by six notes (plus one octave copy of the starting note)

Now we’re back to having three step sizes.

Here’s the next step in our progression, a seven-note scale from stacked 3:2 intervals:

Logarithmic-scaled number line showing an octave divided into seven segments by seven notes (plus one octave copy of the starting note)

Now things are looking interesting indeed. With seven notes we’re back down to two different step sizes, and look! This is the lydian mode of our usual diatonic (major) scale! So our diatonic scale isn’t some random surprise at all; it’s just the collection of seven notes you get if you keep using 3:2 intervals to tune the next string of your instrument! This is where the white keys on the piano come from! And think for a second about the fact that we directly get a lydian scale (as opposed to ionian) from this simple set of interval-stacking operations. This is what George Russell built an entire treatise upon.

Also, notice as we mentioned before that we can arrive at this two-step seven-note scale by starting with our two-step five-note scale which has two big steps, and interleaving our two-step two-note “scale” into those two big steps.

I won’t bore you with the details, but as we’ve seen before, adding another note introduces a new step size by subdividing one of the steps we already have. Eight, nine, ten, and eleven-note scales generated like this have three step sizes, and then when we hit a twelve-note scale, the number of step sizes falls back down to two. As we saw, the prior stages with two step sizes are diatonic and pentatonic, both definitely of deep cultural significance. And the twelve-note scale with two step sizes is important in the same way. Let’s have a look.

Logarithmic-scaled number line showing an octave divided into twelve segments by twelve notes (plus one octave copy of the starting note)

Check out how the diatonic notes — those from the seven-note scale above (akin to “white keys” on the piano if we suppose that our scale spans an F-to-F octave) — are those with steps of 256:243 below them, while those notes that weren’t in the seven-note scale (akin to “black keys”) are the ones with steps of 2187:2048 below them.

And again, we can view this two-step twelve-note scale as our two-step seven-note scale which has five big steps, plus a two-step five-note scale interleaved into those five big steps. If you’re with me so far, you can now give a good explanation for why the black keys on a piano make up a pentatonic scale and the white keys make a diatonic scale.

The next two scale sizes generated by 3:2 with only two step sizes are 17 notes (12 with 5 big steps plus 5) and 29 (17 with 12 big steps plus 12) notes. Beyond that, we cease having small numbers of step sizes for any practical number of notes in the scale. (Take my word for this or work it out yourself, possibly using your favorite scripting language on a computer if you’re inclined to.) Notice that for our twelve-note scale the two step sizes are close to the same, differing by a factor of

    \[(2187 \times 243) / (256 \times 2048) = 531441 / 524288  \approx 1.0136 \]

In contrast, the 17-note scale’s steps differ in size by a factor of about 1.04 and the 29-note scale’s steps differ in size by a factor of about 1.025, both significantly more than for our twelve-note scale.

So now we have the first half of the answer to our “Why Twelve” question: When it comes to scales generated from stacked 3:2 intervals, the twelve-note scale has only two different step sizes and those two step sizes are more nearly equal than those in any other two-step-size scale generated in the same way. Its nearly equal step sizes mean that the twelve-note scale is most like a scale with a single step size evenly dividing the scale. Even if we allow three or more different step sizes, a 3:2 stacked scale would need to contain 41 notes before its largest and smallest step size would differ by a factor smaller than the twelve-note scale’s 1.0136. Some artists and musicologists advocate scales with such large numbers of notes, but using them would seem to impose a significant performance and notation burden well beyond what the Western musician experiences in our familiar twelve-tone world.

Second Half of the Answer: Minimum Equal-Temperament Compromise

We’re about to observe that a twelve-note scale offers some advantages when we wish to use equal temperament. We would be wrong to suppose that these advantages connected with equal temperament had anything to do with Western music’s foundation on a twelve-note scale, though. After all, Western music used a twelve-note scale for centuries before temperaments started to catch on. And even after the advantages of tempered scales were widely appreciated (partly due to famous work by J. S. Bach), centuries more had to pass before equal temperament became a practical and common reality for fixed-pitch instruments.

So although the concept of equal temperament didn’t help lead us to a twelve-note scale, it’s a happy coincidence that the concept of equal temperament applies more gracefully to a twelve-note scale than to any other scale up to 19 notes in size.

What do I mean when I say that equal temperament applies gracefully? What I mean is that it compromises the tuning of the consonant intervals the least. In a twelve-note equal-tempered scale the most out-of-tune consonant interval is the 6:5 interval (conventionally called a minor third). Click here for a spreadsheet I made to calculate how well an equal-tempered scale with a given number of notes can approximate the consonant intervals.

Conclusion: Twelve is a Happy Number

Summarizing what we’ve seen, twelve is the only number of notes that has:

  • Only two step sizes when generated as a stack of 3:2 intervals;
  • Step sizes that are most nearly equal for any such two-step scale with a practical number of notes; and
  • Excellent approximation (81:80) of the consonant major and minor thirds and sixths, and perfect intonation for the fourths and fifths.

Those attributes are historically the answer to “Why twelve?” but as we’ve seen, twelve also offers an additional bonus in the age of equal temperament:

  • Very close approximation of the consonant intervals by an equal-tempered scale.

So that’s why as a culture, we were able to hold onto our twelve-note scale tradition and step fortuitously into a world of equal temperament, allowing music to advance to playing with equal, manageable dissonance in every key!