Just a quick note: My absolute favorite phonetics instruction software is sndpeek by Princeton Sound Lab, a real time Fast Fourier Transform and Waveform display program. Unfortunately, because Apple deprecated a bunch of old audio methods, it was broken with the update to Lion. However, some kind soul has updated the software to be Lion compatible, and it’s back to working like a charm. To download it, visit the sndpeek website and click on the “mac (mac osx lion) binary”.

To install, download the tgz file (the below code assumes you’ve downloaded it to the desktop), double click it to expand, then open a terminal and type:

cd ~/Desktop/sndpeek-1.3-exe

(then hit "enter")

sudo cp bin/sndpeek /bin/

Once you hit enter after typing the above command, OS X will then ask for your OS X administrator password (to copy the file deep into the filesystem), and once you’ve done that, at any point in the future, you’ll be able to just type “sndpeek” into a terminal and it’ll pop up a window displaying whatever sound source is selected in your Sound input preference pane. I usually give a more complex command to produce a prettier output, “sndpeek –logfactor:0.5 –lissajous:OFF –features:OFF –depth:150′”

I encourage you to play with the software as there are few better tools to help understand what a spectral slice is, how it works, and how spectrograms can be made. Enjoy!

Tagged with Computers and Software, Conventional Linguistics, Followups, Phonetics and Phonology | Leave a Comment


This morning, Consumerist linked to an article in Primer Magazine (for some reason), titled “10 Words You Mispronounce That Make People Think You’re an Idiot”.

With a name like that, it couldn’t be anything but judgmental pedantry, but even in an otherwise eyeroll-worthy article, I found that several of these words are actually completely reasonable pronunciations, and several of them demonstrate interesting phonological processes. So, I’m going to discuss them a little bit.

Athlete (pronounced with a schwa in the middle, “Ath-uh-leet” /æθəlit/)

This is a very reasonable and common pronunciation, which I noticed extensively in the speech of even experts on the subject (Michael Lewis, the author of Moneyball: The Art of Winning an Unfair Game is a notable /æθəlit/ speaker. Here, the change likely comes from our dislike of having an interdental sound (/θ/) right next to a lateral (/l/). If you attempt to make the “correct” pronunciation, you’ll notice that your tongue is, in a sense, trapped between your front teeth, and to make a smooth gesture, you end up having to attempt to curve the sides of the middle and back of your tongue down. Which is unpleasant. So, it’s not shocking at all that speakers who use the word often may add the schwa.

(It’s also worth noting that there is no ‘H’ in Athlete, despite the author’s smug assertions that “there is no vowel between the ‘H’ and the ‘L’ in any of these words”. The English “TH” in this word is actually a single sound, a voiceless interdental fricative, which is nothing resembling an /h/. Once again, pedantry is seldom done well enough to be immune to further pedantry.)

Utmost (pronounced as “upmost”, /ʌpmowst/)

This is an awesome example of assimilation, two sounds becoming more like one another to make the speaker’s life easier, a phenomenon I’ve discussed before. Here, in the “correct” pronunciation, /ʌtmowst/, we have a /t/ sound, created at the alveolar ridge (just behind the teeth, try it) followed immediately by /m/, a bilabial sound created by pressing the two lips together.

When speakers are “mispronouncing” the word as /ʌpmowst/, they’re actually being more efficient, substituting in a /p/, also a bilabial sound, which allows them to simply close their lips (creating the /p/), then lower the velum (allowing nasal airflow) and start voicing to begin making the /m/. Going from /p/ to /m/ requires no additional tongue or lip movement, whereas going from /t/ to /m/ requires reconfiguration of the tongue and lips. Efficiency. Not quite the idiot pronunciation he’s claiming.

Sherbet (pronounced as “sher-bert”, /ʃɜɹbəɹt/)

Why does Primer Magazine hate assimilation? The first syllable has an “err” (/ɜɹ/) sound, why not the second syllable too? If we can keep the whole word vaguely “r-sounding” (“rhotic”, in phonetic terms), all the better. Speakers love regularity. Primer Magazine doesn’t.

“For all intensive Purposes”

This is really a horsed zebra. For further discussion of this, see a post I made last week.

Often (pronounced as “offen”, /ɑfɪn/)

How many Americans say “often” with the /t/, ever? This is textbook deletion of an unpleasant sound to simplify a cluster, and it’s one carried out by many, many people. Why bother with a /ft/ cluster when there’s no need to keep it around? It’s not like there’s another word, “Offen”, which this form of “often” could be confused with, and frankly, for speed, fluidity, and social reasons (in the US), the “offen” pronunciation is really a better choice.

Edit: OK, I misread this one completely in my anti-pedant rage. The author of the quoted article is actually _in favor_ of “offen” as the “proper” form, and I responded assuming that he, like so many others have, was arguing that “often” (with a /t/) is the only proper form. So, I’ve culled some of the anger from the post, and kept the phonology. Thanks, commenter!

Awry (pronounced as “aw-ree”, /’ɑɹi/ instead of “uh-rye” /ə’ɹaj/)

This word is a textbook example of why our writing system needs to be taken out behind the barn and dispatched as humanely as possible. Although “wry” is used for the proper /ɹaj/ pronunciation in the word “wry” (and only there), usually the “aw” digraph represents /ɑ/ (as in “claw”, “maw”, “awful”, “awkward”) and the “ry” represents /ɹi/ (as in “fury”, “worry”, “scurry”). I can understand the author feeling the need to state the proper pronunciation of the word, but his indignation at the thought that anybody could EVER think “awry” is pronounced “aw-ree” is just silly.

So, there’s a bit of phonological goodness wrung out of an otherwise dry and pedantic bit of prescriptivism. Which I am going to pronounce as “per-scriptivism” for the remainder of the day. Just to anger Justin Brown.

Tagged with Conventional Linguistics, Language Change, Phonetics and Phonology, Tirades, Words, Phrases, and Idioms | 3 Comments


Only yesterday, I briefly mentioned Mondegreens, where a song lyric is misheard as some other homophonous (identical-sounding) phrase (“killed him and laid him on the green” vs. “killed him and Lady Mondegreen”). This gave me cause to mention Jimi Hendrix’ “Purple Haze” and its famous Mondegreen. The original lyric is:

Purple haze all in my brain
Lately things just don’t seem the same
Actin’ funny, but I don’t know why
‘Scuse me while I kiss the sky

But many people hear the last line as “‘Scuse me while I kiss this guy”, and that misperception actually reveals something very interesting about how English consonants work.

What makes /k/ different from /g/?

Both /k/ and /g/ are what linguists refer to as “stops”, they’re consonants where the airstream out of the mouth is completely obstructed, and actually, both /k/ and /g/ are “velar” stops, made with the tongue up against the soft palate, or velum. Try it, making a /k/ as in “cap” and a /g/ as in “gap”, one after the other, and you’ll notice that your tongue isn’t changing position when you switch from /k/ to /g/ at all.

The simplistic explanation is that /k/ is a voiceless sound (meaning that our vocal folds/cords aren’t vibrating while we make the closure), and /g/ is a voiced sound, involving glottal vibration during the closure. Unfortunately, like most things in phonetics, it’s not quite that simple or easy.

Voice Onset Time

In reality, stop consonants are classified by their voice onset time, the amount of time that elapses between when the stop is released (when the tongue stops blocking airflow) and when the voicing starts (when the vocal folds start vibrating) for the following vowel. By looking at voice onset time (VOT), we can actually classify consonants in three different ways. (I’ve actually discussed voice onset time before, but now that I’ve already made nicer looking graphics for teaching, it seems worth doing again.)

First, [kʰ]. In English, any voiceless stop that’s at the start of a syllable (so the /k/ in “cap”, but not “pack”) is “aspirated”, meaning that there’s a considerable time gap with a burst of air between the opening of the stop and the start of voicing (it has a positive voice onset time). In the word “cap” /kæp/, we bring our tongue back to the velum to make a closure, we release that closure, and then, around 100 ms (milliseconds) later, we start voicing for the vowel /æ/. Viewed in terms of the acoustical waveform of speech, here’s what aspiration and VOT looks like in [kʰa]:

[g], on the other hand, is a voiced stop, where voicing actually starts during the closure. So, the tongue moves up to the velum, the vocal folds begin vibrating, and then, when the stop is released, the vowel begins immediately. The voice onset time is negative, as the voicing started before the closure. See yet another waveform diagram below, this time showing /ga/:

There’s a third option. Imagine that you started voicing at the exact moment that you released the stop, as shown below:

Then what you have is [k], what linguists refer to as a “voiceless unaspirated stop”, with a voice onset time of 0 (or close to it).

So, we have three stop choices: Voiced stops, voiceless unaspirated stops, and voiceless aspirated stops, which are all used differently in the different languages of the world. But how does this affect Jimi Hendrix?

English makes stops oddly

Our problems with Jimi Hendrix kissing guys (not that there’s anything wrong with that) come from three fundamental oddities in the way that English produces stops.

First, English only distinguishes between Aspirated and Voiced stops. “cap” starts with a /k/, which is produced with aspiration, and “gap” starts with /g/. We don’t have a three way contrast between voiced [g], voiceless unaspirated [k], and voiceless aspirated [kʰ]. Korean, as I’ve mentioned before, has that three way contrast.

Second, English word-initial (at the start of a word) voiced stops are actually produced as voiceless-unaspirated stops, with a VOT of ~0. This is because we, as English speakers, have really strong aspiration in our voiceless stops, so even if we produce something without much voicing during the closure, listeners will still be able to understand that it’s not aspirated, so clearly, the speaker must be intending to express voicing. Here’s a waveform of the word “guy”, to prove the point. Note that there’s a very little VOT here.

Finally, when following an /s/, English voiceless stops are not aspirated. So, in the word “sky”, we have an unaspirated stop, rather than the normal, aspirated [kʰ] which our writing system would lead us to expect. Here’s a waveform showing the very small VOT in “sky”:

So, in effect, the /g/ in “guy” and the /k/ in “sky” are the same sound! Still don’t believe me? Well, first listen to sky, then listen to guy, then listen to “sky” where I’ve digitally removed the /s/. Your writing system has been lying to you!

So what does Jimi Hendrix kissing men have to do with Stop Acoustics?

When we look at the acoustics of “guy” and “sky”, it’s very easy to see that the difference the two different perceptions of the lyric (“kiss the sky” and “kiss this guy”) are incredibly similar. When we realize that in English, [k] and [g] are functionally the same thing, the difference between our two choices:

… is seen to be only a question of where you put the /s/, and thus, really, no difference at all.

So, we see that not only are sounds in English not what our writing systems makes them out to be, but that this “error” of perception is not only understandable, but linguistically fascinating as well.

So, next time you find yourself listening to Purple Haze, Thank Jimi Hendrix for providing one of the best examples of the perceptual troubles which can come from our lack of a voiced/voiceless-unaspirated contrast in the English language. Or, curse me for linguistically corrupting an otherwise good song. Either or, really.

Tagged with Conventional Linguistics, Language and Music, Language Usage, Phonetics and Phonology, Speech and Grammar Errors | Leave a Comment


Site Information

  • Categories

  • Latest Non-linguistic Posts

  • Archives

  • Site features