The IPA Translation Widget: a wonderful impossibility

So, I’m somewhat obsessed with checking the statistics of who comes here, who gets referred from where, and what search terms they used to find me. Well, the other day, somebody came here from google searching for “IPA translation widget”. For those of you unfamiliar with the terms, a “widget” is a small program written for Apple’s Dashboard interface, and IPA refers to the International Phonetic Alphabet. What this person seems to be wanting was a widget that, like some existing translation widgets, could take a block of text and immediately turn it into IPA characters. For the first few moments, I thought “Wow! That’d be a great idea!”.

Now, as somebody who uses the IPA very, very frequently, such a thing would be wonderful if it worked well. However, I think it would be impossible to actually create a program that goes from English writing to IPA transcriptions without incredible advances in Artificial Intelligence and speech recognition. Here’s why…

Transcription, not translation

At the surface, this doesn’t seem so crazy. Apple includes a widget to do rough, automated translations with Dashboard, and although I never trust automated translations, it does alright for basic words and phrases. I suspect that our anonymous searcher saw that widget and thought “Wow, cool! I wonder if it can help me put something into the IPA”. However, the fundamental difference between translating a sentence into Spanish and putting that same sentence into the IPA is that the IPA isn’t really a language at all, but instead, it’s a method of writing sounds.

The International Phonetic Alphabet is really a set of symbols, each of which represents a sound, sound characteristic, or other element of spoken language. What the IPA allows a linguist (or speech pathologist, or teacher…) to do is to take spoken language and put it onto paper (’transcription’) with a great deal more precision than most other writing systems. The IPA isn’t a language in itself, it’s just an alternative, phonetic writing system for other languages. The beauty of this is that the IPA is designed to be able to be used not just for English, but for any language. The IPA symbols can be used to transcribe sounds not just from English, but from languages all over the world.

Broad vs. Narrow Transcription

The IPA can be used to transcribe sounds with two different degrees of precision.

If one takes advantage of all the symbols and diacritics, one can make a “narrow” or “phonetic” transcription. At this level, the linguist aims to capture all the detail possible about the word or phrase, including variations across word boundaries, sounds that occur in speech but are unnoticed or unrecognized by native speakers, and even features like intonation and pauses. From these transcriptions, a well-trained linguist could pronounce the words and phrases almost exactly as the speaker did, based simply on the transcriptions. The first, smallest line in the title graphic is a narrow transcription of me pronouncing the site’s title.

This degree of precision would be impossible for a modern computer widget to produce, simply because narrow transcriptions are based on actual words and phrases by a speaker, and really, one needs a fairly trained ear to make an accurate narrow transcription of a word or phrase. Sure, it could use a database of narrowly transcribed words from other speakers, but really, that’s not a narrow transcription. It’s not going to pick up on the variations that each speaker produces, like accents, vowel changes, unusual sound choices, or even tiny speech errors.

The alternative is called “broad” or “phonemic” transcription, expresses the basic sounds of a language or phrase, often more precisely than the native writing system, but at the same time, leaves out detail that’s not necessary to a native speaker. The middle line in the title graphic for this page is a phonemic transcription. Some dictionaries, including the built in OS X dictionary (if you enable IPA in Dictionary Preferences), can show you the standard american IPA Broad transcription form of a word.

Now, using a dictionary of words in a given language and their IPA equivalents, a computer could likely match things and give a passable broad transcription. However, there are variations that occur between people that show up even at a broad level, and are large enough to identify a speaker’s accent, dialect, or even idiolect. For some people (myself included), “caught” and “cot” have the same vowel, but for others, they’re two distinct vowels. So, even at a broad level, you’re not going to get any sort of reliable transcription of one’s actual speech from a computer widget, just a rough approximation.

Why are you transcribing anyways?

In the end, whether such a widget would be useful at all boils down to your reason for needing a transcription. Some people might be learning English and would want a better method of knowing how a given word is supposed to sound. For that, any good dictionary’s pronunciation key should do the trick.

Some people might be interested in the IPA, or want to know how a given word sounds. For that, they’d be better off getting a good phonetics textbook and learning a bit of the IPA themselves, along with some knowledge of phonetics.

However, our widget searcher might just be stuck in an introductory Linguistics course, having to transcribe their speech for an assignment. If so, I offer just one piece of advice: Don’t plagarize transcriptions off the web or from a dictionary. Your professor should have no trouble noticing if you’re not transcribing your own dialect, and everybody’s got a dialect.

Remember, if there’s one thing that phonetics professors are good at, it’s picking out a phone-y.

10 Responses to “The IPA Translation Widget: a wonderful impossibility”

  1. Jim Says:

    I can’t speak for the “IPA translation” googler but I found you searching for “phonetic alphabet widget.” What I’m really looking for is just a chart that would display in the Dashboard showing me: A - Alpha, B - Bravo, etc. (I think what I’m looking for is actually called the NATO Phonetic Alphabet which has a different application than the IPA). I work in IT and sometimes need to communicate long strings of letters and numbers over the phone, e.g., to vendors, suppliers, or end users on tech support calls. I have the NATO Phonetic Alphabet mostly memorized but if my head is deep in trying to troubleshoot a technical problem I like to have the crutch handy to look at. I’ve typed it up in Stickies but if someone wants to make a widget for that, I’d use it. Regardless, I’m glad I found your site. I’ve enjoyed looking through it.

  2. will Says:

    Jim,

    I’m so immersed in the IPA that I’d completely forgotten about the NATO Phonetic alphabet and its kin. You’re quite right that such a widget with the NATO alphabet would be not only possible, but fairly easy to construct, and I believe that the tools included with the next version of OS X will allow you to do that yourself.

    I’m glad you’ve enjoyed my site, and I thank you for reminding me of the NATO alphabet and the like. I may well post on that soon as well.

    LingMystic

  3. berna Says:

    can i request for the tranlation of english words to international phonetic alphabet?

  4. will Says:

    Berna,

    I’m a little reluctant to start doing that, as really, transcription is something that you need to do for your own voice (or the voice in question). I don’t know what you sound like, so I can’t really help.

    Also, if I start doing that, I suspect that people in introductory linguistics classes might start requesting passages that are actually part of their homework.

    I’ll happily educate, and feel free to email me if you feel you’ve got a great reason, but the IPA isn’t tough to learn. Give it a go :)

  5. writch Says:

    I disagree. Looking at wiktionary.org, I can see a pattern to their presentation of not only IPA, but IPA in both RP and US versions, as well as SAMPA and usPR. All the data for every defined word is there.

    I would be easily able to write this widget, pulling the page for each word, digesting it to get the chosen pronunciation scheme, and re-presenting the culled data in the widget. Some other languages have pronunciation schemes, too.

    I’m guessing a three month development cycle, but nobody’s paying, so it will be a while.

  6. katie Says:

    My boyfriend is a linguistics student. I would love to write a simple “Happy Birthday” on his cake in IPA for his birthday this weekend. Where can I go to have it transcribed for me (aside from stealing his phonetics textbook)? Any website suggestions?

    Please email the reply.

  7. will Says:

    Writch,

    You’re quite right. Culling IPA from a dictionary (or Wiktionary) would get you broadly transcribed IPA versions of words. However, the problems stemming from variations in your own speech would still be present, and in the end, you’d just be putting things into a sort of “generic transcription”.

    It might pass for somebody raised in, say, Denver or San Francisco, but the moment you come across somebody with any variety of accent that’s more specific than simple American speech, your widget will be very wrong, very quickly.

    Claiming to be able to transcribe something into IPA without hearing a speaker pronounce it is a lot like claiming to be able to analyze handwriting based on a typed-and-printed copy of a letter. Sure, you can guess what the general letters might look like, but in the end, the purpose is defeated.

  8. Jay Levitt Says:

    I, on the other hand, came here trying to find a program (or, these days, probably just a web site) where I can paste in the IPA pronunciations from Wikipedia and hear them pronounced. Since every IPA character (if I understand right) has a fairly definitive sound, that *seems* like it should be easy - you’re asking a machine to do the easy part, and only the easy part, of text-to-speech.

    Yet apparently nothing exists, probably because the very people who could create such a thing are the people who read IPA fluently and don’t need it.

  9. Burbank Steve Says:

    I’m hopelessly unqualified to comment on this, but I landed up on this blog with the same google search (IPA language translator”). I understand now the distinction between “broad” and “narrow”, but it still strikes me as possible to present a “broad” representation of a word based on a standard pronounciation. I see the difficulty of making an accurate representation which would hold true for speakers of American English from say, both Boston and New Orleans, but isn’t there some accepted “standard” pronounciation?

    British English has the concept of “Received Pronounciation”, sometimes called “BBC English”. It was originally meant to be the pronounciation that you would use when you were “received” by the monarch at some type Royal event.

    Isn’t there some equivalent in US English?

    Also, thinking out loud, if a Bostonian pronounces the ‘a” in “park”, “yard” and “Harvard” one way, and a Louisania native pronounces it differently, wouldn’t they still pronouce a similar “a” correctly (to their ears) if they saw the symbol in an IPA representation of a word?

  10. will Says:

    Steve,

    Good questions! You could certainly come up with an accepted, broad transcription in many ways. This would be simply writing out which phonemes are generally considered to be present in a given word. This is the sort of thing that a dictionary might give you, and really amounts to a phonetic spelling of the word. This is our equivalent of BBC English, often called “GA” (general american) or “newscaster english”. Although people from the (middle and south) western US are usually fairly close to this standard, everybody has little variations.

    However, this would still be a representation, not a transcription. I’m quite tempted to say that unless the speaker was actually observed to speak in a certain way, it’s not actually a transcription at all.

    To touch on your other question, there are definitely variations even within the same vowel among different speakers. People from Boston might generally hold their tongues slightly differently from a Louisianian, even when pronouncing the same IPA vowel, but such differences are nearly negligible. However, on the whole, if somebody who can read IPA sees an IPA symbol (or a word in IPA) and reads it aloud, they’re not going to have a dialect (or they’ll have the dialect of whoever was transcribed.

    Thanks for asking questions, and I hope I was able to clarify some things!

    Will

Leave a Reply