Notes from a Linguistic Mystic

As I've continued to think about my teaching style and academic life, I keep thinking about some of the great teachers I’ve had in the past, and I want to show gratitude and share the people (and their actions) that changed who I am, academically and as a person.

Last time, I wrote an open letter to Mr. Morrow, a Math teacher who helped shape my style as a teacher. Today, an open letter to Kim Hinchey, who was one of the first people to encourage my passion for language.

Dear Sra. Hinchey,

You were my Spanish teacher for the last few years of Middle School. You weren’t my first Spanish teacher, and you weren’t my last. But you gave me a gift, and it stuck with me.

You probably remember that I was a little nerd. The kid always asking questions. Always wanting new words. Always wanting next week’s lesson today. And always frustrated that there was more to learn before I could actually talk in Spanish. That never really changed, and that’s probably why I kept in school forever.

Most of my language teachers didn’t handle that well. If I wanted to know how to say something different, maybe say that I might go to the park, or that I would, but I can’t, I’d go ask after class. But the answer was always “You’ll learn that next year” or “Oh, they’ll cover that in High School” or “Don’t worry about that yet.”.

This was really frustrating, because damnit, I wanted to speak Spanish, not repeat dialogues. I didn’t see why they’d teach us how to say “Pablo went to the park yesterday”, but not “Pablo will go to the park tomorrow”. I didn’t understand why they’d teach me to say “You stop.”, but not “Hey, You, Stop!”. But most of all, I resented the wall. Not just “Sorry, we’re not covering that until next month, look at Chapter 5 if you want to jump ahead”, but “No, I won’t teach you that. Focus on the dialogs from the chapter.”1

20 years later, I don’t remember much of Middle School. But I have moments that are clear as day.

One of these moments was after class, right before recess, standing by a bookshelf in the classroom. We’d just covered the compound future (“Voy a comprar un coche.”, ‘I’m going to buy a car’). But I’d heard there was another way to do the future (the ‘simple future’). And I went up to ask you about it. You explained that we’ll cover it next year, as all the other teachers did. But you went on. “Since you’re interested, though, I’ll make you a copy from the textbook for next year.”

A few minutes later (when I could have been at recess and you could have been eating), you handed me a piece of paper with a verb paradigm, showing the future tense forms for each person and class of verb. I may have been trying to act cool and not show it, but I was thrilled.

I had inside information! I could say something nobody else in class could. I could learn material on my own, and learn more about actually talking in Spanish. I went out on the playground, leaned against the building, and read over that sheet of paper like it had the solution to life, the universe, and everything on it. And I used that future tense, after class with you. At home. At work, a few years later. To this day, the simple future is my favorite Spanish verb tense2.

But the fact remains that I showed what a little language nerd I was, and you didn’t just dismiss it, or tell me to keep pace with the class, but instead, you encouraged me, and tossed fuel into the fire. And you kept encouraging me. Kept feeding me little bits of information based on questions. Kept filling in blanks, and letting me in on “secrets” we hadn’t learned yet. You even organized a school trip to Mexico, so I could actually try my Spanish with people who spoke Spanish every day.

I can’t blame you for my future life in language, as I probably would have ended up a linguist anyways, but you definitely got me going. You were one of few people to encourage me to push. And learn. And embrace my inner nerd. And for that, and for the Simple Future tense, I will be forever grateful.

Gracias, Sra. Hinchey

  1. This wasn’t just a rash of bad teachers in Middle School. It’s the same reason I dropped my Russian major in College. I wanted grammar and paradigms, but the instructors (and terrible textbooks) gave me dialog memorization a “non grammatical approach” to teaching grammar. This approach is about as effective for me as a non-swimming approach to teaching swimming.

  2. What, you don’t have a favorite Spanish verb tense?

~ ə ~

As many of you know, one of my hobbies is following advances in Cryptography. This makes sense to me, as Cryptography and Linguistics are oddly parallel (in ways that deserve their own post).

But one of the very best parts of Cryptography is how easy it is to do poorly. Given that I’m an amateur at best, I’m in an excellent position to do cryptography poorly, and thus, I’ve entered the Snake Oil Crypto Competition with my white-paper on Metalinguistically Hardened Caesar-Shift Encryption.

For those who don’t follow Crypto, it likely won’t be terribly funny (although there are several references to pigeons, who are usually at least entertaining). But hey, I had fun. And in information security, isn’t that what matters?

~ ə ~

As I continue to think about my teaching style and academic life, I keep thinking about some of the great teachers I’ve had in the past, and I want to show gratitude and share the people (and their actions) that changed who I am, academically and as a person.

What follows is an open letter to Rich Morrow, a teacher who I doubt will read it, but who influenced me considerably more than I would have ever admitted at the time.

Dear Mr. Morrow,

You taught my math classes in 4th, 5th and 6th grade at the “Challenge School” in Denver, Colorado, and you were so weird.

Your teaching was weird. Whereas most teachers were happy just writing on the board, you used elaborate overhead transparency overlays. Whereas most teachers just marked off answers on the page, you had us write all our answers on lines in the very edge of the page, then lined the 30 papers up on your desk, so you could grade each question with a single stroke of a pen, stopping the line only for incorrect answers. Whereas most teachers steamrolled ahead, you stopped. You would stop and bother people who looked confused, rather than moving on with the lesson, and seemed to actually care when somebody didn’t get it, because you clearly loved school, even though few of us did.

And you were such a strange person. You taught math, but you constantly talked about geology and nature. You peppered your class with weird anecdotes about the world, and just wouldn’t “stick to the subject”. And then, weirdest of all, you, a math teacher, arranged class trips to Moab, Utah, where you showed us Arches and Canyonlands national park, and parts of the backcountry that I would never have the guts to take 10 middle schoolers, but which are now among my favorite places on Earth.

I still remember the puns. You were a seemingly endless font of really awful groaners. During class, during hikes, and during recess (if you were within earshot), you provided a constant stream of puns so bad that I was in physical pain. I know that despite my eventual dedication to the art of punning, I will never match your ability to loose a terrible joke so bad it could stun a charging musk ox.

But the thing is, no matter how lame we thought you were, you simply didn’t care, because you genuinely loved what you did, and what you taught. You are geek, and we’d better believe we’re going to hear you roar. No shame, just math.

At the time, as a harbinger of math, I really didn’t much care for you. You taught me “useless” things while assuring us they’d be handy, and made me solve “silly” problems, and no matter how often I told you, you never remembered what X was. Not to mention that you just didn’t “fit the mold” as a teacher, with all those silly outside interests, those weird teaching techniques, and most of all, the constant stream of what I only now recognize to be humor. You were “so weird” and “so lame”, and so not what I wanted to do in my life.

Now, I’m doing statistics and mathematical modeling of speech for a living, and kind of needing all that silly math, like you promised. Like you, I am shameless in my geeking, wearing a Ph.D with pride, and taking as a great compliment a student’s assertion that I was “the nerdiest person she’d ever met”. I maintain a website dedicated to terrible puns. I still travel to Utah whenever I can, with a love of nature (and other non-language things), even though it’s “not my field”. And as I stand up in front of college classrooms and design tests to be graded, I find myself unconsciously using the same techniques you used for teaching, building assignments, and grading, only to realize later where I learned it.

Considering you’ve got the Rich Morrow Math Challenge named after you, and you apparently later became the principal of that school, clearly, others recognize your value and skill. But I always just thought of you as a weird Math guy who really liked school (for some reason), and really loved making his students suffer through Math. An odd memory, from an odd time, and somebody I’d never really “get”.

But this morning, 17 years later, I finally put 2 and 2 together1, and realized that despite my long time scorn and “not getting it”, you’re actually one of those handful of teachers in my past who shaped who I am as a teacher, academic, and probably a bit as a human.

So, in honor of my old math teacher, here I am, showing my gratitude by eating a big old slice of Humble Pi.

That one’s for you, Mr. Morrow.

  1. Sorry, couldn’t resist.

~ ə ~

So, remember the dissertation I was working on? That little thing that took two years, 170 pages, 50+ participants and thousands of lines of code? The crowning achievement of 12 years of higher education?

Well, a big chunk of the work I did is gone, because I made some bad decisions, and had some very bad luck. I’d like to share what I did wrong, and how to not be me.

“Huh, that’s weird”

In early June, my logic board in my Macbook Pro failed, and took the hard drive with it. I’d been having kernel panics, and a few periodic drive read errors, but I caught it early. When I brought it to the Genius bar, the diagnostic failed, and Apple replaced everything, as it was (barely) still under warranty. It came back to me with a new SSD and logic board.

I restored my data to the newly wiped computer from a two-day old backup, and I also took this is an opportunity to clean up a bit. I got rid of some programs I wasn’t really using anymore, threw out some files and bad music, and eventually, felt pretty good about my computing life. My computer was lean, fast, with brand new parts, and I thought I’d recovered from a dead hard drive with no issues. But I never opened the dissertation folder.

Two weeks ago, a colleague asked me for a script I used to create some of the stimuli for my dissertation. Easy, I said. I’ve got that in my “dissertation” folder. I opened the folder, knowing just where it would be, but it contained nothing but a corrupted PDF with comments from my committee. Whether it was lost to the data corruption, lost in a bad restore, or just lost, it was gone. Everything else was gone.

“OK, this is why I have backups.”

I’ve had a number of hard drive failures over my life, so, when it comes to data, I’ve had a hardcore backup schema. At any given moment, I have:

  • Three small portable backup drives using Apple’s “Time Machine”, which I swap out periodically
  • A USB hard drive playing “Time Capsule”, attached to my wireless router and automatically backing up using Time Machine every few minutes
  • Two “cold storage” time machine drives, one at home and one off site, which I only update every once in a while
  • An offsite internet backup service (Crashplan), keeping copies of deleted files as well as the past versions.

Theoretically speaking, in order to lose all of my data, I would have to experience 6 hard drive failures and lose access to the cloud.

Or, I’d just have to f*** up really badly.

How I f***ed up really badly, Part 1

I didn’t know when the data had disappeared, but it was gone, and I needed to get it back.

Over the next few hours, I went through every one of the backups above, and found that amazingly, each one had failed because of two really poor choices, and one bad stroke of luck.

Really poor choice #1: I “refreshed” most of my backups when I got my computer back

After the clean install, I was feeling cocky. My computer was clean, decluttered, and running great, and everything looked fine. So, given that my backup drives were already starting to get full with all that old data (“Who needs old data!?”), and I needed to repartition them anyways, I decided to wipe and re-start every single backup drive except my offsite “cold storage” drive. I was confident enough that between Crashplan and the offsite storage, I’d be fine even if there was some missing data, even if there was a problem, and “starting fresh” would be a great idea.

This meant that my oldest backup on any of these drives was June 16th. The day after my “Clean” install. So, on every single drive, instead of 2+ years of backup data, the oldest one had the same corrupted folder as my hard drive.

This choice alone brought my data down from 7 backups, to just two. But that’s fine, two is enough. Unless I f***ed up really badly.

How I f***ed up really badly, Part 2

I’ve used Crashplan for a while now, and liked it a lot. There are reasonable privacy controls, it’s fast, easy, and reliable, and it even saves deleted files for a period you specify. It’s also much more reliable and faster than SpiderOak, my previous solution.

So, once I realized my backups didn’t have my back, I logged in to the Crashplan interface, hoping to restore my files that way. But they weren’t there, either. For that matter, my entire year of deleted file and revision history was gone too. I couldn’t figure out why, until I realized that:

Really poor choice #2: I didn’t understand the nuances of how Crashplan worked

During that restore process, I changed my username on my Mac, to fix a long-standing error. This shouldn’t play a role, except for one minor detail: Crashplan doesn’t save deletion history for folders that are no longer being backed up, and the username of the home folder matters.

When I set Crashplan up again on the newly wiped machine, I selected my new home folder. It matched all the files to the old folder, and since the data had already been uploaded, it was just a matter of minutes before my backup was up to date, and my old home folder was “gone” to the system.

That evening, at 1am, Crashplan’s automated cleanup robots decided that since I no longer cared about the old username’s home folder (which no longer exists), it could delete all of the deleted file history for that old folder, and focus on the new username’s folder, which had no file history at all.

Just like that, at the whim of a bot doing its job properly, my deleted file history disappeared, leaving only the same corrupted folder that I had everywhere else.

At this point, the data existed in just one place: my “offsite” cold storage drive. But I still had a copy, so I’d be fine.

Unless I was really unlucky.

How I was really unlucky

Know the saying “Two is one, one is none”?

Stroke of bad luck #1: One was none.

When I plugged in my offsite drive, I wound up with a “Click-Click-Click” of death, and although my machine could see the drive, it couldn’t decrypt the backup data, no matter what I tried. Whether it was the heat in storage or just my luck running out after 4 years of using the drive, my “just in case” drive was dead, and my data with it.

Learn from me, damnit

Even though I did a lot of things right (by having many backups in a few different forms), I made a few bad choices, and it burned me. In the name of helping my readers avoid these errors, I have a few suggestions, many of which are obvious, but still escaped me:

1) Phase out old backups over time, not all at once

This whole issue would have been avoided had I just kept more old backups. My desire to “clean up” and “start fresh” here burned me bad. What I should have done, if I wanted a clean slate, was to wipe one drive at a time, every six months or so. That way, I’d have had at least one set of historical backups, even as I cleaned things out and repartitioned.

2) Know the Details of your Backup Service

After reading the documentation, Crashplan worked exactly as it was supposed to, here. I removed a folder from the scope of the backup, and it removed all old versions of that folder. This is the right behavior for privacy, for organization, and for minimizing space used. But because I didn’t understand how it worked with username changes, I thought I had old versions that I didn’t, and made bad decisions because of it.

3) Keep a couple of “cold” backups

It’s a very good idea to have data someplace that you simply don’t touch very often. Sure, the data will be a bit out of date, but I would pay good money for a copy of my dissertation files circa November. The purpose of this is not to recover gracefully from a recent failure, but to save your bacon in case “the big one” hits. Whether these are DVDs, a hard drive left with a family member, or even an old computer left unwiped in your closet, it’s important to have a copy of your data that’s safe, offline, and immune to viruses, data corruption, and bad decisions. Had I not had a hard drive failure, I’d have been just fine thanks to my offsite backup.

4) Don’t trust your “perfect system”

All of this would have been avoided had I, shortly after finishing the dissertation, just burned everything to a DVD for archiving. That way nothing could have wiped it out short of a house-fire. I even thought about doing this, but I had enough confidence in my redundant backup system that I didn’t think I needed to bother digging out the DVDs.

Stupid, stupid, stupid.

Redundancy doesn’t prevent stupidity

Although a lot was, all is not lost. I’d stored the sound file data in a different folder, and by searching lab computers, Google Drive backups, asking my advisor and colleagues for scripts I’d shared, and a few very lucky “emailed to myself” or “copied to my website” moments, over the following weeks, I was able to find copies of the text itself, and all the data I will need to reproduce my findings for publication, albeit with a fair amount of duplicated work. A few other folders were affected, but no others of them were as important. I can’t say I dodged the bullet, but I survived it.

Nevertheless, remember that no matter how redundant, well-formed, or multi-tiered your backup plan is, it can’t save you from yourself. My biggest problem here is that I didn’t fully understand the mechanisms I had in place, and I made a stupid decision using this bad information, and it cost me.

Don’t repeat my mistakes.

~ ə ~

Today, something really unusual happened: Siri amazed me.

As I walked across campus this morning, I wanted to listen to one of my favorite recent albums, “Ashes” by the Bedsit Infamy. So, like a douchebag from the future, I raised my arm and spoke to my Apple Watch, whose “virtual assistant” is named “Siri”. I said:

“Hey Siri, play songs by the Bedsit Infamy”

A Recipe for Failure

Speech Recognition, as I’ve discussed before, relies heavily on guesswork, particularly when there are homophones (words which sound identical to other words) or when there’s missing information (maybe due to traffic noise overlapping speech or misarticulation).

Both of the words in this particular band’s name are, well, weird. “Bedsit” is a British term for a studio apartment, and “infamy”, although well known (infamous, even), just isn’t used very often.

I love the saying “When you hear hoofbeats, think horses, not zebras”, and it applies here: when you hear something that sounds like “bedsit infamy”, it’s deeply unlikely that those two words are what’s being said. So, I figured that Siri would “mis-hear” those words as something more common and, well, reasonable. Sure enough, she did:

"Hey Siri play songs by the bed sitting for me"

But, moments later, to my absolute amazement, my phone started playing the first song from the album:

"Now playing: In My Youth by the Bedsit Infamy"

Bridging the Gap between perception and the “real world”

This means that Apple (or Nuance, or whoever’s providing Siri’s logic) has added a logical step that I’ve never seen before in a consumer-facing system, but which has long been present in humans.

Imagine that you’re sitting across the table from a friend, and she says something that you hear as “Hand me that gas”. Unless you’re sitting next to a tank of compressed air1 or something similarly improbable, there’s really no way to complete the request as heard. This is where most natural language processing in speech recognition stops: “I tried to do exactly what I heard you ask me to do, but I can’t. Sorry!”

However, with a little bit more logic, we can bridge the gap between our mis-perception and the world around us. We might realize that on the table, there’s a glass, which sounds a lot like “gas” and is something that I could hand to her. So, without stopping to ask questions, we just hand over the glass, and interaction continues without problems.

So, it appears that, much like humans, when a voice command doesn’t “make sense” (because I don’t own music by “The Bed Sitting For Me”), Siri will now test other phonetically similar commands, to see if any of them make sense. If a similar command (“Play songs by The Bedsit Infamy”) actually can be completed, it’s programmed to do that, instead! But, if there’s nothing even close to what you ask for in your music library, it still gives up:

"Hey Siri, Play Songs by Walrus Taco Logic Board" "I don't see Walrus Taco Logic Board in your music."

Speech Recognition is still really hard

This (small) victory illustrates just how hard good making a good speech recognition interface actually is. Even once they’ve factored out all the environmental noise and figured out the sounds being made (which is no small feat), they’ve still got to match the resulting commands to actual concepts and entities in the user’s life, some of which are going to be really unlikely and hard to predict.

As much as I love mocking and doing terrible things to Speech recognition, even the error-prone systems we have today are amazing. And every time a bit more logic is added to the process, they’ll get better and better, and eventually, we might actually believe Siri’s actually looking out for us.

"Me: Hey Siri open Notes from a Linguistic Mystic - Siri: I'm sorry, but I have standards. Opening Language Log now."

  1. I could have made a gas joke here, but all the good ones Argon.

~ ə ~