So, I’ve been noticing a strong uptick in the use of “the cloud” to refer to online, decentralized storage, computing and program-hosting lately. No shortage of companies are talking about their “cloud computing” services (including my hosting company, Joyent), and it’s become one of those “gotta have it” corporate buzzwords, and it seems like no company’s marketing people will let them release a website, product or service which isn’t in some way cloudy.

This phenomenon itself isn’t noteworthy from a linguistic standpoint (“Web 2.0” seems to have been the same sort of trendy buzzword at some point), but it occurred to me today that for many less-tech-saavy users, this “in the cloud” phrasing might actually be affecting how people view these services, and I think that might be why companies have latched onto this term so strongly.

Let’s take, for example, Apple’s coming “iCloud” information hosting service. Apple is increasingly targeting the non-tech-saavy crowd, and this service, like most of their recent developments, is meant to be largely transparent to the end user. Once you’ve signed up, iCloud will take your music, your photos, your documents, your books, your backups, your contacts, calendars and mail, and any additional information you add in through third party programs, and make it instantly available on all of your devices. As they put it on their own website: “Create a document, iCloud stores it, and pushes it to your devices”. Bam. Magic. You turn the service on and suddenly your data is on all of your devices. Who wouldn’t want that?

A rose by any other name…

They’re doing something linguistically fascinating, though: they make no mention of their machines, servers, databases or storage (at least on the user-facing sites). You create, something cloudy happens, it’s on all your machines. They’ve de-emphasized the middle step. Mind you, Apple’s not the only “cloud” provider to do this (Google Docs de-emphasizes the middle step too), but Apple is certainly the most flagrant. But why bother? Why de-emphasize?

Well, I’ve been toying around with a new hobby. Whenever somebody says “in the cloud”, I’ve found it entertaining to replace it with “on somebody else’s computer”. This simple replacement brings me much joy in the absurdity it creates and how oddly different it makes the act sound:

“Our main working copy of the paper is on somebody else’s computer for group editing, but it’s password protected so nobody but us can edit it”

“My data is safe, I store my address book, mail, passwords, documents and photos on somebody else’s computer.”

“Oh, don’t worry, all of our business information is backed up on somebody else’s computer.”

When put like that, we’re emphasizing the storage, the step that Apple and Google and most of the other cloud providers don’t really want you to think about too much. We’re emphasizing the fact that your data is sitting on a hard drive in another state, watched by a sysadmin who you don’t know. We’re emphasizing that when you put something on the cloud, it’s no longer just yours, and whereas naive users might not hesitate to put something into an amorphous cloud, actually transferring their data onto another computer might tickle enough of their sense of privacy to make them hesitate to upload those bank statements or that racy note from a lover.

In addition, we emphasize the fact that the data is there for the cloud provider to use per the TOS. How much do you think that the recording industry would pay to analyze en masse the music library of hundreds of thousands of iGadget users, even if just for market research? How valuable would it be for a website to figure out where to advertise by asking a company storing passwords “in the cloud” which sites are also visited by people who have stored passwords for their site?

Simply put, putting your data “in the cloud” is amorphous. It’s a mystery, but at the end of it, it just works. Putting your data on somebody else’s computer can get the same ends, but it forces you to think about your data in between your machine and your other devices.

Clouds aren’t necessarily bad

This may sound like a paranoid luddite’s rant, but I use the cloud. I currently use MobileMe, Apple’s current iCloud equivalent, for calendar and address book syncing. I use DropBox to keep my grocery list current across all my devices. I have an SFTP provider for storing backups of my data between at-home backups, and in case of emergency. The cloud can provide, in addition to convenience, a type of security against loss. As a friend of mine pointed out on Google+ (a cloud app):

Somebody else’s computer, with extensive redundancy and backup systems, which makes it much less likely to be lost if my house burns down. It is one kind of security. Not the “no one else will look at it” kind, but the “I won’t lose it in a domestic disaster” kind.

This is certainly true, and one of the best arguments for decentralized, cloud-like computing. Data on my computer in my backpack is fleeting. Data on a well-backed-up server in Dropbox’s massive datacenter is much less likely to be dropped, stolen, lit on fire or broken. These services have a use, whether convenience, ease-of-use for non-tech users, decentralization, or simply as an offsite backup of your data.

The techies who have read this far are doubtless thinking “Come on, I knew this already”. Of course data stored in the cloud is stored on somebody else’s computers. Heck, geeks like myself can likely picture server farms, maybe even imagining the mass storage required. They have a good idea of what sorts of things cloud providers can and can’t do across petabytes of data.

It’s not like I’m blowing the whistle on a massive conspiracy here. Anybody who has thought more than 20 minutes about the idea of a cloud knows that information has to go somewhere, and has deduced that presumably, it’s sitting on somebody else’s computer. Apple’s not choosing to skirt the issue so they can “pull a fast one” on the entire internet, they’re doing it because it’s less intimidating to new users. Google Docs is neglecting to mention their servers because they don’t need to. That’s not why you should be using the phrase “on somebody else’s computer”.

We should be talking about uploading your documents onto somebody else’s computer with grandma when she gets her new laptop and decides that that “iCloud” folder is just like her hard drive. We should be discussing storing information on somebody else’s computer for the clueless CFO who wants to upload the company’s records onto DropBox to be able to work on them from his new iPad.

We should be talking about “the cloud” as storing information on somebody else’s computer so that people will think, if only for a second, about whether they care that that picture, document, or file is something they would be OK with storing on somebody else’s computer.

Because TOSes, “privacy policies”, talking around the issue and other calming language aside, that’s what the cloud is. It’s a vast collection of other people’s computers, and in order to decide intelligently whether you want your data there, you need to know where “there” is.

Tagged with Computers and Software, Corporate Language, Language and Thought, Language Usage, Language, Computers, and the Internet, Words, Phrases, and Idioms | 6 Comments


It seems to me that most blogs fall on a continuum in terms of their content.

The grand blog continuum

On one end, we have the most personal of blogs. Comprised of random thoughts, stories, goings-on, and pictures, these blogs are primarily designed as a means of social communication with one’s friends and family. You can usually tell these because reading them is boring (if not downright painful) if you’re not intimately acquainted with the author. Perhaps the epitome (best example) of these sorts of blogs are the ones kept by many random people on LiveJournal or MySpace.

On the complete opposite end, we have blogs that are so heavily focused on providing useful content to the world that the authors themselves are largely overlooked. Never will you find a post dedicated simply to the wonderful day that the author had, and seldom will you even find a reference to the author’s personal life. Sometimes, these are even run by several authors collaboratively, and unless you look at the name of the poster, you often can’t even tell who’s writing them. Examples of blogs like this would be Lifehacker, Treehugger, and MacRumors.

It seems that, in terms of readership and popularity, the most successful blogs seem to be the ones putting content before personal information, because they appeal to the widest audience. If you think about it, some of the more well known blogs on the internet tend to be the more pragmatic and content-based blogs which have a very distinct theme and focus. After a while, these sorts of blogs start to build a library of sorts, with lots of content that somebody who has never heard of the author might still be interested in (and find, via google).

That’s not to say that there aren’t popular blogs where the author’s voice is both present and strong. One good example of this is DaringFireball, which has a great deal of content, but is also quite clearly John Gruber’s personal blog. He’s found a good balance between Gruber-trivia and widely relevant information, and his success shows that. What Would Tyler Durden Do? (not work safe) has a different approach to this balance. Although the content is mostly just gossip about celebrities, in addition to the content, the author of the site has a strong and distinctive voice in the posts, and his commentaries on the stories are often downright hilarious. Here, the author is clearly present in the content, but nonetheless, the blog isn’t about him.

So, there’s a grand continuum in the blog world, ranging from the most personal livejournal to the most informative megablog, and everybody fits in somewhere.

Where am I?

The reason I’ve gotten to thinking about all this is that recently, I’ve been asked to participate in a blog-meme that involves sharing information about oneself. Basically, participating bloggers are asked to list eight random facts about themselves, and then to pass the meme onto eight more people, much like the chain emails of old. What’s surprising to me, and the reason for this post, is that I was conflicted as to whether or not to participate.

Obviously, participating in this meme would be very much out of character for a blog like Lifehacker or Gizmodo. It’s a clearly author-centric exercise, and for a site where the author is de-emphasized, it would be awkward at best. However, for a Livejournal sort of blog, this sort of thing is their lifeblood.

That led me to wonder where, exactly, this site falls on the grand continuum. Although there are clearly posts which concentrate on me as a person, I try to make the majority of my posts very content-centered, although they may include my voice and opinions. My primary means of getting the word out about this site is through links from other people and from google, and I do my best to make the posts here relevant to people who don’t even know what linguistics is, let alone who I am.

Finally, I do have the rather obsessive desire to incorporate some discussion of language and linguistics into all of my posts, even the most mundane of site news. This obsession, and the awkwardness of posting simply personal information, makes me think that when all is added up, Notes from a Linguistic Mystic tends to lean more towards the content-centered side of the blogosphere.

Passing on the meme

So, I’ve decided that to just fill in eight random facts would be a bit contrary to the site’s nature. However, I’ve come up with a compromise. Here are my eight facts:

1. The pitch of my voice is usually between 90hz and 120hz, although it got at a bit lower (~70hz) with laryngitis. When the vocal folds are inflamed (the main effect of laryngitis), they vibrate more slowly, and thus, people’s voices sound lower.

2. When I was young and first learning to read, I pronounced the L’s in “walk” and “talk” for a time, even in everyday speech. This is called a “spelling pronunciation”, and they’re not uncommon. Many people will pronounce “caulk” differently from “cock” for this precise reason.

3. For me, the vowels in “caught” and “cot” are pronounced identically. This is the case for many speakers in the US. For more information, visit the Wikipedia page on this merger.

4. I can hear the difference between aspirated, unaspirated and voiced stops, but I have trouble reliably making unaspirated stops.

5. After a fair amount of practice, I can make and hear Ejective stops.

6. Violating a number of sociolinguistic and cultural rules, I referred to my parents only by their first names until first or second grade. The school psychologist had to explain to me that generally, “Mom” and “Dad” is more acceptable in our society, and that it made them sad when I called them by any other name.

7. Because I’ve suffered from a number of ear infections in the past and had a somewhat mysterious hearing impairment through the high school and a part of college, I currently have a tympanostomy tube (ear tube) in my right ear drum. Thus, when I’m on planes or driving in the mountains, my right ear doesn’t pop at all. Strangely enough, this surgery actually improved my hearing significantly, and helped me to distinguish sounds that I previously couldn’t.

8. The name “Linguistic Mystic” arose while working on a project regarding the Sapir-Whorf hypothesis. I was debating the idea with a friend in my group who was dead set against the idea that language affects thought. Frustrated that neither of us were changing the other’s mind, he said something along the lines of “You know what you are? You’re a damned Linguistic Mystic, trying to make language into some secret, mysterious force affecting our world.” I loved the expression then, slowly adopted it, and finally ended up making it the title of this site.

Naming the victims

So, there are my eight facts, modified to include a heavy dose of content and linguistic goodness. According to the Meme, I need to now post the rules and nominate a few other blogs.

These are the rules:

1. We have to post these rules before we give you the facts.
2. Players start with eight random facts/habits about themselves.
3. People who are tagged need to write in their own blog about their eight things and include these rules in the post.
4. At the end of your post, you need to choose eight people to get tagged and list their names.
5. Don’t forget to leave them a comment telling them they’re tagged, and to read your blog.

Here are the blogs I’ve chosen (I couldn’t find eight), in no particular order:

1. Mother Tongue Annoyances
2. Language Fragments
3. LinguLangu
4. Confessions of a Language Addict
5. Aspiring Polyglot (PS: Congrats on the Bloggers Choice nomination)

So, if you’re interested in participating, fellow bloggers, you’re welcome to. Feel free to put your own spin on things as I’ve done, or feel free to ignore this altogether.

Conclusion

Much like humans grow to have a certain preferred communication style in a given context, it seems that blogs tend to settle out into different styles. Just as it would seem unusual for a normally serious professor to come into class and start discussing a party he attended over the weekend, bloggers seem to have a good idea of what’s “proper” given their particular style, and seldom violate it.

(Unless, of course, a really good chain-letter goes around. Then, we get flexible.)

Tagged with Language, Computers, and the Internet, Notes, Site News, Tirades | 7 Comments


Periodically, one goes through periods of deep metaphysical malaise. You look around at the world, wondering how such evil could flourish and such suffering could endure. You descend deeper into darkness, your faith in humanity waning, wondering why we were ever born into this cruel world. Then, suddenly, you realize that somebody has written a programming language based off of the dialect of Lolcats/Cat Macros, and your faith in humanity’s inherent good is completely restored.

LOLCode is a computer programming language concept which draws its vocabulary from the recent internet sensation of captioned cat pictures. Although not fully functional yet, it’s still linguistically fascinating on many different levels, and deserves mention.

i has dialect

One of the most interesting parts of this programming language is that it can exist at all, and the fact that it can goes a long way towards establishing the legitimacy of a feline dialect.

Imagine that I wanted to create a programming language based solely off of star wars vocabulary. I would likely start by finding a donor language, whose basic syntax and ideas I would borrow. Then, I would begin to slowly find equivalents and their translations.

Some equivalent/translation pairs might be obvious. ‘Death Star’ for a verb which meant “remove file”, maybe ‘carbonite’ for “pause process”. One could even get a bit more ornate and incorporate some movie quotes. Perhaps “there is an error” could be coded with ‘It’s a Trap!’, and “load this program” could be ‘Commence Primary Ignition’.

However, no matter how nerdy I felt at the time, my plan would be fatally flawed from the outset. Sooner or later, I would find an expression that was too niché (fulfilling just a small purpose) to have a Star Wars equivalent. I’d have to rely on a set canon of phrases to fill in the blanks, and there’s no way to work around it and still maintain the Star Wars theme.

The reason that LOLCode is so awesome is that, based on what I’ve seen so far, it doesn’t seem to have that limit. Based on my highly scientific research at icanhascheezburger.com, it would appear that LOLCat has become a full fledged dialect. There are many captioned images there, each slightly different, and each seems to fit a coherent grammatical pattern. Some linguists are starting to pick up on distinct patterns and grammatical rules, and based on the fact that any sentence can now be LOLCatted, I’m quite tempted to say that LOLCat has become a productive and functional dialect of English.

Because of this productivity of the LOLCat dialect, it would be quite possible for somebody to take any given sentence or idea and put into LOLCat, thus ensuring that LOLCode could, in theory, become fully functional without ever breaking character. This is very exciting, and very awesome.

mai translationz r not straitforwerd

LOLCode is a very special sort of translation. Conventionally, when one sits down to label a cat, the source is an English sentence (I’m yet to find any cats “en mi refrigeradora, comiendo mis comidaz”). However, here, what people are doing is finding equivalents in human/feline language for concepts, verbs, and ideas within a computer language.

Rather than being able to simply translate, they’re forced to create the inflexible, ambiguity free grammar required to tell a computer what to do. This is tough enough to do even using all sorts of abstract symbols, but to do it within LOLCat dialect and syntax is wonderfully difficult. They’re adapting a human language into a dialect, then bending it into a computer language. This is by no means an easy ask, and it’s a far more complex sort of translation than many.

For this alone, I salute the creator and contributors to LOLCode. Although it may seem silly to some, this is really some top-of-the-line linguistic work.

d00d. ur dialect is teh suxx0rs

Perhaps the even interesting than the mere fact that LOLCat has become a translatable dialect is the fact that, well, there are already people who are arguing about the “correct” way to say something in LOLCat. Take, for instance, this post on the LOLCode wiki:

I know VISIBLE is the current output command, but it’s so not LOLCAT. What if we used LOL as the output instead? So, the Count-1 example becomes:

(Code)

I think this works very well, is funny to read and matches actual LOLCAT protocol, sorta. I guess the LOL would be at the end normally.

As a linguist, this is really, really exciting. People are already trying to step in and enforce the “rules” of the LOLCat dialect. It seems like, as a “native speaker” of LOLCat, the author of this page had a distinct intuition about the “proper” means of expressing a concept in this dialect. Truly incredible.

Although this community of people has only arisen recently, I’m very excited at the potential for the later discussions of “proper” LOLCat, and the sociolinguistic goodness sure to arise from it.

o hai. i discussed ur werk.

So, author of (and contributors to) LOLCode: I salute you. This is a unique, wonderful, and groundbreaking project, and I really hope that it continues to yield such fascinating linguistic insight into the future.

Keep up the good work, and don’t let anybody convince you that what you’re building is silly or unnecessary. If there are two things that the world of technology needs, it’s probably humor and cute, fuzzy animals, and really, I can’t think of a better way to combine the two.

Alright, I’m done. kthxbye

Tagged with Computational Linguistics, Conventional Linguistics, Dialects and Idiolects, Language Humor, Language Usage, Language, Computers, and the Internet, Sociolinguistics, Translation and Translation Theory | 32 Comments


Site Information

  • Categories

  • Latest Non-linguistic Posts

  • Archives

  • Site features