« UnboundID LDAP SDK for Java 2.3.0 | Main | Comparing Java LDAP SDK Performance »
Tuesday
Jun142011

The Problems with Twitter's Automatic URL Shortening

At the beginning of 2010, I decided to start writing up my thoughts on all of the first-run movies that I see in the theater. It's debatable about whether those reviews are any good, but I know that at least some people read them. All of my reviews from the last year and a half are available at http://www.viewity.com/.

Last Thursday, I saw (but did not particularly enjoy) J.J. Abrams' new movie Super 8, and last night I finally got around to writing my review of it, which I posted at http://www.viewity.com/reviews/super-8.html. I use Squarespace to host the reviews, and one of the services it provides is the ability to define a shorter URL that can be used to reference the content. I took advantage of this and created the path "/Super8" instead of "/reviews/super-8.html". Squarespace also offers support for using multiple domains with the same account, and I have "vwty.us" in addition to "viewity.com". What this ultimately means is that going to "http://vwty.us/Super8" will take you to "http://www.viewity.com/reviews/super-8.html".

Whenever I post a new review, one of the ways I let people know about it is by Twitter. The whole reason that I offer the shorter version of the URL is that Twitter limits posts to a maximum of 140 characters, and at 21 characters, the short version of the URL is less than half the size of the 43-character long form. This gives me more space to say something about the movie in addition to merely providing the link, and I try to give at least a hint about whether I liked it. For Super 8, the tweet that I composed was:

Super 8 is super underwhelming. http://vwty.us/Super8

However, what actually got tweeted was:

Super 8 is super underwhelming. http://t.co/TZ43SmY

I will grant you that what Twitter actually made available on my behalf is a whopping two characters shorter. However, it is also much worse than what I had originally written, for many reasons.

First, it's completely unnecessary. As I mentioned before, Twitter places restrictions on the length of your tweets, but I wasn't anywhere near that. What I originally wrote was 89 characters, which means that I could have written up to 51 more characters before running out of space. I could have even used the original 43-character URL if I had wanted to and still had plenty of space left.

Second, Twitter's change dramatically obscures the URL. From the URL that I provided, you can tell that it goes to the vwty.us domain (which is a brand that I control and want to be associated with), and the "/Super8" path gives you a pretty good idea what it might be about. On the other hand, with what Twitter actually provided, you can see that it goes to the "t.co" domain (which is known to be a redirect farm so you have no idea where the content actually resides), and the path "/TZ43SmY" tells you nothing about the content. The original URL is very useful. The shortened version is not.

Another significant problem is that the new URL shortener can have a dramatic impact on the availability of your content. Twitter has such a bad reputation in this area that their "fail whale" page is a well known Internet meme. Because a click on the shortened URL must go through Twitter's servers before sending you to the ultimate destination, if Twitter is having a problem then it can make your content unavailable. As if by fate, when I clicked on the t.co link earlier this morning, I got exactly that failure page telling me that Twitter was over capacity. Nice. Even if it had worked, it still requires an extra HTTP request and more data over the wire, and an unnecessary delay in getting to the actual content.

The requirement to go through Twitter's service creates even more ways that the content could become unavailable. It's likely that tweets will outlive Twitter itself. They're being archived in the Library of Congress (in addition to a number of other sites), and although future generations probably don't care how I feel about a movie, there could be long-term value in tweets, and links contained in them. If Twitter goes out of business or is otherwise shut down, then their links won't work anymore even if the content they referenced is still available. Also, it's worth pointing out that the ".co" TLD is controlled by the government of Columbia, and that government can shut down such URLs at any time. The government of Lybia has done this for ".ly" domains, so it's certainly not beyond the realm of possibility.

Twitter's reason for providing this service is that it can "better protect users from malicious sites that engage in spreading malware, phishing attacks, and other harmful activity". While this sounds noble, it is also completely ineffective against everyone except the most extreme idiots. They've already stated that they won't shorten URLs that were already shortened using other services like bit.ly, so there's nothing to prevent people doing suspicious things from using one of them for their posts. Further, there's nothing to prevent me from serving up different content from my server when I can see that the request is coming from Twitter's malware detection service versus some other content, so I could still serve up bad stuff to people following the links. On the other hand, the fact that they are trying to verify that content is safe introduces a very real possibility for false positives. My site could have completely legitimate and safe content, but if Twitter thinks that it's bad for some reason then that may significant inhibit the likelihood of people to go there. Given the unacceptably high percentage of false positives I see from other services like this (e.g., Google mail's spam detection frequently flags things that aren't spam), this is far from an impossibility.

Finally, in the ultimate act of inanity, Twitter's URL shortener can actually produce URLs that are longer than the original URL. For example, when I entered a URL of "http://t.co", Twitter "shortened" it to be "http://t.co/IzZPmi2".

I realize that Twitter will show an expanded version of the URL in its web interface, but that doesn't work for alternate clients. For example, when I use Seesmic on my Android phone, I get the t.co version. And even if I'm using a client that automatically expands that URL, it will only work if the shortening service is available.

Great job, Twitter. This "feature" that I can't disable has made my links less available, less recognizable, and more likely to be flagged as malicious content. I don't need any more hurdles to have to get by for people to read the useless drivel that I write.