Reflections: UnUrlShield – Fighting CAPTCHAs

Update: A challenger appears. My security researcher friend Fox has challenged me to a duel. See her blog post for the details.

This is part of the Reflection series in which I go through my old projects and tell their story.

CAPTCHAs have become an integral part of the web in the last few years. Almost everyone on the web has encountered those twisted pictures, probably when signing up to an email service. They come in various shapes, sizes, colors and cats. When they first became popular, there was an explosion of different types of schemes that services used (who can forget Rapidshare’s cat captcha?).
Now as with every security measure there is a compromise between usability and protection. Some of the easier CAPTCHAs were broken using only OCR software, while some of the latest reCAPTCHA images are hard even for a human to solve (interesting but out dated chart).

Various captchas from wikimedia.

One such service was UrlShield. You would give UrlShield a URL you want to protect from bots and it created a page with a CAPTCHA that when solved correctly redirected you to your original URL. Simple enough. I can certainly see a use for such a service, for example if you want to give out promotional coupons and don’t want bots to snatch all of them. The service became popular in some file sharing sites for the same exact reason.
The particular image this site was generating had a checkerboard background with 4 characters all in different colors, sometimes overlapping. It was pretty easy for a human to parse it.
Example of UrlShield generated images and their OCR.
It even works pretty good against OCR. I used Microsoft OneNote OCR feature which uses the commercial OmniPage software to create the second column.

So far so good? Well, no. This scheme is flawed because it is easy to transform the image – remove the background and segment it (split it to region that each contains a single character), allowing OCR tools to easily get the letter. To remove the background you just clear all the black pixels out of the image. To segment it all you need to do is choose one color and mask all the others, which means you’ll end up with a single letter, as each letter is in a different color. This is what you end up with:
Parsed captcha images
OneNote has no problem parsing each of these to a letter.

The process described above is exactly what UnUrlShield does. It’s a simple Python script that use the Python Imaging Library to read the image. Then it counts all the colors that appear more than a certain threshold (MIN_PIXEL_PER_LETTER_COUNT) and saves each color’s pixel location. Lastly it goes through the colors, creating an image with only that color’s pixel locations.

Is there a lesson here about CAPTCHAs? I think so. UrlShield is now some kind of ad/malware site. Even complicated CAPTCHAs can be broken, or even better – be defeated by side-channel attacks like having an army of low-cost workers break them on-demand (The comments of this article are a treasure trove of irony) and sometimes people are even fooled into breaking CAPTCHAs. This is why it amazes me they are still around, annoying normal regular people while also being broken by even slightly motivated attackers.
Are there no solutions to spam? Of course there are! In fact gmail does a great job at stopping 100% of my spam using things like blacklisting known spammers, Bayesian filtering, “crowd-sourcing” protection (the “mark as spam” button) and other tools that don’t rely on CAPTCHAs.

Do you have good examples of silly, easily broken or bizzare CAPTCHAs? Did you find an easy way around some services blocked by CAPTCHAs? Leave a comment below and tell me about it!

Practical Programming Intro

Today’s blog post will introduce another series in this blog – “Practical Programming”. This series will be about developing software in the real word and about subjects that you won’t usually see in computer science curriculums.

What do I know?

Well to be honest – not much. I started a degree in high school but basically flunked out by the third semester, and never continued it. I’ve had a few professional courses in programming, but most of my knowledge and experience comes from on the job training, side projects (see my “Reflections” series), the internet and other unofficial channels over about 7 years of employment as a developer. I’ll also concede that 7 years is basically nothing, and I’m at least 3 years from even starting to master the field. So why the hell am I even qualified to write about it? Very simple – it’s my party, and I’ll write about what I want to.
Seriously though – it’s up to you to decide whether I am making sense and whether to learn from or discard what I write, or even better – start a discussion and give me your opinions.

Why is that important?

Programming the science is pretty different from programming the craft. To be successful at the craft you need way more than a mastery of a particular programming language or platform. Some general skills like teamwork are a must but also understanding project management concepts and methodologies (waterfall and agile for example), quality assurance concepts and a few other essential skills. The sad fact is that being a “rockstar” (eww) programmer is dependent more on one’s ability to ship software than on one’s skills with C.
This series will talk about the skills you need to ship software.

Some of the topics I already lined up for this series are:

  • Technical Debt
  • Continuous Integration
  • Python’s “There should only be one obvious way to do it”
  • Time Estimations

Hopefully you’ll find this series interesting and informative, and you’ll become a better developer because of it.

Reflections – an Intro and an Example

In this first post in the “Reflections” series I’m going to explain what is the purpose of the series, talk about the reasoning behind revisiting old projects and finally applying it all to a small but fun piece of code.

Reflecting on What?

This series will be about reviewing my github projects. This serves the dual purpose of having at least 10 topics for blog posts and writing documentation for old projects in the hopes that someone else finds them useful or interesting. About a year ago I’ve decided that I must do something about the tens of “side projects” that I have from more than 5 years of coding. I quickly triaged the list of about 25 projects into ‘forget’ or ‘open-source’ and just threw everything presentable with a really minimal readme to github.
Part of the list of projects

So why write about it?

  • Code isn’t as find-able as text. To make this code useful, it needs to be at least somewhat documented. When I uploaded them I half assed a read me file but I doubt that is enough for a search engine to index it correctly.
  • I think most of the projects has something unique to add to the web and are interesting both at the code level and in the research that brought them to life.
  • Revisiting old projects will give me a nice nostalgical feeling.
  • Did I mention 10 easy blog posts?

What am I going to cover?

  • The why – The research that went into the code. Since most of the projects are a Proof of Concept (POC) I think it will be interesting to remember what started that particular train of thought.
  • The how – The parts of the code that do interesting stuff and how do they do it.
  • The what now – Maybe a paragraph or two about how the project can be productized or improved and where can it evolve from here.

Let’s try this out

To finish this intro I want to talk about a small but fun hack I did when I got my first Android phone. It was the happy era of 2.X Android and I got a HTC Desire from England. It was right about that time that they announced that next version will be out in a month (Haha, I still believed OTA promises) and will have Hebrew support. That meant that I had a month where my phone didn’t have the Hebrew language font and most importantly I couldn’t read any SMS. Since the phone was new I didn’t want to experiment with rooting just yet, I needed to find a clever solution. One of the first apps I installed on the phone was the Android Scripting Engine (also known as Scripting Layer 4 Android, unfortunately development looks pretty dead now, although it works on my Galaxy Nexus) since I hoped there would be a way to write real apps with it. Unfortunately it was quite minimal, but one thing it did have is a way to get your SMS. That meant I could write a simple script to transliterate Hebrew Unicode into English (ASCII).
Picture of the SMS transliteration app

The code itself is very straight forward. It is only a few tens of lines, and there is pretty much no reason to ever use it again, but I think it represents why I like the Android – it has more of a “hacker” spirit than the iPhone. The only interesting thing I see in the code is that the sign for NIS and the u2029 whitespace in the translation dictionary. I guess I started manually adding characters in an attempt to map Unicode to ASCII. Finally I gave up and added the error parameter to encode that puts a “?” character for every Unicode code point it can’t print.

Why Start a Site and Blog?

To have my own unique place on the web to speak my mind and to reach an audience.

My own

There are plenty of places to speak your mind on the web – social networks, blogging platforms, wikis and the list goes on. The one thing a lot of people don’t realize is that most of those places control your content, your posts, your tweets.  Sure, some might do it less blatantly then others, some might say that they don’t censor and some have an export option, but the biggest problem is that more often than not they also control your audience. I’m not against such services (twitter is even starting to grow on me), but I don’t like the lack of control. I want my online persona to be under my own domain, designed (even poorly) by me and controlled by me.

Unique place

Continuing on the same note – sveder.com is unique. Not only is that my nickname for years now, but it is also a great domain name and more importantly, brand. So yeah, it will take a lot of time, content and luck for this brand to get big, but it will be more rewarding to me than having the same reputation on twitter or Google+.

Speak my mind

Doesn’t everyone want to be heard? Isn’t attention one of the most basic needs of humans? Less philosophically – I’m going to use this space to broadcast my opinions, my knowledge and my speculations. But like everything in a pragmatic mind – those might change, become more accurate and be proven wrong by reality and by the audience. Topics for this blog include computers, programming, developing, computer games, bicycles and everything else I might get interested in.

Reach an audience

The last point, but probably the most important – while I can control what I write here, no one can really control who views it, who it reaches and who has something to say back. This blog is pretty much like every other product – you put it out there and hope that it helps someone and that someone will get passionate about it enough to start a conversation.

A sign of things to come

I mentioned the general topics I will be covering so to get me a bit more committed here is a list of upcoming subjects for posts:

A series about my github projects – since they are already out there I want to have a post or two about each detailing their history, code reviewing some stuff and capturing the knowledge in them in a more googleable way.

“Practical programming” – a series about basic modern software developing concepts that are usually not taught at school. Example topics include:

    • Technical debt
    • Agile practices
    • “There should be one– and preferably only one –obvious way to do it”
    • separation between view and logic and more and MVC

I’m gonna try and have at least two posts a week.
If that sounds interesting you should subscribe using RSS and be sure to comment and talk back so that I can improve – this side of the keyboard is new to me, so let the journey begin 🙂