Monday, December 27, 2010

Defeating CAPTCHAs

Another coworker of mine mentioned to me that a hobby of his was defeating CAPTCHAs, and that instant, I realized that there were two completely different routes to do that. One social, and one technological. CAPTCHAs are, of course, a Completely Automated Public Touring test to tell Computers and Humans Apart. Those squiggly letters you're forced to enter to post on a forum, register new accounts, or whatever. They make them to prevent mechanical submissions, which get really annoying, really quickly.
The technological approach is to basically reinvent OCR, Optical character Recognition. OCR has gotten a lot of funding as a way of automating the conversion of paper documents into computerized ones, to gain the advantages of computerized documents -- easy transmission, copying, editing, and so on. An OCR approach analyzes the graphical elements to determine which letter they were originally, and enters that. Supposedly, really good ones can work with just a 3-pixel row.
The social approach is to decide that only humans are capable of reading the bent and distorted letters of a CAPTCHA and convinces them to do so. One common approach is to offer something in exchange, like file downloads, or pornography. There are plenty of people who will willingly do just about anything to get more of those things, including decipher letter puzzles. It's not as fast, but it is plenty reliable. After all, the goal of the CAPTCHA maker is not technically circumvented, a human being is solving each and every one of their little puzzles. Just...not in the way they had hoped. Social attack CAPTCHA are promptly cached and used to hammer the server with mechanical submissions.
My coworker, however, said he took the technological approach. He took pride in the quality of his OCR craftsmanship, boasting on his only requiring of the right three rows to totally guess the correct answer.


TCG said...

I doubt his OCR would be able to read them all. I've encountered more than my fair share of completely illegible ones no matter how I turned my head. The cat ones are particularly horrible.

Some SOB even put this as a captcha

A site is surveyed and it is found that the angle to the top of the main building is 45 degrees when measured from a distance. When measured from 200 metres closer, the angle is 80 degrees. Assuming the ground is horizontal, make some practical calculations using degrees and the three trigonometric ratios, sin, cos or tan to find the height of the building.

KaiWen said...

These things can be a bitch on 4chan. If Anonymous cant find a way to beat it, no one can.

