
The other day a friend told me about
reCAPTCHA, a CAPTCHA-based program (attempting to block spam bots) with an interesting twist. The
developers (CMU CS students) are combining a web-scale CAPTCHA solution with an optical character recognition (OCR) system. Since OCR systems are not perfect, they have trouble recognizing some scanned words that humans can typically recognize. By combining known and unknown words in their reCAPTCHA tests, they can simultaneously sort humans from bots
and enable an Internet-scale crowdsourced OCR solution. They're using the human-provided text answers to help the
Internet Archive project with their digitization efforts.
Recaptcha and is available for use on
your own site, and they have
plugins to make it even easier to add to Wordpress or Mediawiki based sites.
No comments:
Post a Comment