Thursday, January 14, 2010

Species Image ReCAPTCHA

After seeing the inspiring talk of Luis von Ahn on PopTech, I was thinking that CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) could be a cool way to improve the OCR of species names in books OCRed by the Biodiversity Heritage Library particularly as Rod is currently using ReCAPTCHA on BioStor. BioStor is helping to annotate and extract data from BHL and Rod is using ReCAPTCHA to check that the annotation is being made by a human.
In my limited experience of OCRing, I have found that italicized species names are particularly hard to get right, so if ReCAPTCHA could be set-up to to use species names from BHL, then users annotating BioStor would be improving the OCRing of BHL articles and helping to annotate it.
Along the same lines, I have been thinking that it would be great if you could use the power of the brain to annotate species images that are on the web, a bit in the style of Google's Image Labeler which I find addictive so have been avoiding it for a while. Then quite by chance, I came across this image CAPTCHA at the University of Edinburgh.
I thought this would be a really cool way to help tag wildlife images at least using common names. You could also do a pro version for taxonomists with latin binomial tagging.

Disqus for Evo-Karma