For those interested in exploring crowdsourcing, transcription tools, and OCR, this is a really neat opportunity to see what's going on in natural science collections.
I attended the Augmenting OCR hackathon in February and learned a tremendous amount about OCR. Better yet, one of the tools I developed for processing entomology labels was re-used successfully by folks at the Early Modern OCR Project for their work dealing with 18th-century English printed books.
I wrote up the experience here: http://manuscripttranscription.blogspot.com/search/label/hackathon
iDigBio (www.idigbio.org) and Zooniverse's Notes from Nature Project (www.notesfromnature.org) are pleased to announce a hackathon to further enable public participation in online transcription of biodiversity specimen labels. There are approximately 1 billion specimens of this type in US collections alone, but it is estimated that information from just 10% of them is currently digitized and online. Digitization of natural history collections grants researchers access to vast quantities of information in their investigations of timely subjects such as climate change, invasive species, and the extinction crisis. The magnitude of the task of bringing those collections into digital format exceeds that of any single organization and will require new, Internet-scale approaches to engage the public. This is an exciting opportunity to work on a ground-breaking citizen-science endeavor with immediate and strong impacts in the areas of biodiversity research and applied conservation.
The event will occur from December 16-20, 2013, at iDigBio in Gainesville, FL. There is up to $1200 for support of travel and lodging for each participant.
The hackathon will produce new functionality and interoperability for Zooniverse's Notes from Nature (www.notesfromnature.org) and similar transcription tools. There are four areas of development that will be progressively addressed throughout the week. On Monday, the focus will be (1) linking images registered to the iDigBio Cloud to transcription tools to create efficiency and alleviate storage issues. Starting on Tuesday, topics will include (2) transcription QA/QC and the reconciliation of replicate transcriptions, (3) integration of OCR into the transcription workflow, and (4) new UI features and novel incentive approaches for public engagement.
We expect that most participants will arrive on Monday afternoon and depart on Friday late afternoon/evening or Saturday morning. There will be a social at the Florida Museum of Natural History on Wednesday, December 18. There will be opportunities to narrow the focus in each category of activity in a teleconference tentatively scheduled for early in the week of November 25.
**If you wish to be considered for one of about ten open invitations (of a total of about 30), please send (1) your CV/resume, (2) a short description (<250 words) of your relevant expertise (citing example products where appropriate), (3) the development areas that interest you (of the four numbered above), and (4) the days that you can attend to Austin Mast ([log in to unmask]) by Friday, November 1, for assured consideration. At least 3 slots will be reserved for qualified graduate students.**
With best regards,
Austin and Rob Guralnick (UC-Boulder), co-organizers
Associate Professor · Director, Robert K. Godfrey Herbarium · Associate Editor, Systematic Biology and Systematic Botany · Treasurer, American Society of Plant Taxonomists · Steering Committee Member, iDigBio, The National Resource for Advancing Digitization of Biodiversity Collections
Department of Biological Science · 319 Stadium Drive · Florida State University · Tallahassee, FL 32306-4295 · U.S.A.
Office is King Life Science Building, room 4065 · Lab is King Life Science Building, rooms 4068 and 4084 · Herbarium is Biological Science Unit One, room 100
Voice: 1 (850) 645-1500 · Fax: 1 (850) 645-8447 · [log in to unmask]