Projects coming out of the woodwork! Ben, I added your project to FOSS4Lib (http://foss4lib.org/package/fromthepage) and will send you an invitation to an account on FOSS4Lib to maintain your package information. Peter On Mar 13, 2013, at 9:55 AM, Ben Brumfield <[log in to unmask]> wrote: > Let me echo Jim in suggesting a transcription tool rather than OCR for > handwritten texts. However, a lot depends on the kinds of material you're > working with and the uses you plan for the transcripts. Is it structured > data, like census records, account books, or an index cards database? > Is it free-form text like diaries or letters? Does the text contain a lot of > genetic elements like strike-throughs, careted insertions and > marginalia? Do you want to index terms so that readers can view all > mentions of banjos within the text? > > At present, there is no one tool that supports all of these. I built and > maintain one (AGPL) tool for free-form text to be used in indexing > [Self-promotion: http://fromthepage.com/ is the tool; source is at > http://github.com/benwbrum/fromthepage/ ] and have spent the last > year building another (Apache) tool for converting tabular records into > a search database. I think they're great, and am really excited about > them both. Nevertheless, last week I pointed a project at Jim's > T-PEN instead of my own tools, because the manuscripts were medieval > Arabic donation records which needed line-based transcription. > > I maintain a list of transcription tools used in crowdsourcing > projects here: http://tinyurl.com/TranscriptionToolGDoc > Currently there are around 30 that I know of, and I'd be happy > to give my opinion of what's appropriate for your project on or off > list. > > Ben Brumfield > http://manuscripttranscription.blogspot.com/ -- Peter Murray Assistant Director, Technology Services Development LYRASIS [log in to unmask] +1 678-235-2955 800.999.8558 x2955