Projects coming out of the woodwork! Ben, I added your project to FOSS4Lib (http://foss4lib.org/package/fromthepage) and will send you an invitation to an account on FOSS4Lib to maintain your package information.
On Mar 13, 2013, at 9:55 AM, Ben Brumfield <[log in to unmask]> wrote:
> Let me echo Jim in suggesting a transcription tool rather than OCR for
> handwritten texts. However, a lot depends on the kinds of material you're
> working with and the uses you plan for the transcripts. Is it structured
> data, like census records, account books, or an index cards database?
> Is it free-form text like diaries or letters? Does the text contain a lot of
> genetic elements like strike-throughs, careted insertions and
> marginalia? Do you want to index terms so that readers can view all
> mentions of banjos within the text?
> At present, there is no one tool that supports all of these. I built and
> maintain one (AGPL) tool for free-form text to be used in indexing
> [Self-promotion: http://fromthepage.com/ is the tool; source is at
> http://github.com/benwbrum/fromthepage/ ] and have spent the last
> year building another (Apache) tool for converting tabular records into
> a search database. I think they're great, and am really excited about
> them both. Nevertheless, last week I pointed a project at Jim's
> T-PEN instead of my own tools, because the manuscripts were medieval
> Arabic donation records which needed line-based transcription.
> I maintain a list of transcription tools used in crowdsourcing
> projects here: http://tinyurl.com/TranscriptionToolGDoc
> Currently there are around 30 that I know of, and I'd be happy
> to give my opinion of what's appropriate for your project on or off
> Ben Brumfield
Assistant Director, Technology Services Development
[log in to unmask]