Print

Print


Let me echo Jim in suggesting a transcription tool rather than OCR for 
handwritten texts.  However, a lot depends on the kinds of material you're 
working with and the uses you plan for the transcripts.  Is it structured 
data, like census records, account books, or an index cards database?  
Is it free-form text like diaries or letters? Does the text contain a lot of 
genetic elements like strike-throughs, careted insertions and 
marginalia?  Do you want to index terms so that readers can view all
mentions of banjos within the text? 

At present, there is no one tool that supports all of these.  I built and 
maintain one (AGPL) tool for free-form text to be used in indexing
[Self-promotion: http://fromthepage.com/ is the tool; source is at
http://github.com/benwbrum/fromthepage/ ] and have spent the last
year building another (Apache) tool for converting tabular records into
a search database.  I think they're great, and am really excited about
them both.  Nevertheless, last week I pointed a project at Jim's 
T-PEN instead of my own tools, because the manuscripts were medieval 
Arabic donation records which needed line-based transcription.  

I maintain a list of transcription tools used  in crowdsourcing 
projects here: http://tinyurl.com/TranscriptionToolGDoc 
Currently there are around 30 that I know of, and I'd be happy
to give my opinion of what's appropriate for your project on or off
list.

Ben Brumfield
http://manuscripttranscription.blogspot.com/