Print

Print


Projects coming out of the woodwork!  Ben, I added your project to FOSS4Lib (http://foss4lib.org/package/fromthepage) and will send you an invitation to an account on FOSS4Lib to maintain your package information.


Peter

On Mar 13, 2013, at 9:55 AM, Ben Brumfield <[log in to unmask]> wrote:
> Let me echo Jim in suggesting a transcription tool rather than OCR for 
> handwritten texts.  However, a lot depends on the kinds of material you're 
> working with and the uses you plan for the transcripts.  Is it structured 
> data, like census records, account books, or an index cards database?  
> Is it free-form text like diaries or letters? Does the text contain a lot of 
> genetic elements like strike-throughs, careted insertions and 
> marginalia?  Do you want to index terms so that readers can view all
> mentions of banjos within the text? 
> 
> At present, there is no one tool that supports all of these.  I built and 
> maintain one (AGPL) tool for free-form text to be used in indexing
> [Self-promotion: http://fromthepage.com/ is the tool; source is at
> http://github.com/benwbrum/fromthepage/ ] and have spent the last
> year building another (Apache) tool for converting tabular records into
> a search database.  I think they're great, and am really excited about
> them both.  Nevertheless, last week I pointed a project at Jim's 
> T-PEN instead of my own tools, because the manuscripts were medieval 
> Arabic donation records which needed line-based transcription.  
> 
> I maintain a list of transcription tools used  in crowdsourcing 
> projects here: http://tinyurl.com/TranscriptionToolGDoc 
> Currently there are around 30 that I know of, and I'd be happy
> to give my opinion of what's appropriate for your project on or off
> list.
> 
> Ben Brumfield
> http://manuscripttranscription.blogspot.com/



--
Peter Murray
Assistant Director, Technology Services Development
LYRASIS
[log in to unmask]
+1 678-235-2955
800.999.8558 x2955