Erica, You may find what you need from OpenRefine: http://openrefine.org/ On Fri, Nov 21, 2014 at 5:15 PM, Erica FINDLEY <[log in to unmask]> wrote: > Greetings, > > I am working on a project to digitize concert programs. These are the type > of programs you get when attending a musical concert that list performers > and details about the concert. > > Since these items are text heavy we have decided to use OCR software to > output a text file that will enable full text searching in our platform. > > These text files are for the most part accurate, but often have unnecessary > line breaks and pockets of extra characters and/or incorrect > capitalization. I would like to pretty them up a little bit if possible. > > I am wondering if there is a script I can use on multiple files to clean > these type of things up. I don't want to have the digitization staff > manually edit each text file or have to open each one to run a macro in a > text editor. > > I have been searching online and so far haven't found anything that will > work for my situation. > > thanks in advance, > > *Erica Findley* > Cataloging/Metadata Librarian > Multnomah County Library > Phone: 503.988.5466 > [log in to unmask] > www.multcolib.org >