Hello Scott, I would be curious to hear more from what you expect from OpenRefine in that case. I know OpenRefine is powerful for many things but I can't get it for the current case, can you expand ? Thanks ---- Sylvain Machefert - Bordeaux, France Web services librarian - http://geobib.fr/en Le 22/11/2014 19:44, scott bacon a écrit : > Erica, > > You may find what you need from OpenRefine: http://openrefine.org/ > > > > On Fri, Nov 21, 2014 at 5:15 PM, Erica FINDLEY <[log in to unmask]> wrote: > >> Greetings, >> >> I am working on a project to digitize concert programs. These are the type >> of programs you get when attending a musical concert that list performers >> and details about the concert. >> >> Since these items are text heavy we have decided to use OCR software to >> output a text file that will enable full text searching in our platform. >> >> These text files are for the most part accurate, but often have unnecessary >> line breaks and pockets of extra characters and/or incorrect >> capitalization. I would like to pretty them up a little bit if possible. >> >> I am wondering if there is a script I can use on multiple files to clean >> these type of things up. I don't want to have the digitization staff >> manually edit each text file or have to open each one to run a macro in a >> text editor. >> >> I have been searching online and so far haven't found anything that will >> work for my situation. >> >> thanks in advance, >> >> *Erica Findley* >> Cataloging/Metadata Librarian >> Multnomah County Library >> Phone: 503.988.5466 >> [log in to unmask] >> www.multcolib.org >>