Print

Print


Hello Scott,
I would be curious to hear more from what you expect from OpenRefine in 
that case. I know OpenRefine is powerful for many things but I can't get 
it for the current case, can you expand ?

Thanks
----
Sylvain Machefert - Bordeaux, France
Web services librarian - http://geobib.fr/en

Le 22/11/2014 19:44, scott bacon a écrit :
> Erica,
>
> You may find what you need from OpenRefine: http://openrefine.org/
>
>
>
> On Fri, Nov 21, 2014 at 5:15 PM, Erica FINDLEY <[log in to unmask]> wrote:
>
>> Greetings,
>>
>> I am working on a project to digitize concert programs. These are the type
>> of programs you get when attending a musical concert that list performers
>> and details about the concert.
>>
>> Since these items are text heavy we have decided to use OCR software to
>> output a text file that will enable full text searching in our platform.
>>
>> These text files are for the most part accurate, but often have unnecessary
>> line breaks and pockets of extra characters and/or incorrect
>> capitalization. I would like to pretty them up a little bit if possible.
>>
>> I am wondering if there is a script I can use on multiple files to clean
>> these type of things up. I don't want to have the digitization staff
>> manually edit each text file or have to open each one to run a macro in a
>> text editor.
>>
>> I have been searching online and so far haven't found anything that will
>> work for my situation.
>>
>> thanks in advance,
>>
>> *Erica Findley*
>> Cataloging/Metadata Librarian
>> Multnomah County Library
>> Phone: 503.988.5466
>> [log in to unmask]
>> www.multcolib.org
>>