I was wondering if anyone has created a script or tool to compare the words
in a text file to a dictionary? I'm looking for a way to quantify the
quality of OCR output. I've heard that counting the number of words that
are in the dictionary is a good quick and dirty way to do this, but I would
like to be able to run this script on larger batches of text files so I can
compare OCR engines (not count words manually).

Let me know if you have any existing tools or thoughts about how to go
about this!



Kimberly Kennedy
Digital Production Coordinator
Northeastern University Library
[log in to unmask]
[log in to unmask]