Hi all, I have an interesting assessment issue with some recently digitized newspapers that I wondered if anyone could shed some light on. We sent a batch of 19th century newspapers off to a vendor knowing they weren't in great shape, and now we have to decide whether the resultant images (TIFFs) are usable or we should be looking for alternative copies and/or microfilm. A lot of the images are in decent shape, but the first few pages of each issue are heavily creased and generally missing a smallish piece from the center of the page where the folds met. I'm looking for a way to programmatically identify how much text is missing/unusable for each page. We haven't run OCR yet, part of this assessment is to figure out whether we should bother sending these items out for OCR and METS/ALTO creation, but I suspect we could run a quick and dirty in-house OCR if that would help. We can go through the images by hand and try to measure and/or count, but if anyone's worked on something like this or has thoughts, I'd love to hear them! Thanks, Christine -- Christine Mayo Digital Production Librarian Thomas P. O'Neill, Jr. Library Boston College 140 Commonwealth Avenue Chestnut Hill, MA 02467 [log in to unmask]