Hi John, That sounds really interesting! Can you share a link to this game or code? Jesse On Tue, Dec 1, 2015 at 3:43 PM, Jason Bengtson <[log in to unmask]> wrote: > This may be a dumb thought, but I built a game a couple of years ago which > tracked results on a map (on an HTML canvas, with the map set as a > background with objects drawn on top of it) by counting the pixels of a > certain color and comparing them as a percentage against the pixels in the > whole map. You could do something similar, by comparing black or gray > beyond a particular threshold against total pixels. That would be a pretty > rough and ready approach, but it might be worth a shot. If the missing > sections have a significantly different color than the rest of the image, > that could be another metric to use. > > Best regards, > *Jason Bengtson, MLIS, MA* > Innovation Architect > > > *Houston Academy of MedicineThe Texas Medical Center Library* > 1133 John Freeman Blvd > Houston, TX 77030 > http://library.tmc.edu/ > www.jasonbengtson.com > > On Tue, Dec 1, 2015 at 2:07 PM, Christine Mayo <[log in to unmask]> wrote: > > > Hi all, > > > > I have an interesting assessment issue with some recently digitized > > newspapers that I wondered if anyone could shed some light on. We sent a > > batch of 19th century newspapers off to a vendor knowing they weren't in > > great shape, and now we have to decide whether the resultant images > (TIFFs) > > are usable or we should be looking for alternative copies and/or > microfilm. > > > > A lot of the images are in decent shape, but the first few pages of each > > issue are heavily creased and generally missing a smallish piece from the > > center of the page where the folds met. I'm looking for a way to > > programmatically identify how much text is missing/unusable for each > page. > > We haven't run OCR yet, part of this assessment is to figure out whether > we > > should bother sending these items out for OCR and METS/ALTO creation, > but I > > suspect we could run a quick and dirty in-house OCR if that would help. > > > > We can go through the images by hand and try to measure and/or count, but > > if anyone's worked on something like this or has thoughts, I'd love to > hear > > them! > > > > Thanks, > > Christine > > > > -- > > Christine Mayo > > Digital Production Librarian > > Thomas P. O'Neill, Jr. Library > > Boston College > > 140 Commonwealth Avenue > > Chestnut Hill, MA 02467 > > [log in to unmask] > > > -- Jesse Martinez Web Services Librarian O'Neill Library, Boston College [log in to unmask] 617-552-2509