Print

Print


Hi John,

That sounds really interesting! Can you share a link to this game or code?

Jesse

On Tue, Dec 1, 2015 at 3:43 PM, Jason Bengtson <[log in to unmask]>
wrote:

> This may be a dumb thought, but I built a game a couple of years ago which
> tracked results on a map (on an HTML canvas, with the map set as a
> background with objects drawn on top of it) by counting the pixels of a
> certain color and comparing them as a percentage against the pixels in the
> whole map. You could do something similar, by comparing black or gray
> beyond a particular threshold against total pixels. That would be a pretty
> rough and ready approach, but it might be worth a shot. If the missing
> sections have a significantly different color than the rest of the image,
> that could be another metric to use.
>
> Best regards,
> *Jason Bengtson, MLIS, MA*
> Innovation Architect
>
>
> *Houston Academy of MedicineThe Texas Medical Center Library*
> 1133 John Freeman Blvd
> Houston, TX   77030
> http://library.tmc.edu/
> www.jasonbengtson.com
>
> On Tue, Dec 1, 2015 at 2:07 PM, Christine Mayo <[log in to unmask]> wrote:
>
> > Hi all,
> >
> > I have an interesting assessment issue with some recently digitized
> > newspapers that I wondered if anyone could shed some light on. We sent a
> > batch of 19th century newspapers off to a vendor knowing they weren't in
> > great shape, and now we have to decide whether the resultant images
> (TIFFs)
> > are usable or we should be looking for alternative copies and/or
> microfilm.
> >
> > A lot of the images are in decent shape, but the first few pages of each
> > issue are heavily creased and generally missing a smallish piece from the
> > center of the page where the folds met. I'm looking for a way to
> > programmatically identify how much text is missing/unusable for each
> page.
> > We haven't run OCR yet, part of this assessment is to figure out whether
> we
> > should bother sending these items out for OCR and METS/ALTO creation,
> but I
> > suspect we could run a quick and dirty in-house OCR if that would help.
> >
> > We can go through the images by hand and try to measure and/or count, but
> > if anyone's worked on something like this or has thoughts, I'd love to
> hear
> > them!
> >
> > Thanks,
> > Christine
> >
> > --
> > Christine Mayo
> > Digital Production Librarian
> > Thomas P. O'Neill, Jr. Library
> > Boston College
> > 140 Commonwealth Avenue
> > Chestnut Hill, MA 02467
> > [log in to unmask]
> >
>



-- 
Jesse Martinez
Web Services Librarian
O'Neill Library, Boston College
[log in to unmask]
617-552-2509