Print

Print


As you probably know, you can compress PDFs by compressing or flattening
the layers (most useful for born-digital materials, such as artwork) or by
applying a compression algorithm to the underlying images for PDFs
assembled from digitized images, which seems to be what you're doing.
Reducing the image size (pixels) and bit depth prior to assembling images
in a PDF (i.e., don't start with your 800ppi TIFF master) can have a
dramatic difference on the total size of the PDF. Beyond that, lossless and
lossy compression algorithms can reduce the size of the underlying image
files, with different techniques working well on different types of images.
IrfanView and Ghostscript can help with this. LZW is one of the more common
lossless compression algorithms for TIFF images. JPEG2000 also offers good
lossless compression.

In addition to LuraTech, there's at least one other proprietary PDF
compression system, developed by SAFER Inc. (http://www.saferinc.com/). Based
on a conversation with someone from the company about 18 months ago, they
use algorithms that do automatic edge detection and background detection,
applying compression non-uniformly to regions that appear to contain little
information. At the time of this conversation, they weren't able to give me
any white papers or peer-reviewed articles describing the algorithms used,
which made me hesitant about recommending the system for anything remotely
archival, though they claimed it was lossless. For use copies, though, the
software does work very well, and file size reduction is dramatic. I don't
know anything about pricing. LuraTech may use something similar in their
"Mixed Raster Content (MRC)" or "layered" compression. As far as I know,
IrfanView and ghostscript don't include algorithms to do anything similar.

Danielle Cunniff Plumer
dcplumer associates
www.dcplumer.com



> > -----Original Message-----
> > From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of
> > Nathan Tallman
> > Sent: Wednesday, October 24, 2012 10:29 AM
> > To: [log in to unmask]
> > Subject: [CODE4LIB] PDF Compression
> >
> > Can anyone recommend some good PDF compression software? Preferable
> > open-source or low-cost. We're scanning archival collections and the PDFs
> > can be quite large for a single folder. The folder may be thick or thin,
> > and contain a mix of text and images. We've fiddled with various Acrobat
> > settings for getting the file size down, but we haven't found a good
> > balance between quality and file size. (Plus, these need to be OCR'ed; so
> > far we've been doing that in Acrobat.)
> >
> > We were looking at LuraTech PDF Compressor, but the cost for an
> enterprise
> > license is pretty high. It did do an excellent job though.
> >
> > Thanks,
> > Nathan
> >
>