Your question reminded me that I wanted to do something similar for a pile of PDFs from our institutional repository. So I made a small script in Python that can do this, using the Python module from: http://pybrary.net/pyPdf/ You can then put the following in a script: (mind the indentation) ---8<------ import os, sys import pyPdf if len(sys.argv) > 1: PATH = sys.argv[1] else: PATH = '.' for dirpath, dirnames, filenames in os.walk(PATH): for filename in filenames: try: filename_path = os.path.join(dirpath, filename) checked_file = pyPdf.PdfFileReader(file(filename_path, "rb")) except Exception, e: sys.stderr.write('%s :: %s\n' % (filename_path, e)) ---8<--------- If you run it without arguments, it checks the current directory, if you specify a path it will walk down the tree and check each file found. If a file is not recognised as a PDF it barfs on stderr. Have fun. Etienne Posthumus TU Delft Library - Digital Product Development t: +31 (0) 15 27 81 949 m: [log in to unmask] skype: eposthumus http://www.library.tudelft.nl/ Prometheusplein 1, 2628 ZC, Delft, Netherlands