On Aug 7, 2012, at 1:23 AM, Yong Tang wrote:
> I am a full time information science student and a part time LAMP server administrator. I was recently thrown into a file dumpster containing hundreds of old PDF files. My job is to clearn the dumpster up by putting right files into right folders. I am facing some difficulties when writing a Perl script to get the job done. I would appreciate it if you could help.
If you're just trying to sort them, you might want to look to see if there's any metadata attached, rather than take the more difficult (although, possibly more accurate if the metadata's ambiguous) approach of looking at the stored text.
Unfortunately, what generated the PDF (and how old it was) is going to affect how to get at the metadata. Newer stuff should be in XMP, but PDF also has a few fields of its own (Author, Title, Creator, CreationDate, Keywords, etc.).