This is still a very broad request set, but a few comments:
Xpdf can be patched to disregard copy/edit/print restrictions (those
set with mast rather than user pass) - although the author has a
statement on cracking - http://www.foolabs.com/xpdf/cracking.html.
You can see a fast sample patch for 3.0.2 (verified) here:
http://www.cs.cmu.edu/~dst/Adobe/Gallery/XPDF/hovland.txt
and general instructions for older versions here:
http://www.cs.cmu.edu/~dst/Adobe/Gallery/xpdf-generic-patch.html
These types of patches violate the Adobe implementation spec. FWIW.
If you'd rather try to brute force passwords, you can always try
pdfcrack (http://pdfcrack.sourceforge.net/). This may take a very long
time on 128-bit encrypted PDFs depending on the speed of your hardware
(although such tools also support masks, dictionaries, etc. Older
40-bit RC4 encrypted PDFs can generally be cracked rapidly with this
and other tools (same for older .doc files - Googling will find you
dozens of programs on and offline that do this).
40-bit passwords can be efficiently recovered (if you have a lot of
disk space and tight time requirements) with rainbow tables; you can
buy them (and the associated tools) from companies like Elcomsoft or
run something like Cain and Abel if you want a front-end, or get free
tables from http://www.freerainbowtables.com/ and run RainbowCrack
(http://project-rainbowcrack.com/). Note that RC4 table support is not
actually included in the free tools I've listed, I'm just making a
point about hash cracking in general.
Kam Woods
Postdoctoral Research Associate
School of Information and Library Science, University of North
Carolina at Chapel Hill
On Fri, Jan 20, 2012 at 9:01 AM, Farrell, Larry D
<[log in to unmask]> wrote:
> At this point I was primarily targeting PDF and Microsoft Office files that would be passed on to our cataloging folks for manual inspection if they were DRM protected. As has been pointed out on the list, general DRM detection has far trickier than I'd initially thought. I've been using Apache Tika for file type detection, metadata and full text extraction. However, when parsing encrypted or password protected files it throws the less than unhelpful "Unexpected Runtime Exception".
>
> Dean
|