Print

Print


At this point I was primarily targeting PDF and Microsoft Office files that would be passed on to our cataloging folks for manual inspection if they were DRM protected.  As has been pointed out on the list, general DRM detection has far trickier than I'd initially thought.  I've been using Apache Tika for file type detection, metadata and full text extraction.  However, when parsing encrypted or password protected files it throws the less than unhelpful "Unexpected Runtime Exception".

Dean