We are beginning a digitization project at my institution that involves
scanning archival documents that may contain personal identifying
information, such as social security numbers or credit card numbers. I'm
looking for a tool that will examine the PDFs and identify the ones that
may contain PII, so we can then redact them.
I've experimented a bit with Bulk Extractor Viewer but haven't been able to
get it to work on the scanned PDFs I've created. I talked to a sales rep
at Spirion and that program seems like overkill for our purposes. Any
suggestions for other things to try would be appreciated!
Digital Production Coordinator
Northeastern University Library
[log in to unmask]