Could you use the patterns feature in Acrobat and regex? http://blogs.adobe.com/acrolaw/2011/05/creating_and_using_custom_redact/
Jenny Lane | NPL | 615-880-1622
From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of Kimberly Kennedy
Sent: Friday, April 19, 2019 12:26 PM
To: [log in to unmask]
Subject: [CODE4LIB] Looking for lightweight tool to identify PII
Attention: This email originated from a source external to Metro Government. Please exercise caution when opening any attachments or links from external sources.
We are beginning a digitization project at my institution that involves
scanning archival documents that may contain personal identifying
information, such as social security numbers or credit card numbers. I'm
looking for a tool that will examine the PDFs and identify the ones that
may contain PII, so we can then redact them.
I've experimented a bit with Bulk Extractor Viewer but haven't been able to
get it to work on the scanned PDFs I've created. I talked to a sales rep
at Spirion and that program seems like overkill for our purposes. Any
suggestions for other things to try would be appreciated!
Digital Production Coordinator
Northeastern University Library
[log in to unmask]