Could you use the patterns feature in Acrobat and regex?

Jenny Lane | NPL | 615-880-1622                                              

-----Original Message-----
From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of Kimberly Kennedy
Sent: Friday, April 19, 2019 12:26 PM
To: [log in to unmask]
Subject: [CODE4LIB] Looking for lightweight tool to identify PII

Attention: This email originated from a source external to Metro Government. Please exercise caution when opening any attachments or links from external sources.


We are beginning a digitization project at my institution that involves
scanning archival documents that may contain personal identifying
information, such as social security numbers or credit card numbers.  I'm
looking for a tool that will examine the PDFs and identify the ones that
may contain PII, so we can then redact them.

I've experimented a bit with Bulk Extractor Viewer but haven't been able to
get it to work on the scanned PDFs I've created.  I talked to a sales rep
at Spirion and that program seems like overkill for our purposes.  Any
suggestions for other things to try would be appreciated!



Kimberly Kennedy
Digital Production Coordinator
Northeastern University Library
[log in to unmask]