Yes. Use iText or PDFBox
These are common PDF libraries.
On 2/6/16, 2:24 PM, "Code for Libraries on behalf of Andrew Cunningham" <[log in to unmask] on behalf of [log in to unmask]> wrote:
>I am working with PDF files in some South Asian and South East Asian
>languages. Each PDF has ActualText added for each tag in the PDF. Each PDF
>has ActualText as an alternative forvthe visible text layer in the PDF.
>Is anyone aware of tools the will allow me to index and search PDFs based
>on the ActualText content rather than the visible text layers in the PDF?
>[log in to unmask]