We also use pdftotext and have been happy with it.
--
Chad Mills
Programming Coordinator
Ph: 732.932.8573 x123
Fax: 732.932.1386
Cell: 732.309.8538
Rutgers University Libraries
Scholarly Communication Center
Room 409D, Alexander Library
169 College Avenue, New Brunswick, NJ 08901
http://rucore.libraries.rutgers.edu/
----- Original Message -----
From: "Eric Lease Morgan" <[log in to unmask]>
To: [log in to unmask]
Sent: Tuesday, June 21, 2011 10:28:39 AM
Subject: Re: [CODE4LIB] PDF->text extraction
On Jun 21, 2011, at 10:23 AM, Owen Stephens wrote:
> We've tried iText but had issues with quality
> We moved to PDFBox but are having performance issues
I have been satisfied with pdftotext which is a part of the Xpdf suite of tools -- http://bit.ly/kIHD1x
--
Eric Lease Morgan
University of Notre Dame
(574) 631-8604
|