Print

Print


We also use pdftotext and have been happy with it.

--
Chad Mills
Programming Coordinator
Ph: 732.932.8573 x123
Fax: 732.932.1386
Cell: 732.309.8538

Rutgers University Libraries
Scholarly Communication Center
Room 409D, Alexander Library
169 College Avenue, New Brunswick, NJ 08901

http://rucore.libraries.rutgers.edu/

----- Original Message -----
From: "Eric Lease Morgan" <[log in to unmask]>
To: [log in to unmask]
Sent: Tuesday, June 21, 2011 10:28:39 AM
Subject: Re: [CODE4LIB] PDF->text extraction

On Jun 21, 2011, at 10:23 AM, Owen Stephens wrote:

> We've tried iText but had issues with quality
> We moved to PDFBox but are having performance issues


I have been satisfied with pdftotext which is a part of the Xpdf suite of tools -- http://bit.ly/kIHD1x

-- 
Eric Lease Morgan
University of Notre Dame

(574) 631-8604