An edge case but I've been using the pdftools package (https://cran.r-project.org/web/packages/pdftools/index.html) in R recently with the udpipe package (https://cran.r-project.org/web/packages/udpipe/index.html) and it just... works!
From: Code for Libraries <[log in to unmask]> On Behalf Of Cassie Tanks
Sent: Tuesday, February 23, 2021 9:35 AM
To: [log in to unmask]
Subject: [EXT] Re: [CODE4LIB] PDF Editors
APL external email warning: Verify sender [log in to unmask] before clicking links or attachments
Charles asked the exact question I was tasked with figuring out this week.
Thank you all for your suggestions- super helpful!
On Mon, Feb 22, 2021 at 5:24 PM Hammer, Erich F <[log in to unmask]> wrote:
> Are you working with PDFs with OCR'd and/or indexed text? If so, just
> about any PDF reader will allow copying the text out (if the PDF isn't
> protected). SumatraPDF (https://www.sumatrapdfreader.org) is my
> choice for a functional reader with a much lower risk than "fully functional"
> If you need to OCR scanned documents you might try NAPS2 (
> If you are looking to automate OCR using scripts, take a look at
> Tesseract (https://github.com/tesseract-ocr/tesseract).
> On Monday, February 22, 2021 at 16:15, Charles Meyer eloquently inscribed:
> > Hi my esteemed listmates,
> > My bad if I missed this but I’m looking for a downloadable (not
> > online) PDF editor?
> > I want to be able to copy “language” out of a PDF I receive and
> > paste it
> > plain text in a word[processor document.
> > Can you please recommend PDF editors you’ve actually used which
> > worked well?
> > Thank you!
> > Charles.