Print

Print


Did anyone already suggest Mendeley - I think it will do this for you with zero coding whatsoever. In fact, you can point Mendeley at the directory and it will suck them in automatically and rename the pdfs if you have it set that way.

Of course this only works with published research articles - coding is needed for the general case.
Christina

-----Original Message-----
From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of Alexander Duryee
Sent: Friday, January 15, 2016 11:19 AM
To: [log in to unmask]
Subject: Re: [CODE4LIB] A smart bulk file name editor?

Amy,

It sounds like this is a three-step process for each file:

1) Feed the PDF (as a data blob) into a script
2) Parse out the data that you're looking for (title, author, year)
3) Build a string using your parsed data, and move the file to that new filename

1 and 3 should be simple with any scripting language; unfortunately, 2 may be very difficult.  PDF is not a structured data format, so there's no guarantee that the data you need can be easily parsed out.  If the PDFs were uniformly generated (e.g. they were all generated from LaTeX markup or a single content management system) then it may be possible to parse out information from the file.  If not - for example, if the PDFs consist of scanned pages - then you'll need to generate that data elsewhere (perhaps from an existing catalog), create the new filenames that way, and feed that list into a script/tool to rename the files.

Best of luck,
--Alex

On Fri, Jan 15, 2016 at 11:06 AM, Chris Moschini <[log in to unmask]> wrote:

> It won't surprise you coders do this all the time and so there are 80 
> ways to do this, so your peril is choice not scarcity.
>
> Although there are a ton of tools that will do this for non-coders:
> https://www.google.com/webhp?q=file%20renamer
>
> On Windows robocopy is popular.
>
> The truth is though most coders just pick the programming language of 
> their choice and go for it. The most common is Bash and regex. Bash is 
> built-in to Linux and Macs and pretty easy to 
> <https://git-for-windows.github.io/>
> get
> onto Windows <https://www.cygwin.com/>. It's an old and ugly language 
> but it's also the kitchen sink of "I just need to do this quick 
> thing." That said if you dislike old and ugly languages or unintuitive 
> syntax or command names, pick a programming language you do like, or one of the tools above.
>
>
> On Fri, Jan 15, 2016 at 10:56 AM, Amy Schuler 
> <[log in to unmask]>
> wrote:
>
> > Hi,
> > I'm looking for a smart bulk file editor, if it exists.  
> > Specifically I'd like it to be able to move through a list of PDF 
> > files that are published research papers, and rename them in this 
> > approximate format, based on the contents of the file:
> > firstauthor_firstfewwordsoftitle_year.pdf
> >
> > I know this is probably a crazy dream.  The bulk file editors that I 
> > know about are more simple.  They can bulk rename files according to 
> > a pre-set pattern or they just remove/add/re-position bits from the 
> > existing file string.
> >
> > Thanks!
> >
> > Amy Schuler
> > Cary Institute of Ecosystem Studies
> > [log in to unmask]
> >
>