Print

Print


> On Dec 15, 2022, at 9:27 AM, Eric Lease Morgan <[log in to unmask]> wrote:
> 
> How can I use the Firefox, Chrome, and/or Safari Web browsers to batch download the content found at the other end of a list of URLs?

Ugh.

I spent YEARS trying to find a solution to this, as we would have scientists who had to download hundreds or thousands of files.

I don’t have my notebooks on me, as I had various scripts that I had developed that worked in some browsers.

From what I remember off the top of my head:

You *can* trigger downloads via JavaScript, but the files have to be served with a mime-type that the browser won’t try to display on its own.  (I might have also tweaked the server to send file-disposition headers.)

You have to set delays in the loop, as some browsers will take it to be a pop up / pop under attack and stop the whole thing.

At least one browser would prompt you for what you wanted to do with each file, which got a bit cumbersome.  One of them gave up asking after 25 or 50 files or so… but I don’t recall if that was because it stopped downloading.

I was hoping to be able to do something like serve them user a metalink file (http://www.metalinker.org/), and then trigger the browser’s download manager to handle it, but they don’t.  At least, not when I last did my testing.

(Another programmer on our team wrote a Java client that would download a list of files in parallel, but it got repurposed for another task and the client front end was never finished.  Later versions were adapted to interact with a home grown file management system that we were required to use by one of the science teams, and so took file IDs rather than URLs (and then it contacted mirror sites to determine who had the files).  It had some nice features, but you’d probably be better off finding a download manager that’s better supported than try to get that code to work in a more generic manner)

My next attempt was going to be serving sparse BagIt files (empty, with a fetch.txt file; see https://www.rfc-editor.org/rfc/rfc8493#section-2.2.3), and writing a client that knew how to handle them, but I got reassigned to deal with trying to keep the damned data system from crashing constantly (while not being allowed to actually make changes to the code)

… oh, and all of this was over a period of 5-15 years ago.  I think I redid some of the tests 2 or 3 years ago, but I don’t remember if things had improved or gotten worse

-Joe