Blocking the IP is the obvious solution but not ideal at all. First off,
it's trivially easy to bypass IP blacklists using proxies. I don't want to
play a game of never-ending IP whack-a-mole. Second, it notifies the
attacker that we are onto them, which makes it less likely for us to catch
them. We want to figure out which accounts are compromised so that we can
fix the problem at the source rather than fixing symptoms. If EZproxy is
being abused, then it's just as likely that other, more valuable systems at
the university are being abused as logins are shared between many systems.
Josh Welker
-----Original Message-----
From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of Kyle
Banerjee
Sent: Thursday, November 20, 2014 12:07 PM
To: [log in to unmask]
Subject: Re: [CODE4LIB] Balancing security and privacy with EZproxy
Personally, I'd be tempted to go the IP lockout route myself since the
patterns should be clear in the logs, but be aware that # megabytes gives a
reasonable level of control because you can set to log rather than lock out.
I think the risk of locking legitimate users is low. Although people can
download mixed materials, my guess is that your abusing accounts are not
watching loads of video.
There are things you can do with user names that would make it easy enough
to uncover abuse without unduly compromising privacy. For example, you could
flush your logs frequently while extracting the number of downloads you're
interested from individual users. Abuse accounts will be immediately
obvious. BTW, you can do some funky things with EZP that include conditional
logic, regexp searches, and rewriting that might be helpful.
Any path you take will protect user privacy far more than just about any
other site they visit. Plus, whoever maintains your network will
occasionally need to monitor specific computers to mitigate a wide variety
of problems. Systems used as a platform for abusive behavior, harassment, or
activity that causes harm to others get locked out and/or blacklisted which
will really hose your users. Getting that kind of thing cleared up takes
time because most places aren't nearly as forgiving as libraries.
kyle
On Wed, Nov 19, 2014 at 8:47 PM, Dan Scott <[log in to unmask]> wrote:
> On Wed, Nov 19, 2014 at 4:06 PM, Kyle Banerjee
> <[log in to unmask]>
> wrote:
>
> > There are a number of technical approaches that could be used to
> > identify which accounts have been compromised.
> >
> > But it's easier to just make the problem go away by setting usage
> > limits
> so
> > EZP locks the account out after it downloads too much.
> >
>
> But EZProxy still doesn't let you set limits based on the type of
> download.
> You therefore have two very blunt sledge hammers with UsageLimit:
>
> - # of downloads (-transfers)
> - # of megabytes downloaded (-MB)
>
> # of downloads is effectively useless because many of our electronic
> resource platforms (hi Proquest and EBSCOHost) make between 50 and 150
> requests for JavaScript, CSS, and images per page, so you have to set
> your thresholds incredibly high to avoid locking out users who might
> be actively paging through search results. Any savvy abuser will just
> script their requests to avoid all of the JS/CSS/images to derive a
> list of PDFs, and then download just the PDFs, thereby staying well
> under the usage limits that legit users require... and I've seen
> exactly that happen through our proxy.
>
> # of megabytes downloaded is a pretty blunt tool as well, given that
> our multimedia-enriched databases now often serve up video and audio
> as well as HTML, images, and PDF files. For the pure audio and video
> streaming sites such as Naxos or Curio, you can set higher limits; but
> as vendors increasingly enrich their databases with audio and video,
> you're going to have to increase your general limits as well... and
> you can pull down a ton of PDFs under that cover.
>
> So no, I don't think it's easy to make the problem go away through the
> suggested approach, unless you're willing to err on the side of
> locking out legitimate users.
>
|