Yeah, I think it ends up being pretty hard to create general-purpose
solutions to this sort of thing that are both not-monstrous-to-use and
flexible enough to do what everyone wants. Which is why most of the
'data warehouse' solutions you see end up being so terrible, in my
I am not sure if there is any product specifically focused on library
usage/financial data -- that might end up being somewhat less monstrous,
it seems the more you focus your use case (instead of trying to provide
for general "data warehouse and analysis"), the more likely a software
provider can come up with something that isn't insane.
At the 2011 Code4Lib Conf, Thomas Barker from UPenn presented on some
open source software they were developing (based on putting together
existing open source packages to be used together) to provide
library-oriented 'data warehousing'. I was interested that he talked
about how their _first_ attempt at this ended up being the sort of
monstrous flexible-but-impossible-to-use sort of solution we're talking
about, but they tried to learn from their experience and start over,
thinking they could do better. I'm not sure what the current status of
that project is. I'm not sure if any 2011 code4lib conf video is
available online? If it is, it doesn't seem to be linked to from the
conf presentation pages like it was in past years:
On 9/13/2011 5:37 PM, Jason Stirnaman wrote:
> Thanks, Shirley! I remember seeing that before but I'll look more closely now.
> I know what I'm describing is also known, typically, as a data warehouse. I guess I'm trying to steer around the usual solutions in that space. We do have an Oracle-driven data warehouse on campus, but the project is in heavy transition right now and we still had to do a fair amount of work ourselves just to get a few data sources into it.
> Jason Stirnaman
> Biomedical Librarian, Digital Projects
> A.R. Dykes Library, University of Kansas Medical Center
> [log in to unmask]
>>>> On 9/13/2011 at 04:25 PM, in message<[log in to unmask]>, Shirley Lincicum<[log in to unmask]> wrote:
> Check out: http://www.needlebase.com/
> It was not developed specifically for libraries, but it supports data
> aggregation, analysis, web scraping, and does not require programming
> skills to use.
> Shirley Lincicum
> Librarian, Western Oregon University
> [log in to unmask]
> On Tue, Sep 13, 2011 at 2:08 PM, Jason Stirnaman<[log in to unmask]> wrote:
>> Does anyone have suggestions or recommendations for platforms that can aggregate usage data from multiple sources, combine it with financial data, and then provide some analysis, graphing, data views, etc?
>> From what I can tell, something like Ex Libris' Alma would require all "fulfillment" transactions to occur within the system.
>> I'm looking instead for something like Splunk that would accept log data, circulation data, usage reports, costs, and Sherpa/Romeo authority data but then schematize it for data analysis and maybe push out reporting dashboards<nods to Brown Library http://library.brown.edu/dashboard/widgets/all/>
>> I'd also want to automate the data retrieval, so that might consist of scraping, web services, and FTP, but that could easily be handled separately.
>> I'm aware there are many challenges, such as comparing usage stats, shifts in journal aggregators, etc.
>> Does anyone have any cool homegrown examples or ideas they've cooked up for this? Pie in the sky?
>> Jason Stirnaman
>> Biomedical Librarian, Digital Projects
>> A.R. Dykes Library, University of Kansas Medical Center
>> [log in to unmask]