Print

Print


Another option might be to use OpenRefine http://openrefine.org - this should easily handle 250,000 rows. I find it good for basic data analysis, and there are extensions which offer some visualisations (e.g. the VIB BITs extension which will plot simple data using d3 https://www.bits.vib.be/index.php/software-overview/openrefine <https://www.bits.vib.be/index.php/software-overview/openrefine>)

I’ve written an introduction to OpenRefine available at http://www.meanboyfriend.com/overdue_ideas/2014/11/working-with-data-using-openrefine/ <http://www.meanboyfriend.com/overdue_ideas/2014/11/working-with-data-using-openrefine/>

Owen

Owen Stephens
Owen Stephens Consulting
Web: http://www.ostephens.com
Email: [log in to unmask]
Telephone: 0121 288 6936

> On 5 Aug 2015, at 21:07, Harper, Cynthia <[log in to unmask]> wrote:
> 
> Hi all. What are you using to process circ data for ad-hoc queries.  I usually extract csv or tab-delimited files - one row per item record, with identifying bib record data, then total checkouts over the given time period(s).  I have been importing these into Access then grouping them by bib record. I think that I've reached the limits of scalability for Access for this project now, with 250,000 item records.  Does anyone do this in R?  My other go-to- software for data processing is RapidMiner free version.  Or do you just use MySQL or other SQL database?  I was looking into doing it in R with RSQLite (just read about this and sqldf  http://www.r-bloggers.com/make-r-speak-sql-with-sqldf/ ) because I'm sure my IT department will be skeptical of letting me have MySQL on my desktop.  (I've moved into a much more users-don't-do-real-computing kind of environment).  I'm rusty enough in R that if anyone will give me some start-off data import code, that would be great.
> 
> Cindy Harper
> E-services and periodicals librarian
> Virginia Theological Seminary
> Bishop Payne Library
> 3737 Seminary Road
> Alexandria VA 22304
> [log in to unmask]<mailto:[log in to unmask]>
> 703-461-1794