The initial commit in https://github.com/code4lib/shortimer/ was November
2011, which is ten years for some values of ten. Taking a quick and
noncomprehensive glance around, I see postings as old as 2005. I don't see
an obvious API, but maybe a maintainer could weigh in about data dump
On Fri, Jan 22, 2021 at 11:28 AM Eric Lease Morgan <[log in to unmask]> wrote:
> On Jan 22, 2021, at 11:11 AM, Jill Ellern <[log in to unmask]> wrote:
> > I'm doing some research into systems librarian duties and wondering if
> there is an easy way to get a dump of the code4lib jobs from the last 10
> years? In excel format?
> Easy? I'd be surprised.
> There are two or three sources of the Code4Lib jobs data:
> 1. the underlying data from the jobs.code4lib.org site
> 2. any one of a number of different Code4Lib mailing list Web archives
> 3. the archived mailbox (mbox) files from the mailing list
> I don't think the jobs site has been around for ten years. Has it? Nor do
> I know whether or not the data is archived. If it is, then I'd bet you will
> be able get it in some sort of structured format like JSON or delimited
> delimited format like Excel.
> Scraping different Web archives would require... scraping which,
> personally, I run away from.
> Finally, the archived mbox files would be the most comprehensive, but a
> programmer would have to parse the mbox (email) files, which is a
> specialized task in and of itself. If you want to know where the mbox files
> are located, then drop me a line and I'll let you know. Easy.
> Finally, what's the questions you would like to answer? How many system
> librarian jobs have been posted? Where were the jobs? What are the
> characteristics of systems librarianship and how have they changed over
> time? How much they pay? Extracting some of this information from the
> postings may be difficult, if not heroic in nature.
> Eric Morgan
> University of Notre Dame
Humanistic Machine Learning for Library Data
Lecturer, San José State University iSchool