I would see if you can just get an SQL or CSV dump of the tables, maybe it’s not super-normalized and you can get most of what you need in a table or two, or perhaps the provider would be so kind as to write a join for the data you need, and write a dump to a CSV file which you can the import in Excel and pursue / analyze to your heart’s content. That seems to be the easiest thing by far, to me anyway. > On Jan 22, 2021, at 12:17 PM, Andromeda Yelton <[log in to unmask]> wrote: > > The initial commit in https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fcode4lib%2Fshortimer%2F&data=04%7C01%7Csteven.j.turner%40ua.edu%7Ca7b50aed122a4cbb42bc08d8bf022f8d%7C2a00728ef0d040b4a4e8ce433f3fbca7%7C0%7C0%7C637469363394049896%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=k8a7Wvpbtq%2FJv5pJb5dsVRkLxm9i9yJ0S%2BfGmLy5OQM%3D&reserved=0 was November > 2011, which is ten years for some values of ten. Taking a quick and > noncomprehensive glance around, I see postings as old as 2005. I don't see > an obvious API, but maybe a maintainer could weigh in about data dump > possibilities? > > On Fri, Jan 22, 2021 at 11:28 AM Eric Lease Morgan <[log in to unmask]> wrote: > >> On Jan 22, 2021, at 11:11 AM, Jill Ellern <[log in to unmask]> wrote: >> >>> I'm doing some research into systems librarian duties and wondering if >> there is an easy way to get a dump of the code4lib jobs from the last 10 >> years? In excel format? >> >> >> Easy? I'd be surprised. >> >> There are two or three sources of the Code4Lib jobs data: >> >> 1. the underlying data from the jobs.code4lib.org site >> >> 2. any one of a number of different Code4Lib mailing list Web archives >> >> 3. the archived mailbox (mbox) files from the mailing list >> >> I don't think the jobs site has been around for ten years. Has it? Nor do >> I know whether or not the data is archived. If it is, then I'd bet you will >> be able get it in some sort of structured format like JSON or delimited >> delimited format like Excel. >> >> Scraping different Web archives would require... scraping which, >> personally, I run away from. >> >> Finally, the archived mbox files would be the most comprehensive, but a >> programmer would have to parse the mbox (email) files, which is a >> specialized task in and of itself. If you want to know where the mbox files >> are located, then drop me a line and I'll let you know. Easy. >> >> Finally, what's the questions you would like to answer? How many system >> librarian jobs have been posted? Where were the jobs? What are the >> characteristics of systems librarianship and how have they changed over >> time? How much they pay? Extracting some of this information from the >> postings may be difficult, if not heroic in nature. >> >> -- >> Eric Morgan >> University of Notre Dame > > > > -- > Andromeda Yelton > Humanistic Machine Learning for Library Data > Lecturer, San José State University iSchool > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fandromedayelton.com%2F&data=04%7C01%7Csteven.j.turner%40ua.edu%7Ca7b50aed122a4cbb42bc08d8bf022f8d%7C2a00728ef0d040b4a4e8ce433f3fbca7%7C0%7C0%7C637469363394049896%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=yBAbEQzkJmwSiJI7pFNb9k%2F1LHMdgxerk67ERm%2B94ew%3D&reserved=0 > @ThatAndromeda > <https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Ftwitter.com%2FThatAndromeda&data=04%7C01%7Csteven.j.turner%40ua.edu%7Ca7b50aed122a4cbb42bc08d8bf022f8d%7C2a00728ef0d040b4a4e8ce433f3fbca7%7C0%7C0%7C637469363394049896%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=U4MuMPb8HiaJSUp8vG2BBdNz0PUDTx13nQ7BV9V7FXw%3D&reserved=0>