Time Commitment & Compensation: $10,000/month, 3.5 months | Appointment Date: October 2018


The Division of Libraries, Center for Data Science, and Arthur L. Carter Journalism Institute at New York University seek a developer to work on an innovative, IMLS-funded project working to preserve interactive web applications for data journalism. These applications are typically made of a database, a web server, a web application and a frontend. To fully capture these disparate parts, the Project Team will work to integrate ReproZip (an open source tool for reproducible research) and open source front-end web archiving tools (e.g. Webrecorder) into one tool to fully capture functionality of modern interactive web applications. The project team includes data journalists, librarians, and software engineers.


We are looking for a talented developer who is passionate about creating tools and infrastructure to support web archiving. Candidates will work collaboratively to build a tool that will transform and streamline preservation of web applications. For this project, the Developer will add open-source web crawling functionality to ReproZip to fully capture interactive web applications. This prototype will be tested with data journalism applications and updated as new functional specifications emerge.


The Developer will work with ReproZip developers. Reporting to the Project Manager, the Developer will be tasked with completing two essential parts related to this project work:

Continue development of ReproZip by adding web-crawling functionality to capture full-scale interactive web applications and writing accompanying documentation.
The Developer will work with the Project Team to update the prototype functionality as testing continues.


A Master’s degree in Computer Science or related technical fields, or equivalent programming background
Experience in software development methodology and tools
Knowledge of C/C++, Python, JavaScript, and SQL (Ruby is a plus)
Proficiency working with database technology (e.g. Postgres) and modern container and cloud orchestration (e.g. Docker, Vagrant, VirtualBox)
Understanding of networking, web requests, and HTTP proxies
Familiarity with the Linux kernel, architecture, and programming interfaces
Excellent documentation practices for software projects
Comfortable in working open source and answering user needs in a variety of forums (e.g. email, GitHub issues)

Brought to you by code4lib jobs: