Come join the digital library dream team at Stanford University Library as our
new Web Archiving Engineer! We offer Silicon Valley competitive salaries, a
beautiful campus with balmy weather and palm trees, and a team of fun and
talented library programmers.
**Web Archive Engineer** - 60432
This position is double-posted at the 4P3 and 4P4 levels.
This is a four-year fixed-term position with the possibility of an extension.
Stanford University Libraries (SUL) is seeking a talented software engineer to
support the Web Archiving Service. This is a four year fixed-term position
with the possibility of an extension.
The position is a key element in the implementation and ongoing support of
SUL's Web Archiving Service. The Service will enable the archiving of web
content into the Stanford Digital Repository (SDR) on behalf of Stanford
librarians, faculty, and researchers and in support of the University's needs
for research, teaching, library collection building, and regulatory
The Web Archiving Engineer will primarily develop and maintain software to
facilitate web archiving workflows and use cases: harvesting, data management,
quality assurance, discovery, indexing, access and analysis. This will entail
deployment, local optimization and possible enhancement of community-developed
open source web archiving tools and best practices.
Reporting to the Manager for Application Development and working closely with
the Web Archiving Service Manager, the successful candidate will be
responsible for developing, configuring and/or managing web archiving systems
and related digital library components; pioneering tools and techniques for
the collection, replay and preservation of the next generation of web
technologies; troubleshooting and resolving technical issues related to
Service operation; and streamlining the processing of archived web content
through the entire lifecycle.
Systems Analysis, Architecture Design, Implementation and Administration (50%)
Provide technical analysis and software engineering support for web archiving
and related digital preservation activities at SUL. Install, configure and
manage Heritrix, Wayback Machine and other components necessary to build an
end-to-end service. Streamline the ingest of harvested and other target
content and associated metadata into repository, discovery and access
Operational Support (25%)
Collaborate with the Web Archiving Service Manager to troubleshoot and resolve
technical issues affecting harvest, replay and web archiving workflows.
Generate Wayback Machine and Lucene indexes to enable web archive replay,
full-text searching and metadata analysis.
Harvest Engineering (15%)
Develop tools and techniques to enable archival capture and replay of rich
media, streaming content, social media as well as traditional web page
content. Administer web crawls to maximize data capture quality and efficient
use of limited resources.
Community Engagement (10%)
Play an active role in the cultural heritage web archiving community. Stay
abreast of evolving best practices and tools for web archiving and make
appropriate recommendations for local service enhancement.
expertise with Ruby and Ruby on Rails application development.
expertise deploying, configuring and managing Apache HTTP Server and Apache
expertise with Unix/Linux and command-line utilities, such as awk, find, and
expertise with XML and XSLT.
experience with relational database design and management, including
implementing database applications for MySQL, Oracle or PostgreSQL.
learner. Adept at quickly learning new scripting and programming languages and
making sense of unfamiliar architectures and application designs.
to write solid, simple, elegant code both independently and in a team-
programming environment and within schedule limitations.
to work collaboratively with multiple levels of staff and colleagues at peer
institutions and within the open source community on projects from
specification to launch. Excellent verbal and written communication skills.
to apply best practices to technical projects, especially test-first
development and automated testing. Must also make effective use of team
collaboration tools, build management and version control systems.
experience providing ongoing support for technical services, including
experience monitoring and managing a solution.
degree or equivalent, with five to seven years of demonstrated experience.
At the 4P4 level,
four-year college degree or equivalent, with more than seven years of
knowledge of web archiving tools, techniques, issues and trends.
expertise with Lucene/Solr.
expertise with distributed computing technologies, such as Hadoop, HBase and
experience with file characterization tools, such as JHOVE, FITS, DROID and
experience with library-related metadata and metadata standards, particularly
DC, MODS, MARC, METS and EAD.
participating in community-based open source projects, especially those
relevant to SUL's Digital Library architecture, such as Fedora, Blacklight,
Solr or Hydra.
experience with library applications and technology, especially experience
participating in relevant library open source efforts.
experience working in an academic and/or library environment.
Master's degree in
Computer Science, Information Science or related field.
: Information Technology Services
: University Libraries
Brought to you by code4lib jobs: http://jobs.code4lib.org/job/9612/