To apply for the position described, please go to the Stanford University
online job application system, and search for Requisition #37588:
*Request no direct phone calls or emails, please.*
Digitization Workflow Engineer
Fixed Term for 12 months
Stanford University Libraries and Academic Information Resources (SULAIR)
have an ongoing program to produce and archive digital reproductions of
library materials. Digital Library Systems and Services (DLSS) manages and
operates several labs dedicated to digitization of print, audio and video
materials, and is building a digital library infrastructure to preserve and
provide access to these digitized materials.
Under the supervision of the Manager of Web Application Development in DLSS,
the Digitization Workflow Engineer will be responsible for building and
implementing systems that help manage the lifecycle of digitized objects.
This lifecycle begins with the object's selection for digitization, and ends
with its publication on the World Wide Web and preservation in the Stanford
Digital Repository. Other steps include metadata creation, digitization,
quality control, file cleanup, derivative creation and file validation. The
workflow systems implemented by the Engineer will focus on digitization
processes and preparation of files for online access and preservation systems.
This is primarily an engineering position, with responsibility for building
and implementing automated and manual tools and interfaces to support the
digitization labs. The workflow engineer will work closely with the lab
managers, the QA specialist, project managers and project coordinators to
build tools and systems that support individual projects and ongoing
digitization activities. The workflow engineer will also work closely with
the DLSS architect and other DLSS software developers to use, extend and
integrate with the existing digital library infrastructure and related services.
- Build or integrate tools for metadata creation. This may include online
forms for manually creating and editing XML metadata descriptions, and
automated tools for extracting embedded metadata values, text conversion
(OCR) or structural and logical markup.
- Develop end-to-end workflow system for digitization labs that automates as
much as possible file naming, movement of files from step to step, logging
of errors, workflow tracking, file validation, file processing and
derivative creation. The workflow systems should prepare files for online
access and preservation systems, and will integrate with (and leverage as
much as possible) the Libraries’ digital infrastructure.
- Build an online digitization project management system to facilitate
assignment of work, flagging of exceptions, tracking of progress and
reporting of project status.
- Develop algorithms and build tools to support format-specific digitization
workflow. This may include manipulations of or enhancements to digital
texts, images, audio files, video files, map and geospatial data, or born
Required Knowledge and Expertise
2-3 years of professional software engineering experience is required.
- Participation in at least one application development project using Ruby
on Rails or Java. Familiarity with a range of programming and scripting
languages is essential
- Demonstrated proficiency building applications in the Ruby on Rails
- Demonstrated proficiency in scripting simple utilities, using Ruby, Perl,
shell scripts, or Python.
- Demonstrated ability to write solid, simple, elegant code both
independently and in a team-programming environment and within schedule
- In-depth knowledge of HTML and related website development technologies
and software (especially CSS and PhP).
- Demonstrated expertise with XML and related tools and technologies (e.g.,
XML schema, schema management and databases, XSLT, X-forms).
- Experience with relational database design and management. Experience
implementing database applications for SQL Server, Oracle, or MySQL.
- Demonstrated ability to work independently on a project from specification
to launch; communicate effectively, orally and in writing; and work with all
levels of staff, vendors, and consultants.
- Demonstrated ability to work collaboratively on a project from
specification to launch; and to work with multiple levels of staff, and
colleagues at peer institutions and in open source communities.
- Demonstrated ability to develop new programming skills quickly, and to
grasp unfamiliar architectures and application designs quickly.
- Demonstrated proficiency applying best practices to technical projects,
especially test-first development and automated testing. Also must make
effective use of team collaboration tools, build management, and version
- Demonstrated success using, participating in and contributing to open
source software development projects
- Quick and self-bootstrapping learner. Particularly adept at quickly
learning new scripting and programming languages.
- Expertise in networking and systems integration in a heterogeneous
hardware (Linux, Windows) and software environment.
Desired Knowledge and Expertise
- Familiarity with XML schemas used to describe digitized cultural heritage
materials, such as TEI, MODS, METS, and EAD.