Print

Print


[Please excuse the cross-posting]

Internet Archive is looking for a programmer that can bring library
records into the semantic web. This requires working with very large
datasets and doing analyzing, merging, and manipulating to bring these
key resources to a wide audience.

The Internet Archive is a non-profit digital library committed to
preserving the world's digital cultural artifacts. Used by over 6
million people, this resource is becoming part of how the Internet
works. Our job is to put the best humanity has to offer within reach of
students, educators and the general public. Find out more about our
organization and web archive at www.archive.org

Open Library is an open source software project started by the Internet
Archive to build a site with one web page for every book ever published.
The site uses a new type of Semantic Wiki that preserves the structured
data that already exists for books. Leveraging millions of library and
publisher bibliographic records, we have already created a technology
demo, available at http://demo.openlibrary.org, and we're looking for a
data importer to help us grow the site to the next level.  Interested
applicants should be sure to look at the source code available on the
demo site before applying.

You will assist the current team of programmers to import data in MARC,
ONIX and other formats, crawl and parse information from the web, and
integrate and deduplicate the records that we get from different sources.

REQUIREMENTS:

    * Minimum of 3 years of experience with Python, Perl, or PHP is required
    * Must have UNIX experience
    * Experience with database calling or merging, crawling technology,
book data a plus
    * Experience as a technical librarian a strong plus

We are located in the Presidio of San Francisco with parking and public
transportation available.

The Internet Archive is an equal opportunity employer. We provide
medical and dental benefits. Please send your resume and cover letter to
[log in to unmask] with the subject line "Programmer- Semantic
Web." The Internet Archive thanks all applicants for their interest, but
advises that only those selected for an interview will be contacted. No
phone calls please.