We are looking for a Document Engineer to help us ingest, store, process, and
transform all of the world's laws and take on an $8 billion duopoly.
The Document Engineer will be primarily responsible for helping us put all of
the world's laws online for free, starting with U.S. law. Most of the world's
laws are in disparate file formats and located in difficult-to-access
government sites. You're going to take the lead on converting these documents
into a single, standardardized XML representation, making sure that our
database of laws stays up to date, and contributing to an in-house system to
ingest, retrieve, and transform these documents.
We are looking for someone who is excited to:
* Work with XML transformations (technologies like XSLT) and transforming non-XML documents to XML;
* Develop and refine a schema and information taxonomy;
* Design and contribute to a complex document storage, retrieval, and transformation system;
* Work at very large scale -- with tens to hundreds of millions of documents; and
* Learn new technologies and about the structure of the law.
Optimally, the Document Engineer will be able to code (or learn to code) in
some of the back-end languages we employ in our current back-end system,
including Python, Go, and Node.js. There will also be opportunities to work on
Machine Learning and Natural Language Processing problems if you're
Casetext's mission is to make all the world's laws free and understandable. We
have amassed an enormous database of legal texts, starting with nearly two
million U.S. judicial opinions. A community of law professors, lawyers, law
students, and citizens are adding insight and explanations. Casetext is
disrupting an $8 billion legal research market currently controlled by a
duopoly (Westlaw and LexisNexis) that has barricaded quality legal information
behind a paywall. (To see a sneak-peak of the newest version of Casetext, go
to [http://beta.casetext.com](http://beta.casetext.com). If prompted, the
login and password are "casetext.")
You'll be working with engineers from Google and IBM, the president of the
Stanford Law Review, and former practicing attorneys from Yale's and
Stanford's law schools. We are a Y Combinator company (Summer 2013) and have
raised a seed round of over $1.8 million.
This is an opportunity to be an early employee at a rising start-up, take on a
lot of responsibility, and play a substantial role in the future of the
company. We are extremely selective with who we hire, but we make sure that
our early team-members are well compensated in equity, salary, benefits, and
quality of work.
* Health insurance is fully covered.
* Caltrain passes are covered.
* We eat lunch out together every day, covered by the company.
* Free snacks and coffee at the office.
* You will have opportunities to blog about your work, attend conferences, publish papers, and open source large parts of the code you work on.
Casetext HQ is located in Palo Alto one street off of California Ave., just a
few blocks from the Caltrain and next door to some of Palo Alto's best
**HOW TO APPLY**
E-mail us at [log in to unmask] Send a resume and links to examples of
projects you've worked on. Code examples are very helpful.
Brought to you by code4lib jobs: http://jobs.code4lib.org/job/15423/
To post a new job please visit http://jobs.code4lib.org/