We are looking for someone to join our technical team at NICTA to work on our
distributed search system, the Lens. The Lens provides a free 'Innovation
Cartography' service to the general public, allowing related innovation data
to be discovered and shared easily by anyone. Over the coming year we will be
adding many terabytes of new data, including global patents, scientific
literature, business and legal information. Your initial work in this role
will focus on improving and replacing our current data transform and import
systems. These systems must accommodate a variety of different data formats
and also be fast enough to process very large data sets. You will be a vital
factor in our future capability to provide massive amounts of data for public
Although your initial role is concentrated on data processing and importing,
you will also be an integral part of the wider development team. We share
tasks and knowledge around the team so you'll get to work with others and on
other parts of the system. You'll be encouraged to learn, innovate and
continually make the high performance, distributed systems which make up the
Lens even more awesome.
* Develop high performance, deployable software for importing massive textual data sets
* Develop high performance, deployable software for transforming those data sets between different data formats and making those data sets accessible for high-use, public-access data services.
**You will need the following in your skills toolkit:**
* XML parsing techniques and frameworks - StAX, SAX, DOM.
* Java and in particular concurrent/multi-core processing.
* Experience in distributed systems, from design through to deployment/administration.
* Robust scripting - especially using bash, ruby or python.
* Good knowledge of Linux development/administration - including utilities such ssh, scp, rsync, grep, find, tar, awk etc
* Specific technologies:
* Lucene and/or Solr
* Amazon EC2
* Cutting edge OCR at large scale (highly regarded)
* Patent data knowledge (highly regarded)
**About The Lens: Cambia and NICTA**
_Our goal is to greatly enhance the public good by creating an open and
inclusive innovation system, which melds many disparate information sources,
dramatically expanding the availability and discoverability of human
knowledge. We think of it as 'Innovation Cartography', maps which allow us to
discover otherwise unreachable knowledge._
_Working on the Lens is a Lifestyle choice. Flexible work hours and a casual,
friendly environment in exchange for passion and
dedication. Our team is small, highly productive, open-
minded about solutions and focused on delivering high-impact public goods._
**NICTA (National ICT Australia Ltd)** is Australia's Information and Communications Technology Research Centre of Excellence. NICTA develops technologies that generate economic, social and environmental benefits for Australia. NICTA collaborates with industry on joint projects, creates new companies, and provides new talent to the ICT sector through a NICTA-enhanced PhD program. With four laboratories around Australia and over 700 people, NICTA is the largest organisation in Australia dedicated to ICT research.
NICTA is funded by the Australian Government through the Department of
Broadband, Communications and the Digital Economy and the Australian Research
Council through the ICT Centre of Excellence Program. NICTA is also funded and
supported by the Australian Capital Territory, the New South Wales and
Victorian Governments, the Australian National University, the University of
New South Wales, the University of Melbourne, the University of Queensland,
the University of Sydney, Griffith University, Queensland University of
Technology and Monash University.
**Cambia** is a globally prominent not-for-profit social enterprise and the leading provider of free patent and intellectual property search and analysis. Cambia's mission is the democratization of problem solving using science and technology. Cambia is the founding partner of the Lens, together with NICTA (National ICT Australia) and QUT (Queensland University of Technology).
Brought to you by code4lib jobs: http://jobs.code4lib.org/job/6410/