About the company:
Ubiquity Press’s mission is to create an open future for research and knowledge by providing open access publishing and open source repository infrastructure and services to the academic sector. Headquartered in London, with a distributed workforce in North America and Europe, Ubiquity Press offers an open and supportive environment where staff are encouraged to grow and develop. Applicants are evaluated solely on their ability to contribute to the success of the mission, and the company recognises the strength and insight that diverse teams bring - to this end, we welcome applicants from all backgrounds, life experiences, and perspectives.
The Data Engineer’s primary responsibility is to migrate research repository metadata and files onto Ubiquity Press’s open source Hyku repository platform. This is done by: analysing the customer’s repository and data structure; working with the customer and the Account Manager to define the data models they will use after migration; working with the Product Manager and Ruby Development team to build the new data structure; creating mapping strategies for the data; lastly, implementing the mapping process to transform and load and quality check the data. As a secondary responsibility, the Data Engineer supports the wider company’s data needs, with advice, data cleanup, data evaluation, and presentation. This position is UK-based, within the Operations team, with the option to work remotely.
Analyse customer legacy data models and repository structures and work with them to define the new data models they will migrate-to.
Create and document mapping process from the legacy to the new data models, then run and test the output of those processes to ensure a high quality migration.
Look for opportunities to add value to the customer data during migration, in particular cleaning and standardising data.
Continuously improve the migration process from the customer’s perspective.
Automate the migration process as far as possible to reduce time and the opportunity for human error.
Regularly communicate timelines and progress with customers and internal team members, including on calls with customers.
Overcome challenges as they become apparent, with help from colleagues and customers where necessary.
Contribute to an open and supportive company culture, by helping colleagues where needed.
Support the development team during project work, for example speccing data structures, converting data, and providing advice.
Develop an understanding of the different databases that Ubiquity Press uses.
Support the company’s reporting needs by creating reports, looking for insights and patterns, preparing visualisations of the data, and explaining the data, in order that colleagues can make informed decisions and take correct actions.
Knowledge / Skills / Experience:
Must have, at high level:
Data translation/manipulation methodologies using apps (eg OpenRefine) or programming languages (esp. Python [eg pandas], Ruby)
Spreadsheet applications (eg Google Sheets / Excel)
Project/work skills (structured thinking, problem solving, planning, prioritisation, time management, accuracy)
Communication skills (internal + external, remote + in person, written + video)
Process creation and optimisation (process design, continuous improvement)
Customer/stakeholder comms (internal + external, remote + in person, written + video)
Useful to have, at any level:
Database querying (eg using SQL)
XML and other data structuring languages (eg XML, XSL, DTD/XSD, XPath, JSON, CSV etc)
Grep (for quickly finding data amongst files)
Data presentation (statistical analysis, data visualisation, communication)
Data architecture / modelling
Communicating acquired data effectively
Library metadata standards (eg RDA, Dublin Core, EAD, DACS, MARC, MODS, METS, TEI etc)
Experience of managing or operating an institutional repository
You will have over a year’s experience in a Data Engineer / Migration Specialist role. You love interacting with data, tidying it up, identifying and communicating patterns/trends/anomalies/insights, and creating processing workflows. You have a methodical, analytical, and logical approach to problem solving. You enjoy working in, and contributing to, a collaborative and supportive environment, but you are also happy to work autonomously, self-motivated, taking pride in doing a great job and delivering excellent customer service. You are calm and resilient under pressure, but happy to ask for help if it gets too much. You have an open mind and respect those with different perspectives to your own.
Please send your CV and covering letter including salary expectations, and whether you prefer to be interviewed in person at our London office or via webcam, to [log in to unmask] Shortlisted candidates will be sent a small Data Engineering test to do in advance, and discuss at the interview. Interviews will be with the COO plus one other team member. The intention is to offer the position after one round of interviews, but it is possible that a follow-up call is required before a decision is made.
Brought to you by code4lib jobs: https://jobs.code4lib.org/jobs/49782-data-engineer