From:    Carol Bean <[log in to unmask]>
Subject: Issue 30 of the Code4Lib Journal now available! [apologies for cross posting]

The Editorial Committee of the Code4Lib Journal is pleased to announce its
30th issue  is now available at, with the
following lineup.  Please feel free to share!

Editorial Introduction: It’s All About Data, Except When It’s Not.

Carol Bean

Data capture and use is not new to libraries. We know data isn’t
everything, but it is ubiquitous in our work, enabling myriads of new ideas
and projects. Articles in this issue reflect the expansion of data
creation, capture, use, and analysis in library systems and services.

Collected Work Clustering in WorldCat

Janifer Gatenby, Gail Thornburg and Jay Weitz, OCLC

WorldCat records are clustered into works, and within works, into content
and manifestation clusters. A recent project revisited the clustering of
collected works that had been previously sidelined because of the
challenges posed by their complexity. Attention was given to both the
identification of collected works and to the determination of the component
works within them. By extensively analysing cast-list information,
performance notes, contents notes, titles, uniform titles and added
entries, the contents of collected works could be identified and
differentiated so that correct clustering was achieved. Further work is
envisaged in the form of refining the tests and weights and also in the
creation and use of name/title authority records and other knowledge cards
in clustering. There is a requirement to link collected works with their
component works for use in search and retrieval.

Data Munging Tools in Preparation for RDF: Catmandu and LODRefine

Christina Harlow

Data munging, or the work of remediating, enhancing and transforming
library datasets for new or improved uses, has become more important and
staff-inclusive in many library technology discussions and projects. Many
times we know how we want our data to look, as well as how we want our data
to act in discovery interfaces or when exposed, but we are uncertain how to
make the data we have into the data we want. This article introduces and
compares two library data munging tools that can help: LODRefine
(OpenRefine with the DERI RDF Extension) and Catmandu.
     The strengths and best practices of each tool are discussed in the context
of metadata munging use cases for an institution’s metadata migration
workflow. There is a focus on Linked Open Data modeling and transformation
applications of each tool, in particular how metadataists, catalogers, and
programmers can create metadata quality reports, enhance existing data with
LOD sets, and transform that data to a RDF model. Integration of these
tools with other systems and projects, the use of domain specific
transformation languages, and the expansion of vocabulary reconciliation
services are mentioned.

Manifold: a Custom Analytics Platform to Visualize Research Impact

Steven Braun

The use of research impact metrics and analytics has become an integral
component to many aspects of institutional assessment. Many platforms
currently exist to provide such analytics, both proprietary and open
source; however, the functionality of these systems may not always overlap
to serve uniquely specific needs. In this paper, I describe a novel
web-based platform, named Manifold, that I built to serve custom research
impact assessment needs in the University of Minnesota Medical School.
Built on a standard LAMP architecture, Manifold automatically pulls
publication data for faculty from Scopus through APIs, calculates impact
metrics through automated analytics, and dynamically generates report-like
profiles that visualize those metrics. Work on this project has resulted in
many lessons learned about challenges to sustainability and scalability in
developing a system of such magnitude.

Open Journal Systems and Dataverse Integration– Helping Journals to Upgrade
Data Publication for Reusable Research

Micah Altman, Eleni Castro, Mercè Crosas, Philip Durbin, Alex Garnett, and
Jen Whitney

This article describes the novel open source tools for open data
publication in open access journal workflows. This comprises a plugin for
Open Journal Systems that supports a data submission, citation, review, and
publication workflow; and an extension to the Dataverse system that
provides a standard deposit API. We describe the function and design of
these tools, provide examples of their use, and summarize their initial
reception. We conclude by discussing future plans and potential impact.

Collecting and Describing University-Generated Patents in an Institutional
Repository: A Case Study from Rice University

Linda Spiro and Scott Carlson

Providing an easy method of browsing a university’s patent output can free
up valuable research time for faculty, students, and external researchers.
This is especially true for Rice University’s Fondren Library, a
USPTO-designated Patent and Trademark Resource Center that serves an
academic community widely recognized for cutting edge science and
engineering research. In order to make Rice-generated patents easier to
find in the university’s community, a team of technical and public services
librarians from Fondren Library devised a method to identify, download, and
upload patents to the university’s institutional repository, starting with
a backlog of over 300. This article discusses the rationale behind the
project, its potential benefits, and challenges as new Rice-generated
patents are added to the repository on a monthly basis.

SierraDNA – Demonstrating the Usefulness of Direct ILS Database Access

James Padgett and Jonathan Hooper

Innovative Interface’s Sierra(™) Integrated Library System (ILS) brings
with it a Database Navigator Application (SierraDNA) – in layman’s terms
SierraDNA gives Sierra sites read access to their ILS database. Unlike the
closed use cases produced by vendor supplied APIs, which restrict Libraries
to limited development opportunities, SierraDNA enables sites to publish
their own APIs and scripts based upon custom SQL code to meet their own
needs and those of their users and processes.
In this article we give examples showing how SierraDNA can be utilized to
improve Library services. We highlight three example use cases which have
benefited our users, enhanced online security and improved our back office
     In the first use case we employ user access data from our electronic
resources proxy server (WAM) to detect hacked user accounts. Three scripts
are used in conjunction to flag user accounts which are being hijacked to
systematically steal content from our electronic resource provider’s
websites. In the second we utilize the reading histories of our users to
augment our search experience with an Amazon style “People who borrowed
this book also borrowed…these books” feature. Two scripts are used together
to determine which other items were borrowed by borrowers of the item
currently of interest. And lastly, we use item holds data to improve our
acquisitions workflow through an automated demand based ordering process.
Our explanation and SQL code should be of direct use for adoption or as
examples for other Sierra customers willing to exploit their ILS data in
similar ways, but the principles may also be useful to non-Sierra sites
that also wish to enhancement security, user services or improve back
office processes.

Streamlining Book Requests with Chrome

Dr. Rachel Schulkins & Joseph Schulkins

This article starts by examining why a Chrome Extension was desired and how
we saw it making the workflow for requesting new items both easier and more
accurate. We then go on to outline how we constructed our extension,
looking at the folder structure, third party scripts and services that
combine to make this achievable. Finally, the article looks at how the
extension is regulated and plans for future development.

Generating Standardized Audio Technical Metadata: AES57

Jody L. DeRidder

Long-term access to digitized audio may be heavily dependent on the quality
of technical metadata captured during digitization. The AES57-2011 standard
offers a standardized method of documenting fairly comprehensive technical
information, but its complexity may be confusing. In an effort to lower the
barrier to use, we have developed software that generates valid AES57 files
for digitized audio, using output from FITS (File Information Tool Set
<>) and a few fields of information
from a tab-delimited spreadsheet. This article will describe the logic
used, the fields required, the basic process, applications, and options for
further development.

Topic Space: Rapid Prototyping a Mobile Augmented Reality Recommendation App

Jim Hahn, Ben Ryckman, and Maria Lux

With funding from an Institute of Museum and Library Services (IMLS)
Sparks! Ignition Grant, researchers from the University of Illinois Library
designed and tested a mobile recommender app with augmented reality
features. By embedding open source optical character recognition software
into a “Topic Space” module, the augmented reality app can recognize call
numbers on a book in the library and suggest relevant items that are not
shelved nearby. Topic Space can also show users items that are normally
shelved in the starting location but that are currently checked out. Using
formative UX methods, grant staff shaped app interface and functionality
through early user testing. This paper reports results of UX testing; a
redesigned mobile interface, and provides considerations on the future
development of personalized recommendation functionality.

Integration of Library Services with Internet of Things Technologies

Kyriakos Stefanidis & Giannis Tsakonas

The SELIDA framework is an integration layer of standardized services that
takes an Internet-of-Things approach for item traceability in the library
setting. The aim of the framework is to provide tracing of RFID tagged
physical items among or within various libraries. Using SELIDA we are able
to integrate typical library services—such as checking in or out items at
different libraries with different Integrated Library Systems—without
requiring substantial changes, code-wise, in their structural parts. To do
so, we employ the Object Naming Service mechanism that allows us to
retrieve and process information from the Electronic Product Code of an
item and its associated services through the use of distributed mapping
servers. We present two use case scenarios involving the Koha open source
ILS and we briefly discuss the potential of this framework in supporting
bibliographic Linked Data.

Carol Bean
Coordinating Editor, Issue 30
Code4Lib Journal
[log in to unmask]


To unsubscribe from the DLF-ANNOUNCE list:
write to: mailto:[log in to unmask]
or click the following link: