Sorry for the cross-posting, but I'd like to announce that Issue 7 of
The Code4Lib Journal was published this afternoon.
Code4Lib: Long May You Run [Editorial Introduction]
by Tom Keays
The Code4Lib Journal mirrors the diversity and depth of interests and
expertise of its readership. Our successes, indeed, are yours.
How Hard Can it Be? : Developing in Open Source
by Joann Ransom with Chris Cormack and Rosalie Blake
In 2000 a small public library system in New Zealand developed and
released Koha, the world’s first open source library management
system. This is the story of how that came to pass and why, and of the
lessons learnt in their first foray into developing in open source.
Extracting User Interaction Information from the Transaction Logs of a
Faceted Navigation OPAC
by Cory Lown and Brad Hemminger
This paper discusses the analysis of Apache web server logs from a
faceted catalog interface (OPAC) at North Carolina State University.
By grouping individual HTTP requests into user sessions and analyzing
in that context, requests can be understood as particular user
actions, with more specificity as to purpose and effect of an action.
Client IP address and time are used as a sufficient proxy for
determining user sessions from logs. Some initial exploratory findings
of user behavior in the NCSU OPAC are provided, including that users
make use of facets less than of text searching, and that some facet
groups are used significantly more than others. Links are provided to
the scripts used to make this session-based analysis, which could be
modified for use with other facetted OPACs which use an Apache
Using a Web Services Architecture with Me, Myself and I
by Stephen Meyer
The UW-Madison Libraries Library Course Page system is used to deliver
electronic reserves materials and course-focused library instruction
webpages to students. As part of a rewrite of our system we broke the
application into three component pieces: a file repository, a course
timetable data service, and an interface application for building and
viewing individual course pages. The new three-piece system was
written with an inward facing service-oriented architecture that
allowed us to choose the best technologies to solve each of the tasks
the entire system needs to accomplish.
Deciphering Journal Abbreviations with JAbbr
By Keith Jenkins
JAbbr is an online tool developed at Cornell University to help users
decipher journal title abbreviations. This article discusses why these
abbreviations are so problematic, and how traditional tools are often
insufficient, and then describes the novel approach used by JAbbr.
Given an abbreviation, JAbbr creates a regular expression for fuzzy
matching, tests it against a list of serial titles extracted from the
library catalog, and returns a list of possible matches to the user.
JAbbr is available as a web site and as a web service.
Repurposing ProQuest Metadata for Batch Ingesting ETDs into an
by Shawn Averkamp and Joanna Lee
his article describes the workflow used by the University of Iowa
Libraries to populate their institutional repository and their catalog
with the data collected by ProQuest UMI Dissertation Publishing during
the submission of students’ theses and dissertations. Re-purposing the
metadata from ProQuest allowed the University of Iowa Libraries to
streamline the process for ingesting theses and dissertations into
their institutional repository The article includes a discussion of
the benefits and limitations of the workflow described.
Bibliographic Metadata Extraction from Theses
By Götz Hatop
This article presents the application of part-of-speech (POS) based
statistical text analysis to the task of bibliographic metadata
extraction from electronic dissertations. By using the approach
described here it is possible to detect the title of a Ph.D. paper
with an accuracy of about 80%. The accuracy measurements are done
using a conceptually simple approach and implementation.