** Apologies for cross-posting **

Please note that the deadline for hotel registration and early  
conference registration is December 22nd.  The user group meeting  
programs are now posted on the website at http://


Sayeed Choudhury
Associate Director for Library Digital Programs
Hodson Director of the Digital Knowledge Center
Sheridan Libraries
Johns Hopkins University
[log in to unmask]

Preview of the Open Repositories Conference 2007, January 23-26, 2007/ 
San Antonio, Texas


Last January the Australian Partnership for Sustainable Repositories  
gathered visionaries for the first time in Sydney < http://>  to share information about  
how Dspace, Fedora, and Eprints repositories were changing the nature  
of scholarly and commercial information communities of practice. The  
upcoming Open Repositories Conference will bring user communities and  
others a step closer to understanding the pivotal role that  
repositories play in the emerging information landscape.    
Institutions such as universities, research laboratories, publishers,  
libraries, and commercial organizations are creating innovative  
repository-based systems that address the entire lifecycle of  
information—from supporting the creation and management of digital  
content, to enabling use, re-use, and interconnection of information,  
to ultimately ensuring long-term preservation and archiving. Open  
Repositories 2007 (OR07) will bring global stakeholders together  
again to discuss the challenges inherent in the conference tagline,  
“Achieving Interoperability in an Open World.”  What are the policy  
issues that are implied in an open world?  What are the technical  
challenges in achieving interoperability across heterogenesous  
repositories and related services?  How can advanced repository-based  
systems enable the collaborative processes around “e-science” and  
scholarly communication?   What are the challenges in enabling users  
to discover and access information across distributed repositories?   
What does open access to content mean across cultures? These are just  
some of the questions that attendees will ponder during the three-and- 
a-half day conference scheduled for January 23-26, 2007 in San  
Antonio, Texas.

Dspace, Fedora, and Eprints User Group meetings will be held on Jan.  
23 and 24, followed by combined conference plenary sessions on Jan.  
24, 25 and 26. The conference reception and poster session will take  
place on Jan. 24.

James Hilton, Vice President and Chief Information Officer at the  
University of Virginia, and Tony Hey, Corporate Vice President for  
Technical Computing, Microsoft, will discuss the opportunities and  
challenges in making human knowledge accessible and interoperable in  
an open world in keynote addresses on January 24 and January 26.

The Conference plenary program focuses on presentations in six  
categories that offer new ideas and solutions for online  
collaborative science and scholarship, along with insights into how  
to manage policy and decisions for the creation and preservation of  
distributed institutional knowledge < 

Management Strategy and Policy

The ARROW Project at 3 years: Looking Backwards, Aiming Forwards.


Since 2003 Arrow has been funded by the Australian Commonwealth  
Department of Education, Science and Training to identify and test  
solutions for best institutional repository practices. Andrew  
Treloar, Monash University, will offer an analysis of how their  
objectives have evolved, views on repository technology then and now,  
software development issues, and implementation decisions culled from  
three years of practice using Fedora.

How the Principles and Activities of Digital Curation Guide  
Repository Management and Operations

Leslie Johnson, University of Virginia Library, will share four  
overarching principles of digital curation that have been successful  
in making it easier to build trusted discovery and delivery services  
and tools for the use of digital objects. Principles for Selection,  
Principles for the Use of Standards, Principles for Trustworthiness,  
and Principles for Preservation and Sustainability are local  
principles that have provided a model for the creation of collection  
development policies, the identification of service goals for a  
repository and related policies and activities.

CURATOR: Its Developmental Strategy


How do you enable indexing of Japanese character strings for  
searching? This presentation describes practical and strategic  
approaches adopted by Japan's first institutional repository launched  
by a university library–Chiba University's Repository for Access to  
Outcome from Research (CURATOR).


Policy Frameworks for Institutional Repositories

As repositories begin to federate and interoperate at a large scale,  
the inability to express local policies as part of the context of the  
digital collections becomes more problematic. MacKenzie Smith, MIT  
and Reagan Moore, SDSC, will report on work by the MIT Libraries and  
the University of California, San Diego Supercomputer Center on the  
PLEDGE project (PoLicy Enforcement in Data Grid Environments). The  
project is funded by the US National Archives and Records  

Using OAI- PMH Resource Harvesting and MPEG- 21 DIDL for Digital  

To successfully preserve a web site, its resources must be crawled  
and the structure and relationships among the resources must be  
maintained. Joan Smith and Michael Nelson, Old Dominion University,  
propose involving the web server in the preservation process through  
“mod_oai”, an Apache module to harvest a web site packaged with its  
associated metadata thereby contributing to its long-term preservation.

CRiB: Preservation Services for Digital Repositories


The active lifespan of digital materials is much longer than the  
lifetime of individual storage media, hardware and software  
components, as well as the formats in which the information is  
encoded. As hardware and software become obsolete, digital materials  
become prisoners of their own encodings. Miguel Ferreira, Ana Alice  
Baptista, and Jose Carlos Ramalho from the University of Minho,  
Portugal will present the CRiB recommendation service that is  
designed to help institutions determine optimal migration strategies  
within a range of choices to preserve authentic materials.

User Services and Workflow

Making Fedora Easier to Implement with Fez–A Free Open Source Content  
Model and Workflow Management Front-end to Fedora


The University of Queensland, Australia has developed Fez, a world- 
leading user-interface and management system for Fedora-based  
institutional repositories, which bridges the gap between  a  
repository and users. Christiaan Kortekaas, Andrew Bennett and Keith  
Webster will review this open source software that gives institutions  
the power to create a comprehensive repository solution without the  

Real-time Duplicate and Plagiarism Detection

             < >
While electronic access to documents provides unprecedented  
opportunity for plagiarism, it also provides an unprecedented  
opportunity to automate the detection of plagiarism. Simeon Warner,  
Cornell University, will describe the implementation and the  
underlying algorithm of a service to compare the full-text of each  
new submission against all existing submissions in real-time used in  
managing the repository. ArXiv contains over 390,000  
articles, and will grow by more than 10% in the next year.

An Ethnographic Study of Institutional Repository Librarians: Their  
Experiences of Usability


The usability of current repository software and its tools is largely  
unknown when it comes to understanding whether they are adequate and  
appropriate for the tasks performed by repository managers.  Sally Jo  
Cunningham, Dave Nichols, Dana McKay and David Bainbridge from the  
University of Waikato, New Zealand, will share their observations  
based on their ethnographic study of local librarians who support the  
inclusion of new material in institutional repositories.

Semantic Web and Web 2.0

Realizing the Role of Digital Repositories in Educational  
Applications: Supporting Content and Context
DLESE Teaching Boxes are customizable, digital replicas of the  
traditional collections that most educators create, store (in boxes),  
re-use and improve on during their years of teaching. Huda Khan and  
Keith Maull from DLESE: Digital Library for Earth System Education,  
will review development of the Teaching Box Builder application and  
discuss questions raised with respect to repository integration with  
real-time Web 2.0 technologies as well as how this application design  
provides support for educators’ creation and adaptation of  
pedagogical content and context.

Cross-Repository Semantic Interoperability: the MIT SIMILE Project


Many questions are raised as previously unreachable digital content  
is found in and among new repositories--is each repository an island  
or a separately searchable resource? SIMILE (Semantic  
Interoperability of Metadata and Information in Unlike Environments)  
has developed an extensive 'tool chain' for gathering and  
manipulating data assets. Richard Rodgers and MacKenzie Smith, MIT,  
will demonstrate how tools developed by the SIMILE project can be  
used as powerful instruments for the federation, discovery,  
exploration, and curation of metadata.

The BibApp–Enabling Rapid Repository Population

The University of Wisconsin-Madison Libraries recently launched the  
Office of Scholarly Communication and Publishing (OSCP) and uses  
BibApp to consolidate campus directory information with citation data  
gathered by librarians, departments and research centers into a  
single online interface. Eric Larson will describe how BibApp alerts  
OSCP to content that may be suitable for fast “mashup” repository  
ingest. OSCP has prepared 1,200+ papers for ingest using BibApp.


The OAI Object Re-Use and Exchange (ORE) Initiative

There are numerous examples of the need to re-use objects across  
repositories in scholarly communication. Carl Lagoze, Cornell  
University and Herbert Van de Sompel, Los Alamos National Laboratory,  
will discuss the ORE (Object Re-Use and Exchange) Initiative that  
seeks to implement an interoperable fabric consisting of service  
interfaces shared across repositories, and some shared  
infrastructure. Repository federation efforts such as aDORe, CORDRA,  
the Chinese DSpace Federation, DARE, and Pathways (NSF IIS-0430906)  
suggest that such object re-use is achievable and will create the  
building blocks of a global scholarly communication federation in  
which each individual digital object will fuel a variety of  

Repository Deposit Service Description


Rachel Heery, Julie Allinson, Jim Downing, Christopher Gutteridge and  
Martin Morrey, UKOLN, University of Bath, will update attendees on a  
three-year UK program that is developing repository infrastructure  
aimed at increasing open access to scholarly material, while  
improving management of assets in higher education institutions. This  
effort is designed to ensure that the emerging network of JISC (Joint  
Information Services Committee) Digital Repositories is well  
populated with content. They will present their work towards defining  
a lightweight Common Repository Deposit Service Description.

An Analysis of Digital Repository Scenarios, Use Cases and Workflows

This presentation will set out the preliminary results of a study for  
a cross-section of the diverse repository developments ongoing in the  
United Kingdom. To date, over 80 scenarios and 20 use cases have been  
collected covering contexts such as: delineating the community  
dimensions of learning object repositories, depositing geospatial  
data, storing versions of content in a repository, developing  
metadata workflow in a laboratory repository holding research data,  
and adding digital rights information. Mahendra Mahey, Rachel Heery,  
Julie Allinson and Robert John Robertson UKOLN, University of Bath,  
will present the methodology developed to collect, compare and  
analyze scenarios, use cases and workflows for the identification of  
common functional internal components and interactions with external  
services in the information landscape.

e-Science and e-Scholarship

The Eprints Application Profile: A FRBR Approach to Modeling  
Repository Metadata
Julie Allinson, Pete Johnston and Andy Powell, UKOLN, University of  
Bath, present recent work on developing a Dublin Core Application  
Profile (DCAP) for describing 'scholarly publications' (eprints).  
They will explain why the Dublin Core Abstract Model is well suited  
to creating descriptions based on entity-relational models such as  
the FRBR-based (Functional Requirements for Bibliographic Records)  
Eprints data model. The ePrints DCAP highlights the relational nature  
of the model underpinning Dublin Core and illustrates that the Dublin  
Core Abstract Model can support the representation of complex data  
describing multiple entities and their relationships.

EsciDoc–a Scholarly Information and Communication Platform for the  
Max Planck Society
Digital libraries have become tools for everyday work. But are they  
ready for e-Scholarship? Scholarship produces additional types of  
information that are not curated by traditional libraries such as  
primary data, simulations, informal results, and annotations.  
Matthias Razum, FIZ Karlsruhe, will discuss eSciDoc, a joint project  
of the Max Planck Society and FIZ Karlsruhe that will create a next- 
generation platform for communication and publication in research  

ChemXSeer: A Chemistry Web Portal for Scientific Literature and Datasets

ChemXSeer portal is designed to be a hub for research in chemistry by  
facilitating search and access to both scientific literature and  
experimental datasets, while bridging these information sources in a  
unified framework. Levent Bolelli, Xiaonan Lu, Ying Liu, Anuj  
Jaiswal, Kun Bai, Isaac Councill, Prasenjit Mitra, James Z. Wang,  
Karl Mueller, James Kubicki, Barbara Garrison, Joel Bandstra and C.  
Lee Giles, Pennsylvania State University, will present an overview of  
ChemXSeer,  a portal for academic researchers in environmental  
chemistry that integrates scientific literature with experimental,  
analytical and simulation result datasets. The hybrid repository of  
ChemXSeer will be comprised of information crawled from the web,  
manual submissions of scientific documents, and user submitted  
datasets as well as scientific documents and metadata provided by  
major publishers.

Advance registration for the conference is open until December 22,  
2006. More information including an at-a-glance conference schedule  
and plenary, keynote and user group session descriptions is available  
at <>.