Print

Print


A few weeks ago on the Code4Lib Slack channel, a computer science student asked:

  Hi! I'm only a bachelor's student in CompSci but I'd be
  interested to learn more about library specific software. What
  are some small beginner friendly projects to get into? I learn
  better creating a product.

And I replied with the following. I'm sharing it here because I believe it is relevant to many of us here:

  More or less, libraries and librarianship are traditionally about
  the collection, organization, preservation, and disseminataion of
  data, information, and knowledge. Beginner friendly projects?
  There are quite a few, listed below in no priority order:
  
  * Create a library catalog - Download and install a program
    called Koha. Bring together a large handful of your books. Use
    Koha to describe your books. Make the resulting catalog
    temporarily available on the Web. For extra credit, search things
    like the Library of Congress of descriptions of books (known as
    MARC records), and add them to your catalog. All of this can be
    done within Koha using both GUI and programatic interfaces.
  
  * Build a collection of scholarly journals - Articulate a topic
    of personal interest, but don't be too specific. Peruse a
    directory of scholarly journals called the Directory of Open
    Access Journals (DOAJ). Look for titles matching your interest
    and note the URL pointing of the titles' OAI-PMH data root. (This
    is the hardest part.) Use either Perl or Python toolkits
    implementing the OAI-PMH protocol, and collect the bibliogrpahics
    of all the articles in a given title. For extra credit collect
    the actual articles, not only the bibliographics.
  
  * Index the content of a relational database - Draw an enity
    relationship diagram illustrating the layout of a set of
    data/information you want to collect. The data/information can be
    any number of things: your books, your CDs, your DVDs, cool
    websites, journals articulated from the previous project, etc.
    Use SQLite to implement the layout, and fill the database with
    content. Finally, use SQLite's fulltext/freetext indexing feature
    to make the database searchable, and write a command-line shell
    tool to query the index. For extra credit, create a Web-based
    interface to the index.
  
  * Archive content - Identify websites of interst to yourself.
    Become familiar with the robots.txt convention. Use a
    command-line tool called wget to crawl the websites of interest,
    and use wget's WARC feature to create long-lasting snapshots of
    the sites. Use these WARC files as fodder for the library catalog
    or relational database project.
  
  * Create a website - Sign up for a free Amazon Web Services
    account. Spin up a tiny instance with the two cores, and the
    tiniest bits of RAM and disk space. All of this is still free.
    Install Apache -- an HTTP server -- on the instance. Write the
    tiniest of HTML pages and save it at the root of your Apache
    server. Finally, configure the instance to accept connections
    from the world. For extra credit, write a .htaccess file limiting
    access to the site via usename/password combitations, and lock
    down access to the site to only your friends and family.
  
  * Practice with REST - Enumerate things of interest to yourself.
    Become familiar with the Internet Archive's REST interface for
    searching its collection. Articulate queries to search the
    Archive's collection using its REST implementation, and manifest
    the queries using a tool call curl. The results of the queries
    will be JSON streams. Use another tool -- jq -- to read, parse,
    filter the result. In the end you will get links to PDF, plain
    text, image, and descriptive (MARC) files of the Archive's
    content. Use your new skills against other websites with REST
    interfaces. For extra credit, use this content as fodder for some
    of the other projects.
  
  * Bind a book - Identify a classic work of literature that piques
    your interest. Download a PDF version of the book. Print it. Use
    a binding technique called the Japanese stab stitch to bind the
    book. Read the book and while you do so, write in the margins.
    Alternatively, use a comb binder or something similar to bind the
    book. For extra credit, understand that a PDF version of a book
    is not a book. Instead, it is a file. To really make the PDF file
    a book, it behooves you to impose the pages to create signatures,
    bind the signatures into a book block, and finally encase the
    book block between covers. By far, the most difficult part of
    this process is the imposition and a program called Fitplot works
    very well for me in this regard. Like the creation of WARC files,
    the binding of books is a type of preservation process.
  
  Underlying all of these projects is one thing: libraries are not
  about books. Books manifest the data, information, and knowledge,
  and now-a-days, data, information, and knowlege are increasing
  manifested in digital forms.
  
  Fun with librarianship!

--
Eric Lease Morgan
Navari Family Center for Digital Scholarship
Hesburgh Libraries
Universityi of Notre Dame