Print

Print


On Aug 9, 2019, at 4:28 PM, Eric Hanson <[log in to unmask]> wrote:

> The newest issue of code4Lib Journal is now available - https://journal.code4lib.org/issues/issues/issue45


It is nice to see our journal thrive.

Our journal is also great fodder for "distant reading", and I took it upon myself to "read it from afar", and I was happy to see the topic modeling seemed to work well:

  http://carrels.distantreader.org/library/code4lib-45/index.htm#topic-modeling

More specifically, I requested five topics of the issue, the following "themes" were returned with the most-associated article titles:

  1. ar, data, sh - Programming Poetry: Using a Poem Printer and
     Web Programming to Build Vandal Poem of the Day

  2. org, rightsstatements, copyright - Consortial
     RightsStatements.org Implementation and Faceted Search for Reuse
     Rights in Digital Library Materials

  3. terms, library, video - Generating Geographic Terms for
     Streaming Videos Using Python: A Comparative Analysis

  4. sinopia, react, component - Developing Sinopia’s Linked-Data
     Editor with React and Redux

  5. ohsu, search, publications - Building an institutional author
     search tool

The extraction of statistically significant keywords worked well too, but some of them ought to be denoted as stop words:

  http, libraries, library, coding, collecting, component,
  copyright, data, digitization, digitizing, falsc, functioning,
  like, metadata, method, ohsu, people, poems, poetry, policy,
  prints, publication, rdf, react, rightsstatement, searching,
  shacl, sinopia, syndrome, term, video, web, williamson, wordpress

Fun with indexing.

--
Eric Morgan
University of Notre Dame