I touch on the text mining etc. needs of researchers in two recent blog entries:

- FREE THE ARTICLES! (Full-text for researchers & scientists and their
- New Open Access Criterion: Support access by machines (m2m)


Glen Newton | [log in to unmask]
Researcher, Information Science, CISTI Research
& NRC W3C Advisory Committee Representative
tel/tél: 613-990-9163 | facsimile/télécopieur 613-952-8246
Canada Institute for Scientific and Technical Information (CISTI)
National Research Council Canada (NRC)| M-55, 1200 Montreal Road
Institut canadien de l'information scientifique et technique (ICIST)
Conseil national de recherches Canada | M-55, 1200 chemin Montréal
Ottawa, Ontario K1A 0R6
Government of Canada | Gouvernement du Canada
>>>>> "Jonathan" == Jonathan Rochkind <[log in to unmask]> writes:

    Jonathan> An announcement from the DOAJ that we got at the
    Jonathan> Code4Lib Journal, since we're listed in the DOAJ.

    Jonathan> I forward it to you all because it's related to the
    Jonathan> on-going discussion some of us are having about "how the
    Jonathan> heck can we get our software to find open access
    Jonathan> versions of articles." Certainly not close to a fix-all
    Jonathan> even if their project is succesful, but addresses one
    Jonathan> component of one subset of open access material.

    Jonathan> Jonathan

    Jonathan> doaj-team wrote:
    >> Lund Sweden 23 April 2008 Important news for all publishers who
    >> have journals listed in the Directory of Open Access Journals
    >> (DOAJ)
    >> Dear publishers of journals listed in the Directory of Open
    >> Access Journals (DOAJ)
    >> We --the team behind the DOAJ-- are approaching you to inform
    >> about two important issues.
    >> Firstly, as you probably are aware of, there is a growing
    >> discussion and attention to open access to scholarly
    >> information in the research community. The current discussion
    >> is concentrating on open access in a broader sense than just
    >> free access to journal articles.
    >> In order for research to be really open, researchers need more
    >> than just to get free access to the articles -- that is more
    >> than free-to-read. Researchers are increasingly demanding and
    >> expecting to be able to reuse not only the text in various
    >> ways, but increasingly to be able to do text- and data mining
    >> in order to more efficiently extract and discover fractions of
    >> the content (i.e. for instance acronyms for genes, proteins,
    >> abbreviations etc.) and to uncover hidden relations between
    >> such fractions by automated computing.
    >> In order for open access journals to be even more useful and
    >> thus receive more exposure and provide more value to the
    >> research community it is very important that open access
    >> journals offer standardized, easily retrievable information
    >> about what kinds of reuse are allowed.
    >> Creative Commons offers a number of licenses that in a
    >> standardized way makes it very easy for content providers to
    >> offer information about these issues. More information about
    >> this under Step 1 below.
    >> Secondly, SPARC Europe and The Directory of Open Access
    >> Journals (operated by Lund University, Sweden) have entered an
    >> agreement about introducing a certification scheme for Open
    >> Access journals, the SPARC Europe Seal for Open Access
    >> Journals.
    >> The intention of the scheme is to motivate open access journals
    >> to deliver metadata to DOAJ. The DOAJ team will then convert
    >> the metadata into standardized XML-format and OAI-compliant
    >> format, which will further increase the visibility of articles
    >> and provide means for the easiest possible dissemination thus
    >> reaching more readers, attracting more authors, gaining more
    >> prestige and impact.
    >> The team behind the DOAJ will offer various forms of assistance
    >> and guidance in this respect.
    >> What are the advantages of having the SPARC Europe Seal?
    >> Improved information as to what users are allowed to do with
    >> papers published in your journal(s).
    >> Possible long-term archiving of your content, which makes
    >> publishing in your journal more attractive to authors.
    >> Better exposure as a high-quality journal based on state-of-the
    >> art dissemination technologies.
    >> The DOAJ team converts your metadata and makes the metadata
    >> harvestable, which means the widest possible dissemination and
    >> thus increased usage and impact.
    >> How to be approved:
    >> Step 1:
    >> Choose the Creative Commons License CC-BY license.
    >> In order to qualify for the SPARC Europe Seal you must apply
    >> the CC-BY license, which is the most user friendly license,
    >> allowing among other things for long-term preservation and
    >> text- and data-mining
    >> How to choose the CC-BY license:
    >> Go to the Creative Commons (CC) web site
    >> ( and copy the CC-BY
    >> Icon -
    >> -- you might as well consult this:
    >> .
    >> Put the CC-BY icon on the homepage of your journal(s) and
    >> preferably on each article in your journal.
    >> Go to DOAJ web site ( , login to "For
    >> journal owners", click on "license info" and choose CC-license
    >> for your journal(s).
    >> The CC icon will be shown automatically in DOAJ.
    >> Step 2:
    >> Your journal(s) shall continuously provide DOAJ with metadata
    >> for all of your content.
    >> How to provide us with the metadata:
    >> Right now DOAJ tools allow you to do the following: upload
    >> article by article filling a web form upload files containing
    >> one or more records. The files must conform with the DOAJ XML
    >> Schema specification (read more at:
    >> These two features can be found once you have logged in to "For
    >> journal owners" on DOAJ web site (
    >> Once we have your content (metadata on article level) in DOAJ,
    >> the content become OAI harvestable and distributed in an XML
    >> format to the rest of the world.
    >> Thanks in advance
    >> The DOAJ Team
    >> --~--~---------~--~----~------------~-------~--~----~ You
    >> received this message because you are subscribed to the Google
    >> Groups "Code4Lib Journal Articles" group.  To post to this
    >> group, send email to [log in to unmask] To
    >> unsubscribe from this group, send email to
    >> [log in to unmask] For more options,
    >> visit this group at
    >> -~----------~----~----~----~------~----~------~--~---

Jonathan Rochkind
    Jonathan> Digital Services Software Engineer The Sheridan
    Jonathan> Libraries Johns Hopkins University 410.516.8886 rochkind
    Jonathan> (at)