I'd spin that around and ask how you'd exploit a set of CSV or files in any
other format since regardless of how data is obtained, the structure and
methods you use should be driven by the problem you want to solve.
This is not to say that it's not sometimes fun to have a tool in your hand
and then try to find problems it might be good for -- especially if that
tool throws fire or potentially does a lot of destruction. But I
digress.... In general, lessons are more likely to stick if you're working
on something you actually care about.
If you're just looking to play around with this particular data, the
authors strike me the most interesting part of the equation. Would be
interesting to know how consistent their names are entered (especially
across publishers), how many of the authors different publishers list, etc.
On Tue, Jan 15, 2019 at 6:38 AM Eric Lease Morgan <[log in to unmask]> wrote:
> How might I exploit & learn from a set of RDF files harvested from DOI's?
> ...... How can I go beyond parsing the RDF with XPath, stuffing the
> results into a database, and applying SQL to the result? How can truly
> exploit the nature of the RDF and possibly manifest it as linked data?