Matt Amory wrote:
> Is anyone involved with, or does anyone know of any project to extract and
> aggregate bibliography data from individual works to produce some kind of
> "most-cited" authors list across a collection? Local/Network/Digital/OCLC
> or historic?
>
> Sorry to be vague, but I'm trying to get my head around whether this is a
> tired old idea or worth pursuing...
>
>
Sounds like you're describing citeseer - http://citeseerx.ist.psu.edu/ -
it's a combination bibliographic and citation index for computer science
literature. It includes a good degree of citation analysis. Incredibly
useful tool.
Funding from NSF, NASA, Microsoft Research. Initially developed at NEC
Research Institute, then moved to Penn. State. Code is available at
sourceforge under an Apache license (find the link on the above cited
page).
From the history page:
---
CiteSeer was the first digital library and search engine to provide
automated citation indexing and citation linking using the method of
autonomous citation indexing
<http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.17.1607>.
CiteSeer was developed in 1997 at the NEC Research Institute, Princeton,
New Jersey, by Steve Lawrence <http://labs.google.com/people/lawrence/>,
Lee Giles <http://clgiles.ist.psu.edu> and Kurt Bollacker
<http://en.wikipedia.org/wiki/Kurt_Bollacker>. The service transitioned
to the Pennsylvania State University's College of Information Sciences
and Technology in 2003. Since then, the project has been led by Lee
Giles with technical and administrative direction by Isaac Councill
<http://www.personal.psu.edu/%7Eigc2>.
After serving as a public search engine for nearly ten years, CiteSeer,
originally intended as a prototype only, began to scale beyond the
capabilities of its original architecture. Since its inception, the
original CiteSeer grew to index over 750,000 documents and served over
1.5 million requests daily, pushing the limits of the system's
capabilities. Based on an analysis of problems encountered by the
original system and the needs of the research community, a new
architecture and data model was developed for the "Next Generation
CiteSeer," or CiteSeer^x , in order to continue the CiteSeer legacy into
the foreseeable future.
---
Miles Fidelman
--
In theory, there is no difference between theory and practice.
In<fnord> practice, there is. .... Yogi Berra
|