Print

Print


I use Geonames for this sort of thing a lot.  With cities and
administrative divisions being offered in a machine-readable format, it's
pretty easy to encode places in a format that adheres to AACR2 or other
cataloging rules.  There are of course problems disambiguating city names
when no country is given, but I get a pretty accurate response in general:
probably greater than 76% when I have both the city and country or city and
geographic region.

Ethan

On Mon, Sep 17, 2012 at 3:16 PM, Eric Lease Morgan <[log in to unmask]> wrote:

> On Sep 17, 2012, at 3:12 PM, <[log in to unmask]> wrote:
>
> > But I'm having trouble coming up with an algorithm that can consistently
> spit these out in the form we'd want to display given the data available in
> TGN.
>
>
> A dense but rich, just-published article from D-Lib Magazine about
> geocoding -- Fulltext Geocoding Versus Spatial Metadata for Large Text
> Archives -- may give some guidance. From the conclusion:
>
>  Spatial information is playing an increasing role in the access
>  and mediation of information, driving interest in methods capable
>  of extracting spatial information from the textual contents of
>  large document archives. Automated approaches, even using fairly
>  basic algorithms, can achieve upwards of 76% accuracy when
>  recognizing, disambiguating, and converting to mappable
>  coordinates the references to individual cities and landmarks
>  buried deep within the text of a document. The workflow of a
>  typical geocoding system involves identifying potential
>  candidates from the text, checking those candidates for potential
>  matches in a gazetteer, and disambiguating and confirming those
>  candidates -- http://bit.ly/Ufl5k9
>
> --
> ELM
>