Aha, that's probably what I need. And now I remember Ross probably pointed that out to me before. I'm still having trouble figuring out how to get from the rdf-triples it's got there to a hash of codes (as they appear in marc records, not URIs), to labels. It seems like it in fact will be a lot more work than the scraping I'm doing of the HTML page now, but of course the problem with the HTML page is that it's structure is not reliable, it changes. So the structured data from id.loc.gov is the way to go.... but I'm still getting confused figuring out how to get what I want out of it. If anyone wants to give me any hints, appreciated. It kind of looks like I FIRST have to get the complete list from one of the structured forms (RDF-XML, triple, etc), and THEN make a seperate HTTP request for _each_ term listed in the list to get the code as found in the MARC record and the label. That's a pretty slow process, as well as requiring writing more code than a task like this seems like it should take. Is there anything on that site that can give me the code/label pairs in one single download? On 6/22/2011 6:38 PM, Stephen Hearn wrote: > Have you looked at id.loc.gov? One of its vocabularies defines URLs > for each of the MARC geographic area codes. > > Stephen > > > On Wed, Jun 22, 2011 at 4:44 PM, Jonathan Rochkind<[log in to unmask]> wrote: >> Can anyone remind me if there's a machine readable copy of the MARC >> geographic codes available at any persistent URL? >> >> They're in HTML at http://www.loc.gov/marc/geoareas/gacs_code.html . I >> actually had a script that automatically downloaded from there and "scraped" >> the HTML -- but sometime since I wrote the script, the HTML structure on the >> page changed and it broke. >> >> (I kind of thought that was unlikely since that HTML page itself was machine >> generated -- but I guess they changed the software that generated it. >> Certainly I knew that scraping HTML was a bad thing to rely on... which is >> why I hope LC provides this in some format less likely to change?) >> > >