I would check with the developers of SNAC (
http://socialarchive.iath.virginia.edu/), as they've spent a lot of time
developing named entity recognition scripts for personal and corporate
names. They might have something you can reuse.
On Fri, Sep 26, 2014 at 3:47 PM, Galligan, Patrick <[log in to unmask]>
> I'm looking to reconcile about 40,000 corporate names against LCNAF to see
> whether they are authorized strings or not, but I'm drawing a blank about
> how to get it done.
> I've used http://freeyourmetadata.org/ for reconciling subject headings
> before, but I can't get it to work for LCNAF. Has anyone had any experience
> in a project like this? I'd love to hear some ideas for automatically
> dealing with a large data set like this that we did not create and do not
> know how the names were created.
> -Patrick Galligan