As Kevin mentioned, there are in fact many possible patterns for names to appear in, so it's probably not possible to un-invert all the names in the NAF with a single RegEx.
You mention that you've downloaded the records in bulk -- what format are the records in? Could you provide some examples?
Thanks,
Mike Monaco
Head, Technical Services & Coordinator, Cataloging Services
Associate Professor of Bibliography
University Libraries Technical Services
261B Bierce Library
The University of Akron
Akron, Ohio 44325-1712
He/him/his
Office: 330-972-2446
[log in to unmask]
ORCID: 0000-0001-7244-5154
https://www.uakron.edu/libraries
-----Original Message-----
From: Code for Libraries <[log in to unmask]> On Behalf Of Stuart A. Yeates
Sent: Monday, May 4, 2026 3:07 PM
To: [log in to unmask]
Subject: Re: [CODE4LIB] Regexp for rewriting LoC LCCN authorised personal names
CAUTION:This email originated from outside of The University of Akron. Use caution when opening attachments, clicking links or responding to requests for information.
As it happens, I have already downloaded the records in bulk. What I need is a regexp to parse the "quoted text"
cheers
stuart
--
...let us be heard from red core to black sky
On Tue, 5 May 2026 at 06:33, Trail, Nate <[log in to unmask]> wrote:
> Stuart,
>
> You could download the entire Names file in "nt" serialization, then
> there's one line for each name you can filter on:
>
>
> <http://id.l/
> oc.gov%2Fauthorities%2Fnames%2Fnr2001046558&data=05%7C02%7Cmmonaco%40UAKRON.EDU%7C65c1a7fc4f6d48f5610608deaa106e9e%7Ce8575dedd7f94ecea4aa0b32991aeedd%7C0%7C0%7C639135184716106736%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=XITloQ5ZybEL5qrdAojXpx%2FZ21wedG6%2BA%2BO%2B1ix4cok%3D&reserved=0> < http://www.loc.gov/mads/rdf/v1#authoritativeLabel> "Smith, Jim, 1940 October 17-" .
>
> Then you can do what you want with the quoted text.
>
> Saves bandwidth for you and us.
>
> https://id.l/
> oc.gov%2Fdownload%2F&data=05%7C02%7Cmmonaco%40UAKRON.EDU%7C65c1a7fc4f6
> d48f5610608deaa106e9e%7Ce8575dedd7f94ecea4aa0b32991aeedd%7C0%7C0%7C639
> 135184716159980%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOi
> IwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%
> 7C%7C&sdata=T7OhOWgr1s4TxHLYmtL5hgQR7rNT3rcLIT5LfjFSvoA%3D&reserved=0
>
> Good luck,
>
> Nate
>
>
> -----------------------------------------
> Nate Trail
> Network Development & MARC Standards Office LCSG/DPS/ABA/NDMSO Library
> of Congress Washington DC 20540
>
>
> -----Original Message-----
> From: Code for Libraries <[log in to unmask]> On Behalf Of Kevin
> Hawkins
> Sent: Monday, May 04, 2026 2:08 PM
> To: [log in to unmask]
> Subject: Re: [CODE4LIB] Regexp for rewriting LoC LCCN authorised
> personal names
>
> CAUTION: This email message has been received from an external source.
> Please use caution when opening attachments, or clicking on links.
>
> Hello Stuart,
>
> Do you mean that you want to convert LCNAF personal names from this
> sort of order:
>
> Mudge, Lewis Seymour, 1868-1945
>
> to something like this:
>
> Lewis Seymour Mudge, 1868-1945
>
> ? But then also deal with authorized forms containing no commas,
> forms with more than two commas, and occasional use of parentheses.
> So, as you know, it gets complicated.
>
> I wonder if a different approach might make more sense here:
>
> 1. Query the inverted LCNAF form at
> https://id.l/
> oc.gov%2F&data=05%7C02%7Cmmonaco%40UAKRON.EDU%7C65c1a7fc4f6d48f5610608
> deaa106e9e%7Ce8575dedd7f94ecea4aa0b32991aeedd%7C0%7C0%7C63913518471617
> 8598%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwM
> CIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata
> =FkP48ZXE11h7Qq1kXsl9JK%2FBhQvnswsYpC8rPoPGgYg%3D&reserved=0
>
> 2. Retrieve the URI, extracting the identifier (beginning with "n")
>
> 3. Query Wikidata using this identifier.
>
> 4. Retrieve Wikidata's form of the name, which is not inverted.
>
> --Kevin
>
> On 5/3/26 1:25 PM, Stuart A. Yeates wrote:
> > Does anyone know of somewhere that describes LCCN authorised
> > personal names as regexps? I want to be able to rewrite them at scale to 'normal'
> order.
> >
> > AI appears to be actively undermining the functionality of search
> engines.
> >
> > cheers
> > stuart
> > --
> > ...let us be heard from red core to black sky
>
|