Thanks for this, Owen. Obviously this will need a little work to develop
into something suitable for my particular use-case - but then, that's what
open source is all about ...
Thanks again,
Tim
On 13 January 2017 at 12:53, Owen Stephens <[log in to unmask]> wrote:
> So just out of curiosity I ran 2500 author names from the DOAJ through
> this library (I used a version someone has kindly wrapped as a webservice
> http://nameparse.herokuapp.com/?name=Firstname+Surname). The names were
> just some I had handy so no real attempt to challenge the software.
>
> In general it seemed to do pretty well, but it isn’t perfect. In
> particular two part given names or two part family names where the parts
> are separated by a space end up with part of the name in the ‘middle name’.
> This may not matter too much to you in cases where this affects the given
> name, because you’ll end up with the same output string if you format as
> {family name}, {first name} {middle name}. However in cases where the
> surname is split by a space (as it is for my kids) then you end up with a
> problem - e.g.:
>
> Jane Bloggs Doe - where the surname is ‘Bloggs Doe’, would end up being
> converted to: Doe, Jane Bloggs instead of Bloggs Doe, Jane
>
> I tend to use OpenRefine to do this kind of work and this allows you to do
> lookups on webservices such as the one I’ve used - so this is a pretty
> useful addition to my toolset - thanks for asking the question!
>
> Owen
>
>
> Owen Stephens
> Owen Stephens Consulting
> Web: http://www.ostephens.com
> Email: [log in to unmask]
> Telephone: 0121 288 6936
>
> > On 13 Jan 2017, at 11:34, Timothy Hill <[log in to unmask]> wrote:
> >
> > Please excuse the naive way this question is formulated: I'm sure the
> > Information & Library Science community has formal terms for what I'm
> > attempting to do, but unfortunately I don't know what they are.
> >
> > The problem I'm trying to solve is that I have a bunch of author names
> (for
> > example, 'Charles Dickens') that I need to reformat into standard
> catalogue
> > order ('Dickens, Charles'). Obviously the example given is trivial, but
> of
> > course this can get quite complex depending on the addition of titles and
> > honorifics.
> >
> > Is anyone aware of a software library to perform this kind of conversion?
> > The programming language used is not terribly important, though Java or
> > Python would be preferable.
> >
> > In ideal world the library would deal with the different conventions used
> > in different languages and by different institutions - but anything would
> > be better than the current split-on-comma approach I'm using right now.
> >
> > Thanks,
> >
> > Timothy Hill
>
|