Print

Print


> From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of
> Eric Lease Morgan
> Sent: Friday, May 28, 2010 09:35 AM
> To: [log in to unmask]
> Subject: [CODE4LIB] generating unique integers
> 
> Given a list of unique strings, how can I generate a list of short,
> unique integers?
> 
> I have a list of about 250 unique author/title combinations, such as:
> 
>   Aeschylus / Prometheus Bound
>   Aeschylus / Suppliant Maidens
>   American State / Articles of confederation
>   American State / Declaration of Independence
>   Aquinas / Summa Theologica
>   Aristophanes / Achamians
>   Aristophanes / Clouds
>   Aristophanes / Ecclesiazusae
>   Aristotle / On Generation And Corruption
>   Aristotle / On The Gait Of Animals
>   Aristotle / On The Generation Of Animals
>   ...
> 
> From each author/title combination I want to create a file name (key).
> Specifically, I want a file name with the following form: author-
> firstwordofthetitle-integer.txt  Such a scheme will make it
> (relatively) easy for me to look at the file name and know the what
> title is and by whom.
> 
> Using Perl, how can I convert the author/title combination into some
> sort of integer, checksum, or unique value that is the same every time
> I run my script? I don't want to have to remember what was used before
> because I don't want to maintain a list of previously used keys. Should
> I use some form of the pack function? Should I sum the ASCII values of
> each character in the author/title combination?

You could MD5 hash the author/title combination which would give you the
same hash so long they were the author/title combination was the same,
e.g., letter case and spelling, etc.  However, that doesn't meet your
requirement of an small integer, but if you are using the value for a
Perl hash it might not matter all that much.

Andy.