> From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of
> Eric Lease Morgan
> Sent: Friday, May 28, 2010 09:35 AM
> To: [log in to unmask]
> Subject: [CODE4LIB] generating unique integers
>
> Given a list of unique strings, how can I generate a list of short,
> unique integers?
>
> I have a list of about 250 unique author/title combinations, such as:
>
> Aeschylus / Prometheus Bound
> Aeschylus / Suppliant Maidens
> American State / Articles of confederation
> American State / Declaration of Independence
> Aquinas / Summa Theologica
> Aristophanes / Achamians
> Aristophanes / Clouds
> Aristophanes / Ecclesiazusae
> Aristotle / On Generation And Corruption
> Aristotle / On The Gait Of Animals
> Aristotle / On The Generation Of Animals
> ...
>
> From each author/title combination I want to create a file name (key).
> Specifically, I want a file name with the following form: author-
> firstwordofthetitle-integer.txt Such a scheme will make it
> (relatively) easy for me to look at the file name and know the what
> title is and by whom.
>
> Using Perl, how can I convert the author/title combination into some
> sort of integer, checksum, or unique value that is the same every time
> I run my script? I don't want to have to remember what was used before
> because I don't want to maintain a list of previously used keys. Should
> I use some form of the pack function? Should I sum the ASCII values of
> each character in the author/title combination?
You could MD5 hash the author/title combination which would give you the
same hash so long they were the author/title combination was the same,
e.g., letter case and spelling, etc. However, that doesn't meet your
requirement of an small integer, but if you are using the value for a
Perl hash it might not matter all that much.
Andy.
|