Alex, Permuting the characters in a string does not produce the same checksum. If it did, that would make checksums really weak. I don't know of any checksum algorithm that produces the same checksum when you merely permute the characters. Here's an example on my iMac. echo "The cat chases the dog" > foo1 echo "The dog chases the cat" > foo2 cksum foo1 414128224 23 foo1 cksum foo2 2453586855 23 foo2 Sol On Fri, May 28, 2010 at 10:26 AM, Alex Bronstein <[log in to unmask]>wrote: > Hi Eric, > > That's not ideal. checksums generate the same number if the letters in the > string are moved. For example "The cat chases the dog" and "The dog chases > the cat" would result in the same checksum. > > You'd be better off using md5(): http://perldoc.perl.org/Digest/MD5.html > > Something like: > # If you want a short integer (2 bytes: 0 - 65535) > my ($integer) = unpack('S', md5($author . $title)); > > # If you want a long integer (4 bytes: 0 - 4 billion) > my ($integer) = unpack('L', md5($author . $title)); > > That would give you uniqueness to within the capability of a short or long > int. If you have few enough items in the list that you're willing to > increase the odds of non-uniqueness in exchange for a smaller maximum > number, you can use the % operator as in: > > # If you want an integer between 0 and 9999 > my ($integer) = unpack('S', md5($author . $title)); > $integer = $integer % 10000; > > Alex. > > > Eric Lease Morgan wrote: > >> Using Perl, how can I convert the author/title combination into some sort >>> of integer, checksum, or unique value that is the same every time I run my >>> script? I don't want to have to remember what was used before because I >>> don't want to maintain a list of previously used keys. Should I use some >>> form of the pack function? Should I sum the ASCII values of each character >>> in the author/title combination? >>> >>> >> >> >> Thank you for the prompt replies, and invariably I resolved my own >> question. Using Perl's unpack function I can generate a checksum based on >> the concatenation of the authors and titles: >> >> my $integer = unpack( "%32C*", "$author$title" ) % 65535; >> >> The result is a unique four-digit number that will be consistently >> generated as my list of author/title combinations grows. At the same time, >> my solution looks much like an incantation -- with magic. Perl-specific and >> at a level of computing that is beyond my day-to-day understanding. >> >> TGIF >> >> >> >