Print

Print


Hello,

A checksum function can verify only data integrity--that is, only whether the data matches the expected values (and even this is not perfect). The change could come in the form of malicious attack or a simple write or transmission error. It cannot determine whether the change is malicious.

Thanks,

Cornel Darden Jr.  
MSLIS
Library Department Chair
South Suburban College
7087052945

"Our Mission is to Serve our Students and the Community through lifelong learning."

Sent from my iPhone

> On Oct 2, 2014, at 5:34 PM, Jonathan Rochkind <[log in to unmask]> wrote:
> 
> For checksums for ensuring archival integrity, are cryptographic flaws relavent? I'm not sure, is part of the point of a checksum to ensure against _malicious_ changes to files?  I honestly don't know. (But in most systems, I'd guess anyone who had access to maliciously change the file would also have access to maliciously change the checksum!)
> 
> Rot13 is not suitable as a checksum for ensuring archival integrity however, because it's output is no smaller than it's input, which is kind of what you're looking for. 
> 
> ________________________________________
> From: Code for Libraries [[log in to unmask]] on behalf of Cary Gordon [[log in to unmask]]
> Sent: Thursday, October 02, 2014 5:51 PM
> To: [log in to unmask]
> Subject: Re: [CODE4LIB] What is the real impact of SHA-256? - Updated
> 
> +1
> 
> MD5 is little better than ROT13. At least with ROT13, you have no illusions.
> 
> We use SHA 512 for most work. We don't do finance or national security, so it is a good fit for us.
> 
> Cary
> 
>> On Oct 2, 2014, at 12:30 PM, Simon Spero <[log in to unmask]> wrote:
>> 
>> Intel skylake processors have dedicated sha instructions.
>> See: https://software.intel.com/en-us/articles/intel-sha-extensions
>> 
>> Using a tree hash approach (which is inherently embarrassingly parallel)
>> will leave io time dominant. This approach is used by Amazon glacier - see
>> http://docs.aws.amazon.com/amazonglacier/latest/dev/checksum-calculations.html
>> 
>> MD5 is broken, and cannot be used for any security purposes. It cannot be
>> used for deduplication if any of the files are in the directories of
>> security researchers!
>> 
>> If security is not a concern then there are many faster hashing algorithms
>> that avoid the costs imposed by the need to defend against adversaries.
>> See siphash, murmur, cityhash, etc.
>> 
>> Simon
>>> On Oct 2, 2014 11:18 AM, "Alex Duryee" <[log in to unmask]> wrote:
>>> 
>>> Despite some of its relative flaws, MD5 is frequently selected over SHA-256
>>> in archives as the checksum algorithm of choice. One of the primary factors
>>> here is the longer processing time required for SHA-256, though there have
>>> been no empirical studies calculating that time difference and its overall
>>> impact on checksum generation and verification in a preservation
>>> environment.
>>> 
>>> AVPreserve Consultant Alex Duryee recently ran a series of tests comparing
>>> the real time and cpu time used by each algorithm. His newly updated white
>>> paper "What Is the Real Impact of SHA-256?" presents the results and comes
>>> to some interesting conclusions regarding the actual time difference
>>> between the two and what other factors may have a greater impact on your
>>> selection decision and file monitoring workflow. The paper can be
>>> downloaded for free at
>>> 
>>> http://www.avpreserve.com/papers-and-presentations/whats-the-real-impact-of-sha-256/
>>> .
>>> ______________________________________
>>> 
>>> Alex Duryee
>>> *AVPreserve*
>>> 350 7th Ave., Suite 1605
>>> New York, NY 10001
>>> 
>>> office: 917-475-9630
>>> 
>>> http://www.avpreserve.com
>>> Facebook.com/AVPreserve <http://facebook.com/AVPreserve>
>>> twitter.com/AVPreserve
>>>