How can I create multiple hashes of a file using only one pass?

How can I get a MD5, SHA and other hashes from a file but only doing one pass? I have 100mb files, so I'd hate to process those 100MB files multiple times.

Answers


Here's a modified @ʞɔıu's answer using @Jason S' suggestion.

from __future__ import with_statement
from hashlib import md5, sha1

filename = 'hash_one-pass.py'

hashes = md5(), sha1()
chunksize = max(4096, max(h.block_size for h in hashes))
with open(filename, 'rb') as f:
    while True:
        chunk = f.read(chunksize)
        if not chunk:
            break
        for h in hashes:
            h.update(chunk)

for h in hashes:
    print h.name, h.hexdigest()

Something like this perhaps?

>>> import hashlib
>>> hashes = (hashlib.md5(), hashlib.sha1())
>>> f = open('some_file', 'r')
>>> for line in f:
...     for hash in hashes:
...         hash.update(line)
... 
>>> for hash in hashes:
...     print hash.name, hash.hexdigest()

or loop over f.read(1024) or something like that to get fixed-length blocks


I don't know Python but I am familiar w/ hash calculations.

If you handle the reading of files manually, just read in one block (of 256 bytes or 4096 bytes or whatever) at a time, and pass each block of data to update the hash of each algorithm. (you'll have to initialize state at the beginning and finalize the state at the end.)


Need Your Help

Ternary opeator in select statement

android android-sqlite ormlite

I am trying to execute query which has ternary operator inside select statement:

Is there an easy way to find broken HTML code in Dreamweaver?

html dreamweaver

Is there an easy way to find broken HTML code in Dreamweaver?

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.