2007-08-02

 

Python Cookbook 2.30 计算CRC-64循环冗余校验

需求:

为了保证数据的正确性,你需要使用循环冗余校验(CRC)来检查数据,而且使用CRC-64算法,并满足ISO-3309标准.

讨论:

Python标准库没有实现CRC-64算法(只实现了CRC-32,在zlib.crc32),所以我们需要自己编写.幸运的是 ,Python的位运算和C一样简便,所以很融入将一个标准的算法转换为Python实现:

# prepare two auxiliary tables tables (using a function, for speed),
# then remove the function, since it's not needed any more:
CRCTableh = [0] * 256
CRCTablel = [0] * 256
def _inittables(CRCTableh, CRCTablel, POLY64REVh, BIT_TOGGLE):
    for i in xrange(256):
        partl = i
        parth = 0L
        for j in xrange(8):
            rflag = partl & 1L
            partl >>= 1L
            if parth & 1:
                partl ^= BIT_TOGGLE
            parth >>= 1L
            if rflag:
                parth ^= POLY64REVh
        CRCTableh[i] = parth
        CRCTablel[i] = partl
# first 32 bits of generator polynomial for CRC64 (the 32 lower bits are
# assumed to be zero) and bit-toggle mask used in _inittables
POLY64REVh = 0xd8000000L
BIT_TOGGLE = 1L << 31L
# run the function to prepare the tables
_inittables(CRCTableh, CRCTablel, POLY64REVh, BIT_TOGGLE)
# remove all names we don't need any more, including the function
del _inittables, POLY64REVh, BIT_TOGGLE
# this module exposes the following two functions: crc64, crc64digest
def crc64(bytes, (crch, crcl)=(0,0)):
    for byte in bytes:
        shr = (crch & 0xFF) << 24
        temp1h = crch >> 8L
        temp1l = (crcl >> 8L) | shr
        tableindex = (crcl ^ ord(byte)) & 0xFF
        crch = temp1h ^ CRCTableh[tableindex]
        crcl = temp1l ^ CRCTablel[tableindex]
    return crch, crcl
def crc64digest(aString):
    return "%08X%08X" % (crc64(bytes))
if _ _name_ _ == '_ _main_ _':
    # a little test/demo, for when this module runs as main-script
    assert crc64("IHATEMATH") == (3822890454, 2600578513)
    assert crc64digest("IHATEMATH") == "E3DCADD69B01ADD1"
    print 'crc64: dumb test successful'

循环冗余校验是验证文件正确性的常用算法,CRC可以很好的检查到数据损坏,但它不像其它较强的算法那样提供预防数据损坏的机制.CRC比其它的校验算法更快,在只关心数据是否破坏的场景下,CRC非常适用.
从数学上讲,CRC对我们要检测的数据进行按位多项式计算.在实践中,如本节所示,大多数计算可以一次完成并加入表中,如果表中的数据被正确的索引的话.在初始化后 (我们使用大量局部变量,因为在Python中,局部变量计算要比全局变量快).事实上CRC计算是非常快的,表中的运算和CRC算法都需要大量的位运算,还好Python的位运算速度可以和C想匹敌(其实语法也很类似).
计算CRC-64的算法在ISO-3309标准中描述了,本节所做的只是实现这个算法罢了,生成多项式是x64 + x4 + x3 + x + 1.我们使用一对Python的int来表示结果,它们分别保存结果的高32位和低32位.为了让CRC能持续的计算 ,比如一次只计算一小段数据,我们在调用CRC64前先对它进行初始化(crch,crcl),其值为前面计算的结果.对于一次性计算CRC的场景来说,我们将数据以字符串的形式传入方法即可,其初始化化值为(0,0),也是方法的默认值.

相关说明:

由于英语水平有限,本节内容不能做到信达雅了,不过一切尽在代码中.

Comments: 发表评论



<< Home

This page is powered by Blogger. Isn't yours?