2007-07-11

 

Python Cookbook 2.9 从Zip文件中读数据

需求:

不解压缩zip文件,实现对文件内容的获取.

讨论:

zip文件是比较流行的压缩文件的方式,它也是跨平台的.标准Python库提供了zipfile模块来简化相关的操作:

import zipfile
z = zipfile.ZipFile("zipfile.zip", "r")
for filename in z.namelist ( ):
    print 'File:', filename,
    bytes = z.read(filename)
    print 'has', len(bytes), 'bytes'

Python可以直接操作zip文件中的数据,可以获得文件列表或者直接获得文件的内容.本节中给出的例子获得了zipfile.zip中的文件列表和包含文件的长度 .
Python目前还不能处理分卷压缩的zip文件和带有注释的zip文件.需要注意的时候,打开zip文件时要使用'r'参数而不是'rb',尽管后者看起来更合理一些(尤其在windows系统下).因为在ZipFile下面,并不识别'rb'选项,这个和open是不同的.'r'选项就表示了是对zip文件进行读操作.
假如一个zip文件包含了zip模块(py或者pyw文件),你可以在sys.path中添加这个文件的路径 ,并能使用import来引用zip文件中的模块.下面是一个小例子,仅仅用来说明问题,它创建一个zip文件,应引用了它,最后再删除:

import zipfile, tempfile, os, sys
handle, filename = tempfile.mkstemp('.zip')
os.close(handle)
z = zipfile.ZipFile(filename, 'w')
z.writestr('hello.py', 'def f( ): return "hello world from "+_ _file_ _\n')
z.close( )
sys.path.insert(0, filename)
import hello
print hello.f( )
os.unlink(filename)

用执行这个例子后输出:

hello world from /tmp/tmpESVzeY.zip/hello.py

除了说明Python能从zip中引入文件外,这个例子也说明如何创建(并删除)一个临时文件,而且也说明了如何使用writestr来给zip文件添加一个文件,而不是先在磁盘上创建一个.
需要注意的是,import引入的文件路径类似一个目录(在本例中是/tmp/tmpESVzeY.zip/hello.py,因为我们使用的是临时文件,所以不同时刻和不同系统下运行的结果可能是不一样的),特别的,全局变量__file__,它表示了/tmp/tmpESVzeY.zip/hello.py,具有一个类似与目录结构的路径,当然,实际上它并不是目录结构的,你用open方法是不能这样打开zip中的文件的.要想打开zip中的文件,要使用zipfile.

相关说明:

z = ZipFile(file, mode="r", compression=ZIP_STORED, allowZip64=True)
 | 
 |  file: Either the path to the file, or a file-like object.
 |        If it is a path, the file will be opened and closed by ZipFile.
 |  mode: The mode can be either read "r", write "w" or append "a".
 |  compression: ZIP_STORED (no compression) or ZIP_DEFLATED (requires zlib).
 |  allowZip64: if True ZipFile will create files with ZIP64 extensions when
 |              needed, otherwise it will raise an exception when this would
 |              be necessary.

mkstemp(suffix='', prefix='tmp', dir=None, text=False)
    mkstemp([suffix, [prefix, [dir, [text]]]])
    User-callable function to create and return a unique temporary
    file.  The return value is a pair (fd, name) where fd is the
    file descriptor returned by os.open, and name is the filename.
   
    If 'suffix' is specified, the file name will end with that suffix,
    otherwise there will be no suffix.
   
    If 'prefix' is specified, the file name will begin with that prefix,
    otherwise a default prefix is used.
   
    If 'dir' is specified, the file will be created in that directory,
    otherwise a default directory is used.
   
    If 'text' is specified and true, the file is opened in text
    mode.  Else (the default) the file is opened in binary mode.  On
    some operating systems, this makes no difference.
   
    The file is readable and writable only by the creating user ID.
    If the operating system uses permission bits to indicate whether a
    file is executable, the file is executable by no one. The file
    descriptor is not inherited by children of this process.
   
    Caller is responsible for deleting the file when done with it.

writestr(self, zinfo_or_arcname, bytes)
    unbound zipfile.ZipFile method
    Write a file into the archive.  The contents is the string
    'bytes'.  'zinfo_or_arcname' is either a ZipInfo instance or
    the name of the file in the archive.

Comments: 发表评论



<< Home

This page is powered by Blogger. Isn't yours?