2007-08-14

 

Python Cookbook 3.7 解析时间数据

需求:

你的程序需要接收时间数据,而这些数据不是按照标准的datetime数据格式"yyyy,mm,dd"来组织的.

讨论:

第三方库dateutil的dateutil.parse提供了简便的解决方案:

import datetime
import dateutil.parser
def tryparse(date):
    # dateutil.parser needs a string argument: let's make one from our
    # `date' argument, according to a few reasonable conventions...:
    kwargs = {  }                                    # assume no named-args
    if isinstance(date, (tuple, list)):
        date = ' '.join([str(x) for x in date])    # join up sequences
    elif isinstance(date, int):
        date = str(date)                           # stringify integers
    elif isinstance(date, dict):
        kwargs = date                              # accept named-args dicts
        date = kwargs.pop('date')                  # with a 'date' str
    try:
        try:
            parsedate = dateutil.parser.parse(date, **kwargs)
            print 'Sharp %r -> %s' % (date, parsedate)
        except ValueError:
            parsedate = dateutil.parser.parse(date, fuzzy=True, **kwargs)
            print 'Fuzzy %r -> %s' % (date, parsedate)
    except Exception, err:
        print 'Try as I may, I cannot parse %r (%s)' % (date, err)
if _ _name_ _ == "_ _main_ _":
    tests = (
            "January 3, 2003",                     # a string
            (5, "Oct", 55),                        # a tuple
            "Thursday, November 18",               # longer string without year
            "7/24/04",                             # a string with slashes
            "24-7-2004",                           # European-format string
            {'date':"5-10-1955", "dayfirst":True}, # a dict including the kwarg
            "5-10-1955",                           # dayfirst, no kwarg
            19950317,                              # not a string
            "11AM on the 11th day of 11th month, in the year of our Lord 1945",
            )
    for test in tests:                             # testing date formats
        tryparse(test)                             # try to parse

dateutil.parse可以工作于多种时间格式.本节演示了一部分的用法.那个解析器可以解析英语国家的月份表示和2或4位的年表示法.当你给它传递的参数没有命名的时候 ,它默认使用"mm-dd-yy",如果发现解析后没有意义,比如例子中给出的"24-7-2004",解析器会尝试"dd-nn-yy",最后,它会尝试"yy-mm-dd".如果命名参数给出了,它会按照命名参数进行解析.
本节的测试代码包含了一些可能会遇到的边界条件,比如以元组的形式传递参数,或者以整数的形式,甚至一个短语.为了测试关键字,tryparse函数还允许字典作为参数,可以将键为date的值转换为时间对象,其余的作为函数的命名参数.
dateutil.parse可以做模糊解析.给出一些提示来让它解析,比如,小时(本节中使用了AM).对于商业级代码,你最好避免模糊解析,做一些预处理,并对解析结果进行检查.

标签:


Comments: 发表评论



<< Home

This page is powered by Blogger. Isn't yours?