2007-08-14
Python Cookbook 3.7 解析时间数据
需求:
你的程序需要接收时间数据,而这些数据不是按照标准的datetime数据格式"yyyy,mm,dd"来组织的.
讨论:
第三方库dateutil的dateutil.parse提供了简便的解决方案:
import datetime
import dateutil.parser
def tryparse(date):
# dateutil.parser needs a string argument: let's make one from our
# `date' argument, according to a few reasonable conventions...:
kwargs = { } # assume no named-args
if isinstance(date, (tuple, list)):
date = ' '.join([str(x) for x in date]) # join up sequences
elif isinstance(date, int):
date = str(date) # stringify integers
elif isinstance(date, dict):
kwargs = date # accept named-args dicts
date = kwargs.pop('date') # with a 'date' str
try:
try:
parsedate = dateutil.parser.parse(date, **kwargs)
print 'Sharp %r -> %s' % (date, parsedate)
except ValueError:
parsedate = dateutil.parser.parse(date, fuzzy=True, **kwargs)
print 'Fuzzy %r -> %s' % (date, parsedate)
except Exception, err:
print 'Try as I may, I cannot parse %r (%s)' % (date, err)
if _ _name_ _ == "_ _main_ _":
tests = (
"January 3, 2003", # a string
(5, "Oct", 55), # a tuple
"Thursday, November 18", # longer string without year
"7/24/04", # a string with slashes
"24-7-2004", # European-format string
{'date':"5-10-1955", "dayfirst":True}, # a dict including the kwarg
"5-10-1955", # dayfirst, no kwarg
19950317, # not a string
"11AM on the 11th day of 11th month, in the year of our Lord 1945",
)
for test in tests: # testing date formats
tryparse(test) # try to parse
dateutil.parse可以工作于多种时间格式.本节演示了一部分的用法.那个解析器可以解析英语国家的月份表示和2或4位的年表示法.当你给它传递的参数没有命名的时候 ,它默认使用"mm-dd-yy",如果发现解析后没有意义,比如例子中给出的"24-7-2004",解析器会尝试"dd-nn-yy",最后,它会尝试"yy-mm-dd".如果命名参数给出了,它会按照命名参数进行解析.
本节的测试代码包含了一些可能会遇到的边界条件,比如以元组的形式传递参数,或者以整数的形式,甚至一个短语.为了测试关键字,tryparse函数还允许字典作为参数,可以将键为date的值转换为时间对象,其余的作为函数的命名参数.
dateutil.parse可以做模糊解析.给出一些提示来让它解析,比如,小时(本节中使用了AM).对于商业级代码,你最好避免模糊解析,做一些预处理,并对解析结果进行检查.
你的程序需要接收时间数据,而这些数据不是按照标准的datetime数据格式"yyyy,mm,dd"来组织的.
讨论:
第三方库dateutil的dateutil.parse提供了简便的解决方案:
import datetime
import dateutil.parser
def tryparse(date):
# dateutil.parser needs a string argument: let's make one from our
# `date' argument, according to a few reasonable conventions...:
kwargs = { } # assume no named-args
if isinstance(date, (tuple, list)):
date = ' '.join([str(x) for x in date]) # join up sequences
elif isinstance(date, int):
date = str(date) # stringify integers
elif isinstance(date, dict):
kwargs = date # accept named-args dicts
date = kwargs.pop('date') # with a 'date' str
try:
try:
parsedate = dateutil.parser.parse(date, **kwargs)
print 'Sharp %r -> %s' % (date, parsedate)
except ValueError:
parsedate = dateutil.parser.parse(date, fuzzy=True, **kwargs)
print 'Fuzzy %r -> %s' % (date, parsedate)
except Exception, err:
print 'Try as I may, I cannot parse %r (%s)' % (date, err)
if _ _name_ _ == "_ _main_ _":
tests = (
"January 3, 2003", # a string
(5, "Oct", 55), # a tuple
"Thursday, November 18", # longer string without year
"7/24/04", # a string with slashes
"24-7-2004", # European-format string
{'date':"5-10-1955", "dayfirst":True}, # a dict including the kwarg
"5-10-1955", # dayfirst, no kwarg
19950317, # not a string
"11AM on the 11th day of 11th month, in the year of our Lord 1945",
)
for test in tests: # testing date formats
tryparse(test) # try to parse
dateutil.parse可以工作于多种时间格式.本节演示了一部分的用法.那个解析器可以解析英语国家的月份表示和2或4位的年表示法.当你给它传递的参数没有命名的时候 ,它默认使用"mm-dd-yy",如果发现解析后没有意义,比如例子中给出的"24-7-2004",解析器会尝试"dd-nn-yy",最后,它会尝试"yy-mm-dd".如果命名参数给出了,它会按照命名参数进行解析.
本节的测试代码包含了一些可能会遇到的边界条件,比如以元组的形式传递参数,或者以整数的形式,甚至一个短语.为了测试关键字,tryparse函数还允许字典作为参数,可以将键为date的值转换为时间对象,其余的作为函数的命名参数.
dateutil.parse可以做模糊解析.给出一些提示来让它解析,比如,小时(本节中使用了AM).对于商业级代码,你最好避免模糊解析,做一些预处理,并对解析结果进行检查.
标签: Python