json - Python and Pandas: UnicodeDecodeError: 'ascii' codec can't decode byte -
after using pandas read json object pandas.dataframe
, want print
first year in each pandas row. eg: if have 2013-2014(2015)
, want print 2013
full code (here)
x = '{"0":"1985\\u2013present","1":"1985\\u2013present",......}' = pd.read_json(x, typ='series') i, row in a.iteritems(): print row.split('-')[0].split('—')[0].split('(')[0]
the following error occurs:
--------------------------------------------------------------------------- unicodedecodeerror traceback (most recent call last) <ipython-input-1333-d8ef23860c53> in <module>() 1 i, row in a.iteritems(): ----> 2 print row.split('-')[0].split('—')[0].split('(')[0] unicodedecodeerror: 'ascii' codec can't decode byte 0xe2 in position 0: ordinal not in range(128)
why happening? how can fix problem?
your json data strings unicode string, can see example printing 1 of values:
in: a[0] out: u'1985\u2013present'
now try split string @ unicode \u2031
(en dash), string give split
no unicode string (therefore error 'ascii' codec can't decode byte 0xe2
- en dash no ascii character).
to make example working, use:
for i, row in a.iteritems(): print row.split('-')[0].split(u'—')[0].split('(')[0]
notice u
in front of uncode dash. write u'\u2013'
split string.
for details on unicode in python, see https://docs.python.org/2/howto/unicode.html
Comments
Post a Comment