Python numpy loadtxt fails with date time

I am trying to use numpy loadtxt to load a csv file into an array. But it seem i can't get the date time correctly loaded.

Below demonstrates what is happening. Did I do something wrong?

>>> s = StringIO("05/21/2007,03:27")
>>> np.loadtxt(s, delimiter=",", dtype={'names':('date','time'), 'formats':('datetime64[D]', 'datetime64[m]')})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/lib/npyio.py", line 796, in loadtxt
items = [conv(val) for (conv, val) in zip(converters, vals)]
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/lib/npyio.py", line 573, in <lambda>
  return lambda x: int(float(x))
ValueError: invalid literal for float(): 05/21/2007

Answers


You need to also add converters, like:

from matplotlib.dates import strpdate2num
...
np.loadtxt(s, delimiter=",", converters={0:strpdate2num('%m/%d/%Y'), 1:...}, dtype= ...

When numpy sees your dtype format of datetime[64], it prepares to output a column of type numpy.datetime64. numpy.datetim64 is a subclass of numpy.integer, and loadtxt prepares to deal with that column as an integer with the following:

def _getconv(dtype):
    typ = dtype.type
    if issubclass(typ, np.bool_):
        return lambda x: bool(int(x))
    if issubclass(typ, np.uint64):
        return np.uint64
    if issubclass(typ, np.int64):
        return np.int64
    if issubclass(typ, np.integer):
        return lambda x: int(float(x))

    ...

When it gets to the point of attempting conversion at line 796 in numpyio:

items = [conv(val) for (conv, val) in zip(converters, vals)]

it tries to uselambda x: int(float(x)) to handle the input. When it does that, it tries to cast your date (05/27/2007) to a float and peters out. The conversion function strpdate2num above will convert the date to a number representation.


Trying MichealJCox's solution did not work for me. My version of numpy (1.8) would not accept a time number as given by strpdate2num('%m/%d/%Y'), it would only accept a date string or datetime object. Therefore, I used a more complex converter, which converts the time string to a time number and then to a datetime object usable by numpy :

from matplotlib.dates import strpdate2num, num2date
...
convert = lambda x: num2date(strpdate2num('%m/%d/%Y')(x))
np.loadtxt(s, delimiter=",", converters={0:convert}, dtype= ...

This seems like a bulky solution though.


Need Your Help

Deleting / replacing objects in an array

java arrays swing

I have a list of words (object) in an array. Each word has a number, word and hint. (please not the number is 1 more than the index it is in the array) I want the user to be able to delete an item...

How to dismiss popoverview from didSelectRowAtIndexPath

ios uitableview popover

I have a popOverView that displays a list of optons the user can use to filter the UITableView on the main view.

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.