csv2rec handles dates differently to datetimes when datefirst is specified.

If a test.csv file contains datetime type entries (ie, file contains "11/01/14 12:00:01 AM" only), and you attempt to run the following script:

```
import matplotlib.mlab as mlab
import datetime as datetime

dataset = mlab.csv2rec('test.csv', delimiter=',', names=['datetime'], dayfirst = True)
#dataset = mlab.csv2rec('test.csv', delimiter=',',converterd={0: lambda x: datetime.datetime.strptime(x, '%d/%m/%y %H:%M:%S %p'),}, names= ['datetime'], dayfirst = True)

for elem in dataset:
    print(elem, end = "\n")
```

...then the printed result misinterprets the datetime string by swapping the day and the month.

The problem is that there are inconsistent handlings of formats in the converter candidate dictionary in mlab.csv2rec. If the time entry is a simple date ("11/01/14"), then the dates will be echoed by the print statement correctly. If the column in the .csv file contains a datetime object (ie date _and_ time, for instance "11/01/14 12:00:01 AM"), then the datetime data will be interpreted as though dayfirst and yearfirst have _NOT_ been specified, and so will NOT be echoed correctly.

This happens because the mydate converter (which uses dayfirst and yearfirst) returns an error if d.hour or d.minute or d.second is greater than zero, but the alternative mydateparser converter, which does not return an error, does not interpret dayfirst or yearfirst, so assumes monthfirst. So then datetime strings in the .csv file are not interpreted correctly if they are other than month first format, and are a datetime string with hour or minute or second other than zero.

In other words, mydateparser is _NOT_ consistent with mydate. Mydateparser should reflect the approach used for mydate and also refer to the dayfirst and yearfirst arguments so that datetime elements of the .csv file column are interpreted in a way that matches dayfirst and/or yearfirst arguments in the call, if specified as True.

A fix that works in my own C:\Python33\Lib\site-packages\matplotlib\mlab.py file is to replace this line:

```
mydateparser = with_default_value(dateparser, datetime.date(1,1,1))
```

with this block:

```
def mydateparser(x):
    # try and return a datetime object
    d = dateparser(x, dayfirst=dayfirst, yearfirst=yearfirst)
    return d
mydateparser = with_default_value(mydateparser, datetime.datetime(1,1,1,0,0,0))
```

This problem is also present in Python 3.4, and probably exists in other releases as well. This appears to be a design bug. It is a dangerous bug because it fails in a way that is inconsistent with other aspects of function behaviour, and can silently leave the user with invalid data despite the correct format having been specified in the function call.

The proposed fix shouldn't break any code that uses the workaround of explicitly specifying a converterd in the csv2rec arguments (which also works) - this fix has been commented out in the code script above.

I am assuming that the test for date objects in mydate should continue to check that hour and minute and second are all equal to 0, since I presume this was put in for a reason (which I don't understand), although modifying this part of the code to return a date in the correct format if there is no non-zero hour/minute/second component, and to return a datetime in the correct format if there is, could be a simpler solution.

I'm new to this bug reporting system, so apologies if I've done anything unconventional or improper - please just point me in the right direction! I've had a crack at creating a pull request (best way to learn!), but happy for people to make suggestions as to how to do the pull request better.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

csv2rec handles dates differently to datetimes when datefirst is specified. #6184

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

csv2rec handles dates differently to datetimes when datefirst is specified. #6184

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions