Wide arg read csv excel (issue #574) #588

alixdamman · 2018-02-21T11:58:28Z

Documentation for argument wide not yet written. Will write documentation when implementation is OK.

alixdamman · 2018-02-21T12:41:32Z

@gdementen add argument wide to Session.save and load (assuming all arrays are stored in wide/narrow format)?

gdementen · 2018-02-21T12:07:30Z

doc/source/changes/version_0_28.rst.inc

@@ -306,6 +317,8 @@ Miscellaneous improvements
  Closes :issue:`549`.


+


isn't this a bit too much blank lines?

gdementen · 2018-02-21T12:07:53Z

larray/inout/array.py

@@ -205,6 +205,9 @@ def df_aslarray(df, sort_rows=False, sort_columns=False, raw=False, parse_header
    parse_header : bool, optional
        Whether or not to parse columns labels. Pandas treats column labels as strings.
        If True, column labels are converted into int, float or boolean when possible. Defaults to True.
+    wide: bool, optional
+        ...


gdementen · 2018-02-21T12:09:26Z

larray/inout/array.py

+    if not wide:
+        if nb_axes is not None:
+            nb_axes = None
+            warnings.warn("`nb_axes` argument cannot be used when `wide` argument is False")


shouldn't this be an exception instead of a warning?

gdementen · 2018-02-21T12:09:42Z

larray/inout/array.py

+            warnings.warn("`nb_axes` argument cannot be used when `wide` argument is False")
+        if index_col is not None:
+            index_col = None
+            warnings.warn("`index_col` argument cannot be used when `wide` argument is False")


gdementen · 2018-02-21T12:09:59Z

larray/inout/array.py

@@ -288,6 +327,9 @@ def read_csv(filepath_or_buffer, nb_axes=None, index_col=None, sep=',', headerse
    sort_columns : bool, optional
        Whether or not to sort the columns alphabetically (sorting is more efficient than not sorting).
        Defaults to False.
+    wide : bool, optional
+        ...


gdementen · 2018-02-21T12:12:44Z

larray/inout/array.py

@@ -540,6 +607,9 @@ def from_lists(data, nb_axes=None, index_col=None, fill_value=np.nan, sort_rows=
    sort_columns : bool, optional
        Whether or not to sort the columns alphabetically (sorting is more efficient than not sorting).
        Defaults to False.
+    wide: bool, optional
+        ...


gdementen · 2018-02-21T12:14:34Z

larray/tests/data/test1d.csv

@@ -1,2 +1,2 @@
-time,2007,2010,2013
+a,a0,a1,a2


using a0 etc... is probably a good idea, but then we need another test for "int" labels (to check they are parsed to int correctly and do not stay strings). This could use from_string or from_list instead of a file though.

Isn't already the case for from_lists and from_string?

the from_list test does not count as the year labels already are of the correct type (it uses 1991; not '1991'), and there is no test in from_string doctest with int-like labels in that last axis, and the tests do not check the type of the labels.

gdementen · 2018-02-21T12:17:09Z

larray/tests/data/test2d.csv

-a\b,0,1,2
-0,0,1,2
-1,3,4,5
+a\b,b0,b1,b2


wouldn't using a 2x3 (or 3x2) array instead of this 3x3 make all tests smaller?

gdementen · 2018-02-21T12:19:22Z

larray/tests/data/testmissing_values_narrow.csv

@@ -0,0 +1,22 @@
+a,b,c,value


would be nice if this test also included missing values for c

gdementen · 2018-02-21T12:21:53Z

larray/tests/test_array.py

-        self.assertEqual(la.axes.names, ['arr', 'age', 'sex', 'nat', 'time'])
-        assert_array_equal(la[X.arr[1], 0, 'F', X.nat[1], :],
-                           [3722, 3395, 3347])
+        arr = read_excel(inputpath('test.xlsx'), '5d')


unsure having a 5d test makes sense. I think it was just a relic of what we used as test data initially

gdementen · 2018-02-21T14:01:55Z

@gdementen add argument wide to Session.save and load (assuming all arrays are stored in wide/narrow format)?

Would make sense, but that's lower priority IMO => open an issue for this?

gdementen · 2018-02-21T14:09:01Z

btw: using ndtest to create the test files is a good idea and was long overdue. This partly solves #26.

gdementen · 2018-02-22T13:11:50Z

larray/tests/data/test2d.csv

-a1,3,4,5
-a2,6,7,8
+a\b,b0,b1
+1,0,1


My comment about int-like labels was meant for the "horizontal" dimension, which is not handled by pandas but manually parsed by us. Having a test for int-like in a column is interesting too (to make sure Pandas does not change its handling of them) but less critical. In any case, we should have a test explicitly telling in a comment (or test name?) that this is what we test, so that we do not break it accidentally.

If I add a new test array with 'int' labels for all axes, is that OK?

- updated CSV test files + test.xlsx - updated unittests test_read_csv, test_read_excel_pandas, test_read_excel_xlwings and test_to_csv

…l, from_lists and from_strings functions + and updated df_aslarray so as to be able to load arrays stored in narrow format

alixdamman requested a review from gdementen February 21, 2018 11:58

gdementen added the in progress label Feb 21, 2018

gdementen reviewed Feb 21, 2018

View reviewed changes

gdementen removed the in progress label Feb 21, 2018

gdementen reviewed Feb 22, 2018

View reviewed changes

gdementen approved these changes Feb 22, 2018

View reviewed changes

alixdamman added 2 commits February 22, 2018 16:07

updated read/write unittests --> use ndtest to generate data:

37151a9

- updated CSV test files + test.xlsx - updated unittests test_read_csv, test_read_excel_pandas, test_read_excel_xlwings and test_to_csv

fix larray-project#574 : added argument 'wide' to read_csv, read_exce…

7964815

…l, from_lists and from_strings functions + and updated df_aslarray so as to be able to load arrays stored in narrow format

alixdamman force-pushed the wide_arg_read_csv_excel_574 branch from 6a9e5dc to 7964815 Compare February 22, 2018 15:08

alixdamman merged commit 2f5b0d3 into larray-project:master Feb 22, 2018

alixdamman deleted the wide_arg_read_csv_excel_574 branch February 22, 2018 15:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wide arg read csv excel (issue #574) #588

Wide arg read csv excel (issue #574) #588

alixdamman commented Feb 21, 2018

alixdamman commented Feb 21, 2018

gdementen Feb 21, 2018

gdementen Feb 21, 2018

gdementen Feb 21, 2018

gdementen Feb 21, 2018

gdementen Feb 21, 2018

gdementen Feb 21, 2018

gdementen Feb 21, 2018

alixdamman Feb 22, 2018

gdementen Feb 22, 2018

gdementen Feb 21, 2018

gdementen Feb 21, 2018

gdementen Feb 21, 2018

gdementen commented Feb 21, 2018

gdementen commented Feb 21, 2018

gdementen Feb 22, 2018

alixdamman Feb 22, 2018

gdementen Feb 22, 2018

		@@ -306,6 +317,8 @@ Miscellaneous improvements
		Closes :issue:`549`.

		@@ -1,2 +1,2 @@
		time,2007,2010,2013
		a,a0,a1,a2

Wide arg read csv excel (issue #574) #588

Wide arg read csv excel (issue #574) #588

Conversation

alixdamman commented Feb 21, 2018

alixdamman commented Feb 21, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gdementen commented Feb 21, 2018

gdementen commented Feb 21, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment