better support for duplicate labels (on the same axis)

We cannot load >2D array with duplicate labels:
 
```python
arr = ndtest("a=a0,a1;b=x,x;c=c0,c1")

arr.to_hdf('test.h5', 'arr')
arr = read_hdf('test.h5', 'arr')
ValueError: cannot reshape array of size 8 into shape (2,1,2)

arr.to_csv('test.csv')
arr = read_csv('test.csv')
ValueError: cannot handle a non-unique multi-index!

arr.to_excel('test.xlsx')
arr = read_excel('test.xlsx')
ValueError: cannot handle a non-unique multi-index!
```

For HDF, this is clearly a limitation in larray's code. In pandas.py/index_to_labels, I used the following code:

```python
return [unique_list(idx.get_level_values(label)) for label in range(idx.nlevels)]
```

where unique_list returns the unique labels for that index "level", and that obviously breaks in the presence of duplicate labels.

For csv and Excel, this is not so clear-cut. This seems to be a limitation in Pandas reindex, and I am unsure we can do anything about that (except not going via Pandas to load data).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

better support for duplicate labels (on the same axis) #1126

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

better support for duplicate labels (on the same axis) #1126

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions