-
Notifications
You must be signed in to change notification settings - Fork 6
Allow to save and load all Axis and Group objects of a session in/from HDF, CSV and EXCEL files #578
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@gdementen Implementing #81 (LFrame) and #6 (multiversion labels) would greatly help to spread the use of LArray among our (potential) current users but in the meantime it is possible to mix Pandas and LArray. Unfortunately, It is really easy to modify I suggest to adapt the argument |
I do not really like this option (but this might be the best option anyway -- this needs more thoughts). At least I hope it would not be required. Can't we autodetect what we have? At least for .h5, we could store some extra metadata to make this possible. For Excel & .csv, we could return an array when \ is present and a Dataframe when \ is not present. That would be backward incompatible though and would break again when we support LFrame. Also names are ordered while the dict would not be guaranteed to be so on python < 3.7 |
The whole point behind this is to be able to save Axis and Group objects. Maybe a special object to simulate #6 also. |
@gdementen I don't see any obvious way to guess if we are dealing with a group or an axis or a 1D array when reading data from a CSV file or Excel sheet. |
The @ idea is interesting, but I think we are looking at it from the wrong angle. There are potentially several different features involved here. First, if we are loading one axis (or one group) from a file, the user should know what s/he is loading and could use a specific function to load that. eg read_axis(), read_group(), or similar. It is probably valuable to have a format to save/load those one at a time from a custom format, and in that case the @ solution you describe seems a good compromise between readabilty, simplicity and functionality. FWIW, I prefer to see "realistic" examples to gauge syntaxes. Your above proposals would be:
But, what I think users need more is
|
That was my first thought actually but I see users coming. I'm pretty sure some of them will ask to export arrays with associated groups in the same sheets and then being able to reload arrays and groups in one operation. Imagine an array called Another stuff I'm worried about: Do we force Groups and Axis to be stored vertically or horizontally? |
No, no, no... Users cannot have it both ways. They might complain indeed, but this is IMO putting the bar too high to have a way to save/load them exactly like users want and have them save/load all together. These are two different features. The session thing should be seen as an internal format. If the internal format can be used directly by users, that's all the better, but this is not even required. However, we need to make it as easy as possible to define axes and groups from arbitrary .csv and Excel files (ie, #155/use any format the user like) but then this is a one-object-at-a-time process. In a mid to long term future we might want to let users define their own custom format/template for saving or loading many axes at once, but this is a lot less useful than the other two features. |
For 1, whatever is most convenient to implement, so I guess horizontal. |
|
|
|
|
The name of the Sheet/CSV file/HDF group for axes and groups could be defined by two additional arguments with default values: |
…project#578) : - added to_hdf method to Axis and Group - updated read_hdf (inout/hdf.py) - updated doctests of Session.load and Session.save - added context manager LHDFStore (utils/misc.py) refactored package inout: created one module per file extension or external object type like in pandas/io/: new modules: - common.py - pandas.py - csv.py - excel.py - hdf.py - sas.py - misc.py - pickle.py renamed modules: - excel.py --> xw_excel.py deleted modules: - array.py
…project#578) : - added to_hdf method to Axis and Group - updated read_hdf (inout/hdf.py) - updated documentation of Session's methods - updated doctests of Session.load and Session.save - added context manager LHDFStore (utils/misc.py) refactored package inout: created one module per file extension or external object type like in pandas/io/: new modules: - common.py - pandas.py - csv.py - excel.py - hdf.py - sas.py - misc.py - pickle.py renamed modules: - excel.py --> xw_excel.py deleted modules: - array.py
- added to_hdf method to Axis and Group - updated read_hdf (inout/hdf.py) - updated documentation of Session's methods - updated doctests of Session.load and Session.save - added context manager LHDFStore (utils/misc.py) refactored package inout: created one module per file extension or external object type like in pandas/io/: new modules: - common.py - pandas.py - csv.py - excel.py - hdf.py - sas.py - misc.py - pickle.py renamed modules: - excel.py --> xw_excel.py deleted modules: - array.py
…project#578) updated FileHandler and its subclasses: - renamed FileHandler.list as FileHandler.lists which returns 3 lists (axes, groups and arrays) - updated FileHandler.read_items() - updated FileHandler.dump_items() - split _dump() into _dump_array(), _dump_axes() and _dump_groups() - split _read_item() into _read_array(), _read_axes(), _read_groups()
…objects of a session in/from HDF, CSV and EXCEL files
- added to_hdf method to Axis and Group - updated read_hdf (inout/hdf.py) - updated documentation of Session's methods - updated doctests of Session.load and Session.save - added context manager LHDFStore (utils/misc.py) refactored package inout: created one module per file extension or external object type like in pandas/io/: new modules: - common.py - pandas.py - csv.py - excel.py - hdf.py - sas.py - misc.py - pickle.py renamed modules: - excel.py --> xw_excel.py deleted modules: - array.py
…ession in/from HDF, CSV and EXCEL files
somewhat related to #153
The text was updated successfully, but these errors were encountered: