Skip to content

BUG: warn when saving dtype with metadata #14994

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Dec 2, 2019
Merged

Conversation

mattip
Copy link
Member

@mattip mattip commented Nov 27, 2019

Address gh-14142 for the 1.18 release: warn when saving a dtype with metadata that cannot be loaded.

@eric-wieser
Copy link
Member

eric-wieser commented Nov 27, 2019

Would recommend a recursive strategy:

def _has_metadata(dt):
    if dt.metadata is not None:
        return True
    elif dt.names is not None:
        return any(_has_metadata(dt[k]) for k in dt.names)
    elif dt.subdtype is not None:
        return _has_metadata(dt.base)
    else:
        return False

@seberg
Copy link
Member

seberg commented Nov 27, 2019

My main question is whether we should give a warning, or rather just raise right away. The error/warning message could possibly include either use pickle directly or use `arr.view` to remove the metadata?

Erics recursive version looks good to me.

@eric-wieser
Copy link
Member

eric-wieser commented Nov 27, 2019

use arr.view to remove the metadata

Possibly with a comment saying "good luck building a suitable dtype from your nested structured type while preserving field order and padding"...

I think the only viable options available to users are:

  • Suppress the warning (perhaps we want a subclass of UserWarning to make this easy?) - if the resulting file is unreadable rather than just not preserving metadata, this isn't a viable choice either.
  • Eliminate the metadata at the source
  • Use pickle

@seberg
Copy link
Member

seberg commented Nov 27, 2019

Fair point, using view is only viable for unstructured dtypes... I was toying with the idea of adding dtype.drop_metadata() that is recursive, but I do not really want to rush that in at this time.

@mattip
Copy link
Member Author

mattip commented Nov 28, 2019

Suppress the warning (perhaps we want a subclass of UserWarning to make this easy?)

This seems like the best option for 1.18, adding a subclass MetadataWarning

Comment on lines +279 to +280
warnings.warn("metadata on a dtype may be saved or ignored, but will "
"raise if saved when read. Use another form of storage.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a tracking issue we can reference regarding making this work properly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened gh-15017

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last question - do we want to reference that issue in the warning message?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we fix the issue the test will fail, which should bring us back to the warning. I think that is sufficient for this corner case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant more for pointing users to more information. Fine as is, just curious if putting issue numbers in warnings is a path we should look at going down.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a good point, for now I think I will just copy the full warning text into the issue. Search engines will probably pick it up well.

@seberg
Copy link
Member

seberg commented Dec 2, 2019

Closing and reopen, to re-trigger failing tests.

@seberg seberg closed this Dec 2, 2019
@seberg seberg reopened this Dec 2, 2019
@charris
Copy link
Member

charris commented Dec 2, 2019

@seberg Ping.

@seberg
Copy link
Member

seberg commented Dec 2, 2019

Thanks Matti and Eric!

@seberg seberg merged commit 7b2d968 into numpy:master Dec 2, 2019
@mattip mattip deleted the issue-14142 branch November 2, 2020 08:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants