Skip to content

BUG: Fix metadata not roundtripping when pickling datetime #29555

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Aug 14, 2025

Conversation

MaanasArora
Copy link
Contributor

@MaanasArora MaanasArora commented Aug 13, 2025

Fixes #29344.

It seems that for datetime dtypes, arraydescr_reduce was initializing a new mapping instead of setting to Py_None, and then arraydescr_setstate just read it in. I set the metadata to and then checked for Py_None and refactored some of the logic to be clearer hopefully.

Adding a test. Thanks for reviewing!

@MaanasArora
Copy link
Contributor Author

Added test! Confirmed that it fails on main.

for proto in range(2, pickle.HIGHEST_PROTOCOL + 1):
res = pickle.loads(pickle.dumps(dt, protocol=proto))
assert_equal(res, dt)
assert res.metadata is None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this test already has 7 other assertions above, any objection to just making it its own regression test:

--- a/numpy/_core/tests/test_datetime.py
+++ b/numpy/_core/tests/test_datetime.py
@@ -889,8 +889,8 @@ def test_pickle(self):
                 b"I1\nI1\nI1\ntp7\ntp8\ntp9\nb."
             assert_equal(pickle.loads(pkl), np.dtype('>M8[us]'))
 
+    def test_gh_29555(self):
         # check that dtype metadata round-trips when none
-        # gh-29555
         dt = np.dtype('>M8[us]')

Beyond that, the regression test does indeed seem to fail before/pass after the patch in my hands locally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems nice, applied, thanks!

Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The None reference is a bug but we can also just apply a suggestion even. Tylers test suggestion seems like a nice to have.

PyTuple_SET_ITEM(ret, 0, PyDict_New());
}
else {
PyTuple_SET_ITEM(ret, 0, Py_None);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This steals a reference to None! Otherwise LGTM, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes, sorry - fixed. Thanks!

@@ -2877,6 +2878,7 @@ arraydescr_setstate(_PyArray_LegacyDescr *self, PyObject *args)
char endian;
PyObject *endian_obj;
PyObject *subarray, *fields, *names = NULL, *metadata=NULL;
PyObject *old_metadata, *new_metadata;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would probably move these down, but doesn't matter (and yeah, everything else is here so).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, they seem more local state - I moved them down to be clearer.

@seberg seberg added the 09 - Backport-Candidate PRs tagged should be backported label Aug 14, 2025
@seberg
Copy link
Member

seberg commented Aug 14, 2025

Thanks @MaanasArora, let's put this in.

This has been around forever, but I can't really see that not including an empty metadatadict in such a niche use-case can break anything, so I think we could backport.

@seberg seberg merged commit 87e208c into numpy:main Aug 14, 2025
77 checks passed
charris pushed a commit to charris/numpy that referenced this pull request Aug 14, 2025
* BUG: fix metadata not roundtripping for datetime pickles

* STYLE: Remove unnecessary newline

* REF: Simplify metadata handling in arraydescr_setstate function

* TST: add test to ensure datetime dtype metadata round-trips on pickling

* REF: clearer operation order

* TST: add new regression test (numpygh-29555)

* BUG: don't steal reference to Py_None

* REF: move metadata variable declarations below for clarity
@charris charris removed the 09 - Backport-Candidate PRs tagged should be backported label Aug 14, 2025
charris added a commit that referenced this pull request Aug 14, 2025
BUG: Fix metadata not roundtripping when pickling datetime (#29555)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Pickling/unpickling datetime64 adds dtype metadata
4 participants