Skip to content

[3.13] copyreg._reconstructor crashes when trying to pickle.load specific serialized data #135201

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
moorchegue opened this issue Jun 6, 2025 · 9 comments
Labels
3.13 bugs and security fixes pending The issue will be closed if no feedback is provided stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@moorchegue
Copy link

moorchegue commented Jun 6, 2025

Bug description:

In [25]: pickle.load(open('taskwarrior/f91b9e98-7586-4317-ae52-b516e97209e5', 'rb'))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[25], line 1
----> 1 pickle.load(open('taskwarrior/f91b9e98-7586-4317-ae52-b516e97209e5', 'rb'))

File /usr/lib/python3.13/copyreg.py:44, in _reconstructor(cls, base, state)
     41 def _reconstructor(cls, base, state):
     42     if base is object:
     43         #obj = cls.__new__(cls)
---> 44         obj = object.__new__(cls)
     45     else:
     46         obj = base.__new__(cls, state)

TypeError: object.__new__(tzutc) is not safe, use tzutc.__new__()

It seems that cls.__new__(cls) is the correct way to achieve the desired effect for any type of object?

This issue was not present in 3.12, or at least same code on same pickles used to work.

I was not able to reproduce this with arbitrary data serialization/deserialization (e.g. pickle.dump(tzutc(), f); pickle.load(f) works), which suggests this is dependent on how exactly data is structured, and so I'm including the culprit pickle here: 74b9da0a-ab54-437f-9ab5-549e886d3c83.txt

CPython versions tested on:

3.13

Operating systems tested on:

Linux

@moorchegue moorchegue added the type-bug An unexpected behavior, bug, or error label Jun 6, 2025
@ZeroIntensity
Copy link
Member

Hi, I suspect this is related to a library you're using. The repro fails with this:

ModuleNotFoundError: No module named 'taskw_ng'

I doubt copyreg._reconstructor is the culprit, because that hasn't been touched in over two decades.

@ZeroIntensity ZeroIntensity added stdlib Python modules in the Lib dir pending The issue will be closed if no feedback is provided labels Jun 6, 2025
@moorchegue
Copy link
Author

moorchegue commented Jun 6, 2025

Yeah, it is: taskw-ng==0.2.7. The context here is the https://github.com/bergercookie/syncall (data sync between Asana and taskwarrior) after upgrading from 3.12 to 3.13.

I fixed it at the level of copyreg, finding information that this way of using the __new__ method was deprecated long time ago. Which is the reason I'm suggesting to change this upstream regardless. But perhaps it's the serialization that's the issue (as well).

With the above fix the result of a pickle.load() is this:

{'id': 0,
 'description': 'Set up a demo environment',
 'end': datetime.datetime(2024, 5, 28, 6, 15, 10, tzinfo=tzutc()),
 'entry': datetime.datetime(2024, 5, 28, 6, 13, 7, tzinfo=tzutc()),
 'modified': datetime.datetime(2024, 5, 28, 6, 15, 11, tzinfo=tzutc()),
 'project': 'integral/vxbos',
 'status': 'completed',
 'uuid': UUID('74b9da0a-ab54-437f-9ab5-549e886d3c83'),
 'urgency': 1.25205}

Strangely enough it doesn't contain anything outside of the standard library.

I'm looking at a diff between the culprit file and a dump of the above data, and what immediately stands out is that _reconstructor is mentioned right on top of the file:

diff -u 74b9da0a-ab54-437f-9ab5-549e886d3c83.txt tmp.pickle | head
--- 74b9da0a-ab54-437f-9ab5-549e886d3c83.txt    2025-06-06 18:06:37.516240292 +0800
+++ tmp.pickle  2025-06-06 18:44:29.165722062 +0800
@@ -1,445 +1,85 @@
-ccopy_reg
-_reconstructor
-p0
-(ctaskw_ng.task
-Task
+(dp0
+Vdescription

Update: scratch that, _reconstructor is mentioned later in the second file as well, so that's not the issue, probably the exact chain of calls is.

I don't fully understand what's going on, but it seems that some intermediate function calls are included into the pickle, and that's why it fails with this file, and doesn't with resulting data structure.

@ZeroIntensity
Copy link
Member

I fixed it at the level of copyreg, finding information that this way of using the new method was deprecated long time ago.

It's a Python 2 compatibility shield, as far as I can tell. But I can't see why that would suddenly break things in 3.13.

@moorchegue
Copy link
Author

@ZeroIntensity, do you think pickle should load the attached file without error?

@ZeroIntensity
Copy link
Member

No idea, it's impossible to tell without seeing the code. It's most likely an issue with taskw_ng. I'd report it to them first, and then if they determine it's an issue on our end, they'll file a new report with a pure-stdlib repro.

@moorchegue
Copy link
Author

Problem with that is taskw_ng doesn't cause this error. Furthermore it's not even in the stacktrace:

Exception was raised during program execution.

  File "~/.virtualenvs/tw/bin/tw_asana_sync", line 8, in <module>
    sys.exit(main())
             ~~~~^^
  File "~/.virtualenvs/tw/lib/python3.13/site-packages/click/core.py", line 1442, in __call__
    return self.main(*args, **kwargs)
           ~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "~/.virtualenvs/tw/lib/python3.13/site-packages/click/core.py", line 1363, in main
    rv = self.invoke(ctx)
  File "~/.virtualenvs/tw/lib/python3.13/site-packages/click/core.py", line 1226, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.virtualenvs/tw/lib/python3.13/site-packages/click/core.py", line 794, in invoke
    return callback(*args, **kwargs)
  File "~/.virtualenvs/tw/lib/python3.13/site-packages/syncall/scripts/tw_asana_sync.py", line 233, in main
    aggregator.sync()
    ~~~~~~~~~~~~~~~^^
  File "~/.virtualenvs/tw/lib/python3.13/site-packages/syncall/aggregator.py", line 178, in sync
    changes_A = self.detect_changes(self._helper_A, items_A)
  File "~/.virtualenvs/tw/lib/python3.13/site-packages/syncall/aggregator.py", line 159, in detect_changes
    cached_item = pickle_load(serdes_dir / item_id)
  File "~/.virtualenvs/tw/lib/python3.13/site-packages/bubop/serial.py", line 20, in pickle_load
    return pickle.load(f)
           ~~~~~~~~~~~^^^
  File "/usr/lib/python3.13/copyreg.py", line 43, in _reconstructor
    obj = object.__new__(cls)


object.__new__(tzutc) is not safe, use tzutc.__new__()

And I'm not sure at what point and with what version of that library was this pickle file created, but I am sure that it wasn't causing an error until the upgrade. I could report it to syncall or bubop, but looking at their code I don't see any misuse either. What such misuse could possibly even look like?

At the same time calling object.__new__(tzutc) does seem to cause an issue in both 3.13 and 2.7. Admittedly I don't understand why that should throw an error while tzutc.__new__(tzutc) shouldn't.

@ZeroIntensity
Copy link
Member

Can you send the code that is creating the file?

@serhiy-storchaka
Copy link
Member

What is tzutc?

It looks like in version that created the pickle data, tzutc was implemented in Python, so its __new__ was inherited from object, and object.__new__(tzutc) created a "raw" instance of tzutc, with empty __dict__. In version that loads the pickle data, tzutc is an extension type, so object.__new__(tzutc) no longer works, because it doesn't know how to initialize internal state of tzutc.

Look what versions of the library that provides tzutc was at pickle and unpickle time.

This is an interesting problem. It cannot be solved at the user side (implementing __reduce__ or __reduce_ex__ for tzutc can help with future pickles, but not with unpickling old pickle data). Fixing it at Python side requires additional investigation if we don't want to break other user code.

This all happens only with historical pickle protocols 0 and 1. The best chance to recover your data is to use an older version of the library to unpickle it and then pickle it using a newer pickle protocol.

@picnixz picnixz added the 3.13 bugs and security fixes label Jun 7, 2025
@moorchegue
Copy link
Author

moorchegue commented Jun 7, 2025

Can you send the code that is creating the file?

I think it's the code where pickle_dump is called from here: https://github.com/bergercookie/syncall/blob/master/syncall/aggregator.py

What is tzutc?

It's dateutil.tz.tzutc, which inherits datetime.tzinfo.

Thanks for breaking it down, @serhiy-storchaka, this roughly matches my timeline with these files and Python upgrades. Let me know if I can do more to investigate it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.13 bugs and security fixes pending The issue will be closed if no feedback is provided stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
Status: No status
Development

No branches or pull requests

4 participants