Skip to content

gh-99631: Add custom loads and dumps support for the shelve module #118065

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 75 commits into from
Jul 12, 2025
Merged
Show file tree
Hide file tree
Changes from 19 commits
Commits
Show all changes
75 commits
Select commit Hold shift + click to select a range
e269c09
Use custom loads & dumps instead of custom pickler & unpickler for Shelf
furkanonder Apr 18, 2024
3465149
Allow custom loads & dumps instead of custom pickler & unpickler for …
furkanonder Apr 18, 2024
44b9fa1
Update documentation for serializer and deserializred functions
furkanonder Apr 18, 2024
f2eed32
Update Doc/library/shelve.rst
furkanonder Apr 20, 2024
b3e5723
Update Doc/library/shelve.rst
furkanonder Apr 20, 2024
53d5557
Update documentation for serializer and deserializer functions
furkanonder Apr 21, 2024
d496eab
Merge branch 'main' into issue-99631-2
furkanonder Apr 21, 2024
1e295ba
Fix lines according to PEP-8
furkanonder Apr 21, 2024
2011baa
Merge branch 'issue-99631-2' of github.com:furkanonder/cpython into i…
furkanonder Apr 21, 2024
3cbabe9
Fix doc according to line 80
furkanonder Apr 21, 2024
67b7340
Merge branch 'main' into issue-99631-2
furkanonder Apr 22, 2024
c6b43e2
Fix inline emphasis issue in docs
furkanonder Apr 23, 2024
52a90f7
Merge branch 'issue-99631-2' of github.com:furkanonder/cpython into i…
furkanonder Apr 23, 2024
4f79cf6
Update the definition of the open function.
furkanonder Jul 13, 2024
798fdb2
Pass the serializer and serializer arguments of Shelf.__init__ of Bsd…
furkanonder Jul 14, 2024
bb1150d
Add unittests for BsdDbShelf
furkanonder Jul 14, 2024
98d841b
Merge branch 'issue-99631-2' of github.com:furkanonder/cpython into i…
furkanonder Jul 14, 2024
1159bb6
Update BsdDbShelf's set_location, last and first functions
furkanonder Jul 14, 2024
4b4f1b6
Update BsdDbShelf's next and previous functions
furkanonder Jul 14, 2024
bc399fa
Merge branch 'main' into issue-99631-2
furkanonder Jul 15, 2024
41448d3
Refer to shelve.open function for the deserializer and serializer arg…
furkanonder Jul 15, 2024
fbbe5ea
Refer to shelve.open function for the deserializer and serializer arg…
furkanonder Jul 15, 2024
fdd3e8e
Merge branch 'main' into issue-99631-2
furkanonder Jul 15, 2024
6823ef2
📜🤖 Added by blurb_it.
blurb-it[bot] Jul 16, 2024
2affece
Update the versionchanged statements
furkanonder Jul 16, 2024
da8bc91
Merge branch 'issue-99631-2' of github.com:furkanonder/cpython into i…
furkanonder Jul 16, 2024
6bfebee
change type of num2
furkanonder Jul 16, 2024
82d58a7
Add test_custom_incomplete_serializer_and_deserializer case
furkanonder Jul 17, 2024
7dca8b4
Merge branch 'main' into issue-99631-2
furkanonder Jul 17, 2024
5f97676
Specify that the Shelf, DbfilenameShelf and BsdDbShelf class's takes …
furkanonder Jul 24, 2024
048daee
And and update the versionchanged's text
furkanonder Jul 24, 2024
00837d0
Update the news entry
furkanonder Jul 24, 2024
1292963
Update the versionchanged's text
furkanonder Jul 24, 2024
3431920
Merge branch 'issue-99631-2' of github.com:furkanonder/cpython into i…
furkanonder Jul 24, 2024
97a6d7c
Add new testcases to other bytes objects
furkanonder Jul 24, 2024
3becbc8
Add new testcases to test custom serializer protocl
furkanonder Jul 24, 2024
3a5d6ed
Add new testcases to other bytes objects
furkanonder Jul 24, 2024
fb74832
Delete comma from document
furkanonder Jul 24, 2024
d670c95
Update the description of open function
furkanonder Jul 25, 2024
f2e22eb
sort the imports
furkanonder Jul 27, 2024
e00a52f
add white space
furkanonder Jul 27, 2024
3af3f97
Don't use f-string in type(obj).__name__
furkanonder Jul 27, 2024
9d232e5
Don't use f-string in type(obj).__name__
furkanonder Jul 27, 2024
26fc959
Don't use f-string in type(obj).__name__
furkanonder Jul 27, 2024
ab005aa
Set shelve class argument only serializer and deserializer
furkanonder Jul 27, 2024
6052309
Update shelveError message
furkanonder Jul 27, 2024
87b66d5
pass serializer and deserializer as keyword argument to DbfilenameShelf
furkanonder Jul 27, 2024
0c2f255
Remove unused import
furkanonder Jul 27, 2024
3db0c8e
Update shelve testcases
furkanonder Jul 27, 2024
5c39d94
Remove memoryview testcases
furkanonder Jul 28, 2024
1ca1801
Add ShelveError to shelve's __all__
furkanonder Jul 28, 2024
5a42de1
Add ShelveError to shelve documentation
furkanonder Jul 28, 2024
b0a5ee3
Add blank lines after versionadded and versionchanged
furkanonder Jul 29, 2024
4202ede
Remove white space in test_shelve
furkanonder Jul 29, 2024
2827eb4
Add test_custom_incomplete_serializer_and_deserializer_bsd_db_shelf
furkanonder Jul 29, 2024
9918531
Merge branch 'issue-99631-2' of github.com:furkanonder/cpython into i…
furkanonder Jul 29, 2024
54188bd
Update the serializer and deserializer functions
furkanonder Jul 29, 2024
786a248
Move os.mkdir and addCleanup functions beginning of the testcases
furkanonder Jul 29, 2024
20c2450
Use self.assertIsNone when checking None types
furkanonder Jul 29, 2024
b3770ae
change the test order
furkanonder Jul 29, 2024
588623a
Merge branch 'main' into issue-99631-2
furkanonder Apr 19, 2025
4d9599b
Update shelve module version references from 3.14 to 3.15
furkanonder May 30, 2025
9b204b7
Merge branch 'main' into issue-99631-2
furkanonder May 30, 2025
8b06918
Change shelve module version references from 3.15 to next
furkanonder May 30, 2025
b0f0bbc
Change shelve module version references from 3.15 to next
furkanonder May 30, 2025
d1bb227
refactor nested context managers for better readability
furkanonder May 30, 2025
791743b
simplify assertRaises calls in test_missing_custom_deserializer & tes…
furkanonder May 30, 2025
6b4be8b
refactor nested context managers for better readability
furkanonder May 30, 2025
34a32b9
Add type_name_len helper and use shorter variable names to reduce lin…
furkanonder May 30, 2025
bf6f3aa
Merge branch 'main' into issue-99631-2
furkanonder May 30, 2025
4b000cd
Improve the description of the open function
furkanonder Jun 2, 2025
2dcda2a
Update the description of ShelveError
furkanonder Jun 2, 2025
00bfb01
Simplify conditional branches in serializer and deserializer functions
furkanonder Jun 2, 2025
23ea842
Merge branch 'main' into issue-99631-2
furkanonder Jun 2, 2025
4df9b58
Merge branch 'main' into issue-99631-2
furkanonder Jun 11, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 32 additions & 9 deletions Doc/library/shelve.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,8 @@ This includes most class instances, recursive data types, and objects containing
lots of shared sub-objects. The keys are ordinary strings.


.. function:: open(filename, flag='c', protocol=None, writeback=False)
.. function:: open(filename, flag='c', protocol=None, writeback=False, *, \
serializer=None, deserializer=None)

Open a persistent dictionary. The filename specified is the base filename for
the underlying database. As a side-effect, an extension may be added to the
Expand All @@ -41,13 +42,24 @@ lots of shared sub-objects. The keys are ordinary strings.
determine which accessed entries are mutable, nor which ones were actually
mutated).

By default, :mod:`shelve` uses :func:`pickle.dumps` and :func:`pickle.loads`
for serializing and deserializing. This can be changed by supplying
*serializer* and *deserializer*, respectively. The *serializer* argument
should be a function that takes an object and returns its representation
as a :term:`bytes-like object`; *deserializer* should be a function that
takes :class:`bytes` and returns the corresponding object.
If one of these is given, the other must be given as well.

.. versionchanged:: 3.10
:const:`pickle.DEFAULT_PROTOCOL` is now used as the default pickle
protocol.

.. versionchanged:: 3.11
Accepts :term:`path-like object` for filename.

.. versionchanged:: 3.13
Accepts *serializer* and *deserializer* as parameters.

.. note::

Do not rely on the shelf being closed automatically; always call
Expand Down Expand Up @@ -117,7 +129,8 @@ Restrictions
which can cause hard crashes when trying to read from the database.


.. class:: Shelf(dict, protocol=None, writeback=False, keyencoding='utf-8')
.. class:: Shelf(dict, protocol=None, writeback=False, keyencoding='utf-8', \
serializer=None, deserializer=None)

A subclass of :class:`collections.abc.MutableMapping` which stores pickled
values in the *dict* object.
Expand All @@ -135,6 +148,11 @@ Restrictions
The *keyencoding* parameter is the encoding used to encode keys before they
are used with the underlying dict.

The *deserializer* parameter can be the function that takes the
:term:`bytes-like object` and the *protocol* parameter and returns the
object. *serializer* parameter can be the function that takes the object
and returns :class:`bytes`.

A :class:`Shelf` object can also be used as a context manager, in which
case it will be automatically closed when the :keyword:`with` block ends.

Expand All @@ -149,8 +167,11 @@ Restrictions
:const:`pickle.DEFAULT_PROTOCOL` is now used as the default pickle
protocol.

.. versionchanged:: 3.13
Accepts *serializer* and *deserializer* as parameters.

.. class:: BsdDbShelf(dict, protocol=None, writeback=False, keyencoding='utf-8')
.. class:: BsdDbShelf(dict, protocol=None, writeback=False, \
keyencoding='utf-8', serializer=None, deserializer=None)

A subclass of :class:`Shelf` which exposes :meth:`!first`, :meth:`!next`,
:meth:`!previous`, :meth:`!last` and :meth:`!set_location` methods.
Expand All @@ -160,18 +181,20 @@ Restrictions
modules. The *dict* object passed to the constructor must support those
methods. This is generally accomplished by calling one of
:func:`!bsddb.hashopen`, :func:`!bsddb.btopen` or :func:`!bsddb.rnopen`. The
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, we also need to update this sentence (from bsddb to berkeleydb). bsddb is deprecated according to https://www.jcea.es/programacion/pybsddb.htm.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the change would be bigger: it’s a new module (although the docs are not very clear) with maybe a new API.

Updating or deprecating this should be discussed in its own ticket 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree on it. It would be better to open a new ticket to discuss this issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... or consider instead opening topic on Discourse; this may need a larger audience than what you'd get on the bug tracker.

optional *protocol*, *writeback*, and *keyencoding* parameters have the same
interpretation as for the :class:`Shelf` class.
optional *protocol*, *writeback*, *keyencoding*, *serializer* and *deserializer*
parameters have the same interpretation as for the :class:`Shelf` class.


.. class:: DbfilenameShelf(filename, flag='c', protocol=None, writeback=False)
.. class:: DbfilenameShelf(filename, flag='c', protocol=None, writeback=False, \
serializer=None, deserializer=None)

A subclass of :class:`Shelf` which accepts a *filename* instead of a dict-like
object. The underlying file will be opened using :func:`dbm.open`. By
default, the file will be created and opened for both read and write. The
optional *flag* parameter has the same interpretation as for the :func:`.open`
function. The optional *protocol* and *writeback* parameters have the same
interpretation as for the :class:`Shelf` class.
optional *flag* parameter has the same interpretation as for the
:func:`.open` function. The optional *protocol*, *writeback*, *serializer*
and *deserializer* parameters have the same interpretation as for the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
and *deserializer* parameters have the same interpretation as for the
and *deserializer* parameters have the same interpretation as in

:class:`Shelf` class.


.. _shelve-example:
Expand Down
62 changes: 37 additions & 25 deletions Lib/shelve.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,13 +56,18 @@
the persistent dictionary on disk, if feasible).
"""

from pickle import DEFAULT_PROTOCOL, Pickler, Unpickler
from pickle import DEFAULT_PROTOCOL, Unpickler, dumps, loads
from io import BytesIO

import collections.abc

__all__ = ["Shelf", "BsdDbShelf", "DbfilenameShelf", "open"]


class ShelveError(Exception):
pass


class _ClosedDict(collections.abc.MutableMapping):
'Marker for a closed dict. Access attempts raise a ValueError.'

Expand All @@ -82,7 +87,7 @@ class Shelf(collections.abc.MutableMapping):
"""

def __init__(self, dict, protocol=None, writeback=False,
keyencoding="utf-8"):
keyencoding="utf-8", *, serializer=None, deserializer=None):
self.dict = dict
if protocol is None:
protocol = DEFAULT_PROTOCOL
Expand All @@ -91,6 +96,16 @@ def __init__(self, dict, protocol=None, writeback=False,
self.cache = {}
self.keyencoding = keyencoding

if serializer is None and deserializer is None:
self.serializer = dumps
self.deserializer = loads
elif (serializer is None) ^ (deserializer is None):
raise ShelveError("Serializer and deserializer must be"
"defined together.")
else:
self.serializer = serializer
self.deserializer = deserializer

def __iter__(self):
for k in self.dict.keys():
yield k.decode(self.keyencoding)
Expand All @@ -110,19 +125,17 @@ def __getitem__(self, key):
try:
value = self.cache[key]
except KeyError:
f = BytesIO(self.dict[key.encode(self.keyencoding)])
value = Unpickler(f).load()
f = self.dict[key.encode(self.keyencoding)]
value = self.deserializer(f)
if self.writeback:
self.cache[key] = value
return value

def __setitem__(self, key, value):
if self.writeback:
self.cache[key] = value
f = BytesIO()
p = Pickler(f, self._protocol)
p.dump(value)
self.dict[key.encode(self.keyencoding)] = f.getvalue()
serialized_value = self.serializer(value, self._protocol)
self.dict[key.encode(self.keyencoding)] = serialized_value

def __delitem__(self, key):
del self.dict[key.encode(self.keyencoding)]
Expand Down Expand Up @@ -186,33 +199,29 @@ class BsdDbShelf(Shelf):
"""

def __init__(self, dict, protocol=None, writeback=False,
keyencoding="utf-8"):
Shelf.__init__(self, dict, protocol, writeback, keyencoding)
keyencoding="utf-8", *, serializer=None, deserializer=None):
Shelf.__init__(self, dict, protocol, writeback, keyencoding,
serializer=serializer, deserializer=deserializer)

def set_location(self, key):
(key, value) = self.dict.set_location(key)
f = BytesIO(value)
return (key.decode(self.keyencoding), Unpickler(f).load())
return (key.decode(self.keyencoding), self.deserializer(value))

def next(self):
(key, value) = next(self.dict)
f = BytesIO(value)
return (key.decode(self.keyencoding), Unpickler(f).load())
return (key.decode(self.keyencoding), self.deserializer(value))

def previous(self):
(key, value) = self.dict.previous()
f = BytesIO(value)
return (key.decode(self.keyencoding), Unpickler(f).load())
return (key.decode(self.keyencoding), self.deserializer(value))

def first(self):
(key, value) = self.dict.first()
f = BytesIO(value)
return (key.decode(self.keyencoding), Unpickler(f).load())
return (key.decode(self.keyencoding), self.deserializer(value))

def last(self):
(key, value) = self.dict.last()
f = BytesIO(value)
return (key.decode(self.keyencoding), Unpickler(f).load())
return (key.decode(self.keyencoding), self.deserializer(value))


class DbfilenameShelf(Shelf):
Expand All @@ -222,9 +231,11 @@ class DbfilenameShelf(Shelf):
See the module's __doc__ string for an overview of the interface.
"""

def __init__(self, filename, flag='c', protocol=None, writeback=False):
def __init__(self, filename, flag='c', protocol=None, writeback=False,
serializer=None, deserializer=None):
import dbm
Shelf.__init__(self, dbm.open(filename, flag), protocol, writeback)
Shelf.__init__(self, dbm.open(filename, flag), protocol, writeback,
serializer=serializer, deserializer=deserializer)

def clear(self):
"""Remove all items from the shelf."""
Expand All @@ -233,8 +244,8 @@ def clear(self):
self.cache.clear()
self.dict.clear()


def open(filename, flag='c', protocol=None, writeback=False):
def open(filename, flag='c', protocol=None, writeback=False, *,
serializer=None, deserializer=None):
"""Open a persistent dictionary for reading and writing.

The filename parameter is the base filename for the underlying
Expand All @@ -247,4 +258,5 @@ def open(filename, flag='c', protocol=None, writeback=False):
See the module's __doc__ string for an overview of the interface.
"""

return DbfilenameShelf(filename, flag, protocol, writeback)
return DbfilenameShelf(filename, flag, protocol, writeback,
serializer, deserializer)
143 changes: 142 additions & 1 deletion Lib/test/test_shelve.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,10 @@
import shelve
import pickle
import os
from io import BytesIO
from pydoc import locate

from test.support import os_helper
from test.support import os_helper, import_helper
from collections.abc import MutableMapping
from test.test_dbm import dbm_iterator

Expand Down Expand Up @@ -165,6 +167,145 @@ def test_default_protocol(self):
with shelve.Shelf({}) as s:
self.assertEqual(s._protocol, pickle.DEFAULT_PROTOCOL)

def test_custom_serializer_and_deserializer(self):
def serializer(obj, protocol=None):
return bytes(f"{type(obj).__name__}", 'utf-8')

def deserializer(data):
value = BytesIO(data).read()
return locate(value.decode("utf-8"))

os.mkdir(self.dirname)
self.addCleanup(os_helper.rmtree, self.dirname)

with shelve.open(self.fn,
serializer=serializer,
deserializer=deserializer) as s:
num = 1
s['number'] = num
self.assertEqual(s['number'], type(num))

with self.assertRaises(AssertionError):
def serializer(obj, protocol=None):
return bytes(f"{type(obj).__name__}", 'utf-8')

def deserializer(data):
pass

with shelve.open(self.fn,
serializer=serializer,
deserializer=deserializer) as s:
s['number'] = 100
self.assertEqual(s['number'], 100)

with self.assertRaises(dbm.sqlite3.error):
def serializer(obj, protocol=None):
pass

def deserializer(data):
return BytesIO(data).read().decode("utf-8")

with shelve.open(self.fn,
serializer=serializer,
deserializer=deserializer) as s:
s['number'] = 100
self.assertEqual(s['number'], 100)

def test_custom_serializer_and_deserializer_bsd_db_shelf(self):
berkeleydb = import_helper.import_module('berkeleydb')

def serializer(obj, protocol=None):
return bytes(f"{type(obj).__name__}", 'utf-8')

def deserializer(data):
value = BytesIO(data).read()
return locate(value.decode("utf-8"))

os.mkdir(self.dirname)
self.addCleanup(os_helper.rmtree, self.dirname)

with shelve.BsdDbShelf(berkeleydb.btopen(self.fn),
serializer=serializer,
deserializer=deserializer) as s:
num = 1
s['number'] = num
num2 = 2
s['number2'] = num2
self.assertEqual(s['number'], type(num))

key, value = s.previous()
self.assertEqual("number2", key)
self.assertEqual(value, type(num))

key, value = s.set_location(b'number')
self.assertEqual("number", key)
self.assertEqual(value, type(num))

key, value = s.next()
self.assertEqual("number2", key)
self.assertEqual(value, type(num))

key, value = s.first()
self.assertEqual("number", key)
self.assertEqual(s['number'], value)

key, value = s.last()
self.assertEqual("number2", key)
self.assertEqual(s['number2'], value)

with self.assertRaises(AssertionError):
def serializer(obj, protocol=None):
return bytes(f"{type(obj).__name__}", 'utf-8')

def deserializer(data):
pass

with shelve.BsdDbShelf(berkeleydb.btopen(self.fn),
serializer=serializer,
deserializer=deserializer) as s:
s['number'] = 100
self.assertEqual(s['number'], 100)

def serializer(obj, protocol=None):
pass

def deserializer(data):
return BytesIO(data).read().decode("utf-8")

with shelve.BsdDbShelf(berkeleydb.btopen(self.fn),
serializer=serializer,
deserializer=deserializer) as s:
s['number'] = 100
self.assertNotEqual(s['number'], 100)
self.assertEqual(s['number'], "")

def test_missing_custom_deserializer(self):
def serializer(obj, protocol=None):
pass

with self.assertRaises(shelve.ShelveError):
shelve.Shelf({},
protocol=2, writeback=False, serializer=serializer)

with self.assertRaises(shelve.ShelveError):
shelve.BsdDbShelf({},
protocol=2,
writeback=False, serializer=serializer)

def test_missing_custom_serializer(self):
def deserializer(data):
pass

with self.assertRaises(shelve.ShelveError):
shelve.Shelf({},
protocol=2,
writeback=False, deserializer=deserializer)

with self.assertRaises(shelve.ShelveError):
shelve.BsdDbShelf({},
protocol=2,
writeback=False, deserializer=deserializer)


class TestShelveBase:
type2test = shelve.Shelf
Expand Down
Loading