Skip to content

gh-110067: Make max heap methods public and add missing ones #130725

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 16 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 50 additions & 2 deletions Doc/library/heapq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,10 @@
This module provides an implementation of the heap queue algorithm, also known
as the priority queue algorithm.

Heaps are binary trees for which every parent node has a value less than or
equal to any of its children. We refer to this condition as the heap invariant.
Min-heaps (resp. max-heaps) are binary trees for which every parent node
has a value less than (resp. greater than) or equal to any of its children.
We refer to this condition as the heap invariant. Unless stated otherwise,
*heaps* refer to min-heaps.

This implementation uses arrays for which
``heap[k] <= heap[2*k+1]`` and ``heap[k] <= heap[2*k+2]`` for all *k*, counting
Expand Down Expand Up @@ -82,6 +84,52 @@ The following functions are provided:
on the heap.


For max-heaps, the following functions are provided:


.. function:: heapify_max(x)

Transform list *x* into a max-heap, in-place, in linear time.

.. versionadded:: next


.. function:: heappush_max(heap, item)

Push the value *item* onto the max-heap *heap*, maintaining the heap invariant.

.. versionadded:: next


.. function:: heappop_max(heap)

Pop and return the largest item from the max-heap *heap*, maintaining the heap
invariant. If the max-heap is empty, :exc:`IndexError` is raised. To access the
largest item without popping it, use ``heap[0]``.

.. versionadded:: next


.. function:: heappushpop_max(heap, item)

Push *item* on the max-heap *heap*, then pop and return the largest item from *heap*.
The combined action runs more efficiently than :func:`heappush_max`
followed by a separate call to :func:`heappop_max`.

.. versionadded:: next


.. function:: heapreplace_max(heap, item)

Pop and return the largest item from the max-heap *heap* and also push the new *item*.
The max-heap size doesn't change. If the max-heap is empty, :exc:`IndexError` is raised.

The value returned may be smaller than the *item* added. Refer to the analogous
function :func:`heapreplace` for detailed usage notes.

.. versionadded:: next


The module also offers three general purpose functions based on heaps.


Expand Down
12 changes: 12 additions & 0 deletions Doc/whatsnew/3.14.rst
Original file line number Diff line number Diff line change
Expand Up @@ -844,6 +844,18 @@ graphlib
(Contributed by Daniel Pope in :gh:`130914`)


heapq
-----

* Add functions for working with max-heaps:

* :func:`heapq.heapify_max`,
* :func:`heapq.heappush_max`,
* :func:`heapq.heappop_max`,
* :func:`heapq.heapreplace_max`
* :func:`heapq.heappushpop_max`


hmac
----

Expand Down
51 changes: 29 additions & 22 deletions Lib/heapq.py
Original file line number Diff line number Diff line change
Expand Up @@ -178,7 +178,7 @@ def heapify(x):
for i in reversed(range(n//2)):
_siftup(x, i)

def _heappop_max(heap):
def heappop_max(heap):
"""Maxheap version of a heappop."""
lastelt = heap.pop() # raises appropriate IndexError if heap is empty
if heap:
Expand All @@ -188,19 +188,32 @@ def _heappop_max(heap):
return returnitem
return lastelt

def _heapreplace_max(heap, item):
def heapreplace_max(heap, item):
"""Maxheap version of a heappop followed by a heappush."""
returnitem = heap[0] # raises appropriate IndexError if heap is empty
heap[0] = item
_siftup_max(heap, 0)
return returnitem

def _heapify_max(x):
def heappush_max(heap, item):
"""Maxheap version of a heappush."""
heap.append(item)
_siftdown_max(heap, 0, len(heap)-1)

def heappushpop_max(heap, item):
"""Maxheap fast version of a heappush followed by a heappop."""
if heap and heap[0] < item:
item, heap[0] = heap[0], item
_siftup_max(heap, 0)
return item

def heapify_max(x):
"""Transform list into a maxheap, in-place, in O(len(x)) time."""
n = len(x)
for i in reversed(range(n//2)):
_siftup_max(x, i)


# 'heap' is a heap at all indices >= startpos, except possibly for pos. pos
# is the index of a leaf with a possibly out-of-order value. Restore the
# heap invariant.
Expand Down Expand Up @@ -335,9 +348,9 @@ def merge(*iterables, key=None, reverse=False):
h_append = h.append

if reverse:
_heapify = _heapify_max
_heappop = _heappop_max
_heapreplace = _heapreplace_max
_heapify = heapify_max
_heappop = heappop_max
_heapreplace = heapreplace_max
direction = -1
else:
_heapify = heapify
Expand Down Expand Up @@ -490,10 +503,10 @@ def nsmallest(n, iterable, key=None):
result = [(elem, i) for i, elem in zip(range(n), it)]
if not result:
return result
_heapify_max(result)
heapify_max(result)
top = result[0][0]
order = n
_heapreplace = _heapreplace_max
_heapreplace = heapreplace_max
for elem in it:
if elem < top:
_heapreplace(result, (elem, order))
Expand All @@ -507,10 +520,10 @@ def nsmallest(n, iterable, key=None):
result = [(key(elem), i, elem) for i, elem in zip(range(n), it)]
if not result:
return result
_heapify_max(result)
heapify_max(result)
top = result[0][0]
order = n
_heapreplace = _heapreplace_max
_heapreplace = heapreplace_max
for elem in it:
k = key(elem)
if k < top:
Expand Down Expand Up @@ -583,19 +596,13 @@ def nlargest(n, iterable, key=None):
from _heapq import *
except ImportError:
pass
try:
from _heapq import _heapreplace_max
except ImportError:
pass
try:
from _heapq import _heapify_max
except ImportError:
pass
try:
from _heapq import _heappop_max
except ImportError:
pass

# For backwards compatibility
_heappop_max = heappop_max
_heapreplace_max = heapreplace_max
_heappush_max = heappush_max
_heappushpop_max = heappushpop_max
_heapify_max = heapify_max

if __name__ == "__main__":

Expand Down
105 changes: 97 additions & 8 deletions Lib/test/test_heapq.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
# _heapq.nlargest/nsmallest are saved in heapq._nlargest/_smallest when
# _heapq is imported, so check them there
func_names = ['heapify', 'heappop', 'heappush', 'heappushpop', 'heapreplace',
'_heappop_max', '_heapreplace_max', '_heapify_max']
'heappop_max', 'heapreplace_max', 'heapify_max', 'heappushpop_max']

class TestModules(TestCase):
def test_py_functions(self):
Expand Down Expand Up @@ -74,13 +74,46 @@ def test_push_pop(self):
except AttributeError:
pass

def test_max_push_pop(self):
# 1) Push 256 random numbers and pop them off, verifying all's OK.
heap = []
data = []
self.check_max_invariant(heap)
for i in range(256):
item = random.random()
data.append(item)
self.module.heappush_max(heap, item)
self.check_max_invariant(heap)
results = []
while heap:
item = self.module.heappop_max(heap)
self.check_max_invariant(heap)
results.append(item)
data_sorted = data[:]
data_sorted.sort(reverse=True)

self.assertEqual(data_sorted, results)
# 2) Check that the invariant holds for a sorted array
self.check_max_invariant(results)

self.assertRaises(TypeError, self.module.heappush_max, [])

exc_types = (AttributeError, TypeError)
self.assertRaises(exc_types, self.module.heappush_max, None, None)
self.assertRaises(exc_types, self.module.heappop_max, None)

def check_invariant(self, heap):
# Check the heap invariant.
for pos, item in enumerate(heap):
if pos: # pos 0 has no parent
parentpos = (pos-1) >> 1
self.assertTrue(heap[parentpos] <= item)

def check_max_invariant(self, heap):
for pos, item in enumerate(heap[1:], start=1):
parentpos = (pos - 1) >> 1
self.assertGreaterEqual(heap[parentpos], item)

def test_heapify(self):
for size in list(range(30)) + [20000]:
heap = [random.random() for dummy in range(size)]
Expand All @@ -89,6 +122,14 @@ def test_heapify(self):

self.assertRaises(TypeError, self.module.heapify, None)

def test_heapify_max(self):
for size in list(range(30)) + [20000]:
heap = [random.random() for dummy in range(size)]
self.module.heapify_max(heap)
self.check_max_invariant(heap)

self.assertRaises(TypeError, self.module.heapify_max, None)

def test_naive_nbest(self):
data = [random.randrange(2000) for i in range(1000)]
heap = []
Expand Down Expand Up @@ -153,12 +194,31 @@ def test_heappushpop(self):
x = self.module.heappushpop(h, 11)
self.assertEqual((h, x), ([11], 10))

def test_heappushpop_max(self):
h = []
x = self.module.heappushpop_max(h, 10)
self.assertTupleEqual((h, x), ([], 10))

h = [10]
x = self.module.heappushpop_max(h, 10.0)
self.assertTupleEqual((h, x), ([10], 10.0))
self.assertIsInstance(h[0], int)
self.assertIsInstance(x, float)

h = [10]
x = self.module.heappushpop_max(h, 11)
self.assertTupleEqual((h, x), ([11], 10))

h = [10]
x = self.module.heappushpop_max(h, 9)
self.assertTupleEqual((h, x), ([10], 9))

def test_heappop_max(self):
# _heapop_max has an optimization for one-item lists which isn't
# heapop_max has an optimization for one-item lists which isn't
# covered in other tests, so test that case explicitly here
h = [3, 2]
self.assertEqual(self.module._heappop_max(h), 3)
self.assertEqual(self.module._heappop_max(h), 2)
self.assertEqual(self.module.heappop_max(h), 3)
self.assertEqual(self.module.heappop_max(h), 2)

def test_heapsort(self):
# Exercise everything with repeated heapsort checks
Expand Down Expand Up @@ -377,16 +437,20 @@ def __lt__(self, other):
class TestErrorHandling:

def test_non_sequence(self):
for f in (self.module.heapify, self.module.heappop):
for f in (self.module.heapify, self.module.heappop,
self.module.heapify_max, self.module.heappop_max):
self.assertRaises((TypeError, AttributeError), f, 10)
for f in (self.module.heappush, self.module.heapreplace,
self.module.heappush_max, self.module.heapreplace_max,
self.module.nlargest, self.module.nsmallest):
self.assertRaises((TypeError, AttributeError), f, 10, 10)

def test_len_only(self):
for f in (self.module.heapify, self.module.heappop):
for f in (self.module.heapify, self.module.heappop,
self.module.heapify_max, self.module.heappop_max):
self.assertRaises((TypeError, AttributeError), f, LenOnly())
for f in (self.module.heappush, self.module.heapreplace):
for f in (self.module.heappush, self.module.heapreplace,
self.module.heappush_max, self.module.heapreplace_max):
self.assertRaises((TypeError, AttributeError), f, LenOnly(), 10)
for f in (self.module.nlargest, self.module.nsmallest):
self.assertRaises(TypeError, f, 2, LenOnly())
Expand All @@ -395,14 +459,17 @@ def test_cmp_err(self):
seq = [CmpErr(), CmpErr(), CmpErr()]
for f in (self.module.heapify, self.module.heappop):
self.assertRaises(ZeroDivisionError, f, seq)
for f in (self.module.heappush, self.module.heapreplace):
for f in (self.module.heappush, self.module.heapreplace,
self.module.heappush_max, self.module.heapreplace_max):
self.assertRaises(ZeroDivisionError, f, seq, 10)
for f in (self.module.nlargest, self.module.nsmallest):
self.assertRaises(ZeroDivisionError, f, 2, seq)

def test_arg_parsing(self):
for f in (self.module.heapify, self.module.heappop,
self.module.heappush, self.module.heapreplace,
self.module.heapify_max, self.module.heappop_max,
self.module.heappush_max, self.module.heapreplace_max,
self.module.nlargest, self.module.nsmallest):
self.assertRaises((TypeError, AttributeError), f, 10)

Expand All @@ -424,13 +491,21 @@ def test_heappush_mutating_heap(self):
# Python version raises IndexError, C version RuntimeError
with self.assertRaises((IndexError, RuntimeError)):
self.module.heappush(heap, SideEffectLT(5, heap))
heap = []
heap.extend(SideEffectLT(i, heap) for i in range(200))
with self.assertRaises((IndexError, RuntimeError)):
self.module.heappush_max(heap, SideEffectLT(5, heap))

def test_heappop_mutating_heap(self):
heap = []
heap.extend(SideEffectLT(i, heap) for i in range(200))
# Python version raises IndexError, C version RuntimeError
with self.assertRaises((IndexError, RuntimeError)):
self.module.heappop(heap)
heap = []
heap.extend(SideEffectLT(i, heap) for i in range(200))
with self.assertRaises((IndexError, RuntimeError)):
self.module.heappop_max(heap)

def test_comparison_operator_modifiying_heap(self):
# See bpo-39421: Strong references need to be taken
Expand All @@ -443,6 +518,9 @@ def __lt__(self, o):
heap = []
self.module.heappush(heap, EvilClass(0))
self.assertRaises(IndexError, self.module.heappushpop, heap, 1)
heap = []
self.module.heappush_max(heap, EvilClass(0))
self.assertRaises(IndexError, self.module.heappushpop_max, heap, 1)

def test_comparison_operator_modifiying_heap_two_heaps(self):

Expand All @@ -464,6 +542,17 @@ def __lt__(self, o):
self.assertRaises((IndexError, RuntimeError), self.module.heappush, list1, g(1))
self.assertRaises((IndexError, RuntimeError), self.module.heappush, list2, h(1))

list1, list2 = [], []

self.module.heappush_max(list1, h(0))
self.module.heappush_max(list2, g(0))
self.module.heappush_max(list1, g(1))
self.module.heappush_max(list2, h(1))

TestHeap.check_max_invariant(self, list1)
TestHeap.check_max_invariant(self, list2)


class TestErrorHandlingPython(TestErrorHandling, TestCase):
module = py_heapq

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
Make :mod:`heapq` max-heap functions :func:`heapq.heapify_max`, :func:`heapq.heappush_max`,
:func:`heapq.heappop_max`, and :func:`heapq.heapreplace_max` public.
Previous underscored naming is kept for backwards compatibility.
Additionally, the missing function :func:`heapq.heappushpop_max` has been added
to both the C and Python implementations.
Loading
Loading