@@ -231,6 +231,42 @@ Heap elements can be tuples. This is useful for assigning comparison values
231
231
(1, 'write spec')
232
232
233
233
234
+ Other Applications
235
+ ------------------
236
+
237
+ `Medians <https://en.wikipedia.org/wiki/Median >`_ are a measure of
238
+ central tendency for a set of numbers. In distributions skewed by
239
+ outliers, the median provides a more stable estimate than an average
240
+ (arithmetic mean). A running median is an `online algorithm
241
+ <https://en.wikipedia.org/wiki/Online_algorithm> `_ that updates
242
+ continuously as new data arrives.
243
+
244
+ A running median can be efficiently implemented by balancing two heaps,
245
+ a max-heap for values at or below the midpoint and a min-heap for values
246
+ above the midpoint. When the two heaps have the same size, the new
247
+ median is the average of the tops of the two heaps; otherwise, the
248
+ median is at the top of the larger heap::
249
+
250
+ def running_median(iterable):
251
+ "Yields the cumulative median of values seen so far."
252
+
253
+ lo = [] # max-heap
254
+ hi = [] # min-heap (same size as or one smaller than lo)
255
+
256
+ for x in iterable:
257
+ if len(lo) == len(hi):
258
+ heappush_max(lo, heappushpop(hi, x))
259
+ yield lo[0]
260
+ else:
261
+ heappush(hi, heappushpop_max(lo, x))
262
+ yield (lo[0] + hi[0]) / 2
263
+
264
+ For example::
265
+
266
+ >>> list(running_median([5.0, 9.0, 4.0, 12.0, 8.0, 9.0]))
267
+ [5.0, 7.0, 5.0, 7.0, 8.0, 8.5]
268
+
269
+
234
270
Priority Queue Implementation Notes
235
271
-----------------------------------
236
272
0 commit comments