Skip to content

Commit bd06fe4

Browse files
committed
YA attempt at taming worst-case behavior of get_actual_variable_range.
We've made multiple attempts at preventing get_actual_variable_range from taking an unreasonable amount of time (3ca930f, fccebe4). But there's still an issue for the very first planning attempt after deletion of a large number of extremal-valued tuples. While that planning attempt will set "killed" bits on the tuples it visits and thereby reduce effort for next time, there's still a lot of work it has to do to visit the heap and then set those bits. It's (usually?) not worth it to do that much work at plan time to have a slightly better estimate, especially in a context like this where the table contents are known to be mutating rapidly. Therefore, let's bound the amount of work to be done by giving up after we've visited 100 heap pages. Giving up just means we'll fall back on the extremal value recorded in pg_statistic, so it shouldn't mean that planner estimates suddenly become worthless. Note that this means we'll still gradually whittle down the problem by setting a few more index "killed" bits in each planning attempt; so eventually we'll reach a good state (barring further deletions), even in the absence of VACUUM. Simon Riggs, per a complaint from Jakub Wartak (with cosmetic adjustments by me). Back-patch to all supported branches. Discussion: https://postgr.es/m/CAKZiRmznOwi0oaV=4PHOCM4ygcH4MgSvt8=5cu_vNCfc8FSUug@mail.gmail.com
1 parent 870d621 commit bd06fe4

File tree

1 file changed

+40
-5
lines changed

1 file changed

+40
-5
lines changed

src/backend/utils/adt/selfuncs.c

Lines changed: 40 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -5968,7 +5968,7 @@ get_stats_slot_range(AttStatsSlot *sslot, Oid opfuncoid, FmgrInfo *opproc,
59685968
* and fetching its low and/or high values.
59695969
* If successful, store values in *min and *max, and return true.
59705970
* (Either pointer can be NULL if that endpoint isn't needed.)
5971-
* If no data available, return false.
5971+
* If unsuccessful, return false.
59725972
*
59735973
* sortop is the "<" comparison operator to use.
59745974
* collation is the required collation.
@@ -6097,11 +6097,11 @@ get_actual_variable_range(PlannerInfo *root, VariableStatData *vardata,
60976097
}
60986098
else
60996099
{
6100-
/* If min not requested, assume index is nonempty */
6100+
/* If min not requested, still want to fetch max */
61016101
have_data = true;
61026102
}
61036103

6104-
/* If max is requested, and we didn't find the index is empty */
6104+
/* If max is requested, and we didn't already fail ... */
61056105
if (max && have_data)
61066106
{
61076107
/* scan in the opposite direction; all else is the same */
@@ -6135,7 +6135,7 @@ get_actual_variable_range(PlannerInfo *root, VariableStatData *vardata,
61356135

61366136
/*
61376137
* Get one endpoint datum (min or max depending on indexscandir) from the
6138-
* specified index. Return true if successful, false if index is empty.
6138+
* specified index. Return true if successful, false if not.
61396139
* On success, endpoint value is stored to *endpointDatum (and copied into
61406140
* outercontext).
61416141
*
@@ -6145,6 +6145,9 @@ get_actual_variable_range(PlannerInfo *root, VariableStatData *vardata,
61456145
* to probe the heap.
61466146
* (We could compute these values locally, but that would mean computing them
61476147
* twice when get_actual_variable_range needs both the min and the max.)
6148+
*
6149+
* Failure occurs either when the index is empty, or we decide that it's
6150+
* taking too long to find a suitable tuple.
61486151
*/
61496152
static bool
61506153
get_actual_variable_endpoint(Relation heapRel,
@@ -6161,6 +6164,8 @@ get_actual_variable_endpoint(Relation heapRel,
61616164
SnapshotData SnapshotNonVacuumable;
61626165
IndexScanDesc index_scan;
61636166
Buffer vmbuffer = InvalidBuffer;
6167+
BlockNumber last_heap_block = InvalidBlockNumber;
6168+
int n_visited_heap_pages = 0;
61646169
ItemPointer tid;
61656170
Datum values[INDEX_MAX_KEYS];
61666171
bool isnull[INDEX_MAX_KEYS];
@@ -6203,6 +6208,12 @@ get_actual_variable_endpoint(Relation heapRel,
62036208
* might get a bogus answer that's not close to the index extremal value,
62046209
* or could even be NULL. We avoid this hazard because we take the data
62056210
* from the index entry not the heap.
6211+
*
6212+
* Despite all this care, there are situations where we might find many
6213+
* non-visible tuples near the end of the index. We don't want to expend
6214+
* a huge amount of time here, so we give up once we've read too many heap
6215+
* pages. When we fail for that reason, the caller will end up using
6216+
* whatever extremal value is recorded in pg_statistic.
62066217
*/
62076218
InitNonVacuumableSnapshot(SnapshotNonVacuumable,
62086219
GlobalVisTestFor(heapRel));
@@ -6217,13 +6228,37 @@ get_actual_variable_endpoint(Relation heapRel,
62176228
/* Fetch first/next tuple in specified direction */
62186229
while ((tid = index_getnext_tid(index_scan, indexscandir)) != NULL)
62196230
{
6231+
BlockNumber block = ItemPointerGetBlockNumber(tid);
6232+
62206233
if (!VM_ALL_VISIBLE(heapRel,
6221-
ItemPointerGetBlockNumber(tid),
6234+
block,
62226235
&vmbuffer))
62236236
{
62246237
/* Rats, we have to visit the heap to check visibility */
62256238
if (!index_fetch_heap(index_scan, tableslot))
6239+
{
6240+
/*
6241+
* No visible tuple for this index entry, so we need to
6242+
* advance to the next entry. Before doing so, count heap
6243+
* page fetches and give up if we've done too many.
6244+
*
6245+
* We don't charge a page fetch if this is the same heap page
6246+
* as the previous tuple. This is on the conservative side,
6247+
* since other recently-accessed pages are probably still in
6248+
* buffers too; but it's good enough for this heuristic.
6249+
*/
6250+
#define VISITED_PAGES_LIMIT 100
6251+
6252+
if (block != last_heap_block)
6253+
{
6254+
last_heap_block = block;
6255+
n_visited_heap_pages++;
6256+
if (n_visited_heap_pages > VISITED_PAGES_LIMIT)
6257+
break;
6258+
}
6259+
62266260
continue; /* no visible tuple, try next index entry */
6261+
}
62276262

62286263
/* We don't actually need the heap tuple for anything */
62296264
ExecClearTuple(tableslot);

0 commit comments

Comments
 (0)