Skip to content

Commit 75f386d

Browse files
committed
Fix two ancient bugs in GiST code to re-find a parent after page split:
First, when following a right-link, we incorrectly marked the current page as the parent of the right sibling. In reality, the parent of the right page is the same as the parent of the current page (or some page to the right of it, gistFindCorrectParent() will sort that out). Secondly, when we follow a right-link, we must prepend, not append, the right page to our list of pages to visit. That's because we assume that once we hit a leaf page in the list, all the rest are leaf pages too, and give up. To hit these bugs, you need concurrent actions and several unlucky accidents. Another backend must split the root page, while you're in process of splitting a lower-level page. Furthermore, while you scan the internal nodes to re-find the parent, another backend needs to again split some more internal pages. Even then, the bugs don't necessarily manifest as user-visible errors or index corruption. While we're at it, make the error reporting a bit better if gistFindPath() fails to re-find the parent. It used to be an assertion, but an elog() seems more appropriate. Backpatch to all supported branches.
1 parent 0dd46a7 commit 75f386d

File tree

1 file changed

+24
-9
lines changed

1 file changed

+24
-9
lines changed

src/backend/access/gist/gist.c

Lines changed: 24 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -676,24 +676,38 @@ gistFindPath(Relation r, BlockNumber child)
676676

677677
if (GistPageIsLeaf(page))
678678
{
679-
/* we can safety go away, follows only leaf pages */
679+
/*
680+
* Because we scan the index top-down, all the rest of the pages
681+
* in the queue must be leaf pages as well.
682+
*/
680683
UnlockReleaseBuffer(buffer);
681-
return NULL;
684+
break;
682685
}
683686

684687
top->lsn = PageGetLSN(page);
685688

686689
if (top->parent && XLByteLT(top->parent->lsn, GistPageGetOpaque(page)->nsn) &&
687690
GistPageGetOpaque(page)->rightlink != InvalidBlockNumber /* sanity check */ )
688691
{
689-
/* page splited while we thinking of... */
692+
/*
693+
* Page was split while we looked elsewhere. We didn't see the
694+
* downlink to the right page when we scanned the parent, so
695+
* add it to the queue now.
696+
*
697+
* Put the right page ahead of the queue, so that we visit it
698+
* next. That's important, because if this is the lowest internal
699+
* level, just above leaves, we might already have queued up some
700+
* leaf pages, and we assume that there can't be any non-leaf
701+
* pages behind leaf pages.
702+
*/
690703
ptr = (GISTInsertStack *) palloc0(sizeof(GISTInsertStack));
691704
ptr->blkno = GistPageGetOpaque(page)->rightlink;
692705
ptr->childoffnum = InvalidOffsetNumber;
693-
ptr->parent = top;
694-
ptr->next = NULL;
695-
tail->next = ptr;
696-
tail = ptr;
706+
ptr->parent = top->parent;
707+
ptr->next = top->next;
708+
top->next = ptr;
709+
if (tail == top)
710+
tail = ptr;
697711
}
698712

699713
maxoff = PageGetMaxOffsetNumber(page);
@@ -751,7 +765,9 @@ gistFindPath(Relation r, BlockNumber child)
751765
top = top->next;
752766
}
753767

754-
return NULL;
768+
elog(ERROR, "failed to re-find parent of a page in index \"%s\", block %u",
769+
RelationGetRelationName(r), child);
770+
return NULL; /* keep compiler quiet */
755771
}
756772

757773

@@ -823,7 +839,6 @@ gistFindCorrectParent(Relation r, GISTInsertStack *child)
823839

824840
/* ok, find new path */
825841
ptr = parent = gistFindPath(r, child->blkno);
826-
Assert(ptr != NULL);
827842

828843
/* read all buffers as expected by caller */
829844
/* note we don't lock them or gistcheckpage them here! */

0 commit comments

Comments
 (0)