Skip to content

Commit 0f12905

Browse files
committed
Count individual SQL commands in pg_restore's --transaction-size mode.
The initial implementation in commit 959b38d counted one action per TOC entry (except for some special cases for multi-blob BLOBS entries). This assumes that TOC entries are all about equally complex, but it turns out that that assumption doesn't hold up very well in binary-upgrade mode. For example, even after the previous commit I was able to cause backend bloat with tables having many inherited constraints. There may be other cases too. (Since no serious problems have been reported with --single-transaction mode, we can conclude that the backend copes well with psql's regular restore scripts; but before 959b38d we never ran binary-upgrade restores with multi-command transactions.) To fix, count multi-command TOC entries as N actions, allowing the transaction size to be scaled down when we hit a complex TOC entry. Rather than add a SQL parser to pg_restore, approximate "multi command" by counting semicolons in the TOC entry's defn string. This will be fooled by semicolons appearing in string literals --- but the error is in the conservative direction, so it doesn't seem worth working harder. The biggest risk is with function/procedure TOC entries, but we can just explicitly skip those. (This is undoubtedly a hack, and maybe someday we'll be able to revert it after fixing the backend's bloat issues or rethinking what pg_dump emits in binary upgrade mode. But that surely isn't a project for v17.) Thanks to Alexander Korotkov for the let's-count-semicolons idea. Per report from Justin Pryzby. Back-patch to v17 where txn_size mode was introduced. Discussion: https://postgr.es/m/ZqEND4ZcTDBmcv31@pryzbyj2023
1 parent b3f0e05 commit 0f12905

File tree

1 file changed

+25
-3
lines changed

1 file changed

+25
-3
lines changed

src/bin/pg_dump/pg_backup_archiver.c

Lines changed: 25 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3827,10 +3827,32 @@ _printTocEntry(ArchiveHandle *AH, TocEntry *te, bool isData)
38273827
{
38283828
IssueACLPerBlob(AH, te);
38293829
}
3830-
else
3830+
else if (te->defn && strlen(te->defn) > 0)
38313831
{
3832-
if (te->defn && strlen(te->defn) > 0)
3833-
ahprintf(AH, "%s\n\n", te->defn);
3832+
ahprintf(AH, "%s\n\n", te->defn);
3833+
3834+
/*
3835+
* If the defn string contains multiple SQL commands, txn_size mode
3836+
* should count it as N actions not one. But rather than build a full
3837+
* SQL parser, approximate this by counting semicolons. One case
3838+
* where that tends to be badly fooled is function definitions, so
3839+
* ignore them. (restore_toc_entry will count one action anyway.)
3840+
*/
3841+
if (ropt->txn_size > 0 &&
3842+
strcmp(te->desc, "FUNCTION") != 0 &&
3843+
strcmp(te->desc, "PROCEDURE") != 0)
3844+
{
3845+
const char *p = te->defn;
3846+
int nsemis = 0;
3847+
3848+
while ((p = strchr(p, ';')) != NULL)
3849+
{
3850+
nsemis++;
3851+
p++;
3852+
}
3853+
if (nsemis > 1)
3854+
AH->txnCount += nsemis - 1;
3855+
}
38343856
}
38353857

38363858
/*

0 commit comments

Comments
 (0)