Skip to content

Commit 70d067c

Browse files
committed
Exclude parallel workers from connection privilege/limit checks.
Cause parallel workers to not check datallowconn, rolcanlogin, and ACL_CONNECT privileges. The leader already checked these things (except for rolcanlogin which might have been checked for a different role). Re-checking can accomplish little except to induce unexpected failures in applications that might not even be aware that their query has been parallelized. We already had the principle that parallel workers rely on their leader to pass a valid set of authorization information, so this change just extends that a bit further. Also, modify the ReservedConnections, datconnlimit and rolconnlimit logic so that these limits are only enforced against regular backends, and only regular backends are counted while checking if the limits were already reached. Previously, background processes that had an assigned database or role were subject to these limits (with rather random exclusions for autovac workers and walsenders), and the set of existing processes that counted against each limit was quite haphazard as well. The point of these limits, AFAICS, is to ensure the availability of PGPROC slots for regular backends. Since all other types of processes have their own separate pools of PGPROC slots, it makes no sense either to enforce these limits against them or to count them while enforcing the limit. While edge-case failures of these sorts have been possible for a long time, the problem got a good deal worse with commit 5a2fed9 (CVE-2024-10978), which caused parallel workers to make some of these checks using the leader's current role where before we had used its AuthenticatedUserId, thus allowing parallel queries to fail after SET ROLE. The previous behavior was fairly accidental and I have no desire to return to it. This patch includes reverting 73c9f91, which was an emergency hack to suppress these same checks in some cases. It wasn't complete, as shown by a recent bug report from Laurenz Albe. We can also revert fd4d93d and 4922173, which hacked around the same problems in one regression test. In passing, remove the special case for autovac workers in CheckMyDatabase; it seems cleaner to have AutoVacWorkerMain pass the INIT_PG_OVERRIDE_ALLOW_CONNS flag, now that that does what's needed. Like 5a2fed9, back-patch to supported branches (which sadly no longer includes v12). Discussion: https://postgr.es/m/1808397.1735156190@sss.pgh.pa.us
1 parent 1025463 commit 70d067c

File tree

9 files changed

+68
-76
lines changed

9 files changed

+68
-76
lines changed

src/backend/access/transam/parallel.c

+7-2
Original file line numberDiff line numberDiff line change
@@ -1413,10 +1413,15 @@ ParallelWorkerMain(Datum main_arg)
14131413
fps->session_user_is_superuser);
14141414
SetCurrentRoleId(fps->outer_user_id, fps->role_is_superuser);
14151415

1416-
/* Restore database connection. */
1416+
/*
1417+
* Restore database connection. We skip connection authorization checks,
1418+
* reasoning that (a) the leader checked these things when it started, and
1419+
* (b) we do not want parallel mode to cause these failures, because that
1420+
* would make use of parallel query plans not transparent to applications.
1421+
*/
14171422
BackgroundWorkerInitializeConnectionByOid(fps->database_id,
14181423
fps->authenticated_user_id,
1419-
0);
1424+
BGWORKER_BYPASS_ALLOWCONN);
14201425

14211426
/*
14221427
* Set the client encoding to the database encoding, since that is what

src/backend/access/transam/twophase.c

+1-1
Original file line numberDiff line numberDiff line change
@@ -483,7 +483,7 @@ MarkAsPreparingGuts(GlobalTransaction gxact, TransactionId xid, const char *gid,
483483
proc->databaseId = databaseid;
484484
proc->roleId = owner;
485485
proc->tempNamespaceId = InvalidOid;
486-
proc->isBackgroundWorker = false;
486+
proc->isBackgroundWorker = true;
487487
proc->lwWaiting = LW_WS_NOT_WAITING;
488488
proc->lwWaitMode = 0;
489489
proc->waitLock = NULL;

src/backend/postmaster/autovacuum.c

+4-2
Original file line numberDiff line numberDiff line change
@@ -1675,12 +1675,14 @@ AutoVacWorkerMain(int argc, char *argv[])
16751675
pgstat_report_autovac(dbid);
16761676

16771677
/*
1678-
* Connect to the selected database
1678+
* Connect to the selected database, specifying no particular user,
1679+
* and ignoring datallowconn. Collect the database's name for
1680+
* display.
16791681
*
16801682
* Note: if we have selected a just-deleted database (due to using
16811683
* stale stats info), we'll fail and exit here.
16821684
*/
1683-
InitPostgres(NULL, dbid, NULL, InvalidOid, dbname, false);
1685+
InitPostgres(NULL, dbid, NULL, InvalidOid, dbname, true);
16841686
SetProcessingMode(NormalProcessing);
16851687
set_ps_display(dbname);
16861688
ereport(DEBUG1,

src/backend/storage/ipc/procarray.c

+2-2
Original file line numberDiff line numberDiff line change
@@ -2902,8 +2902,7 @@ CountDBBackends(Oid databaseid)
29022902
}
29032903

29042904
/*
2905-
* CountDBConnections --- counts database backends ignoring any background
2906-
* worker processes
2905+
* CountDBConnections --- counts database backends (only regular backends)
29072906
*/
29082907
int
29092908
CountDBConnections(Oid databaseid)
@@ -2975,6 +2974,7 @@ CancelDBBackends(Oid databaseid, ProcSignalReason sigmode, bool conflictPending)
29752974

29762975
/*
29772976
* CountUserBackends --- count backends that are used by specified user
2977+
* (only regular backends, not any type of background worker)
29782978
*/
29792979
int
29802980
CountUserBackends(Oid roleid)

src/backend/storage/lmgr/proc.c

+2-2
Original file line numberDiff line numberDiff line change
@@ -395,7 +395,7 @@ InitProcess(void)
395395
MyProc->databaseId = InvalidOid;
396396
MyProc->roleId = InvalidOid;
397397
MyProc->tempNamespaceId = InvalidOid;
398-
MyProc->isBackgroundWorker = IsBackgroundWorker;
398+
MyProc->isBackgroundWorker = !AmRegularBackendProcess();
399399
MyProc->delayChkpt = false;
400400
MyProc->delayChkptEnd = false;
401401
MyPgXact->vacuumFlags = 0;
@@ -578,7 +578,7 @@ InitAuxiliaryProcess(void)
578578
MyProc->databaseId = InvalidOid;
579579
MyProc->roleId = InvalidOid;
580580
MyProc->tempNamespaceId = InvalidOid;
581-
MyProc->isBackgroundWorker = IsBackgroundWorker;
581+
MyProc->isBackgroundWorker = true;
582582
MyProc->delayChkpt = false;
583583
MyProc->delayChkptEnd = false;
584584
MyPgXact->vacuumFlags = 0;

src/backend/utils/init/miscinit.c

+36-39
Original file line numberDiff line numberDiff line change
@@ -690,6 +690,16 @@ InitializeSessionUserId(const char *rolename, Oid roleid)
690690
char *rname;
691691
bool is_superuser;
692692

693+
/*
694+
* In a parallel worker, we don't have to do anything here.
695+
* ParallelWorkerMain already set our output variables, and we aren't
696+
* going to enforce either rolcanlogin or rolconnlimit. Furthermore, we
697+
* don't really want to perform a catalog lookup for the role: we don't
698+
* want to fail if it's been dropped.
699+
*/
700+
if (InitializingParallelWorker)
701+
return;
702+
693703
/*
694704
* Don't do scans if we're bootstrapping, none of the system catalogs
695705
* exist yet, and they should be owned by postgres anyway.
@@ -705,68 +715,52 @@ InitializeSessionUserId(const char *rolename, Oid roleid)
705715

706716
/*
707717
* Look up the role, either by name if that's given or by OID if not.
708-
* Normally we have to fail if we don't find it, but in parallel workers
709-
* just return without doing anything: all the critical work has been done
710-
* already. The upshot of that is that if the role has been deleted, we
711-
* will not enforce its rolconnlimit against parallel workers anymore.
712718
*/
713719
if (rolename != NULL)
714720
{
715721
roleTup = SearchSysCache1(AUTHNAME, PointerGetDatum(rolename));
716722
if (!HeapTupleIsValid(roleTup))
717-
{
718-
if (InitializingParallelWorker)
719-
return;
720723
ereport(FATAL,
721724
(errcode(ERRCODE_INVALID_AUTHORIZATION_SPECIFICATION),
722725
errmsg("role \"%s\" does not exist", rolename)));
723-
}
724726
}
725727
else
726728
{
727729
roleTup = SearchSysCache1(AUTHOID, ObjectIdGetDatum(roleid));
728730
if (!HeapTupleIsValid(roleTup))
729-
{
730-
if (InitializingParallelWorker)
731-
return;
732731
ereport(FATAL,
733732
(errcode(ERRCODE_INVALID_AUTHORIZATION_SPECIFICATION),
734733
errmsg("role with OID %u does not exist", roleid)));
735-
}
736734
}
737735

738736
rform = (Form_pg_authid) GETSTRUCT(roleTup);
739737
roleid = rform->oid;
740738
rname = NameStr(rform->rolname);
741739
is_superuser = rform->rolsuper;
742740

743-
/* In a parallel worker, ParallelWorkerMain already set these variables */
744-
if (!InitializingParallelWorker)
745-
{
746-
SetAuthenticatedUserId(roleid, is_superuser);
741+
SetAuthenticatedUserId(roleid, is_superuser);
747742

748-
/*
749-
* Set SessionUserId and related variables, including "role", via the
750-
* GUC mechanisms.
751-
*
752-
* Note: ideally we would use PGC_S_DYNAMIC_DEFAULT here, so that
753-
* session_authorization could subsequently be changed from
754-
* pg_db_role_setting entries. Instead, session_authorization in
755-
* pg_db_role_setting has no effect. Changing that would require
756-
* solving two problems:
757-
*
758-
* 1. If pg_db_role_setting has values for both session_authorization
759-
* and role, we could not be sure which order those would be applied
760-
* in, and it would matter.
761-
*
762-
* 2. Sites may have years-old session_authorization entries. There's
763-
* not been any particular reason to remove them. Ending the dormancy
764-
* of those entries could seriously change application behavior, so
765-
* only a major release should do that.
766-
*/
767-
SetConfigOption("session_authorization", rname,
768-
PGC_BACKEND, PGC_S_OVERRIDE);
769-
}
743+
/*
744+
* Set SessionUserId and related variables, including "role", via the GUC
745+
* mechanisms.
746+
*
747+
* Note: ideally we would use PGC_S_DYNAMIC_DEFAULT here, so that
748+
* session_authorization could subsequently be changed from
749+
* pg_db_role_setting entries. Instead, session_authorization in
750+
* pg_db_role_setting has no effect. Changing that would require solving
751+
* two problems:
752+
*
753+
* 1. If pg_db_role_setting has values for both session_authorization and
754+
* role, we could not be sure which order those would be applied in, and
755+
* it would matter.
756+
*
757+
* 2. Sites may have years-old session_authorization entries. There's not
758+
* been any particular reason to remove them. Ending the dormancy of
759+
* those entries could seriously change application behavior, so only a
760+
* major release should do that.
761+
*/
762+
SetConfigOption("session_authorization", rname,
763+
PGC_BACKEND, PGC_S_OVERRIDE);
770764

771765
/*
772766
* These next checks are not enforced when in standalone mode, so that
@@ -785,7 +779,9 @@ InitializeSessionUserId(const char *rolename, Oid roleid)
785779
rname)));
786780

787781
/*
788-
* Check connection limit for this role.
782+
* Check connection limit for this role. We enforce the limit only
783+
* for regular backends, since other process types have their own
784+
* PGPROC pools.
789785
*
790786
* There is a race condition here --- we create our PGPROC before
791787
* checking for other PGPROCs. If two backends did this at about the
@@ -795,6 +791,7 @@ InitializeSessionUserId(const char *rolename, Oid roleid)
795791
* just document that the connection limit is approximate.
796792
*/
797793
if (rform->rolconnlimit >= 0 &&
794+
AmRegularBackendProcess() &&
798795
!is_superuser &&
799796
CountUserBackends(roleid) > rform->rolconnlimit)
800797
ereport(FATAL,

src/backend/utils/init/postinit.c

+13-27
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,6 @@
2222
#include "access/genam.h"
2323
#include "access/heapam.h"
2424
#include "access/htup_details.h"
25-
#include "access/parallel.h"
2625
#include "access/session.h"
2726
#include "access/sysattr.h"
2827
#include "access/tableam.h"
@@ -332,13 +331,13 @@ CheckMyDatabase(const char *name, bool am_superuser, bool override_allow_connect
332331
* These checks are not enforced when in standalone mode, so that there is
333332
* a way to recover from disabling all access to all databases, for
334333
* example "UPDATE pg_database SET datallowconn = false;".
335-
*
336-
* We do not enforce them for autovacuum worker processes either.
337334
*/
338-
if (IsUnderPostmaster && !IsAutoVacuumWorkerProcess())
335+
if (IsUnderPostmaster)
339336
{
340337
/*
341338
* Check that the database is currently allowing connections.
339+
* (Background processes can override this test and the next one by
340+
* setting override_allow_connections.)
342341
*/
343342
if (!dbform->datallowconn && !override_allow_connections)
344343
ereport(FATAL,
@@ -351,7 +350,7 @@ CheckMyDatabase(const char *name, bool am_superuser, bool override_allow_connect
351350
* is redundant, but since we have the flag, might as well check it
352351
* and save a few cycles.)
353352
*/
354-
if (!am_superuser &&
353+
if (!am_superuser && !override_allow_connections &&
355354
pg_database_aclcheck(MyDatabaseId, GetUserId(),
356355
ACL_CONNECT) != ACLCHECK_OK)
357356
ereport(FATAL,
@@ -360,7 +359,9 @@ CheckMyDatabase(const char *name, bool am_superuser, bool override_allow_connect
360359
errdetail("User does not have CONNECT privilege.")));
361360

362361
/*
363-
* Check connection limit for this database.
362+
* Check connection limit for this database. We enforce the limit
363+
* only for regular backends, since other process types have their own
364+
* PGPROC pools.
364365
*
365366
* There is a race condition here --- we create our PGPROC before
366367
* checking for other PGPROCs. If two backends did this at about the
@@ -370,6 +371,7 @@ CheckMyDatabase(const char *name, bool am_superuser, bool override_allow_connect
370371
* just document that the connection limit is approximate.
371372
*/
372373
if (dbform->datconnlimit >= 0 &&
374+
AmRegularBackendProcess() &&
373375
!am_superuser &&
374376
CountDBConnections(MyDatabaseId) > dbform->datconnlimit)
375377
ereport(FATAL,
@@ -763,23 +765,7 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
763765
else
764766
{
765767
InitializeSessionUserId(username, useroid);
766-
767-
/*
768-
* In a parallel worker, set am_superuser based on the
769-
* authenticated user ID, not the current role. This is pretty
770-
* dubious but it matches our historical behavior. Note that this
771-
* value of am_superuser is used only for connection-privilege
772-
* checks here and in CheckMyDatabase (we won't reach
773-
* process_startup_options in a background worker).
774-
*
775-
* In other cases, there's been no opportunity for the current
776-
* role to diverge from the authenticated user ID yet, so we can
777-
* just rely on superuser() and avoid an extra catalog lookup.
778-
*/
779-
if (InitializingParallelWorker)
780-
am_superuser = superuser_arg(GetAuthenticatedUserId());
781-
else
782-
am_superuser = superuser();
768+
am_superuser = superuser();
783769
}
784770
}
785771
else
@@ -820,11 +806,11 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
820806
}
821807

822808
/*
823-
* The last few connection slots are reserved for superusers. Replication
824-
* connections are drawn from slots reserved with max_wal_senders and not
825-
* limited by max_connections or superuser_reserved_connections.
809+
* The last few regular connection slots are reserved for superusers. We
810+
* do not apply this limit to background processes, since they all have
811+
* their own pools of PGPROC slots.
826812
*/
827-
if (!am_superuser && !am_walsender &&
813+
if (AmRegularBackendProcess() && !am_superuser &&
828814
ReservedBackends > 0 &&
829815
!HaveNFreeProcs(ReservedBackends))
830816
ereport(FATAL,

src/include/miscadmin.h

+2
Original file line numberDiff line numberDiff line change
@@ -339,6 +339,8 @@ typedef enum BackendType
339339

340340
extern BackendType MyBackendType;
341341

342+
#define AmRegularBackendProcess() (MyBackendType == B_BACKEND)
343+
342344
extern const char *GetBackendTypeDesc(BackendType backendType);
343345

344346
extern void SetDatabasePath(const char *path);

src/include/storage/proc.h

+1-1
Original file line numberDiff line numberDiff line change
@@ -130,7 +130,7 @@ struct PGPROC
130130
Oid tempNamespaceId; /* OID of temp schema this backend is
131131
* using */
132132

133-
bool isBackgroundWorker; /* true if background worker. */
133+
bool isBackgroundWorker; /* true if not a regular backend. */
134134

135135
/*
136136
* While in hot standby mode, shows that a conflict signal has been sent

0 commit comments

Comments
 (0)