Skip to content

Commit 787514b

Browse files
michaelpqsamimseih
andcommitted
Use relation name instead of OID in query jumbling for RangeTblEntry
custom_query_jumble (introduced in 5ac462e as a node field attribute) is now assigned to the expanded reference name "eref" of RangeTblEntry, adding in the query jumble computation the non-qualified aliased relation name, without the list of column names. The relation OID is removed from the query jumbling. The effects of this change can be seen in the tests added by 3430215, where pg_stat_statements (PGSS) entries are now grouped using the relation name, ignoring the relation search_path may point at. For example, these two relations are different, but are now grouped in a single PGSS entry as they are assigned the same query ID: CREATE TABLE foo1.tab (a int); CREATE TABLE foo2.tab (b int); SET search_path = 'foo1'; SELECT count(*) FROM tab; SET search_path = 'foo2'; SELECT count(*) FROM tab; SELECT count(*) FROM foo1.tab; SELECT count(*) FROM foo2.tab; SELECT query, calls FROM pg_stat_statements WHERE query ~ 'FROM tab'; query | calls --------------------------+------- SELECT count(*) FROM tab | 4 (1 row) It is still possible to use an alias in the FROM clause to split these. This behavior is useful for relations re-created with the same name, where queries based on such relations would be grouped in the same PGSS entry. For permanent schemas, it should not really matter in practice. The main benefit is for workloads that use a lot of temporary relations, which are usually re-created with the same name continuously. These can be a heavy source of bloat in PGSS depending on the workload. Such entries can now be grouped together, improving the user experience. The original idea from Christoph Berg used catalog lookups to find temporary relations, something that the query jumble has never done, and it could cause some performance regressions. The idea to use RangeTblEntry.eref and the relation name, applying the same rules for all relations, temporary and not temporary, has been proposed by Tom Lane. The documentation additions have been suggested by Sami Imseih. Author: Michael Paquier <michael@paquier.xyz> Co-authored-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Christoph Berg <myon@debian.org> Reviewed-by: Lukas Fittl <lukas@fittl.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/Z9iWXKGwkm8RAC93@msg.df7cb.de
1 parent d2028e9 commit 787514b

File tree

4 files changed

+42
-17
lines changed

4 files changed

+42
-17
lines changed

contrib/pg_stat_statements/expected/select.out

Lines changed: 8 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -433,11 +433,10 @@ COMMIT;
433433
SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C";
434434
calls | query
435435
-------+------------------------------------------------------------------------
436-
1 | SELECT * FROM temp_t
437-
1 | SELECT * FROM temp_t
436+
2 | SELECT * FROM temp_t
438437
0 | SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C"
439438
1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
440-
(4 rows)
439+
(3 rows)
441440

442441
SELECT pg_stat_statements_reset() IS NOT NULL AS t;
443442
t
@@ -623,18 +622,15 @@ SELECT a AS a1 FROM pgss_schema_2.tab_search_diff_2;
623622
SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C";
624623
calls | query
625624
-------+------------------------------------------------------------------------
626-
3 | SELECT a FROM pgss_schema_2.tab_search_diff_2 AS t1
627-
9 | SELECT a FROM tab_search_diff_2 AS t1
628-
1 | SELECT a, b FROM pgss_schema_1.tab_search_same
629-
3 | SELECT a, b FROM tab_search_same
625+
8 | SELECT a FROM tab_search_diff_2
626+
4 | SELECT a FROM tab_search_diff_2 AS t1
627+
4 | SELECT a, b FROM tab_search_same
630628
0 | SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C"
631-
1 | SELECT count(*) FROM pgss_schema_1.tab_search_same
632-
1 | SELECT count(*) FROM pgss_schema_2.tab_search_diff_1
633-
3 | SELECT count(*) FROM tab_search_diff_1
629+
4 | SELECT count(*) FROM tab_search_diff_1
634630
4 | SELECT count(*) FROM tab_search_diff_2
635-
3 | SELECT count(*) FROM tab_search_same
631+
4 | SELECT count(*) FROM tab_search_same
636632
1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
637-
(11 rows)
633+
(8 rows)
638634

639635
DROP SCHEMA pgss_schema_1 CASCADE;
640636
NOTICE: drop cascades to 3 other objects

doc/src/sgml/pgstatstatements.sgml

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -675,8 +675,13 @@ calls | 2
675675
things, the internal object identifiers appearing in this representation.
676676
This has some counterintuitive implications. For example,
677677
<filename>pg_stat_statements</filename> will consider two apparently-identical
678-
queries to be distinct, if they reference a table that was dropped
679-
and recreated between the executions of the two queries.
678+
queries to be distinct, if they reference for example a function that was
679+
dropped and recreated between the executions of the two queries.
680+
Conversely, if a table is dropped and recreated between the
681+
executions of queries, two apparently-identical queries may be
682+
considered the same. However, if the alias for a table is different
683+
for otherwise-similar queries, these queries will be considered
684+
distinct.
680685
The hashing process is also sensitive to differences in
681686
machine architecture and other facets of the platform.
682687
Furthermore, it is not safe to assume that <structfield>queryid</structfield>

src/backend/nodes/queryjumblefuncs.c

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,9 @@ static void _jumbleElements(JumbleState *jstate, List *elements);
6767
static void _jumbleA_Const(JumbleState *jstate, Node *node);
6868
static void _jumbleList(JumbleState *jstate, Node *node);
6969
static void _jumbleVariableSetStmt(JumbleState *jstate, Node *node);
70+
static void _jumbleRangeTblEntry_eref(JumbleState *jstate,
71+
RangeTblEntry *rte,
72+
Alias *expr);
7073

7174
/*
7275
* Given a possibly multi-statement source string, confine our attention to the
@@ -516,3 +519,19 @@ _jumbleVariableSetStmt(JumbleState *jstate, Node *node)
516519
JUMBLE_FIELD(is_local);
517520
JUMBLE_LOCATION(location);
518521
}
522+
523+
/*
524+
* Custom query jumble function for RangeTblEntry.eref.
525+
*/
526+
static void
527+
_jumbleRangeTblEntry_eref(JumbleState *jstate,
528+
RangeTblEntry *rte,
529+
Alias *expr)
530+
{
531+
JUMBLE_FIELD(type);
532+
533+
/*
534+
* This includes only the table name, the list of column names is ignored.
535+
*/
536+
JUMBLE_STRING(aliasname);
537+
}

src/include/nodes/parsenodes.h

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1050,8 +1050,13 @@ typedef struct RangeTblEntry
10501050
*/
10511051
/* user-written alias clause, if any */
10521052
Alias *alias pg_node_attr(query_jumble_ignore);
1053-
/* expanded reference names */
1054-
Alias *eref pg_node_attr(query_jumble_ignore);
1053+
1054+
/*
1055+
* Expanded reference names. This uses a custom query jumble function so
1056+
* that the table name is included in the computation, but not its list of
1057+
* columns.
1058+
*/
1059+
Alias *eref pg_node_attr(custom_query_jumble);
10551060

10561061
RTEKind rtekind; /* see above */
10571062

@@ -1094,7 +1099,7 @@ typedef struct RangeTblEntry
10941099
* tables to be invalidated if the underlying table is altered.
10951100
*/
10961101
/* OID of the relation */
1097-
Oid relid;
1102+
Oid relid pg_node_attr(query_jumble_ignore);
10981103
/* inheritance requested? */
10991104
bool inh;
11001105
/* relation kind (see pg_class.relkind) */

0 commit comments

Comments
 (0)