Skip to content

Commit bc08520

Browse files
committed
Change CREATE STATISTICS syntax
Previously, we had the WITH clause in the middle of the command, where you'd specify both generic options as well as statistic types. Few people liked this, so this commit changes it to remove the WITH keyword from that clause and makes it accept statistic types only. (We currently don't have any generic options, but if we invent in the future, we will gain a new WITH clause, probably at the end of the command). Also, the column list is now specified without parens, which makes the whole command look more similar to a SELECT command. This change will let us expand the command to supporting expressions (not just columns names) as well as multiple tables and their join conditions. Tom added lots of code comments and fixed some parts of the CREATE STATISTICS reference page, too; more changes in this area are forthcoming. He also fixed a potential problem in the alter_generic regression test, reducing verbosity on a cascaded drop to avoid dependency on message ordering, as we do in other tests. Tom also closed a security bug: we documented that table ownership was required in order to create a statistics object on it, but didn't actually implement it. Implement tab-completion for statistics objects. This can stand some more improvement. Authors: Alvaro Herrera, with lots of cleanup by Tom Lane Discussion: https://postgr.es/m/20170420212426.ltvgyhnefvhixm6i@alvherre.pgsql
1 parent 46052d9 commit bc08520

21 files changed

+321
-286
lines changed

doc/src/sgml/perform.sgml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1132,8 +1132,8 @@ WHERE tablename = 'road';
11321132
To inspect functional dependencies on a statistics
11331133
<literal>stts</literal>, you may do this:
11341134
<programlisting>
1135-
CREATE STATISTICS stts WITH (dependencies)
1136-
ON (zip, city) FROM zipcodes;
1135+
CREATE STATISTICS stts (dependencies)
1136+
ON zip, city FROM zipcodes;
11371137
ANALYZE zipcodes;
11381138
SELECT stxname, stxkeys, stxdependencies
11391139
FROM pg_statistic_ext
@@ -1219,8 +1219,8 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 10;
12191219
Continuing the above example, the n-distinct coefficients in a ZIP
12201220
code table may look like the following:
12211221
<programlisting>
1222-
CREATE STATISTICS stts2 WITH (ndistinct)
1223-
ON (zip, state, city) FROM zipcodes;
1222+
CREATE STATISTICS stts2 (ndistinct)
1223+
ON zip, state, city FROM zipcodes;
12241224
ANALYZE zipcodes;
12251225
SELECT stxkeys AS k, stxndistinct AS nd
12261226
FROM pg_statistic_ext

doc/src/sgml/planstats.sgml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -526,7 +526,7 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
526526
multivariate statistics on the two columns:
527527

528528
<programlisting>
529-
CREATE STATISTICS stts WITH (dependencies) ON (a, b) FROM t;
529+
CREATE STATISTICS stts (dependencies) ON a, b FROM t;
530530
ANALYZE t;
531531
EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
532532
QUERY PLAN
@@ -569,7 +569,7 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b;
569569
calculation, the estimate is much improved:
570570
<programlisting>
571571
DROP STATISTICS stts;
572-
CREATE STATISTICS stts WITH (dependencies, ndistinct) ON (a, b) FROM t;
572+
CREATE STATISTICS stts (dependencies, ndistinct) ON a, b FROM t;
573573
ANALYZE t;
574574
EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b;
575575
QUERY PLAN

doc/src/sgml/ref/alter_statistics.sgml

Lines changed: 15 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ PostgreSQL documentation
1717
<refnamediv>
1818
<refname>ALTER STATISTICS</refname>
1919
<refpurpose>
20-
change the definition of a extended statistics
20+
change the definition of an extended statistics object
2121
</refpurpose>
2222
</refnamediv>
2323

@@ -34,19 +34,20 @@ ALTER STATISTICS <replaceable class="parameter">name</replaceable> SET SCHEMA <r
3434

3535
<para>
3636
<command>ALTER STATISTICS</command> changes the parameters of an existing
37-
extended statistics. Any parameters not specifically set in the
37+
extended statistics object. Any parameters not specifically set in the
3838
<command>ALTER STATISTICS</command> command retain their prior settings.
3939
</para>
4040

4141
<para>
42-
You must own the statistics to use <command>ALTER STATISTICS</>.
43-
To change a statistics' schema, you must also have <literal>CREATE</>
44-
privilege on the new schema.
42+
You must own the statistics object to use <command>ALTER STATISTICS</>.
43+
To change a statistics object's schema, you must also
44+
have <literal>CREATE</> privilege on the new schema.
4545
To alter the owner, you must also be a direct or indirect member of the new
4646
owning role, and that role must have <literal>CREATE</literal> privilege on
47-
the statistics' schema. (These restrictions enforce that altering the owner
48-
doesn't do anything you couldn't do by dropping and recreating the statistics.
49-
However, a superuser can alter ownership of any statistics anyway.)
47+
the statistics object's schema. (These restrictions enforce that altering
48+
the owner doesn't do anything you couldn't do by dropping and recreating
49+
the statistics object. However, a superuser can alter ownership of any
50+
statistics object anyway.)
5051
</para>
5152
</refsect1>
5253

@@ -59,7 +60,8 @@ ALTER STATISTICS <replaceable class="parameter">name</replaceable> SET SCHEMA <r
5960
<term><replaceable class="parameter">name</replaceable></term>
6061
<listitem>
6162
<para>
62-
The name (optionally schema-qualified) of the statistics to be altered.
63+
The name (optionally schema-qualified) of the statistics object to be
64+
altered.
6365
</para>
6466
</listitem>
6567
</varlistentry>
@@ -68,7 +70,7 @@ ALTER STATISTICS <replaceable class="parameter">name</replaceable> SET SCHEMA <r
6870
<term><replaceable class="PARAMETER">new_owner</replaceable></term>
6971
<listitem>
7072
<para>
71-
The user name of the new owner of the statistics.
73+
The user name of the new owner of the statistics object.
7274
</para>
7375
</listitem>
7476
</varlistentry>
@@ -77,7 +79,7 @@ ALTER STATISTICS <replaceable class="parameter">name</replaceable> SET SCHEMA <r
7779
<term><replaceable class="parameter">new_name</replaceable></term>
7880
<listitem>
7981
<para>
80-
The new name for the statistics.
82+
The new name for the statistics object.
8183
</para>
8284
</listitem>
8385
</varlistentry>
@@ -86,7 +88,7 @@ ALTER STATISTICS <replaceable class="parameter">name</replaceable> SET SCHEMA <r
8688
<term><replaceable class="parameter">new_schema</replaceable></term>
8789
<listitem>
8890
<para>
89-
The new schema for the statistics.
91+
The new schema for the statistics object.
9092
</para>
9193
</listitem>
9294
</varlistentry>
@@ -99,7 +101,7 @@ ALTER STATISTICS <replaceable class="parameter">name</replaceable> SET SCHEMA <r
99101
<title>Compatibility</title>
100102

101103
<para>
102-
There's no <command>ALTER STATISTICS</command> command in the SQL standard.
104+
There is no <command>ALTER STATISTICS</command> command in the SQL standard.
103105
</para>
104106
</refsect1>
105107

doc/src/sgml/ref/create_statistics.sgml

Lines changed: 34 additions & 51 deletions
Original file line numberDiff line numberDiff line change
@@ -22,8 +22,8 @@ PostgreSQL documentation
2222
<refsynopsisdiv>
2323
<synopsis>
2424
CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_name</replaceable>
25-
WITH ( <replaceable class="PARAMETER">option</replaceable> [= <replaceable class="PARAMETER">value</replaceable>] [, ... ] )
26-
ON ( <replaceable class="PARAMETER">column_name</replaceable>, <replaceable class="PARAMETER">column_name</replaceable> [, ...])
25+
[ ( <replaceable class="PARAMETER">statistic_type</replaceable> [, ... ] ) ]
26+
ON <replaceable class="PARAMETER">column_name</replaceable>, <replaceable class="PARAMETER">column_name</replaceable> [, ...]
2727
FROM <replaceable class="PARAMETER">table_name</replaceable>
2828
</synopsis>
2929

@@ -34,17 +34,17 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_na
3434

3535
<para>
3636
<command>CREATE STATISTICS</command> will create a new extended statistics
37-
object on the specified table, foreign table or materialized view.
38-
The statistics will be created in the current database and
39-
will be owned by the user issuing the command.
37+
object tracking data about the specified table, foreign table or
38+
materialized view. The statistics object will be created in the current
39+
database and will be owned by the user issuing the command.
4040
</para>
4141

4242
<para>
4343
If a schema name is given (for example, <literal>CREATE STATISTICS
44-
myschema.mystat ...</>) then the statistics is created in the specified
45-
schema. Otherwise it is created in the current schema. The name of
46-
the statistics must be distinct from the name of any other statistics in the
47-
same schema.
44+
myschema.mystat ...</>) then the statistics object is created in the
45+
specified schema. Otherwise it is created in the current schema.
46+
The name of the statistics object must be distinct from the name of any
47+
other statistics object in the same schema.
4848
</para>
4949
</refsect1>
5050

@@ -57,10 +57,10 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_na
5757
<term><literal>IF NOT EXISTS</></term>
5858
<listitem>
5959
<para>
60-
Do not throw an error if a statistics with the same name already exists.
61-
A notice is issued in this case. Note that only the name of the
62-
statistics object is considered here. The definition of the statistics is
63-
not considered.
60+
Do not throw an error if a statistics object with the same name already
61+
exists. A notice is issued in this case. Note that only the name of
62+
the statistics object is considered here, not the details of its
63+
definition.
6464
</para>
6565
</listitem>
6666
</varlistentry>
@@ -69,67 +69,45 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_na
6969
<term><replaceable class="PARAMETER">statistics_name</replaceable></term>
7070
<listitem>
7171
<para>
72-
The name (optionally schema-qualified) of the statistics to be created.
72+
The name (optionally schema-qualified) of the statistics object to be
73+
created.
7374
</para>
7475
</listitem>
7576
</varlistentry>
7677

7778
<varlistentry>
78-
<term><replaceable class="PARAMETER">column_name</replaceable></term>
79+
<term><replaceable class="PARAMETER">statistic_type</replaceable></term>
7980
<listitem>
8081
<para>
81-
The name of a column to be included in the statistics.
82+
A statistic type to be computed in this statistics object. Currently
83+
supported types are <literal>ndistinct</literal>, which enables
84+
n-distinct coefficient tracking,
85+
and <literal>dependencies</literal>, which enables functional
86+
dependencies.
8287
</para>
8388
</listitem>
8489
</varlistentry>
8590

8691
<varlistentry>
87-
<term><replaceable class="PARAMETER">table_name</replaceable></term>
88-
<listitem>
89-
<para>
90-
The name (optionally schema-qualified) of the table the statistics should
91-
be created on.
92-
</para>
93-
</listitem>
94-
</varlistentry>
95-
96-
</variablelist>
97-
98-
<refsect2 id="SQL-CREATESTATISTICS-parameters">
99-
<title id="SQL-CREATESTATISTICS-parameters-title">Parameters</title>
100-
101-
<indexterm zone="sql-createstatistics-parameters">
102-
<primary>statistics parameters</primary>
103-
</indexterm>
104-
105-
<para>
106-
The <literal>WITH</> clause can specify <firstterm>options</>
107-
for the statistics. Available options are listed below.
108-
</para>
109-
110-
<variablelist>
111-
112-
<varlistentry>
113-
<term><literal>dependencies</> (<type>boolean</>)</term>
92+
<term><replaceable class="PARAMETER">column_name</replaceable></term>
11493
<listitem>
11594
<para>
116-
Enables functional dependencies for the statistics.
95+
The name of a table column to be included in the statistics object.
11796
</para>
11897
</listitem>
11998
</varlistentry>
12099

121100
<varlistentry>
122-
<term><literal>ndistinct</> (<type>boolean</>)</term>
101+
<term><replaceable class="PARAMETER">table_name</replaceable></term>
123102
<listitem>
124103
<para>
125-
Enables ndistinct coefficients for the statistics.
104+
The name (optionally schema-qualified) of the table containing the
105+
column(s) the statistics are computed on.
126106
</para>
127107
</listitem>
128108
</varlistentry>
129109

130-
</variablelist>
131-
132-
</refsect2>
110+
</variablelist>
133111
</refsect1>
134112

135113
<refsect1>
@@ -158,7 +136,7 @@ CREATE TABLE t1 (
158136
INSERT INTO t1 SELECT i/100, i/500
159137
FROM generate_series(1,1000000) s(i);
160138

161-
CREATE STATISTICS s1 WITH (dependencies) ON (a, b) FROM t1;
139+
CREATE STATISTICS s1 (dependencies) ON a, b FROM t1;
162140

163141
ANALYZE t1;
164142

@@ -168,6 +146,11 @@ EXPLAIN ANALYZE SELECT * FROM t1 WHERE (a = 1) AND (b = 0);
168146
-- invalid combination of values
169147
EXPLAIN ANALYZE SELECT * FROM t1 WHERE (a = 1) AND (b = 1);
170148
</programlisting>
149+
150+
Without functional-dependency statistics, the planner would make the
151+
same estimate of the number of matching rows for these two queries.
152+
With such statistics, it is able to tell that one case has matches
153+
and the other does not.
171154
</para>
172155

173156
</refsect1>
@@ -176,7 +159,7 @@ EXPLAIN ANALYZE SELECT * FROM t1 WHERE (a = 1) AND (b = 1);
176159
<title>Compatibility</title>
177160

178161
<para>
179-
There's no <command>CREATE STATISTICS</command> command in the SQL standard.
162+
There is no <command>CREATE STATISTICS</command> command in the SQL standard.
180163
</para>
181164
</refsect1>
182165

doc/src/sgml/ref/drop_statistics.sgml

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -29,9 +29,9 @@ DROP STATISTICS [ IF EXISTS ] <replaceable class="PARAMETER">name</replaceable>
2929
<title>Description</title>
3030

3131
<para>
32-
<command>DROP STATISTICS</command> removes statistics from the database.
33-
Only the statistics owner, the schema owner, and superuser can drop a
34-
statistics.
32+
<command>DROP STATISTICS</command> removes statistics object(s) from the
33+
database. Only the statistics object's owner, the schema owner, or a
34+
superuser can drop a statistics object.
3535
</para>
3636

3737
</refsect1>
@@ -44,8 +44,8 @@ DROP STATISTICS [ IF EXISTS ] <replaceable class="PARAMETER">name</replaceable>
4444
<term><literal>IF EXISTS</literal></term>
4545
<listitem>
4646
<para>
47-
Do not throw an error if the statistics do not exist. A notice is
48-
issued in this case.
47+
Do not throw an error if the statistics object does not exist. A notice
48+
is issued in this case.
4949
</para>
5050
</listitem>
5151
</varlistentry>
@@ -54,7 +54,7 @@ DROP STATISTICS [ IF EXISTS ] <replaceable class="PARAMETER">name</replaceable>
5454
<term><replaceable class="PARAMETER">name</replaceable></term>
5555
<listitem>
5656
<para>
57-
The name (optionally schema-qualified) of the statistics to drop.
57+
The name (optionally schema-qualified) of the statistics object to drop.
5858
</para>
5959
</listitem>
6060
</varlistentry>
@@ -66,7 +66,7 @@ DROP STATISTICS [ IF EXISTS ] <replaceable class="PARAMETER">name</replaceable>
6666
<title>Examples</title>
6767

6868
<para>
69-
To destroy two statistics objects on different schemas, without failing
69+
To destroy two statistics objects in different schemas, without failing
7070
if they don't exist:
7171

7272
<programlisting>
@@ -82,7 +82,7 @@ DROP STATISTICS IF EXISTS
8282
<title>Compatibility</title>
8383

8484
<para>
85-
There's no <command>DROP STATISTICS</command> command in the SQL standard.
85+
There is no <command>DROP STATISTICS</command> command in the SQL standard.
8686
</para>
8787
</refsect1>
8888

0 commit comments

Comments
 (0)