Skip to content

Commit e7728cf

Browse files
committed
Merge commit '44d88c6a4d2c792d66940c267a7c4d916b4bfdee' into PGPRO9_5
2 parents ba2f892 + 44d88c6 commit e7728cf

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

66 files changed

+13874
-4780
lines changed

contrib/pg_pathman/.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,9 @@
11
.deps
2+
isolation_output
23
results/pg_pathman.out
34
regression.diffs
45
regression.out
56
*.o
67
*.so
8+
*.pyc
79
pg_pathman--*.sql

contrib/pg_pathman/.travis.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,8 @@ before_install:
1414
- sudo sh ./travis/apt.postgresql.org.sh
1515

1616
env:
17+
- PGVERSION=9.6 CHECK_CODE=true
18+
- PGVERSION=9.6 CHECK_CODE=false
1719
- PGVERSION=9.5 CHECK_CODE=true
1820
- PGVERSION=9.5 CHECK_CODE=false
1921

contrib/pg_pathman/Makefile

Lines changed: 15 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,22 @@
11
# contrib/pg_pathman/Makefile
22

33
MODULE_big = pg_pathman
4-
OBJS = src/init.o src/utils.o src/runtimeappend.o src/runtime_merge_append.o src/pg_pathman.o src/dsm_array.o \
5-
src/rangeset.o src/pl_funcs.o src/worker.o src/hooks.o src/nodes_common.o $(WIN32RES)
4+
OBJS = src/init.o src/relation_info.o src/utils.o src/partition_filter.o \
5+
src/runtimeappend.o src/runtime_merge_append.o src/pg_pathman.o src/rangeset.o \
6+
src/pl_funcs.o src/pl_range_funcs.o src/pl_hash_funcs.o src/pathman_workers.o \
7+
src/hooks.o src/nodes_common.o src/xact_handling.o src/copy_stmt_hooking.o \
8+
src/pg_compat.o $(WIN32RES)
69

710
EXTENSION = pg_pathman
8-
EXTVERSION = 0.1
11+
EXTVERSION = 1.0
912
DATA_built = $(EXTENSION)--$(EXTVERSION).sql
1013
PGFILEDESC = "pg_pathman - partitioning tool"
1114

12-
REGRESS = pg_pathman
15+
REGRESS = pathman_basic \
16+
pathman_runtime_nodes \
17+
pathman_callbacks \
18+
pathman_domains \
19+
pathman_foreign_keys
1320
EXTRA_REGRESS_OPTS=--temp-config=$(top_srcdir)/$(subdir)/conf.add
1421
EXTRA_CLEAN = $(EXTENSION)--$(EXTVERSION).sql ./isolation_output
1522

@@ -27,14 +34,14 @@ endif
2734
$(EXTENSION)--$(EXTVERSION).sql: init.sql hash.sql range.sql
2835
cat $^ > $@
2936

30-
ISOLATIONCHECKS=insert_trigger rollback_on_create_partitions
37+
ISOLATIONCHECKS=insert_nodes for_update rollback_on_create_partitions
3138

3239
submake-isolation:
3340
$(MAKE) -C $(top_builddir)/src/test/isolation all
3441

3542
isolationcheck: | submake-isolation
3643
$(MKDIR_P) isolation_output
3744
$(pg_isolation_regress_check) \
38-
--temp-config=$(top_srcdir)/$(subdir)/conf.add \
39-
--outputdir=./isolation_output \
40-
$(ISOLATIONCHECKS)
45+
--temp-config=$(top_srcdir)/$(subdir)/conf.add \
46+
--outputdir=./isolation_output \
47+
$(ISOLATIONCHECKS)

contrib/pg_pathman/README.md

Lines changed: 121 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,12 @@
11
[![Build Status](https://travis-ci.org/postgrespro/pg_pathman.svg?branch=master)](https://travis-ci.org/postgrespro/pg_pathman)
2+
[![PGXN version](https://badge.fury.io/pg/pg_pathman.svg)](https://badge.fury.io/pg/pg_pathman)
23

34
# pg_pathman
45

56
The `pg_pathman` module provides optimized partitioning mechanism and functions to manage partitions.
67

8+
The extension is compatible with PostgreSQL 9.5 (9.6 support is coming soon).
9+
710
## Overview
811
**Partitioning** means splitting one large table into smaller pieces. Each row in such table is moved to a single partition according to the partitioning key. PostgreSQL supports partitioning via table inheritance: each partition must be created as a child table with CHECK CONSTRAINT. For example:
912

@@ -29,16 +32,15 @@ WHERE id = 150
2932
Based on the partitioning type and condition's operator, `pg_pathman` searches for the corresponding partitions and builds the plan. Currently `pg_pathman` supports two partitioning schemes:
3033

3134
* **RANGE** - maps rows to partitions using partitioning key ranges assigned to each partition. Optimization is achieved by using the binary search algorithm;
32-
* **HASH** - maps rows to partitions using a generic hash function (only *integer* attributes are supported at the moment).
35+
* **HASH** - maps rows to partitions using a generic hash function.
3336

3437
More interesting features are yet to come. Stay tuned!
3538

3639
## Roadmap
37-
* Replace INSERT triggers with a custom node (aka **PartitionFilter**)
38-
* Implement [concurrent partitioning](https://github.com/postgrespro/pg_pathman/tree/concurrent_part) (much more responsive)
39-
* Implement HASH partitioning for non-integer attributes
40-
* Optimize hash join (both tables are partitioned by join key)
41-
* Implement LIST partitioning scheme
40+
41+
* Provide a way to create user-defined partition creation\destruction callbacks (issue [#22](https://github.com/postgrespro/pg_pathman/issues/22))
42+
* Implement LIST partitioning scheme;
43+
* Optimize hash join (both tables are partitioned by join key).
4244

4345
## Installation guide
4446
To install `pg_pathman`, execute this in the module's directory:
@@ -62,77 +64,99 @@ Done! Now it's time to setup your partitioning schemes.
6264

6365
### Partition creation
6466
```plpgsql
65-
create_hash_partitions(relation TEXT,
67+
create_hash_partitions(relation REGCLASS,
6668
attribute TEXT,
67-
partitions_count INTEGER)
69+
partitions_count INTEGER,
70+
partition_name TEXT DEFAULT NULL)
6871
```
69-
Performs HASH partitioning for `relation` by integer key `attribute`. Creates `partitions_count` partitions and trigger on INSERT. All the data will be automatically copied from the parent to partitions.
72+
Performs HASH partitioning for `relation` by integer key `attribute`. The `partitions_count` parameter specifies the number of partitions to create; it cannot be changed afterwards. If `partition_data` is `true` then all the data will be automatically copied from the parent table to partitions. Note that data migration may took a while to finish and the table will be locked until transaction commits. See `partition_table_concurrently()` for a lock-free way to migrate data.
73+
74+
```plpgsql
75+
create_range_partitions(relation REGCLASS,
76+
attribute TEXT,
77+
start_value ANYELEMENT,
78+
interval ANYELEMENT,
79+
count INTEGER DEFAULT NULL
80+
partition_data BOOLEAN DEFAULT true)
81+
82+
create_range_partitions(relation REGCLASS,
83+
attribute TEXT,
84+
start_value ANYELEMENT,
85+
interval INTERVAL,
86+
count INTEGER DEFAULT NULL,
87+
partition_data BOOLEAN DEFAULT true)
88+
```
89+
Performs RANGE partitioning for `relation` by partitioning key `attribute`. `start_value` argument specifies initial value, `interval` sets the range of values in a single partition, `count` is the number of premade partitions (if not set then pathman tries to determine it based on attribute values).
7090

7191
```plpgsql
72-
create_range_partitions(relation TEXT,
73-
attribute TEXT,
74-
start_value ANYELEMENT,
75-
interval ANYELEMENT,
76-
premake INTEGER DEFAULT NULL)
92+
create_partitions_from_range(relation REGCLASS,
93+
attribute TEXT,
94+
start_value ANYELEMENT,
95+
end_value ANYELEMENT,
96+
interval ANYELEMENT,
97+
partition_data BOOLEAN DEFAULT true)
7798

78-
create_range_partitions(relation TEXT,
79-
attribute TEXT,
80-
start_value ANYELEMENT,
81-
interval INTERVAL,
82-
premake INTEGER DEFAULT NULL)
99+
create_partitions_from_range(relation REGCLASS,
100+
attribute TEXT,
101+
start_value ANYELEMENT,
102+
end_value ANYELEMENT,
103+
interval INTERVAL,
104+
partition_data BOOLEAN DEFAULT true)
83105
```
84-
Performs RANGE partitioning for `relation` by partitioning key `attribute`. `start_value` argument specifies initial value, `interval` sets the range of values in a single partition, `premake` is the number of premade partitions (if not set then pathman tries to determine it based on attribute values). All the data will be automatically copied from the parent to partitions.
106+
Performs RANGE-partitioning from specified range for `relation` by partitioning key `attribute`.
107+
108+
### Data migration
85109

86110
```plpgsql
87-
create_partitions_from_range(relation TEXT,
88-
attribute TEXT,
89-
start_value ANYELEMENT,
90-
end_value ANYELEMENT,
91-
interval ANYELEMENT)
111+
partition_table_concurrently(relation REGCLASS)
112+
```
113+
Starts a background worker to move data from parent table to partitions. The worker utilizes short transactions to copy small batches of data (up to 10K rows per transaction) and thus doesn't significantly interfere with user's activity.
92114

93-
create_partitions_from_range(relation TEXT,
94-
attribute TEXT,
95-
start_value ANYELEMENT,
96-
end_value ANYELEMENT,
97-
interval INTERVAL)
115+
```plpgsql
116+
stop_concurrent_part_task(relation REGCLASS)
98117
```
99-
Performs RANGE-partitioning from specified range for `relation` by partitioning key `attribute`. Data will be copied to partitions as well.
118+
Stops a background worker performing a concurrent partitioning task. Note: worker will exit after it finishes relocating a current batch.
100119

101120
### Triggers
102121
```plpgsql
103-
create_hash_update_trigger(parent TEXT)
122+
create_hash_update_trigger(parent REGCLASS)
104123
```
105124
Creates the trigger on UPDATE for HASH partitions. The UPDATE trigger isn't created by default because of the overhead. It's useful in cases when the key attribute might change.
106125
```plpgsql
107-
create_range_update_trigger(parent TEXT)
126+
create_range_update_trigger(parent REGCLASS)
108127
```
109128
Same as above, but for a RANGE-partitioned table.
110129

111130
### Post-creation partition management
112131
```plpgsql
113-
split_range_partition(partition TEXT, value ANYELEMENT)
132+
split_range_partition(partition REGCLASS,
133+
value ANYELEMENT,
134+
partition_name TEXT DEFAULT NULL,)
114135
```
115136
Split RANGE `partition` in two by `value`.
116137

117138
```plpgsql
118-
merge_range_partitions(partition1 TEXT, partition2 TEXT)
139+
merge_range_partitions(partition1 REGCLASS, partition2 REGCLASS)
119140
```
120141
Merge two adjacent RANGE partitions. First, data from `partition2` is copied to `partition1`, then `partition2` is removed.
121142

122143
```plpgsql
123-
append_range_partition(p_relation TEXT)
144+
append_range_partition(p_relation REGCLASS,
145+
partition_name TEXT DEFAULT NULL)
124146
```
125-
Append new RANGE partition.
147+
Append new RANGE partition with `pathman_config.range_interval` as interval.
126148

127149
```plpgsql
128-
prepend_range_partition(p_relation TEXT)
150+
prepend_range_partition(p_relation REGCLASS,
151+
partition_name TEXT DEFAULT NULL)
129152
```
130-
Prepend new RANGE partition.
153+
Prepend new RANGE partition with `pathman_config.range_interval` as interval.
131154

132155
```plpgsql
133-
add_range_partition(relation TEXT,
134-
start_value ANYELEMENT,
135-
end_value ANYELEMENT)
156+
add_range_partition(relation REGCLASS,
157+
start_value ANYELEMENT,
158+
end_value ANYELEMENT,
159+
partition_name TEXT DEFAULT NULL)
136160
```
137161
Create new RANGE partition for `relation` with specified range bounds.
138162

@@ -142,28 +166,65 @@ drop_range_partition(partition TEXT)
142166
Drop RANGE partition and all its data.
143167

144168
```plpgsql
145-
attach_range_partition(relation TEXT,
146-
partition TEXT,
169+
attach_range_partition(relation REGCLASS,
170+
partition REGCLASS,
147171
start_value ANYELEMENT,
148172
end_value ANYELEMENT)
149173
```
150174
Attach partition to the existing RANGE-partitioned relation. The attached table must have exactly the same structure as the parent table, including the dropped columns.
151175

152176
```plpgsql
153-
detach_range_partition(partition TEXT)
177+
detach_range_partition(partition REGCLASS)
154178
```
155179
Detach partition from the existing RANGE-partitioned relation.
156180

157181
```plpgsql
158-
disable_partitioning(relation TEXT)
182+
disable_pathman_for(relation TEXT)
159183
```
160184
Permanently disable `pg_pathman` partitioning mechanism for the specified parent table and remove the insert trigger if it exists. All partitions and data remain unchanged.
161185

186+
```plpgsql
187+
drop_partitions(parent REGCLASS,
188+
delete_data BOOLEAN DEFAULT FALSE)
189+
```
190+
Drop partitions of the `parent` table. If `delete_data` is `false` then the data is copied to the parent table first. Default is `false`.
191+
192+
193+
### Additional parameters
194+
195+
```plpgsql
196+
enable_parent(relation REGCLASS)
197+
disable_parent(relation REGCLASS)
198+
```
199+
Include/exclude parent table into/from query plan. In original PostgreSQL planner parent table is always included into query plan even if it's empty which can lead to additional overhead. You can use `disable_parent()` if you are never going to use parent table as a storage. Default value depends on the `partition_data` parameter that was specified during initial partitioning in `create_range_partitions()` or `create_partitions_from_range()` functions. If the `partition_data` parameter was `true` then all data have already been migrated to partitions and parent table disabled. Otherwise it is enabled.
200+
201+
```plpgsql
202+
enable_auto(relation REGCLASS)
203+
disable_auto(relation REGCLASS)
204+
```
205+
Enable/disable auto partition propagation (only for RANGE partitioning). It is enabled by default.
206+
162207
## Custom plan nodes
163208
`pg_pathman` provides a couple of [custom plan nodes](https://wiki.postgresql.org/wiki/CustomScanAPI) which aim to reduce execution time, namely:
164209

165210
- `RuntimeAppend` (overrides `Append` plan node)
166211
- `RuntimeMergeAppend` (overrides `MergeAppend` plan node)
212+
- `PartitionFilter` (drop-in replacement for INSERT triggers)
213+
214+
`PartitionFilter` acts as a *proxy node* for INSERT's child scan, which means it can redirect output tuples to the corresponding partition:
215+
216+
```
217+
EXPLAIN (COSTS OFF)
218+
INSERT INTO partitioned_table
219+
SELECT generate_series(1, 10), random();
220+
QUERY PLAN
221+
-----------------------------------------
222+
Insert on partitioned_table
223+
-> Custom Scan (PartitionFilter)
224+
-> Subquery Scan on "*SELECT*"
225+
-> Result
226+
(4 rows)
227+
```
167228

168229
`RuntimeAppend` and `RuntimeMergeAppend` have much in common: they come in handy in a case when WHERE condition takes form of:
169230
```
@@ -290,6 +351,15 @@ SELECT tableoid::regclass AS partition, * FROM partitioned_table;
290351

291352
- Though indices on a parent table aren't particularly useful (since it's empty), they act as prototypes for indices on partitions. For each index on the parent table, `pg_pathman` will create a similar index on every partition.
292353

354+
- All running concurrent partitioning tasks can be listed using the `pathman_concurrent_part_tasks` view:
355+
```plpgsql
356+
SELECT * FROM pathman_concurrent_part_tasks;
357+
userid | pid | dbid | relid | processed | status
358+
--------+------+-------+-------+-----------+---------
359+
dmitry | 7367 | 16384 | test | 472000 | working
360+
(1 row)
361+
```
362+
293363
### HASH partitioning
294364
Consider an example of HASH partitioning. First create a table with some integer column:
295365
```
@@ -419,17 +489,18 @@ There are several user-accessible [GUC](https://www.postgresql.org/docs/9.5/stat
419489
- `pg_pathman.enable` --- disable (or enable) `pg_pathman` completely
420490
- `pg_pathman.enable_runtimeappend` --- toggle `RuntimeAppend` custom node on\off
421491
- `pg_pathman.enable_runtimemergeappend` --- toggle `RuntimeMergeAppend` custom node on\off
492+
- `pg_pathman.enable_partitionfilter` --- toggle `PartitionFilter` custom node on\off
422493

423494
To **permanently** disable `pg_pathman` for some previously partitioned table, use the `disable_partitioning()` function:
424495
```
425-
SELECT disable_partitioning('range_rel');
496+
SELECT disable_pathman_for('range_rel');
426497
```
427498
All sections and data will remain unchanged and will be handled by the standard PostgreSQL inheritance mechanism.
428499

429500
##Feedback
430501
Do not hesitate to post your issues, questions and new ideas at the [issues](https://github.com/postgrespro/pg_pathman/issues) page.
431502

432503
## Authors
433-
Ildar Musin <i.musin@postgrespro.ru> Postgres Professional Ltd., Russia
434-
Alexander Korotkov <a.korotkov@postgrespro.ru> Postgres Professional Ltd., Russia
435-
Dmitry Ivanov <d.ivanov@postgrespro.ru> Postgres Professional Ltd., Russia
504+
Ildar Musin <i.musin@postgrespro.ru> Postgres Professional Ltd., Russia
505+
Alexander Korotkov <a.korotkov@postgrespro.ru> Postgres Professional Ltd., Russia
506+
Dmitry Ivanov <d.ivanov@postgrespro.ru> Postgres Professional Ltd., Russia

0 commit comments

Comments
 (0)