You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The `pg_pathman` module provides optimized partitioning mechanism and functions to manage partitions.
6
7
8
+
The extension is compatible with PostgreSQL 9.5 (9.6 support is coming soon).
9
+
7
10
## Overview
8
11
**Partitioning** means splitting one large table into smaller pieces. Each row in such table is moved to a single partition according to the partitioning key. PostgreSQL supports partitioning via table inheritance: each partition must be created as a child table with CHECK CONSTRAINT. For example:
9
12
@@ -29,16 +32,15 @@ WHERE id = 150
29
32
Based on the partitioning type and condition's operator, `pg_pathman` searches for the corresponding partitions and builds the plan. Currently `pg_pathman` supports two partitioning schemes:
30
33
31
34
***RANGE** - maps rows to partitions using partitioning key ranges assigned to each partition. Optimization is achieved by using the binary search algorithm;
32
-
***HASH** - maps rows to partitions using a generic hash function (only *integer* attributes are supported at the moment).
35
+
***HASH** - maps rows to partitions using a generic hash function.
33
36
34
37
More interesting features are yet to come. Stay tuned!
35
38
36
39
## Roadmap
37
-
* Replace INSERT triggers with a custom node (aka **PartitionFilter**)
38
-
* Implement [concurrent partitioning](https://github.com/postgrespro/pg_pathman/tree/concurrent_part) (much more responsive)
39
-
* Implement HASH partitioning for non-integer attributes
40
-
* Optimize hash join (both tables are partitioned by join key)
41
-
* Implement LIST partitioning scheme
40
+
41
+
* Provide a way to create user-defined partition creation\destruction callbacks (issue [#22](https://github.com/postgrespro/pg_pathman/issues/22))
42
+
* Implement LIST partitioning scheme;
43
+
* Optimize hash join (both tables are partitioned by join key).
42
44
43
45
## Installation guide
44
46
To install `pg_pathman`, execute this in the module's directory:
@@ -62,77 +64,99 @@ Done! Now it's time to setup your partitioning schemes.
62
64
63
65
### Partition creation
64
66
```plpgsql
65
-
create_hash_partitions(relation TEXT,
67
+
create_hash_partitions(relation REGCLASS,
66
68
attribute TEXT,
67
-
partitions_count INTEGER)
69
+
partitions_count INTEGER,
70
+
partition_name TEXT DEFAULT NULL)
68
71
```
69
-
Performs HASH partitioning for `relation` by integer key `attribute`. Creates `partitions_count` partitions and trigger on INSERT. All the data will be automatically copied from the parent to partitions.
72
+
Performs HASH partitioning for `relation` by integer key `attribute`. The `partitions_count` parameter specifies the number of partitions to create; it cannot be changed afterwards. If `partition_data` is `true` then all the data will be automatically copied from the parent table to partitions. Note that data migration may took a while to finish and the table will be locked until transaction commits. See `partition_table_concurrently()` for a lock-free way to migrate data.
73
+
74
+
```plpgsql
75
+
create_range_partitions(relation REGCLASS,
76
+
attribute TEXT,
77
+
start_value ANYELEMENT,
78
+
interval ANYELEMENT,
79
+
count INTEGER DEFAULT NULL
80
+
partition_data BOOLEAN DEFAULT true)
81
+
82
+
create_range_partitions(relation REGCLASS,
83
+
attribute TEXT,
84
+
start_value ANYELEMENT,
85
+
interval INTERVAL,
86
+
count INTEGER DEFAULT NULL,
87
+
partition_data BOOLEAN DEFAULT true)
88
+
```
89
+
Performs RANGE partitioning for `relation` by partitioning key `attribute`. `start_value` argument specifies initial value, `interval` sets the range of values in a single partition, `count` is the number of premade partitions (if not set then pathman tries to determine it based on attribute values).
70
90
71
91
```plpgsql
72
-
create_range_partitions(relation TEXT,
73
-
attribute TEXT,
74
-
start_value ANYELEMENT,
75
-
interval ANYELEMENT,
76
-
premake INTEGER DEFAULT NULL)
92
+
create_partitions_from_range(relation REGCLASS,
93
+
attribute TEXT,
94
+
start_value ANYELEMENT,
95
+
end_value ANYELEMENT,
96
+
interval ANYELEMENT,
97
+
partition_data BOOLEAN DEFAULT true)
77
98
78
-
create_range_partitions(relation TEXT,
79
-
attribute TEXT,
80
-
start_value ANYELEMENT,
81
-
interval INTERVAL,
82
-
premake INTEGER DEFAULT NULL)
99
+
create_partitions_from_range(relation REGCLASS,
100
+
attribute TEXT,
101
+
start_value ANYELEMENT,
102
+
end_value ANYELEMENT,
103
+
interval INTERVAL,
104
+
partition_data BOOLEAN DEFAULT true)
83
105
```
84
-
Performs RANGE partitioning for `relation` by partitioning key `attribute`. `start_value` argument specifies initial value, `interval` sets the range of values in a single partition, `premake` is the number of premade partitions (if not set then pathman tries to determine it based on attribute values). All the data will be automatically copied from the parent to partitions.
106
+
Performs RANGE-partitioning from specified range for `relation` by partitioning key `attribute`.
107
+
108
+
### Data migration
85
109
86
110
```plpgsql
87
-
create_partitions_from_range(relation TEXT,
88
-
attribute TEXT,
89
-
start_value ANYELEMENT,
90
-
end_value ANYELEMENT,
91
-
interval ANYELEMENT)
111
+
partition_table_concurrently(relation REGCLASS)
112
+
```
113
+
Starts a background worker to move data from parent table to partitions. The worker utilizes short transactions to copy small batches of data (up to 10K rows per transaction) and thus doesn't significantly interfere with user's activity.
92
114
93
-
create_partitions_from_range(relation TEXT,
94
-
attribute TEXT,
95
-
start_value ANYELEMENT,
96
-
end_value ANYELEMENT,
97
-
interval INTERVAL)
115
+
```plpgsql
116
+
stop_concurrent_part_task(relation REGCLASS)
98
117
```
99
-
Performs RANGE-partitioning from specified range for `relation` by partitioning key `attribute`. Data will be copied to partitions as well.
118
+
Stops a background worker performing a concurrent partitioning task. Note: worker will exit after it finishes relocating a current batch.
100
119
101
120
### Triggers
102
121
```plpgsql
103
-
create_hash_update_trigger(parent TEXT)
122
+
create_hash_update_trigger(parent REGCLASS)
104
123
```
105
124
Creates the trigger on UPDATE for HASH partitions. The UPDATE trigger isn't created by default because of the overhead. It's useful in cases when the key attribute might change.
106
125
```plpgsql
107
-
create_range_update_trigger(parent TEXT)
126
+
create_range_update_trigger(parent REGCLASS)
108
127
```
109
128
Same as above, but for a RANGE-partitioned table.
110
129
111
130
### Post-creation partition management
112
131
```plpgsql
113
-
split_range_partition(partition TEXT, value ANYELEMENT)
Attach partition to the existing RANGE-partitioned relation. The attached table must have exactly the same structure as the parent table, including the dropped columns.
151
175
152
176
```plpgsql
153
-
detach_range_partition(partition TEXT)
177
+
detach_range_partition(partition REGCLASS)
154
178
```
155
179
Detach partition from the existing RANGE-partitioned relation.
156
180
157
181
```plpgsql
158
-
disable_partitioning(relation TEXT)
182
+
disable_pathman_for(relation TEXT)
159
183
```
160
184
Permanently disable `pg_pathman` partitioning mechanism for the specified parent table and remove the insert trigger if it exists. All partitions and data remain unchanged.
161
185
186
+
```plpgsql
187
+
drop_partitions(parent REGCLASS,
188
+
delete_data BOOLEAN DEFAULT FALSE)
189
+
```
190
+
Drop partitions of the `parent` table. If `delete_data` is `false` then the data is copied to the parent table first. Default is `false`.
191
+
192
+
193
+
### Additional parameters
194
+
195
+
```plpgsql
196
+
enable_parent(relation REGCLASS)
197
+
disable_parent(relation REGCLASS)
198
+
```
199
+
Include/exclude parent table into/from query plan. In original PostgreSQL planner parent table is always included into query plan even if it's empty which can lead to additional overhead. You can use `disable_parent()` if you are never going to use parent table as a storage. Default value depends on the `partition_data` parameter that was specified during initial partitioning in `create_range_partitions()` or `create_partitions_from_range()` functions. If the `partition_data` parameter was `true` then all data have already been migrated to partitions and parent table disabled. Otherwise it is enabled.
200
+
201
+
```plpgsql
202
+
enable_auto(relation REGCLASS)
203
+
disable_auto(relation REGCLASS)
204
+
```
205
+
Enable/disable auto partition propagation (only for RANGE partitioning). It is enabled by default.
206
+
162
207
## Custom plan nodes
163
208
`pg_pathman` provides a couple of [custom plan nodes](https://wiki.postgresql.org/wiki/CustomScanAPI) which aim to reduce execution time, namely:
164
209
165
210
-`RuntimeAppend` (overrides `Append` plan node)
166
211
-`RuntimeMergeAppend` (overrides `MergeAppend` plan node)
212
+
-`PartitionFilter` (drop-in replacement for INSERT triggers)
213
+
214
+
`PartitionFilter` acts as a *proxy node* for INSERT's child scan, which means it can redirect output tuples to the corresponding partition:
215
+
216
+
```
217
+
EXPLAIN (COSTS OFF)
218
+
INSERT INTO partitioned_table
219
+
SELECT generate_series(1, 10), random();
220
+
QUERY PLAN
221
+
-----------------------------------------
222
+
Insert on partitioned_table
223
+
-> Custom Scan (PartitionFilter)
224
+
-> Subquery Scan on "*SELECT*"
225
+
-> Result
226
+
(4 rows)
227
+
```
167
228
168
229
`RuntimeAppend` and `RuntimeMergeAppend` have much in common: they come in handy in a case when WHERE condition takes form of:
169
230
```
@@ -290,6 +351,15 @@ SELECT tableoid::regclass AS partition, * FROM partitioned_table;
290
351
291
352
- Though indices on a parent table aren't particularly useful (since it's empty), they act as prototypes for indices on partitions. For each index on the parent table, `pg_pathman` will create a similar index on every partition.
292
353
354
+
- All running concurrent partitioning tasks can be listed using the `pathman_concurrent_part_tasks` view:
0 commit comments