|
| 1 | +<sect1 id="pathman"> |
| 2 | + <title>pathman</title> |
| 3 | + <para> |
| 4 | + The <literal>pathman</literal> module provides optimized |
| 5 | + partitioning mechanism and functions to manage partitions. |
| 6 | + </para> |
| 7 | + <sect2 id="pathman-concepts"> |
| 8 | + <title>pathman Concepts</title> |
| 9 | + <para> |
| 10 | + Partitioning refers to splitting one large table into smaller |
| 11 | + pieces. Each row in such table assigns to a single partition based |
| 12 | + on partitioning key. Common partitioning strategies are: |
| 13 | + </para> |
| 14 | + <itemizedlist spacing="compact"> |
| 15 | + <listitem> |
| 16 | + <para> |
| 17 | + HASH - maps rows to partitions based on hash function values; |
| 18 | + </para> |
| 19 | + </listitem> |
| 20 | + <listitem> |
| 21 | + <para> |
| 22 | + RANGE - maps data to partitions based on ranges that you |
| 23 | + establish for each partition; |
| 24 | + </para> |
| 25 | + </listitem> |
| 26 | + <listitem> |
| 27 | + <para> |
| 28 | + LIST - maps data to partitions based on explicitly specified |
| 29 | + values of partitioning key for each partition. |
| 30 | + </para> |
| 31 | + </listitem> |
| 32 | + </itemizedlist> |
| 33 | + <para> |
| 34 | + &productname; supports partitioning via table inheritance. Each |
| 35 | + partition must be created as child table with CHECK CONSTRAINT. |
| 36 | + For example: |
| 37 | + </para> |
| 38 | + <programlisting> |
| 39 | +CHECK ( id >= 100 AND id < 200 ) |
| 40 | +CHECK ( id >= 200 AND id < 300 ) |
| 41 | +</programlisting> |
| 42 | + <para> |
| 43 | + Despite the flexibility of this approach it has weakness. If query |
| 44 | + uses filtering the optimizer forced to perform an exhaustive |
| 45 | + search and check constraints for each partition to determine |
| 46 | + partitions from which it should select data. If the number of |
| 47 | + partitions is large the overhead may be significant. |
| 48 | + </para> |
| 49 | + <para> |
| 50 | + The <literal>pathman</literal> module provides functions to manage |
| 51 | + partitions and partitioning mechanism optimized based on knowledge |
| 52 | + of the partitions structure. It stores partitioning configuration |
| 53 | + in the <literal>pathman_config</literal> table, each row of which |
| 54 | + contains single entry for partitioned table (relation name, |
| 55 | + partitioning key and type). During initialization the |
| 56 | + <literal>pathman</literal> module caches information about child |
| 57 | + partitions in shared memory in form convenient to perform rapid |
| 58 | + search. When user executes SELECT query pathman analyzes |
| 59 | + conditions tree looking for conditions like: |
| 60 | + </para> |
| 61 | + <programlisting> |
| 62 | +VARIABLE OP CONST |
| 63 | +</programlisting> |
| 64 | + <para> |
| 65 | + where <literal>VARIABLE</literal> is partitioning key, |
| 66 | + <literal>OP</literal> is comparison operator (supported operators |
| 67 | + are =, <, <=, >, >=), <literal>CONST</literal> is |
| 68 | + scalar value. For example: |
| 69 | + </para> |
| 70 | + <programlisting> |
| 71 | +WHERE id = 150 |
| 72 | +</programlisting> |
| 73 | + <para> |
| 74 | + Based on partitioning type and operator the |
| 75 | + <literal>pathman</literal> searches corresponding partitions and |
| 76 | + builds the plan. |
| 77 | + </para> |
| 78 | + </sect2> |
| 79 | + <sect2 id="pathman-installation"> |
| 80 | + <title>Installation</title> |
| 81 | + <para> |
| 82 | + To install pathman run: |
| 83 | + </para> |
| 84 | + <programlisting> |
| 85 | +CREATE EXTENSION pathman; |
| 86 | +</programlisting> |
| 87 | + <para> |
| 88 | + in your database Then modify shared_preload_libraries parameter |
| 89 | + in postgresql.conf as following: |
| 90 | + </para> |
| 91 | + <programlisting> |
| 92 | +shared_preload_libraries = 'pathman' |
| 93 | +</programlisting> |
| 94 | + <para> |
| 95 | + Then restart the &productname; instance. |
| 96 | + </para> |
| 97 | + </sect2> |
| 98 | + <sect2 id="pathman-functions"> |
| 99 | + <title>FUNCTIONS</title> |
| 100 | + <sect3 id="partitions-creation"> |
| 101 | + <title>Partitions Creation</title> |
| 102 | + <programlisting> |
| 103 | +create_hash_partitions( |
| 104 | + relation TEXT, |
| 105 | + attribute TEXT, |
| 106 | + partitions_count INTEGER) |
| 107 | +</programlisting> |
| 108 | + <para> |
| 109 | + Performs HASH partitioning for <literal>relation</literal> by |
| 110 | + integer key <literal>attribute</literal>. Creates |
| 111 | + <literal>partitions_count</literal> partitions and trigger on |
| 112 | + INSERT. Data doesn't automatically copied from parent table to |
| 113 | + partitions. Use <literal>partition_data()</literal> function |
| 114 | + (see below) to migrate data. |
| 115 | + </para> |
| 116 | + <programlisting> |
| 117 | +create_range_partitions( |
| 118 | + relation TEXT, |
| 119 | + attribute TEXT, |
| 120 | + start_value ANYELEMENT, |
| 121 | + interval ANYELEMENT, |
| 122 | + premake INTEGER) |
| 123 | +</programlisting> |
| 124 | + <para> |
| 125 | + Performs RANGE partitioning for <literal>relation</literal> by |
| 126 | + partitioning key <literal>attribute</literal>. |
| 127 | + <literal>start_value</literal> argument specifies initial value, |
| 128 | + <literal>interval</literal> sets the range of values in a single |
| 129 | + partition, <literal>premake</literal> is the number of premade |
| 130 | + partitions (the only one partition will be created if |
| 131 | + <literal>premake</literal> is 0). |
| 132 | + </para> |
| 133 | + <programlisting> |
| 134 | +create_range_partitions( |
| 135 | + relation TEXT, |
| 136 | + attribute TEXT, |
| 137 | + start_value ANYELEMENT, |
| 138 | + interval INTERVAL, |
| 139 | + premake INTEGER) |
| 140 | +</programlisting> |
| 141 | + <para> |
| 142 | + Same as above but suitable for <literal>DATE</literal> and |
| 143 | + <literal>TIMESTAMP</literal> partitioning keys. |
| 144 | + </para> |
| 145 | + </sect3> |
| 146 | + <sect3 id="data-migration"> |
| 147 | + <title>Data migration</title> |
| 148 | + <programlisting> |
| 149 | +partition_data(parent text) |
| 150 | +</programlisting> |
| 151 | + <para> |
| 152 | + Copies data from parent table to its partitions. |
| 153 | + </para> |
| 154 | + </sect3> |
| 155 | + <sect3 id="partitions-management"> |
| 156 | + <title>Partitions management</title> |
| 157 | + <programlisting> |
| 158 | +split_range_partition(partition TEXT, value ANYELEMENT) |
| 159 | +</programlisting> |
| 160 | + <para> |
| 161 | + Splits RANGE <literal>partition</literal> in two by |
| 162 | + <literal>value</literal>. |
| 163 | + </para> |
| 164 | + <programlisting> |
| 165 | +merge_range_partitions(partition1 TEXT, partition2 TEXT) |
| 166 | +</programlisting> |
| 167 | + <para> |
| 168 | + Merge two adjacent RANGE partitions. Data from |
| 169 | + <literal>partition2</literal> is copied to |
| 170 | + <literal>partition1</literal>. Then the |
| 171 | + <literal>partition2</literal> is removed. |
| 172 | + </para> |
| 173 | + <programlisting> |
| 174 | +append_partition(p_relation TEXT) |
| 175 | +</programlisting> |
| 176 | + <para> |
| 177 | + Appends new partition with the range equal to the range of the |
| 178 | + previous partition. |
| 179 | + </para> |
| 180 | + <programlisting> |
| 181 | +prepend_partition(p_relation TEXT) |
| 182 | +</programlisting> |
| 183 | + <para> |
| 184 | + Prepends new partition with the range equal to the range of the |
| 185 | + first partition. |
| 186 | + </para> |
| 187 | + <programlisting> |
| 188 | +disable_partitioning(relation TEXT) |
| 189 | +</programlisting> |
| 190 | + <para> |
| 191 | + Disables <literal>pathman</literal> partitioning mechanism for |
| 192 | + the specified parent table and removes an insert trigger. |
| 193 | + Partitions itself remain unchanged. |
| 194 | + </para> |
| 195 | + </sect3> |
| 196 | + </sect2> |
| 197 | + <sect2 id="examples"> |
| 198 | + <title>Examples</title> |
| 199 | + <sect3 id="hash"> |
| 200 | + <title>HASH</title> |
| 201 | + <para> |
| 202 | + Consider an example of HASH partitioning. First create a table |
| 203 | + with some integer column: |
| 204 | + </para> |
| 205 | + <programlisting> |
| 206 | +CREATE TABLE hash_rel ( |
| 207 | + id SERIAL PRIMARY KEY, |
| 208 | + value INTEGER); |
| 209 | +INSERT INTO hash_rel (value) SELECT g FROM generate_series(1, 10000) as g; |
| 210 | +</programlisting> |
| 211 | + <para> |
| 212 | + Then run create_hash_partitions() function with appropriate |
| 213 | + arguments: |
| 214 | + </para> |
| 215 | + <programlisting> |
| 216 | +SELECT create_hash_partitions('hash_rel', 'value', 100); |
| 217 | +</programlisting> |
| 218 | + <para> |
| 219 | + This will create new partitions but data will still be in the |
| 220 | + parent table. To move data to the corresponding partitions use |
| 221 | + partition_data() function: |
| 222 | + </para> |
| 223 | + <programlisting> |
| 224 | +SELECT partition_data('hash_rel'); |
| 225 | +</programlisting> |
| 226 | + </sect3> |
| 227 | + <sect3 id="range"> |
| 228 | + <title>RANGE</title> |
| 229 | + <para> |
| 230 | + Consider an example of RANGE partitioning. Create a table with |
| 231 | + numerical or date or timestamp column: |
| 232 | + </para> |
| 233 | + <programlisting> |
| 234 | +CREATE TABLE range_rel ( |
| 235 | + id SERIAL PRIMARY KEY, |
| 236 | + dt TIMESTAMP); |
| 237 | +INSERT INTO range_rel (dt) SELECT g FROM generate_series('2010-01-01'::date, '2015-12-31'::date, '1 day') as g; |
| 238 | +</programlisting> |
| 239 | + <para> |
| 240 | + Run create_range_partitions() function to create partitions so |
| 241 | + that each partition would contain data for one month: |
| 242 | + </para> |
| 243 | + <programlisting> |
| 244 | +SELECT create_range_partitions('range_rel', 'dt', '2010-01-01'::date, '1 month'::interval, 59); |
| 245 | +</programlisting> |
| 246 | + <para> |
| 247 | + It will create 60 partitions (one partition is created |
| 248 | + regardless of <literal>premake</literal> parameter). Now move |
| 249 | + data from the parent to partitions. |
| 250 | + </para> |
| 251 | + <programlisting> |
| 252 | +SELECT partition_data('range_rel'); |
| 253 | +</programlisting> |
| 254 | + <para> |
| 255 | + To merge to adjacent partitions run merge_range_partitions() |
| 256 | + function: |
| 257 | + </para> |
| 258 | + <programlisting> |
| 259 | +SELECT merge_range_partitions('range_rel_1', 'range_rel_2'); |
| 260 | +</programlisting> |
| 261 | + <para> |
| 262 | + To split partition use split_range_partition() function: |
| 263 | + </para> |
| 264 | + <programlisting> |
| 265 | +SELECT split_range_partition('range_rel_1', '2010-02-15'::date); |
| 266 | +</programlisting> |
| 267 | + <para> |
| 268 | + Now let's create new partition. You can use append_partition() |
| 269 | + or prepend_partition() functions: |
| 270 | + </para> |
| 271 | + <programlisting> |
| 272 | +SELECT append_partition('range_rel'); |
| 273 | +SELECT append_partition('range_rel'); |
| 274 | +</programlisting> |
| 275 | + </sect3> |
| 276 | + </sect2> |
| 277 | + <sect2 id="author"> |
| 278 | + <title>Author</title> |
| 279 | + <para> |
| 280 | + Ildar Musin <email>i.musin@postgrespro.ru</email> Postgres |
| 281 | + Professional Ltd., Russia |
| 282 | + </para> |
| 283 | + </sect2> |
| 284 | +</sect1> |
0 commit comments