|
| 1 | +This directory contains the code for the user-defined type, |
| 2 | +CUBE, representing multidimensional cubes. |
| 3 | + |
| 4 | + |
| 5 | +FILES |
| 6 | +----- |
| 7 | + |
| 8 | +Makefile building instructions for the shared library |
| 9 | + |
| 10 | +README.cube the file you are now reading |
| 11 | + |
| 12 | +buffer.c globals and buffer access utilities shared between |
| 13 | + the parser (cubeparse.y) and the scanner (cubescan.l) |
| 14 | + |
| 15 | +buffer.h function prototypes for buffer.c |
| 16 | + |
| 17 | +cube.c the implementation of this data type in c |
| 18 | + |
| 19 | +cube.sql.in SQL code needed to register this type with postgres |
| 20 | + (transformed to cube.sql by make) |
| 21 | + |
| 22 | +cubedata.h the data structure used to store the cubes |
| 23 | + |
| 24 | +cubeparse.y the grammar file for the parser (used by cube_in() in cube.c) |
| 25 | + |
| 26 | +cubescan.l scanner rules (used by cube_yyparse() in cubeparse.y) |
| 27 | + |
| 28 | + |
| 29 | +INSTALLATION |
| 30 | +============ |
| 31 | + |
| 32 | +To install the type, run |
| 33 | + |
| 34 | + make |
| 35 | + make install |
| 36 | + |
| 37 | +For this to work, make sure that: |
| 38 | + |
| 39 | +. the cube source directory is in the postgres contrib directory |
| 40 | +. the user running "make install" has postgres administrative authority |
| 41 | +. this user's environment defines the PGLIB and PGDATA variables and has |
| 42 | + postgres binaries in the PATH. |
| 43 | + |
| 44 | +This only installs the type implementation and documentation. To make the |
| 45 | +type available in any particular database, do |
| 46 | + |
| 47 | + psql -d databasename < cube.sql |
| 48 | + |
| 49 | +If you install the type in the template1 database, all subsequently created |
| 50 | +databases will inherit it. |
| 51 | + |
| 52 | +To test the new type, after "make install" do |
| 53 | + |
| 54 | + make installcheck |
| 55 | + |
| 56 | +If it fails, examine the file regression.diffs to find out the reason (the |
| 57 | +test code is a direct adaptation of the regression tests from the main |
| 58 | +source tree). |
| 59 | + |
| 60 | + |
| 61 | +SYNTAX |
| 62 | +====== |
| 63 | + |
| 64 | +The following are valid external representations for the CUBE type: |
| 65 | + |
| 66 | +'x' A floating point value representing |
| 67 | + a one-dimensional point or one-dimensional |
| 68 | + zero length cubement |
| 69 | + |
| 70 | +'(x)' Same as above |
| 71 | + |
| 72 | +'x1,x2,x3,...,xn' A point in n-dimensional space, |
| 73 | + represented internally as a zero volume box |
| 74 | + |
| 75 | +'(x1,x2,x3,...,xn)' Same as above |
| 76 | + |
| 77 | +'(x),(y)' 1-D cubement starting at x and ending at y |
| 78 | + or vice versa; the order does not matter |
| 79 | + |
| 80 | +'(x1,...,xn),(y1,...,yn)' n-dimensional box represented by |
| 81 | + a pair of its opposite corners, no matter which. |
| 82 | + Functions take care of swapping to achieve |
| 83 | + "lower left -- upper right" representation |
| 84 | + before computing any values |
| 85 | + |
| 86 | +Grammar |
| 87 | +------- |
| 88 | + |
| 89 | +rule 1 box -> O_BRACKET paren_list COMMA paren_list C_BRACKET |
| 90 | +rule 2 box -> paren_list COMMA paren_list |
| 91 | +rule 3 box -> paren_list |
| 92 | +rule 4 box -> list |
| 93 | +rule 5 paren_list -> O_PAREN list C_PAREN |
| 94 | +rule 6 list -> FLOAT |
| 95 | +rule 7 list -> list COMMA FLOAT |
| 96 | + |
| 97 | +Tokens |
| 98 | +------ |
| 99 | + |
| 100 | +n [0-9]+ |
| 101 | +integer [+-]?{n} |
| 102 | +real [+-]?({n}\.{n}?)|(\.{n}) |
| 103 | +FLOAT ({integer}|{real})([eE]{integer})? |
| 104 | +O_BRACKET \[ |
| 105 | +C_BRACKET \] |
| 106 | +O_PAREN \( |
| 107 | +C_PAREN \) |
| 108 | +COMMA \, |
| 109 | + |
| 110 | + |
| 111 | +Examples of valid CUBE representations: |
| 112 | +-------------------------------------- |
| 113 | + |
| 114 | +'x' A floating point value representing |
| 115 | + a one-dimensional point (or, zero-length |
| 116 | + one-dimensional interval) |
| 117 | + |
| 118 | +'(x)' Same as above |
| 119 | + |
| 120 | +'x1,x2,x3,...,xn' A point in n-dimensional space, |
| 121 | + represented internally as a zero volume cube |
| 122 | + |
| 123 | +'(x1,x2,x3,...,xn)' Same as above |
| 124 | + |
| 125 | +'(x),(y)' A 1-D interval starting at x and ending at y |
| 126 | + or vice versa; the order does not matter |
| 127 | + |
| 128 | +'[(x),(y)]' Same as above |
| 129 | + |
| 130 | +'(x1,...,xn),(y1,...,yn)' An n-dimensional box represented by |
| 131 | + a pair of its diagonally opposite corners, |
| 132 | + regardless of order. Swapping is provided |
| 133 | + by all comarison routines to ensure the |
| 134 | + "lower left -- upper right" representation |
| 135 | + before actaul comparison takes place. |
| 136 | + |
| 137 | +'[(x1,...,xn),(y1,...,yn)]' Same as above |
| 138 | + |
| 139 | + |
| 140 | +White space is ignored, so '[(x),(y)]' can be: '[ ( x ), ( y ) ]' |
| 141 | + |
| 142 | + |
| 143 | +DEFAULTS |
| 144 | +======== |
| 145 | + |
| 146 | +I believe this union: |
| 147 | + |
| 148 | +select cube_union('(0,5,2),(2,3,1)','0'); |
| 149 | +cube_union |
| 150 | +------------------- |
| 151 | +(0, 0, 0),(2, 5, 2) |
| 152 | +(1 row) |
| 153 | + |
| 154 | +does not contradict to the common sense, neither does the intersection |
| 155 | + |
| 156 | +select cube_inter('(0,-1),(1,1)','(-2),(2)'); |
| 157 | +cube_inter |
| 158 | +------------- |
| 159 | +(0, 0),(1, 0) |
| 160 | +(1 row) |
| 161 | + |
| 162 | +In all binary operations on differently sized boxes, I assume the smaller |
| 163 | +one to be a cartesian projection, i. e., having zeroes in place of coordinates |
| 164 | +omitted in the string representation. The above examples are equivalent to: |
| 165 | + |
| 166 | +cube_union('(0,5,2),(2,3,1)','(0,0,0),(0,0,0)'); |
| 167 | +cube_inter('(0,-1),(1,1)','(-2,0),(2,0)'); |
| 168 | + |
| 169 | + |
| 170 | +The following containment predicate uses the point syntax, |
| 171 | +while in fact the second argument is internally represented by a box. |
| 172 | +This syntax makes it unnecessary to define the special Point type |
| 173 | +and functions for (box,point) predicates. |
| 174 | + |
| 175 | +select cube_contains('(0,0),(1,1)', '0.5,0.5'); |
| 176 | +cube_contains |
| 177 | +-------------- |
| 178 | +t |
| 179 | +(1 row) |
| 180 | + |
| 181 | + |
| 182 | +PRECISION |
| 183 | +========= |
| 184 | + |
| 185 | +Values are stored internally as 32-bit floating point numbers. This means that |
| 186 | +numbers with more than 7 significant digits will be truncated. |
| 187 | + |
| 188 | + |
| 189 | +USAGE |
| 190 | +===== |
| 191 | + |
| 192 | +The access method for CUBE is a GiST (gist_cube_ops), which is a |
| 193 | +generalization of R-tree. GiSTs allow the postgres implementation of |
| 194 | +R-tree, originally encoded to support 2-D geometric types such as |
| 195 | +boxes and polygons, to be used with any data type whose data domain |
| 196 | +can be partitioned using the concepts of containment, intersection and |
| 197 | +equality. In other words, everything that can intersect or contain |
| 198 | +its own kind can be indexed with a GiST. That includes, among other |
| 199 | +things, all geometric data types, regardless of their dimensionality |
| 200 | +(see also contrib/seg). |
| 201 | + |
| 202 | +The operators supported by the GiST access method include: |
| 203 | + |
| 204 | + |
| 205 | +[a, b] << [c, d] Is left of |
| 206 | + |
| 207 | + The left operand, [a, b], occurs entirely to the left of the |
| 208 | + right operand, [c, d], on the axis (-inf, inf). It means, |
| 209 | + [a, b] << [c, d] is true if b < c and false otherwise |
| 210 | + |
| 211 | +[a, b] >> [c, d] Is right of |
| 212 | + |
| 213 | + [a, b] is occurs entirely to the right of [c, d]. |
| 214 | + [a, b] >> [c, d] is true if b > c and false otherwise |
| 215 | + |
| 216 | +[a, b] &< [c, d] Over left |
| 217 | + |
| 218 | + The cubement [a, b] overlaps the cubement [c, d] in such a way |
| 219 | + that a <= c <= b and b <= d |
| 220 | + |
| 221 | +[a, b] &> [c, d] Over right |
| 222 | + |
| 223 | + The cubement [a, b] overlaps the cubement [c, d] in such a way |
| 224 | + that a > c and b <= c <= d |
| 225 | + |
| 226 | +[a, b] = [c, d] Same as |
| 227 | + |
| 228 | + The cubements [a, b] and [c, d] are identical, that is, a == b |
| 229 | + and c == d |
| 230 | + |
| 231 | +[a, b] @ [c, d] Contains |
| 232 | + |
| 233 | + The cubement [a, b] contains the cubement [c, d], that is, |
| 234 | + a <= c and b >= d |
| 235 | + |
| 236 | +[a, b] @ [c, d] Contained in |
| 237 | + |
| 238 | + The cubement [a, b] is contained in [c, d], that is, |
| 239 | + a >= c and b <= d |
| 240 | + |
| 241 | +Although the mnemonics of the following operators is questionable, I |
| 242 | +preserved them to maintain visual consistency with other geometric |
| 243 | +data types defined in Postgres. |
| 244 | + |
| 245 | +Other operators: |
| 246 | + |
| 247 | +[a, b] < [c, d] Less than |
| 248 | +[a, b] > [c, d] Greater than |
| 249 | + |
| 250 | + These operators do not make a lot of sense for any practical |
| 251 | + purpose but sorting. These operators first compare (a) to (c), |
| 252 | + and if these are equal, compare (b) to (d). That accounts for |
| 253 | + reasonably good sorting in most cases, which is useful if |
| 254 | + you want to use ORDER BY with this type |
| 255 | + |
| 256 | +There are a few other potentially useful functions defined in cube.c |
| 257 | +that vanished from the schema because I stopped using them. Some of |
| 258 | +these were meant to support type casting. Let me know if I was wrong: |
| 259 | +I will then add them back to the schema. I would also appreciate |
| 260 | +other ideas that would enhance the type and make it more useful. |
| 261 | + |
| 262 | +For examples of usage, see sql/cube.sql |
| 263 | + |
| 264 | + |
| 265 | +CREDITS |
| 266 | +======= |
| 267 | + |
| 268 | +This code is essentially based on the example written for |
| 269 | +Illustra, http://garcia.me.berkeley.edu/~adong/rtree |
| 270 | + |
| 271 | +My thanks are primarily to Prof. Joe Hellerstein |
| 272 | +(http://db.cs.berkeley.edu/~jmh/) for elucidating the gist of the GiST |
| 273 | +(http://gist.cs.berkeley.edu/), and to his former student, Andy Dong |
| 274 | +(http://best.me.berkeley.edu/~adong/), for his exemplar. |
| 275 | +I am also grateful to all postgres developers, present and past, for enabling |
| 276 | +myself to create my own world and live undisturbed in it. And I would like to |
| 277 | +acknowledge my gratitude to Argonne Lab and to the U.S. Department of Energy |
| 278 | +for the years of faithful support of my database research. |
| 279 | + |
| 280 | +------------------------------------------------------------------------ |
| 281 | +Gene Selkov, Jr. |
| 282 | +Computational Scientist |
| 283 | +Mathematics and Computer Science Division |
| 284 | +Argonne National Laboratory |
| 285 | +9700 S Cass Ave. |
| 286 | +Building 221 |
| 287 | +Argonne, IL 60439-4844 |
| 288 | + |
| 289 | +selkovjr@mcs.anl.gov |
0 commit comments