Skip to content

Commit 9892ddf

Browse files
committed
Gene Selkov's CUBE datatype (GiST example code)
1 parent 5bb4f72 commit 9892ddf

File tree

12 files changed

+6475
-0
lines changed

12 files changed

+6475
-0
lines changed

contrib/cube/Makefile

Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
#
2+
# $Header: /cvsroot/pgsql/contrib/cube/Makefile,v 1.1 2000/12/11 20:39:14 tgl Exp $
3+
#
4+
5+
subdir = contrib/cube
6+
top_builddir = ../..
7+
include $(top_builddir)/src/Makefile.global
8+
9+
# override libdir to install shlib in contrib not main directory
10+
libdir := $(libdir)/contrib
11+
12+
# shared library parameters
13+
NAME= cube
14+
SO_MAJOR_VERSION= 1
15+
SO_MINOR_VERSION= 0
16+
17+
override CPPFLAGS += -I$(srcdir)
18+
19+
OBJS= cube.o cubeparse.o cubescan.o buffer.o
20+
21+
all: all-lib $(NAME).sql
22+
23+
# Shared library stuff
24+
include $(top_srcdir)/src/Makefile.shlib
25+
26+
27+
cubeparse.c cubeparse.h: cubeparse.y
28+
$(YACC) -d $(YFLAGS) -p cube_yy $<
29+
mv -f y.tab.c cubeparse.c
30+
mv -f y.tab.h cubeparse.h
31+
32+
cubescan.c: cubescan.l
33+
ifdef FLEX
34+
$(FLEX) $(FLEXFLAGS) -Pcube_yy -o'$@' $<
35+
else
36+
@$(missing) flex $< $@
37+
endif
38+
39+
$(NAME).sql: $(NAME).sql.in
40+
sed -e 's:MODULE_PATHNAME:$(libdir)/$(shlib):g' < $< > $@
41+
42+
.PHONY: submake
43+
submake:
44+
$(MAKE) -C $(top_builddir)/src/test/regress pg_regress
45+
46+
# against installed postmaster
47+
installcheck: submake
48+
$(top_builddir)/src/test/regress/pg_regress cube
49+
50+
# in-tree test doesn't work yet (no way to install my shared library)
51+
#check: all submake
52+
# $(top_builddir)/src/test/regress/pg_regress --temp-install \
53+
# --top-builddir=$(top_builddir) seg
54+
check:
55+
@echo "'make check' is not supported."
56+
@echo "Do 'make install', then 'make installcheck' instead."
57+
58+
install: all installdirs install-lib
59+
$(INSTALL_DATA) $(srcdir)/README.$(NAME) $(docdir)/contrib
60+
$(INSTALL_DATA) $(NAME).sql $(datadir)/contrib
61+
62+
installdirs:
63+
$(mkinstalldirs) $(docdir)/contrib $(datadir)/contrib $(libdir)
64+
65+
uninstall: uninstall-lib
66+
rm -f $(docdir)/contrib/README.$(NAME) $(datadir)/contrib/$(NAME).sql
67+
68+
clean distclean maintainer-clean: clean-lib
69+
rm -f cubeparse.c cubeparse.h cubescan.c
70+
rm -f y.tab.c y.tab.h $(OBJS) $(NAME).sql
71+
# things created by various check targets
72+
rm -rf results tmp_check log
73+
rm -f regression.diffs regression.out regress.out run_check.out
74+
ifeq ($(PORTNAME), win)
75+
rm -f regress.def
76+
endif
77+
78+
depend dep:
79+
$(CC) -MM $(CFLAGS) *.c >depend
80+
81+
ifeq (depend,$(wildcard depend))
82+
include depend
83+
endif

contrib/cube/README.cube

Lines changed: 289 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,289 @@
1+
This directory contains the code for the user-defined type,
2+
CUBE, representing multidimensional cubes.
3+
4+
5+
FILES
6+
-----
7+
8+
Makefile building instructions for the shared library
9+
10+
README.cube the file you are now reading
11+
12+
buffer.c globals and buffer access utilities shared between
13+
the parser (cubeparse.y) and the scanner (cubescan.l)
14+
15+
buffer.h function prototypes for buffer.c
16+
17+
cube.c the implementation of this data type in c
18+
19+
cube.sql.in SQL code needed to register this type with postgres
20+
(transformed to cube.sql by make)
21+
22+
cubedata.h the data structure used to store the cubes
23+
24+
cubeparse.y the grammar file for the parser (used by cube_in() in cube.c)
25+
26+
cubescan.l scanner rules (used by cube_yyparse() in cubeparse.y)
27+
28+
29+
INSTALLATION
30+
============
31+
32+
To install the type, run
33+
34+
make
35+
make install
36+
37+
For this to work, make sure that:
38+
39+
. the cube source directory is in the postgres contrib directory
40+
. the user running "make install" has postgres administrative authority
41+
. this user's environment defines the PGLIB and PGDATA variables and has
42+
postgres binaries in the PATH.
43+
44+
This only installs the type implementation and documentation. To make the
45+
type available in any particular database, do
46+
47+
psql -d databasename < cube.sql
48+
49+
If you install the type in the template1 database, all subsequently created
50+
databases will inherit it.
51+
52+
To test the new type, after "make install" do
53+
54+
make installcheck
55+
56+
If it fails, examine the file regression.diffs to find out the reason (the
57+
test code is a direct adaptation of the regression tests from the main
58+
source tree).
59+
60+
61+
SYNTAX
62+
======
63+
64+
The following are valid external representations for the CUBE type:
65+
66+
'x' A floating point value representing
67+
a one-dimensional point or one-dimensional
68+
zero length cubement
69+
70+
'(x)' Same as above
71+
72+
'x1,x2,x3,...,xn' A point in n-dimensional space,
73+
represented internally as a zero volume box
74+
75+
'(x1,x2,x3,...,xn)' Same as above
76+
77+
'(x),(y)' 1-D cubement starting at x and ending at y
78+
or vice versa; the order does not matter
79+
80+
'(x1,...,xn),(y1,...,yn)' n-dimensional box represented by
81+
a pair of its opposite corners, no matter which.
82+
Functions take care of swapping to achieve
83+
"lower left -- upper right" representation
84+
before computing any values
85+
86+
Grammar
87+
-------
88+
89+
rule 1 box -> O_BRACKET paren_list COMMA paren_list C_BRACKET
90+
rule 2 box -> paren_list COMMA paren_list
91+
rule 3 box -> paren_list
92+
rule 4 box -> list
93+
rule 5 paren_list -> O_PAREN list C_PAREN
94+
rule 6 list -> FLOAT
95+
rule 7 list -> list COMMA FLOAT
96+
97+
Tokens
98+
------
99+
100+
n [0-9]+
101+
integer [+-]?{n}
102+
real [+-]?({n}\.{n}?)|(\.{n})
103+
FLOAT ({integer}|{real})([eE]{integer})?
104+
O_BRACKET \[
105+
C_BRACKET \]
106+
O_PAREN \(
107+
C_PAREN \)
108+
COMMA \,
109+
110+
111+
Examples of valid CUBE representations:
112+
--------------------------------------
113+
114+
'x' A floating point value representing
115+
a one-dimensional point (or, zero-length
116+
one-dimensional interval)
117+
118+
'(x)' Same as above
119+
120+
'x1,x2,x3,...,xn' A point in n-dimensional space,
121+
represented internally as a zero volume cube
122+
123+
'(x1,x2,x3,...,xn)' Same as above
124+
125+
'(x),(y)' A 1-D interval starting at x and ending at y
126+
or vice versa; the order does not matter
127+
128+
'[(x),(y)]' Same as above
129+
130+
'(x1,...,xn),(y1,...,yn)' An n-dimensional box represented by
131+
a pair of its diagonally opposite corners,
132+
regardless of order. Swapping is provided
133+
by all comarison routines to ensure the
134+
"lower left -- upper right" representation
135+
before actaul comparison takes place.
136+
137+
'[(x1,...,xn),(y1,...,yn)]' Same as above
138+
139+
140+
White space is ignored, so '[(x),(y)]' can be: '[ ( x ), ( y ) ]'
141+
142+
143+
DEFAULTS
144+
========
145+
146+
I believe this union:
147+
148+
select cube_union('(0,5,2),(2,3,1)','0');
149+
cube_union
150+
-------------------
151+
(0, 0, 0),(2, 5, 2)
152+
(1 row)
153+
154+
does not contradict to the common sense, neither does the intersection
155+
156+
select cube_inter('(0,-1),(1,1)','(-2),(2)');
157+
cube_inter
158+
-------------
159+
(0, 0),(1, 0)
160+
(1 row)
161+
162+
In all binary operations on differently sized boxes, I assume the smaller
163+
one to be a cartesian projection, i. e., having zeroes in place of coordinates
164+
omitted in the string representation. The above examples are equivalent to:
165+
166+
cube_union('(0,5,2),(2,3,1)','(0,0,0),(0,0,0)');
167+
cube_inter('(0,-1),(1,1)','(-2,0),(2,0)');
168+
169+
170+
The following containment predicate uses the point syntax,
171+
while in fact the second argument is internally represented by a box.
172+
This syntax makes it unnecessary to define the special Point type
173+
and functions for (box,point) predicates.
174+
175+
select cube_contains('(0,0),(1,1)', '0.5,0.5');
176+
cube_contains
177+
--------------
178+
t
179+
(1 row)
180+
181+
182+
PRECISION
183+
=========
184+
185+
Values are stored internally as 32-bit floating point numbers. This means that
186+
numbers with more than 7 significant digits will be truncated.
187+
188+
189+
USAGE
190+
=====
191+
192+
The access method for CUBE is a GiST (gist_cube_ops), which is a
193+
generalization of R-tree. GiSTs allow the postgres implementation of
194+
R-tree, originally encoded to support 2-D geometric types such as
195+
boxes and polygons, to be used with any data type whose data domain
196+
can be partitioned using the concepts of containment, intersection and
197+
equality. In other words, everything that can intersect or contain
198+
its own kind can be indexed with a GiST. That includes, among other
199+
things, all geometric data types, regardless of their dimensionality
200+
(see also contrib/seg).
201+
202+
The operators supported by the GiST access method include:
203+
204+
205+
[a, b] << [c, d] Is left of
206+
207+
The left operand, [a, b], occurs entirely to the left of the
208+
right operand, [c, d], on the axis (-inf, inf). It means,
209+
[a, b] << [c, d] is true if b < c and false otherwise
210+
211+
[a, b] >> [c, d] Is right of
212+
213+
[a, b] is occurs entirely to the right of [c, d].
214+
[a, b] >> [c, d] is true if b > c and false otherwise
215+
216+
[a, b] &< [c, d] Over left
217+
218+
The cubement [a, b] overlaps the cubement [c, d] in such a way
219+
that a <= c <= b and b <= d
220+
221+
[a, b] &> [c, d] Over right
222+
223+
The cubement [a, b] overlaps the cubement [c, d] in such a way
224+
that a > c and b <= c <= d
225+
226+
[a, b] = [c, d] Same as
227+
228+
The cubements [a, b] and [c, d] are identical, that is, a == b
229+
and c == d
230+
231+
[a, b] @ [c, d] Contains
232+
233+
The cubement [a, b] contains the cubement [c, d], that is,
234+
a <= c and b >= d
235+
236+
[a, b] @ [c, d] Contained in
237+
238+
The cubement [a, b] is contained in [c, d], that is,
239+
a >= c and b <= d
240+
241+
Although the mnemonics of the following operators is questionable, I
242+
preserved them to maintain visual consistency with other geometric
243+
data types defined in Postgres.
244+
245+
Other operators:
246+
247+
[a, b] < [c, d] Less than
248+
[a, b] > [c, d] Greater than
249+
250+
These operators do not make a lot of sense for any practical
251+
purpose but sorting. These operators first compare (a) to (c),
252+
and if these are equal, compare (b) to (d). That accounts for
253+
reasonably good sorting in most cases, which is useful if
254+
you want to use ORDER BY with this type
255+
256+
There are a few other potentially useful functions defined in cube.c
257+
that vanished from the schema because I stopped using them. Some of
258+
these were meant to support type casting. Let me know if I was wrong:
259+
I will then add them back to the schema. I would also appreciate
260+
other ideas that would enhance the type and make it more useful.
261+
262+
For examples of usage, see sql/cube.sql
263+
264+
265+
CREDITS
266+
=======
267+
268+
This code is essentially based on the example written for
269+
Illustra, http://garcia.me.berkeley.edu/~adong/rtree
270+
271+
My thanks are primarily to Prof. Joe Hellerstein
272+
(http://db.cs.berkeley.edu/~jmh/) for elucidating the gist of the GiST
273+
(http://gist.cs.berkeley.edu/), and to his former student, Andy Dong
274+
(http://best.me.berkeley.edu/~adong/), for his exemplar.
275+
I am also grateful to all postgres developers, present and past, for enabling
276+
myself to create my own world and live undisturbed in it. And I would like to
277+
acknowledge my gratitude to Argonne Lab and to the U.S. Department of Energy
278+
for the years of faithful support of my database research.
279+
280+
------------------------------------------------------------------------
281+
Gene Selkov, Jr.
282+
Computational Scientist
283+
Mathematics and Computer Science Division
284+
Argonne National Laboratory
285+
9700 S Cass Ave.
286+
Building 221
287+
Argonne, IL 60439-4844
288+
289+
selkovjr@mcs.anl.gov

0 commit comments

Comments
 (0)