0% found this document useful (0 votes)

237 views29 pages

R-Trees, Advanced Data Structures

R-Trees are a data structure used to store spatial data. They represent multidimensional data objects using minimum bounding rectangles (MBRs) and organize these rectangles hierarchically. Each node in an R-Tree bounds all of the MBRs of its child nodes. R-Trees allow efficient searching by only exploring nodes that intersect the search range. However, the original R-Tree splitting algorithm could result in non-optimal trees. The R*-Tree was developed to use additional criteria during splitting such as minimizing overlap and margin to improve search performance. Further variants like RC-Trees store the actual object geometries rather than just MBRs and use techniques like clipping and domain reduction to dynamically adapt the tree structure.

Uploaded by

lastname name

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

237 views29 pages

R-Trees, Advanced Data Structures

Uploaded by

lastname name

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

R-Trees

Accessing Spatial Data

In the beginning

The B-Tree provided a foundation for RTrees. But whats a B-Tree?

A data structure for storing sorted data with
amortized run times for insertion and deletion
Often used for data stored on long latency I/O
(filesystems and DBs) because child nodes
can be accessed together (since they are in
order)

B-Tree

From wikipedia

Whats wrong with B-Trees

B-Trees cannot store new types of data

Specifically people wanted to store
geometrical data and multi-dimensional data
The R-Tree provided a way to do that (thanx
to Guttman 84)

R-Trees

R-Trees can organize any-dimensional data

by representing the data by a minimum
bounding box.
Each node bounds its children. A node can
have many objects in it
The leaves point to the actual objects (stored
on disk probably)
The height is always log n (it is height
balanced)

R-Tree Example

From http://lacot.org/public/enst/bda/img/schema1.gif

Operations

Searching: look at all nodes that intersect,

then recurse into those nodes. Many paths
may lead nowhere
Insertion: Locate place to insert node through
searching and insert.

If a node is full, then a split needs to be done

Deletion: node becomes underfull. Reinsert

other nodes to maintain balance.

Splitting Full Nodes

Linear choose far apart nodes as ends.

Randomly choose nodes and assign them so
that they require the smallest MBR
enlargement
Quadratic choose two nodes so the dead
space between them is maximized. Insert
nodes so area enlargement is minimized
Exponential search all possible groupings
Note: Only criteria is MBR area enlargement

Demo

How can we visualize the R-Tree

By clicking here

Variants - R+ Trees

Avoids multiple paths during searching.

Objects may be stored in multiple nodes

MBRs of nodes at same tree level do not overlap

On insertion/deletion the tree may change
downward or upward in order to maintain the
structure

R+ Tree

http://perso.enst.fr/~saglio/bdas/EPFL0525/sld041.htm

Variants: Hilbert R-Tree

Similar to other R-Trees except that the Hilbert value

of its rectangle centroid is calculated.
That key is used to guide the insertion
On an overflow, evenly divide between two nodes
Experiments has shown that this scheme
significantly improves performance and decreases
insertion complexity.
Hilbert R-tree achieves up to 28% saving in the
number of pages touched compared to R*-tree.

Hilbert Value??

The Hilbert value of an object is found by

interleaving the bits of its x and y coordinates, and
then chopping the binary string into 2-bit strings.
Then, for every 2-bit string, if the value is 0, we
replace every 1 in the original string with a 3, and
vice-versa.
If the value of the 2-bit string is 3, we replace all 2s
and 0s in a similar fashion.
After this is done, you put all the 2-bit strings back
together and compute the decimal value of the
binary string;
This is the Hilbert value of the object.
http://www-users.cs.umn.edu/research/shashi-group/CS8715/exercise_ans.doc

R*-Tree

The original R-Tree only uses minimized

MBR area to determine node splitting.
There are other factors to consider as well
that can have a great impact depending on
the data
By considering the other factors, R*-Trees
become faster for spatial and point access
queries.

Problems in original R-Tree

Because the only criteria is to minimize area

Certain types of data may create small areas but

large distances which will initiate a bad split.
If one group reaches a maximum number of
entries, the rest of assigned without
consideration of their geometry.

Greene tried to solve, but he only used the

split axis more criteria needs to be used

Splitting overfilled nodes

Why is this overfull?

R*-Tree Parameters
1.

2.
3.

Area covered by a rectangle should be

minimized
Overlap should be minimized
The sum of the lengths of the edges
(margins) should be minimized
Storage utilization should be maximized
(resulting in smaller tree height)

Splitting in R*-Trees
1)

3)
4)

Entries are sorted by their lower value, then

their upper value of their rectangles. All
possible distributions are determined
Compute the sum of the margin values and
choose the axis with the minimum as the
split axis
Along the split axis, choose the distribution
with the minimum overlap
Distribute entries into these two groups

Deleting and Forced Re-insertion

Experimentally, it was shown that re-inserting

data provided large (20-50%) improvement in
performance.
Thus, randomly deleting half the data and reinserting is a good way to keep the structure
balanced.

Results

Lots of data sets and lots of query types.

One example: Real Data: MBRs of elevation lines.
100K objects
Disk accesses
Query

Storage util. On insert

After build up

RC-Trees

Changing motivations:

Memory large enough to store objects

Its possible to store the object geometry and not
just the MBR representation.
Data is dynamic and transient
Spatial objects naturally overlap (ie: stock market
triggers)

RC-Trees
Take advantage of dynamic segmentation
If the original geometry is thrown away, then
later on the MBR cannot be modified to
represent new changes to the tree
RC Tree does

1.
2.
3.

Clipping
Domain Reduction
Rebalancing

Discriminators

A discriminator is used to decide (in binary) which

direction a node should go in. (It means its a binary
tree, unlike other R-Trees)
It partitions the space
If an object intersects a discriminator, the object can
be clipped into two parts
When an object is clipped, the space it takes up (in
terms of its MBR) is reduced (aka domain reduction)
This allows for removal of dead space and faster
point query lookups

Domain Reduction and Clipping

Operations

Insert, Delete and Search are straightforward

What happens on an node that has been
overflowed?
Choose a discriminator to partition the object
into balanced sets
How is a discriminator chosen?

Partitioning

Two methods for finding a discriminator for a

partition
RC-MID faster, but ignores balancing and
clipping. Uses pre-computed data to
determine and average discriminator.
Problems?

Different distributions greatly affect partition

Space requirements can be huge

Partitioning Take 2

RC-SWEEP

sorts objects.
Candidates for discriminators are the boundaries
of the MBRs
Assign a weight to each candidate using a
formula not shown here
Choose the minimum

Problems?

Slower, but space costs much better than RC-MID

(which keeps info about nodes)

Rebuilding

The tree can take a certain degree of

flexibility in its structure before needing to be
rebalanced
On an insert, check if the height is too
imbalanced
If so, go to the imbalanced subtree and flush
the items, sort and call split on them to get a
better balancing

Experimentation

CPU execution time not a good measure. (although

they still calculate it)
Instead use number of discriminators compared
Lots of results
Result summary:

Insertion a little more expensive (because of possible

rebalancing)
Querying for point or spatial data faster (and fewer memory
accesses) than all previous incarnations
Storage requirements not that bad
Dynamic segmentation (ie recalculating MBRs) can help a
lot
Controlling space with factor (by disallowing further
splitting) controls space costs

NUS CS2040 Notes
No ratings yet
NUS CS2040 Notes
13 pages
Putnam 2015 Results
No ratings yet
Putnam 2015 Results
30 pages
French Glossary For Math PDF
No ratings yet
French Glossary For Math PDF
79 pages
Lect8 05
No ratings yet
Lect8 05
27 pages
R-Trees - Presentation Slides
100% (1)
R-Trees - Presentation Slides
44 pages
Background Reading - R Tree With Examples
No ratings yet
Background Reading - R Tree With Examples
24 pages
An Efficient and Robust Access Method For Points and Rectangles
No ratings yet
An Efficient and Robust Access Method For Points and Rectangles
38 pages
G3 - R-Tree, R+-Tree
No ratings yet
G3 - R-Tree, R+-Tree
47 pages
Spatial Data Indexing and Queries
No ratings yet
Spatial Data Indexing and Queries
56 pages
R Tree
No ratings yet
R Tree
11 pages
R-Trees - Paper
100% (1)
R-Trees - Paper
36 pages
Nhom 10 - A New Enhancement To The R-Tree Node Splitting
No ratings yet
Nhom 10 - A New Enhancement To The R-Tree Node Splitting
16 pages
A Comparison of R, R+,R, X and Hilberg Tree: Submitted by
No ratings yet
A Comparison of R, R+,R, X and Hilberg Tree: Submitted by
9 pages
M.tech DS-Scheme CIE 2
No ratings yet
M.tech DS-Scheme CIE 2
5 pages
Dsa Unit 4
No ratings yet
Dsa Unit 4
5 pages
R Tree
No ratings yet
R Tree
18 pages
Search Trees
No ratings yet
Search Trees
55 pages
KD Trees
No ratings yet
KD Trees
12 pages
Spatial Indexing I: Point Access Methods
No ratings yet
Spatial Indexing I: Point Access Methods
52 pages
Kdtrees
No ratings yet
Kdtrees
12 pages
Spatial Index Structures: (R-Tree Family)
No ratings yet
Spatial Index Structures: (R-Tree Family)
35 pages
Slides21 PDF
No ratings yet
Slides21 PDF
125 pages
7up
No ratings yet
7up
13 pages
R-Trees: Extension of B+-Trees
No ratings yet
R-Trees: Extension of B+-Trees
44 pages
B Trees
No ratings yet
B Trees
27 pages
R Tree
No ratings yet
R Tree
11 pages
R-Trees: Index Structures For Spatial Data
No ratings yet
R-Trees: Index Structures For Spatial Data
21 pages
Part10 Quadtrees Etc
No ratings yet
Part10 Quadtrees Etc
69 pages
Assignment 3: Due Date: June 25, 2020, 11:55pm IST
No ratings yet
Assignment 3: Due Date: June 25, 2020, 11:55pm IST
6 pages
Non Linear Data Structure 1
No ratings yet
Non Linear Data Structure 1
57 pages
BST Range Search!
No ratings yet
BST Range Search!
17 pages
07 Kdtrees
No ratings yet
07 Kdtrees
17 pages
3 - Efficient Data Access
No ratings yet
3 - Efficient Data Access
7 pages
Indexing: Data Structure and Algorithm Analysis
No ratings yet
Indexing: Data Structure and Algorithm Analysis
22 pages
Assign 4
No ratings yet
Assign 4
13 pages
RRB-Trees - Efficient Immutable Vectors
No ratings yet
RRB-Trees - Efficient Immutable Vectors
16 pages
Btrees Animated
No ratings yet
Btrees Animated
77 pages
B-Tree in Database Management Systems (DBMS)
No ratings yet
B-Tree in Database Management Systems (DBMS)
19 pages
Lect0208 PDF
No ratings yet
Lect0208 PDF
7 pages
Rtree
No ratings yet
Rtree
33 pages
Tree
No ratings yet
Tree
117 pages
4.4 Balanced Trees: Symbol Table
No ratings yet
4.4 Balanced Trees: Symbol Table
11 pages
Spatial Data Management: Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 1
No ratings yet
Spatial Data Management: Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 1
21 pages
Binary Tree Tahsin
No ratings yet
Binary Tree Tahsin
14 pages
Spatial Data Management: Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 1
No ratings yet
Spatial Data Management: Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 1
7 pages
Ch18 - B-Trees
No ratings yet
Ch18 - B-Trees
75 pages
Bulk Loading The M-Tree To Enhance Query Performance
No ratings yet
Bulk Loading The M-Tree To Enhance Query Performance
13 pages
08 - Search Trees
No ratings yet
08 - Search Trees
50 pages
B-Tree, Hashing, Chaining
No ratings yet
B-Tree, Hashing, Chaining
6 pages
5b Tree Indexes
No ratings yet
5b Tree Indexes
41 pages
Spatial, Text, and Multimedia Databases: Erik Zeitler Udbl
No ratings yet
Spatial, Text, and Multimedia Databases: Erik Zeitler Udbl
53 pages
Timos Sellis: The R - Tree: A Dynamic Index For Multi-Dimensional Objects
No ratings yet
Timos Sellis: The R - Tree: A Dynamic Index For Multi-Dimensional Objects
11 pages
Spatial Data Management
No ratings yet
Spatial Data Management
7 pages
CSE 326: Data Structures Lecture #21 Multidimensional Search Trees
No ratings yet
CSE 326: Data Structures Lecture #21 Multidimensional Search Trees
42 pages
DSA Unit-4
No ratings yet
DSA Unit-4
9 pages
1 Persistent Data Structures
No ratings yet
1 Persistent Data Structures
4 pages
IOI Training Week 7 Advanced Data Structures: 1.1 Square-Root (SQRT) Decomposition
No ratings yet
IOI Training Week 7 Advanced Data Structures: 1.1 Square-Root (SQRT) Decomposition
6 pages
Week7.pdf Sqrt+Segtree PDF
No ratings yet
Week7.pdf Sqrt+Segtree PDF
6 pages
Splaytrees
No ratings yet
Splaytrees
11 pages
02 Balanced Trees
No ratings yet
02 Balanced Trees
34 pages
Algorithms: Selected Lecture Notes
No ratings yet
Algorithms: Selected Lecture Notes
53 pages
Search Tree: Fundamentals and Applications
From Everand
Search Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
Cache Related Preemption Delay Computation For Set-Associative Caches
No ratings yet
Cache Related Preemption Delay Computation For Set-Associative Caches
27 pages
Oxford Syllabus, School of Literae Humaniores: Appendixb
No ratings yet
Oxford Syllabus, School of Literae Humaniores: Appendixb
3 pages
Cosmic Memory (Prehistory of Earth and Man) by Rudolf Steiner
100% (1)
Cosmic Memory (Prehistory of Earth and Man) by Rudolf Steiner
47 pages
Gramática (Semántico) : 1 Terminales Del Lenguaje
No ratings yet
Gramática (Semántico) : 1 Terminales Del Lenguaje
5 pages
Richard Dawkins On Constraints On Natural Selection
No ratings yet
Richard Dawkins On Constraints On Natural Selection
5 pages
Fishbone PDF
No ratings yet
Fishbone PDF
1 page
Chinese PDF
No ratings yet
Chinese PDF
7 pages
Who Am I? (Nan Yar?) by Bhagavan Sri Ramana Maharshi
No ratings yet
Who Am I? (Nan Yar?) by Bhagavan Sri Ramana Maharshi
11 pages
Dogmatism Without Mooreanism: Jonathan Fuqua
No ratings yet
Dogmatism Without Mooreanism: Jonathan Fuqua
22 pages
c2
No ratings yet
c2
23 pages
German Glossary Math
No ratings yet
German Glossary Math
61 pages
DM Lab Cycle 6 1
No ratings yet
DM Lab Cycle 6 1
5 pages
DL Insem (2019 Pattern) 2022
No ratings yet
DL Insem (2019 Pattern) 2022
1 page
P066 Gather Flattening Based On Event Tracking For Each Time Sample
No ratings yet
P066 Gather Flattening Based On Event Tracking For Each Time Sample
5 pages
Game Playing in AI
No ratings yet
Game Playing in AI
27 pages
The Categories of The Different LCD Display Types Are:: Monochrome (Single Color)
No ratings yet
The Categories of The Different LCD Display Types Are:: Monochrome (Single Color)
11 pages
NA Report - H. M. J. Raheem
No ratings yet
NA Report - H. M. J. Raheem
7 pages
LeetCode Was HARD Until I Learned These 15 Patterns
No ratings yet
LeetCode Was HARD Until I Learned These 15 Patterns
23 pages
UNIT 2 Study Materials 1
No ratings yet
UNIT 2 Study Materials 1
42 pages
Ncert Solutions Class 9 Math Chapter 2 Polynomials Ex 2 4
No ratings yet
Ncert Solutions Class 9 Math Chapter 2 Polynomials Ex 2 4
11 pages
15 Dijkstra
No ratings yet
15 Dijkstra
48 pages
Merged Presentation Choladeck
No ratings yet
Merged Presentation Choladeck
128 pages
A Design of LDPC Codes With Large Girth Based On The Sub-Matrix Shifting
No ratings yet
A Design of LDPC Codes With Large Girth Based On The Sub-Matrix Shifting
4 pages
Laporan Praktikum Siskom Digital PCM Encoding&Decoding - Fatih
No ratings yet
Laporan Praktikum Siskom Digital PCM Encoding&Decoding - Fatih
23 pages
BSC 3 Sem Computer Science (Data Structures) Summer 2018
No ratings yet
BSC 3 Sem Computer Science (Data Structures) Summer 2018
2 pages
Genetic Algorithms: PHY 604: Computational Methods in Physics and Astrophysics II
No ratings yet
Genetic Algorithms: PHY 604: Computational Methods in Physics and Astrophysics II
31 pages
Analog To Digital Converter ADC: Lecturer Mohand Lokman
No ratings yet
Analog To Digital Converter ADC: Lecturer Mohand Lokman
23 pages
Artificial Intelligence and Machine Learning
No ratings yet
Artificial Intelligence and Machine Learning
64 pages
Clustering Algorithms
No ratings yet
Clustering Algorithms
61 pages
Programme Guide
No ratings yet
Programme Guide
90 pages
Nmode2 160210054831 PDF
No ratings yet
Nmode2 160210054831 PDF
170 pages
Understanding Locality Sensitive Hashing (LSH) : A Powerful Technique For Similarity Search
No ratings yet
Understanding Locality Sensitive Hashing (LSH) : A Powerful Technique For Similarity Search
11 pages
A Novel Evolutionary Algorithm With Column and Sub-Block Local Search For Sudoku Puzzles
No ratings yet
A Novel Evolutionary Algorithm With Column and Sub-Block Local Search For Sudoku Puzzles
11 pages
BigM Method
No ratings yet
BigM Method
8 pages
Experimental Study of The Broyden Class Updating Method For Solving Non-Linear Unconstrained Optimization Problems
No ratings yet
Experimental Study of The Broyden Class Updating Method For Solving Non-Linear Unconstrained Optimization Problems
10 pages
DC Unit 5
No ratings yet
DC Unit 5
7 pages
Syllabus
No ratings yet
Syllabus
2 pages
Unit 1 - I Algorithms and Flowcharts v1.4
No ratings yet
Unit 1 - I Algorithms and Flowcharts v1.4
12 pages
Data Structures Algorithms U5
No ratings yet
Data Structures Algorithms U5
83 pages
11 Stability
No ratings yet
11 Stability
32 pages
Dit FFT
100% (1)
Dit FFT
18 pages

R-Trees, Advanced Data Structures

Uploaded by

R-Trees, Advanced Data Structures

Uploaded by

R-Trees

Accessing Spatial Data

The B-Tree provided a foundation for RTrees. But whats a B-Tree?

Whats wrong with B-Trees

B-Trees cannot store new types of data

R-Trees can organize any-dimensional data

Searching: look at all nodes that intersect,

If a node is full, then a split needs to be done

Deletion: node becomes underfull. Reinsert

Splitting Full Nodes

Linear choose far apart nodes as ends.

How can we visualize the R-Tree

Avoids multiple paths during searching.

Objects may be stored in multiple nodes

MBRs of nodes at same tree level do not overlap

Variants: Hilbert R-Tree

Similar to other R-Trees except that the Hilbert value

The Hilbert value of an object is found by

The original R-Tree only uses minimized

Problems in original R-Tree

Certain types of data may create small areas but

Greene tried to solve, but he only used the

Splitting overfilled nodes

Area covered by a rectangle should be

Entries are sorted by their lower value, then

Deleting and Forced Re-insertion

Experimentally, it was shown that re-inserting

Lots of data sets and lots of query types.

Storage util. On insert

Memory large enough to store objects

A discriminator is used to decide (in binary) which

Domain Reduction and Clipping

Insert, Delete and Search are straightforward

Two methods for finding a discriminator for a

Different distributions greatly affect partition

Slower, but space costs much better than RC-MID

The tree can take a certain degree of

CPU execution time not a good measure. (although

Insertion a little more expensive (because of possible

You might also like