0% found this document useful (0 votes)
10 views

L6 Query Optimization

Uploaded by

lci2022010
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

L6 Query Optimization

Uploaded by

lci2022010
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

LET’S START WITH DBMS :)

Query Optimization
Whenever we want to improve the efficiency of a query, make it execute faster and
consume fewer resources we use query optimization techniques.

Optimizing SQL queries is essential for improving the performance of database


applications

1.Use Indexes Efficiently: Index the columns frequently used in WHERE, JOIN,
ORDER BY, and GROUP BY clauses to speed up data retrieval.

2.Select Only Necessary Columns : Avoid SELECT *, specify only the columns you
need in your query. This reduces the amount of data transferred and processed.
LET’S START WITH DBMS :)
Query Optimization
3. Optimize JOIN Operations : Choose the Right JOIN Type and use indexed
columns in Join conditions.

4. Partition Large Tables: For very large tables, consider partitioning them to
improve query performance. Partitioning allows the DBMS to scan only relevant
partitions, reducing I/O and improving response times.

5. Cache results : Cache the results of frequently executed queries to avoid


redundant calculations.
LET’S START WITH DBMS :)
Physical Storage And File Organization

Physical Storage

Storage device : For storing the data, there are different types of storage options
available.
Primary Storage -> Main memory and cache (fastest and expensive and as soon
as the system leads to a power cut or a crash, the data also get lost. )

Secondary Storage -> Flash Memory and Magnetic Disk Storage (non-
volatile)save and store data permanently.

Tertiary Storage-> Optical disk and Magnetic tape (used for data backup and
storing large data offline)
LET’S START WITH DBMS :)
Physical Storage And File Organization

Databases store data in files on disk. Each table or index may correspond to one or
more files. Each file has sequence of records.
Accessing data from RAM is much faster than accessing data from a hard disk or
SSD. This is because RAM is designed for high-speed read and write operations. Data
retrieval in RAM typically involves fetching data in nanoseconds.

RAM : Provides fast, volatile memory for quick data retrieval. Data is accessed via
addressable locations and cache memory enhances speed.

Hard Disk: Provides slower, non-volatile storage with mechanical components


that impact access speed. Data retrieval involves moving the read/write head and
waiting for disk rotation.
LET’S START WITH DBMS :)
Physical Storage And File Organization

When a program or application needs to access data, it first checks if the data is
available in RAM. If the data is not in RAM, it must be loaded from a slower storage
device like a hard disk or SSD. Index files in a Database Management System
(DBMS) are stored on disk but are also managed in RAM to optimize performance

Query

RAM
CPU Hardisk
(Volatile)
(non-volatile)
LET’S START WITH DBMS :)
Physical Storage And File Organization
How file is organized?

Snam
Rollno SAge
e
Files are allocated on the
hard disk in contiguous
blocks to reduce
fragmentation.

DB HARDISK
No of records stored in each block= Size of block/size of record
Records can be stored in 2 ways-> sorted(seraching is fast) or
unsorted(searching is slow, insertion is fast)
LET’S START WITH DBMS :)
Indexing and its types
Why do we need indexing?
Imagine you have a 1,000-page textbook and you want to serach a topic “xyz”.

When indexing is not present-> You need to manually search for each term in
the book, which could be frustrating and time-consuming.

When indexing is present->The index allows you to quickly locate each topic.
You can look up "xyz," find that it's covered on pages 448-490.This makes it
more efficient and less time-consuming.

Indexing is a critical technique in database management that significantly improves


the performance of query operations by minimizing the no of disk access. Index
table is always sorted
LET’S START WITH DBMS :)
Indexing and its types
Why do we need indexing?
In the same way how we have used indexing in book , it helps us to find specific
records or data entries in a large table or dataset.

Search key Data acccess

It contains the address


where the data item is
It is used to find the record
stored in memory for the
provided search key

It helps in Index table set of pointer


finding what a holding address
user is looking of a block
for
LET’S START WITH DBMS :)
Indexing and its types
How it helps?

Improved Query Performance: Indexes allow the database management


system (DBMS) to locate and retrieve data much more quickly than it could by
scanning the entire table.

Indexes can be used to optimize queries that sort data (ORDER BY) or group
data (GROUP BY) : When an index is available on the columns being sorted or
grouped, the DBMS can retrieve the data in the desired order directly from the
index, eliminating the need for additional sorting operations.

We have certain indexing methods like Dense(index entry for all the records)
and Sparse(index entries for only a subset of records) index.
LET’S START WITH DBMS :)
Indexing and its types

Sparse Index: A sparse index is a type of database indexing technique where the
index does not include an entry for every single record in the database. Instead, it
contains entries for only some of the records, typically one entry per block (or
page) of data in the storage. This makes sparse indexes smaller and faster to
search through compared to dense indexes, which have an entry for every record.

It is used when we have an ordered data set.

EX: If a table has 1,000 rows divided into 100 blocks, and you create a sparse
index, the index might only have 100 entries, with each entry pointing to the first
record in each block.
Index table
RollNo Name
1 Ram
Search key data ref
2 Shyam B1
3 1 Ram
1 B1

4 2 Shyam
4 B2 5 B2
6 3 Raghu

Sparse Index
4 Riti
no of records in index disk
file=no of blocks 5 Raj

6 Rahul

How Sparse Indexing Works:


Primary Index: Sparse indexes are often used with primary indexes, where the
table is sorted on the indexed column. The sparse index only includes an entry
for the first record in each block or page.

Searching: When searching for a specific value, the database uses the sparse
index to quickly locate the block where the record might reside. It then searches
within the block to find the exact record.
Index table
1 RollNo Name
Search key data ref
2 B1
3 1 Ram
1 B1

Disadvantages of Sparse Indexes 4 B2


4
5 B2
2 Shyam

Additional I/O Operations: After using the index Sparse Index


3 Raghu

no of records in index disk 4 Riti

to find the correct block, an additional search file=no of blocks 5 Raj

within the block is required to find the specific 6 Rahul

record.

Less Efficient for Random Access: If queries


often need to retrieve random, non-sequential
records, a dense index might be more efficient.

Use Cases:
Primary Indexing on Large Tables: Sparse indexes are ideal for primary
indexes on large tables where records are sequentially stored.
Range Queries: Sparse indexes can be effective for range queries where
the query needs to retrieve a range of records rather than a single record.
LET’S START WITH DBMS :)
Indexing and its types

Dense Index: A dense index is a type of indexing technique used in databases where
there is an index entry for every single record in the data file. This means that the index
contains all the search keys and corresponding pointers (or addresses) to the actual
data records, making it highly efficient for direct lookups.

It is used when we have an un-ordered data set.

EX : Consider a table with 1,000 rows, and you create a dense index on a non-primary
key column. The dense index will have no ofunique non key records, each pointing to a
specific row in the table.

These are useful in scenarios where random access to individual records is quiet
frequent .
18 Ram Age Name
Search key data ref 18. Shyam B1
19 18 Ram
18 B1
20 18 Shyam
20 B2
19 B1
21
19 Raghu
20 B2
Dense Index 20 Riti
21 B2
no of records in index file=no 20 Raj

Index table of blocks unique non key


21 Rahul
records

How Dense Indexing Works:


In a dense index, each index entry includes:
A search key (the value of the indexed column).
A pointer (or address) to the actual record in the data file.
When a query searches for a specific value, the database uses the dense index to
quickly locate the corresponding record.
18 Ram Age Name
Search key data ref 18. Shyam B1
19 18 Ram
18 B1
Disadvantages of Dense Indexes 19 B1
20
20 B2
18 Shyam

Storage Intensive: Requires significant storage


21
19 Raghu
20 B2
Dense Index
space because it maintains an entry for every
20 Riti
21 B2
no of records in index file=no 20 Raj
record in the table. Index table of blocks unique non key
21 Rahul
records

Maintenance Overhead: Inserting, updating, or


deleting records requires updating the dense
index, which can be costly in terms of
performance, especially for large tables.
Use Cases:
Primary Indexing: Dense indexes are often used for primary indexing when
the table is small to medium-sized, and quick access to individual records
is required.
Exact Match Queries: Ideal for situations where queries frequently request
individual records based on an exact key match.
LET’S START WITH DBMS :)
Indexing and its types
Index table
B1 B2
Search key data ref
1-10 11-20
1 B1
B3 B4
21-30 31-40
11 B2
Hardisk
21 B3

When a query is run, the database engine checks the index to find the pointers
to the rows that contain the desired data. It then retrieves these rows directly,
without scanning the entire table.
LET’S START WITH DBMS :)
Indexing and its types
How indexing works?
B1 B2
It takes a search key as input and then returns a collection of all the matching
1-10 11-20
records. Now in index there are 2 columns, the first one stores a duplicate of the
key attribute/ non-key attribute(serach key) from table while the second one
stores pointer which hold the disk block address of the corresponding key-value.
LET’S START WITH DBMS :)
Indexing and its types
Indexing

Clustered Index Multilevel Indexing Non clustered Index

Primary index When there is one more


level of indexing in the index
Secondary index
When key= PK table we use inner and outer
When key= non-key/key
order= sorted index.
Follows sparse indexing order= unsorted
Mostly used for large files
Follows dense indexing

Clustering Index
When key= non-key
order= sorted
Follows dense indexing
LET’S START WITH DBMS :)
Indexing and its types

Key data: Unique identifiers for each record. Often used to distinguish one
record from another.
Non-key data: All other data in the record besides the key.

Single-Level Indexing is straightforward and works well for smaller


databases, offering a direct approach to speeding up query performance.
Multi-Level Indexing is more complex but necessary for larger databases, as
it effectively manages large indexes by creating additional layers of indexing,
thereby improving data retrieval times.
LET’S START WITH DBMS :)
Primary Index
It enhances the efficiency of retrieving records by using their primary key values. It establishes
an index structure that associates primary key values with disk block addresses. This index is
composed of a sorted list of primary key values paired with their corresponding disk block
pointers, thereby accelerating data retrieval by reducing search time. It is especially beneficial
for queries based on primary keys. It is done on sorted data
1 RollNo Name
Search key data ref 2 B1
3 1 Ram
1 B1
4 2 Shyam
4 B2 5 B2
6
3 Raghu
It creates an index
structure that maps Sparse Index 4 Riti

primary key values to no of reocrds in index 5 Raj


disk block addresses. file=no of blocks
6 Rahul
Index table
LET’S START WITH DBMS :)
Primary Index

Example : In a Users table with a UserID column as the primary key, a


primary index on UserID allows for quick retrieval of user records when you
know the UserID

The primary index ensures that the database can immediately locate the
row associated with a given UserID, making operations like SELECT,
UPDATE, and DELETE highly efficient.
LET’S START WITH DBMS :)
Cluster Index
The cluster index is generally on a non- key attribute. The data in the table is typically
ordered according to the order defined by the clustered index. Mostly each index entry
points directly to a row. It is mostly used where we are using GROUP BY. It physically
order records in a table based on the indexed columns.
Age Name
Search key data ref 18
18 B1
19 18 Ram
18 B1

20 18 Shyam
19 B1
20 B2
21 19 Raghu
20 B2

21 B2 Dense Index 20 Riti

no of reocrds in index file=no 20 Raj

Index table of blocks unique non key


21 Rahul
records
LET’S START WITH DBMS :)
Cluster Index
Example : In an Employees table, if you often need to produce lists of employees
ordered by LastName, creating a clustered index on LastName can make these
queries faster.

The clustered index keeps the rows in LastName order, reducing the need for the
database to perform additional sorting when executing queries.
LET’S START WITH DBMS :)
Multilevel Index
A multilevel index is an advanced indexing technique used in database management
systems to manage large indexes more efficiently. When a single-level index
becomes too large to fit into memory, multilevel indexing helps by breaking down
the index into multiple levels, reducing the number of disk I/O operations needed to
search through the index

First Level (Primary Index): The first level is the original index, where each entry
points to a block of data or a page in the table.
Second Level (Secondary Index): If the first level index is too large to fit in
memory, a second-level index is created. This index points to blocks of the first-
level index, effectively indexing the index itself.
LET’S START WITH DBMS :)
Multilevel Index
Eid Pointer Eid Data

Eid Pointer E101 B1 E101


E102
E151 B2 .
E101
.

E201 Eid Pointer


E151
E201 B3 E152
E301 .
.
E251 B4

Outer index E201


Eid Pointer E202
.
.
E301 B5

E351 B6

Inner index
LET’S START WITH DBMS :)
Secondary Index
Secondary indexing speeds up searches for non-key columns in a database.
Unlike primary indexing, which uses the primary key, secondary indexing
focuses on other columns. It creates an index that maps column values to the
locations where the records are stored, making it faster to find specific records
without scanning the entire table.

The data is mostly unsorted and it can be performed on both key and non-key
attributes.
LET’S START WITH DBMS :)
Secondary Index
1. when serach key is key attribute and data is unsorted

Search key data ref Age


18
18 B1 20
B1
Dense Index 45
67
no of reocrds in index 20 B1 68

file=no of blocks unique 22 B2


22
non key records 24 B2
24 B2 30

.
.
.
.
.
.
.

Index table
LET’S START WITH DBMS :)
Secondary Index
2. when serach key is non- key attribute and data is unsorted

Search key data ref Age


18
18 B1 18
B1
Dense Index 20
67
no of reocrds in index 20 B1 22

file=no of blocks unique 22 B1->B2


22
non key records 24 B2
24 B2 30

.
.
.
.
.
.
.

Index table
LET’S START WITH DBMS :)
Indexing and its types
How to create indexes
Non-Clustered index: Clustered index:
(On one column) Automatically created with the primary
CREATE INDEX index_name key. MySQL does not support explicit
ON table_name (column_name); creation of additional clustered indexes.

(On multiple column)


CREATE INDEX index_name
ON table_name (column_name1, column_name2,.....)
Remove an index
DROP INDEX index_name ON table_name
LET’S START WITH DBMS :)
B and B+ trees
Clustered and non-clustered indexes are concepts that describe how data is stored and
accessed in a database, but B-trees (and B+ trees) are the data structures that actually
implement these indexes. Knowing B-trees gives you insight into how these indexes work under
the hood.

Understanding B-trees helps you understand why certain queries perform well or poorly based
on the structure of the index. For example, how a B-tree's balanced nature affects search times
or why range queries are efficient with B-tree indexes

B-trees provide the balanced, efficient structure that makes these types of indexes performant,
ensuring that operations like search, insert, delete, and update are done in logarithmic time
(O(log n)). This makes B-tree indexing crucial for optimizing database queries, whether in
clustered or non-clustered index scenarios.
LET’S START WITH DBMS :)
B trees(M-way tree)

B-trees are self-balancing tree data structures that maintain sorted data and allow
searches, sequential access, insertions, and deletions in logarithmic time. All leaf
nodes are at the same level.

In B trees you can have minimum 2 children and max x children.


Now B-tree is a generalisation of Binary Search trees. In BST every node can have
atmost 2 children(0,1,2) and only one key

key root key

key
key key

Binary tree
B-tree
Root node
LET’S START WITH DBMS :)
internal node
B trees
Dp-> record/data pointer(where the record is present
in secondary memory(disk) Leaf node
Tp-> block/tree pointer(links to the children nodes)

Structure of B-Tree:
Nodes: A B-tree is composed of nodes, each containing keys and pointers (references) to
child nodes. The keys within a node are sorted in ascending order.
Root, Internal Nodes, and Leaves:
keys Rp
1. The root node is the topmost node in the tree. Tp
2. Internal nodes contain keys and child pointers.
3. Leaf nodes contain keys and possibly pointers to records or other data.
In indexing, each key in a B-tree node typically represents a value or range of values, and
the associated pointer directs to a data block where records corresponding to the key(s)
can be found.
For example, in a database, the key might be a value in a column, and the pointer might
direct to the location of a row or a set of rows in a table.
Root node
LET’S START WITH DBMS :)
internal node
B trees
A B-tree of order x can have:
Leaf node
The max no. of children
For every node -> x i.e the order of tree.
The max no. of keys Ceiling-> upper value
For every node -> x-1 floor-> lower value
The min no of keys
a. root node -> 1 K1 K2
b. other nodes apart from root-> ceiling(m/2) -1
The min no. of children
a. root node -> 2
b. Leaf node ->0
c. internal node-> ceiling(m/2)

Insertion in B-tree always happens from leaf node.


Root node
LET’S START WITH DBMS :)
internal node
B trees
To determine the order of a B-tree when the
Leaf node
block size, block pointer size, and data pointer
size are given :
For a B-tree node with m children:
m×Pb​+(m−1)×(K+Pd​)≤B Number of Keys: A node with m children can
have a maximum of m-1 keys.
B- Block size Number of Block Pointers: Each node has m
m- Order of tree block pointers (pointers to child nodes).
Pb- Block pointer size Number of Data Pointers: Each key has an
K- Key size associated data pointer, so there are m-1
Pd- Data pointer size data pointers.
To determine the order of a B-tree when the

LET’S START WITH DBMS :) block size, block pointer size, and data pointer
size are given :
B trees
m×Pb​+(m−1)×(K+Pd​)≤B
Let's say you have the following:
Block size (B) = 1024 bytes B- Block size
Block pointer size (Pb) = 8 bytes m- Order of tree
Pb- Block pointer size
Data pointer size (Pd) = 12 bytes K- Key size
Key size (K) = 16 bytes Pd- Data pointer size
Find the order of the tree.

Formula to find order : m×Pb​+(m−1)×(K+Pd​)≤B


mx8+(m-1)x(16+12) <=1024

So, the order of the B-tree is m = 29


LET’S START WITH DBMS :)
Insertion in B-TREE
Steps to insert values in B-tree
1.Start at the root and recursively move down the tree to find the appropriate leaf node where the new
value should be inserted.

2. Insert the value into the leaf node in sorted order. If the leaf node has fewer than the maximum
allowed keys (order - 1), this step is simple.

3. If the leaf node contains the maximum number of keys after the insertion, it causes an overflow.
Split the Node:
Divide the node into two nodes. The middle key (median) is pushed up to the parent node.
The left half of the original node stays in place, while the right half forms a new node.
Insert the Median into the Parent:
If the parent node also overflows after this insertion, recursively split the parent node and
propagate the median up the tree.

4.If the root node overflows (which can happen if it already has the maximum number of keys), split it
into two nodes, and the median becomes the new root. This increases the height of the B-tree by one.
A B-tree of order x can have:

LET’S START WITH DBMS :)


Every node will have max x children i.e the
order of tree.
Every node can have max (x-1) keys
The min no of keys
Insertion in B-TREE a. root node -> 1
b. other nodes apart from root-> ceiling(m/2) -1
Create a B-tree of Order 3 and insert values from 1,4,6,8,10,12,14,16 Min children
a. root node -> 2

Max children= order of tree=3 b. Leaf node ->0


c. internal node-> ceiling(m/2)

Max keys node can have=order-1=3-1=2

Insertion in B-tree happens from leaf node and values are also inserted
in sorted order. The element in left of root would be less than root and the
element in right would be greater than root

Step 1: Start at the root and recursively move down the tree to find the appropriate
leaf node where the new value should be inserted. Insert the value into the leaf
node in sorted order. If the leaf node has fewer than the maximum allowed keys
(order - 1), this step is simple.
1 4
Max children= order of tree=3
LET’S START WITH DBMS :) Max keys node can have=order-1=3-1=2
Insertion in B-TREE
Create a B-tree of Order 3 and insert values from 1,4,6,8,10,12,14,16

Step 2: If the leaf node contains the maximum number of keys after the insertion, it
causes an overflow.
Split the Node: Divide the node into two nodes. The middle key (median) is pushed up
to the parent node.
The left half of the original node stays in place, while the right half forms a new
node.
Insert the Median into the Parent:
If the parent node also overflows after this insertion, recursively split the parent
node and propagate the median up the tree.
4 4 (left- 1 , right- 6)

1 6
Max children= order of tree=3
LET’S START WITH DBMS :) Max keys node can have=order-1=3-1=2
Insertion in B-TREE
Create a B-tree of Order 3 and insert values from 1,4,6,8,10,12,14,16
4 (left- 1 , right- 6,8)
4

1 6 8

Follow step-2 again

4 (left- 1 , right- 6)
4 8
8(right -10,12)
1 6 10 12
Max children= order of tree=3
LET’S START WITH DBMS :) Max keys node can have=order-1=3-1=2
Insertion in B-TREE
Create a B-tree of Order 3 and insert values from 1,4,6,8,10,12,14,16
Step 3:If the root node overflows (which can happen if it already has the maximum
number of keys), split it into two nodes, and the median becomes the new root. This
increases the height of the B-tree by one.
4 (left- 1 , right- 6)
8
8(left-4, right-12)
4 12 12(left -10,right -14)

1 6 10 14
maximum of (m-1)=3 keys
LET’S START WITH DBMS :) and minimum of (ceil(m/2)-1)=1 key
Deletion in B-TREE
Consider a B-tree of order 4
]
Step 1: Begin at the root and recursively move 40, 50
down the tree to find the node that contains
the key to be deleted. 10, 20 44, 47 52, 59

Step 2 :
Case 1: The Key is in a Leaf Node (Delete 10)
Simply remove the key from the leaf node. ]
If the node still has the minimum required number 40, 50
of keys (i.e., at least ceil(order/2) - 1 keys), the
deletion is complete. 20 44, 47 52, 59
maximum of (m-1)=3 keys
LET’S START WITH DBMS :) and minimum of (ceil(m/2)-1)=1 key
Deletion in B-TREE
Consider a B-tree of order 4 ]
40, 50
Step 2 :
Case 2: The Key is in a Leaf Node (Delete
20) 20 44, 47 52, 59
If the deletion causes the node to have
fewer than the minimum number of keys,
proceed to the Borrowing or Merging
step.

1. If the node has a sibling with more than ]


47, 50
the minimum number of keys, you can
borrow a key from this sibling. The
parent key between the node and the 40 44 52, 59
sibling moves down to the node, and a
key from the sibling moves up to the
parent.
maximum of (m-1)=3 keys
LET’S START WITH DBMS :) and minimum of (ceil(m/2)-1)=1 key
Deletion in B-TREE
Consider a B-tree of order 4
]
47, 50
Step 2 :
Case 2: The Key is in a Leaf Node (Delete 40)
40 44 52, 59
2. If borrowing is not possible (i.e., the sibling
also has the minimum number of keys), merge
the node with a sibling. The key from the
parent that separates the two nodes moves
]
down into the newly merged node.If this 50
causes the parent to have too few keys,
repeat the borrowing or merging process at 44 47 52, 59
the parent level.
]
40, 50
LET’S START WITH DBMS :)
Deletion in B-TREE 20 44, 47 52, 59
Steps on how we can delete elements in B-
tree
Step 2: 10,11 22,23 43 48 60
Case 2: The Key is in an Internal Node (Delete 20)
Replace the key with its predecessor (the largest key in the left subtree) or its
successor (the smallest key in the right subtree).
Delete the predecessor or successor key from the corresponding subtree.
If this causes an underflow (i.e., a node has too few keys), proceed to the
Borrowing or Merging step. 40, 50

11 44, 47 52, 59

10 22,23 43 48 60
LET’S START WITH DBMS :)
B+ Tree
A B+ tree is an extension of the B-tree and is commonly used in databases and
file systems to maintain sorted data and allow for efficient insertion, deletion, and
search operations. B+ tree is a balanced tree, meaning all leaf nodes are at the
same level

The key difference between a B+ tree and a B-tree lies in how they store data
and how leaf nodes are structured.
1. In a B+ tree, all actual data (or references to data) are stored in the leaf nodes.
Internal nodes only store keys

leaf node
LET’S START WITH DBMS :)
B+ Tree

2. In a B+ tree, Leaf nodes are linked together in a linked list fashion, allowing for
efficient sequential access, one leaf node will have only 1 Bp.

leaf nodes

During inserting value in B+ tree, a copy of the key is always stored in


the leaf node.
LET’S START WITH DBMS :)
B+ Tree

3. Internal nodes do not store data pointers, only keys and child pointers. This allows
more keys to be stored in each internal node, leading to a lower height and more
efficient operations.

internal nodes
LET’S START WITH DBMS :)
B+ Tree

Structure of a B+ Tree:
Root Node:
The top node of the B+ tree, which points to the first level of
internal nodes or directly to leaf nodes if the tree has only one
level.
Internal Nodes:
These nodes contain only keys and pointers to child nodes.
They guide the search process down to the correct leaf node.
Leaf Nodes:
These nodes contain keys and data pointers. Each leaf node
stores a pointer to the next leaf node, enabling quick traversal
of records.
LET’S START WITH DBMS :)
B+ Tree
Advantages :

1. B+ trees have a balanced structure, meaning all leaf nodes are at the same
level. This balance ensures that search operations require logarithmic time
relative to the number of keys, making it very efficient even for large
datasets.
2. The structure of the B+ tree allows for direct access to data. Since the
internal nodes contain only keys, searching for a specific value can be done
quickly by navigating through the tree down to the leaf node where the data
is stored.
B- Block size
LET’S START WITH DBMS :) m- Order of tree
B+ Tree Pb- Block pointer size
Order of B+ tree K- Key size
Pd- Data pointer size
Order of Leaf node :
leaf nodes
1xPb+M(k+Pd)<=B

Order of non-Leaf node(internal,root) :


internal nodes
mxPb+(m-1)k<=B
LET’S START WITH DBMS :)
Difference between B and B+ Tree

B-Tree Example: Think of a B-Tree as a directory structure on your computer


where folders (nodes) contain both names (keys) and pointers to subfolders
or files (child pointers).

B+ Tree Example: Imagine a library catalog where all book records are listed
in a sorted, linked list (leaf nodes), and the internal nodes only contain
pointers to guide you to the right part of the catalog.
LET’S START WITH DBMS :)
Difference between B and B+ Tree
Case B tree B+ tree

keys and associated pointers to data are all actual data is stored only in the leaf
Data Storage stored in all nodes (internal and leaf nodes.Internal nodes contain only keys
nodes) and pointers to child nodes

Leaf nodes are linked together in a linked


There is no inherent linked structure list. This linked structure allows for
Leaf Node Linking
between the leaf nodes efficient sequential access, making range
queries faster and easier

Access times can be slower for certain types


Leaf nodes are linked, allowing for efficient
Search Performance of queries because you may need to traverse
sequential access and range queries.
multiple levels of the tree to find the data.

B+ trees typically have better space utilization


Since data is stored throughout the tree, B-
because internal nodes are used only for
Space Utilization trees might have lower space utilization in the
keys, allowing more keys to be stored per
nodes.
node.

You might also like