Linear Data Structures
Linear Data Structures
Definition:
Linear data structures are data structures where elements are arranged in a sequential manner,
and each element is connected to its previous and next element (except the first and last).
🔗 Key Characteristics:
Elements are stored one after another.
Memory is organized contiguously or with pointers.
Traversal is done in linear order (one element after another).
✅ Common Linear Data Structures:
Data Structure Description Key Operations
Collection of elements stored in contiguous
Array Access, Insert, Delete
memory
Sequence of nodes where each node points to the
Linked List Insert, Delete, Traverse
next
Stack Follows LIFO (Last-In, First-Out) push, pop, peek
Queue Follows FIFO (First-In, First-Out) enqueue, dequeue, peek
Double-ended queue; insert/remove from both append, appendleft, pop,
Deque
ends popleft
📊 Time Complexity Summary:
Operation Array Linked List Stack Queue
Access (by index) O(1) O(n) O(n) O(n)
Insert at end O(1) O(1) O(1) O(1)
Insert at start O(n) O(1) O(1) O(1)
Delete from end O(1) O(n) O(1) O(n)
Delete from start O(n) O(1) O(n) O(1)
Note: Actual complexity may vary depending on implementation (e.g., circular queue, doubly
linked list).
🧠 Use Cases
Array: Fast random access (e.g., image processing, dynamic programming).
Linked List: Frequent insert/delete (e.g., music playlists, undo feature).
Stack: Expression evaluation, function call stack, backtracking.
Queue: Scheduling, buffers, BFS (Breadth-First Search).
Examples in Python
python
CopyEdit
# Array
arr = [1, 2, 3]
arr.append(4)
# Stack
stack = []
stack.append(5)
stack.pop()
# Queue
from collections import deque
queue = deque()
queue.append(1)
queue.popleft()
# Linked List (simplified)
class Node:
def __init__(self, val):
self.val = val
self.next = None
Non-Linear Data Structures
Definition:
Non-linear data structures are structures where data is not arranged in a sequential manner.
Elements may connect to multiple other elements, forming hierarchical or networked
relationships.
🔗 Key Characteristics:
No single sequence of elements (unlike arrays or linked lists).
Elements may be arranged hierarchically (like in trees) or in complex networks (like in
graphs).
Ideal for representing real-world relationships such as maps, organization charts, and file
systems.
✅ Common Non-Linear Data Structures:
Structure Description Example Use Cases
Hierarchical structure with parent-child File systems, XML parsers,
Tree
nodes databases
Binary Tree Each node has at most two children Expression trees, Huffman coding
Binary Search Tree
Tree where left < root < right Efficient search, insert, delete
(BST)
Complete binary tree for priority
Heap Priority queue, scheduling
management
Trie Tree for prefix matching Dictionary/Autocomplete apps
Networks, social media, route
Graph Nodes (vertices) connected by edges
maps
🧠 Detailed Overview:
1. Tree
Root node + branches.
No cycles.
Types: N-ary tree, Binary tree, AVL tree, B-Trees, etc.
2. Binary Search Tree (BST)
Left subtree < root < right subtree.
Allows quick search, insertion, deletion in O(log n) (balanced).
3. Heap
Min-Heap: Parent ≤ Children
Max-Heap: Parent ≥ Children
Used in algorithms like Heap Sort and Dijkstra’s algorithm.
4. Trie
Efficient retrieval of strings.
Every node represents a character.
Common for autocomplete and spell-check.
5. Graph
Can be:
o Directed or Undirected
o Weighted or Unweighted
o Cyclic or Acyclic
Represented using:
o Adjacency Matrix or Adjacency List
📝 Example in Python
Tree Node (Binary Tree)
python
CopyEdit
class TreeNode:
def __init__(self, val):
self.val = val
self.left = None
self.right = None
Graph (Adjacency List)
python
CopyEdit
graph = {
'A': ['B', 'C'],
'B': ['D'],
'C': ['E'],
'D': [],
'E': ['B']
🔍 Applications of Non-Linear Structures:
Data Structure Application
Tree File systems, XML parsing, AI game trees
BST Efficient searching and sorting
Heap Priority queues, task schedulers
Trie Text prediction, IP routing
Graph GPS navigation, social networks
Linked Lists
A Linked List is a linear data structure where each element (called a node) contains:
1. Data (value)
2. A pointer (reference) to the next node in the sequence
Unlike arrays, linked lists do not use contiguous memory, allowing dynamic memory allocation.
✅ Types of Linked Lists
Type Description Structure
Singly Linked List Each node points to the next node only A → B → C → None
None ← A ⇄ B ⇄ C →
Doubly Linked List Each node points to both next and previous nodes
None
Circular Linked List Last node points to the first node A→B→C→A
🧠 Basic Operations (Singly Linked List)
Operation Time Complexity
Insert at Head O(1)
Insert at Tail O(n)
Delete Node O(n)
Operation Time Complexity
Search O(n)
Traverse O(n)
📝 Python Example
Define Node and Singly Linked List:
python
CopyEdit
class Node:
def __init__(self, data):
self.data = data
self.next = None
class LinkedList:
def __init__(self):
self.head = None
# Insert at head
def insert_at_head(self, data):
new_node = Node(data)
new_node.next = self.head
self.head = new_node
# Print list
def print_list(self):
current = self.head
while current:
print(current.data, end=" → ")
current = current.next
print("None")
# Example usage
ll = LinkedList()
ll.insert_at_head(3)
ll.insert_at_head(2)
ll.insert_at_head(1)
ll.print_list() # Output: 1 → 2 → 3 → None
Applications
Dynamic memory usage where size is unknown
Undo functionality in editors
Hash table chaining (separate chaining)
Real-time task scheduling
❗ Why Use Linked Lists?
Arrays Linked Lists
Fixed size Dynamic size
Insertion/Deletion = O(n) Insertion/Deletion = O(1) at head
Fast random access (O(1)) Slow access (O(n))
Stacks
A Stack is a linear data structure that follows the LIFO principle — Last In, First Out.
This means the last element inserted is the first one to be removed.
📦 Key Stack Operations:
Operation Description Time Complexity
push(x) Insert element x on top O(1)
pop() Remove and return the top element O(1)
peek() or top() View top element without removing O(1)
isEmpty() Check if the stack is empty O(1)
🧠 Real-World Analogies:
Stack of plates
Undo/Redo in text editors
Call stack in recursive programming
📝 Stack Implementation in Python
Using List:
python
CopyEdit
stack = []
# Push elements
stack.append(10)
stack.append(20)
# Peek top
print(stack[-1]) # Output: 20
# Pop element
stack.pop() # Removes 20
# Check empty
print(len(stack) == 0) # False
Using collections.deque (more efficient):
python
CopyEdit
from collections import deque
stack = deque()
stack.append(1)
stack.append(2)
stack.pop()
💡 Applications of Stacks
Use Case Description
Expression Evaluation Infix to postfix conversion, postfix eval
Balanced Parentheses Checking {[()]} using a stack
Backtracking Maze solving, DFS
Function Calls (Call Stack) Tracks active function calls (recursion)
Undo/Redo Track previous states
🔍 Example: Balanced Parentheses Checker
python
CopyEdit
def is_balanced(expr):
stack = []
pairs = {')': '(', ']': '[', '}': '{'}
for char in expr:
if char in '([{':
stack.append(char)
elif char in ')]}':
if not stack or stack.pop() != pairs[char]:
return False
return not stack
# Test
print(is_balanced("([{}])")) # True
print(is_balanced("([)]")) # False
1. Expression Evaluation Using Stack
Stacks are essential in evaluating arithmetic expressions, especially when converting between or
evaluating:
Infix (e.g., 3 + 4)
Postfix (Reverse Polish Notation - RPN) (e.g., 3 4 +)
Prefix (Polish Notation) (e.g., + 3 4)
✅ A. Postfix Evaluation (e.g., 2 3 1 * + 9 -)
Algorithm:
1. Scan the expression from left to right.
2. If it's a number, push onto the stack.
3. If it's an operator, pop two elements, apply the operator, and push the result.
Python Example:
python
CopyEdit
def evaluate_postfix(expr):
stack = []
for token in expr.split():
if token.isdigit():
stack.append(int(token))
else:
b = stack.pop()
a = stack.pop()
if token == '+': stack.append(a + b)
elif token == '-': stack.append(a - b)
elif token == '*': stack.append(a * b)
elif token == '/': stack.append(int(a / b)) # Floor division
return stack[0]
# Test
print(evaluate_postfix("2 3 1 * + 9 -")) # Output: -4
✅ B. Infix to Postfix Conversion
Steps:
1. Use two stacks: one for operators, one for output.
2. Follow precedence and associativity rules.
Would you like code for infix-to-postfix conversion?
🧊 2. Sorting a Stack (Using Stack Only)
Goal: Sort a stack such that the smallest element is on top using only stack operations.
Approach: Use an auxiliary stack to insert elements in sorted order.
✅ Python Example
python
CopyEdit
def sort_stack(input_stack):
temp_stack = []
while input_stack:
tmp = input_stack.pop()
while temp_stack and temp_stack[-1] > tmp:
input_stack.append(temp_stack.pop())
temp_stack.append(tmp)
return temp_stack
# Test
s = [34, 3, 31, 98, 92, 23]
sorted_stack = sort_stack(s)
print(sorted_stack) # Output: [3, 23, 31, 34, 92, 98]
💡 Sorting Stack Logic
Pop elements one by one from the original stack.
Use a temporary stack to hold sorted elements.
Maintain sorted order in the auxiliary stack.
Queues
Queues
A Queue is a linear data structure that follows the FIFO (First In, First Out) principle.
This means the first element added to the queue will be the first one to be removed.
🧱 Structure
Each queue operation happens at two ends:
Enqueue → Add to the rear
Dequeue → Remove from the front
✅ Basic Operations
Operation Description Time Complexity
enqueue(x) Add element x to the end O(1)
dequeue() Remove and return front element O(1)
peek() View front element without removing O(1)
isEmpty() Check if the queue is empty O(1)
🧠 Types of Queues
Type Description
Simple Queue FIFO structure using an array or linked list
Circular Queue Connects the end back to the front to use space efficiently
Priority Queue Elements are dequeued by priority, not order
Deque Double-ended queue — insert/delete from both ends
📝 Queue Implementation in Python
Using collections.deque:
python
CopyEdit
from collections import deque
q = deque()
# Enqueue
q.append(10)
q.append(20)
# Dequeue
print(q.popleft()) # Output: 10
# Peek
print(q[0]) # Output: 20
Using queue.Queue (Thread-safe):
python
CopyEdit
from queue import Queue
q = Queue()
q.put(1) # enqueue
q.put(2)
print(q.get()) # dequeue → 1
print(q.qsize()) # Output: 1
💡 Applications of Queues
Use Case Description
CPU/IO scheduling Jobs processed in the order they arrive
Print queue Print tasks processed in order
Breadth-First Search BFS in trees/graphs uses a queue
Call center queue Customers served in order
🔍 Example: BFS Using Queue
python
CopyEdit
from collections import deque
def bfs(graph, start):
visited = set()
q = deque([start])
while q:
node = q.popleft()
if node not in visited:
print(node, end=' ')
visited.add(node)
q.extend(graph[node])
# Sample graph (Adjacency List)
graph = {
'A': ['B', 'C'],
'B': ['D'],
'C': ['E'],
'D': [],
'E': ['B']
bfs(graph, 'A') # Output: A B C D E
Stack using Queue
ChatGPT said:
🔄 Implement Stack Using Queues
The goal is to simulate LIFO (Last-In-First-Out) behavior of a Stack using FIFO (First-In-First-Out)
behavior of Queue(s).
✅ Approach 1: Using Two Queues (Costly pop())
🔧 Logic:
1. Push: Always enqueue to q1.
2. Pop:
o Move all elements from q1 to q2 except the last one.
o Pop and return the last one from q1.
o Swap q1 and q2.
🧪 Code:
python
CopyEdit
from collections import deque
class StackUsingQueues:
def __init__(self):
self.q1 = deque()
self.q2 = deque()
def push(self, x):
self.q1.append(x)
def pop(self):
while len(self.q1) > 1:
self.q2.append(self.q1.popleft())
popped = self.q1.popleft()
self.q1, self.q2 = self.q2, self.q1 # swap queues
return popped
def top(self):
while len(self.q1) > 1:
self.q2.append(self.q1.popleft())
top = self.q1.popleft()
self.q2.append(top)
self.q1, self.q2 = self.q2, self.q1
return top
def isEmpty(self):
return not self.q1
✅ Approach 2: Using One Queue (Costly push())
🔧 Logic:
1. Push:
o Enqueue new element.
o Rotate the queue so the new element is at the front.
2. Pop: Simply dequeue from the front.
🧪 Code:
python
CopyEdit
from collections import deque
class StackUsingOneQueue:
def __init__(self):
self.q = deque()
def push(self, x):
self.q.append(x)
for _ in range(len(self.q) - 1):
self.q.append(self.q.popleft()) # rotate
def pop(self):
return self.q.popleft()
def top(self):
return self.q[0]
def isEmpty(self):
return not self.q
⚖️Comparison
Method Push Time Pop Time Space
Two Queues O(1) O(n) O(n)
One Queue (rotate) O(n) O(1) O(n)
Page Replacement Algorithms Using Queues
Page Replacement Algorithms can be implemented using queues to simulate memory frame usage.
This works well especially for FIFO, and with some modifications for LRU and Clock (Second
Chance).
✅ 1. FIFO Using Queue
FIFO (First-In First-Out) removes the page that was added first to memory.
🔧 Logic:
Use a queue to represent memory frames.
If a page is not in queue (i.e., not in memory), it causes a page fault.
If the queue is full, remove the front (oldest) page.
🧪 Python Code:
python
CopyEdit
from collections import deque
def fifo_page_replacement(pages, capacity):
memory = deque()
page_faults = 0
for page in pages:
if page not in memory:
page_faults += 1
if len(memory) == capacity:
memory.popleft()
memory.append(page)
return page_faults
# Test
pages = [7, 0, 1, 2, 0, 3, 0, 4]
capacity = 3
print("Page Faults (FIFO):", fifo_page_replacement(pages, capacity))
✅ 2. LRU Using Queue
Least Recently Used (LRU) evicts the page that hasn't been used for the longest time.
🔧 Logic:
Use a deque to maintain page usage order.
If the page is accessed again, move it to the rear (most recently used).
On page fault, evict the front (least recently used) if full.
🧪 Python Code:
python
CopyEdit
def lru_page_replacement(pages, capacity):
memory = deque()
page_faults = 0
for page in pages:
if page in memory:
memory.remove(page)
memory.append(page) # move to most recently used
else:
page_faults += 1
if len(memory) == capacity:
memory.popleft()
memory.append(page)
return page_faults
# Test
pages = [7, 0, 1, 2, 0, 3, 0, 4]
capacity = 3
print("Page Faults (LRU):", lru_page_replacement(pages, capacity))
✅ 3. Clock Algorithm Using Circular Queue
The Clock (Second Chance) algorithm:
Maintains a circular queue of frames.
Each frame has a reference bit.
On page fault, if reference bit = 0, replace page.
If reference bit = 1, give a "second chance" and set it to 0.
Deques (Double-Ended Queues)
A Deque (pronounced "deck") is a linear data structure that allows insertion and deletion from
both ends — front and rear.
🧠 Key Features:
Operation Front Rear
Insertion ✅ ✅
Deletion ✅ ✅
Access (peek) ✅ ✅
This makes it more powerful than stacks (LIFO) and queues (FIFO).
🔧 Types of Deques:
1. Input-restricted deque – insertion at one end only.
2. Output-restricted deque – deletion at one end only.
✅ Python Deque (Using collections.deque)
python
CopyEdit
from collections import deque
dq = deque()
# Insert at rear
dq.append(10)
dq.append(20)
# Insert at front
dq.appendleft(5)
# Remove from rear
dq.pop() # 20
# Remove from front
dq.popleft() # 5
# Peek front and rear
print("Front:", dq[0])
print("Rear:", dq[-1])
📦 Use Cases:
Sliding window problems
Palindrome checking
Undo/Redo functionality
Job scheduling
Maximum/Minimum in a subarray
🧪 Example: Sliding Window Maximum (Using Deque)
python
CopyEdit
def max_sliding_window(nums, k):
from collections import deque
q = deque()
result = []
for i in range(len(nums)):
# Remove out-of-window indices
if q and q[0] <= i - k:
q.popleft()
# Remove smaller values
while q and nums[q[-1]] < nums[i]:
q.pop()
q.append(i)
if i >= k - 1:
result.append(nums[q[0]])
return result
# Example:
print(max_sliding_window([1,3,-1,-3,5,3,6,7], 3))
🆚 Comparison with Other Structures
Structure Insertion Deletion Ends Used
Stack One end One end Rear
Queue Rear Front FIFO
Deque Both ends Both ends Versatile
Implementing a Stack Using a Deque
A deque (double-ended queue) allows insertion and deletion at both ends, which makes it easy to
simulate a stack's LIFO behavior.
✅ Approach
Use one end of the deque for both push and pop operations.
For a stack, typically use the rear (right side) of the deque.
push(x) → append(x) at the rear.
pop() → pop() from the rear.
top() → peek the rear element.
🧪 Python Code
python
CopyEdit
from collections import deque
class StackUsingDeque:
def __init__(self):
self.dq = deque()
def push(self, x):
self.dq.append(x)
def pop(self):
if self.is_empty():
raise IndexError("pop from empty stack")
return self.dq.pop()
def top(self):
if self.is_empty():
raise IndexError("top from empty stack")
return self.dq[-1]
def is_empty(self):
return len(self.dq) == 0
# Usage example:
stack = StackUsingDeque()
stack.push(10)
stack.push(20)
print(stack.top()) # Output: 20
print(stack.pop()) # Output: 20
print(stack.pop()) # Output: 10
⚡ Explanation
append(x) adds to the right end.
pop() removes from the right end, behaving exactly like a stack.
This implementation has O(1) time complexity for push, pop, and top.
Heap Data Structure
A Heap is a specialized complete binary tree that satisfies the heap property:
Max-Heap: Parent node is greater than or equal to its children.
Min-Heap: Parent node is less than or equal to its children.
Heaps are commonly used to implement priority queues.
🧠 Key Characteristics:
Complete binary tree: all levels filled except possibly the last, which is filled from left to
right.
Efficient for finding the maximum (max-heap) or minimum (min-heap) element.
Supports efficient insertion, deletion, and access to the root element.
Operations
Operation Description Time Complexity
insert Add a new element and maintain heap property O(log n)
Operation Description Time Complexity
extract_max / extract_min Remove and return root element (max or min) O(log n)
peek Return the root element without removing it O(1)
heapify Convert an arbitrary array into a heap O(n)
📊 Array Representation of Heap
Parent at index i
Left child at index 2*i + 1
Right child at index 2*i + 2
🧪 Python Example: Min-Heap Using heapq
python
CopyEdit
import heapq
min_heap = []
heapq.heappush(min_heap, 5)
heapq.heappush(min_heap, 3)
heapq.heappush(min_heap, 8)
print(heapq.heappop(min_heap)) # Output: 3 (smallest element)
print(min_heap[0]) # Peek smallest element (5)
Manual Heap Implementation (Max-Heap Insert and Extract)
python
CopyEdit
class MaxHeap:
def __init__(self):
self.heap = []
def insert(self, val):
self.heap.append(val)
self._heapify_up(len(self.heap) - 1)
def _heapify_up(self, index):
parent = (index - 1) // 2
if index > 0 and self.heap[index] > self.heap[parent]:
self.heap[index], self.heap[parent] = self.heap[parent], self.heap[index]
self._heapify_up(parent)
def extract_max(self):
if not self.heap:
return None
if len(self.heap) == 1:
return self.heap.pop()
root = self.heap[0]
self.heap[0] = self.heap.pop()
self._heapify_down(0)
return root
def _heapify_down(self, index):
largest = index
left = 2 * index + 1
right = 2 * index + 2
if left < len(self.heap) and self.heap[left] > self.heap[largest]:
largest = left
if right < len(self.heap) and self.heap[right] > self.heap[largest]:
largest = right
if largest != index:
self.heap[index], self.heap[largest] = self.heap[largest], self.heap[index]
self._heapify_down(largest)
# Usage
h = MaxHeap()
h.insert(10)
h.insert(20)
h.insert(5)
print(h.extract_max()) # Output: 20
minheap
Min-Heap Data Structure
A Min-Heap is a complete binary tree where each node’s value is less than or equal to the values of
its children.
The minimum element is always at the root (top).
🧠 Properties:
Complete binary tree (all levels fully filled except possibly the last, filled left to right).
Parent ≤ Children
Efficient access to minimum element (root).
Operations & Time Complexity
Operation Description Time Complexity
insert Add new element & maintain heap O(log n)
extract_min Remove & return root (minimum) O(log n)
peek Get minimum element without removal O(1)
heapify Build min-heap from array O(n)
🔢 Array Representation
For node at index i:
o Left child at 2*i + 1
o Right child at 2*i + 2
o Parent at (i-1)//2
🧪 Python Example: Min-Heap using heapq module
python
CopyEdit
import heapq
min_heap = []
# Insert elements
heapq.heappush(min_heap, 10)
heapq.heappush(min_heap, 5)
heapq.heappush(min_heap, 20)
print(min_heap[0]) # Peek min element → 5
print(heapq.heappop(min_heap)) # Extract min → 5
print(heapq.heappop(min_heap)) # Extract min → 10
Manual Min-Heap Implementation
python
CopyEdit
class MinHeap:
def __init__(self):
self.heap = []
def insert(self, val):
self.heap.append(val)
self._heapify_up(len(self.heap) - 1)
def _heapify_up(self, index):
parent = (index - 1) // 2
if index > 0 and self.heap[index] < self.heap[parent]:
self.heap[index], self.heap[parent] = self.heap[parent], self.heap[index]
self._heapify_up(parent)
def extract_min(self):
if not self.heap:
return None
if len(self.heap) == 1:
return self.heap.pop()
root = self.heap[0]
self.heap[0] = self.heap.pop()
self._heapify_down(0)
return root
def _heapify_down(self, index):
smallest = index
left = 2 * index + 1
right = 2 * index + 2
if left < len(self.heap) and self.heap[left] < self.heap[smallest]:
smallest = left
if right < len(self.heap) and self.heap[right] < self.heap[smallest]:
smallest = right
if smallest != index:
self.heap[index], self.heap[smallest] = self.heap[smallest], self.heap[index]
self._heapify_down(smallest)
# Example Usage
h = MinHeap()
h.insert(15)
h.insert(10)
h.insert(20)
print(h.extract_min()) # Output: 10
print(h.extract_min()) # Output: 15
Max-Heap Data Structure
A Max-Heap is a complete binary tree where each node’s value is greater than or equal to the
values of its children.
The maximum element is always at the root (top).
🧠 Properties:
Complete binary tree.
Parent ≥ Children.
Efficient access to maximum element (root).
Operations & Time Complexity
Operation Description Time Complexity
insert Add new element & maintain heap O(log n)
extract_max Remove & return root (maximum) O(log n)
peek Get maximum element without removal O(1)
heapify Build max-heap from array O(n)
🔢 Array Representation
For node at index i:
o Left child at 2*i + 1
o Right child at 2*i + 2
o Parent at (i-1)//2
Manual Max-Heap Implementation in Python
python
CopyEdit
class MaxHeap:
def __init__(self):
self.heap = []
def insert(self, val):
self.heap.append(val)
self._heapify_up(len(self.heap) - 1)
def _heapify_up(self, index):
parent = (index - 1) // 2
if index > 0 and self.heap[index] > self.heap[parent]:
self.heap[index], self.heap[parent] = self.heap[parent], self.heap[index]
self._heapify_up(parent)
def extract_max(self):
if not self.heap:
return None
if len(self.heap) == 1:
return self.heap.pop()
root = self.heap[0]
self.heap[0] = self.heap.pop()
self._heapify_down(0)
return root
def _heapify_down(self, index):
largest = index
left = 2 * index + 1
right = 2 * index + 2
if left < len(self.heap) and self.heap[left] > self.heap[largest]:
largest = left
if right < len(self.heap) and self.heap[right] > self.heap[largest]:
largest = right
if largest != index:
self.heap[index], self.heap[largest] = self.heap[largest], self.heap[index]
self._heapify_down(largest)
# Example Usage
h = MaxHeap()
h.insert(10)
h.insert(20)
h.insert(5)
print(h.extract_max()) # Output: 20
print(h.extract_max()) # Output: 10
🧪 Python heapq module note:
The built-in heapq module implements a min-heap by default.
To simulate max-heap with heapq, insert negative values and negate them back when
popping.
Binomial Heap
A Binomial Heap is a collection of binomial trees that are linked together, supporting efficient
priority queue operations. It generalizes a heap by allowing fast merge (union) operations.
🧠 Key Concepts:
Binomial Tree B_k:
A binomial tree of order k has:
o 2^k nodes
o Height = k
o Constructed recursively by linking two B_(k-1) trees, where the root of one
becomes the leftmost child of the other.
A Binomial Heap is a set of binomial trees that satisfies:
1. Each binomial tree in the heap is min-heap ordered (parent ≤ children).
2. There is at most one binomial tree of any order in the heap (like binary
representation of number of nodes).
🔧 Operations & Complexity
Operation Description Time Complexity
insert Insert new element O(log n)
find_min Find minimum element O(log n)
extract_min Remove minimum element O(log n)
union (merge) Merge two binomial heaps O(log n)
decrease_key Decrease the key value of an element O(log n)
delete Delete an element O(log n)
Structure Summary
Binomial trees are linked by root list, sorted by increasing order.
Each node stores: key, pointers to child, sibling, and parent.
🧪 Basic Idea of union operation:
Merge two root lists (like merging two sorted linked lists by order).
Combine trees of the same order by linking one tree under the other (min-heap order).
Resulting heap has unique orders.
Simple conceptual illustration:
Suppose H1 and H2 are two binomial heaps.
To union them:
1. Merge their root lists by increasing order of degrees.
2. Traverse merged list, link binomial trees of the same degree:
o Attach one tree as child of the other based on key order.
3. Resulting heap satisfies binomial heap properties.
🔥 Why Binomial Heap?
Supports efficient merge operation, unlike binary heaps.
Useful in applications requiring priority queue merging, like network optimization and
graph algorithms (e.g., Prim's MST algorithm).
Brief code is complex to write fully here, but core operations revolve around:
Linking trees (make one tree a subtree of another).
Maintaining root list as a linked list sorted by degree.
Implementing heap operations based on tree properties.
Heap Sort — Sorting Using a Heap
Heap sort is a comparison-based sorting algorithm that uses a heap data structure (usually a max-
heap) to sort elements in ascending order.
How Heap Sort Works:
1. Build a max-heap from the unsorted array.
2. The largest element is now at the root (top) of the heap.
3. Swap the root with the last element in the heap.
4. Reduce the heap size by one (exclude the last element from the heap).
5. Heapify the root to maintain the max-heap property.
6. Repeat steps 3-5 until the heap size is 1.
Complexity:
Operation Time Complexity
Building heap O(n)
Sorting O(n log n)
Overall O(n log n)
Heap sort is in-place (requires no extra space) and not stable (does not preserve order of equal
elements).
Visual Explanation:
Build max-heap → largest at root.
Swap root with last → largest element moves to the sorted position.
Heapify root → fix heap.
Repeat until sorted.
Python Implementation:
python
CopyEdit
def heapify(arr, n, i):
largest = i # Initialize largest as root
left = 2 * i + 1 # left child
right = 2 * i + 2 # right child
# Check if left child is larger than root
if left < n and arr[left] > arr[largest]:
largest = left
# Check if right child is larger than largest so far
if right < n and arr[right] > arr[largest]:
largest = right
# Change root if needed
if largest != i:
arr[i], arr[largest] = arr[largest], arr[i] # Swap
heapify(arr, n, largest) # Heapify subtree
def heap_sort(arr):
n = len(arr)
# Build max heap
for i in range(n // 2 - 1, -1, -1):
heapify(arr, n, i)
# One by one extract elements
for i in range(n - 1, 0, -1):
arr[0], arr[i] = arr[i], arr[0] # Swap root with end
heapify(arr, i, 0) # Heapify reduced heap
# Example usage
arr = [12, 11, 13, 5, 6, 7]
heap_sort(arr)
print("Sorted array is:", arr)
Output:
pgsql
CopyEdit
Sorted array is: [5, 6, 7, 11, 12, 13]
Summary:
Heap sort uses a max-heap to sort an array in ascending order.
Build the heap → extract max repeatedly → fix heap.
Efficient, in-place, with guaranteed O(n log n) time.
Trees — Data Structure Overview
A tree is a hierarchical, non-linear data structure consisting of nodes connected by edges,
resembling an upside-down tree with a root at the top and leaves at the bottom.
Key Terminology
Term Meaning
Node Basic unit holding data
Root Topmost node in a tree (no parent)
Edge Connection between two nodes
Parent Node directly above another node
Child Node directly below another node
Leaf Node with no children
Subtree A tree formed by a node and its descendants
Height Longest path from root to leaf (edges)
Depth Distance from root to a node (edges)
Term Meaning
Types of Trees
Binary Tree: Each node has at most 2 children (left and right).
Binary Search Tree (BST): Binary tree with ordered property: left < root < right.
Balanced Trees: Height balanced for efficiency (e.g., AVL tree, Red-Black tree).
Heap: Complete binary tree with heap property (max-heap or min-heap).
Trie: Prefix tree, used for efficient string retrieval.
N-ary Tree: Nodes can have N children.
B-Trees: Balanced multi-way trees used in databases/file systems.
Basic Operations
Operation Description Time Complexity (BST avg)
Insertion Add node while maintaining properties O(log n)
Deletion Remove node, restructure tree O(log n)
Search Find a node with given value O(log n)
Traversal Visit nodes in specific order O(n)
Common Tree Traversals
Traversal Type Order Description Example Use
Preorder Root → Left → Right Copy tree, prefix expression
Inorder Left → Root → Right Sorted output (BST)
Postorder Left → Right → Root Delete tree, postfix expression
Level Order By levels, top to bottom Breadth-first search (BFS)
Simple Example: Binary Tree Node in Python
python
CopyEdit
class TreeNode:
def __init__(self, val):
self.val = val
self.left = None
self.right = None
# Example: Create simple tree
root = TreeNode(1)
root.left = TreeNode(2)
root.right = TreeNode(3)
Applications of Trees
File systems (directory structure)
Databases and indexing (B-Trees)
Expression parsing (syntax trees)
Network routing algorithms
Artificial intelligence (decision trees)
Autocomplete and spell-check (tries)
Binary Trees — Overview
A Binary Tree is a hierarchical data structure in which each node has at most two children called:
Left child
Right child
Key Characteristics:
Each node can have 0, 1, or 2 children.
The top node is called the root.
Nodes with no children are called leaf nodes.
The height of a binary tree is the number of edges on the longest path from the root to a
leaf.
Types of Binary Trees
Type Description
Full Binary Tree Every node has 0 or 2 children
All levels completely filled except possibly the last, which is filled
Complete Binary Tree
from left to right
Perfect Binary Tree All internal nodes have 2 children and all leaves are at the same level
Balanced Binary Tree Height difference between left and right subtree of any node is ≤ 1
Degenerate (Pathological)
Each parent has only one child (like a linked list)
Tree
Binary Tree Traversals
Traversal Order of visiting nodes Use Case
Preorder Root → Left → Right Copying tree, expression parsing
Inorder Left → Root → Right Get sorted order in BST
Postorder Left → Right → Root Deleting tree, postfix notation
Level Order Level by level (BFS) Breadth-first traversal
Basic Binary Tree Node (Python)
python
CopyEdit
class Node:
def __init__(self, data):
self.data = data
self.left = None
self.right = None
# Example: Creating a simple binary tree
root = Node(1)
root.left = Node(2)
root.right = Node(3)
root.left.left = Node(4)
root.left.right = Node(5)
Traversal Example (Inorder):
python
CopyEdit
def inorder(root):
if root:
inorder(root.left)
print(root.data, end=' ')
inorder(root.right)
# Usage:
inorder(root) # Output: 4 2 5 1 3
Applications of Binary Trees:
Expression parsing and expression trees
Binary Search Trees (BST) for efficient search, insertion, deletion
Heaps for priority queues
Huffman coding trees
Syntax trees in compilers
Tree Traversal Techniques
Tree traversal means visiting all the nodes in a tree in a specific order. Traversal is key to many tree
operations like searching, printing, or modifying the tree.
1. Depth-First Traversal (DFS)
DFS explores as far as possible along each branch before backtracking. It has 3 common types:
a) Preorder Traversal (Root → Left → Right)
Visit the root node first.
Traverse the left subtree.
Traverse the right subtree.
Use: Copying trees, prefix expression evaluation.
python
CopyEdit
def preorder(root):
if root:
print(root.data, end=' ')
preorder(root.left)
preorder(root.right)
b) Inorder Traversal (Left → Root → Right)
Traverse the left subtree.
Visit the root node.
Traverse the right subtree.
Use: Produces sorted order for binary search trees (BST).
python
CopyEdit
def inorder(root):
if root:
inorder(root.left)
print(root.data, end=' ')
inorder(root.right)
c) Postorder Traversal (Left → Right → Root)
Traverse the left subtree.
Traverse the right subtree.
Visit the root node last.
Use: Deleting trees, postfix expression evaluation.
python
CopyEdit
def postorder(root):
if root:
postorder(root.left)
postorder(root.right)
print(root.data, end=' ')
2. Breadth-First Traversal (BFS) / Level Order Traversal
Visit nodes level by level, starting from the root.
Use a queue data structure.
Use: Shortest path, serialization/deserialization of trees.
python
CopyEdit
from collections import deque
def level_order(root):
if not root:
return
queue = deque([root])
while queue:
node = queue.popleft()
print(node.data, end=' ')
if node.left:
queue.append(node.left)
if node.right:
queue.append(node.right)
Summary Table
Traversal Type Order of Nodes Visited Data Structure Used Common Uses
Preorder (DFS) Root → Left → Right Recursion/Stack Copy tree, prefix notation
Inorder (DFS) Left → Root → Right Recursion/Stack BST sorted order
Postorder (DFS) Left → Right → Root Recursion/Stack Tree deletion, postfix
Level Order (BFS) Level by level (top to bottom) Queue Shortest path, tree levels
Expression Evaluation Using Stacks
Expression evaluation is the process of computing the value of a mathematical expression, which
can be given in:
Infix notation: Operators between operands (e.g., 3 + 4)
Prefix notation: Operators before operands (e.g., + 3 4)
Postfix notation: Operators after operands (e.g., 3 4 +)
Why Stacks?
Stacks help manage operators and operands in the correct order.
They efficiently handle operator precedence and associativity.
Used for converting between notations and for evaluating postfix/prefix expressions.
1. Evaluating Postfix Expression (Reverse Polish Notation)
Scan expression from left to right.
If operand, push onto stack.
If operator, pop two operands, apply operator, push result back.
At end, stack has the final result.
Example: Evaluate 6 2 3 + - 3 8 2 / + *
Step-by-step:
Token Action Stack
6 Push operand [6]
2 Push operand [6, 2]
3 Push operand [6, 2, 3]
+ Pop 3, 2; compute 2+3=5; push 5 [6, 5]
- Pop 5, 6; compute 6-5=1; push 1 [1]
3 Push operand [1, 3]
8 Push operand [1, 3, 8]
2 Push operand [1, 3, 8, 2]
/ Pop 2,8; compute 8/2=4; push 4 [1, 3, 4]
+ Pop 4,3; compute 3+4=7; push 7 [1, 7]
* Pop 7,1; compute 1*7=7; push 7 [7]
Token Action Stack
Result = 7
Python Code for Postfix Evaluation
python
CopyEdit
def evaluate_postfix(expression):
stack = []
operators = set(['+', '-', '*', '/'])
for token in expression.split():
if token not in operators:
stack.append(int(token))
else:
b = stack.pop()
a = stack.pop()
if token == '+':
stack.append(a + b)
elif token == '-':
stack.append(a - b)
elif token == '*':
stack.append(a * b)
elif token == '/':
stack.append(int(a / b)) # Integer division
return stack.pop()
expr = "6 2 3 + - 3 8 2 / + *"
print("Result:", evaluate_postfix(expr))
2. Evaluating Infix Expression (Using Two Stacks)
One stack for operands (numbers)
One stack for operators
Process tokens left to right:
o If operand, push onto operand stack.
o If operator, pop operators with higher or equal precedence and apply them.
o Push current operator onto operator stack.
At the end, apply remaining operators.
Operator Precedence
Operator Precedence
() Highest
*/ 2
+- 1
Simplified Example of Infix Evaluation:
Expression: (3 + 5) * 2
Push '('
Push 3
Push '+'
Push 5
Encounter ')', evaluate inside parentheses → 8
Push 8
Push '*'
Push 2
Apply '*' → 16
Binary Search Trees (BST) — Overview
A Binary Search Tree (BST) is a special kind of binary tree with a sorted structure that allows
efficient search, insertion, and deletion operations.
Properties of BST
For each node:
o All values in its left subtree are less than the node’s value.
o All values in its right subtree are greater than the node’s value.
No duplicate nodes (usually; some variants allow duplicates on one side).
Why BST?
Enables efficient searching (average O(log n)) if the tree is balanced.
Ordered data structure useful for:
o Searching
o Range queries
o Sorted data traversal
Basic Operations
Operation Description Average Time Complexity
Search Find a node with given value O(log n)
Insert Add a node in correct position O(log n)
Delete Remove a node, restructuring tree O(log n)
Traversal Inorder traversal gives sorted data O(n)
BST Node Class (Python)
python
CopyEdit
class BSTNode:
def __init__(self, val):
self.val = val
self.left = None
self.right = None
Insert Operation
python
CopyEdit
def insert(root, key):
if root is None:
return BSTNode(key)
if key < root.val:
root.left = insert(root.left, key)
elif key > root.val:
root.right = insert(root.right, key)
return root
Search Operation
python
CopyEdit
def search(root, key):
if root is None or root.val == key:
return root
if key < root.val:
return search(root.left, key)
return search(root.right, key)
Delete Operation
1. Find the node to delete.
2. Cases:
o Node is leaf → just remove it.
o Node has one child → replace node with child.
o Node has two children → replace node with inorder successor (smallest node in
right subtree) or inorder predecessor (largest in left subtree), then delete
successor/predecessor.
python
CopyEdit
def min_value_node(node):
current = node
while current.left:
current = current.left
return current
def delete(root, key):
if not root:
return root
if key < root.val:
root.left = delete(root.left, key)
elif key > root.val:
root.right = delete(root.right, key)
else:
# Node with one or no child
if not root.left:
return root.right
elif not root.right:
return root.left
# Node with two children
temp = min_value_node(root.right)
root.val = temp.val
root.right = delete(root.right, temp.val)
return root
Inorder Traversal (BST → sorted order)
python
CopyEdit
def inorder(root):
if root:
inorder(root.left)
print(root.val, end=' ')
inorder(root.right)
Example Usage
python
CopyEdit
root = None
for key in [50, 30, 20, 40, 70, 60, 80]:
root = insert(root, key)
print("Inorder traversal of BST:")
inorder(root) # Output: 20 30 40 50 60 70 80
# Search for 40
found = search(root, 40)
print("\nFound:", found.val if found else "Not Found")
# Delete 20
root = delete(root, 20)
print("After deleting 20:")
inorder(root) # Output: 30 40 50 60 70 80
AVL Trees — Overview
An AVL Tree is a self-balancing Binary Search Tree (BST) named after its inventors Adelson-Velsky
and Landis. It maintains a strict balance to ensure operations like search, insert, and delete run in
O(log n) time, even in the worst case.
Why AVL Tree?
Normal BST can become skewed (like a linked list), leading to O(n) operations.
AVL Tree keeps the tree balanced by enforcing a balance factor on every node.
Balance ensures logarithmic height.
AVL Tree Properties
For every node, balance factor = height(left subtree) - height(right subtree)
Allowed balance factor values: -1, 0, +1
If balance factor goes outside this range after insertion or deletion, rotations are done to
restore balance.
Height of node
Height of a node = max(height of left subtree, height of right subtree) + 1
Height of empty subtree = -1 or 0 depending on convention
Rotations to Rebalance
1. Right Rotation (LL Rotation)
Used when a node is left-heavy due to insertion in left subtree of left child.
2. Left Rotation (RR Rotation)
Used when a node is right-heavy due to insertion in right subtree of right child.
3. Left-Right Rotation (LR Rotation)
Used when a node is left-heavy due to insertion in right subtree of left child.
(First left rotate left child, then right rotate node)
4. Right-Left Rotation (RL Rotation)
Used when a node is right-heavy due to insertion in left subtree of right child.
(First right rotate right child, then left rotate node)
AVL Tree Node (Python)
python
CopyEdit
class AVLNode:
def __init__(self, key):
self.key = key
self.left = None
self.right = None
self.height = 1 # Height of node
Key Functions
1. Get Height
python
CopyEdit
def height(node):
if not node:
return 0
return node.height
2. Get Balance Factor
python
CopyEdit
def get_balance(node):
if not node:
return 0
return height(node.left) - height(node.right)
3. Right Rotation
python
CopyEdit
def right_rotate(y):
x = y.left
T2 = x.right
# Perform rotation
x.right = y
y.left = T2
# Update heights
y.height = 1 + max(height(y.left), height(y.right))
x.height = 1 + max(height(x.left), height(x.right))
return x
4. Left Rotation
python
CopyEdit
def left_rotate(x):
y = x.right
T2 = y.left
# Perform rotation
y.left = x
x.right = T2
# Update heights
x.height = 1 + max(height(x.left), height(x.right))
y.height = 1 + max(height(y.left), height(y.right))
return y
5. Insert with Rebalancing
python
CopyEdit
def insert(node, key):
# Normal BST insert
if not node:
return AVLNode(key)
elif key < node.key:
node.left = insert(node.left, key)
else:
node.right = insert(node.right, key)
# Update height
node.height = 1 + max(height(node.left), height(node.right))
# Check balance
balance = get_balance(node)
# Left Left Case
if balance > 1 and key < node.left.key:
return right_rotate(node)
# Right Right Case
if balance < -1 and key > node.right.key:
return left_rotate(node)
# Left Right Case
if balance > 1 and key > node.left.key:
node.left = left_rotate(node.left)
return right_rotate(node)
# Right Left Case
if balance < -1 and key < node.right.key:
node.right = right_rotate(node.right)
return left_rotate(node)
return node
6. Inorder Traversal (Sorted order)
python
CopyEdit
def inorder(node):
if node:
inorder(node.left)
print(node.key, end=' ')
inorder(node.right)
Example Usage
python
CopyEdit
root = None
for key in [10, 20, 30, 40, 50, 25]:
root = insert(root, key)
print("Inorder traversal of AVL tree:")
inorder(root) # Output: 10 20 25 30 40 50
Summary
Operation Time Complexity
Search O(log n)
Insertion O(log n)
Deletion O(log n)
AVL trees maintain a balanced tree structure automatically after insertions/deletions by rotations,
ensuring logarithmic height.
Red-Black Trees — Overview
A Red-Black Tree is a kind of self-balancing binary search tree that ensures the tree remains
approximately balanced during insertions and deletions, guaranteeing O(log n) time for search,
insert, and delete operations.
Why Red-Black Tree?
Like AVL trees, it maintains balance but with less strict rules, leading to faster insertion and
deletion in practice.
It guarantees the tree height is no more than 2 × log₂(n + 1).
Used widely in system libraries (e.g., C++ STL map, Java TreeMap).
Properties of Red-Black Trees
1. Every node is either red or black.
2. The root is always black.
3. All leaves (NIL nodes) are black. (These are sentinel leaves.)
4. If a node is red, then both its children are black. (No two reds in a row.)
5. For each node, every path from the node to descendant leaves contains the same number
of black nodes. (Black-height property)
Terminology
Red node: Colored red
Black node: Colored black
NIL node: Null leaves, considered black
Black-height: Number of black nodes from a node to leaves (excluding node itself)
Why These Properties?
They enforce balance by limiting the longest path to be no more than twice the shortest path,
keeping the tree approximately balanced.
Basic Operations
Rotations
Same as in AVL trees (left rotate and right rotate).
Used to maintain tree structure during insert/delete.
Recoloring
Changing node colors (red/black) to restore properties after insert/delete.
Insertion in Red-Black Tree
Insert node like in a regular BST, color it red.
Fix violations of properties by recoloring and rotations.
The main fix involves ensuring no two consecutive red nodes and black-height consistency.
Insertion Fix-Up Cases (after inserting a red node)
1. New node is root: Color it black.
2. Parent is black: Tree is still valid.
3. Parent and uncle are red: Recolor parent and uncle black, grandparent red, recurse up.
4. Parent is red, uncle is black (or NIL): Rotate and recolor depending on node’s position (left-
left, left-right, right-left, right-right).
Deletion in Red-Black Tree
More complicated than insertion.
Delete like in BST.
Fix "double black" violations with rotations and recoloring.
Simplified Node Structure (Python)
python
CopyEdit
class RBNode:
def __init__(self, key, color='red'):
self.key = key
self.color = color # 'red' or 'black'
self.left = None
self.right = None
self.parent = None
Summary Comparison with AVL Trees
Feature AVL Tree Red-Black Tree
Balancing
More strict Less strict
strictness
Strictly balanced (O(log
Height guarantee Approximately balanced (≤ 2 × log n)
n))
Insert/Delete
Slower (more rotations) Faster (fewer rotations)
speed
Systems with many inserts/deletes (e.g., OS
Use cases Read-intensive systems
schedulers, STL)
B-Tree — Overview
A B-Tree is a self-balancing multi-way search tree designed to work efficiently on disk storage or
large databases. Unlike binary trees, each node can have more than two children, allowing it to
store large blocks of sorted data and minimize disk reads/writes.
Why B-Tree?
Optimized for systems that read/write large blocks of data (like databases, file systems).
Reduces the number of disk accesses.
Maintains balance by ensuring all leaves are at the same depth.
Key Properties of a B-Tree of order t (minimum degree)
1. Every node has at most 2t - 1 keys.
2. Every node (except root) has at least t - 1 keys.
3. The root has at least 1 key (unless tree is empty).
4. Every internal node has at most 2t children.
5. All leaves appear at the same level.
6. Keys in each node are sorted.
7. Child subtrees have keys that fall between keys in the parent node.
Terminology
Order (t): Minimum degree of the tree (defines the minimum and maximum number of
keys/children).
Keys: Sorted data elements inside each node.
Children: Pointers to subtrees.
Leaf nodes: Nodes without children.
B-Tree Node Structure
Each node stores:
An array of keys (up to 2t-1 keys).
An array of child pointers (up to 2t children).
A flag indicating if it is a leaf.
Operations
Search
Similar to BST search but keys are in a node array.
At each node, do a binary or linear search among keys.
If key is found, return.
Else, recurse into the child subtree that fits the key’s value.
Insertion
1. Insert key in a leaf node.
2. If the leaf node is full (has 2t - 1 keys), split it.
3. Promote the median key to the parent node.
4. If parent node also full, split recursively up to the root.
5. If root splits, tree height increases by 1.
Deletion
More complex, involves merging and redistribution of keys to maintain minimum keys
property.
Deletion is done carefully to maintain tree balance.
Visualization Example (Order t=3)
Max keys per node: 5 (2×3 - 1)
Min keys per node (except root): 2 (3 - 1)
python-repl
CopyEdit
[10 | 20 | 30]
/ | | \
... ... ... ...
Keys are sorted inside each node, and children cover key ranges accordingly.
B-Tree Node Class (Simplified Python)
python
CopyEdit
class BTreeNode:
def __init__(self, t, leaf=False):
self.t = t # Minimum degree
self.leaf = leaf
self.keys = [] # List of keys
self.children = [] # List of child pointers
Advantages
Minimizes disk I/O by storing multiple keys per node.
Keeps tree height low.
Efficient for databases and filesystems.
Summary
Operation Time Complexity
Search O(log n)
Insert O(log n) (with splits)
Delete O(log n) (with merges)
B+ Tree — Overview
A B+ Tree is an extension of the B-Tree designed specifically for database and file systems to
optimize range queries and sequential access. It maintains all data in the leaf nodes and uses an
internal node structure primarily for indexing.
Key Differences from B-Tree
Feature B-Tree B+ Tree
Data Storage Keys and data stored in all nodes Data stored only in leaf nodes
Leaf nodes are linked via a linked list for fast
Leaf Nodes Not linked
sequential access
Internal
Store keys and data Store only keys (index values)
Nodes
Search may end at internal nodes
Searching Search always ends at leaf nodes
or leaves
Range
Less efficient Efficient due to linked leaves
Queries
Structure of B+ Tree
Internal nodes contain only keys to guide search.
Leaf nodes contain keys and actual data (or pointers to data).
Leaf nodes are linked using a doubly linked list for fast ordered traversal.
Order (minimum degree) t controls max/min keys and children as in B-Tree.
B+ Tree Node Types
Internal nodes: Only keys, up to 2t - 1 keys, and up to 2t children.
Leaf nodes: Keys + data pointers; linked together sequentially.
Operations
Search
Traverse from root to leaf using keys in internal nodes.
Final key search in leaf nodes.
Insertion
Insert key and data into leaf node.
If leaf overflows (more than 2t - 1 keys), split leaf node.
Promote the smallest key of the new right sibling to parent node.
If parent overflows, split recursively up to root.
Leaf nodes remain linked.
Deletion
Delete key from leaf node.
If underflow occurs (less than t - 1 keys), redistribute or merge nodes.
Update parent keys accordingly.
Maintains linked leaf nodes for range queries.
Example Visualization (Order t=3)
Internal nodes:
python-repl
CopyEdit
[10 | 20 | 30]
/ | | \
... ... ... ...
Leaf nodes (linked list):
css
CopyEdit
[5, 7, 9] <-> [12, 15, 18] <-> [22, 25, 28] <-> [35, 40]
Advantages of B+ Tree
Efficient range queries by traversing linked leaf nodes.
Faster sequential access for operations like full table scans.
Internal nodes are smaller (no data), so more keys fit per node.
Widely used in database indexing and filesystems.
Summary Comparison
Feature B-Tree B+ Tree
Data in nodes Internal + leaf nodes Leaf nodes only
Leaf linkage None Linked list linkage
Range queries Less efficient Highly efficient
Node size Larger (with data) Smaller (index only)
Typical use case General balanced tree Database/file indexing
Segment Tree — Overview
A Segment Tree is a versatile data structure used for answering range queries and performing
updates efficiently on an array or sequence of elements.
Why Segment Tree?
Ideal for problems like range sum, range minimum/maximum, range gcd, etc.
Supports both query and update operations in O(log n) time.
Useful when the array is mutable (updates happen between queries).
Basic Idea
The tree represents intervals (segments) of the array.
Each node stores information about a segment (e.g., sum, min, max).
The root represents the whole array segment.
The children represent the left and right halves of the parent segment.
Leaf nodes correspond to individual elements.
Structure
For an array of size n, segment tree size is roughly 4n nodes.
Binary tree structure where:
o node stores info about [start, end] segment of array.
o Left child covers [start, mid].
o Right child covers [mid+1, end].
Operations
Build
Recursively build the tree from the array bottom-up.
Time complexity: O(n).
Query (Range Query)
Query over a segment [L, R].
Recursively check if current node segment is fully inside, outside, or partially overlaps [L,
R].
Combine results from children as needed.
Time complexity: O(log n).
Update
Update an element at position pos.
Recursively update nodes covering pos.
Time complexity: O(log n).
Example: Range Sum Segment Tree (Python)
python
CopyEdit
class SegmentTree:
def __init__(self, data):
self.n = len(data)
self.tree = [0] * (4 * self.n)
self._build(data, 0, 0, self.n - 1)
def _build(self, data, node, start, end):
if start == end:
self.tree[node] = data[start]
else:
mid = (start + end) // 2
self._build(data, 2*node+1, start, mid)
self._build(data, 2*node+2, mid+1, end)
self.tree[node] = self.tree[2*node+1] + self.tree[2*node+2]
def update(self, idx, val):
self._update(0, 0, self.n-1, idx, val)
def _update(self, node, start, end, idx, val):
if start == end:
self.tree[node] = val
else:
mid = (start + end) // 2
if idx <= mid:
self._update(2*node+1, start, mid, idx, val)
else:
self._update(2*node+2, mid+1, end, idx, val)
self.tree[node] = self.tree[2*node+1] + self.tree[2*node+2]
def query(self, L, R):
return self._query(0, 0, self.n-1, L, R)
def _query(self, node, start, end, L, R):
if R < start or L > end:
return 0
if L <= start and end <= R:
return self.tree[node]
mid = (start + end) // 2
left_sum = self._query(2*node+1, start, mid, L, R)
right_sum = self._query(2*node+2, mid+1, end, L, R)
return left_sum + right_sum
Use Cases
Range sum queries.
Range minimum/maximum queries.
Frequency counting.
Inversion count.
Interval queries with updates.
Summary
Operation Time Complexity
Build O(n)
Query O(log n)
Update O(log n)
Graphs — Overview
A graph is a fundamental data structure used to model pairwise relationships between objects. It
consists of vertices (nodes) and edges (connections).
Basic Terminology
Vertices (V): The entities or points in the graph.
Edges (E): The connections between vertices.
Directed Graph (Digraph): Edges have direction (from one vertex to another).
Undirected Graph: Edges have no direction; connection is two-way.
Weighted Graph: Edges carry weights/costs.
Unweighted Graph: Edges have no weights.
Degree: Number of edges incident on a vertex.
Path: Sequence of edges connecting vertices.
Cycle: Path where first and last vertices are the same.
Connected Graph: There is a path between any two vertices.
Disconnected Graph: Some vertices are unreachable from others.
Graph Representation
1. Adjacency Matrix
2D matrix of size V x V.
matrix[i][j] = 1 if edge from i to j exists, else 0.
Easy to implement.
Space: O(V²).
Good for dense graphs.
2. Adjacency List
Array or list where each index corresponds to a vertex.
Stores a list of adjacent vertices.
Space: O(V + E).
Efficient for sparse graphs.
Graph Types
Type Description Example Use Cases
Directed Graph Edges have direction Web page links, workflows
Undirected Graph Edges bidirectional Social networks, roads
Weighted Graph Edges have weights Maps with distances, costs
Unweighted Graph No weights on edges Simple networks
Common Graph Algorithms
Traversal
Depth-First Search (DFS): Explore as far as possible along a branch before backtracking.
Breadth-First Search (BFS): Explore neighbors level by level.
Shortest Path
Dijkstra’s Algorithm: For weighted graphs with non-negative weights.
Bellman-Ford: Works with negative weights (detects negative cycles).
Floyd-Warshall: All pairs shortest paths.
A Search:* Heuristic based shortest path.
Minimum Spanning Tree (MST)
Prim’s Algorithm: Greedy MST starting from one vertex.
Kruskal’s Algorithm: MST by sorting edges and union-find.
Cycle Detection
Using DFS or Union-Find.
Example: Adjacency List Representation in Python
python
CopyEdit
graph = {
0: [1, 2],
1: [2],
2: [0, 3],
3: [3]
Summary
Graphs model complex relationships and support many important algorithms for traversal,
optimization, and connectivity.
Hashing — Overview
Hashing is a technique to uniquely identify a specific object from a collection of objects. It converts
input (keys) into a fixed-size integer value called a hash code using a hash function.
Key Concepts
Hash Function: Maps input data of arbitrary size to fixed-size hash values (indices).
Hash Table: Data structure that uses hash functions to index data for fast access.
Collision: When two inputs produce the same hash value.
Load Factor (α): Ratio of number of elements to table size (n / m), affects performance.
Hash Table Operations
Operation Average Time Complexity
Insertion O(1)
Search O(1)
Deletion O(1)
Collision Resolution Techniques
1. Chaining:
o Each bucket stores a linked list (or another data structure) of elements.
o Collisions handled by appending to list.
2. Open Addressing:
o Find another free bucket by probing.
o Types of probing:
Linear probing: Check next slots sequentially.
Quadratic probing: Use quadratic function for next slot.
Double hashing: Use a second hash function.
Common Hash Functions
Division Method: h(k) = k mod m (m = table size)
Multiplication Method: h(k) = floor(m * (k * A mod 1)), A is a constant (0 < A < 1)
Universal Hashing: Randomized hash functions for better average case.
Applications of Hashing
Implementing dictionaries and maps.
Caching data.
Database indexing.
Detecting duplicate data.
Password storage (with cryptographic hashing).
Example: Simple Hash Table with Chaining (Python)
python
CopyEdit
class HashTable:
def __init__(self, size=10):
self.size = size
self.table = [[] for _ in range(size)]
def _hash(self, key):
return hash(key) % self.size
def insert(self, key, value):
idx = self._hash(key)
# Check if key exists, update
for i, (k, v) in enumerate(self.table[idx]):
if k == key:
self.table[idx][i] = (key, value)
return
# Else insert
self.table[idx].append((key, value))
def search(self, key):
idx = self._hash(key)
for k, v in self.table[idx]:
if k == key:
return v
return None
def delete(self, key):
idx = self._hash(key)
for i, (k, v) in enumerate(self.table[idx]):
if k == key:
del self.table[idx][i]
return True
return False
Two Pointer Approach — Overview
The Two Pointer Approach is a popular technique used to solve problems involving arrays or linked
lists efficiently. It uses two pointers (indices) that traverse the data structure simultaneously to
reduce time complexity, often from quadratic to linear time.
When to Use?
When dealing with sorted arrays or sequences.
When you need to find pairs, triplets, or subarrays matching certain conditions.
Useful in sliding window problems, partitioning, and searching.
Key Idea
Maintain two pointers, often called left and right.
Move pointers based on conditions:
o Increment left or right depending on the problem.
o Adjust pointers to narrow down or expand the search space.
Avoid nested loops by moving pointers intelligently.
Common Patterns
Problem Type Two Pointer Strategy
Pair sum in sorted array Move pointers inward to find sum
Remove duplicates in sorted array Use one pointer to track unique items
Partitioning arrays (e.g., evens/odds) One pointer scans, other swaps
Sliding window problems Expand and shrink window using pointers
Example 1: Two Sum (Sorted Array)
Find two numbers in sorted array that add up to target.
python
CopyEdit
def two_sum(arr, target):
left, right = 0, len(arr) - 1
while left < right:
s = arr[left] + arr[right]
if s == target:
return (left, right)
elif s < target:
left += 1
else:
right -= 1
return None
Example 2: Remove Duplicates from Sorted Array
python
CopyEdit
def remove_duplicates(arr):
if not arr:
return 0
write = 1
for read in range(1, len(arr)):
if arr[read] != arr[read-1]:
arr[write] = arr[read]
write += 1
return write # New length
Benefits
Efficient: Reduces complexity from O(n²) to O(n) in many problems.
Simple implementation.
Minimal extra space: Usually O(1).
Sliding Window — Overview
The Sliding Window technique is a powerful approach for solving problems involving subarrays or
substrings in arrays or strings, especially where you need to find a contiguous segment meeting
certain criteria.
What is Sliding Window?
Imagine a window (a range of indices) that “slides” over the data.
The window can expand or shrink as needed to maintain the condition.
Avoids re-computing results for overlapping parts, improving efficiency.
When to Use?
Problems involving contiguous subarrays or substrings.
Finding maximum/minimum sum or length of subarray meeting criteria.
Examples: max sum subarray of size k, longest substring without repeating characters.
Types of Sliding Windows
Type Description
Fixed Size Window size is fixed (e.g., size k). Slide by one.
Variable Size Window size varies; expand/shrink based on conditions.
Basic Approach
Maintain two pointers: start and end defining the current window.
Expand end to add elements.
Shrink start to remove elements if the window violates conditions.
Update result during expansion/shrinking.
Example 1: Fixed Size Window — Maximum Sum Subarray of Size k
python
CopyEdit
def max_sum_subarray(arr, k):
max_sum = 0
window_sum = sum(arr[:k])
max_sum = window_sum
for i in range(k, len(arr)):
window_sum += arr[i] - arr[i - k] # slide window forward
max_sum = max(max_sum, window_sum)
return max_sum
Example 2: Variable Size Window — Longest Substring Without Repeating Characters
python
CopyEdit
def length_of_longest_substring(s):
char_index = {}
start = 0
max_len = 0
for end, char in enumerate(s):
if char in char_index and char_index[char] >= start:
start = char_index[char] + 1 # shrink window from left
char_index[char] = end
max_len = max(max_len, end - start + 1)
return max_len
Benefits
Linear time complexity (O(n)) for many problems.
Efficiently handles overlapping subproblems.
Can be combined with other techniques like hashing.
Tries — Overview
A Trie (pronounced "try"), also called a prefix tree, is a specialized tree used to store associative
data structures, typically strings. It is especially useful for efficient retrieval of keys in datasets like
dictionaries or word lists.
Key Characteristics
Each node represents a character of a string.
The root represents an empty string.
Paths from root to leaf nodes represent stored words.
Common prefixes are shared, saving space.
Supports fast search, insert, and prefix queries.
Trie Structure
Each node contains:
o An array or map of child nodes for next characters.
o A boolean flag isEndOfWord to indicate if a node completes a valid word.
Common Operations
Operation Time Complexity
Insert O(L), where L = length of word
Search O(L)
StartsWith (prefix check) O(L)
Use Cases
Auto-complete and auto-suggest.
Spell checking.
IP routing (Longest prefix matching).
Word games.
Searching prefixes in large dictionaries.
Example: Trie Implementation in Python
python
CopyEdit
class TrieNode:
def __init__(self):
self.children = {}
self.isEndOfWord = False
class Trie:
def __init__(self):
self.root = TrieNode()
def insert(self, word):
node = self.root
for char in word:
if char not in node.children:
node.children[char] = TrieNode()
node = node.children[char]
node.isEndOfWord = True
def search(self, word):
node = self.root
for char in word:
if char not in node.children:
return False
node = node.children[char]
return node.isEndOfWord
def startsWith(self, prefix):
node = self.root
for char in prefix:
if char not in node.children:
return False
node = node.children[char]
return True
Suffix Trees — Overview
A Suffix Tree is a compressed trie (prefix tree) containing all the suffixes of a given string. It’s a
powerful data structure for string processing tasks and allows many substring-related queries to be
answered efficiently.
Key Characteristics
Built from all suffixes of a string.
Each edge is labeled with a substring of the original string.
No two edges starting from a node begin with the same character (compact
representation).
Allows fast pattern matching, substring checks, longest repeated substring, etc.
Properties
For a string of length n, the suffix tree has O(n) nodes.
Construction can be done in O(n) time using advanced algorithms (Ukkonen’s algorithm).
Supports queries like substring search, pattern matching, longest repeated substring in
O(m) time, where m is pattern length.
Why Use Suffix Trees?
Fast substring search.
Find longest repeated substring.
Find longest common substring between strings.
Solve problems like string matching, palindrome detection efficiently.
Example
For the string "banana$" (where $ is a unique end symbol), the suffix tree contains all suffixes:
banana$
anana$
nana$
ana$
na$
a$
$
Each suffix is represented as a path from root to a leaf.
Basic Structure
Each internal node has two or more children.
Each leaf node corresponds to a suffix.
Edges labeled with substrings (not single characters) — compressed.
Simplified Example (Not Full Implementation)
plaintext
CopyEdit
banana$
├─ b
│ └─ anana$
├─ a
│ ├─ na
│ │ └─ na$
│ └─ $
├─ n
└─ ana$
Applications
Pattern matching: Check if a pattern exists in O(m).
Longest repeated substring.
Longest common substring between strings.
Finding palindromes or other string analysis problems.
Related Structures
Suffix Array: A space-efficient alternative to suffix tree.
Generalized Suffix Tree: For multiple strings.
Ternary Search Trees (TST) — Overview
A Ternary Search Tree is a type of trie (prefix tree) where nodes are arranged like a binary search
tree but with three children:
Left child: Nodes with characters less than the current node's character.
Middle child: Nodes with the same character (continue the word).
Right child: Nodes with characters greater than the current node's character.
Key Characteristics
Combines the compactness of tries with binary search tree ordering.
Each node stores a character and has three pointers.
Efficient for storing strings and supporting fast search, insert, and prefix queries.
Useful when the alphabet is large or memory is constrained (compared to tries with large
arrays).
Benefits
More space-efficient than tries with large alphabets.
Supports prefix searches.
Balances between trie speed and binary search tree space efficiency.
Node Structure
Each node stores:
A character.
Three children: left, middle, right.
A flag indicating if the node marks the end of a word.
Operations and Complexity
Operation Time Complexity
Insert O(L), L = length of word
Search O(L)
Prefix Search O(L)
Example: Inserting and Searching in TST
python
CopyEdit
class TSTNode:
def __init__(self, char):
self.char = char
self.left = None
self.eq = None
self.right = None
self.isEnd = False
class TernarySearchTree:
def __init__(self):
self.root = None
def insert(self, word):
self.root = self._insert(self.root, word, 0)
def _insert(self, node, word, index):
char = word[index]
if not node:
node = TSTNode(char)
if char < node.char:
node.left = self._insert(node.left, word, index)
elif char > node.char:
node.right = self._insert(node.right, word, index)
else:
if index + 1 == len(word):
node.isEnd = True
else:
node.eq = self._insert(node.eq, word, index + 1)
return node
def search(self, word):
return self._search(self.root, word, 0)
def _search(self, node, word, index):
if not node:
return False
char = word[index]
if char < node.char:
return self._search(node.left, word, index)
elif char > node.char:
return self._search(node.right, word, index)
else:
if index == len(word) - 1:
return node.isEnd
return self._search(node.eq, word, index + 1)
Use Cases
Auto-completion and spell checking.
When you want a compromise between space usage and fast prefix queries.
Efficient dictionary implementations for large alphabets.