Hash Data Structure
CS1-401
PRESENTED BY: NEHA ANWAR (BSCS – 3) (2020-GCUF-072287)
Hashing
The process of storing and retrieving data in O(1) time is called hashing.
Hash Table
Terms used in Hashing
Search keys= ( 24 , 52 , 91 , 67 , 48 , 83 )
Hash Table (To make hash table: 0 – n-1) 91
52
Hash Functions ( K Mod 10 , K Mod n , 83
Mid Square , Folding Method) 24
n = number of indexes
• K Mod 10: 24 Mod 10 = 4 (Hash Value)
• K Mod n: 24 Mod 10 = 4 (Hash Value) 67
48
Other Hash Functions
• Mid Square: 123 Mid value = 2 then
2² = 4 (index)
• Folding Method: 123456 ( 0 – 999 indexes)
123 + 456 = 579 (index)
Finding a Key
To find a key use the same hash function you used to insert the key
in the hash table.
By using the hash functions we can perform the following functions
in the hash table:
1. Traverse
2. Insertion
3. Deletion
4. Searching a key
Collision in Hashing
Techniques
Chaining (Open Hashing)
Open Addressing (Closed Hashing)
1. Linear Probing
2. Quadratic Probing
3. Double Hashing
Chaining (Open Hashing)
External chains
Like a linked list, we insert keys in a chain at the same index.
Keys: (42 , 19 , 10 , 12)
Hash Function : K Mod 5
Chaining (Open Hashing)
node
º 10 º 10
º 42 12 º 12 42
º 19 º 19
To insert text in a chain we store the address of the key in the index and make a singly linked list in
which each node has the address of the next node also. It means that the index of the hash table will
act like a head of the first node in chaining.
Continued…
Advantages:
Deletion is easy
Load Factor:
O(1) time for insertion
𝑡𝑜𝑡𝑎𝑙 𝑛𝑜.𝑜𝑓 𝑒𝑙𝑒𝑚𝑒𝑛𝑡𝑠/𝑘𝑒𝑦𝑠
Disadvantages: =
𝑠𝑙𝑜𝑡𝑠
Searching O(n)
Extra space
Linear Probing
R(K) = K Mod 10 19
R’(K , i) = (h(K) + i)(mod 10)
Keys (43 , 135 , 72 , 23 , 99 , 19 , 82) 72
{for 23} 43
23
R’(K , i) = 3 + 1 mod 10 = 4 mod 10 = 4
135
Hash value Collision number/prob. number
82
Similarly, for other collisions.
99
Continued…
Advantages:
No extra space
Disadvantages:
Search time O(n)
Deletion difficult
Primary clustering (If the primary hash index is x, subsequent probes go to x+1 , x+2, x+3 and so on, this
results in Primary Clustering. Once the primary cluster forms, the bigger the cluster gets, the faster it grows.)
Secondary clustering (If the primary hash index is x, probes go to x+1, x+4, x+9 , x+16, x+25 and so on,
this results in Secondary Clustering.)
Quadratic Probing
R(K) = K Mod 10 36
R’(K , i) = (h(K) + i²)(mod 10) 91
Keys (42 , 16 , 91 , 33 , 18 , 27 , 36 , 62) 42
{for 36} 33
R’(K , i) = 6 + 1² mod 10 = 7 mod 10 = 7
6 + 2² mod 10 = 10 mod 10 = 0
16
Similarly, for other collisions.
27
18
Continued…
Advantages:
No extra space
Primary clustering resolved
Disadvantages:
Searching in O(n) time
Secondary clustering
No guarantee of finding slot
Double Hashing
R1(K) = K mod 11
R2(K) = 8 – (K mod 8) 34
R’(K , i) = (R1(K) + iR2(K)) mod 11
Keys (20 , 34 , 45 , 70 , 56) 56
45
{for 70}
R’(K , i) = (4 + (1*3)) mod 11 = 4
70
Similarly , for other collisions.
20
Continued…
Advantages:
No extra space
No primary clustering
No secondary clustering
Disadvantages:
Search time O(n)
References
Channel name: Gate Smashers
https://youtu.be/W5q0xgxmRd8
https://youtu.be/j612Fj-mgCY
https://youtu.be/hmMYPZ5wLX0
https://youtu.be/ZEyPqqRTO00
https://youtu.be/-yQ2Kj-Jn0A
https://youtu.be/1P7ygNSe9lY
THE END
THANK YOU FOR WATCHING