Hashing
Hashing
• In all search techniques like linear search, binary search and search
trees, the time required to search an element depends on the total
number of elements present in that data structure.
• In all these search techniques, as the number of elements increases
the time required to search an element also increases linearly.
• Hashing is another approach in which time required to search an
element doesn't depend on the total number of elements.
• Using hashing data structure, a given element is searched with
constant time complexity.
• Hashing is an effective way to reduce the number of comparisons to
search an element in a data structure.
• Hashing is the process of indexing and retrieving element (data) in a
data structure to provide a faster way of finding the element using a
hash key.
• In this data structure, we use a concept called Hash table to store
data.
● All the data values are inserted into the hash table based on the hash
key value.
● The hash key value is used to map the data with an index in the hash
table. And the hash key is generated for every data using a hash
function.
● That means every entry in the hash table is based on the hash key
value generated using the hash function.
● Hash function : Hash function is a function which takes a piece of
data (i.e. key) as input and produces an integer (i.e. hash value) as
output which maps the data to a particular index in the hash table.
• A Has Table can be implemented usually using an ordinary array
H[0………..m-1]
• The size of the hash table is limited and so it is necessary to map the
given data into this fairly restricted set of integers.
• If the size of the hash table is fixed and size of the table is smaller
than the total number of keys generated then two or more keys will
have the same hash address
• This phenomenon of two or more keys being hashed to the same
location of hash table is called Collision.
How to calculate the hash key?
Let's take hash table size as 7.
size = 7
arr[size];
If the hash index already has some value, check for next index.
If the next index is available hashTable[key], store the value. Otherwise try for next index.
● Division method
● Mid square method
● Folding method
● Converting keys to integers
Division method
This is the easiest method to create a hash function. The hash function can be
described as
h(k) = k mod n
Here, h(k) is the hash value obtained by dividing the key value k by size of hash
table n using the remainder. It is best that n is a prime number as that makes
sure the keys are distributed with more uniformity.
The mid square method is a very good hash function. It involves squaring the value of the key and then
extracting the middle r digits as the hash value. The value of r can be decided according to the size of the
hash table.
Suppose the hash table has 100 memory locations. So r=2 because two digits are required to map the key to
memory location.
k = 50
k*k = 2500
h(50) = 50
The key k is partitioned into a number of parts k 1, k2.... kn where each part
except possibly the last, has the same number of digits as the required
address.
Then the parts are added together, ignoring the last carry.
H (k) = k1+ k2+.....+kn
Example:
key=123987234876
k1=12, k2=39, k3=87, k4=23, k5=48, k6=76
Now the hash value can be obtained by adding all the partitioned keys as
shown below:
h(k)= 12+39+87+23+48+76=285