0% found this document useful (0 votes)
60 views34 pages

Hashing

Hashing is a technique for storing and retrieving data in an array-based data structure called a hash table. It works by using a hash function to generate an index into the hash table from a key associated with the data, allowing for fast lookup. The time required to search for data using hashing is constant, regardless of the number of elements, unlike other search techniques where time increases with more elements. Collisions, where two keys hash to the same index, are resolved using techniques like separate chaining or open addressing. Dynamic hashing allows the hash table size to grow or shrink as needed. Common hash functions aim to evenly distribute keys across the table and be simple to compute.

Uploaded by

Amisha Shetty
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
60 views34 pages

Hashing

Hashing is a technique for storing and retrieving data in an array-based data structure called a hash table. It works by using a hash function to generate an index into the hash table from a key associated with the data, allowing for fast lookup. The time required to search for data using hashing is constant, regardless of the number of elements, unlike other search techniques where time increases with more elements. Collisions, where two keys hash to the same index, are resolved using techniques like separate chaining or open addressing. Dynamic hashing allows the hash table size to grow or shrink as needed. Common hash functions aim to evenly distribute keys across the table and be simple to compute.

Uploaded by

Amisha Shetty
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 34

Hashing

• In all search techniques like linear search, binary search and search
trees, the time required to search an element depends on the total
number of elements present in that data structure.
• In all these search techniques, as the number of elements increases
the time required to search an element also increases linearly.
• Hashing is another approach in which time required to search an
element doesn't depend on the total number of elements.
• Using hashing data structure, a given element is searched with
constant time complexity.
• Hashing is an effective way to reduce the number of comparisons to
search an element in a data structure.
• Hashing is the process of indexing and retrieving element (data) in a
data structure to provide a faster way of finding the element using a
hash key.
• In this data structure, we use a concept called Hash table to store
data.
● All the data values are inserted into the hash table based on the hash
key value.
● The hash key value is used to map the data with an index in the hash
table. And the hash key is generated for every data using a hash
function.
● That means every entry in the hash table is based on the hash key
value generated using the hash function.
● Hash function : Hash function is a function which takes a piece of
data (i.e. key) as input and produces an integer (i.e. hash value) as
output which maps the data to a particular index in the hash table.
• A Has Table can be implemented usually using an ordinary array
H[0………..m-1]
• The size of the hash table is limited and so it is necessary to map the
given data into this fairly restricted set of integers.
• If the size of the hash table is fixed and size of the table is smaller
than the total number of keys generated then two or more keys will
have the same hash address
• This phenomenon of two or more keys being hashed to the same
location of hash table is called Collision.
How to calculate the hash key?
Let's take hash table size as 7.
size = 7
arr[size];

Formula to calculate key is,

key = element % size

Initialize the Hash Bucket


• Before inserting elements into
array. Let’s make array default
value as -1.
• -1 indicates element not
present or the particular index
is available to insert.
Inserting elements in the hash table
i) insert 24
ii)insert 8
iii) insert 14
Searching elements from the hash table
i) search 8
ii) search 19
What if we insert an element say 15 to existing hash table?
Insert : 15
Key = element % key
Key = 15 % 7 = 1
But already arr[1] has element 8 !
Here, two or more different elements pointing to the same index under
modulo size. This is called collision.
Collision Avoidance
• The Collision can be avoided by using the following techniques
1. Open Hashing : (separate chaining)
- In open hashing, keys are stored in linked lists attached to cells
of a hash table. Each list contains all the keys hashed to its cell.
2. Closed Hashing:
- In closed hashing, all keys are stored in the hash table itself
without the use of linked lists.
- Different strategies can be employed for collision resolution.
- The simplest one—called linear probing—checks the cell
following the one where the collision occurs. If that cell is empty, the new
key is installed there; if the next cell is already occupied, the availability of
that cell’s immediate successor is checked, and so on.
Open Addressing is done in the following
ways:

a) Linear Probing: In linear probing, we linearly probe for


next slot
b) Quadratic Probing We look for i2‘th slot in i’th iteration.
c) Double Hashing We use another hash function hash2(x)
and look for i*hash2(x) slot in i’th rotation.
Open hashing or separate chaining
linear probing

Calculate the hash key. key = data % size;

If hashTable[key] is empty, store the value directly. hashTable[key] = data.

If the hash index already has some value, check for next index.

key = (key+1) % size;

If the next index is available hashTable[key], store the value. Otherwise try for next index.

Do the above process till we find the space.


Static hashing and Dynamic hashing

● The drawback of static hashing is that that it does not expand


or shrink dynamically as the size of the database grows or
shrinks.

● In Dynamic hashing, data buckets grows or shrinks (added or


removed dynamically) as the records increases or decreases.

● Dynamic hashing is also known as extended hashing.


● In dynamic hashing, the hash function is made to produce a large
number of values.

● For Example, there are three data records D1, D2 and D3 .

● The hash function generates three addresses 1001, 0101 and


1010 respectively.

● This method of storing considers only part of this address –


especially only first one bit to store the data.

● So it tries to load three of them at address 0 and 1.


● But the problem is that No bucket address is remaining for D3.
● The bucket has to grow dynamically to accommodate D3.
● So it changes the address have 2 bits rather than 1 bit, and then it
updates the existing data to have 2 bit address.
● Then it tries to accommodate D3.
Hash Functions

A good hash function should satisfy 2 criteria:

● A hash function should generate the hash addresses


such that all the key are distributed as evenly as
possible among the various cells of the hash table.

● Computation of a key by hash function should be


simple
Types of hash function

● Division method
● Mid square method
● Folding method
● Converting keys to integers
Division method
This is the easiest method to create a hash function. The hash function can be
described as

h(k) = k mod n

Here, h(k) is the hash value obtained by dividing the key value k by size of hash
table n using the remainder. It is best that n is a prime number as that makes
sure the keys are distributed with more uniformity.

An example of the Division Method is as follows −


k=1276
n=10
h(1276) = 1276 mod 10
=6

The hash value obtained is 6


Mid Square Method

The mid square method is a very good hash function. It involves squaring the value of the key and then
extracting the middle r digits as the hash value. The value of r can be decided according to the size of the
hash table.

An example of the Mid Square Method is as follows −

Suppose the hash table has 100 memory locations. So r=2 because two digits are required to map the key to
memory location.

k = 50

k*k = 2500

h(50) = 50

The hash value obtained is 50


Folding Method:

The key k is partitioned into a number of parts k 1, k2.... kn where each part
except possibly the last, has the same number of digits as the required
address.

Then the parts are added together, ignoring the last carry.
H (k) = k1+ k2+.....+kn

Example:
key=123987234876
k1=12, k2=39, k3=87, k4=23, k5=48, k6=76
Now the hash value can be obtained by adding all the partitioned keys as
shown below:
h(k)= 12+39+87+23+48+76=285

You might also like