Adp Huffman Coding
Adp Huffman Coding
Adp Huffman Coding
KABILAN D - 19IT043
GOPINATH K S - 19IT030
Problem definition
Data Transfer Speed : Even a system that can handle large
amounts of data transfer is still slowed down when a lot of users
connect to it at once.
Storage Space: File compression is intended to reduce the
storage requirements of data that provide no additional
information, such as white space on a page.
Solution methodology
MESSAGE - BCCABBDDAECCBBAEDDCC
LENGTH OF MESSAGE - 20
A 3 001 3*3 = 9
B 5 10 5*2 = 10
C 6 11 6*2 = 12
D 4 01 4*2 = 8
E 2 000 2*3 = 6
TOTAL 45 BITS
BITS
FLOWCHART
code
import heapq
import os
class HuffmanCoding:
def __init__(self, path):
self.path = path
self.heap = []
self.codes = {}
self.reverse_mapping = {}
class HeapNode:
def __init__(self, char, freq):
self.char = char
self.freq = freq
self.left = None
self.right = None
def merge_nodes(self):
while(len(self.heap)>1):
node1 = heapq.heappop(self.heap)
node2 = heapq.heappop(self.heap)
heapq.heappush(self.heap, merged)
def make_codes(self):
root = heapq.heappop(self.heap)
current_code = ""
self.make_codes_helper(root, current_code)
padded_info = "{0:08b}".format(extra_padding)
encoded_text = padded_info + encoded_text
return encoded_text
code
def get_byte_array(self, padded_encoded_text):
if(len(padded_encoded_text) % 8 != 0):
print("Encoded text not padded properly")
exit(0)
b = bytearray()
for i in range(0, len(padded_encoded_text), 8):
byte = padded_encoded_text[i:i+8]
b.append(int(byte, 2))
return b
def compress(self):
filename, file_extension = os.path.splitext(self.path)
output_path = filename + ".bin"
with open(self.path, 'r+') as file, open(output_path, 'wb') as output:
text = file.read()
text = text.rstrip()
frequency = self.make_frequency_dict(text)
self.make_heap(frequency)
self.merge_nodes()
self.make_codes()
encoded_text = self.get_encoded_text(text)
padded_encoded_text = self.pad_encoded_text(encoded_text)
b = self.get_byte_array(padded_encoded_text)
output.write(bytes(b))
print("Compressed")
return output_path
Result and discussion
Original text
We can see that size of the file is reduced to the possible lowes size
using huffman coding
SPACE COMPLEXITY : If we have n symbol then we need to store each Symbol in Array so