0% found this document useful (0 votes)
4 views

Data Compression

Data compression is essential for reducing the storage space and bandwidth required for large volumes of graphics, audio, and video data. It involves coding techniques that transform input data into a compressed format, which can later be decompressed using a codec. Key methods include entropy encoding, such as Run Length Encoding and Huffman coding, which aim to eliminate redundant information while preserving useful content.

Uploaded by

Snehasis
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Data Compression

Data compression is essential for reducing the storage space and bandwidth required for large volumes of graphics, audio, and video data. It involves coding techniques that transform input data into a compressed format, which can later be decompressed using a codec. Key methods include entropy encoding, such as Run Length Encoding and Huffman coding, which aim to eliminate redundant information while preserving useful content.

Uploaded by

Snehasis
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

DATA COMPRESSION

WHY COMPRESS DATA?


• Uncompressed graphics, audio & video data
require
– large storage space and
– huge bandwidth. So compression of digital video
& audio data are required. Data compression &
coding are almost synonymous. Text data requires
least storing space, image requires more, audio &
video data requires still more space. Data
compression is needed for:
WHY COMPRESS DATA?
• Large data volumes for secondary storage may
make system too expensive & sometimes
unfeasible.
• Relatively slow storage media may not allow
data transmission from secondary storage
devices to output devices in real time.
• BW limitation may not allow real time A/V
transmission over networks.
Data compression technology
• It is an example of coding technique.
• Coding means to code & transform the input
bit stream to a new bit stream based on some
principles. The source data is passed through a
adapt compressor (coder) & then stored in a
storage media like disk, CD, tape etc. It Is later
de-compressed or expanded in an expander
(decoder).
• The combination of coder-decoder is called
codec or compander.
• The compression ratio (CR) = size of
uncompressed data set in bits/bytes: size of
compressed data set in bits/bytes
Entropy encoding principle
Entropy:
• Entropy is the total information content (TIC) of any
information object
• = Useful information content (entropy) + redundant
information content (RIC)

• In ideal compression, all redundant information is


removed & all useful information is retained. This
happens in ENTROPY ENCODING which is lossless
compression. Here, some RIC may be left but no
entropy should be LOST. In source encoding, it may be
lossless or lossy. Lossless encoding is reversible.

• Data Compression Technique:
• Entropy encoding: It is of two types – repetitive
sequence & statistical encoding.
• a) Repetitive Sequence – Oldest technique of data
compression. The sequence of repetitive bits or bytes
is represented by no. of repetition & some special
character. The general form if this technique is called
Run length En-coding – special case being zero/ blank
character replacement.
• If a character C is repeated r times in the input data,
the sequence is represented by character C followed by
special character (C  r) and followed by number of
repetition. RLE is beneficial for 4 or more characters
where a considerable CR is achieved.
• Ex: Sequence of characters: = A BB CCC DDDD EEEEE
……J can be replaced by A BB CCC D4 E5 F6 ……..I
9 J10.
• The original sequence is 64 characters, the compressed
one is 36 characters. So, CR = 64:36.
• Blanking is a special case of RLE. Often, blank spaces
are encountered in a text document. In some data
streams, repeated character may have zero amplitude
of signal e.g. in a communication channel, an audio
may have periods of silence. It works by specifying
number of zeros/blanks.
• S.P. = 000052.00 C.P. = 000034.00 Profit = 000018.00
• S.P 452.00 C.P. 434.00 Profit 418.00
9
Huffman coding is a method of
data compression that is
independent of the data type, that
is, the data could represent an
image, audio or spreadsheet. This
compression scheme is used in
JPEG and MPEG-2. Huffman
coding works by looking at the data
stream that makes up the file to be
compressed.

You might also like