Data compression is essential for reducing the storage space and bandwidth required for large volumes of graphics, audio, and video data. It involves coding techniques that transform input data into a compressed format, which can later be decompressed using a codec. Key methods include entropy encoding, such as Run Length Encoding and Huffman coding, which aim to eliminate redundant information while preserving useful content.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
4 views
Data Compression
Data compression is essential for reducing the storage space and bandwidth required for large volumes of graphics, audio, and video data. It involves coding techniques that transform input data into a compressed format, which can later be decompressed using a codec. Key methods include entropy encoding, such as Run Length Encoding and Huffman coding, which aim to eliminate redundant information while preserving useful content.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35
DATA COMPRESSION
WHY COMPRESS DATA?
• Uncompressed graphics, audio & video data require – large storage space and – huge bandwidth. So compression of digital video & audio data are required. Data compression & coding are almost synonymous. Text data requires least storing space, image requires more, audio & video data requires still more space. Data compression is needed for: WHY COMPRESS DATA? • Large data volumes for secondary storage may make system too expensive & sometimes unfeasible. • Relatively slow storage media may not allow data transmission from secondary storage devices to output devices in real time. • BW limitation may not allow real time A/V transmission over networks. Data compression technology • It is an example of coding technique. • Coding means to code & transform the input bit stream to a new bit stream based on some principles. The source data is passed through a adapt compressor (coder) & then stored in a storage media like disk, CD, tape etc. It Is later de-compressed or expanded in an expander (decoder). • The combination of coder-decoder is called codec or compander. • The compression ratio (CR) = size of uncompressed data set in bits/bytes: size of compressed data set in bits/bytes Entropy encoding principle Entropy: • Entropy is the total information content (TIC) of any information object • = Useful information content (entropy) + redundant information content (RIC)
• In ideal compression, all redundant information is
removed & all useful information is retained. This happens in ENTROPY ENCODING which is lossless compression. Here, some RIC may be left but no entropy should be LOST. In source encoding, it may be lossless or lossy. Lossless encoding is reversible. • • Data Compression Technique: • Entropy encoding: It is of two types – repetitive sequence & statistical encoding. • a) Repetitive Sequence – Oldest technique of data compression. The sequence of repetitive bits or bytes is represented by no. of repetition & some special character. The general form if this technique is called Run length En-coding – special case being zero/ blank character replacement. • If a character C is repeated r times in the input data, the sequence is represented by character C followed by special character (C r) and followed by number of repetition. RLE is beneficial for 4 or more characters where a considerable CR is achieved. • Ex: Sequence of characters: = A BB CCC DDDD EEEEE ……J can be replaced by A BB CCC D4 E5 F6 ……..I 9 J10. • The original sequence is 64 characters, the compressed one is 36 characters. So, CR = 64:36. • Blanking is a special case of RLE. Often, blank spaces are encountered in a text document. In some data streams, repeated character may have zero amplitude of signal e.g. in a communication channel, an audio may have periods of silence. It works by specifying number of zeros/blanks. • S.P. = 000052.00 C.P. = 000034.00 Profit = 000018.00 • S.P 452.00 C.P. 434.00 Profit 418.00 9 Huffman coding is a method of data compression that is independent of the data type, that is, the data could represent an image, audio or spreadsheet. This compression scheme is used in JPEG and MPEG-2. Huffman coding works by looking at the data stream that makes up the file to be compressed.