MMC Unit III-1
MMC Unit III-1
MMC Unit III-1
By
M. C. Aralimarad
Introduction to Audio and Video Compression
• Key Characteristics:
• Unlike text and images, audio and video signals are continuously varying analog
signals.
• Digitization involves a continuous stream of digital values representing sampled
analog signals.
• Compression Differences:
• Algorithms differ for digitized audio/video compared to text/image data.
• Audio Compression (Section 4.2)
• Digitization Process:
• Performed using Pulse Code Modulation (PCM).
• Sampling rate: At least twice the maximum frequency (Nyquist rate).
• Sampling Examples:
• Speech signal: Max frequency = 10 kHz → Sampling rate = 20 kHz.
• Music: Max frequency = 20 kHz → Sampling rate = 40 kHz.
• Bit Requirements:
• Speech: Typically 12 bits per sample.
• General audio: 16 bits per sample.
• Stereo signal: Two channels are digitized.
• Resulting bit rates:
• Compression Methods
• Challenges:
• High bit rates exceed available channel bandwidth.
• Solutions:
• Lower Sampling Rate:
• Reduces quality due to loss of high-frequency components.
• Compression Algorithms:
• Efficiently reduce data rates while preserving acceptable quality.
• Differential Pulse Code Modulation (DPCM) - Section 4.2.1
• Overview:
• Derived from PCM.
• Encodes differences between successive audio signal samples.
• Advantages:
• Reduces required bits per sample:
• Standard PCM for voice: 64 kbps.
• DPCM reduces it to 56 kbps.
• DPCM Encoder
• Components:
• Bandlimiting Filter: Limits input signal's frequency bandwidth.
• Analog-to-Digital Converter (ADC): Digitizes input samples.
• Subtractor & Register (R):
• Computes the difference between current and previous sample values.
• Adder: Updates the register for future operations.
• Parallel-to-Serial Converter: Outputs DPCM signal.
• Figure Reference: Fig. 4.1(a).
• DPCM Decoder
• Components:
• Serial-to-Parallel Converter: Reads DPCM data stream.
• Adder: Reconstructs the signal using stored values in register (R).
• Digital-to-Analog Converter (DAC): Converts reconstructed signal back to
analog.
• Low-Pass Filter: Smoothens signal for playback.
• Figure Reference: Fig. 4.1(a)
• Timing Diagram of DPCM
• Process Breakdown:
• R0=Current Register ValueR0=Current Register Value.
• DPCM=PCM−R0DPCM=PCM−R0.
• R1=R0+DPCMR1=R0+DPCM.
• Timing Considerations:
• T0T0: Time for encoding PCM to DPCM.
• T1T1: Time for updating the register.
• Figure Reference: Fig. 4.1(b).
Predictive DPCM signal encoder and decoder schematic.
Predictive DPCM Signal Encoder and Decoder
Code-Excited LPC (CELP) Combines LPC with a codebook of excitation signals to • Used in mobile and VoIP applications.
improve quality. • Balances bit rate and quality.
Perceptual Coding Removes inaudible parts of the signal based on • Foundation for modern codecs (e.g., MP3).
psychoacoustic principles. • High compression with minimal perceptual
loss.
MPEG Audio Coders Use subband filtering, psychoacoustic models, and • Widely used in multimedia and streaming.
perceptual coding for efficient audio compression. • Supports multiple layers with varying
complexity.
Dolby Audio Coders Specialize in high-quality surround sound using adaptive • Used in cinema and home theaters.
transform coding. • Provides immersive audio experiences.
Dolby audio coders:Forward Adaptive Bit Allocation
• Key Features:
• Bits allocated dynamically for each subband.
• Bit allocation data included with encoded samples.
• Advantages: Psychoacoustic model needed only in the encoder.
• Disadvantages:
• Bit allocation information increases overhead in the bitstream.
• Reduced efficiency of the available bit rate.
• Example: Illustrated in Figure 4.9(a).
• Fixed Bit Allocation Strategy
• Description:
• Fixed bits assigned to subbands based on ear sensitivity.
• No need to transmit bit allocation data in the bitstream.
• Advantages: Simple and efficient.
• Example:
• Dolby AC(acoustic coder)-1 Standard
• 40 subbands at 32 ksps.
• Typical stereo bit rate: 512 kbps.
Dolby audio coders
Backward Adaptive Bit Allocation
• Key Features:
• Psychoacoustic model present in both encoder and decoder.
• Decoder computes bit allocation using subband samples.
• Advantages:
• Reduces overhead in the bitstream.
• Disadvantages:
• Dependency on the same psychoacoustic model in encoder and decoder.
• Encoder modification requires decoder updates.
• Example:
• Dolby AC-2 Standard
• Used in PC sound cards.
• Typical stereo bit rate: 256 kbps.
Backward Adaptive vs. Forward
Adaptive
Feature Forward Adaptive Backward Adaptive
Where Allocation Happens At the encoder At the decoder
Computed by decoder based on
Bit Allocation Info Explicitly sent in the bitstream
spectral envelope
Encoder simplicity, no dependency Reduced bitstream overhead, more
Advantages
on decoder efficient use of bits
Requires psychoacoustic model in
Disadvantages Increased bitstream overhead
decoder
Hybrid Backward/Forward Adaptive Bit Allocation
• Description:
• Combines forward and backward adaptive methods.
• Key Components:
• PMB: Backward psychoacoustic model.
• PMF: Forward psychoacoustic model (complex, used only in the encoder).
• Process:
• PMF computes differences between forward and backward allocations.
• Differences improve quantization accuracy and are transmitted to the decoder.
• Advantages:
• Flexible encoder modifications.
• Improved audio quality.
• Example:
• Dolby AC-3 Standard
• Used in HDTV and ATV applications.
• Comparable to MPEG audio standards.
Block Design in Dolby AC-3
• Block Structure:
• Each block: 512 subband samples.
• Continuity: Last 256 samples repeated in the next block.
• PCM sampling rate: 32 ksps.
• Subband Details:
• Audio bandwidth: 15 kHz.
• Subband width: 62.5 Hz.
• Duration: Block: 16 ms (512 samples, with 256 new samples).
• Bit Rate: Typical stereo rate: 192 kbps.
Comparison of Techniques
Feature Forward Adaptive Fixed Allocation Backward Adaptive Hybrid Approach
Bitstream Overhead High Low Medium Medium
Psychoacoustic Both
Encoder only None in decoder Encoder dominant
Model encoder/decoder
Flexibility Moderate Low Low High
Audio Quality Good Limited by fixed Good Excellent
Applications
• Dolby AC-1:
• FM radio and TV audio.
• Low complexity, high efficiency.
• Dolby AC-2:
• Hi-fi quality for PC sound cards.
• Used in professional audio compression.
• Dolby AC-3:
• Advanced Television (ATV) and HDTV.
• Comparable to MPEG audio standards.
Summary
• Terminology:
• Video is referred to as moving pictures; terms "frame" and "picture" used
interchangeably.
• Standard usage in context: "frame."
• Compression via JPEG:
• Applying JPEG independently to each frame is called MJPEG (Moving JPEG).
• Compression ratios: 10:1 to 20:1.
• These ratios are insufficient for most video applications.
• Spatial and Temporal Redundancy
• Spatial Redundancy: Exists within individual frames.
• Temporal Redundancy: Exists between successive frames:
• Example:
• Minor movements like lips or eyes in video telephony.
• Larger movements like a person or vehicle in movies.