Redundancy (Information Theory)

Redundancy
(information theory)
Page issues
In Information theory, redundancy

measures the fractional difference
between the entropy H(X) of an ensemble
X, and its maximum possible value
.[1][2] Informally, it is the amount
of wasted "space" used to transmit certain
data. Data compression is a way to reduce
or eliminate unwanted redundancy, while
checksums are a way of adding desired
redundancy for purposes of error
detection when communicating over a
noisy channel of limited capacity.
Quantitative definition
In describing the redundancy of raw data,
the rate of a source of information is the
average entropy per symbol. For
memoryless sources, this is merely the
entropy of each symbol, while, in the most
general case of a stochastic process, it is
the limit, as n goes to infinity, of the joint
entropy of the first n symbols divided by n.
It is common in information theory to
speak of the "rate" or "entropy" of a
language. This is appropriate, for example,
when the source of information is English
prose. The rate of a memoryless source is
simply , since by definition there is
no interdependence of the successive
messages of a memoryless source.
The absolute rate of a language or source

is simply
the logarithm of the cardinality of the
message space, or alphabet. (This formula
is sometimes called the Hartley function.)
This is the maximum possible rate of
information that can be transmitted with
that alphabet. (The logarithm should be
taken to a base appropriate for the unit of
measurement in use.) The absolute rate is
equal to the actual rate if the source is
memoryless and has a uniform
distribution.
The absolute redundancy can then be

defined as
the difference between the absolute rate
and the rate.
The quantity is called the relative
redundancy and gives the maximum

possible data compression ratio, when
expressed as the percentage by which a
file size can be decreased. (When
expressed as a ratio of original file size to
compressed file size, the quantity
gives the maximum compression ratio that
can be achieved.) Complementary to the
concept of relative redundancy is
efficiency, defined as so that

. A memoryless source with
a uniform distribution has zero

redundancy (and thus 100% efficiency),
and cannot be compressed.
Other notions
A measure of redundancy between two
variables is the mutual information or a
normalized variant. A measure of
redundancy among many variables is
given by the total correlation.
Redundancy of compressed data refers to

the difference between the expected
compressed data length of messages
(or expected data rate
) and the entropy (or
entropy rate ). (Here we assume the data
is ergodic and stationary, e.g., a
memoryless source.) Although the rate
difference can be
arbitrarily small as increased, the actual
difference , cannot,
although it can be theoretically upper-
bounded by 1 in the case of finite-entropy
memoryless sources.
See also
Minimum redundancy coding
Huffman encoding
Data compression
Hartley function
Negentropy
Source coding theorem
Overcompleteness
References
1. Here it is assumed are the sets on
which the probability distributions are
defined.
2. MacKay, David J.C. (2003). "2.4
Definition of entropy and related functions".
Information Theory, Inference, and Learning
Algorithms . Cambridge University Press.
p. 33. ISBN 0-521-64298-1. “The
redundancy measures the fractional
difference between H(X) and its maximum
possible value, ”
Reza, Fazlollah M. (1994) [1961]. An
Introduction to Information Theory. New
York: Dover [McGraw-Hill]. ISBN 0-486-
68210-2.
Schneier, Bruce (1996). Applied
Cryptography: Protocols, Algorithms, and
Source Code in C. New York: John Wiley
& Sons, Inc. ISBN 0-471-12845-7.
Auffarth, B; Lopez-Sanchez, M.;
Cerquides, J. (2010). "Comparison of
Redundancy and Relevance Measures
for Feature Selection in Tissue
Classification of CT images". Advances
in Data Mining. Applications and
Theoretical Aspects. Springer. pp. 248–
262. CiteSeerX 10.1.1.170.1528  .
Retrieved from
"https://en.wikipedia.org/w/index.php?
title=Redundancy_(information_theory)&oldid=805
743890"
Last edited 3 months ago by Kku
Content is available under CC BY-SA 3.0 unless

otherwise noted.

Redundancy (Information Theory)

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Redundancy (Information Theory)

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Redundancy (Information Theory)

Uploaded by

Copyright:

Available Formats

Redundancy

In Information theory, redundancy

The absolute rate of a language or source

The absolute redundancy can then be

The quantity is called the relative

redundancy and gives the maximum

efﬁciency, deﬁned as so that

a uniform distribution has zero

Redundancy of compressed data refers to

Last edited 3 months ago by Kku

Content is available under CC BY-SA 3.0 unless

You might also like