PMIT-6214: Information Coding: Instructor: M. Shamim Kaiser Email: Text Phone: 01511000555

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 76

PMIT–6214: Information Coding

Instructor: M. Shamim Kaiser


Email: shamimkaiser@juniv.edu
Text phone: 01511000555

1
Information
• Quantifying information is the foundation of the field of
information theory.
• The intuition behind quantifying information is the idea of
measuring how much surprise there is in an event. Those
events that are rare (low probability) are more surprising
and therefore have more information those events that are
common (high probability).
• Low Probability Event: High Information (surprising).
• High Probability Event: Low Information (unsurprising).
• Rare events are more uncertain or more surprising and
require more information to represent them than common
events.

2
• We can calculate the amount of information there is in an event
using the probability of the event. This is called “Shannon
information,” “self-information,” or simply the “information,” and can
be calculated for a discrete event x as follows:
information(x) = -log( p(x) )
• Where log() is the base-2 logarithm and p(x) is the probability of the
event x.
• The choice of the base-2 logarithm means that the units of the
information measure is in bits (binary digits). This can be directly
interpreted in the information processing sense as the number of
bits required to represent the event.
• The negative sign ensures that the result is always positive or zero.
• Information will be zero when the probability of an event is 1.0 or a
certainty, e.g. there is no surprise.

3
• Entropy is the measurement of the average uncertainty of
information
• We will skip the proofs and background that leads us to the
formula for entropy, but it was derived from required properties
• Also, keep in mind that this is a simplified explanation
• H – entropy
• P – probability
• X – random variable with a discrete set of possible
outcomes
• (X0, X1, X2, … Xn-1) where n is the total number of possibilities
n 1 n 1
1
Entropy  H ( X )   Pj lg Pj   Pj lg
j 0 j 0 Pj
4
In short

5
Example

6
7
• Entropy is greatest when the probabilities of the
outcomes are equal
• Let’s consider our fair coin experiment again
• The entropy H = ½ lg 2 + ½ lg 2 = 1
• Since each outcome has self-information of 1, the
average of 2 outcomes is (1+1)/2 = 1
• Consider a biased coin, P(H) = 0.98, P(T) = 0.02
• H = 0.98 * lg 1/0.98 + 0.02 * lg 1/0.02 =
= 0.98 * 0.029 + 0.02 * 5.643 = 0.0285 + 0.1129 =
0.1414
8
• In general, we must estimate the entropy
• The estimate depends on our assumptions about about
the structure (read pattern) of the source of information
• Consider the following sequence:
1 2 3 2 3 4 5 4 5 6 7 8 9 8 9 10
• Obtaining the probability from the sequence
• 16 digits, 1, 6, 7, 10 all appear once, the rest appear twice
• The entropy H = 3.25 bits
• Since there are 16 symbols, we theoretically would
need 16 * 3.25 bits to transmit the information

9
• Consider the following sequence:
1212441244444412444444

• Obtaining the probability from the sequence


• 1, 2 four times (4/22), (4/22)
• 4 fourteen times (14/22)

• The entropy H = 0.447 + 0.447 + 0.415 = 1.309 bits


• Since there are 22 symbols, we theoretically would need 22 *
1.309 = 28.798 (29) bits to transmit the information
• However, check the symbols 12, 44
• 12 appears 4/11 and 44 appears 7/11
• H = 0.530 + 0.415 = 0.945 bits
• 11 * 0.945 = 10.395 (11) bits to tx the info (38 % less!)
• We might possibly be able to find patterns with less entropy

10
11
Example

12
Example

13
14
15
16
17
Example

18
Example

19
Discreate Memoryless
Channel

20
21
22
Lossless Channel

23
Deterministic Channel

24
Noiseless

25
BSC

26
Example

27
28
Example

29
30
Example

31
32
Mutual Information

33
34
35
Channel Capacity

36
Special Channel

37
Example

38
39
40
Example

41
42
43
44
Example

45
Example

46
Example

47
48
49
50
Example

51
Example

52
Example

53
54
55
56
Example

57
AWGN

58
Example

59
Example

60
Example

61
62
Entropy in Long independent
sequence
• Let us consider a zero memory source producing
independent sequences of symbols with source
alphabet s={s1,s2,…sq} with probabilities P={P1,P2,
…,Pq} respectively. Let me consider a long
independent sequence of length L.
• This long sequence then contains
• P1L number of messages of type s1
• P2L number of messages of type s2
•…
• PqL number of messages of type sq

63
Exercise
• A source emits one of 4 possible symbols x0 x1 x2 x3 during
each signaling interval. The symbols occur with probabilities
as (0.4,0.3,0.2,0.1). Find the amount of information gained by
observe ring the source emitting each of these symbols and
also the source entropy.
• A discreate message source S emits two independent
symbols X and Y with probabilities 0.55 and 0.45 respectively.
Calculate the efficiency of the source and its redundancy.
• A pair of dice are tested simultaneously. The occurrence of
first dice is X1 and second dice is X2. two events are define
as follows A={(X1,X2) such that X1+X2=7} B= {(X1,X2) such
that X1>X2}. Which event conveys more information. Please
comments

64
Markoff Statistics Model
• In real source, there is inter-symbol influence
present such that the occurrence of xi in the
zeroth position s0 of the message depends on
the previous q symbols {s1,s2, …, sq} such a
source is called qth order Markoff source or
markov source

65
66
67
68
69
70
71
72
73
74
75
76

You might also like