Interval Coding
Modified Cumulative Distribution Function
Given a discrete probability distribution , we define the modified cumulative distribution function as: where is the usual cumulative distribution function.
Shannon-Fano-Elias Coding
For each outcome ,
- Calculate the value of modified cumulative distribution function .
- Calculate the length of the required binary string .
- Convert the value to binary .
- Trim the binary string to the required length , assign the trimmed binary string to the outcome as its codeword.
e.g. Suppose has outcomes and probabilities . Then we have the following table:
| 4 | |||||
| 5 | |||||
| 3 | |||||
| 3 |
Therefore we have the code .
Lossless Property
Proposition
The Shannon-Fano-Elias code is lossless.
Proof Denote the Shannon-Fano-Elias code as for the th outcome . Then we have That is the codeword lies entirely in the interval between and , and these intervals are disjoint. Therefore, the code is lossless.
Prefix Property
Proposition
Shannon-Fano-Elias code is prefix free.
Proof According to the previous argument, we know that . Additionally, Hence The intervals for each codeword are thus trivially disjoint, since we know each of the interval is disjoint, thus the code is prefix free.
Shannon-Fano-Elias Decoding
Shannon-Fano-Elias Decoding
To decode a given bitstring,
- start with the first bit, compute the corresponding binary interval;
- if the interval is strictly contained within that of a codeword:
- output the corresponding outcome
- skip over any redundant bits for this codeword
- repeat the process with the remaining bits
- otherwise include next bit and compute the corresponding interval.
e.g. Suppose we have with and we want to decode :
We could actually stop the process once we see since .
Expected Length
Theorem
The expected length of a Shannon-Fano-Elias code on ensemble satisfies:
Proof The lower bound is similarly obtained.