5.9 Shannon-Fano-Elias Coding

发布时间 2023-10-17 12:50:15作者: 李斯赛特

  • Define the modified cdf \(\overline{F}(x)\) based on the standard cdf \(F(x)\):

\[\overline{F}(x)=\sum_{a<x}p(a)+\frac{1}{2}p(x)=F(x-1)-\frac{1}{2}p(x) \]

  • Assume that \(p(x)>0\) for all \(x\in S_X\), then \(\overline{F}(a)\neq \overline{F}(b)\) if \(a\neq b\).

  • So we can determine \(x\) if we know \(\overline{F}(x)\), and we can use the value of \(\overline{F}(x)\) as a code for \(x\).

  • But, in general, \(\overline{F}(x)\) is a infinite decimal. So, we have consider using an approximate value of \(\overline{F}(x)\) and consider the required accuracy.

  • Assume that we truncate \(\overline{F}(x)\) yp \(l(x)\) bits, denoted by \(\lfloor\overline{F}(x)\rfloor_{l(x)}\), i.e. \(\lfloor 0.34\overline{8125}\rfloor_{4}=0.3481\).

  • Get some bounds:

    \[\overline{F}(x)-\lfloor\overline{F}(x)\rfloor_{l(x)} \leq \frac{1}{2^{l(x)}} \]

    it is very straightforward that \(0.34\overline{8125}-\lfloor 0.34\overline{8125}\rfloor_{4}\leq \frac{1}{10^4}\leq \frac{1}{2^4}\), and it is a very loose bound.

    Let \(l(x)=\lceil \log\frac{1}{p(x)}\rceil+1\), then

    \[\frac{1}{2^{l(x)}} < \frac{1}{2^{\log\frac{1}{p(x)}+1}}=\frac{p(x)}{2}=\overline{F}(x)-F(x-1) \]

    Combining above inequalities, we have

    \[\overline{F}(x)-\lfloor\overline{F}(x)\rfloor_{l(x)} < \overline{F}(x)-F(x-1) \]

    That is, $\lfloor\overline{F}(x)\rfloor_{l(x)} $ lies within the step corresponding to \(x\), i.e. \(\lfloor\overline{F}(x)\rfloor_{l(x)} \in (F(x-1), \overline{F}(x)]\), so \(l(x)\) bits suffice to describe \(x\).

  • Then, the code is also required to be prefix-free. The algorithm converts \(\lfloor\overline{F}(x)\rfloor_{l(x)}\) in binary, and uses the first \(l(x)\) bits after the decimal point as the codeword for \(x\).

  • The way to converts a decimal (i.e. \(0.34\overline{8125}\)) in binary:

    1. multiply this decimal by 2 (\(0.34\overline{8125}\times 2=0.69625...\))
    2. record the number before the decimal point (\(0\)), and set it to be 0 (\(0.69625...\rightarrow 0.69625...\)).
    3. multiply the new decimal by 2 (\(0.69625...\times 2=1.3925...\))
    4. record the number before the decimal point (\(1\)), and set it to be 0 (\(1.3925...\rightarrow 0.3925...\)).
    5. multiply the new decimal by 2 (\(0.3925...\times 2=0.785...\))
    6. record the number before the decimal point (\(0\)), and set it to be 0 (\(0.785...\rightarrow 0.785...\)).
    7. repeat above procedure \(l(x)\) times, the binary code is the digits we recorded (i.e. \(010...\))
  • The expected length of Shannon-Fano-Elias Code is:

    \[L=\sum_x p(x) l(x)=\sum_x p(x)(\lceil \log\frac{1}{p(x)}\rceil+1)<H(X)+2 \]

Some examples: