"Fossies" - the Fresh Open Source Software Archive

Member "quicktime4linux-2.3/thirdparty/libvorbis-1.1.1/doc/xml/03-codebook.xml" (31 May 2008, 15533 Bytes) of package /linux/privat/old/quicktime4linux-2.3-src.tar.gz:


As a special service "Fossies" has tried to format the requested source page into HTML format using (guessed) XML source code syntax highlighting (style: standard) with prefixed line numbers. Alternatively you can here view or download the uninterpreted source code file.

    1 <?xml version="1.0" standalone="no"?>
    2 <!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
    3                 "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [
    4 
    5 ]>
    6 
    7 <section id="vorbis-spec-codebook">
    8 <sectioninfo>
    9 <releaseinfo>
   10  $Id: 03-codebook.xml 7186 2004-07-20 07:19:25Z xiphmont $
   11 </releaseinfo>
   12 </sectioninfo>
   13 <title>Probability Model and Codebooks</title>
   14 
   15 <section>
   16 <title>Overview</title>
   17 
   18 <para>
   19 Unlike practically every other mainstream audio codec, Vorbis has no
   20 statically configured probability model, instead packing all entropy
   21 decoding configuration, VQ and Huffman, into the bitstream itself in
   22 the third header, the codec setup header.  This packed configuration
   23 consists of multiple 'codebooks', each containing a specific
   24 Huffman-equivalent representation for decoding compressed codewords as
   25 well as an optional lookup table of output vector values to which a
   26 decoded Huffman value is applied as an offset, generating the final
   27 decoded output corresponding to a given compressed codeword.</para>
   28 
   29 <section><title>Bitwise operation</title>
   30 <para>
   31 The codebook mechanism is built on top of the vorbis bitpacker. Both
   32 the codebooks themselves and the codewords they decode are unrolled 
   33 from a packet as a series of arbitrary-width values read from the 
   34 stream according to <xref linkend="vorbis-spec-bitpacking"/>.</para>
   35 </section>
   36 
   37 </section>
   38 
   39 <section>
   40 <title>Packed codebook format</title>
   41 
   42 <para>
   43 For purposes of the examples below, we assume that the storage
   44 system's native byte width is eight bits.  This is not universally
   45 true; see <xref linkend="vorbis-spec-bitpacking"/> for discussion 
   46 relating to non-eight-bit bytes.</para>
   47 
   48 <section><title>codebook decode</title>
   49 
   50 <para>
   51 A codebook begins with a 24 bit sync pattern, 0x564342:
   52 
   53 <screen>
   54 byte 0: [ 0 1 0 0 0 0 1 0 ] (0x42)
   55 byte 1: [ 0 1 0 0 0 0 1 1 ] (0x43)
   56 byte 2: [ 0 1 0 1 0 1 1 0 ] (0x56)
   57 </screen></para>
   58 
   59 <para>
   60 16 bit <varname>[codebook_dimensions]</varname> and 24 bit <varname>[codebook_entries]</varname> fields:
   61 
   62 <screen>
   63 
   64 byte 3: [ X X X X X X X X ] 
   65 byte 4: [ X X X X X X X X ] [codebook_dimensions] (16 bit unsigned)
   66 
   67 byte 5: [ X X X X X X X X ] 
   68 byte 6: [ X X X X X X X X ] 
   69 byte 7: [ X X X X X X X X ] [codebook_entries] (24 bit unsigned)
   70 
   71 </screen></para>
   72 
   73 <para>
   74 Next is the <varname>[ordered]</varname> bit flag:
   75 
   76 <screen>
   77 
   78 byte 8: [               X ] [ordered] (1 bit)
   79 
   80 </screen></para>
   81 
   82 <para>
   83 Each entry, numbering a
   84 total of <varname>[codebook_entries]</varname>, is assigned a codeword length.
   85 We now read the list of codeword lengths and store these lengths in
   86 the array <varname>[codebook_codeword_lengths]</varname>. Decode of lengths is
   87 according to whether the <varname>[ordered]</varname> flag is set or unset.
   88 
   89 <itemizedlist>
   90 <listitem>
   91   <para>If the <varname>[ordered]</varname> flag is unset, the codeword list is not
   92   length ordered and the decoder needs to read each codeword length
   93   one-by-one.</para> 
   94 
   95   <para>The decoder first reads one additional bit flag, the
   96   <varname>[sparse]</varname> flag.  This flag determines whether or not the
   97   codebook contains unused entries that are not to be included in the
   98   codeword decode tree:
   99 
  100 <screen>
  101 byte 8: [             X 1 ] [sparse] flag (1 bit)
  102 </screen></para>
  103 
  104 <para>
  105   The decoder now performs for each of the <varname>[codebook_entries]</varname>
  106   codebook entries:
  107 
  108 <screen>
  109   
  110   1) if([sparse] is set){
  111 
  112          2) [flag] = read one bit;
  113          3) if([flag] is set){
  114 
  115               4) [length] = read a five bit unsigned integer;
  116               5) codeword length for this entry is [length]+1;
  117 
  118             } else {
  119 
  120               6) this entry is unused.  mark it as such.
  121 
  122             }
  123 
  124      } else the sparse flag is not set {
  125 
  126         7) [length] = read a five bit unsigned integer;
  127         8) the codeword length for this entry is [length]+1;
  128         
  129      }
  130 
  131 </screen></para>
  132 </listitem>
  133 <listitem>
  134   <para>If the <varname>[ordered]</varname> flag is set, the codeword list for this
  135   codebook is encoded in ascending length order.  Rather than reading
  136   a length for every codeword, the encoder reads the number of
  137   codewords per length.  That is, beginning at entry zero:
  138 
  139 <screen>
  140   1) [current_entry] = 0;
  141   2) [current_length] = read a five bit unsigned integer and add 1;
  142   3) [number] = read <link linkend="vorbis-spec-ilog">ilog</link>([codebook_entries] - [current_entry]) bits as an unsigned integer
  143   4) set the entries [current_entry] through [current_entry]+[number]-1, inclusive, 
  144     of the [codebook_codeword_lengths] array to [current_length]
  145   5) set [current_entry] to [number] + [current_entry]
  146   6) increment [current_length] by 1
  147   7) if [current_entry] is greater than [codebook_entries] ERROR CONDITION; 
  148     the decoder will not be able to read this stream.
  149   8) if [current_entry] is less than [codebook_entries], repeat process starting at 3)
  150   9) done.
  151 </screen></para>
  152 </listitem>
  153 </itemizedlist>
  154 
  155 After all codeword lengths have been decoded, the decoder reads the
  156 vector lookup table.  Vorbis I supports three lookup types:
  157 <orderedlist>
  158 <listitem>
  159 <simpara>No lookup</simpara>
  160 </listitem><listitem>
  161 <simpara>Implicitly populated value mapping (lattice VQ)</simpara>
  162 </listitem><listitem>
  163 <simpara>Explicitly populated value mapping (tessellated or 'foam'
  164 VQ)</simpara>
  165 </listitem>
  166 </orderedlist>
  167 </para>
  168 
  169 <para>
  170 The lookup table type is read as a four bit unsigned integer:
  171 <screen>
  172   1) [codebook_lookup_type] = read four bits as an unsigned integer
  173 </screen></para>
  174 
  175 <para>
  176 Codebook decode precedes according to <varname>[codebook_lookup_type]</varname>:
  177 <itemizedlist>
  178 <listitem>
  179 <para>Lookup type zero indicates no lookup to be read.  Proceed past
  180 lookup decode.</para>
  181 </listitem><listitem>
  182 <para>Lookup types one and two are similar, differing only in the
  183 number of lookup values to be read.  Lookup type one reads a list of
  184 values that are permuted in a set pattern to build a list of vectors,
  185 each vector of order <varname>[codebook_dimensions]</varname> scalars.  Lookup
  186 type two builds the same vector list, but reads each scalar for each
  187 vector explicitly, rather than building vectors from a smaller list of
  188 possible scalar values.  Lookup decode proceeds as follows:
  189 
  190 <screen>
  191   1) [codebook_minimum_value] = <link linkend="vorbis-spec-float32_unpack">float32_unpack</link>( read 32 bits as an unsigned integer) 
  192   2) [codebook_delta_value] = <link linkend="vorbis-spec-float32_unpack">float32_unpack</link>( read 32 bits as an unsigned integer) 
  193   3) [codebook_value_bits] = read 4 bits as an unsigned integer and add 1
  194   4) [codebook_sequence_p] = read 1 bit as a boolean flag
  195 
  196   if ( [codebook_lookup_type] is 1 ) {
  197    
  198      5) [codebook_lookup_values] = <link linkend="vorbis-spec-lookup1_values">lookup1_values</link>(<varname>[codebook_entries]</varname>, <varname>[codebook_dimensions]</varname> )
  199 
  200   } else {
  201 
  202      6) [codebook_lookup_values] = <varname>[codebook_entries]</varname> * <varname>[codebook_dimensions]</varname>
  203 
  204   }
  205 
  206   7) read a total of [codebook_lookup_values] unsigned integers of [codebook_value_bits] each; 
  207      store these in order in the array [codebook_multiplicands]
  208 </screen></para>
  209 </listitem><listitem>
  210 <para>A <varname>[codebook_lookup_type]</varname> of greater than two is reserved
  211 and indicates a stream that is not decodable by the specification in this
  212 document.</para>
  213 </listitem>
  214 </itemizedlist>
  215 </para>
  216 
  217 <para>
  218 An 'end of packet' during any read operation in the above steps is
  219 considered an error condition rendering the stream undecodable.</para>
  220 
  221 <section><title>Huffman decision tree representation</title>
  222 
  223 <para>
  224 The <varname>[codebook_codeword_lengths]</varname> array and
  225 <varname>[codebook_entries]</varname> value uniquely define the Huffman decision
  226 tree used for entropy decoding.</para>
  227 
  228 <para>
  229 Briefly, each used codebook entry (recall that length-unordered
  230 codebooks support unused codeword entries) is assigned, in order, the
  231 lowest valued unused binary Huffman codeword possible.  Assume the
  232 following codeword length list:
  233 
  234 <screen>
  235 entry 0: length 2
  236 entry 1: length 4
  237 entry 2: length 4
  238 entry 3: length 4
  239 entry 4: length 4
  240 entry 5: length 2
  241 entry 6: length 3
  242 entry 7: length 3
  243 </screen></para>
  244 
  245 <para>
  246 Assigning codewords in order (lowest possible value of the appropriate
  247 length to highest) results in the following codeword list:
  248 
  249 <screen>
  250 entry 0: length 2 codeword 00
  251 entry 1: length 4 codeword 0100
  252 entry 2: length 4 codeword 0101
  253 entry 3: length 4 codeword 0110
  254 entry 4: length 4 codeword 0111
  255 entry 5: length 2 codeword 10
  256 entry 6: length 3 codeword 110
  257 entry 7: length 3 codeword 111
  258 </screen></para>
  259 
  260 
  261 <note>
  262 <para>
  263 Unlike most binary numerical values in this document, we
  264 intend the above codewords to be read and used bit by bit from left to
  265 right, thus the codeword '001' is the bit string 'zero, zero, one'.
  266 When determining 'lowest possible value' in the assignment definition
  267 above, the leftmost bit is the MSb.</para>
  268 </note>
  269 
  270 <para>
  271 It is clear that the codeword length list represents a Huffman
  272 decision tree with the entry numbers equivalent to the leaves numbered
  273 left-to-right:
  274 
  275 <mediaobject>
  276 <imageobject>
  277  <imagedata fileref="hufftree.png" format="PNG"/>
  278 </imageobject>
  279 <textobject>
  280  <phrase>[huffman tree illustration]</phrase>
  281 </textobject>
  282 </mediaobject>
  283 </para>
  284 
  285 <para>
  286 As we assign codewords in order, we see that each choice constructs a
  287 new leaf in the leftmost possible position.</para>
  288 
  289 <para>
  290 Note that it's possible to underspecify or overspecify a Huffman tree
  291 via the length list.  In the above example, if codeword seven were
  292 eliminated, it's clear that the tree is unfinished:
  293 
  294 <mediaobject>
  295 <imageobject>
  296  <imagedata fileref="hufftree-under.png" format="PNG"/>
  297 </imageobject>
  298 <textobject>
  299  <phrase>[underspecified huffman tree illustration]</phrase>
  300 </textobject>
  301 </mediaobject>
  302 </para>
  303 
  304 <para>
  305 Similarly, in the original codebook, it's clear that the tree is fully
  306 populated and a ninth codeword is impossible.  Both underspecified and
  307 overspecified trees are an error condition rendering the stream
  308 undecodable.</para>
  309 
  310 <para>
  311 Codebook entries marked 'unused' are simply skipped in the assigning
  312 process.  They have no codeword and do not appear in the decision
  313 tree, thus it's impossible for any bit pattern read from the stream to
  314 decode to that entry number.</para>
  315 
  316 </section>
  317 
  318 <section><title>VQ lookup table vector representation</title>
  319 
  320 <para>
  321 Unpacking the VQ lookup table vectors relies on the following values:
  322 <programlisting>
  323 the [codebook_multiplicands] array
  324 [codebook_minimum_value]
  325 [codebook_delta_value]
  326 [codebook_sequence_p]
  327 [codebook_lookup_type]
  328 [codebook_entries]
  329 [codebook_dimensions]
  330 [codebook_lookup_values]
  331 </programlisting>
  332 </para>
  333 
  334 <para>
  335 Decoding (unpacking) a specific vector in the vector lookup table
  336 proceeds according to <varname>[codebook_lookup_type]</varname>.  The unpacked
  337 vector values are what a codebook would return during audio packet
  338 decode in a VQ context.</para>
  339 
  340 <section><title>Vector value decode: Lookup type 1</title>
  341 
  342 <para>
  343 Lookup type one specifies a lattice VQ lookup table built
  344 algorithmically from a list of scalar values.  Calculate (unpack) the
  345 final values of a codebook entry vector from the entries in
  346 <varname>[codebook_multiplicands]</varname> as follows (<varname>[value_vector]</varname>
  347 is the output vector representing the vector of values for entry number
  348 <varname>[lookup_offset]</varname> in this codebook):
  349 
  350 <screen>
  351   1) [last] = 0;
  352   2) [index_divisor] = 1;
  353   3) iterate [i] over the range 0 ... [codebook_dimensions]-1 (once for each scalar value in the value vector) {
  354        
  355        4) [multiplicand_offset] = ( [lookup_offset] divided by [index_divisor] using integer 
  356           division ) integer modulo [codebook_lookup_values]
  357 
  358        5) vector [value_vector] element [i] = 
  359             ( [codebook_multiplicands] array element number [multiplicand_offset] ) *
  360             [codebook_delta_value] + [codebook_minimum_value] + [last];
  361 
  362        6) if ( [codebook_sequence_p] is set ) then set [last] = vector [value_vector] element [i]
  363 
  364        7) [index_divisor] = [index_divisor] * [codebook_lookup_values]
  365 
  366      }
  367  
  368   8) vector calculation completed.
  369 </screen></para>
  370 
  371 </section>
  372 
  373 <section><title>Vector value decode: Lookup type 2</title>
  374 
  375 <para>
  376 Lookup type two specifies a VQ lookup table in which each scalar in
  377 each vector is explicitly set by the <varname>[codebook_multiplicands]</varname>
  378 array in a one-to-one mapping.  Calculate [unpack] the
  379 final values of a codebook entry vector from the entries in
  380 <varname>[codebook_multiplicands]</varname> as follows (<varname>[value_vector]</varname>
  381 is the output vector representing the vector of values for entry number
  382 <varname>[lookup_offset]</varname> in this codebook):
  383 
  384 <screen>
  385   1) [last] = 0;
  386   2) [multiplicand_offset] = [lookup_offset] * [codebook_dimensions]
  387   3) iterate [i] over the range 0 ... [codebook_dimensions]-1 (once for each scalar value in the value vector) {
  388 
  389        4) vector [value_vector] element [i] = 
  390             ( [codebook_multiplicands] array element number [multiplicand_offset] ) *
  391             [codebook_delta_value] + [codebook_minimum_value] + [last];
  392 
  393        5) if ( [codebook_sequence_p] is set ) then set [last] = vector [value_vector] element [i] 
  394 
  395        6) increment [multiplicand_offset]
  396 
  397      }
  398  
  399   7) vector calculation completed.
  400 </screen></para>
  401 
  402 </section>
  403 
  404 </section>
  405 
  406 </section>
  407 
  408 </section>
  409 
  410 <section>
  411 <title>Use of the codebook abstraction</title>
  412 
  413 <para>
  414 The decoder uses the codebook abstraction much as it does the
  415 bit-unpacking convention; a specific codebook reads a
  416 codeword from the bitstream, decoding it into an entry number, and then
  417 returns that entry number to the decoder (when used in a scalar
  418 entropy coding context), or uses that entry number as an offset into
  419 the VQ lookup table, returning a vector of values (when used in a context
  420 desiring a VQ value). Scalar or VQ context is always explicit; any call
  421 to the codebook mechanism requests either a scalar entry number or a
  422 lookup vector.</para>
  423 
  424 <para>
  425 Note that VQ lookup type zero indicates that there is no lookup table;
  426 requesting decode using a codebook of lookup type 0 in any context
  427 expecting a vector return value (even in a case where a vector of
  428 dimension one) is forbidden.  If decoder setup or decode requests such
  429 an action, that is an error condition rendering the packet
  430 undecodable.</para>
  431 
  432 <para>
  433 Using a codebook to read from the packet bitstream consists first of
  434 reading and decoding the next codeword in the bitstream. The decoder
  435 reads bits until the accumulated bits match a codeword in the
  436 codebook.  This process can be though of as logically walking the
  437 Huffman decode tree by reading one bit at a time from the bitstream,
  438 and using the bit as a decision boolean to take the 0 branch (left in
  439 the above examples) or the 1 branch (right in the above examples).
  440 Walking the tree finishes when the decode process hits a leaf in the
  441 decision tree; the result is the entry number corresponding to that
  442 leaf.  Reading past the end of a packet propagates the 'end-of-stream'
  443 condition to the decoder.</para>
  444 
  445 <para>
  446 When used in a scalar context, the resulting codeword entry is the
  447 desired return value.</para>
  448 
  449 <para>
  450 When used in a VQ context, the codeword entry number is used as an
  451 offset into the VQ lookup table.  The value returned to the decoder is
  452 the vector of scalars corresponding to this offset.</para>
  453 
  454 </section>
  455 
  456 </section>
  457 
  458 <!-- end section of probablity model and codebooks -->