"Fossies" - the Fresh Open Source Software Archive

Member "quicktime4linux-2.3/thirdparty/libvorbis-1.1.1/doc/xml/08-residue.xml" (31 May 2008, 17030 Bytes) of package /linux/privat/old/quicktime4linux-2.3-src.tar.gz:


As a special service "Fossies" has tried to format the requested source page into HTML format using (guessed) XML source code syntax highlighting (style: standard) with prefixed line numbers. Alternatively you can here view or download the uninterpreted source code file.

    1 <?xml version="1.0" standalone="no"?>
    2 <!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
    3                 "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [
    4 
    5 ]>
    6 
    7 <section id="vorbis-spec-residue">
    8 <sectioninfo>
    9  <releaseinfo>
   10   $Id: 08-residue.xml 7186 2004-07-20 07:19:25Z xiphmont $
   11  </releaseinfo>
   12 </sectioninfo>
   13 <title>Residue setup and decode</title>
   14 
   15 
   16 <section>
   17 <title>Overview</title>
   18 
   19 <para>
   20 A residue vector represents the fine detail of the audio spectrum of
   21 one channel in an audio frame after the encoder subtracts the floor
   22 curve and performs any channel coupling.  A residue vector may
   23 represent spectral lines, spectral magnitude, spectral phase or
   24 hybrids as mixed by channel coupling.  The exact semantic content of
   25 the vector does not matter to the residue abstraction.</para>
   26 
   27 <para>
   28 Whatever the exact qualities, the Vorbis residue abstraction codes the
   29 residue vectors into the bitstream packet, and then reconstructs the
   30 vectors during decode.  Vorbis makes use of three different encoding
   31 variants (numbered 0, 1 and 2) of the same basic vector encoding
   32 abstraction.</para>
   33 
   34 </section>
   35 
   36 <section>
   37 <title>Residue format</title>
   38 
   39 <para>
   40 Residue format partitions each vector in the vector bundle into chunks,
   41 classifies each chunk, encodes the chunk classifications and finally
   42 encodes the chunks themselves using the the specific VQ arrangement
   43 defined for each selected classification.
   44 The exact interleaving and partitioning vary by residue encoding number,
   45 however the high-level process used to classify and encode the residue 
   46 vector is the same in all three variants.</para>
   47 
   48 <para>
   49 A set of coded residue vectors are all of the same length.  High level
   50 coding structure, ignoring for the moment exactly how a partition is
   51 encoded and simply trusting that it is, is as follows:</para>
   52 
   53 <itemizedlist>
   54 <listitem><para>Each vector is partitioned into multiple equal sized chunks
   55 according to configuration specified.  If we have a vector size of
   56 <emphasis>n</emphasis>, a partition size <emphasis>residue_partition_size</emphasis>, and a total
   57 of <emphasis>ch</emphasis> residue vectors, the total number of partitioned chunks
   58 coded is <emphasis>n</emphasis>/<emphasis>residue_partition_size</emphasis>*<emphasis>ch</emphasis>.  It is
   59 important to note that the integer division truncates.  In the below
   60 example, we assume an example <emphasis>residue_partition_size</emphasis> of 8.</para></listitem>
   61 
   62 <listitem><para>Each partition in each vector has a classification number that
   63 specifies which of multiple configured VQ codebook setups are used to
   64 decode that partition.  The classification numbers of each partition
   65 can be thought of as forming a vector in their own right, as in the
   66 illustration below.  Just as the residue vectors are coded in grouped
   67 partitions to increase encoding efficiency, the classification vector
   68 is also partitioned into chunks.  The integer elements of each scalar
   69 in a classification chunk are built into a single scalar that
   70 represents the classification numbers in that chunk.  In the below
   71 example, the classification codeword encodes two classification
   72 numbers.</para></listitem>
   73 
   74 <listitem><para>The values in a residue vector may be encoded monolithically in a
   75 single pass through the residue vector, but more often efficient
   76 codebook design dictates that each vector is encoded as the additive
   77 sum of several passes through the residue vector using more than one
   78 VQ codebook.  Thus, each residue value potentially accumulates values
   79 from multiple decode passes.  The classification value associated with
   80 a partition is the same in each pass, thus the classification codeword
   81 is coded only in the first pass.</para></listitem>
   82 
   83 </itemizedlist>
   84 
   85 <mediaobject>
   86 <imageobject>
   87  <imagedata fileref="residue-pack.png" format="PNG"/>
   88 </imageobject>
   89 <textobject>
   90  <phrase>[illustration of residue vector format]</phrase>
   91 </textobject>
   92 </mediaobject>
   93 
   94 </section>
   95 
   96 <section><title>residue 0</title>
   97 
   98 <para>
   99 Residue 0 and 1 differ only in the way the values within a residue
  100 partition are interleaved during partition encoding (visually treated
  101 as a black box--or cyan box or brown box--in the above figure).</para>
  102 
  103 <para>
  104 Residue encoding 0 interleaves VQ encoding according to the
  105 dimension of the codebook used to encode a partition in a specific
  106 pass.  The dimension of the codebook need not be the same in multiple
  107 passes, however the partition size must be an even multiple of the
  108 codebook dimension.</para>
  109 
  110 <para>
  111 As an example, assume a partition vector of size eight, to be encoded
  112 by residue 0 using codebook sizes of 8, 4, 2 and 1:</para>
  113 
  114 <programlisting>
  115 
  116             original residue vector: [ 0 1 2 3 4 5 6 7 ]
  117 
  118 codebook dimensions = 8  encoded as: [ 0 1 2 3 4 5 6 7 ]
  119 
  120 codebook dimensions = 4  encoded as: [ 0 2 4 6 ], [ 1 3 5 7 ]
  121 
  122 codebook dimensions = 2  encoded as: [ 0 4 ], [ 1 5 ], [ 2 6 ], [ 3 7 ]
  123 
  124 codebook dimensions = 1  encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ]
  125 
  126 </programlisting>
  127 
  128 <para>
  129 It is worth mentioning at this point that no configurable value in the
  130 residue coding setup is restricted to a power of two.</para>
  131 
  132 </section>
  133 
  134 <section><title>residue 1</title>
  135 
  136 <para>
  137 Residue 1 does not interleave VQ encoding.  It represents partition
  138 vector scalars in order.  As with residue 0, however, partition length
  139 must be an integer multiple of the codebook dimension, although
  140 dimension may vary from pass to pass.</para>
  141 
  142 <para>
  143 As an example, assume a partition vector of size eight, to be encoded
  144 by residue 0 using codebook sizes of 8, 4, 2 and 1:</para>
  145 
  146 <programlisting>
  147 
  148             original residue vector: [ 0 1 2 3 4 5 6 7 ]
  149 
  150 codebook dimensions = 8  encoded as: [ 0 1 2 3 4 5 6 7 ]
  151 
  152 codebook dimensions = 4  encoded as: [ 0 1 2 3 ], [ 4 5 6 7 ]
  153 
  154 codebook dimensions = 2  encoded as: [ 0 1 ], [ 2 3 ], [ 4 5 ], [ 6 7 ]
  155 
  156 codebook dimensions = 1  encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ]
  157 
  158 </programlisting>
  159 
  160 </section>
  161 
  162 <section><title>residue 2</title>
  163 
  164 <para>
  165 Residue type two can be thought of as a variant of residue type 1.
  166 Rather than encoding multiple passed-in vectors as in residue type 1,
  167 the <emphasis>ch</emphasis> passed in vectors of length <emphasis>n</emphasis> are first
  168 interleaved and flattened into a single vector of length
  169 <emphasis>ch</emphasis>*<emphasis>n</emphasis>.  Encoding then proceeds as in type 1. Decoding is
  170 as in type 1 with decode interleave reversed. If operating on a single
  171 vector to begin with, residue type 1 and type 2 are equivalent.</para>
  172 
  173 <mediaobject>
  174 <imageobject>
  175  <imagedata fileref="residue2.png" format="PNG"/>
  176 </imageobject>
  177 <textobject>
  178  <phrase>[illustration of residue type 2]</phrase>
  179 </textobject>
  180 </mediaobject>
  181 
  182 </section>
  183 
  184 <section>
  185 <title>Residue decode</title>
  186 
  187 <section><title>header decode</title>
  188 
  189 <para>
  190 Header decode for all three residue types is identical.</para>
  191 <programlisting>
  192   1) [residue_begin] = read 24 bits as unsigned integer
  193   2) [residue_end] = read 24 bits as unsigned integer
  194   3) [residue_partition_size] = read 24 bits as unsigned integer and add one
  195   4) [residue_classifications] = read 6 bits as unsigned integer and add one
  196   5) [residue_classbook] = read 8 bits as unsigned integer
  197 </programlisting>
  198 
  199 <para>
  200 <varname>[residue_begin]</varname> and <varname>[residue_end]</varname> select the specific
  201 sub-portion of each vector that is actually coded; it implements akin
  202 to a bandpass where, for coding purposes, the vector effectively
  203 begins at element <varname>[residue_begin]</varname> and ends at
  204 <varname>[residue_end]</varname>.  Preceding and following values in the unpacked
  205 vectors are zeroed.  Note that for residue type 2, these values as
  206 well as <varname>[residue_partition_size]</varname>apply to the interleaved
  207 vector, not the individual vectors before interleave.
  208 <varname>[residue_partition_size]</varname> is as explained above,
  209 <varname>[residue_classifications]</varname> is the number of possible
  210 classification to which a partition can belong and
  211 <varname>[residue_classbook]</varname> is the codebook number used to code
  212 classification codewords.  The number of dimensions in book
  213 <varname>[residue_classbook]</varname> determines how many classification values
  214 are grouped into a single classification codeword.</para>
  215 
  216 <para>
  217 Next we read a bitmap pattern that specifies which partition classes
  218 code values in which passes.</para>
  219 
  220 <programlisting>
  221   1) iterate [i] over the range 0 ... [residue_classifications]-1 {
  222   
  223        2) [high_bits] = 0
  224        3) [low_bits] = read 3 bits as unsigned integer
  225        4) [bitflag] = read one bit as boolean
  226        5) if ( [bitflag] is set ) then [high_bits] = read five bits as unsigned integer
  227        6) vector [residue_cascade] element [i] = [high_bits] * 8 + [low_bits]
  228      }
  229   7) done
  230 </programlisting>
  231 
  232 <para>
  233 Finally, we read in a list of book numbers, each corresponding to
  234 specific bit set in the cascade bitmap.  We loop over the possible
  235 codebook classifications and the maximum possible number of encoding
  236 stages (8 in Vorbis I, as constrained by the elements of the cascade
  237 bitmap being eight bits):</para>
  238 
  239 <programlisting>
  240   1) iterate [i] over the range 0 ... [residue_classifications]-1 {
  241   
  242        2) iterate [j] over the range 0 ... 7 {
  243   
  244             3) if ( vector [residue_cascade] element [i] bit [j] is set ) {
  245 
  246                  4) array [residue_books] element [i][j] = read 8 bits as unsigned integer
  247 
  248                } else {
  249 
  250                  5) array [residue_books] element [i][j] = unused
  251 
  252                }
  253           }
  254       }
  255 
  256   6) done
  257 </programlisting>
  258 
  259 <para>
  260 An end-of-packet condition at any point in header decode renders the
  261 stream undecodable.  In addition, any codebook number greater than the
  262 maximum numbered codebook set up in this stream also renders the
  263 stream undecodable.</para>
  264 
  265 </section>
  266 
  267 <section><title>packet decode</title>
  268 
  269 <para>
  270 Format 0 and 1 packet decode is identical except for specific
  271 partition interleave.  Format 2 packet decode can be built out of the
  272 format 1 decode process.  Thus we describe first the decode
  273 infrastructure identical to all three formats.</para>
  274 
  275 <para>
  276 In addition to configuration information, the residue decode process
  277 is passed the number of vectors in the submap bundle and a vector of
  278 flags indicating if any of the vectors are not to be decoded.  If the
  279 passed in number of vectors is 3 and vector number 1 is marked 'do not
  280 decode', decode skips vector 1 during the decode loop.  However, even
  281 'do not decode' vectors are allocated and zeroed.</para>
  282 
  283 <para>
  284 The following convenience values are conceptually useful to clarifying
  285 the decode process:</para>
  286 
  287 <programlisting>
  288   1) [classwords_per_codeword] = [codebook_dimensions] value of codebook [residue_classbook]
  289   2) [n_to_read] = [residue_end] - [residue_begin]
  290   3) [partitions_to_read] = [n_to_read] / [residue_partition_size]
  291 </programlisting>
  292 
  293 <para>
  294 Packet decode proceeds as follows, matching the description offered earlier in the document.  We assume that the number of vectors being encoded, <varname>[ch]</varname> is provided by the higher level decoding process.</para>
  295 <programlisting>
  296   1) allocate and zero all vectors that will be returned.
  297   2) iterate [pass] over the range 0 ... 7 {
  298 
  299        3) [partition_count] = 0
  300 
  301        4) if ([pass] is zero) {
  302      
  303             5) iterate [j] over the range 0 .. [ch]-1 {
  304 
  305                  6) if vector [j] is not marked 'do not decode' {
  306 
  307                       7) [temp] = read from packet using codebook [residue_classbook] in scalar context
  308                       8) iterate [i] descending over the range [classwords_per_codeword]-1 ... 0 {
  309 
  310                            9) array [classifications] element [j],([i]+[partition_count]) =
  311                               [temp] integer modulo [residue_classifications]
  312                           10) [temp] = [temp] / [residue_classifications] using integer division
  313 
  314                          }
  315       
  316                     }
  317             
  318                }
  319         
  320           }
  321 
  322       11) iterate [i] over the range 0 .. ([classwords_per_codeword] - 1) while [partition_count] 
  323           is also less than [partitions_to_read] {
  324 
  325             12) iterate [j] over the range 0 .. [ch]-1 {
  326    
  327                  13) if vector [j] is not marked 'do not decode' {
  328    
  329                       14) [vqclass] = array [classifications] element [j],[partition_count]
  330                       15) [vqbook] = array [residue_books] element [vqclass],[pass]
  331                       16) if ([vqbook] is not 'unused') {
  332    
  333                            17) decode partition into output vector number [j], starting at scalar 
  334                            offset [residue_begin]+[partition_count]*[residue_partition_size] using 
  335                            codebook number [vqbook] in VQ context
  336                      }
  337                 }
  338    
  339             18) increment [partition_count] by one
  340 
  341           }
  342      }
  343  
  344  19) done
  345 
  346 </programlisting>
  347 
  348 <para>
  349 An end-of-packet condition during packet decode is to be considered a
  350 nominal occurrence.  Decode returns the result of vector decode up to
  351 that point.</para>
  352 
  353 </section>
  354 
  355 <section><title>format 0 specifics</title>
  356 
  357 <para>
  358 Format zero decodes partitions exactly as described earlier in the
  359 'Residue Format: residue 0' section.  The following pseudocode
  360 presents the same algorithm. Assume:</para>
  361 
  362 <itemizedlist>
  363 <listitem><simpara> <varname>[n]</varname> is the value in <varname>[residue_partition_size]</varname></simpara></listitem>
  364 <listitem><simpara><varname>[v]</varname> is the residue vector</simpara></listitem>
  365 <listitem><simpara><varname>[offset]</varname> is the beginning read offset in [v]</simpara></listitem>
  366 </itemizedlist>
  367 
  368 <programlisting>
  369  1) [step] = [n] / [codebook_dimensions]
  370  2) iterate [i] over the range 0 ... [step]-1 {
  371 
  372       3) vector [entry_temp] = read vector from packet using current codebook in VQ context
  373       4) iterate [j] over the range 0 ... [codebook_dimensions]-1 {
  374 
  375            5) vector [v] element ([offset]+[i]+[j]*[step]) =
  376             vector [v] element ([offset]+[i]+[j]*[step]) +
  377                 vector [entry_temp] element [j]
  378 
  379          }
  380 
  381     }
  382 
  383   6) done
  384 
  385 </programlisting>
  386 
  387 </section>
  388 
  389 <section><title>format 1 specifics</title>
  390 
  391 <para>
  392 Format 1 decodes partitions exactly as described earlier in the
  393 'Residue Format: residue 1' section.  The following pseudocode
  394 presents the same algorithm. Assume:</para>
  395 
  396 <itemizedlist>
  397 <listitem><simpara> <varname>[n]</varname> is the value in
  398 <varname>[residue_partition_size]</varname></simpara></listitem>
  399 <listitem><simpara><varname>[v]</varname> is the residue vector</simpara></listitem>
  400 <listitem><simpara><varname>[offset]</varname> is the beginning read offset in [v]</simpara></listitem>
  401 </itemizedlist>
  402 
  403 <programlisting>
  404  1) [i] = 0
  405  2) vector [entry_temp] = read vector from packet using current codebook in VQ context
  406  3) iterate [j] over the range 0 ... [codebook_dimensions]-1 {
  407 
  408       4) vector [v] element ([offset]+[i]) =
  409       vector [v] element ([offset]+[i]) +
  410           vector [entry_temp] element [j]
  411       5) increment [i]
  412 
  413     }
  414  
  415   6) if ( [i] is less than [n] ) continue at step 2
  416   7) done
  417 </programlisting>
  418 
  419 </section>
  420 
  421 <section><title>format 2 specifics</title>
  422  
  423 <para>
  424 Format 2 is reducible to format 1.  It may be implemented as an additional step prior to and an additional post-decode step after a normal format 1 decode.
  425 </para>
  426 
  427 <para>
  428 Format 2 handles 'do not decode' vectors differently than residue 0 or
  429 1; if all vectors are marked 'do not decode', no decode occurrs.
  430 However, if at least one vector is to be decoded, all the vectors are
  431 decoded.  We then request normal format 1 to decode a single vector
  432 representing all output channels, rather than a vector for each
  433 channel.  After decode, deinterleave the vector into independent vectors, one for each output channel.  That is:</para>
  434 
  435 <orderedlist>
  436  <listitem><simpara>If all vectors 0 through <emphasis>ch</emphasis>-1 are marked 'do not decode', allocate and clear a single vector <varname>[v]</varname>of length <emphasis>ch*n</emphasis> and skip step 2 below; proceed directly to the post-decode step.</simpara></listitem>
  437  <listitem><simpara>Rather than performing format 1 decode to produce <emphasis>ch</emphasis> vectors of length <emphasis>n</emphasis> each, call format 1 decode to produce a single vector <varname>[v]</varname> of length <emphasis>ch*n</emphasis>. </simpara></listitem>
  438  <listitem><para>Post decode: Deinterleave the single vector <varname>[v]</varname> returned by format 1 decode as described above into <emphasis>ch</emphasis> independent vectors, one for each outputchannel, according to:
  439   <programlisting>
  440   1) iterate [i] over the range 0 ... [n]-1 {
  441 
  442        2) iterate [j] over the range 0 ... [ch]-1 {
  443 
  444             3) output vector number [j] element [i] = vector [v] element ([i] * [ch] + [j])
  445 
  446           }
  447      }
  448 
  449   4) done
  450   </programlisting>
  451  </para></listitem>
  452 </orderedlist>
  453 
  454 </section>
  455 
  456 </section>
  457 
  458 </section>
  459