"Fossies" - the Fresh Open Source Software Archive

Member "gnash-0.8.10/cygnal/libamf/README" (19 Jan 2012, 8130 Bytes) of package /linux/www/old/gnash-0.8.10.tar.gz:


As a special service "Fossies" has tried to format the requested text file into HTML format (style: standard) with prefixed line numbers. Alternatively you can here view or download the uninterpreted source code file.

    1 *** Please *** read this in it's entirety before making any changes to
    2 the code in this directory!
    3 ===========================
    4 
    5 All of the information in this document has been figured out through
    6 reverse engineering, so it's entirely possible there are some minor
    7 misconceptions about concepts. These explanations are based on many
    8 months of staring at hex dumps of files, memory segments, and network
    9 traffic. Many thanks to the volunteers that use the proprietary player
   10 for allowing their data to be captured, and their disk drives
   11 examined.
   12 
   13 AMF is the lowest level representation of an SWF data type. Up until
   14 swf version 8, the format is refered to as AMF0, and is widely used on
   15 swf files for the SharedObject and LocalConnection ActionScript
   16 classes, as well as for remoting and RTMP based streaming.
   17 
   18 As of swf version 9, the a new version was created called AMF3, since
   19 it only works with the new ActionScript 3 classes. The main reason for
   20 this is performance. For example, AMF0 has only a single data type for
   21 numbers, which is a double (8 bytes). AMF3 introduced a new integer
   22 data type, which also supports a simplified packing scheme where if
   23 the first bit is set, the only 3 bytes, instead of the usual 4 for an
   24 integer is read.
   25 
   26 Currently the AMF implementation in Gnash only supports AMF0. Although
   27 there are bits of AMF3 implemented for the various constants used for
   28 handling AMF objects, none of the code is currently using it until we
   29 actually see some AMF3 based swf files out in the wild. Most of the
   30 time AMF0 is used by swf version 9 anyway, and uses s special data
   31 type to switch to AMF3 for that particular piece of data.
   32 
   33 Another big difference is that AMF3 has more optimizations, supporting
   34 a simple caching scheme where after an object has been sent, it can be
   35 referred to by an index number. Where AMF0 had multiple types for the
   36 various commonly used ActionScript classes, AMF3 instead as a single
   37 object type, which can be defined to be an existing or custom
   38 class. Other optimizations include removing the AMF0 Boolean data
   39 type, which was 3 bytes ling by a single byte which is either true or
   40 false.
   41 
   42 The usage of AMF has two main types of usage, one is a simple encoding
   43 of basic data types, like string and number. The other usage is used
   44 for the properties of ActionScript class objects. A basic AMF object
   45 has a 3 byte header field. The first byte is the type, followed by 2
   46 bytes to hold the length. A property uses the same format to hold 
   47 the data, but is preceeded by the name of the property, which is
   48 preceeded by two bytes that contsain the length of the name.
   49 
   50 As an AMF object doesn't do much but hold data, a header is used to
   51 signify the number of objects, file size, etc... The SharedObject and
   52 LocalConnection ActionScript classes have different headers, as
   53 SharedObjects store AMF objects in a disk based file, while
   54 LocalConnection stores AMF objects in a shared memory segment.
   55 
   56 The basic lowdown on the classes are as follows. The Buffer class is
   57 used to hold all data. We specifically don't use a std::vector because
   58 this class is so heavily used. We want to avoid memory fragmentation,
   59 which often happens when using classes from libstdc++. This class also
   60 has special methods for handling the data types we use, so this class
   61 would need to exist anyway. In it's simplest form, this is merely an
   62 array of unsigned bytes with a length.
   63 
   64 As a raw buffer is pretty useless for higher level processing, the
   65 Element class is used to represent an AMF object. After a buffer is
   66 read, it's data is extracted into a series of Elements. An Element
   67 still uses the Buffer class to hold the data, but often this is a much
   68 smaller buffer than the one used to read data. Most Elements are
   69 simply a numeric value or string, but Elements can also hold a higher
   70 level ActionScript class, including all of it's properties. The main
   71 internal difference is that the properties of an ActionScript class
   72 have a name, which is the name of the property, in addition to the
   73 data. Only an Element of the data type *OBJECT* can have properties.
   74 Properties of an object are stored as an array of more Elements, each
   75 one representing a single AMF data type. Note that when allocated, the
   76 memory in the Buffer points to is *not* memset() to 0. All Buffers are
   77 the exact size they need to be. Setting the memory to zero is nice for
   78 debugging in GBD, as you get nice clean hex dumps that way, but
   79 imagine the performance hit if every single time a Buffer is
   80 allocated, each bytes must be set to zero. If you want to set a
   81 Buffer's data to zero for debugging, use the Buffer::clear() method,
   82 and don't forget to remove it later so it doesn't become a subtle
   83 performance problem.
   84 
   85 The AMF class is used to encode and decode data between Buffers and
   86 Elements. When encoding, all the methods are static, as no data needs
   87 to be retained between usages of the data. Note that all the
   88 AMF::encode*() methods allocate a Buffer, which then later needs to be
   89 freed after usage. Once again, smart pointers, while useful are
   90 avoided because of the memory fragmentation issue for heavily used
   91 code. While this sort of defeats the purpose of both C++ and object
   92 oriented programming, that's life when working with high-performance,
   93 data-driven code. All decoding is handled by the non static
   94 AMF::extract*{} methods. These are not static as they must retain the
   95 current amount of data that has been parsed so subsequent decoding
   96 starts in the right place.
   97 
   98 The the only difference between the two higher level classes SOL and
   99 LcShm, are where the data is stored (disk or memory), and the
  100 appropriate headers for the data.
  101 
  102 LocalConnection, on unix based systems uses the older SYSV style
  103 shared memory segments. These are always the same size, 64528
  104 bytes. There are two sections in the shared memory segment. One I call
  105 "Listeners", not to be confused with ActionScript object Listeners,
  106 although the concept is similar. LocalConnection is used as a
  107 bi-directional way to transfer AMF objects between swf movies, instead
  108 of using a network connection. When a swf movie attachs to the
  109 LovalConnection shared memory segment, it registers itself by writing
  110 it's name into the Listener section. 
  111 
  112 This registration step turns out to be optional, as it is possible to
  113 send and receive data by polling for changes. This is of course a huge
  114 security problem, as it allows any client to secretly monitor or inject
  115 the communication between multiple swf files in an untraceable
  116 way. Some web sites, YouTube in particular, exploit this feature by
  117 never registering themselves as a Listener, so beware.
  118 
  119 
  120 -------
  121 Note to developers. Please be very careful making any changes to this
  122 code without seriously understanding how the code works. Byte
  123 manipulation is very easy to screw up, minor changes can often cause
  124 major problems. Anyone making changes here should run the libamf.all
  125 test cases to make sure they haven't introduced breakage.
  126 
  127 As a further note, valgrind gets confused with type casting sometimes,
  128 displaying errors where there are none. As all data is stored as
  129 unsigned bytes, to extract numeric values like the length often cause
  130 valgrind to assume there are errors with word alignment. Eliminating
  131 valgrind errors is a good thing though, so sometimes we have to jump
  132 through hoops to keep it quite. Often this requires playing silly
  133 games with local variables and multiple type casts. This makes the
  134 code a bit convoluted at times, but that's life if you want solid code.
  135 
  136 As this code does much allocation and deleting of memory blocks. After
  137 any changes make sure there are no memory leaks. This can be done with
  138 valgrind (eliminating the stupid valgrind errors makes it obvious).
  139 Optionally, the Memory class in libbase/gmemory.h contains supported
  140 for a valgrind like API that is under programmer control. Look at the
  141 test cases in libamf.all for usage examples. Memory::analyze() will do
  142 the same thing as valgrind to check to make sure all allocated memory
  143 is properly deleted when the program exits. To use the Memory class,
  144 you have to configure with --with-statistics=all or
  145 --with-statistics=mem.