"Fossies" - the Fresh Open Source Software Archive

Member "fasm/tools/fas.txt" (21 Feb 2022, 23496 Bytes) of package /linux/misc/fasm-1.73.30.tgz:


As a special service "Fossies" has tried to format the requested text file into HTML format (style: standard) with prefixed line numbers. Alternatively you can here view or download the uninterpreted source code file.

    1 
    2                                 flat assembler
    3                        Symbolic information file format
    4 
    5 
    6    Table 1  Header
    7   /-------------------------------------------------------------------------\
    8   | Offset | Size    | Description                                          |
    9   |========|=========|======================================================|
   10   |   +0   |  dword  | Signature 1A736166h (little-endian).                 |
   11   |--------|---------|------------------------------------------------------|
   12   |   +4   |  byte   | Major version of flat assembler.                     |
   13   |--------|---------|------------------------------------------------------|
   14   |   +5   |  byte   | Minor version of flat assembler.                     |
   15   |--------|---------|------------------------------------------------------|
   16   |   +6   |  word   | Length of header.                                    |
   17   |--------|---------|------------------------------------------------------|
   18   |   +8   |  dword  | Offset of input file name in the strings table.      |
   19   |--------|---------|------------------------------------------------------|
   20   |  +12   |  dword  | Offset of output file name in the strings table.     |
   21   |--------|---------|------------------------------------------------------|
   22   |  +16   |  dword  | Offset of strings table.                             |
   23   |--------|---------|------------------------------------------------------|
   24   |  +20   |  dword  | Length of strings table.                             |
   25   |--------|---------|------------------------------------------------------|
   26   |  +24   |  dword  | Offset of symbols table.                             |
   27   |--------|---------|------------------------------------------------------|
   28   |  +28   |  dword  | Length of symbols table.                             |
   29   |--------|---------|------------------------------------------------------|
   30   |  +32   |  dword  | Offset of preprocessed source.                       |
   31   |--------|---------|------------------------------------------------------|
   32   |  +36   |  dword  | Length of preprocessed source.                       |
   33   |--------|---------|------------------------------------------------------|
   34   |  +40   |  dword  | Offset of assembly dump.                             |
   35   |--------|---------|------------------------------------------------------|
   36   |  +44   |  dword  | Length of assembly dump.                             |
   37   |--------|---------|------------------------------------------------------|
   38   |  +48   |  dword  | Offset of section names table.                       |
   39   |--------|---------|------------------------------------------------------|
   40   |  +52   |  dword  | Length of section names table.                       |
   41   |--------|---------|------------------------------------------------------|
   42   |  +56   |  dword  | Offset of symbol references dump.                    |
   43   |--------|---------|------------------------------------------------------|
   44   |  +60   |  dword  | Length of symbol references dump.                    |
   45   \-------------------------------------------------------------------------/
   46 
   47   Notes:
   48 
   49     If header is shorter than 64 bytes, it comes from a version that does not
   50     support dumping some of the structures. It should then be interpreted
   51     that the data for missing structures could not be provided, not that the
   52     size of that data is zero.
   53 
   54     Offsets given in header generally mean positions in the file, however
   55     input and output file names are specified by offsets in the strings table,
   56     so you have to add their offset to the offset of strings table to obtain
   57     the positions of those strings in the file.
   58 
   59     The strings table contains just a sequence of ASCIIZ strings, which may
   60     be referred to by other parts of the file. It contains the names of
   61     main input file, the output file, and the names of the sections and
   62     external symbols if there were any.
   63 
   64     The symbols table is an array of 32-byte structures, each one in format
   65     specified by table 2.
   66 
   67     The preprocessed source is a sequence of preprocessed lines, each one
   68     in format as defined in table 3.
   69 
   70     The assembly dump contains an array of 28-byte structures, each one in
   71     format specified by table 4, and at the end of this array an additional
   72     double word containing the offset in output file at which the assembly
   73     was ended.
   74 
   75     It is possible that file does not contain assembly dump at all - this
   76     happens when some error occured and only the preprocessed source was
   77     dumped. If error occured during the preprocessing, only the source up to
   78     the point of error is provided. In such case (and only then) the field
   79     at offset 44 contains zero.
   80 
   81     The section names table exists only when the output format was an object
   82     file (ELF or COFF), and it is an array of 4-byte entries, each being an
   83     offset of the name of the section in the strings table.
   84     The index of section in this table is the same, as the index of section
   85     in the generated object file.
   86 
   87     The symbol references dump contains an array of 8-byte structures, each
   88     one describes an event of some symbol being used. The first double word
   89     of such structure contains an offset of symbol in the symbols table,
   90     and the second double word is an offset of structure in assembly dump,
   91     which specifies at what moment the symbol was referenced.
   92 
   93 
   94    Table 2  Symbol structure
   95   /-------------------------------------------------------------------------\
   96   | Offset | Size  | Description                                            |
   97   |========|=======|========================================================|
   98   |   +0   | qword | Value of symbol.                                       |
   99   |--------|-------|--------------------------------------------------------|
  100   |   +8   | word  | Flags (table 2.1).                                     |
  101   |--------|-------|--------------------------------------------------------|
  102   |  +10   | byte  | Size of data labelled by this symbol (zero means plain |
  103   |        |       | label without size attached).                          |
  104   |--------|-------|--------------------------------------------------------|
  105   |  +11   | byte  | Type of value (table 2.2). Any value other than zero   |
  106   |        |       | means some kind of relocatable symbol.                 |
  107   |--------|-------|--------------------------------------------------------|
  108   |  +12   | dword | Extended SIB, the first two bytes are register codes   |
  109   |        |       | and the second two bytes are corresponding scales.     |
  110   |--------|-------|--------------------------------------------------------|
  111   |  +16   | word  | Number of pass in which symbol was defined last time.  |
  112   |--------|-------|--------------------------------------------------------|
  113   |  +18   | word  | Number of pass in which symbol was used last time.     |
  114   |--------|-------|--------------------------------------------------------|
  115   |  +20   | dword | If the symbol is relocatable, this field contains      |
  116   |        |       | information about section or external symbol, to which |
  117   |        |       | it is relative - otherwise this field has no meaning.  |
  118   |        |       | When the highest bit is cleared, the symbol is         |
  119   |        |       | relative to a section, and the bits 0-30 contain       |
  120   |        |       | the index (starting from 1) in the table of sections.  |
  121   |        |       | When the highest bit is set, the symbol is relative to |
  122   |        |       | an external symbol, and the bits 0-30 contain the      |
  123   |        |       | the offset of the name of this symbol in the strings   |
  124   |        |       | table.                                                 |
  125   |--------|-------|--------------------------------------------------------|
  126   |  +24   | dword | If the highest bit is cleared, the bits 0-30 contain   |
  127   |        |       | the offset of symbol name in the preprocessed source.  |
  128   |        |       | This name is a pascal-style string (byte length        |
  129   |        |       | followed by string data).                              |
  130   |        |       | Zero in this field means an anonymous symbol.          |
  131   |        |       | If the highest bit is set, the bits 0-30 contain the   |
  132   |        |       | offset of the symbol name in the strings table, and    |
  133   |        |       | this name is a zero-ended string in this case (as are  |
  134   |        |       | all the strings there).                                |
  135   |--------|-------|--------------------------------------------------------|
  136   |  +28   | dword | Offset in the preprocessed source of line that defined |
  137   |        |       | this symbol (see table 3).                             |
  138   \-------------------------------------------------------------------------/
  139 
  140 
  141    Table 2.1  Symbol flags
  142   /-----------------------------------------------------------------\
  143   | Bit | Value | Description                                       |
  144   |=====|=======|===================================================|
  145   |  0  |     1 | Symbol was defined.                               |
  146   |-----|-------|---------------------------------------------------|
  147   |  1  |     2 | Symbol is an assembly-time variable.              |
  148   |-----|-------|---------------------------------------------------|
  149   |  2  |     4 | Symbol cannot be forward-referenced.              |
  150   |-----|-------|---------------------------------------------------|
  151   |  3  |     8 | Symbol was used.                                  |
  152   |-----|-------|---------------------------------------------------|
  153   |  4  |   10h | The prediction was needed when checking           |
  154   |     |       | whether the symbol was used.                      |
  155   |-----|-------|---------------------------------------------------|
  156   |  5  |   20h | Result of last predicted check for being used.    |
  157   |-----|-------|---------------------------------------------------|
  158   |  6  |   40h | The prediction was needed when checking           |
  159   |     |       | whether the symbol was defined.                   |
  160   |-----|-------|---------------------------------------------------|
  161   |  7  |   80h | Result of last predicted check for being defined. |
  162   |-----|-------|---------------------------------------------------|
  163   |  8  |  100h | The optimization adjustment is applied to         |
  164   |     |       | the value of this symbol.                         |
  165   |-----|-------|---------------------------------------------------|
  166   |  9  |  200h | The value of symbol is negative number encoded    |
  167   |     |       | as two's complement.                              |
  168   |-----|-------|---------------------------------------------------|
  169   | 10  |  400h | Symbol is a special marker and has no value.      |
  170   \-----------------------------------------------------------------/
  171 
  172   Notes:
  173 
  174     Some of those flags are listed here just for completness, as they
  175     have little use outside of the flat assembler. However the bit 0
  176     is important, because the symbols table contains all the labels
  177     that occured in source, even if some of them were in the
  178     conditional blocks that did not get assembled.
  179 
  180 
  181    Table 2.2  Symbol value types
  182   /-------------------------------------------------------------------\
  183   | Value | Description                                               |
  184   |=======|===========================================================|
  185   |   0   | Absolute value.                                           |
  186   |-------|-----------------------------------------------------------|
  187   |   1   | Relocatable segment address (only with MZ output).        |
  188   |-------|-----------------------------------------------------------|
  189   |   2   | Relocatable 32-bit address.                               |
  190   |-------|-----------------------------------------------------------|
  191   |   3   | Relocatable relative 32-bit address (value valid only for |
  192   |       | symbol used in the same place where it was calculated,    |
  193   |       | it should not occur in the symbol structure).             |
  194   |-------|-----------------------------------------------------------|
  195   |   4   | Relocatable 64-bit address.                               |
  196   |-------|-----------------------------------------------------------|
  197   |   5   | [ELF only] GOT-relative 32-bit address.                   |
  198   |-------|-----------------------------------------------------------|
  199   |   6   | [ELF only] 32-bit address of PLT entry.                   |
  200   |-------|-----------------------------------------------------------|
  201   |   7   | [ELF only] Relative 32-bit address of PLT entry (value    |
  202   |       | valid only for symbol used in the same place where it     |
  203   |       | was calculated, it should not occur in the symbol         |
  204   |       | structure).                                               |
  205   \-------------------------------------------------------------------/
  206 
  207   Notes:
  208 
  209     The types 3 and 7 should never be encountered in the symbols dump,
  210     they are only used internally by the flat assembler.
  211 
  212     If type value is a negative number, it is an opposite of a value
  213     from this table and it means that the symbol of a given type has
  214     been negated.
  215 
  216 
  217    Table 2.3  Register codes for extended SIB
  218   /------------------\
  219   | Value | Register |
  220   |=======|==========|
  221   |  23h  | BX       |
  222   |-------|----------|
  223   |  25h  | BP       |
  224   |-------|----------|
  225   |  26h  | SI       |
  226   |-------|----------|
  227   |  27h  | DI       |
  228   |-------|----------|
  229   |  40h  | EAX      |
  230   |-------|----------|
  231   |  41h  | ECX      |
  232   |-------|----------|
  233   |  42h  | EDX      |
  234   |-------|----------|
  235   |  43h  | EBX      |
  236   |-------|----------|
  237   |  44h  | ESP      |
  238   |-------|----------|
  239   |  45h  | EBP      |
  240   |-------|----------|
  241   |  46h  | ESI      |
  242   |-------|----------|
  243   |  47h  | EDI      |
  244   |-------|----------|
  245   |  48h  | R8D      |
  246   |-------|----------|
  247   |  49h  | R9D      |
  248   |-------|----------|
  249   |  4Ah  | R10D     |
  250   |-------|----------|
  251   |  4Bh  | R11D     |
  252   |-------|----------|
  253   |  4Ch  | R12D     |
  254   |-------|----------|
  255   |  4Dh  | R13D     |
  256   |-------|----------|
  257   |  4Eh  | R14D     |
  258   |-------|----------|
  259   |  4Fh  | R15D     |
  260   |-------|----------|
  261   |  80h  | RAX      |
  262   |-------|----------|
  263   |  81h  | RCX      |
  264   |-------|----------|
  265   |  82h  | RDX      |
  266   |-------|----------|
  267   |  83h  | RBX      |
  268   |-------|----------|
  269   |  84h  | RSP      |
  270   |-------|----------|
  271   |  85h  | RBP      |
  272   |-------|----------|
  273   |  86h  | RSI      |
  274   |-------|----------|
  275   |  87h  | RDI      |
  276   |-------|----------|
  277   |  88h  | R8       |
  278   |-------|----------|
  279   |  89h  | R9       |
  280   |-------|----------|
  281   |  8Ah  | R10      |
  282   |-------|----------|
  283   |  8Bh  | R11      |
  284   |-------|----------|
  285   |  8Ch  | R12      |
  286   |-------|----------|
  287   |  8Dh  | R13      |
  288   |-------|----------|
  289   |  8Eh  | R14      |
  290   |-------|----------|
  291   |  8Fh  | R15      |
  292   |-------|----------|
  293   |  94h  | EIP      |
  294   |-------|----------|
  295   |  98h  | RIP      |
  296   \------------------/
  297 
  298 
  299    Table 3  Preprocessed line
  300   /--------------------------------------------------------------------------\
  301   | Offset | Size  | Value                                                   |
  302   |========|=================================================================|
  303   |   +0   | dword | When the line was loaded from source, this field        |
  304   |        |       | contains either zero (if it is the line from the main   |
  305   |        |       | input file), or an offset inside the preprocessed       |
  306   |        |       | source to the name of file, from which this line was    |
  307   |        |       | loaded (the name of file is zero-ended string).         |
  308   |        |       | When the line was generated by macroinstruction, this   |
  309   |        |       | field contains offset inside the preprocessed source to |
  310   |        |       | the pascal-style string specifying the name of          |
  311   |        |       | macroinstruction, which generated this line.            |
  312   |--------|-------|---------------------------------------------------------|
  313   |   +4   | dword | Bits 0-30 contain the number of this line.              |
  314   |        |       | If the highest bit is zeroed, this line was loaded from |
  315   |        |       | source.                                                 |
  316   |        |       | If the highest bit is set, this line was generated by   |
  317   |        |       | macroinstruction.                                       |
  318   |--------|-------|---------------------------------------------------------|
  319   |   +8   | dword | If the line was loaded from source, this field contains |
  320   |        |       | the position of the line inside the source file, from   |
  321   |        |       | which it was loaded.                                    |
  322   |        |       | If line was generated by macroinstruction, this field   |
  323   |        |       | contains the offset of preprocessed line, which invoked |
  324   |        |       | the macroinstruction.                                   |
  325   |        |       | If line was generated by instantaneous macro, this      |
  326   |        |       | field is equal to the next one.                         |
  327   |--------|-------|---------------------------------------------------------|
  328   |  +12   | dword | If the line was generated by macroinstruction, this     |
  329   |        |       | field contains offset of the preprocessed line inside   |
  330   |        |       | the definition of macro, from which this one was        |
  331   |        |       | generated.                                              |
  332   |--------|-------|---------------------------------------------------------|
  333   |  +16   | ?     | The tokenized contents of line.                         |
  334   \--------------------------------------------------------------------------/
  335 
  336   Notes:
  337 
  338     To determine, whether this is the line loaded from source, or generated by
  339     macroinstruction, you need to check the highest bit of the second double
  340     word.
  341 
  342     The contents of line is no longer a text, which it was in source file,
  343     but a sequence of tokens, ended with a zero byte.
  344     Any chain of characters that aren't special ones, separated from other
  345     similar chains with spaces or some other special characters, is converted
  346     into symbol token. The first byte of this element has the value of 1Ah,
  347     the second byte is the count of characters, followed by this amount of
  348     bytes, which build the symbol.
  349     Some characters have a special meaning, and cannot occur inside the
  350     symbol, they split the symbols and are converted into separate tokens.
  351     For example, if source contains this line of text:
  352 
  353       mov ax,4
  354 
  355     preprocessor converts it into the chain of bytes, shown here with their
  356     hexadecimal values (characters corresponding to some of those values are
  357     placed below the hexadecimal codes):
  358 
  359       1A 03 6D 6F 76 1A 02 61 78 2C 1A 01 34 00
  360             m  o  v        a  x  ,        4
  361 
  362     The third type of token that can be found in preprocessed line is the
  363     quoted text. This element is created from chain of any bytes other than
  364     line breaks that are placed between the single or double quotes in the
  365     original text. First byte of such element is always 22h, it is followed
  366     by double word which specifies the number of bytes that follow, and the
  367     value of quoted text comes next. For example, this line from source:
  368 
  369       mov eax,'ABCD'
  370 
  371     is converted into (the notation used is the same as in previous sample):
  372 
  373       1A 03 6D 6F 76 1A 03 65 61 78 2C 22 04 00 00 00 41 42 43 44 00
  374             m  o  v        e  a  x  ,                 A  B  C  D
  375 
  376     This data defines two symbols followed by symbol character, quoted text
  377     and zero byte that marks end of line.
  378     There is also a special case of symbol token with first byte having the
  379     value 3Bh instead of 1Ah, such symbol means that all the line elements
  380     that follow, including this one, have already been interpreted by
  381     preprocessor and are ignored by assembler.
  382 
  383 
  384    Table 4  Row of the assembly dump
  385   /-------------------------------------------------------------------------\
  386   | Offset | Size  | Description                                            |
  387   |========|=======|========================================================|
  388   |   +0   | dword | Offset in output file.                                 |
  389   |--------|-------|--------------------------------------------------------|
  390   |   +4   | dword | Offset of line in preprocessed source.                 |
  391   |--------|-------|--------------------------------------------------------|
  392   |   +8   | qword | Value of $ address.                                    |
  393   |--------|-------|--------------------------------------------------------|
  394   |  +16   | dword | Extended SIB for the $ address, the first two bytes    |
  395   |        |       | are register codes and the second two bytes are        |
  396   |        |       | corresponding scales.                                  |
  397   |--------|-------|--------------------------------------------------------|
  398   |  +20   | dword | If the $ address is relocatable, this field contains   |
  399   |        |       | information about section or external symbol, to which |
  400   |        |       | it is relative - otherwise this field is zero.         |
  401   |        |       | When the highest bit is cleared, the address is        |
  402   |        |       | relative to a section, and the bits 0-30 contain       |
  403   |        |       | the index (starting from 1) in the table of sections.  |
  404   |        |       | When the highest bit is set, the address is relative   |
  405   |        |       | to an external symbol, and the bits 0-30 contain the   |
  406   |        |       | the offset of the name of this symbol in the strings   |
  407   |        |       | table.                                                 |
  408   |--------|-------|--------------------------------------------------------|
  409   |  +24   | byte  | Type of $ address value (as in table 2.2).             |
  410   |--------|-------|--------------------------------------------------------|
  411   |  +25   | byte  | Type of code - possible values are 16, 32, and 64.     |
  412   |--------|-------|--------------------------------------------------------|
  413   |  +26   | byte  | If the bit 0 is set, then at this point the assembly   |
  414   |        |       | was taking place inside the virtual block, and the     |
  415   |        |       | offset in output file has no meaning here.             |
  416   |        |       | If the bit 1 is set, the line was assembled at the     |
  417   |        |       | point, which was not included in the output file for   |
  418   |        |       | some other reasons (like inside the reserved data at   |
  419   |        |       | the end of section).                                   |
  420   |--------|-------|--------------------------------------------------------|
  421   |  +27   | byte  | The higher bits of value of $ address.                 |
  422   \-------------------------------------------------------------------------/
  423 
  424 
  425   Notes:
  426 
  427     Each row of the assembly dump informs, that the given line of preprocessed
  428     source was assembled at the specified address (defined by its type, value
  429     and the extended SIB) and at the specified position in output file.