"Fossies" - the Fresh Open Source Software Archive

Member "nasm-2.15.05/doc/html/nasmdo11.html" (28 Aug 2020, 8595 Bytes) of package /linux/misc/nasm-2.15.05-xdoc.tar.xz:

As a special service "Fossies" has tried to format the requested source page into HTML format using (guessed) HTML source code syntax highlighting (style: standard) with prefixed line numbers. Alternatively you can here view or download the uninterpreted source code file.

    1 <?xml version="1.0" encoding="UTF-8" standalone="no" ?>
    2 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
    3 <html xmlns="http://www.w3.org/1999/xhtml">
    4 <head>
    5 <title>NASM - The Netwide Assembler</title>
    6 <link href="nasmdoc.css" rel="stylesheet" type="text/css" />
    7 <link href="local.css" rel="stylesheet" type="text/css" />
    8 </head>
    9 <body>
   10 <ul class="navbar">
   11 <li class="first"><a class="prev" href="nasmdo10.html">Chapter 10</a></li>
   12 <li><a class="next" href="nasmdo12.html">Chapter 12</a></li>
   13 <li><a class="toc" href="nasmdoc0.html">Contents</a></li>
   14 <li class="last"><a class="index" href="nasmdoci.html">Index</a></li>
   15 </ul>
   16 <div class="title">
   17 <h1>NASM - The Netwide Assembler</h1>
   18 <span class="subtitle">version 2.15.05</span>
   19 </div>
   20 <div class="contents"
   21 >
   22 <h2 id="chapter-11">Chapter 11: Mixing 16- and 32-bit Code</h2>
   23 <p>This chapter tries to cover some of the issues, largely related to
   24 unusual forms of addressing and jump instructions, encountered when writing
   25 operating system code such as protected-mode initialization routines, which
   26 require code that operates in mixed segment sizes, such as code in a 16-bit
   27 segment trying to modify data in a 32-bit one, or jumps between
   28 different-size segments.</p>
   29 <h3 id="section-11.1">11.1 Mixed-Size Jumps</h3>
   30 <p>The most common form of mixed-size instruction is the one used when
   31 writing a 32-bit OS: having done your setup in 16-bit mode, such as loading
   32 the kernel, you then have to boot it by switching into protected mode and
   33 jumping to the 32-bit kernel start address. In a fully 32-bit OS, this
   34 tends to be the <em>only</em> mixed-size instruction you need, since
   35 everything before it can be done in pure 16-bit code, and everything after
   36 it can be pure 32-bit.</p>
   37 <p>This jump must specify a 48-bit far address, since the target segment is
   38 a 32-bit one. However, it must be assembled in a 16-bit segment, so just
   39 coding, for example,</p>
   40 <pre>
   41         jmp     0x1234:0x56789ABC       ; wrong!
   42 </pre>
   43 <p>will not work, since the offset part of the address will be truncated to
   44 <code>0x9ABC</code> and the jump will be an ordinary 16-bit far one.</p>
   45 <p>The Linux kernel setup code gets round the inability of
   46 <code>as86</code> to generate the required instruction by coding it
   47 manually, using <code>DB</code> instructions. NASM can go one better than
   48 that, by actually generating the right instruction itself. Here's how to do
   49 it right:</p>
   50 <pre>
   51         jmp     dword 0x1234:0x56789ABC         ; right
   52 </pre>
   53 <p>The <code>DWORD</code> prefix (strictly speaking, it should come
   54 <em>after</em> the colon, since it is declaring the <em>offset</em> field
   55 to be a doubleword; but NASM will accept either form, since both are
   56 unambiguous) forces the offset part to be treated as far, in the assumption
   57 that you are deliberately writing a jump from a 16-bit segment to a 32-bit
   58 one.</p>
   59 <p>You can do the reverse operation, jumping from a 32-bit segment to a
   60 16-bit one, by means of the <code>WORD</code> prefix:</p>
   61 <pre>
   62         jmp     word 0x8765:0x4321      ; 32 to 16 bit
   63 </pre>
   64 <p>If the <code>WORD</code> prefix is specified in 16-bit mode, or the
   65 <code>DWORD</code> prefix in 32-bit mode, they will be ignored, since each
   66 is explicitly forcing NASM into a mode it was in anyway.</p>
   67 <h3 id="section-11.2">11.2 Addressing Between Different-Size Segments</h3>
   68 <p>If your OS is mixed 16 and 32-bit, or if you are writing a DOS extender,
   69 you are likely to have to deal with some 16-bit segments and some 32-bit
   70 ones. At some point, you will probably end up writing code in a 16-bit
   71 segment which has to access data in a 32-bit segment, or vice versa.</p>
   72 <p>If the data you are trying to access in a 32-bit segment lies within the
   73 first 64K of the segment, you may be able to get away with using an
   74 ordinary 16-bit addressing operation for the purpose; but sooner or later,
   75 you will want to do 32-bit addressing from 16-bit mode.</p>
   76 <p>The easiest way to do this is to make sure you use a register for the
   77 address, since any effective address containing a 32-bit register is forced
   78 to be a 32-bit address. So you can do</p>
   79 <pre>
   80         mov     eax,offset_into_32_bit_segment_specified_by_fs 
   81         mov     dword [fs:eax],0x11223344
   82 </pre>
   83 <p>This is fine, but slightly cumbersome (since it wastes an instruction
   84 and a register) if you already know the precise offset you are aiming at.
   85 The x86 architecture does allow 32-bit effective addresses to specify
   86 nothing but a 4-byte offset, so why shouldn't NASM be able to generate the
   87 best instruction for the purpose?</p>
   88 <p>It can. As in <a href="#section-11.1">section 11.1</a>, you need only
   89 prefix the address with the <code>DWORD</code> keyword, and it will be
   90 forced to be a 32-bit address:</p>
   91 <pre>
   92         mov     dword [fs:dword my_offset],0x11223344
   93 </pre>
   94 <p>Also as in <a href="#section-11.1">section 11.1</a>, NASM is not fussy
   95 about whether the <code>DWORD</code> prefix comes before or after the
   96 segment override, so arguably a nicer-looking way to code the above
   97 instruction is</p>
   98 <pre>
   99         mov     dword [dword fs:my_offset],0x11223344
  100 </pre>
  101 <p>Don't confuse the <code>DWORD</code> prefix <em>outside</em> the square
  102 brackets, which controls the size of the data stored at the address, with
  103 the one <code>inside</code> the square brackets which controls the length
  104 of the address itself. The two can quite easily be different:</p>
  105 <pre>
  106         mov     word [dword 0x12345678],0x9ABC
  107 </pre>
  108 <p>This moves 16 bits of data to an address specified by a 32-bit offset.</p>
  109 <p>You can also specify <code>WORD</code> or <code>DWORD</code> prefixes
  110 along with the <code>FAR</code> prefix to indirect far jumps or calls. For
  111 example:</p>
  112 <pre>
  113         call    dword far [fs:word 0x4321]
  114 </pre>
  115 <p>This instruction contains an address specified by a 16-bit offset; it
  116 loads a 48-bit far pointer from that (16-bit segment and 32-bit offset),
  117 and calls that address.</p>
  118 <h3 id="section-11.3">11.3 Other Mixed-Size Instructions</h3>
  119 <p>The other way you might want to access data might be using the string
  120 instructions (<code>LODSx</code>, <code>STOSx</code> and so on) or the
  121 <code>XLATB</code> instruction. These instructions, since they take no
  122 parameters, might seem to have no easy way to make them perform 32-bit
  123 addressing when assembled in a 16-bit segment.</p>
  124 <p>This is the purpose of NASM's <code>a16</code>, <code>a32</code> and
  125 <code>a64</code> prefixes. If you are coding <code>LODSB</code> in a 16-bit
  126 segment but it is supposed to be accessing a string in a 32-bit segment,
  127 you should load the desired address into <code>ESI</code> and then code</p>
  128 <pre>
  129         a32     lodsb
  130 </pre>
  131 <p>The prefix forces the addressing size to 32 bits, meaning that
  132 <code>LODSB</code> loads from <code>[DS:ESI]</code> instead of
  133 <code>[DS:SI]</code>. To access a string in a 16-bit segment when coding in
  134 a 32-bit one, the corresponding <code>a16</code> prefix can be used.</p>
  135 <p>The <code>a16</code>, <code>a32</code> and <code>a64</code> prefixes can
  136 be applied to any instruction in NASM's instruction table, but most of them
  137 can generate all the useful forms without them. The prefixes are necessary
  138 only for instructions with implicit addressing: <code>CMPSx</code>,
  139 <code>SCASx</code>, <code>LODSx</code>, <code>STOSx</code>,
  140 <code>MOVSx</code>, <code>INSx</code>, <code>OUTSx</code>, and
  141 <code>XLATB</code>. Also, the various push and pop instructions
  142 (<code>PUSHA</code> and <code>POPF</code> as well as the more usual
  143 <code>PUSH</code> and <code>POP</code>) can accept <code>a16</code>,
  144 <code>a32</code> or <code>a64</code> prefixes to force a particular one of
  145 <code>SP</code>, <code>ESP</code> or <code>RSP</code> to be used as a stack
  146 pointer, in case the stack segment in use is a different size from the code
  147 segment.</p>
  148 <p><code>PUSH</code> and <code>POP</code>, when applied to segment
  149 registers in 32-bit mode, also have the slightly odd behaviour that they
  150 push and pop 4 bytes at a time, of which the top two are ignored and the
  151 bottom two give the value of the segment register being manipulated. To
  152 force the 16-bit behaviour of segment-register push and pop instructions,
  153 you can use the operand-size prefix <code>o16</code>:</p>
  154 <pre>
  155         o16 push    ss 
  156         o16 push    ds
  157 </pre>
  158 <p>This code saves a doubleword of stack space by fitting two segment
  159 registers into the space which would normally be consumed by pushing one.</p>
  160 <p>(You can also use the <code>o32</code> prefix to force the 32-bit
  161 behaviour when in 16-bit mode, but this seems less useful.)</p>
  162 </div>
  163 </body>
  164 </html>