How is a relative JMP (x86) implemented in an Assembler?

Posted by Pindatjuh on Stack Overflow See other posts from Stack Overflow or by Pindatjuh
Published on 2010-05-11T21:23:54Z Indexed on 2010/05/11 21:24 UTC
Read the original article Hit count: 378

Filed under:

encoding

While building my assembler for the x86 platform I encountered some problems with encoding the JMP instruction:

enc     inst        size in bytes
EB cb   JMP rel8    2
E9 cw   JMP rel16   4 (because of 0x66 16-bit prefix)
E9 cd   JMP rel32   5
...

(from my favourite x86 instruction website, http://siyobik.info/index.php?module=x86&id=147)

All are relative jumps, where the size of each encoding (operation + operand) is in the third column.

Now my original (and thus fault because of this) design reserved the maximum (5 bytes) space for each instruction. The operand is not yet known, because it's a jump to a yet unknown location. So I've implemented a "rewrite" mechanism, that rewrites the operands in the correct location in memory, if the location of the jump is known, and fills the rest with NOPs. This is a somewhat serious concern in tight-loops.

Now my problem is with the following situation:

b: XXX
c: JMP a
e: XXX
   ...
   XXX
d: JMP b
a: XXX      (where XXX is any instruction, depending
             on the to-be assembled program)

The problem is that I want the smallest possible encoding for a JMP instruction (and no NOP filling).

I have to know the size of the instruction at c before I can calculate the relative distance between a and b for the operand at d. The same applies for the JMP at c: it needs to know the size of d before it can calculate the relative distance between e and a.

How do existing assemblers implement this, or how would you implement this?

This is what I am thinking which solves the problem:

First encode all the instructions to opcodes between the JMP and it's target, and if this region contains a variable-sized opcode, use the maximum size, i.e. 5 for JMP. Then in some conditions, the JMP is oversized (because it may fit in a smaller encoding): so another pass will search for oversized JMPs, shrink them, and move all instructions ahead), and set absolute branching instructions (i.e. external CALLs) after this pass is completed.

I wonder, perhaps this is an over-engineered solution, that's why I ask this question.

Developer IT