Run-time assemblers - Using and porting GNU lightning

Next: Standard macros, Previous: Adjusting configure, Up: Porting GNU lightning

3.3 Creating the run-time assembler

The run-time assembler is a set of macros whose purpose is to assemble instructions for the target machine's assembly language, translating mnemonics to machine language together with their operands. While a run-time assembler is not, strictly speaking, part of gnu lightning (it is a private layer to be used while implementing the standard macros that are ultimately used by clients), designing a run-time assembler first allows you to think in terms of assembly language rather than binary code (ouch!...), making it considerably easier to write the standard macros.

Creating a run-time assembler is a tedious process rather than a difficult one, because most of the time will be spent collecting and copying information from the architecture's manual.

Macros defined by a run-time assembler are conventionally named after the mnemonic and the type of its operands. Examples took from the SPARC's run-time assembler are ADDrrr, a macro that assembles an ADD instruction with three register operands, and SUBCCrir, which assembles a SUBCC instruction whose second operand is an immediate and the remaining two are registers.

The first step in creating the assembler is to pick a convention for operand specifiers (r and i in the example above) and for register names. On the SPARC, this convention is as follows

r: A register name. For every r in the macro name, a numeric parameter RR is passed to the macro, and the operand is assembled as %rRR.
i: An immediate, usually a 13-bit signed integer (with exception for instructions such as SETHI and branches). The macros check the size of the passed parameter if gnu lightning is configured with --enable-assertions.
x: A combination of two r parameters, which are summed to determine the effective address in a memory load/store operation.
m: A combination of an r and i parameter, which are summed to determine the effective address in a memory load/store operation.

Additional macros can be defined that provide easier access to register names. For example, on the SPARC, _Ro(3) and _Rg(5) map respectively to %o3 and %g5; on the x86, instead, symbolic representations of the register names are provided (for example, _EAX and _EBX).

CISC architectures sometimes have registers of different sizes–this is the case on the x86 where %ax is a 16-bit register while %esp is a 32-bit one. In this case, it can be useful to embed information on the size in the definition of register names. The x86 machine language, for example, represents all three of %bh, %di and %edi as 7; but the x86 run-time assemblers defines them with different numbers, putting the register's size in the upper nybble (for example, `17h' for %bh and `27h' for %di) so that consistency checks can be made on the operands' sizes when --enable-assertions is used.

The next important part defines the native architecture's instruction formats. These can be as few as ten on RISC architectures, and as many as fifty on CISC architectures. In the latter case it can be useful to define more macros for sub-formats (such as macros for different addressing modes) or even for sub-fields in an instruction. Let's see an example of these macros.

     #define _2i( OP, RD, OP2, IMM)
             _I((_u2 (OP )<<30)  |  (_u5(RD)<<25)  |  (_u3(OP2)<<22)  |
                 _u22(IMM)                                            )

The name of the macro, _2i, indicates a two-operand instruction comprising an immediate operand. The instruction format is:

      .------.---------.------.-------------------------------------------.
      |  OP  |   RD    | OP2  |               IMM                         |
      |------+---------+------+-------------------------------------------|
      |2 bits|  5 bits |3 bits|             22 bits                       |
      |31-30 |  29-25  | 22-24|              0-21                         |
      '------'---------'------'-------------------------------------------'

gnu lightning provides macros named _sXX(OP) and _uXX(OP), where XX is a number between 1 and 31, which test¹ whether OP can be represented as (respectively) a signed or unsigned integer of the given size. What the macro above does, then, is to shift and or together the different fields, ensuring that each of them fits the field.

Here is another definition, this time for the PowerPC architecture.

     #define _X(OP,RD,RA,RB,XO,RC)
             _I((_u6 (OP)<<26)  |  (_u5(RD)<<21)  |  (_u5(RA)<<16)  |
                ( _u5(RB)<<11)  |  (_u10(XO)<<1)  |   _u1(RC)       )

Here is the bit layout corresponding to this instruction format:

      .--------.--------.--------.--------.---------------------.-------.
      |    OP  |   RD   |   RA   |   RB   |           X0        |   RC  |
      |--------+--------+--------+--------+-----------------------------|
      | 6 bits | 5 bits | 5 bits | 5 bits |         10 bits     | 1 bit |
      | 31-26  | 25-21  | 16-20  | 11-15  |         1-10        |   0   |
      '--------'---------'-------'--------'-----------------------------'

How do these macros actually generate code? The secret lies in the _I macro, which is one of four predefined macros which actually store machine language instructions in memory. They are _B, _W, _I and _L, respectively for 8-bit, 16-bit, 32-bit, and long (either 32-bit or 64-bit, depending on the architecture) values.

Next comes another set of macros (usually the biggest) which represents the actual mnemonics—macros such as ADDrrr and SUBCCrir, which were cited earlier in this chapter, belong to this set. Most of the times, all these macros will do is to use the “instruction format” macros, specifying the values of the fields in the different instruction formats. Let's see a few of these definitions, again taken from the SPARC assembler:

     #define BAi(DISP)                       _2   (0, 0,  8, 2, DISP)
     #define BA_Ai(DISP)                     _2   (0, 1,  8, 2, DISP)
     
     #define SETHIir(IMM, RD)                _2i  (0, RD, 4, IMM)
     
     #define ADDrrr(RS1, RS2, RD)            _3   (2, RD,  0, RS1, 0, 0, RS2)
     #define ADDrir(RS1, IMM, RD)            _3i  (2, RD,  0, RS1, 1,    IMM)
     #define ADDCCrrr(RS1, RS2, RD)          _3   (2, RD, 16, RS1, 0, 0, RS2)
     #define ADDCCrir(RS1, IMM, RD)          _3i  (2, RD, 16, RS1, 1,    IMM)
     #define ANDrrr(RS1, RS2, RD)            _3   (2, RD,  1, RS1, 0, 0, RS2)
     #define ANDrir(RS1, IMM, RD)            _3i  (2, RD,  1, RS1, 1,    IMM)
     #define ANDCCrrr(RS1, RS2, RD)          _3   (2, RD, 17, RS1, 0, 0, RS2)
     #define ANDCCrir(RS1, IMM, RD)          _3i  (2, RD, 17, RS1, 1,    IMM)

A few things have to be noted. For example:

The SPARC assembly language sometimes uses a comma inside a mnemonic (for example, ba,a). This symbol is not allowed inside a cpp macro name, so it is replaced with an underscore; the same is done with the dots found in the PowerPC assembly language (for example, andi. is defined as ANDI_rri).
It can be useful to group together instructions with the same instruction format, as doing this tends to make the source code more readable (numbers are put in the same columns).
Using an editor without automatic wrap at end of line can be useful, since run-time assemblers tend to have very long lines.

A final touch is to define the synthetic instructions, which are usually found on RISC machines. For example, on the SPARC, the LD instruction has two synonyms (LDUW and LDSW) which are defined thus:

     #define LDUWxr(RS1, RS2, RD)            LDxr(RS1, RS2, RD)
     #define LDUWmr(RS1, IMM, RD)            LDmr(RS1, IMM, RD)
     #define LDSWxr(RS1, RS2, RD)            LDxr(RS1, RS2, RD)
     #define LDSWmr(RS1, IMM, RD)            LDmr(RS1, IMM, RD)

Other common case are instructions which take advantage of registers whose value is hard-wired to zero, and short-cut instructions which hard-code some or all of the operands:

     /* Destination is %g0, which the processor never overwrites. */
     #define CMPrr(R1, R2)   SUBCCrrr(R1, R2, 0) /* subcc %r1, %r2, %g0 */
     
     /* One of the source registers is hard-coded to be %g0. */
     #define NEGrr(R,S)      SUBrrr(0, R, S)     /* sub %g0, %rR, %rS */
     
     /* All of the operands are hard-coded. */
     #define RET()           JMPLmr(31,8 ,0)     /* jmpl [%r31+8], %g0  */
     
     /* One of the operands acts as both source and destination */
     #define BSETrr(R,S)     ORrrr(R, S, S)      /* or %rR, %rS, %rS */

Specific to RISC computers, finally, is the instruction to load an arbitrarily sized immediate into a register. This instruction is usually implemented as one or two basic instructions:

If the number is small enough, an instruction is sufficient (LI or ORI on the PowerPC, MOV on the SPARC).
If the lowest bits are all zeroed, an instruction is sufficient (LIS on the PowerPC, SETHI on the SPARC).
Otherwise, the high bits are set first (with LIS or SETHI), and the result is then ored with the low bits

Here is the definition of such an instruction for the PowerPC:

     #define MOVEIri(R,I)      (_siP(16,I) ? LIri(R,I) :     \ /* case 1    */
                               (_uiP(16,I) ? ORIrri(R,0,I) : \ /* case 1    */
                               _MOVEIri(R, _HI(I), _LO(I)) ))  /* case 2/3  */
     
     #define _MOVEIri(H,L,R)  (LISri(R,H), (L ? ORIrri(R,R,L) : 0))

and for the SPARC:

     #define SETir(I,R)      (_siP(13,I) ? MOVir(I,R) : \
     			 _SETir(_HI(I), _LO(I), R))
     
     #define _SETir(H,L,R)   (SETHIir(H,R), (L ? ORrir(R,L,R) : 0))

In both cases, _HI and _LO are macros for internal use that extract different parts of the immediate operand.

You should take a look at the run-time assemblers distributed with gnu lightning before trying to craft your own. In particular, make sure you understand the RISC run-time assemblers (the SPARC's is the simplest) before trying to decypher the x86 run-time assembler, which is significantly more complex.

Footnotes

[1] Only when --enable-assertions is used.