Fibonacci - Using and porting GNU lightning

Previous: RPN calculator, Up: GNU lightning macros

2.3.4 Fibonacci numbers

The code in this section calculates a variant of the Fibonacci sequence. While the traditional Fibonacci sequence is modeled by the recurrence relation:

          f(0) = f(1) = 1
          f(n) = f(n-1) + f(n-2)

the functions in this section calculates the following sequence, which is more interesting as a benchmark¹:

          nfibs(0) = nfibs(1) = 1
          nfibs(n) = nfibs(n-1) + nfibs(n-2) + 1

The purpose of this example is to introduce branches. There are two kind of branches: backward branches and forward branches. We'll present the calculation in a recursive and iterative form; the former only uses forward branches, while the latter uses both.

     #include <stdio.h>
     #include "lightning.h"
     
     static jit_insn codeBuffer[1024];
     
     typedef int (*pifi)(int);       /* Pointer to Int Function of Int */
     
     int main()
     {
       pifi      nfibs = (pifi) (jit_set_ip(codeBuffer).iptr);
       int       in;                 /* offset of the argument */
       jit_insn  *ref;               /* to patch the forward reference */
     
             jit_prolog   (1);
       in =  jit_arg_ui   ();
             jit_getarg_ui(JIT_V0, in);              /* V0 = n */
       ref = jit_blti_ui  (jit_forward(), JIT_V0, 2);
             jit_subi_ui  (JIT_V1, JIT_V0, 1);       /* V1 = n-1 */
             jit_subi_ui  (JIT_V2, JIT_V0, 2);       /* V2 = n-2 */
             jit_prepare(1);
               jit_pusharg_ui(JIT_V1);
             jit_finish(nfibs);
             jit_retval(JIT_V1);                     /* V1 = nfibs(n-1) */
             jit_prepare(1);
               jit_pusharg_ui(JIT_V2);
             jit_finish(nfibs);
             jit_retval(JIT_V2);                     /* V2 = nfibs(n-2) */
             jit_addi_ui(JIT_V1,  JIT_V1,  1);
             jit_addr_ui(JIT_RET, JIT_V1, JIT_V2);   /* RET = V1 + V2 + 1 */
             jit_ret();
     
       jit_patch(ref);                               /* patch jump */
             jit_movi_i(JIT_RET, 1);                 /* RET = 1 */
             jit_ret();
     
       /* call the generated code, passing 32 as an argument */
       jit_flush_code(codeBuffer, jit_get_ip().ptr);
       printf("nfibs(%d) = %d", 32, nfibs(32));
       return 0;
     }

As said above, this is the first example of dynamically compiling branches. Branch instructions have three operands: two contains the values to be compared, while the first is a label; gnu lightning label's are represented as jit_insn * values. Unlike other instructions (apart from arg, which is actually a directive rather than an instruction), branch instructions also return a value which, as we see in the example above, can be used to compile forward references.

Compiling a forward reference is a two-step operation. First, a branch is compiled with a dummy label, since the actual destination of the jump is not yet known; the dummy label is returned by the jit_forward() macro. The value returned by the branch instruction is saved to be used later.

Then, when the destination of the jump is reached, another macro is used, jit_patch(). This macro must be called once for every point in which the code had a forward branch to the instruction following jit_patch (in this case a movi_i instruction).

Now, here is the iterative version:

     #include <stdio.h>
     #include "lightning.h"
     
     static jit_insn codeBuffer[1024];
     
     typedef int (*pifi)(int);       /* Pointer to Int Function of Int */
     
     int main()
     {
       pifi     nfibs = (pifi) (jit_set_ip(codeBuffer).iptr);
       int      in;                  /* offset of the argument */
       jit_insn *ref;                /* to patch the forward reference */
       jit_insn *loop;               /* start of the loop */
     
             jit_leaf     (1);
       in =  jit_arg_ui   ();
             jit_getarg_ui(JIT_R2, in);              /* R2 = n */
             jit_movi_ui  (JIT_R1, 1);
       ref = jit_blti_ui  (jit_forward(), JIT_R2, 2);
             jit_subi_ui  (JIT_R2, JIT_R2, 1);
             jit_movi_ui  (JIT_R0, 1);
     
       loop= jit_get_label();
             jit_subi_ui  (JIT_R2, JIT_R2, 1);       /* decr. counter */
             jit_addr_ui  (JIT_V0, JIT_R0, JIT_R1);  /* V0 = R0 + R1 */
             jit_movr_ui  (JIT_R0, JIT_R1);          /* R0 = R1 */
             jit_addi_ui  (JIT_R1, JIT_V0, 1);       /* R1 = V0 + 1 */
             jit_bnei_ui  (loop, JIT_R2, 0);         /* if (R2) goto loop; */
     
       jit_patch(ref);                               /* patch forward jump */
             jit_movr_ui  (JIT_RET, JIT_R1);         /* RET = R1 */
             jit_ret      ();
     
       /* call the generated code, passing 36 as an argument */
       jit_flush_code(codeBuffer, jit_get_ip().ptr);
       printf("nfibs(%d) = %d", 36, nfibs(36));
       return 0;
     }

This code calculates the recurrence relation using iteration (a for loop in high-level languages). There is still a forward reference (indicated by the jit_forward/jit_patch pair); there are no function calls anymore: instead, there is a backward jump (the bnei at the end of the loop).

In this case, the destination address should be known, because the jumps lands on an instruction that has already been compiled. However the program must make a provision and remember the address where the jump will land. This is achieved with jit_get_label, yet another macro that is much similar to jit_get_ip but, instead of a jit_code union, it answers an jit_insn * that the branch macros accept.

Now, let's make one more change: let's rewrite the loop like this:

       ...
     
       jit_delay(
             jit_movi_ui  (JIT_R1, 1),
       ref = jit_blti_ui  (jit_forward(), JIT_R2, 2));
             jit_subi_ui  (JIT_R2, JIT_R2, 1);
     
       loop= jit_get_label();
             jit_subi_ui  (JIT_R2, JIT_R2, 1);       /* decr. counter */
             jit_addr_ui  (JIT_V0, JIT_R0, JIT_R1);  /* V0 = R0 + R1 */
             jit_movr_ui  (JIT_R0, JIT_R1);          /* R0 = R1 */
       jit_delay(
             jit_addi_ui  (JIT_R1, JIT_V0, 1),       /* R1 = V0 + 1 */
             jit_bnei_ui  (loop, JIT_R2, 0));        /* if (R2) goto loop; */
     
       ...

The jit_delay macro is used to schedule delay slots in jumps and branches. This is optional, but might lead to performance improvements in tight inner loops (of course not in a loop that is executed 35 times, but this is just an example).

jit_delay takes two gnu lightning instructions, a delay instruction and a branch instruction. Note that the two instructions must be written in execution order (first the delay instruction, then the branch instruction), not with the branch first. If the current machine has a delay slot, the delay instruction (or part of it) is placed in the delay slot after the branch instruction; otherwise, it emits the delay instruction before the branch instruction. The delay instruction must not depend on being executed before or after the branch.

Instead of jit_patch, you can use jit_patch_at, which takes two arguments: the first is the same as for jit_patch, and the second is the valued to be patched in. In other words, these two invocations have the same effect:

       jit_patch (jump_pc);
       jit_patch_at (jump_pc, jit_get_ip ());

Dual to branches and jit_patch_at are jit_movi_p and jit_patch_movi, which can also be used to implement forward references. jit_movi_p is carefully implemented to use an encoding that is as long as possible, so that it can always be patched; in addition, like branches, it will return an address which is then passed to jit_patch_movi. The usage of jit_patch_movi is similar to jit_patch_at.

Footnotes

[1] That's because, as is easily seen, the sequence represents the number of activations of the nfibs procedure that are needed to compute its value through recursion.