This perl5 jitter is super-simple. The compiled optree is a linked list in memory in non-execution order, wide-spread jumps. Additionally the calls are indirect. The jitter properly aligns the run-time calls in linear linked-list "exec" order, so that the CPU can prefetch the next instructions, and it inlines some simple ops. IT DOES NOT WORK YET! Faster jitted execution path without runops loop, selected with -MJit or (later) with perl -j. All ops are unrolled in execution order for the CPU cache, prefetching is the main advantage of this function. The ASYNC check should be done only when necessary. (TODO) For now only implemented for x86 with certain hardcoded my_perl offsets. C pseudocode threaded: my_perl->Iop = op_ppaddr>(my_perl); if (my_perl->Isig_pending) Perl_despatch_signals(my_perl); => asm: x86 thr: my_perl in ebx, my_perl->Iop in eax (ebx+4) prolog: my_perl passed on stack, but force 16-alignment for stack. core2/opteron just love that 8D 4C 24 04 leal 4(%esp), %ecx 83 E4 F0 andl $-16, %esp FF 71 FC pushl -4(%ecx) call: 89 1c 24 mov %ebx,(%esp) ; push my_perl FF 25 xx xx xx xx jmp $PL_op->op_ppaddr ; call far 0x5214a4c5 89 43 04 mov %eax,0x4(%ebx) ; save new PL_op into my_perl PERL_ASYNC_CHECK: movl %ebx, (%esi) ;891e movl %eax, 4(%esi) ;894604 movl 900(%esi), %eax ;8b8684030000 testl %eax, %eax ;85C0 je +8 ;7408 movl %esi, (%esp) ;893424 call _Perl_despatch_signals ;FF25xxxxxxxx after calling Perl_despatch_signals, restore my_perl into ebx and push for next 83 c4 10 add $0x10,%esp 83 ec 0c sub $0xc,%esp 31 db xor %ebx,%ebx 53 push %ebx epilog after final Perl_despatch_signals 83 c4 10 add $0x10,%esp 8d 65 f8 lea -0x8(%ebp),%esp 59 pop %ecx 5b pop %ebx 5d pop %ebp 8d 61 fc lea -0x4(%ecx),%esp c3 ret not-threaded: PL_op = op_ppaddr>(); if (PL_sig_pending) Perl_despatch_signals(); PL_op in eax, PL_sig_pending in ebx Note: It looks like gcc can inline some pp calls better than the jitter. enter/nextstate/leave are inlined pretty good. prolog: 55 push %ebp 89 e5 mov %esp,%ebp 83 ec 08 sub $0x8,%esp call: FF 25 xx xx xx xx jmp $PL_op->op_ppaddr ; call far a3 xx xx xx xx mov %eax,$PL_op ;0x4061c4 PERL_ASYNC_CHECK: a1 xx xx xx xx mov $PL_sig_pending,%eax 85 c0 test %eax,%eax 74 05 je +5 e8 xx xx xx xx call Perl_despatch_signals epilog: b8 00 00 00 00 mov $0x0,%eax c9 leave c3 ret problems far calls to the pp ops break code prefetching, so we have to inline as much as possible, similar to B::CC. Easy to jit are only nextstate, enter, and skip null. The best jitter would be a B::CC to assembler backend, but this is hard to get right. porting I created the asm with cc_main and cc_main_nt, see Makefile for objdump and cc_harness rules for gcc assembly. asm links http://www.lxhp.in-berlin.de/lhplinks.html http://blogs.msdn.com/freik/archive/2005/03/17/398200.aspx http://msdn.microsoft.com/en-us/library/7kcdt6fy.aspx http://asm.sourceforge.net//resources.html http://www.intel.com/design/itanium/manuals/iiasdmanual.htm http://www.heyrick.co.uk/assembler/qfinder.html HL jitters parrot luajit psyco / pypy tracemonkey ruby clisp jit libs lightning - c macros only libjit - c lib llvm - compiler framework + lib