[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [openrisc] Re: OR1000 16 bit instruction set
> I have worked on several CPUs that support "duality" of opcode
> lengths (i.e. supporting both 16bit and 32bit ISAs), both ARM and
> MIPs. In theory 16bit instructions are great, in practice they
> usually turn out to be a big fat pain in the rear side. To me it's
> another "me too" feature that all the embedded CPUs had to support to
> get customer attention.
right. I've never work with Thumb code, but IMHO it is not worth
to use it, unless you don't have a cache.
> However, the issues in the tools with 16bit instructions have never
> been properly addressed. The idea I had while considering this some
> time ago is adding a "multi op" instruction that has a 4 bit
> pre-amble and can do 3 register to register arithmetic operations.
> It couldn't access all the registers clearly, but the way it would
> work is you would use the "normal" instructions to load the
> arithmetic registers (say 0-3), and then use the "multop" instruction
> to do the heavy lifting of those calculations. This gets rid of all
> the issues involved with having multiple ISAs and gives you most of
> the advantages of 16bit instructions in terms of code size based on
> my calculations.
If you are considering new architecture you should also consider
stack model and software conventions as well. As I was studying
different models it turned out, SW is really dependant on classical
RISC/CISC architectures. If you want to make your processor
useful, you must make it easy for software guys.
I did not quite understand your ISA, but I have got an opinion, you
are trying to make nearly 0-operand cpu. You should also consider
how your loops would be optimized.
The 'theoretical' code minimum is about 20% of OR32 code size; but this
would involve real compressor (e.g. gzip) with long latencies. If you want
to have solution like ARM Thumb you end up with 50% of OR32 code size.
Below this it is really hard to go, without significantly reducing
performace
(e.g. stack computer 30% OR32 code size).
BTW:
if anybody is interested in following statistics done recently on ucLinux
code (for or32):
The table below shows how long should immediates be. Middle column shows
coverage
in % for given number of bits (left), and number of instructions are listed
on the right.
/ delimiters arithm/jump instructions.
(all immediates are treated as signed)
0: 0% / 14% 0 / 11443
1: 0% / 15% 7 / 1428
2: 1% / 17% 657 / 1764
3: 3% / 19% 838 / 1367
4: 7% / 22% 1760 / 2568
5: 10% / 26% 1475 / 3594
6: 13% / 31% 1153 / 3787
7: 15% / 34% 893 / 2777
8: 16% / 36% 682 / 2036
9: 17% / 37% 496 / 861
10: 19% / 38% 579 / 855
11: 22% / 39% 1329 / 380
12: 24% / 39% 959 / 136
13: 26% / 39% 817 / 334
14: 28% / 40% 1034 / 86
15: 31% / 40% 1212 / 485
16: 34% / 41% 1423 / 930
17: 37% / 42% 1633 / 1016
18: 71% / 66% 15499 / 20144
19: 72% / 67% 320 / 334
20: 72% / 67% 53 / 219
21: 75% / 67% 1362 / 407
22: 75% / 68% 71 / 157
23: 75% / 68% 61 / 29
24: 77% / 68% 728 / 136
25: 81% / 74% 1850 / 5156
26: 81% / 74% 49 / 66
27: 82% / 75% 188 / 783
28: 89% / 92% 3449 / 14466
29: 89% / 93% 76 / 376
30: 97% / 93% 3611 / 526
31: 98% / 95% 262 / 1387
32: 100% / 100% 996 / 4381
best regards,
Marko
--
To unsubscribe from openrisc mailing list please visit http://www.opencores.org/mailinglists.shtml