[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [oc] Beyond Transmeta...
And did you seen anytime web <a
href="http://www.alphaprocessors.com">http://www.alphaprocessors.co
m</a> ?
----- Original Message -----
From: Suboner@a...
To: cores@o...
Date: Wed, 7 Jun 2000 04:18:49 EDT
Subject: Re: [oc] Beyond Transmeta...
>
>
> > Yes, that's true - I haven't looked at it this way...
> > So if I understand you correctly you are trying to calculate
> each 'bit
> > plane'
> > as fast as possible (from its point of view). I suppose this
> could be
> > theoreticaly
> > done faster than calculating 32b operands even for sequential
> programs.
> > But I am worried that programs would be extremely large.
>
> Yeah, that is probably one big issue, except when the network is
> configured
> to work like an X86 or RISC processor, then what you have is a
> large chunk of
> memory being used to create the necesary hardware in the network
> itself to be
> able to run the software in a normal serial manner, this would
> decrease
> memory usage but also decrease performance (sound familiar :), the
> old fight
> between memory and performance).
>
> > In how many cycles can you execute this? How many instructions
> do you need?
> > c = add32(a,b)
> > e = add32(c,d)
>
> Well in a network manner with a minimum of 8 1bit processors, it
> would be
> like 32 clocks, less processors increase that. But the minimum that
> could be
> required could be only 1clock (depending on what bits change), if
> only the
> first bit changes, or if 2 clocks if only 1 bit changes, of course
> if the bit
> causes a turn over causing many carries, will increase that. The
> amount of
> instructions is 1 for the first bit, 3 for the second bit and 4 for
> every
> following bit). Of course there are other ways to arrange a
> network, I
> believe that one is the most parallel I could create, I may be able
> to create
> an even more parallel one but it may not have much in performance
> gain, it
> would be a kind of temporal diffrence in that the first initial
> pass could
> have 63 instructions to be done in parrallel (2 for every bit
> except the
> last), and every pass after that is 2 instructions. There are many
> ways to
> configure a network of bits, and they all might have more benefits
> then
> others.
>
> > BTW: I am not such pesimistic guy trying to criticise
> everything. When we
> > were
> > developing or2k such 'comments' were very welcome.
>
> Well, I was not so sure you were... Actually this conversation has
> been good,
> there are some things I did not realize about this that were
> brought up
> through this discussion. Like the multiprocessor way of viewing the
> network,
> if it was not for the questions I would not have tried to look at
> things in
> diffrent ways.
>
> > I suppose you can link data back to loop start, can't you?
> Parallely you can
> > detect
> > whether loop should be finished. Of course this isn't normal
> equation
> > anymore...
> > But otherwise I don't see a problem here.
>
> Oh, yeah... ha... I did not think of that. :)
>
> > Yes that is true for basic blocks. But not for functions.
> Compiler would
> > create
> > separate network for each function (there are too many
> problems otherwise).
> > You cannot link them dinamicaly together. I won't go in
> detailed
> > explanation,
> > even when detecting parallelity between functions there are
> certain
> > problems,
> > unless program (meaning of program) itself is modified. ILP
> here stand for
> > inductive logic programing.
>
> What I'm getting at though, is not to modify the programs source
> code, but to
> compile it into a network and to shift the network around into a
> more
> parallel program. What I think the way it would work would be to
> shift parts
> of the network which compose a function into other functions, so
> that as you
> shift them around they sort of lose them selves as a descreet and
> seperate
> function and instead become an integrated part.
>
> I'm not exactly sure what you mean by creating seperate networks
> for each
> function. You could mean to create a seperate network for each
> function
> call/usage, which I think you mean. For that, I would say that you
> do not
> necesarily have to do it that way, you could create a network that
> acts like
> a mini-highlevel processor (high level instructions), that will
> reduce the
> amount of redundantly creating the function networks, by allowing
> the
> function to be called in an high level instruction. Instead of
> having them
> directly connected to each other they would act like many mini
> processors
> connected together. If you want to think about it diffrently try
> thinking
> about as though you could either have billions of 1bit processors
> working
> simultaniously (virtually of course since they only work on bits
> that
> change), or you could have a few 8bit processors (faked by the
> network) that
> do various tasks within the system, or you could have even fewer
> 32bit
> processors doing various tasks, or you could have one 64bit
> processor that
> does everything (like a normal CPU), the latter taking up the least
> amount of
> memory, while the 1bit network takes up the most memory. Its really
> scalable
> environment, that allows you to create any kind of processor that
> is
> necesary, it will turn your functions into a processor if it needs
> to, it
> will basicly balance between consuming a lot of memory and
> resources to
> taking very little memory or resources. If it was not for this
> discussion I
> don't think I would have realized that.
>
> Leyland Needham
>
--
To unsubscribe from cores mailing list please visit http://www.opencores.org/mailinglists.shtml