Previous | Next --- Slide 36 of 87
Back to Lecture Thumbnails
big-compute-energy

Seems like the big issue with utilizing all the ALUs is that we can never control the composition of trues and falses; we could have all true or all false, which is optimal, or something in between. However, in theory, we could compute the conditional for each of the pieces of data. We can then place all the data that yield true in one bucket and all the data that yield false in another bucket (the buckets could presumably be stored with low latency in the on-chip storage). Once we have 8 (or how ever many ALUs there are) values in a given bucket, we pull the data from that bucket onto the registers and have the 8 ALUs process the data in parallel, yielding guaranteed 100% ALU optimization at all times.