Previous | Next --- Slide 21 of 55

lonelymoon

From my understanding, the purpose of this program is to calculate inner products of vec1 and vec2: sum of vec1[i]vec2[i]. At first, Tile1 and Tile2 load portions of vec1 and vec2 within the size of tailsize. Then, using Reduce(accum), it calculates the portion of inner products: sum of Tile1[i]Tile2[i]. This is accumulated in 'output'. To increase the speed, we might use parallelism by using "par" or increase the tailsize.

As an additional comment, I think Reduce(x)(A by B) can be understood as for(int i=0; i<A; i+=B).

nanoxh

Does "par 2" means 2 operations are done in parallel here? In the last slide, it was "par 1". Does that equal to no parallelism, thus "tileSize by 1"?

nickbowman

@nanoxh Yes, my understanding matches what you've described – the par 2 here will result in the circuit for the inner controller to be duplicated and two run control flow on two different data inputs at the same time. When there is par 1 specified that is basically the same thing as indicating no parallelism of data through that controller should happen.

nanoxh

@nickbowman Thanks for confirming!

Please log in to leave a comment.