Self-test: partial starts as a zero-vector and x is loaded vector by vector. The accumulation of each number within a vector is independent - this is the key why it compiles. In this way, partial is a vector of several partial sums. After the looping, the implementation gets the sum of the partial sums.
kayvonf
@orz. Correct! At the end of the loop, the element in the ith position in 8-wide vector partial is the partial sum of input array elements x[i] + x[i+8] + x[i+16]...
Self-test:
partial
starts as a zero-vector andx
is loaded vector by vector. The accumulation of each number within a vector is independent - this is the key why it compiles. In this way,partial
is a vector of several partial sums. After the looping, the implementation gets the sum of the partial sums.