Previous | Next --- Slide 8 of 63

kevtan

What happens when programCount doesn't exactly line up with the width of the vector instructions that the CPU supports?

nickbowman

I think in this case you would have to be a little more careful with the memory access instructions in the ispc_sinx implementation to make sure that you're not trying to load memory beyond the bounds of x. However, as long as that's taken care of I think the program would still work, with just lower-than-usual vector lane utilization for the last set of vector instructions instructions to handle the last bit of the array that didn't divide nicely.

wanze

When we say for example "8-wide AVX2", the 8-wide basically means the ProgramCount=8, and it can have at most 8 independent instructions executed in parallel right?

user1234

@wanze I agree with your first statement that 8-wide means ProgramCount=8. However, they generate SIMD vector instructions so that there should be the same one instruction for all 8 instances executed in parallel.

zecheng

@user1234 Agree. The instruction might still from one instruction stream but only the data parallel.

Please log in to leave a comment.