Previous | Next --- Slide 9 of 63
Back to Lecture Thumbnails
harrymellsop

Is there a reason here why programIndex is not included in the loop condition? I.e. for (uniform int i=programIndex; i < N; i += programCount) { ... } ? Does the for loop itself spawn the members of the gang, or is this just a stylistic choice?

cyb

@harrymellsop I think it's a stylistic choice. But we also need to change variable i to a non-uniform variable since it now contains different values for different instances.

wooloo

Does the for loop itself spawn the members of the gang, or is this just a stylistic choice?

I think the ispc execution model is such that the program instances start when ispc_sinx is called. Each program instance in the gang runs the for loop.

kayvonf

@wooloo. You are correct. The ISPC instances are created at the call to ispc_sinx and then exist until the function's return. All instances execute the body of the function. Therefore, all the code you see on the right hand side of the slide is executed serially by each of the programCount number of program instances in a gang.

kayvonf

@harrymellsop. There's actually a good reason. As written, all expressions in the loop (including the check of the loop termination condition) are independent of programIndex and therefore all the instruction stream control variable values are the same across all instances. Notice that all these expressions work on uniform values in the program, which means that a good implementation of the ISPC compiler can implement them with scalar math/instructions instead of vector instructions. (Recall modern CPUs can execute both scalar and SIMD vector instructions) Your suggestion would make i hold on a different value in all program instances, so the control of the loop would have to be implemented using vector ops much like your implementation of Assignment 1 program 2.

kevtan

Is it convention to put ISPC subroutines in their own files with the .ispc file extension?

zecheng

So, programIndex will always start from 0 and end with programCount - 1?

a7hu

@zecheng Yes, programIndex will always start from 0 ... programCount - 1. I believe this maps to the lane ID of SIMD instruction. For a 8-wide SIMD instruction, the lane ID are lane 0 ... lane 7. See _mm256_add_ps intrinsics reference as an example: https://software.intel.com/sites/landingpage/IntrinsicsGuide/#techs=AVX&expand=5669,136,136&text=_add_ps

Please log in to leave a comment.