ufxela

Could one workaround the task IPSC abstraction by using std::thread to call regular ISPC code? If so, what's the purpose of also including the task abstraction? Ease of use or is there something more fundamental?

wzz

It seems the task abstraction is much simpler code wise, but I also wonder if there's anything different in the end implementations.

lblankem

I wonder why it is difficult to have the task abstraction work under the hood when the programmer indicates independences in the program.

tspint

The launch keyword in ISPC provides a way for a "task" to execute asynchronously. These tasks can run concurrently on multiple cores, and there is no guarantee about the order in which they will execute, especially if there are more tasks launched than available threads. Maybe a good way of thinking about a task is just a "thing to do" and threads are just workers that "do the things". If there are N available threads, then at any given time there can only be N tasks being processed at once. In general, you should launch more tasks than there are threads to ensure good load-balancing, but not too many such that the overhead of scheduling the tasks to be run dominates.

Further reading: https://ispc.github.io/ispc.html#tasking-model

haofeng

Tasks are independent work that can be executed with different cores. Contrary to threads, they do not have execution context and they are only pieces of work. The ISPC compiler takes the tasks and launches however many threads it decides. Breaking a large number of work into more tasks than the number of cores results in better load-balancing. However, one should also consider the overhead of task scheduling when deciding how many tasks to launch.

donquixote

On the first programming assignment, a question was presented on why ISPC introduced the task abstraction instead of automatically splitting the iterations of a foreach loop across multiple cores. One answer I would give is that having programmer control over multicore tasks seems important, so having a separate abstraction for tasks enables the programmer to choose things like the number of tasks to launch, and whether/when to sync tasks forcefully with the sync keyword described here. On the other hand, a foreach loop parallelizes within one thread on one core and doesn't really need parallelization or synchronization options provided explicitly by the programmer. The option to control the target SIMD width is left as a compiler flag, so that a foreach loop is very similar to a straightforward for loop.

fizzbuzz

I like the term Gangs—it really hammers home the independent threads of control model which is both a really useful framing and not a way I intuitively think of SIMD instructions.