Previous | Next --- Slide 18 of 47
Back to Lecture Thumbnails
tspint

ISPC tasks get dynamically assigned to threads that have completed their current tasks. Does this mean that there is never any thread idling as long as there are incomplete tasks (until maybe the very end)?

Also, the ISPC User Guide says that if there are too many tasks, then the overhead of scheduling them dominates the computation. Where is this overhead occur since it appears that the threads simply need to look at the next task pointer?

haiyuem

1) Yes, threads will actively look for work to do if there are still piles of tasks left. 2) Examples for overhead: 1. A thread has to indicate when it becomes idle to let the control unit know; 2. The control unit needs to make sure a task is only assigned to one thread, not multiple threads at the same time.

wzz

To follow up on the overhead, threads will be probably working on tasks that are not contiguous in memory, so there's some cache costs to unnecessarily partitioning the problem space into too many tasks.

pslui88

To add on the (great) discussion of ISPC task overhead, even though there is overhead, it is still significantly less than overhead for typical multi-threading. For multi-threading, if there are more software threads the user wants to run than hardware threads available, the kernel still wants to maintain the illusion of threads running in parallel. To do this, it runs as many threads that the hardware can support for a small time slice, then switches them with the waiting threads, and does this switching until all threads are done. This overhead is way more than for ISPC tasks, which are also assigned to hardware threads as they are available, but they key difference is that the tasks are run to completion.

kostun

questions about this slide:

is the assignment from the list of tasks linear, or in order? is the order of the tasks decided by the compiler?

or is the "next task ptr" an abstraction for how the state is kept by ISPC?

weimin

I think user code adds the tasks with the launch but there is no guarantee of the execution order.

trip

This system of dynamic task assignment among a collection of threads sounds like a threadpool whose scheduler is the ISPC runtime -- is that an okay way to think about this?

zecheng

So actually programmer still decides the size and number of tasks. But since ISPC runtime will decide which tasks should be assign to which threads (not in a static order), we call ISPC tasks dynamic assignment.

marwan

So does ISPC use something similar to a threadpool to notice when a thread is available and then utilize it to finish each task?

a7hu

@marwan Yes, ISPC maintains a threadpool and adds a task to it via keyword "launch". This is described as: "So I added a launch keyword to volta; it used Cilk’s semantics. Put it before a function call and the function goes off to a thread pool: launch foo(a, 6.3);" Reference: The story of ispc: C's influence and implementing SPMD on SIMD (part 4)

kayvonf

@marwan, @a7hu -- and in programming assignment 2, you'll implement your own version of the same idea.

Ethan

So does ISPC automatically determine how many threads to use? If so does it depend on the number of tasks created? If not then does it default to the number of existing cores?

jchen

@Ethan, I think it's a combination of both of those factors. I would guess that ISPC is smart enough to only spawn one thread if you only launch one task, 2 threads if you launch 2 tasks, etc.. However, it will probably use knowledge of the hardware to cap the number of threads, so if you launch 10000 tasks and your processor can only support 8 threads, the compiler will probably only launch 8 threads.

Something I'm not sure about: can ISPC adaptively spawn more threads if that would be helpful, or is the number of threads fixed at compile time? For example, if the processor only supports 8 threads, but the tasks are IO-bound and often blocking with disk or network reads/writes, will ISPC decide to create more threads to take advantage of that fact?

x2020

In assignment 2 we implemented several versions of threadpools. From the above discussion, ISPC also maintains a threadpool for asynchronous execution. Is the threadpool threading one or sleeping one? Do we reuse the same threadpool across multiple ISPC calls?

Please log in to leave a comment.