Performance Optimization I: Work Distribution and Scheduling

Previous | Next --- Slide 14 of 64

steliosr

Assuming there are no dependencies between tasks, a Longest Task First greedy approach would likely perform near-optimally in terms of time of completion of the last task remaining. Such an approach would immediately assign the long task to the first worker thread, while smaller tasks would execute concurrently in other workers. Even if the long task is still the bottleneck, overall utilization is maximized.

Drew

I wonder what an optimal parallel algorithm looks like for tasks like this. For instance, if we had an array of numbers to do some computation on, where the computation time is proportional to the value of each number. Would we want to take more time to exactly sort the numbers first? Would we want to take some lesser amount of time to approximately sort the numbers first? I imagine it depends on the situation, and I also imagine there are some data/computations for which you would want to not sort at all.

Jonathan

One question I have with this slide is that if we know that these 16 tasks will take this much time, why would we want to use dynamic scheduling? It seems like in this case we would be better served using static assignment to ensure that each thread is allocated the same amount of work.

steliosr

@Jonathan I don't think we are assuming (the very special case) that we know these runtimes in advance. The plot just shows possible task runtimes to make the point (probably in the next slide) that if we scheduled statically, we would guarantee that the thread that is assigned the long task will run for much longer. Although this might still happen in the dynamic case (worst case scenario the long task is randomly scheduled last), it's certain to happen if the allocation is static. So in the average and best case, the dynamic allocation will perform much better, "masking" the long runtime of the big task by running all the small ones on other threads.

harrymellsop

Is there a general system out there to generate heuristics that we could use to improve this assignment? Perhaps heuristic is too strong a term? Perhaps we could instead provide the system with some ranking ahead of time that gives the system an idea of how complex the task will be ex ante?

rmjones

If the distribution of task lengths is that uneven then it seems like a random assignment would do pretty darn well as a general system. Given enough tasks, as steliosr said it's unlikely that the extremely long tasks end up near the very end, allowing for the other threads to continue working while the unlucky thread is saddled with the long task.

Please log in to leave a comment.