Previous | Next --- Slide 79 of 82
Back to Lecture Thumbnails
suninhouse

It appears that the flavor of CUDA programs that we've studied so far are more similar to styles of multi-threading programs that use barriers instead of mutex, additionally with the access/modification to the global shared memory being atomic.

haiyuem

Is this somewhat similar to our assignment 2 where we implemented a thread pool that always exist for different rounds of bulks of tasks?

andykhuu

It seems to me that CUDA programming will require a lot of generalization with respect to the different possible specification that a GPU may have. Do GPUs provide an interface in software that allow you to just access it's specifications in regards to Cache, Num of Warps, Num of SMs, and etc... I'm thinking it would be just a global macro?

haiyuem

@andykhuu My understanding is that CUDA programming usually requires knowledge about specific GPU configurations (and you're most likely writing CUDA for NVIDIA GPUs). From my own experiences of writing CUDA code, I would always know the exact GPU config beforehand.

danieljm

How is __sync_threads() implemented? Is it the same as a typical memory barrier, or are there other considerations which must be made since we are dealing with a GPU?

Please log in to leave a comment.