I wonder how much optimization you could achieve in parallel programs if you could explicitly indicate certain memory is safe to only use in processor and doesn't require syncing for a while (or if a compiler could do this!). It seems like a tricky thing to automatically detect. I guess it probably doesn't make sense for a lot of problems because it kinda defeats the point of being parallel, but maybe the compiler / programmer can reorder operations to make this syncing happen very minimally.
wanze
Why is that at line "P3 load x", it is a miss with x = 0? Isn't that in the previous line (i.e. previous cycle), P1 updated the value stored in the address of x to be 1 instead of 0?
So when P3 grab the value of x from memory, it should get a value of 1 instead?
weimin
P3 load x is a miss because its the first time fetching x into P3's cache.
I wonder how much optimization you could achieve in parallel programs if you could explicitly indicate certain memory is safe to only use in processor and doesn't require syncing for a while (or if a compiler could do this!). It seems like a tricky thing to automatically detect. I guess it probably doesn't make sense for a lot of problems because it kinda defeats the point of being parallel, but maybe the compiler / programmer can reorder operations to make this syncing happen very minimally.