Exploit parallelism in the segments of the whole workload by separating it into separate threads, rather than trying to tease out ways of exploiting more ILP faster from a single stream.
kevtan
I find this slide slightly misleading because it is titled "multi-core era" but only shows a single core. The point is, however, that the cores in modern multi-core processors are much simpler. Unlike the diagram on slide 15, we no longer have:
A big data cache
An out-of-order control logic unit
A fancy branch predictor
A memory pre-fetcher
Two fetch / decode units
All of the chip real estate has been dedicated instead to "vanilla" cores. I think this is one of those rare cases when quantity trumps quality.
Exploit parallelism in the segments of the whole workload by separating it into separate threads, rather than trying to tease out ways of exploiting more ILP faster from a single stream.