Parallel Graph Processing Frameworks + How DRAM Works

Previous | Next --- Slide 41 of 81

haiyuem

The source nodes are contiguous to maximize cache locality when we load from memory.

suninhouse

Thinking about cache locality and false sharing, does it mean that it'd be better optimized if 1) from the memory controller perspective, consecutive memory access requests should be as close to each other as possible while 2) memory access from each thread should be as different from each other as possible?

danieljm

@suninhouse, I think you describe these goals really well! One way that sharding could potentially be further improved is to make sure that each of the "sub-shards" of contiguous information are cacheline-aligned, as then we can be assured that there is no false sharing when constructing subgraphs containing vertices from a single shard.

Another interesting pair of opposing forces here relates to the number of shards: The fewer shards we have, the fewer loads of contiguous data from other shards we need to perform, but the more shards we have, the more granular shards become, allowing for more flexible subgraph generation using the above approach.

Claire

I believe that the memory pulled from the other shards will all be continuous since all memory highlighted in yellow have either a src or dst that are the vertices we are looking at (i.e. 1 and 2). Thus, since this is all the data relating to those vertices, then that will allow us to pull a contiguous block of memory.

Please log in to leave a comment.