In the written assignment we saw that the groupByKey operation not only increases communication between partitions but can also have a larger memory footprint due to the need to look through the entire data set to find elements with matching keys.
Please log in to leave a comment.
In the written assignment we saw that the groupByKey operation not only increases communication between partitions but can also have a larger memory footprint due to the need to look through the entire data set to find elements with matching keys.