Previous | Next --- Slide 36 of 49
Back to Lecture Thumbnails
trip

The benefit of this "streaming" solution is that it handles a single line of data at a time, allowing computation to happen in memory, compared to the larger, chained example above where keeping every sub-piece of the solution in memory would be virtually impossible.

zecheng

I think Spark also has something like query optimizations in database?

jessiexu

My understanding is compiler is the key to fuse these operations.

teapot

Yes, Spark has query optimization that will improve query plans given a specific set of operators just as most databases do.

msere

This streaming is like the RDD question on the last written assignment. It's a nice solution in that it is cache efficient (for RDDs with narrow dependency), and still allows the programmer to write efficient solutions without having to manually think about grouping work together

Please log in to leave a comment.