Previous | Next --- Slide 26 of 49

MasonLlewellyn

Would a lineage need to be stored in distributed memory in case the node containing its RDD failed?

l-henken

A large benefit for RDDs and the lineage abstraction is idempotence in respect to failures. By keeping the initial raw data/RDD and lineage encoding in persistent memory, the lineage can be traversed as many times as needed as long as this lineage is respected (ie, start with the same initial source and perform each transformation in lineage-order)

Please log in to leave a comment.