You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by kant kodali <ka...@gmail.com> on 2018/09/25 10:51:28 UTC

can I model any arbitrary data structure as an RDD?

Hi All,

I am wondering if I can model any arbitrary data structure as an RDD? For
example, can I model, Red-black trees, Suffix Trees, Radix Trees, Splay
Trees, Fibonacci heaps, Tries, Linked Lists etc as RDD's? If so, how?

To implement a custom RDD I have to implement compute and getPartitions
functions so does this mean that as long as I can store the above data
structures into some storage and implement the compute and getPatitions
functions am I good? I wonder if every data structure is parallelizable in
the first place?

Thanks!