You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@tinkerpop.apache.org by Marko Rodriguez <ok...@gmail.com> on 2016/01/08 15:25:11 UTC

Storage API and HDFS + Spark Persisted RDDs

Hi everyone,

In 3.1.1-SNAPSHOT, I made it so persisted RDDs in Spark can be managed like files on a filesystem. To do this, I took the methods from hdfs (ls(), rm(), cp(), etc.) and created a Storage interface that providers can implement which give filesystem semantics to their storage systems. For OLTP databases like Titan, OrientDB, etc., this doesn't make much sense, but for OLAP systems like Spark, Giraph, Hadoop, it does. Please check out the SNAPSHOT docs to see how Spark RDDs can be managed:

	http://tinkerpop.incubator.apache.org/docs/3.1.1-SNAPSHOT/reference/#_storage_systems

Enjoy,
Marko.

http://markorodriguez.com