You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by "Russell Alexander Spitzer (JIRA)" <ji...@apache.org> on 2015/12/07 19:59:11 UTC
[jira] [Commented] (TINKERPOP-1017) Get InputRDDFormat to work with
Multiple Splits
[ https://issues.apache.org/jira/browse/TINKERPOP-1017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15045489#comment-15045489 ]
Russell Alexander Spitzer commented on TINKERPOP-1017:
------------------------------------------------------
Is the goal here to draw from a remote RDD into a an Iterator on the machine client? Because I think you've done that it about the best way it can be done currently with Spark :)
The underlying code for that is
{code}
def collectPartition(p: Int): Array[T] = {
sc.runJob(this, (iter: Iterator[T]) => iter.toArray, Seq(p)).head
}
(0 until partitions.length).iterator.flatMap(i => collectPartition(i))
{code}
This should pull an entire task into memory (it won't actually stream it except for one "task" at a time) so for many uses cases this means filling a giant block of memory one at a time. I'm not sure there is a much better way to do this in Spark ATM.
> Get InputRDDFormat to work with Multiple Splits
> -----------------------------------------------
>
> Key: TINKERPOP-1017
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1017
> Project: TinkerPop
> Issue Type: Improvement
> Components: hadoop
> Affects Versions: 3.1.1-incubating
> Reporter: Marko A. Rodriguez
>
> {{InputFormatRDD}} was recently added to enable {{HadoopGraph}} to OLTP stream in {{InputRDD}} data. It is currently single threaded. I tried to make it parallel, but ran into some {{Exceptions}} I didn't understand. For OLTP it doesn't matter, however, it would be good to make it work with multiple Hadoop {{InputSplits}} and then, Hadoop could read from Spark in OLAP too :). I don't know why that would ever be used... ? But if its easy enough to do, just do it.
> [~rspitzer] --- When https://issues.apache.org/jira/browse/TINKERPOP-1011 you will see {{InputFormatRDD}}. You might have an idea on how to do this. If you care -- no worries though.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)