You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Vijayasarathy Kannan <kv...@vt.edu> on 2015/03/17 20:18:30 UTC

Question on RDD groupBy and executors

Hi,

I am doing a groupBy on an EdgeRDD like this,

val groupedEdges = graph.edges.groupBy[VertexId](func0)
while(true) {
  val info = groupedEdges.flatMap(func1).collect.foreach(func2)
}

The groupBy distributes the data to different executors on different nodes
in the cluster.

Given a key K (a vertexId identifying a particular group in *groupedEdges*),
is there a way to find details such as
- which executor is responsible for K?
- which node in the cluster the executor containing K resides on?
- access that specific executor (and possibly assign a task) from the
driver?

Thanks.