You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Luis Guerra <lu...@gmail.com> on 2014/07/23 11:03:30 UTC

Spark execution plan

Hi all,

I was wondering how spark may deal with an execution plan. Using PIG as
example and its DAG execution, I would like to manage Spark for a similar
solution.

For instance, if my code has 3 different "parts", being A and B
self-sufficient parts:

Part A:
......
.....
.....
var output_a
Part B:
.....
......
.....
var output_b
Part C:
....
...using output_a and output_b

How would be the execution plan in spark? Could somehow parts A and B being
executed in parallel?

Related to this, are there thread implementations in Scala? Could be this a
solution for this scenario?

Regards

Re: Spark execution plan

Posted by Luis Guerra <lu...@gmail.com>.

Thanks for your answer. However, there has been a missunderstanding here.
My question is related to control the execution in parallel of different
parts of code, similarly to PIG, where there is a planning phase before the
execution.

On Wed, Jul 23, 2014 at 1:46 PM, chutium <te...@gmail.com> wrote:

> it seems union should work for this scenario
>
> in part C, try to use: output_a union output_b
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-execution-plan-tp10482p10491.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>

Re: Spark execution plan

Posted by chutium <te...@gmail.com>.

it seems union should work for this scenario

in part C, try to use: output_a union output_b 



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-execution-plan-tp10482p10491.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.