You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Nipun Arora <ni...@gmail.com> on 2017/05/28 13:59:51 UTC
[Spark Streaming] DAG Output Processing mechanism
up vote
0
down vote
favorite
I would like some clarification on the execution model for spark streaming.
Broadly, I am trying to understand if output operations in a DAG are only
processed after all intermediate operations are finished for all parts of
the DAG.
Let me give an example:
I have a dstream -A , I do map operations on this dstream and create two
different dstreams -B and C such that
A ---> B -> (some operations) ---> kafka output 1
\----> C---> ( some operations) --> kafka output 2
I want to understand will kafka output 1 and kafka output 2 wait for all
operations to finish on B and C before sending an output, or will they
simply send an output as soon as the ops in B and C are done.
What kind of synchronization guarantees are there?
--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-DAG-Output-Processing-mechanism-tp28713.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org
Re: [Spark Streaming] DAG Output Processing mechanism
Posted by Nipun Arora <ni...@gmail.com>.
Sending out the message again.. Hopefully someone cal clarify :)
I would like some clarification on the execution model for spark streaming.
Broadly, I am trying to understand if output operations in a DAG are only
processed after all intermediate operations are finished for all parts of
the DAG.
Let me give an example:
I have a dstream -A , I do map operations on this dstream and create two
different dstreams -B and C such that
A ---> B -> (some operations) ---> kafka output 1
\----> C---> ( some operations) --> kafka output 2
I want to understand will kafka output 1 and kafka output 2 wait for all
operations to finish on B and C before sending an output, or will they
simply send an output as soon as the ops in B and C are done.
What kind of synchronization guarantees are there?
--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-DAG-Output-Processing-mechanism-tp28713p28715.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org
Re: [Spark Streaming] DAG Output Processing mechanism
Posted by Nipun Arora <ni...@gmail.com>.
Apogies - Resending as the previous mail went with some unnecessary copy
paste.
I would like some clarification on the execution model for spark streaming.
Broadly, I am trying to understand if output operations in a DAG are only
processed after all intermediate operations are finished for all parts of
the DAG.
Let me give an example:
I have a dstream -A , I do map operations on this dstream and create two
different dstreams -B and C such that
A ---> B -> (some operations) ---> kafka output 1
\----> C---> ( some operations) --> kafka output 2
I want to understand will kafka output 1 and kafka output 2 wait for all
operations to finish on B and C before sending an output, or will they
simply send an output as soon as the ops in B and C are done.
What kind of synchronization guarantees are there?
On Sun, May 28, 2017 at 9:59 AM, Nipun Arora <ni...@gmail.com>
wrote:
> up vote
> 0
> down vote
> favorite
> I would like some clarification on the execution model for spark streaming.
>
> Broadly, I am trying to understand if output operations in a DAG are only
> processed after all intermediate operations are finished for all parts of
> the DAG.
>
> Let me give an example:
>
> I have a dstream -A , I do map operations on this dstream and create two
> different dstreams -B and C such that
>
> A ---> B -> (some operations) ---> kafka output 1
> \----> C---> ( some operations) --> kafka output 2
>
> I want to understand will kafka output 1 and kafka output 2 wait for all
> operations to finish on B and C before sending an output, or will they
> simply send an output as soon as the ops in B and C are done.
>
> What kind of synchronization guarantees are there?
>
>
>
> --
> View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/Spark-Streaming-DAG-Output-
> Processing-mechanism-tp28713.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>