You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Yan Fang <ya...@gmail.com> on 2014/07/09 20:22:14 UTC

Spark Streaming - two questions about the streamingcontext

I am using the Spark Streaming and have the following two questions:

1. If more than one output operations are put in the same StreamingContext
(basically, I mean, I put all the output operations in the same class), are
they processed one by one as the order they appear in the class? Or they
are actually processes parallely?

2. If one DStream takes longer than the interval time, does a new DStream
wait in the queue until the previous DStream is fully processed? Is there
any parallelism that may process the two DStream at the same time?

Thank you.

Cheers,

Fang, Yan
yanfang724@gmail.com
+1 (206) 849-4108

Re: Spark Streaming - two questions about the streamingcontext

Posted by Yan Fang <ya...@gmail.com>.
Great. Thank you!

Fang, Yan
yanfang724@gmail.com
+1 (206) 849-4108


On Wed, Jul 9, 2014 at 11:45 AM, Tathagata Das <ta...@gmail.com>
wrote:

> 1. Multiple output operations are processed in the order they are defined.
> That is because by default each one output operation is processed at a
> time. This *can* be parallelized using an undocumented config parameter
> "spark.streaming.concurrentJobs" which is by default set to 1.
>
> 2. Yes, the output operations (and the spark jobs that are involved with
> them) gets queued up.
>
> TD
>
>
> On Wed, Jul 9, 2014 at 11:22 AM, Yan Fang <ya...@gmail.com> wrote:
>
>> I am using the Spark Streaming and have the following two questions:
>>
>> 1. If more than one output operations are put in the same
>> StreamingContext (basically, I mean, I put all the output operations in the
>> same class), are they processed one by one as the order they appear in the
>> class? Or they are actually processes parallely?
>>
>> 2. If one DStream takes longer than the interval time, does a new DStream
>> wait in the queue until the previous DStream is fully processed? Is there
>> any parallelism that may process the two DStream at the same time?
>>
>> Thank you.
>>
>> Cheers,
>>
>> Fang, Yan
>> yanfang724@gmail.com
>> +1 (206) 849-4108
>>
>
>

Re: Spark Streaming - two questions about the streamingcontext

Posted by Tathagata Das <ta...@gmail.com>.
1. Multiple output operations are processed in the order they are defined.
That is because by default each one output operation is processed at a
time. This *can* be parallelized using an undocumented config parameter
"spark.streaming.concurrentJobs" which is by default set to 1.

2. Yes, the output operations (and the spark jobs that are involved with
them) gets queued up.

TD


On Wed, Jul 9, 2014 at 11:22 AM, Yan Fang <ya...@gmail.com> wrote:

> I am using the Spark Streaming and have the following two questions:
>
> 1. If more than one output operations are put in the same StreamingContext
> (basically, I mean, I put all the output operations in the same class), are
> they processed one by one as the order they appear in the class? Or they
> are actually processes parallely?
>
> 2. If one DStream takes longer than the interval time, does a new DStream
> wait in the queue until the previous DStream is fully processed? Is there
> any parallelism that may process the two DStream at the same time?
>
> Thank you.
>
> Cheers,
>
> Fang, Yan
> yanfang724@gmail.com
> +1 (206) 849-4108
>