You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by SamyaMaiti <sa...@gmail.com> on 2014/12/29 17:49:51 UTC

Can we say 1 RDD is generated every batch interval?

Hi All, 

Please clarify.

Can we say 1 RDD is generated every batch interval?

If the above is true. Then, is the foreachRDD() operator executed one & only
once for each batch processing? 

Regards,
Sam



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Can-we-say-1-RDD-is-generated-every-batch-interval-tp20885.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: Can we say 1 RDD is generated every batch interval?

Posted by "Maiti, Samya" <sa...@philips.com>.
Thank Sean.

That was helpful.

Regards,
Sam
On Dec 30, 2014, at 4:12 PM, Sean Owen <so...@cloudera.com> wrote:

> The DStream model is one RDD of data per interval, yes. foreachRDD
> performs an operation on each RDD in the stream, which means it is
> executed once* for the one RDD in each interval.
>
> * ignoring the possibility here of failure and retry of course
>
> On Mon, Dec 29, 2014 at 4:49 PM, SamyaMaiti <sa...@gmail.com> wrote:
>> Hi All,
>>
>> Please clarify.
>>
>> Can we say 1 RDD is generated every batch interval?
>>
>> If the above is true. Then, is the foreachRDD() operator executed one & only
>> once for each batch processing?
>>
>> Regards,
>> Sam
>>
>>
>>
>> --
>> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Can-we-say-1-RDD-is-generated-every-batch-interval-tp20885.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>


________________________________
The information contained in this message may be confidential and legally protected under applicable law. The message is intended solely for the addressee(s). If you are not the intended recipient, you are hereby notified that any use, forwarding, dissemination, or reproduction of this message is strictly prohibited and may be unlawful. If you are not the intended recipient, please contact the sender by return e-mail and destroy all copies of the original message.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: Can we say 1 RDD is generated every batch interval?

Posted by Sean Owen <so...@cloudera.com>.
The DStream model is one RDD of data per interval, yes. foreachRDD
performs an operation on each RDD in the stream, which means it is
executed once* for the one RDD in each interval.

* ignoring the possibility here of failure and retry of course

On Mon, Dec 29, 2014 at 4:49 PM, SamyaMaiti <sa...@gmail.com> wrote:
> Hi All,
>
> Please clarify.
>
> Can we say 1 RDD is generated every batch interval?
>
> If the above is true. Then, is the foreachRDD() operator executed one & only
> once for each batch processing?
>
> Regards,
> Sam
>
>
>
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Can-we-say-1-RDD-is-generated-every-batch-interval-tp20885.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: Can we say 1 RDD is generated every batch interval?

Posted by "Jahagirdar, Madhu" <ma...@philips.com>.
Foreach iterates through the partitions in the RDD and executes the operations for each partitions i guess.

> On 29-Dec-2014, at 10:19 pm, SamyaMaiti <sa...@gmail.com> wrote:
>
> Hi All,
>
> Please clarify.
>
> Can we say 1 RDD is generated every batch interval?
>
> If the above is true. Then, is the foreachRDD() operator executed one & only
> once for each batch processing?
>
> Regards,
> Sam
>
>
>
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Can-we-say-1-RDD-is-generated-every-batch-interval-tp20885.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>


________________________________
The information contained in this message may be confidential and legally protected under applicable law. The message is intended solely for the addressee(s). If you are not the intended recipient, you are hereby notified that any use, forwarding, dissemination, or reproduction of this message is strictly prohibited and may be unlawful. If you are not the intended recipient, please contact the sender by return e-mail and destroy all copies of the original message.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org