You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by anshu shukla <an...@gmail.com> on 2015/06/20 16:27:32 UTC

Fwd: Verifying number of workers in Spark Streaming

Any suggestions please ..!!
How to know that  In stream  Processing  over the  cluster  of 8 machines
 all the machines/woker nodes are being used  (my cluster have  8 slaves )
 .
I am submitting job from master itself over the ec-2 cluster crated by the
ec-2 scripts available with spark. But i am not able  figure out that  my
job is  using all workers or not .





-- 
Thanks & Regards,
Anshu Shukla
SERC-IISC

Re: Verifying number of workers in Spark Streaming

Posted by Silvio Fiorito <si...@granturing.com>.
If you look at your streaming app UI you should see how many tasks are executed each batch and on how many executors. This is dependent on the batch duration and block interval, which defaults to 200ms. So every block interval a partition will be generated. You can control the parallelism by adjusting the block interval and batch duration. As described in the docs, using the default block interval and a 2 second batch duration you'd get 10 partitions.

http://spark.apache.org/docs/latest/streaming-programming-guide.html#reducing-the-batch-processing-times

From: anshu shukla<ma...@gmail.com>
Sent: ?Saturday?, ?June? ?20?, ?2015 ?10?:?27? ?AM
To: dev@spark.apache.org<ma...@spark.apache.org>, Tathagata Das<ma...@databricks.com>, user@spark.apache.org<ma...@spark.apache.org>

Any suggestions please ..!!
How to know that  In stream  Processing  over the  cluster  of 8 machines  all the machines/woker nodes are being used  (my cluster have  8 slaves )  .
I am submitting job from master itself over the ec-2 cluster crated by the ec-2 scripts available with spark. But i am not able  figure out that  my job is  using all workers or not .





--
Thanks & Regards,
Anshu Shukla
SERC-IISC