You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by bsikander <be...@gmail.com> on 2018/11/07 10:08:09 UTC

[Spark-Core] Long scheduling delays (1+ hour)

We are facing an issue with very long scheduling delays in Spark (upto 1+
hours).
We are using Spark-standalone. The data is being pulled from Kafka.

Any help would be much appreciated.

I have attached the screenshots.
<http://apache-spark-user-list.1001560.n3.nabble.com/file/t8018/1-stats.png> 
<http://apache-spark-user-list.1001560.n3.nabble.com/file/t8018/4.png> 
<http://apache-spark-user-list.1001560.n3.nabble.com/file/t8018/3.png> 
<http://apache-spark-user-list.1001560.n3.nabble.com/file/t8018/2.png> 







--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: [Spark-Core] Long scheduling delays (1+ hour)

Posted by bsikander <be...@gmail.com>.
Forgot to add the link
https://jira.apache.org/jira/browse/KAFKA-5649



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: [Spark-Core] Long scheduling delays (1+ hour)

Posted by bsikander <be...@gmail.com>.
Actually, our job runs fine for 17-18 hours and this behavior just suddenly
starts happening after that. 

We found the following ticket which is exactly what is happening in our
Kafka cluster also.
WARN Failed to send SSL Close message 
(org.apache.kafka.common.network.SslTransportLayer)

You also replied to this ticket with a problem very similar to ours.

what fix you did to avoid these SSL Close exceptions and long delays in
spark job?



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: [Spark-Core] Long scheduling delays (1+ hour)

Posted by bsikander <be...@gmail.com>.
Could you please give some feedback.



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: [Spark-Core] Long scheduling delays (1+ hour)

Posted by Biplob Biswas <re...@gmail.com>.
Hi,

This has to do with your batch duration and processing time, as a rule, the
batch duration should be lower than the processing time of your data. As I
can see from your screenshots, your batch duration is 10 seconds but your
processing time is more than a minute mostly, this adds up and you will end
up having a lot of scheduling delay.

Maybe see, why does it take 1 min to process 100 records and fix the logic.
Also, I see you have higher number of events which takes some time lower
amount of processing time. Fix the code logic and this should be fixed.

Thanks & Regards
Biplob Biswas


On Wed, Nov 7, 2018 at 11:08 AM bsikander <be...@gmail.com> wrote:

> We are facing an issue with very long scheduling delays in Spark (upto 1+
> hours).
> We are using Spark-standalone. The data is being pulled from Kafka.
>
> Any help would be much appreciated.
>
> I have attached the screenshots.
> <
> http://apache-spark-user-list.1001560.n3.nabble.com/file/t8018/1-stats.png>
>
> <http://apache-spark-user-list.1001560.n3.nabble.com/file/t8018/4.png>
> <http://apache-spark-user-list.1001560.n3.nabble.com/file/t8018/3.png>
> <http://apache-spark-user-list.1001560.n3.nabble.com/file/t8018/2.png>
>
>
>
>
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>