You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Sarthak Dash <da...@gmail.com> on 2014/07/28 02:20:19 UTC

Maximum jobs finish very soon, some of them take longer time.

Hello everyone,

I am trying out Spark for the first time, and after a month of work - I am
stuck with an issue. I have a very simple program that, given a directed
Graph with nodes/edges parameters and a particular node, tries to figure out
all the siblings(in the traditional sense) of the given node.

Right now, I have 1200 partitions, and I see that while most of the tasks(on
an average 1190-1195) tasks finish within 500 ms, a few tasks (5-10 of them)
take about 1-2 seconds to finish. I am aiming for a scenario wherein all the
tasks finish under a second, and hence trying to figure out why a few
tasks(5-10 of them) take longer time to complete as opposed to the remaining
(1190-1195) tasks ?

Also, Please let me know whether its possible to change some settings, so as
to achieve my target scenario ?
Any help would be much appreciated.

My configurations:
1. Tried with both FAIR/FIFO scheduler.
2. Tried playing around with Spark.locality.wait settings. Currently I have
a max scheduler delay of 300 ms.
3. Version: Apache Spark 1.0.0 on a 50 node cluster, 14GB each RAM, 8
cores/node.

Thanks,
Sarthak



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Maximum-jobs-finish-very-soon-some-of-them-take-longer-time-tp10750.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.