You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by "Livni, Dana" <da...@intel.com> on 2014/03/06 20:49:29 UTC

major Spark performance problem

Hi all,

We have a big issue and would like if someone have any insights or ideas.
The problem is composed of two connected problems.

1. Run time of a single application.

2. Run time of multiple applications in parallel is almost linear with run time of a single application.

We have written a spark application patching its data from HBase.
We are running the application using YARN-client resource manager.
The cluster have 2 nodes (both uses as HBase data nodes and spark/YARN processing nodes).

We have few sparks steps in our app, the heaviest and longest from all Is describe by this flow

1. flatMap - converting the HBase RDD to objects RDD.

2. Group by key

3. Map making the calculations we need. (checking set of basic mathematical conditions)

When running a single instance of this step Working on only 2000 records this step takes around 13s. (all records are related to one key)
The HBase table we fetch the data from have 5 regions.

The implementation we have made is using REST service which creates one spark context
Each request we make to this service, run an instance of the application (but a gain all uses the same spark contxt)
Each request creates multiple threads which run all the application steps.
When running one request (with 10 parallel threads) the relevant stage takes about 40s for all the threads - each one of them takes 40s itself, but they almost run completely in parallel, so also the total run time of one request is 40s.

We have allocated 10 workers each with 512M memory (no need for more, looks like all the RDD is cached)

So the first question:
Does this run time make sense? For us it seems too long? Do you have an idea what are we doing wrong

The second problem and the more serious one
We need to run multiple parallel request of this kind.
When doing so the run time spikes again and instead of an request that runs in about 1m (40s is only the main stage)
We get 2 applications both running almost in parallel both run for 2m.
This also happens if we use 2 different services and sending each of them 1 request.
These running times grows as we send more requests.

We have also monitored the CPU usage of the node and each request makes it jump to 90%.

If we reduce the number of workers to 2 the CPU usage jump is to about 35%, but the run time increases significantly.

This seems very unlikely to us.
Are there any spark parameters we should consider to change?
Any other ideas? We are quite stuck on this.

Thanks in advanced
Dana

---------------------------------------------------------------------
Intel Electronics Ltd.

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

Re: major Spark performance problem

Posted by Matei Zaharia <ma...@gmail.com>.

Hi Dana,

It’s hard to tell exactly what is consuming time, but I’d suggest starting by profiling the single application first. Three things to look at there:

1) How many stages and how many tasks per stage is Spark launching (in the application web UI at http://<driver>:4040)? If you have hundreds of tasks for this small a file, just the task launching time might be a problem. You can use RDD.coalesce() to have fewer data partitions.

2) If you run a Java profiler (e.g. YourKit or hprof) on the workers while the application is executing, where is time being spent? Maybe some of your code is more expensive than it seems. One other thing you might find is that some code you use requires synchronization and is therefore not scaling properly to multiple cores (e.g. Java’s Math.random() actually does that).

3) Are there any RDDs that are used over and over but not cached? In that case they’ll be recomputed on each use.

Once you look into these it might be easier to improve the multiple-job case. In that case as others have pointed out, running the jobs in the same SparkContext and using the fair scheduler (http://spark.apache.org/docs/latest/job-scheduling.html) should work.

Matei

On Mar 9, 2014, at 5:56 AM, Livni, Dana <da...@intel.com> wrote:

> YARN also have this scheduling option.
> The problem is all of our applications have the same flow where the first  stage is the heaviest and the rest are very small.
> The problem is when some request (application) start to run on the same time, the first stage of all is schedule in parallel, and for some reason they delay each other,
> And a stage that alone will take around 13s can reach up to 2m when running in parallel with other identic stages  (around 15 stages).
> 
> 
> 
> -----Original Message-----
> From: elyast [mailto:lukasz.jastrzebski@gmail.com] 
> Sent: Friday, March 07, 2014 20:01
> To: user@spark.incubator.apache.org
> Subject: Re: major Spark performance problem
> 
> Hi,
> 
> There is also an option to run spark applications on top of mesos in fine grained mode, then it is possible for fair scheduling (applications will run in parallel and mesos is responsible for scheduling all tasks) so in a sense all applications will progress in parallel, obviously it total in may not be faster however the benefit is the fair scheduling (small jobs will not be stuck by the big ones).
> 
> Best regards
> Lukasz Jastrzebski
> 
> 
> 
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/major-Spark-performance-problem-tp2364p2403.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> ---------------------------------------------------------------------
> Intel Electronics Ltd.
> 
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>

RE: major Spark performance problem

Posted by "Livni, Dana" <da...@intel.com>.

YARN also have this scheduling option.
The problem is all of our applications have the same flow where the first  stage is the heaviest and the rest are very small.
The problem is when some request (application) start to run on the same time, the first stage of all is schedule in parallel, and for some reason they delay each other,
And a stage that alone will take around 13s can reach up to 2m when running in parallel with other identic stages  (around 15 stages).



-----Original Message-----
From: elyast [mailto:lukasz.jastrzebski@gmail.com] 
Sent: Friday, March 07, 2014 20:01
To: user@spark.incubator.apache.org
Subject: Re: major Spark performance problem

Hi,

There is also an option to run spark applications on top of mesos in fine grained mode, then it is possible for fair scheduling (applications will run in parallel and mesos is responsible for scheduling all tasks) so in a sense all applications will progress in parallel, obviously it total in may not be faster however the benefit is the fair scheduling (small jobs will not be stuck by the big ones).

Best regards
Lukasz Jastrzebski



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/major-Spark-performance-problem-tp2364p2403.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
Intel Electronics Ltd.

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

Re: major Spark performance problem

Posted by elyast <lu...@gmail.com>.

Hi,

There is also an option to run spark applications on top of mesos in fine
grained mode, then it is possible for fair scheduling (applications will run
in parallel and mesos is responsible for scheduling all tasks) so in a sense
all applications will progress in parallel, obviously it total in may not be
faster however the benefit is the fair scheduling (small jobs will not be
stuck by the big ones).

Best regards
Lukasz Jastrzebski



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/major-Spark-performance-problem-tp2364p2403.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: major Spark performance problem

Posted by Christopher Nguyen <ct...@adatao.com>.

Dana,

When you run multiple "applications" under Spark, and if each application
takes up the entire cluster resources, it is expected that one will block
the other completely, thus you're seeing that the wall time add together
sequentially. In addition there is some overhead associated with starting
up a new application/SparkContext.

Your other mode of sharing a single SparkContext, if your use case allows
it, is more promising in that workers are available to work on tasks in
parallel (but ultimately still subject to maximum resource limits). Without
knowing what your actual workload is, it's hard to tell in absolute terms
whether 12 seconds is reasonable or not.

One reason for the jump from 12s in local mode to 40s in cluster mode would
be the HBase bottleneck---you apparently would have 2x10=20 clients going
against the HBase data source instead of 1 (or however many local threads
you have). Assuming this is an increase of useful work output by a factor
of 20x, a jump from 12s to 40s wall time is actually quite attractive.

NB: given my assumption that the HBase data source is not parallelized
along with the Spark cluster, you would run into sublinear performance
issues (HBase-perf-limited or network-bandwidth-limited) as you scale out
your cluster size.
--
Christopher T. Nguyen
Co-founder & CEO, Adatao <http://adatao.com>
linkedin.com/in/ctnguyen



On Thu, Mar 6, 2014 at 11:49 AM, Livni, Dana <da...@intel.com> wrote:

>  Hi all,
>
>
>
> We have a big issue and would like if someone have any insights or ideas.
>
> The problem is composed of two connected problems.
>
> 1.       Run time of a single application.
>
> 2.       Run time of multiple applications in parallel is almost linear
> with run time of a single application.
>
>
>
> We have written a spark application patching its data from HBase.
>
> We are running the application using YARN-client resource manager.
>
> The cluster have 2 nodes (both uses as HBase data nodes and spark/YARN
> processing nodes).
>
>
>
> We have few sparks steps in our app, the heaviest and longest from all Is
> describe by this flow
>
> 1.       flatMap - converting the HBase RDD to objects RDD.
>
> 2.       Group by key
>
> 3.       Map making the calculations we need. (checking set of basic
> mathematical conditions)
>
>
>
> When running a single instance of this step Working on only 2000 records
> this step takes around 13s. (all records are related to one key)
>
> The HBase table we fetch the data from have 5 regions.
>
>
>
> The implementation we have made is using REST service which creates one
> spark context
>
> Each request we make to this service, run an instance of the application
> (but a gain all uses the same spark contxt)
>
> Each request creates multiple threads which run all the application steps.
>
> When running one request (with 10 parallel threads) the relevant stage
> takes about 40s for all the threads - each one of them takes 40s  itself,
> but they almost run completely in parallel, so also the total run time of
> one request is 40s.
>
>
>
> We have allocated 10 workers each with 512M memory (no need for more,
> looks like all the RDD is cached)
>
>
>
> So the first question:
>
> Does this run time make sense? For us it seems too long? Do you have an
> idea what are we doing wrong
>
>
>
> The second problem and the more serious one
>
> We need to run multiple parallel request of this kind.
>
> When doing so the run time spikes again and instead of an request that
> runs in about 1m (40s is only the main stage)
>
> We get 2 applications both running almost in parallel both run for 2m.
>
> This also happens if we use 2 different services and sending each of them
> 1 request.
>
> These running times grows as we send more requests.
>
>
>
> We have also monitored the CPU usage of the node and each request makes it
> jump to 90%.
>
>
>
> If we reduce the number of workers to 2 the CPU usage jump is to about
> 35%, but the run time increases significantly.
>
>
>
> This seems very unlikely to us.
>
> Are there any spark parameters we should consider to change?
>
> Any other ideas? We are quite stuck on this.
>
>
>
> Thanks in advanced
>
> Dana
>
>
>
>
>
>
>
>
>
>
>
> ---------------------------------------------------------------------
> Intel Electronics Ltd.
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>