You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spark.apache.org by Enno Shioji <es...@gmail.com> on 2014/12/31 17:12:40 UTC

Big performance difference between "client" and "cluster" deployment mode; is this expected?

Hi,

I have a very, very simple streaming job. When I deploy this on the exact
same cluster, with the exact same parameters, I see big (40%) performance
difference between "client" and "cluster" deployment mode. This seems a bit
surprising.. Is this expected?

The streaming job is:

    val msgStream = kafkaStream
      .map { case (k, v) => v}
      .map(DatatypeConverter.printBase64Binary)
      .foreachRDD(save)
      .saveAsTextFile("s3n://some.bucket/path", classOf[LzoCodec])

I tried several times, but the job deployed with "client" mode can only
write at 60% throughput of the job deployed with "cluster" mode and this
happens consistently. I'm logging at INFO level, but my application code
doesn't log anything so it's only Spark logs. The logs I see in "client"
mode doesn't seem like a crazy amount.

The setup is:
spark-ec2 [...] \
  --copy-aws-credentials \
  --instance-type=m3.2xlarge \
  -s 2 launch test_cluster

And all the deployment was done from the master machine.

ᐧ

Re: Big performance difference between "client" and "cluster" deployment mode; is this expected?

Posted by Enno Shioji <es...@gmail.com>.

Hi Tathagata,

It's a standalone cluster. The submit commands are:

== CLIENT
spark-submit --class com.fake.Test \
--deploy-mode client --master spark://fake.com:7077 \
fake.jar <arguments>

== CLUSTER
 spark-submit --class com.fake.Test \
 --deploy-mode cluster --master spark://fake.com:7077 \
 s3n://fake.jar <arguments>

And they are both occupying all available slots. (8 * 2 machine = 16 slots).


ᐧ

On Thu, Jan 1, 2015 at 12:21 AM, Tathagata Das <ta...@gmail.com>
wrote:

> Whats your spark-submit commands in both cases? Is it Spark Standalone or
> YARN (both support client and cluster)? Accordingly what is the number of
> executors/cores requested?
>
> TD
>
> On Wed, Dec 31, 2014 at 10:36 AM, Enno Shioji <es...@gmail.com> wrote:
>
>> Also the job was deployed from the master machine in the cluster.
>>
>> On Wed, Dec 31, 2014 at 6:35 PM, Enno Shioji <es...@gmail.com> wrote:
>>
>>> Oh sorry that was a edit mistake. The code is essentially:
>>>
>>>      val msgStream = kafkaStream
>>>        .map { case (k, v) => v}
>>>        .map(DatatypeConverter.printBase64Binary)
>>>        .saveAsTextFile("s3n://some.bucket/path", classOf[LzoCodec])
>>>
>>> I.e. there is essentially no original code (I was calling saveAsTextFile
>>> in a "save" function but that was just a remnant from previous debugging).
>>>
>>>
>>>
>>> On Wed, Dec 31, 2014 at 6:21 PM, Sean Owen <so...@cloudera.com> wrote:
>>>
>>>> -dev, +user
>>>>
>>>> A decent guess: Does your 'save' function entail collecting data back
>>>> to the driver? and are you running this from a machine that's not in
>>>> your Spark cluster? Then in client mode you're shipping data back to a
>>>> less-nearby machine, compared to with cluster mode. That could explain
>>>> the bottleneck.
>>>>
>>>> On Wed, Dec 31, 2014 at 4:12 PM, Enno Shioji <es...@gmail.com> wrote:
>>>> > Hi,
>>>> >
>>>> > I have a very, very simple streaming job. When I deploy this on the
>>>> exact
>>>> > same cluster, with the exact same parameters, I see big (40%)
>>>> performance
>>>> > difference between "client" and "cluster" deployment mode. This seems
>>>> a bit
>>>> > surprising.. Is this expected?
>>>> >
>>>> > The streaming job is:
>>>> >
>>>> >     val msgStream = kafkaStream
>>>> >       .map { case (k, v) => v}
>>>> >       .map(DatatypeConverter.printBase64Binary)
>>>> >       .foreachRDD(save)
>>>> >       .saveAsTextFile("s3n://some.bucket/path", classOf[LzoCodec])
>>>> >
>>>> > I tried several times, but the job deployed with "client" mode can
>>>> only
>>>> > write at 60% throughput of the job deployed with "cluster" mode and
>>>> this
>>>> > happens consistently. I'm logging at INFO level, but my application
>>>> code
>>>> > doesn't log anything so it's only Spark logs. The logs I see in
>>>> "client"
>>>> > mode doesn't seem like a crazy amount.
>>>> >
>>>> > The setup is:
>>>> > spark-ec2 [...] \
>>>> >   --copy-aws-credentials \
>>>> >   --instance-type=m3.2xlarge \
>>>> >   -s 2 launch test_cluster
>>>> >
>>>> > And all the deployment was done from the master machine.
>>>> >
>>>> > ᐧ
>>>>
>>>
>>>
>>
>

Re: Big performance difference between "client" and "cluster" deployment mode; is this expected?

Posted by Tathagata Das <ta...@gmail.com>.

Whats your spark-submit commands in both cases? Is it Spark Standalone or
YARN (both support client and cluster)? Accordingly what is the number of
executors/cores requested?

TD

On Wed, Dec 31, 2014 at 10:36 AM, Enno Shioji <es...@gmail.com> wrote:

> Also the job was deployed from the master machine in the cluster.
> ᐧ
>
> On Wed, Dec 31, 2014 at 6:35 PM, Enno Shioji <es...@gmail.com> wrote:
>
>> Oh sorry that was a edit mistake. The code is essentially:
>>
>>      val msgStream = kafkaStream
>>        .map { case (k, v) => v}
>>        .map(DatatypeConverter.printBase64Binary)
>>        .saveAsTextFile("s3n://some.bucket/path", classOf[LzoCodec])
>>
>> I.e. there is essentially no original code (I was calling saveAsTextFile
>> in a "save" function but that was just a remnant from previous debugging).
>>
>>
>> ᐧ
>>
>> On Wed, Dec 31, 2014 at 6:21 PM, Sean Owen <so...@cloudera.com> wrote:
>>
>>> -dev, +user
>>>
>>> A decent guess: Does your 'save' function entail collecting data back
>>> to the driver? and are you running this from a machine that's not in
>>> your Spark cluster? Then in client mode you're shipping data back to a
>>> less-nearby machine, compared to with cluster mode. That could explain
>>> the bottleneck.
>>>
>>> On Wed, Dec 31, 2014 at 4:12 PM, Enno Shioji <es...@gmail.com> wrote:
>>> > Hi,
>>> >
>>> > I have a very, very simple streaming job. When I deploy this on the
>>> exact
>>> > same cluster, with the exact same parameters, I see big (40%)
>>> performance
>>> > difference between "client" and "cluster" deployment mode. This seems
>>> a bit
>>> > surprising.. Is this expected?
>>> >
>>> > The streaming job is:
>>> >
>>> >     val msgStream = kafkaStream
>>> >       .map { case (k, v) => v}
>>> >       .map(DatatypeConverter.printBase64Binary)
>>> >       .foreachRDD(save)
>>> >       .saveAsTextFile("s3n://some.bucket/path", classOf[LzoCodec])
>>> >
>>> > I tried several times, but the job deployed with "client" mode can only
>>> > write at 60% throughput of the job deployed with "cluster" mode and
>>> this
>>> > happens consistently. I'm logging at INFO level, but my application
>>> code
>>> > doesn't log anything so it's only Spark logs. The logs I see in
>>> "client"
>>> > mode doesn't seem like a crazy amount.
>>> >
>>> > The setup is:
>>> > spark-ec2 [...] \
>>> >   --copy-aws-credentials \
>>> >   --instance-type=m3.2xlarge \
>>> >   -s 2 launch test_cluster
>>> >
>>> > And all the deployment was done from the master machine.
>>> >
>>> > ᐧ
>>>
>>
>>
>

Re: Big performance difference between "client" and "cluster" deployment mode; is this expected?

Posted by Enno Shioji <es...@gmail.com>.

Also the job was deployed from the master machine in the cluster.
ᐧ

On Wed, Dec 31, 2014 at 6:35 PM, Enno Shioji <es...@gmail.com> wrote:

> Oh sorry that was a edit mistake. The code is essentially:
>
>      val msgStream = kafkaStream
>        .map { case (k, v) => v}
>        .map(DatatypeConverter.printBase64Binary)
>        .saveAsTextFile("s3n://some.bucket/path", classOf[LzoCodec])
>
> I.e. there is essentially no original code (I was calling saveAsTextFile
> in a "save" function but that was just a remnant from previous debugging).
>
>
> ᐧ
>
> On Wed, Dec 31, 2014 at 6:21 PM, Sean Owen <so...@cloudera.com> wrote:
>
>> -dev, +user
>>
>> A decent guess: Does your 'save' function entail collecting data back
>> to the driver? and are you running this from a machine that's not in
>> your Spark cluster? Then in client mode you're shipping data back to a
>> less-nearby machine, compared to with cluster mode. That could explain
>> the bottleneck.
>>
>> On Wed, Dec 31, 2014 at 4:12 PM, Enno Shioji <es...@gmail.com> wrote:
>> > Hi,
>> >
>> > I have a very, very simple streaming job. When I deploy this on the
>> exact
>> > same cluster, with the exact same parameters, I see big (40%)
>> performance
>> > difference between "client" and "cluster" deployment mode. This seems a
>> bit
>> > surprising.. Is this expected?
>> >
>> > The streaming job is:
>> >
>> >     val msgStream = kafkaStream
>> >       .map { case (k, v) => v}
>> >       .map(DatatypeConverter.printBase64Binary)
>> >       .foreachRDD(save)
>> >       .saveAsTextFile("s3n://some.bucket/path", classOf[LzoCodec])
>> >
>> > I tried several times, but the job deployed with "client" mode can only
>> > write at 60% throughput of the job deployed with "cluster" mode and this
>> > happens consistently. I'm logging at INFO level, but my application code
>> > doesn't log anything so it's only Spark logs. The logs I see in "client"
>> > mode doesn't seem like a crazy amount.
>> >
>> > The setup is:
>> > spark-ec2 [...] \
>> >   --copy-aws-credentials \
>> >   --instance-type=m3.2xlarge \
>> >   -s 2 launch test_cluster
>> >
>> > And all the deployment was done from the master machine.
>> >
>> > ᐧ
>>
>
>

Re: Big performance difference between "client" and "cluster" deployment mode; is this expected?

Posted by Enno Shioji <es...@gmail.com>.

Oh sorry that was a edit mistake. The code is essentially:

     val msgStream = kafkaStream
       .map { case (k, v) => v}
       .map(DatatypeConverter.printBase64Binary)
       .saveAsTextFile("s3n://some.bucket/path", classOf[LzoCodec])

I.e. there is essentially no original code (I was calling saveAsTextFile in
a "save" function but that was just a remnant from previous debugging).


ᐧ

On Wed, Dec 31, 2014 at 6:21 PM, Sean Owen <so...@cloudera.com> wrote:

> -dev, +user
>
> A decent guess: Does your 'save' function entail collecting data back
> to the driver? and are you running this from a machine that's not in
> your Spark cluster? Then in client mode you're shipping data back to a
> less-nearby machine, compared to with cluster mode. That could explain
> the bottleneck.
>
> On Wed, Dec 31, 2014 at 4:12 PM, Enno Shioji <es...@gmail.com> wrote:
> > Hi,
> >
> > I have a very, very simple streaming job. When I deploy this on the exact
> > same cluster, with the exact same parameters, I see big (40%) performance
> > difference between "client" and "cluster" deployment mode. This seems a
> bit
> > surprising.. Is this expected?
> >
> > The streaming job is:
> >
> >     val msgStream = kafkaStream
> >       .map { case (k, v) => v}
> >       .map(DatatypeConverter.printBase64Binary)
> >       .foreachRDD(save)
> >       .saveAsTextFile("s3n://some.bucket/path", classOf[LzoCodec])
> >
> > I tried several times, but the job deployed with "client" mode can only
> > write at 60% throughput of the job deployed with "cluster" mode and this
> > happens consistently. I'm logging at INFO level, but my application code
> > doesn't log anything so it's only Spark logs. The logs I see in "client"
> > mode doesn't seem like a crazy amount.
> >
> > The setup is:
> > spark-ec2 [...] \
> >   --copy-aws-credentials \
> >   --instance-type=m3.2xlarge \
> >   -s 2 launch test_cluster
> >
> > And all the deployment was done from the master machine.
> >
> > ᐧ
>

Re: Big performance difference between "client" and "cluster" deployment mode; is this expected?

Posted by Sean Owen <so...@cloudera.com>.

-dev, +user

A decent guess: Does your 'save' function entail collecting data back
to the driver? and are you running this from a machine that's not in
your Spark cluster? Then in client mode you're shipping data back to a
less-nearby machine, compared to with cluster mode. That could explain
the bottleneck.

On Wed, Dec 31, 2014 at 4:12 PM, Enno Shioji <es...@gmail.com> wrote:
> Hi,
>
> I have a very, very simple streaming job. When I deploy this on the exact
> same cluster, with the exact same parameters, I see big (40%) performance
> difference between "client" and "cluster" deployment mode. This seems a bit
> surprising.. Is this expected?
>
> The streaming job is:
>
>     val msgStream = kafkaStream
>       .map { case (k, v) => v}
>       .map(DatatypeConverter.printBase64Binary)
>       .foreachRDD(save)
>       .saveAsTextFile("s3n://some.bucket/path", classOf[LzoCodec])
>
> I tried several times, but the job deployed with "client" mode can only
> write at 60% throughput of the job deployed with "cluster" mode and this
> happens consistently. I'm logging at INFO level, but my application code
> doesn't log anything so it's only Spark logs. The logs I see in "client"
> mode doesn't seem like a crazy amount.
>
> The setup is:
> spark-ec2 [...] \
>   --copy-aws-credentials \
>   --instance-type=m3.2xlarge \
>   -s 2 launch test_cluster
>
> And all the deployment was done from the master machine.
>
> ᐧ

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org

Re: Big performance difference between "client" and "cluster" deployment mode; is this expected?

Posted by Sean Owen <so...@cloudera.com>.

-dev, +user

A decent guess: Does your 'save' function entail collecting data back
to the driver? and are you running this from a machine that's not in
your Spark cluster? Then in client mode you're shipping data back to a
less-nearby machine, compared to with cluster mode. That could explain
the bottleneck.

On Wed, Dec 31, 2014 at 4:12 PM, Enno Shioji <es...@gmail.com> wrote:
> Hi,
>
> I have a very, very simple streaming job. When I deploy this on the exact
> same cluster, with the exact same parameters, I see big (40%) performance
> difference between "client" and "cluster" deployment mode. This seems a bit
> surprising.. Is this expected?
>
> The streaming job is:
>
>     val msgStream = kafkaStream
>       .map { case (k, v) => v}
>       .map(DatatypeConverter.printBase64Binary)
>       .foreachRDD(save)
>       .saveAsTextFile("s3n://some.bucket/path", classOf[LzoCodec])
>
> I tried several times, but the job deployed with "client" mode can only
> write at 60% throughput of the job deployed with "cluster" mode and this
> happens consistently. I'm logging at INFO level, but my application code
> doesn't log anything so it's only Spark logs. The logs I see in "client"
> mode doesn't seem like a crazy amount.
>
> The setup is:
> spark-ec2 [...] \
>   --copy-aws-credentials \
>   --instance-type=m3.2xlarge \
>   -s 2 launch test_cluster
>
> And all the deployment was done from the master machine.
>
> ᐧ

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org