You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Shuai Zheng <sz...@gmail.com> on 2015/03/31 22:27:16 UTC

--driver-memory parameter doesn't work for spark-submmit on yarn?

Hi All,

 

Below is the my shell script:

 

/home/hadoop/spark/bin/spark-submit --driver-memory=5G --executor-memory=40G
--master yarn-client --class com.***.FinancialEngineExecutor
/home/hadoop/lib/my.jar s3://bucket/vriscBatchConf.properties 

 

My driver will load some resources and then broadcast to all executors.

 

That resources is only 600MB in ser format, but I always has out of memory
exception, it looks like the driver doesn't allocate right memory to my
driver.

 

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

        at java.lang.reflect.Array.newArray(Native Method)

        at java.lang.reflect.Array.newInstance(Array.java:70)

        at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1670)

        at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)

        at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)

        at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)

        at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)

        at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)

        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)

        at
com.***.executor.support.S3FileUtils.loadCache(S3FileUtils.java:68)

 

Do I do anything wrong here? 

 

And no matter how much I set for --driver-memory value (from 512M to 20G),
it always give me error on the same line (that line try to load a 600MB java
serialization file). So looks like the script doesn't allocate right memory
to driver in my case?

 

Regards,

 

Shuai

RE: --driver-memory parameter doesn't work for spark-submmit on yarn?

Posted by Shuai Zheng <sz...@gmail.com>.

Sorry for reply late.

I bypass this by set _JAVA_OPTIONS.

And the ps aux | grep spark

hadoop   14442  0.6  0.2 34334552 128560 pts/0 Sl+  14:37   0:01 /usr/java/latest/bin/java org.apache.spark.deploy.SparkSubmitDriverBootstrapper --driver-memory=5G --executor-memory=10G --master yarn-client --class com.***.FinancialEngineExecutor /home/hadoop/lib/Engine-2.0-jar-with-dependencies.jar 
hadoop   14544  158 13.4 37206420 8472272 pts/0 Sl+ 14:37   4:21 /usr/java/latest/bin/java -cp /home/hadoop/spark/conf:/home/hadoop/conf:/home/hadoop/spark/classpath/emr/*:/home/hadoop/spark/classpath/emrfs/*:/home/hadoop/share/hadoop/common/lib/*:/home/hadoop/share/hadoop/common/lib/hadoop-lzo.jar::/home/hadoop/spark/conf:/home/hadoop/spark/lib/spark-assembly-1.3.0-hadoop2.4.0.jar:/home/hadoop/spark/lib/datanucleus-core-3.2.10.jar:/home/hadoop/spark/lib/datanucleus-rdbms-3.2.9.jar:/home/hadoop/spark/lib/datanucleus-api-jdo-3.2.6.jar:/home/hadoop/conf:/home/hadoop/conf -XX:MaxPermSize=128m -Dspark.driver.log.level=INFO -Xms512m -Xmx512m org.apache.spark.deploy.SparkSubmit --driver-memory=5G --executor-memory=10G --master yarn-client --class com.*executor.FinancialEngineExecutor /home/hadoop/lib/MiddlewareEngine-2.0-jar-with-dependencies.jar 

Above already have set _JAVA_OPTIONS="-Xmx30g", but looks like it doesn't show in the commandline. I guess SparkSubmit will read _JAVA_OPTIONS, but I just think this should be overwritten by the commandline params. Not sure what happen here, have no time to dig it out. But if you want me to provide more information. I will be happy to do that.

Regards,

Shuai


-----Original Message-----
From: Bozeman, Christopher [mailto:bozemanc@amazon.com] 
Sent: Wednesday, April 01, 2015 4:59 PM
To: Shuai Zheng; 'Sean Owen'
Cc: 'Akhil Das'; user@spark.apache.org
Subject: RE: --driver-memory parameter doesn't work for spark-submmit on yarn?

Shuai,

What did " ps aux | grep spark-submit" reveal?

When you compare using _JAVA_OPTIONS and without using it, where do you see the difference?

Thanks
Christopher




-----Original Message-----
From: Shuai Zheng [mailto:szheng.code@gmail.com]
Sent: Wednesday, April 01, 2015 11:12 AM
To: 'Sean Owen'
Cc: 'Akhil Das'; user@spark.apache.org
Subject: RE: --driver-memory parameter doesn't work for spark-submmit on yarn?

Nice.

But when my case shows that even I use Yarn-Client, I have same issue. I do verify it several times.

And I am running 1.3.0 on EMR (use the version dispatch by installSpark script from AWS).

I agree _JAVA_OPTIONS is not a right solution, but I will use it until 1.4.0 out :)

Regards,

Shuai

-----Original Message-----
From: Sean Owen [mailto:sowen@cloudera.com]
Sent: Wednesday, April 01, 2015 10:51 AM
To: Shuai Zheng
Cc: Akhil Das; user@spark.apache.org
Subject: Re: --driver-memory parameter doesn't work for spark-submmit on yarn?

I feel like I recognize that problem, and it's almost the inverse of
https://issues.apache.org/jira/browse/SPARK-3884 which I was looking at today. The spark-class script didn't seem to handle all the ways that driver memory can be set.

I think this is also something fixed by the new launcher library in 1.4.0.

_JAVA_OPTIONS is not a good solution since it's global.

On Wed, Apr 1, 2015 at 3:21 PM, Shuai Zheng <sz...@gmail.com> wrote:
> Hi Akhil,
>
>
>
> Thanks a lot!
>
>
>
> After set export _JAVA_OPTIONS="-Xmx5g", the OutOfMemory exception 
> disappeared. But this make me confused, so the driver-memory options 
> doesn’t work for spark-submit to YARN (I haven’t check other clusters), is it a bug?
>
>
>
> Regards,
>
>
>
> Shuai
>
>
>
>
>
> From: Akhil Das [mailto:akhil@sigmoidanalytics.com]
> Sent: Wednesday, April 01, 2015 2:40 AM
> To: Shuai Zheng
> Cc: user@spark.apache.org
> Subject: Re: --driver-memory parameter doesn't work for spark-submmit 
> on yarn?
>
>
>
> Once you submit the job do a ps aux | grep spark-submit and see how 
> much is the heap space allocated to the process (the -Xmx params), if 
> you are seeing a lower value you could try increasing it yourself with:
>
>
>
> export _JAVA_OPTIONS="-Xmx5g"
>
>
> Thanks
>
> Best Regards
>
>
>
> On Wed, Apr 1, 2015 at 1:57 AM, Shuai Zheng <sz...@gmail.com> wrote:
>
> Hi All,
>
>
>
> Below is the my shell script:
>
>
>
> /home/hadoop/spark/bin/spark-submit --driver-memory=5G 
> --executor-memory=40G --master yarn-client --class 
> com.***.FinancialEngineExecutor /home/hadoop/lib/my.jar 
> s3://bucket/vriscBatchConf.properties
>
>
>
> My driver will load some resources and then broadcast to all executors.
>
>
>
> That resources is only 600MB in ser format, but I always has out of 
> memory exception, it looks like the driver doesn’t allocate right 
> memory to my driver.
>
>
>
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>
>         at java.lang.reflect.Array.newArray(Native Method)
>
>         at java.lang.reflect.Array.newInstance(Array.java:70)
>
>         at
> java.io.ObjectInputStream.readArray(ObjectInputStream.java:1670)
>
>         at
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)
>
>         at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:199
> 0)
>
>         at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>
>         at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:17
> 98)
>
>         at
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>
>         at
> java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>
>         at
> com.***.executor.support.S3FileUtils.loadCache(S3FileUtils.java:68)
>
>
>
> Do I do anything wrong here?
>
>
>
> And no matter how much I set for --driver-memory value (from 512M to 
> 20G), it always give me error on the same line (that line try to load 
> a 600MB java serialization file). So looks like the script doesn’t 
> allocate right memory to driver in my case?
>
>
>
> Regards,
>
>
>
> Shuai
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org For additional commands, e-mail: user-help@spark.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org

RE: --driver-memory parameter doesn't work for spark-submmit on yarn?

Posted by "Bozeman, Christopher" <bo...@amazon.com>.

Shuai,

What did " ps aux | grep spark-submit" reveal?

When you compare using _JAVA_OPTIONS and without using it, where do you see the difference?

Thanks
Christopher




-----Original Message-----
From: Shuai Zheng [mailto:szheng.code@gmail.com] 
Sent: Wednesday, April 01, 2015 11:12 AM
To: 'Sean Owen'
Cc: 'Akhil Das'; user@spark.apache.org
Subject: RE: --driver-memory parameter doesn't work for spark-submmit on yarn?

Nice.

But when my case shows that even I use Yarn-Client, I have same issue. I do verify it several times.

And I am running 1.3.0 on EMR (use the version dispatch by installSpark script from AWS).

I agree _JAVA_OPTIONS is not a right solution, but I will use it until 1.4.0 out :)

Regards,

Shuai

-----Original Message-----
From: Sean Owen [mailto:sowen@cloudera.com]
Sent: Wednesday, April 01, 2015 10:51 AM
To: Shuai Zheng
Cc: Akhil Das; user@spark.apache.org
Subject: Re: --driver-memory parameter doesn't work for spark-submmit on yarn?

I feel like I recognize that problem, and it's almost the inverse of
https://issues.apache.org/jira/browse/SPARK-3884 which I was looking at today. The spark-class script didn't seem to handle all the ways that driver memory can be set.

I think this is also something fixed by the new launcher library in 1.4.0.

_JAVA_OPTIONS is not a good solution since it's global.

On Wed, Apr 1, 2015 at 3:21 PM, Shuai Zheng <sz...@gmail.com> wrote:
> Hi Akhil,
>
>
>
> Thanks a lot!
>
>
>
> After set export _JAVA_OPTIONS="-Xmx5g", the OutOfMemory exception 
> disappeared. But this make me confused, so the driver-memory options 
> doesn’t work for spark-submit to YARN (I haven’t check other clusters), is it a bug?
>
>
>
> Regards,
>
>
>
> Shuai
>
>
>
>
>
> From: Akhil Das [mailto:akhil@sigmoidanalytics.com]
> Sent: Wednesday, April 01, 2015 2:40 AM
> To: Shuai Zheng
> Cc: user@spark.apache.org
> Subject: Re: --driver-memory parameter doesn't work for spark-submmit 
> on yarn?
>
>
>
> Once you submit the job do a ps aux | grep spark-submit and see how 
> much is the heap space allocated to the process (the -Xmx params), if 
> you are seeing a lower value you could try increasing it yourself with:
>
>
>
> export _JAVA_OPTIONS="-Xmx5g"
>
>
> Thanks
>
> Best Regards
>
>
>
> On Wed, Apr 1, 2015 at 1:57 AM, Shuai Zheng <sz...@gmail.com> wrote:
>
> Hi All,
>
>
>
> Below is the my shell script:
>
>
>
> /home/hadoop/spark/bin/spark-submit --driver-memory=5G 
> --executor-memory=40G --master yarn-client --class 
> com.***.FinancialEngineExecutor /home/hadoop/lib/my.jar 
> s3://bucket/vriscBatchConf.properties
>
>
>
> My driver will load some resources and then broadcast to all executors.
>
>
>
> That resources is only 600MB in ser format, but I always has out of 
> memory exception, it looks like the driver doesn’t allocate right 
> memory to my driver.
>
>
>
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>
>         at java.lang.reflect.Array.newArray(Native Method)
>
>         at java.lang.reflect.Array.newInstance(Array.java:70)
>
>         at
> java.io.ObjectInputStream.readArray(ObjectInputStream.java:1670)
>
>         at
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)
>
>         at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:199
> 0)
>
>         at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>
>         at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:17
> 98)
>
>         at
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>
>         at
> java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>
>         at
> com.***.executor.support.S3FileUtils.loadCache(S3FileUtils.java:68)
>
>
>
> Do I do anything wrong here?
>
>
>
> And no matter how much I set for --driver-memory value (from 512M to 
> 20G), it always give me error on the same line (that line try to load 
> a 600MB java serialization file). So looks like the script doesn’t 
> allocate right memory to driver in my case?
>
>
>
> Regards,
>
>
>
> Shuai
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org For additional commands, e-mail: user-help@spark.apache.org

RE: --driver-memory parameter doesn't work for spark-submmit on yarn?

Posted by Shuai Zheng <sz...@gmail.com>.

Nice.

But when my case shows that even I use Yarn-Client, I have same issue. I do verify it several times.

And I am running 1.3.0 on EMR (use the version dispatch by installSpark script from AWS).

I agree _JAVA_OPTIONS is not a right solution, but I will use it until 1.4.0 out :)

Regards,

Shuai

-----Original Message-----
From: Sean Owen [mailto:sowen@cloudera.com] 
Sent: Wednesday, April 01, 2015 10:51 AM
To: Shuai Zheng
Cc: Akhil Das; user@spark.apache.org
Subject: Re: --driver-memory parameter doesn't work for spark-submmit on yarn?

I feel like I recognize that problem, and it's almost the inverse of
https://issues.apache.org/jira/browse/SPARK-3884 which I was looking at today. The spark-class script didn't seem to handle all the ways that driver memory can be set.

I think this is also something fixed by the new launcher library in 1.4.0.

_JAVA_OPTIONS is not a good solution since it's global.

On Wed, Apr 1, 2015 at 3:21 PM, Shuai Zheng <sz...@gmail.com> wrote:
> Hi Akhil,
>
>
>
> Thanks a lot!
>
>
>
> After set export _JAVA_OPTIONS="-Xmx5g", the OutOfMemory exception 
> disappeared. But this make me confused, so the driver-memory options 
> doesn’t work for spark-submit to YARN (I haven’t check other clusters), is it a bug?
>
>
>
> Regards,
>
>
>
> Shuai
>
>
>
>
>
> From: Akhil Das [mailto:akhil@sigmoidanalytics.com]
> Sent: Wednesday, April 01, 2015 2:40 AM
> To: Shuai Zheng
> Cc: user@spark.apache.org
> Subject: Re: --driver-memory parameter doesn't work for spark-submmit 
> on yarn?
>
>
>
> Once you submit the job do a ps aux | grep spark-submit and see how 
> much is the heap space allocated to the process (the -Xmx params), if 
> you are seeing a lower value you could try increasing it yourself with:
>
>
>
> export _JAVA_OPTIONS="-Xmx5g"
>
>
> Thanks
>
> Best Regards
>
>
>
> On Wed, Apr 1, 2015 at 1:57 AM, Shuai Zheng <sz...@gmail.com> wrote:
>
> Hi All,
>
>
>
> Below is the my shell script:
>
>
>
> /home/hadoop/spark/bin/spark-submit --driver-memory=5G 
> --executor-memory=40G --master yarn-client --class 
> com.***.FinancialEngineExecutor /home/hadoop/lib/my.jar 
> s3://bucket/vriscBatchConf.properties
>
>
>
> My driver will load some resources and then broadcast to all executors.
>
>
>
> That resources is only 600MB in ser format, but I always has out of 
> memory exception, it looks like the driver doesn’t allocate right 
> memory to my driver.
>
>
>
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>
>         at java.lang.reflect.Array.newArray(Native Method)
>
>         at java.lang.reflect.Array.newInstance(Array.java:70)
>
>         at 
> java.io.ObjectInputStream.readArray(ObjectInputStream.java:1670)
>
>         at
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)
>
>         at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:199
> 0)
>
>         at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>
>         at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:17
> 98)
>
>         at
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>
>         at 
> java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>
>         at
> com.***.executor.support.S3FileUtils.loadCache(S3FileUtils.java:68)
>
>
>
> Do I do anything wrong here?
>
>
>
> And no matter how much I set for --driver-memory value (from 512M to 
> 20G), it always give me error on the same line (that line try to load 
> a 600MB java serialization file). So looks like the script doesn’t 
> allocate right memory to driver in my case?
>
>
>
> Regards,
>
>
>
> Shuai
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org

Re: --driver-memory parameter doesn't work for spark-submmit on yarn?

Posted by Sean Owen <so...@cloudera.com>.

I feel like I recognize that problem, and it's almost the inverse of
https://issues.apache.org/jira/browse/SPARK-3884 which I was looking
at today. The spark-class script didn't seem to handle all the ways
that driver memory can be set.

I think this is also something fixed by the new launcher library in 1.4.0.

_JAVA_OPTIONS is not a good solution since it's global.

On Wed, Apr 1, 2015 at 3:21 PM, Shuai Zheng <sz...@gmail.com> wrote:
> Hi Akhil,
>
>
>
> Thanks a lot!
>
>
>
> After set export _JAVA_OPTIONS="-Xmx5g", the OutOfMemory exception
> disappeared. But this make me confused, so the driver-memory options doesn’t
> work for spark-submit to YARN (I haven’t check other clusters), is it a bug?
>
>
>
> Regards,
>
>
>
> Shuai
>
>
>
>
>
> From: Akhil Das [mailto:akhil@sigmoidanalytics.com]
> Sent: Wednesday, April 01, 2015 2:40 AM
> To: Shuai Zheng
> Cc: user@spark.apache.org
> Subject: Re: --driver-memory parameter doesn't work for spark-submmit on
> yarn?
>
>
>
> Once you submit the job do a ps aux | grep spark-submit and see how much is
> the heap space allocated to the process (the -Xmx params), if you are seeing
> a lower value you could try increasing it yourself with:
>
>
>
> export _JAVA_OPTIONS="-Xmx5g"
>
>
> Thanks
>
> Best Regards
>
>
>
> On Wed, Apr 1, 2015 at 1:57 AM, Shuai Zheng <sz...@gmail.com> wrote:
>
> Hi All,
>
>
>
> Below is the my shell script:
>
>
>
> /home/hadoop/spark/bin/spark-submit --driver-memory=5G --executor-memory=40G
> --master yarn-client --class com.***.FinancialEngineExecutor
> /home/hadoop/lib/my.jar s3://bucket/vriscBatchConf.properties
>
>
>
> My driver will load some resources and then broadcast to all executors.
>
>
>
> That resources is only 600MB in ser format, but I always has out of memory
> exception, it looks like the driver doesn’t allocate right memory to my
> driver.
>
>
>
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>
>         at java.lang.reflect.Array.newArray(Native Method)
>
>         at java.lang.reflect.Array.newInstance(Array.java:70)
>
>         at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1670)
>
>         at
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)
>
>         at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>
>         at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>
>         at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>
>         at
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>
>         at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>
>         at
> com.***.executor.support.S3FileUtils.loadCache(S3FileUtils.java:68)
>
>
>
> Do I do anything wrong here?
>
>
>
> And no matter how much I set for --driver-memory value (from 512M to 20G),
> it always give me error on the same line (that line try to load a 600MB java
> serialization file). So looks like the script doesn’t allocate right memory
> to driver in my case?
>
>
>
> Regards,
>
>
>
> Shuai
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org

RE: --driver-memory parameter doesn't work for spark-submmit on yarn?

Posted by Shuai Zheng <sz...@gmail.com>.

Hi Akhil,

 

Thanks a lot!

 

After set export _JAVA_OPTIONS="-Xmx5g", the OutOfMemory exception disappeared. But this make me confused, so the driver-memory options doesn’t work for spark-submit to YARN (I haven’t check other clusters), is it a bug?

 

Regards,

 

Shuai

 

 

From: Akhil Das [mailto:akhil@sigmoidanalytics.com] 
Sent: Wednesday, April 01, 2015 2:40 AM
To: Shuai Zheng
Cc: user@spark.apache.org
Subject: Re: --driver-memory parameter doesn't work for spark-submmit on yarn?

 

Once you submit the job do a ps aux | grep spark-submit and see how much is the heap space allocated to the process (the -Xmx params), if you are seeing a lower value you could try increasing it yourself with:

 

export _JAVA_OPTIONS="-Xmx5g"




Thanks

Best Regards

 

On Wed, Apr 1, 2015 at 1:57 AM, Shuai Zheng <sz...@gmail.com> wrote:

Hi All,

 

Below is the my shell script:

 

/home/hadoop/spark/bin/spark-submit --driver-memory=5G --executor-memory=40G --master yarn-client --class com.***.FinancialEngineExecutor /home/hadoop/lib/my.jar s3://bucket/vriscBatchConf.properties 

 

My driver will load some resources and then broadcast to all executors.

 

That resources is only 600MB in ser format, but I always has out of memory exception, it looks like the driver doesn’t allocate right memory to my driver.

 

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

        at java.lang.reflect.Array.newArray(Native Method)

        at java.lang.reflect.Array.newInstance(Array.java:70)

        at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1670)

        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)

        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)

        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)

        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)

        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)

        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)

        at com.***.executor.support.S3FileUtils.loadCache(S3FileUtils.java:68)

 

Do I do anything wrong here? 

 

And no matter how much I set for --driver-memory value (from 512M to 20G), it always give me error on the same line (that line try to load a 600MB java serialization file). So looks like the script doesn’t allocate right memory to driver in my case?

 

Regards,

 

Shuai

Re: --driver-memory parameter doesn't work for spark-submmit on yarn?

Posted by Akhil Das <ak...@sigmoidanalytics.com>.

Once you submit the job do a ps aux | grep spark-submit and see how much is
the heap space allocated to the process (the -Xmx params), if you are
seeing a lower value you could try increasing it yourself with:

export _JAVA_OPTIONS="-Xmx5g"

Thanks
Best Regards

On Wed, Apr 1, 2015 at 1:57 AM, Shuai Zheng <sz...@gmail.com> wrote:

> Hi All,
>
>
>
> Below is the my shell script:
>
>
>
> /home/hadoop/spark/bin/spark-submit --driver-memory=5G
> --executor-memory=40G --master yarn-client --class
> com.***.FinancialEngineExecutor /home/hadoop/lib/my.jar
> s3://bucket/vriscBatchConf.properties
>
>
>
> My driver will load some resources and then broadcast to all executors.
>
>
>
> That resources is only 600MB in ser format, but I always has out of memory
> exception, it looks like the driver doesn’t allocate right memory to my
> driver.
>
>
>
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>
>         at java.lang.reflect.Array.newArray(Native Method)
>
>         at java.lang.reflect.Array.newInstance(Array.java:70)
>
>         at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1670)
>
>         at
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)
>
>         at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>
>         at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>
>         at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>
>         at
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>
>         at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>
>         at
> com.***.executor.support.S3FileUtils.loadCache(S3FileUtils.java:68)
>
>
>
> Do I do anything wrong here?
>
>
>
> And no matter how much I set for --driver-memory value (from 512M to
> 20G), it always give me error on the same line (that line try to load a
> 600MB java serialization file). So looks like the script doesn’t allocate
> right memory to driver in my case?
>
>
>
> Regards,
>
>
>
> Shuai
>