You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by sankarmittapally <sa...@creditvidya.com> on 2016/10/24 11:19:10 UTC

JAVA heap space issue

Hi,

 I have a three node cluster with 30G of Memory. I am trying to analyzing
the data of 200MB and running out of memory every time. This is the command
I am using 

Driver Memory = 10G
Executor memory=10G

sc <- sparkR.session(master =
"spark://ip-172-31-6-116:7077",sparkConfig=list(spark.executor.memory="10g",spark.app.name="Testing",spark.driver.memory="14g",spark.executor.extraJavaOption="-Xms2g
-Xmx5g -XX:MaxPermSize=1024M",spark.driver.extraJavaOption="-Xms2g -Xmx5g
-XX:MaxPermSize=1024M",spark.cores.max="2"))


[D 16:43:51.437 NotebookApp] 200 GET
/api/contents?type=directory&_=1477289197671 (123.176.38.226) 7.96ms                                       
Exception in thread "broadcast-exchange-0" java.lang.OutOfMemoryError: Java
heap space                                                          
        at
org.apache.spark.sql.execution.joins.LongToUnsafeRowMap.append(HashedRelation.scala:539)                                             
        at
org.apache.spark.sql.execution.joins.LongHashedRelation$.apply(HashedRelation.scala:803)                                             
        at
org.apache.spark.sql.execution.joins.HashedRelation$.apply(HashedRelation.scala:105)                                                 
        at
org.apache.spark.sql.execution.joins.HashedRelationBroadcastMode.transform(HashedRelation.scala:816)                                 
        at
org.apache.spark.sql.execution.joins.HashedRelationBroadcastMode.transform(HashedRelation.scala:812)                                 
        at
org.apache.spark.sql.execution.exchange.BroadcastExchangeExec$$anonfun$relationFuture$1$$anonfun$apply$1.apply(BroadcastExchangeExec.
scala:90)                                                                                                                                       
        at
org.apache.spark.sql.execution.exchange.BroadcastExchangeExec$$anonfun$relationFuture$1$$anonfun$apply$1.apply(BroadcastExchangeExec.
scala:72)                                                                                                                                       
        at
org.apache.spark.sql.execution.SQLExecution$.withExecutionId(SQLExecution.scala:94)                                                  
        at
org.apache.spark.sql.execution.exchange.BroadcastExchangeExec$$anonfun$relationFuture$1.apply(BroadcastExchangeExec.scala:72)        
        at
org.apache.spark.sql.execution.exchange.BroadcastExchangeExec$$anonfun$relationFuture$1.apply(BroadcastExchangeExec.scala:72)        
        at
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)                                                
        at
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)                                                          
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)                                                      
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)                                                      
        at java.lang.Thread.run(Thread.java:745)                                                                                                
                                                                                                                                                



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/JAVA-heap-space-issue-tp27950.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org

Re: JAVA heap space issue

Posted by Sankar Mittapally <sa...@creditvidya.com>.

I have lot of joint SQL operations, which is blocking me write data and
unresisted the data, if not useful.

On Oct 24, 2016 7:50 PM, "Mich Talebzadeh" <mi...@gmail.com>
wrote:

> OK so you are disabling broadcasting although it is not obvious how this
> helps in this case!
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 24 October 2016 at 15:08, Sankar Mittapally <sankar.mittapally@
> creditvidya.com> wrote:
>
>> sc <- sparkR.session(master = "spark://ip-172-31-6-116:7077"
>> ,sparkConfig=list(spark.executor.memory="10g",spark.app.name
>> ="Testing",spark.driver.memory="14g",spark.executor.extraJavaOption="-Xms2g
>> -Xmx5g -XX:-UseGCOverheadLimit",spark.driver.extraJavaOption="-Xms2g
>> -Xmx5g -XX:-UseGCOverheadLimit",spark.cores.max="2",spark.sql.autoB
>> roadcastJoinThreshold="-1"))
>>
>> On Mon, Oct 24, 2016 at 7:33 PM, Mich Talebzadeh <
>> mich.talebzadeh@gmail.com> wrote:
>>
>>> OK so what is your full launch code now? I mean equivalent to
>>> spark-submit
>>>
>>>
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>> On 24 October 2016 at 14:57, Sankar Mittapally <
>>> sankar.mittapally@creditvidya.com> wrote:
>>>
>>>> Hi Mich,
>>>>
>>>>  I am able to write the files to storage after adding extra parameter.
>>>>
>>>> FYI..
>>>>
>>>> This one I used.
>>>>
>>>> spark.sql.autoBroadcastJoinThreshold="-1"
>>>>
>>>>
>>>>
>>>> On Mon, Oct 24, 2016 at 7:22 PM, Mich Talebzadeh <
>>>> mich.talebzadeh@gmail.com> wrote:
>>>>
>>>>> Rather strange as you have plenty free memory there.
>>>>>
>>>>> Try reducing driver memory to 2GB and executer memory to 2GB and run
>>>>> it again
>>>>>
>>>>> ${SPARK_HOME}/bin/spark-submit \
>>>>>                --driver-memory 2G \
>>>>>                 --num-executors 2 \
>>>>>                 --executor-cores 1 \
>>>>>                 --executor-memory 2G \
>>>>>                 --master spark://IPAddress:7077 \
>>>>>
>>>>> HTH
>>>>>
>>>>>
>>>>>
>>>>> Dr Mich Talebzadeh
>>>>>
>>>>>
>>>>>
>>>>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>>>
>>>>>
>>>>>
>>>>> http://talebzadehmich.wordpress.com
>>>>>
>>>>>
>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>>>> any loss, damage or destruction of data or any other property which may
>>>>> arise from relying on this email's technical content is explicitly
>>>>> disclaimed. The author will in no case be liable for any monetary damages
>>>>> arising from such loss, damage or destruction.
>>>>>
>>>>>
>>>>>
>>>>> On 24 October 2016 at 13:15, Sankar Mittapally <
>>>>> sankar.mittapally@creditvidya.com> wrote:
>>>>>
>>>>>> Hi Mich,
>>>>>>
>>>>>>  Yes, I am using standalone mode cluster, We have two executors with
>>>>>> 10G memory each.  We have two workers.
>>>>>>
>>>>>> FYI..
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Oct 24, 2016 at 5:22 PM, Mich Talebzadeh <
>>>>>> mich.talebzadeh@gmail.com> wrote:
>>>>>>
>>>>>>> Sounds like you are running in standalone mode.
>>>>>>>
>>>>>>> Have you checked the UI on port 4040 (default) to see where memory
>>>>>>> is going. Why do you need executor memory of 10GB?
>>>>>>>
>>>>>>> How many executors are running and plus how many slaves started?
>>>>>>>
>>>>>>> In standalone mode executors run on workers (UI 8080)
>>>>>>>
>>>>>>>
>>>>>>> HTH
>>>>>>>
>>>>>>> Dr Mich Talebzadeh
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>>>>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> http://talebzadehmich.wordpress.com
>>>>>>>
>>>>>>>
>>>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility
>>>>>>> for any loss, damage or destruction of data or any other property which may
>>>>>>> arise from relying on this email's technical content is explicitly
>>>>>>> disclaimed. The author will in no case be liable for any monetary damages
>>>>>>> arising from such loss, damage or destruction.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 24 October 2016 at 12:19, sankarmittapally <
>>>>>>> sankar.mittapally@creditvidya.com> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>>  I have a three node cluster with 30G of Memory. I am trying to
>>>>>>>> analyzing
>>>>>>>> the data of 200MB and running out of memory every time. This is the
>>>>>>>> command
>>>>>>>> I am using
>>>>>>>>
>>>>>>>> Driver Memory = 10G
>>>>>>>> Executor memory=10G
>>>>>>>>
>>>>>>>> sc <- sparkR.session(master =
>>>>>>>> "spark://ip-172-31-6-116:7077",sparkConfig=list(spark.execut
>>>>>>>> or.memory="10g",spark.app.name="Testing",spark.driver.memory
>>>>>>>> ="14g",spark.executor.extraJavaOption="-Xms2g
>>>>>>>> -Xmx5g -XX:MaxPermSize=1024M",spark.driver.extraJavaOption="-Xms2g
>>>>>>>> -Xmx5g
>>>>>>>> -XX:MaxPermSize=1024M",spark.cores.max="2"))
>>>>>>>>
>>>>>>>>
>>>>>>>> [D 16:43:51.437 NotebookApp] 200 GET
>>>>>>>> /api/contents?type=directory&_=1477289197671 (123.176.38.226)
>>>>>>>> 7.96ms
>>>>>>>> Exception in thread "broadcast-exchange-0"
>>>>>>>> java.lang.OutOfMemoryError: Java
>>>>>>>> heap space
>>>>>>>>         at
>>>>>>>> org.apache.spark.sql.execution.joins.LongToUnsafeRowMap.appe
>>>>>>>> nd(HashedRelation.scala:539)
>>>>>>>>         at
>>>>>>>> org.apache.spark.sql.execution.joins.LongHashedRelation$.app
>>>>>>>> ly(HashedRelation.scala:803)
>>>>>>>>         at
>>>>>>>> org.apache.spark.sql.execution.joins.HashedRelation$.apply(H
>>>>>>>> ashedRelation.scala:105)
>>>>>>>>         at
>>>>>>>> org.apache.spark.sql.execution.joins.HashedRelationBroadcast
>>>>>>>> Mode.transform(HashedRelation.scala:816)
>>>>>>>>         at
>>>>>>>> org.apache.spark.sql.execution.joins.HashedRelationBroadcast
>>>>>>>> Mode.transform(HashedRelation.scala:812)
>>>>>>>>         at
>>>>>>>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
>>>>>>>> c$$anonfun$relationFuture$1$$anonfun$apply$1.apply(Broadcast
>>>>>>>> ExchangeExec.
>>>>>>>> scala:90)
>>>>>>>>         at
>>>>>>>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
>>>>>>>> c$$anonfun$relationFuture$1$$anonfun$apply$1.apply(Broadcast
>>>>>>>> ExchangeExec.
>>>>>>>> scala:72)
>>>>>>>>         at
>>>>>>>> org.apache.spark.sql.execution.SQLExecution$.withExecutionId
>>>>>>>> (SQLExecution.scala:94)
>>>>>>>>         at
>>>>>>>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
>>>>>>>> c$$anonfun$relationFuture$1.apply(BroadcastExchangeExec.scala:72)
>>>>>>>>         at
>>>>>>>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
>>>>>>>> c$$anonfun$relationFuture$1.apply(BroadcastExchangeExec.scala:72)
>>>>>>>>         at
>>>>>>>> scala.concurrent.impl.Future$PromiseCompletingRunnable.lifte
>>>>>>>> dTree1$1(Future.scala:24)
>>>>>>>>         at
>>>>>>>> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(F
>>>>>>>> uture.scala:24)
>>>>>>>>         at
>>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>>>>>>> Executor.java:1142)
>>>>>>>>         at
>>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>>>>>>> lExecutor.java:617)
>>>>>>>>         at java.lang.Thread.run(Thread.java:745)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> View this message in context: http://apache-spark-user-list.
>>>>>>>> 1001560.n3.nabble.com/JAVA-heap-space-issue-tp27950.html
>>>>>>>> Sent from the Apache Spark User List mailing list archive at
>>>>>>>> Nabble.com.
>>>>>>>>
>>>>>>>> ------------------------------------------------------------
>>>>>>>> ---------
>>>>>>>> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: JAVA heap space issue

Posted by Mich Talebzadeh <mi...@gmail.com>.

OK so you are disabling broadcasting although it is not obvious how this
helps in this case!

Dr Mich Talebzadeh



LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 24 October 2016 at 15:08, Sankar Mittapally <
sankar.mittapally@creditvidya.com> wrote:

> sc <- sparkR.session(master = "spark://ip-172-31-6-116:7077"
> ,sparkConfig=list(spark.executor.memory="10g",spark.app.name
> ="Testing",spark.driver.memory="14g",spark.executor.extraJavaOption="-Xms2g
> -Xmx5g -XX:-UseGCOverheadLimit",spark.driver.extraJavaOption="-Xms2g
> -Xmx5g -XX:-UseGCOverheadLimit",spark.cores.max="2",spark.sql.
> autoBroadcastJoinThreshold="-1"))
>
> On Mon, Oct 24, 2016 at 7:33 PM, Mich Talebzadeh <
> mich.talebzadeh@gmail.com> wrote:
>
>> OK so what is your full launch code now? I mean equivalent to spark-submit
>>
>>
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>> On 24 October 2016 at 14:57, Sankar Mittapally <
>> sankar.mittapally@creditvidya.com> wrote:
>>
>>> Hi Mich,
>>>
>>>  I am able to write the files to storage after adding extra parameter.
>>>
>>> FYI..
>>>
>>> This one I used.
>>>
>>> spark.sql.autoBroadcastJoinThreshold="-1"
>>>
>>>
>>>
>>> On Mon, Oct 24, 2016 at 7:22 PM, Mich Talebzadeh <
>>> mich.talebzadeh@gmail.com> wrote:
>>>
>>>> Rather strange as you have plenty free memory there.
>>>>
>>>> Try reducing driver memory to 2GB and executer memory to 2GB and run it
>>>> again
>>>>
>>>> ${SPARK_HOME}/bin/spark-submit \
>>>>                --driver-memory 2G \
>>>>                 --num-executors 2 \
>>>>                 --executor-cores 1 \
>>>>                 --executor-memory 2G \
>>>>                 --master spark://IPAddress:7077 \
>>>>
>>>> HTH
>>>>
>>>>
>>>>
>>>> Dr Mich Talebzadeh
>>>>
>>>>
>>>>
>>>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>>
>>>>
>>>>
>>>> http://talebzadehmich.wordpress.com
>>>>
>>>>
>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>>> any loss, damage or destruction of data or any other property which may
>>>> arise from relying on this email's technical content is explicitly
>>>> disclaimed. The author will in no case be liable for any monetary damages
>>>> arising from such loss, damage or destruction.
>>>>
>>>>
>>>>
>>>> On 24 October 2016 at 13:15, Sankar Mittapally <
>>>> sankar.mittapally@creditvidya.com> wrote:
>>>>
>>>>> Hi Mich,
>>>>>
>>>>>  Yes, I am using standalone mode cluster, We have two executors with
>>>>> 10G memory each.  We have two workers.
>>>>>
>>>>> FYI..
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Oct 24, 2016 at 5:22 PM, Mich Talebzadeh <
>>>>> mich.talebzadeh@gmail.com> wrote:
>>>>>
>>>>>> Sounds like you are running in standalone mode.
>>>>>>
>>>>>> Have you checked the UI on port 4040 (default) to see where memory is
>>>>>> going. Why do you need executor memory of 10GB?
>>>>>>
>>>>>> How many executors are running and plus how many slaves started?
>>>>>>
>>>>>> In standalone mode executors run on workers (UI 8080)
>>>>>>
>>>>>>
>>>>>> HTH
>>>>>>
>>>>>> Dr Mich Talebzadeh
>>>>>>
>>>>>>
>>>>>>
>>>>>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>>>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>>>>
>>>>>>
>>>>>>
>>>>>> http://talebzadehmich.wordpress.com
>>>>>>
>>>>>>
>>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility
>>>>>> for any loss, damage or destruction of data or any other property which may
>>>>>> arise from relying on this email's technical content is explicitly
>>>>>> disclaimed. The author will in no case be liable for any monetary damages
>>>>>> arising from such loss, damage or destruction.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 24 October 2016 at 12:19, sankarmittapally <
>>>>>> sankar.mittapally@creditvidya.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>>  I have a three node cluster with 30G of Memory. I am trying to
>>>>>>> analyzing
>>>>>>> the data of 200MB and running out of memory every time. This is the
>>>>>>> command
>>>>>>> I am using
>>>>>>>
>>>>>>> Driver Memory = 10G
>>>>>>> Executor memory=10G
>>>>>>>
>>>>>>> sc <- sparkR.session(master =
>>>>>>> "spark://ip-172-31-6-116:7077",sparkConfig=list(spark.execut
>>>>>>> or.memory="10g",spark.app.name="Testing",spark.driver.memory
>>>>>>> ="14g",spark.executor.extraJavaOption="-Xms2g
>>>>>>> -Xmx5g -XX:MaxPermSize=1024M",spark.driver.extraJavaOption="-Xms2g
>>>>>>> -Xmx5g
>>>>>>> -XX:MaxPermSize=1024M",spark.cores.max="2"))
>>>>>>>
>>>>>>>
>>>>>>> [D 16:43:51.437 NotebookApp] 200 GET
>>>>>>> /api/contents?type=directory&_=1477289197671 (123.176.38.226) 7.96ms
>>>>>>> Exception in thread "broadcast-exchange-0"
>>>>>>> java.lang.OutOfMemoryError: Java
>>>>>>> heap space
>>>>>>>         at
>>>>>>> org.apache.spark.sql.execution.joins.LongToUnsafeRowMap.appe
>>>>>>> nd(HashedRelation.scala:539)
>>>>>>>         at
>>>>>>> org.apache.spark.sql.execution.joins.LongHashedRelation$.app
>>>>>>> ly(HashedRelation.scala:803)
>>>>>>>         at
>>>>>>> org.apache.spark.sql.execution.joins.HashedRelation$.apply(H
>>>>>>> ashedRelation.scala:105)
>>>>>>>         at
>>>>>>> org.apache.spark.sql.execution.joins.HashedRelationBroadcast
>>>>>>> Mode.transform(HashedRelation.scala:816)
>>>>>>>         at
>>>>>>> org.apache.spark.sql.execution.joins.HashedRelationBroadcast
>>>>>>> Mode.transform(HashedRelation.scala:812)
>>>>>>>         at
>>>>>>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
>>>>>>> c$$anonfun$relationFuture$1$$anonfun$apply$1.apply(Broadcast
>>>>>>> ExchangeExec.
>>>>>>> scala:90)
>>>>>>>         at
>>>>>>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
>>>>>>> c$$anonfun$relationFuture$1$$anonfun$apply$1.apply(Broadcast
>>>>>>> ExchangeExec.
>>>>>>> scala:72)
>>>>>>>         at
>>>>>>> org.apache.spark.sql.execution.SQLExecution$.withExecutionId
>>>>>>> (SQLExecution.scala:94)
>>>>>>>         at
>>>>>>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
>>>>>>> c$$anonfun$relationFuture$1.apply(BroadcastExchangeExec.scala:72)
>>>>>>>         at
>>>>>>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
>>>>>>> c$$anonfun$relationFuture$1.apply(BroadcastExchangeExec.scala:72)
>>>>>>>         at
>>>>>>> scala.concurrent.impl.Future$PromiseCompletingRunnable.lifte
>>>>>>> dTree1$1(Future.scala:24)
>>>>>>>         at
>>>>>>> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(F
>>>>>>> uture.scala:24)
>>>>>>>         at
>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>>>>>> Executor.java:1142)
>>>>>>>         at
>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>>>>>> lExecutor.java:617)
>>>>>>>         at java.lang.Thread.run(Thread.java:745)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> View this message in context: http://apache-spark-user-list.
>>>>>>> 1001560.n3.nabble.com/JAVA-heap-space-issue-tp27950.html
>>>>>>> Sent from the Apache Spark User List mailing list archive at
>>>>>>> Nabble.com.
>>>>>>>
>>>>>>> ------------------------------------------------------------
>>>>>>> ---------
>>>>>>> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: JAVA heap space issue

Posted by Mich Talebzadeh <mi...@gmail.com>.

OK so what is your full launch code now? I mean equivalent to spark-submit



Dr Mich Talebzadeh



LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 24 October 2016 at 14:57, Sankar Mittapally <
sankar.mittapally@creditvidya.com> wrote:

> Hi Mich,
>
>  I am able to write the files to storage after adding extra parameter.
>
> FYI..
>
> This one I used.
>
> spark.sql.autoBroadcastJoinThreshold="-1"
>
>
>
> On Mon, Oct 24, 2016 at 7:22 PM, Mich Talebzadeh <
> mich.talebzadeh@gmail.com> wrote:
>
>> Rather strange as you have plenty free memory there.
>>
>> Try reducing driver memory to 2GB and executer memory to 2GB and run it
>> again
>>
>> ${SPARK_HOME}/bin/spark-submit \
>>                --driver-memory 2G \
>>                 --num-executors 2 \
>>                 --executor-cores 1 \
>>                 --executor-memory 2G \
>>                 --master spark://IPAddress:7077 \
>>
>> HTH
>>
>>
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>> On 24 October 2016 at 13:15, Sankar Mittapally <
>> sankar.mittapally@creditvidya.com> wrote:
>>
>>> Hi Mich,
>>>
>>>  Yes, I am using standalone mode cluster, We have two executors with 10G
>>> memory each.  We have two workers.
>>>
>>> FYI..
>>>
>>>
>>>
>>> On Mon, Oct 24, 2016 at 5:22 PM, Mich Talebzadeh <
>>> mich.talebzadeh@gmail.com> wrote:
>>>
>>>> Sounds like you are running in standalone mode.
>>>>
>>>> Have you checked the UI on port 4040 (default) to see where memory is
>>>> going. Why do you need executor memory of 10GB?
>>>>
>>>> How many executors are running and plus how many slaves started?
>>>>
>>>> In standalone mode executors run on workers (UI 8080)
>>>>
>>>>
>>>> HTH
>>>>
>>>> Dr Mich Talebzadeh
>>>>
>>>>
>>>>
>>>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>>
>>>>
>>>>
>>>> http://talebzadehmich.wordpress.com
>>>>
>>>>
>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>>> any loss, damage or destruction of data or any other property which may
>>>> arise from relying on this email's technical content is explicitly
>>>> disclaimed. The author will in no case be liable for any monetary damages
>>>> arising from such loss, damage or destruction.
>>>>
>>>>
>>>>
>>>> On 24 October 2016 at 12:19, sankarmittapally <
>>>> sankar.mittapally@creditvidya.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>>  I have a three node cluster with 30G of Memory. I am trying to
>>>>> analyzing
>>>>> the data of 200MB and running out of memory every time. This is the
>>>>> command
>>>>> I am using
>>>>>
>>>>> Driver Memory = 10G
>>>>> Executor memory=10G
>>>>>
>>>>> sc <- sparkR.session(master =
>>>>> "spark://ip-172-31-6-116:7077",sparkConfig=list(spark.execut
>>>>> or.memory="10g",spark.app.name="Testing",spark.driver.memory
>>>>> ="14g",spark.executor.extraJavaOption="-Xms2g
>>>>> -Xmx5g -XX:MaxPermSize=1024M",spark.driver.extraJavaOption="-Xms2g
>>>>> -Xmx5g
>>>>> -XX:MaxPermSize=1024M",spark.cores.max="2"))
>>>>>
>>>>>
>>>>> [D 16:43:51.437 NotebookApp] 200 GET
>>>>> /api/contents?type=directory&_=1477289197671 (123.176.38.226) 7.96ms
>>>>> Exception in thread "broadcast-exchange-0" java.lang.OutOfMemoryError:
>>>>> Java
>>>>> heap space
>>>>>         at
>>>>> org.apache.spark.sql.execution.joins.LongToUnsafeRowMap.appe
>>>>> nd(HashedRelation.scala:539)
>>>>>         at
>>>>> org.apache.spark.sql.execution.joins.LongHashedRelation$.app
>>>>> ly(HashedRelation.scala:803)
>>>>>         at
>>>>> org.apache.spark.sql.execution.joins.HashedRelation$.apply(H
>>>>> ashedRelation.scala:105)
>>>>>         at
>>>>> org.apache.spark.sql.execution.joins.HashedRelationBroadcast
>>>>> Mode.transform(HashedRelation.scala:816)
>>>>>         at
>>>>> org.apache.spark.sql.execution.joins.HashedRelationBroadcast
>>>>> Mode.transform(HashedRelation.scala:812)
>>>>>         at
>>>>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
>>>>> c$$anonfun$relationFuture$1$$anonfun$apply$1.apply(Broadcast
>>>>> ExchangeExec.
>>>>> scala:90)
>>>>>         at
>>>>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
>>>>> c$$anonfun$relationFuture$1$$anonfun$apply$1.apply(Broadcast
>>>>> ExchangeExec.
>>>>> scala:72)
>>>>>         at
>>>>> org.apache.spark.sql.execution.SQLExecution$.withExecutionId
>>>>> (SQLExecution.scala:94)
>>>>>         at
>>>>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
>>>>> c$$anonfun$relationFuture$1.apply(BroadcastExchangeExec.scala:72)
>>>>>         at
>>>>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
>>>>> c$$anonfun$relationFuture$1.apply(BroadcastExchangeExec.scala:72)
>>>>>         at
>>>>> scala.concurrent.impl.Future$PromiseCompletingRunnable.lifte
>>>>> dTree1$1(Future.scala:24)
>>>>>         at
>>>>> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(F
>>>>> uture.scala:24)
>>>>>         at
>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>>>> Executor.java:1142)
>>>>>         at
>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>>>> lExecutor.java:617)
>>>>>         at java.lang.Thread.run(Thread.java:745)
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> View this message in context: http://apache-spark-user-list.
>>>>> 1001560.n3.nabble.com/JAVA-heap-space-issue-tp27950.html
>>>>> Sent from the Apache Spark User List mailing list archive at
>>>>> Nabble.com.
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: JAVA heap space issue

Posted by Sankar Mittapally <sa...@creditvidya.com>.

Hi Mich,

 I am able to write the files to storage after adding extra parameter.

FYI..

This one I used.

spark.sql.autoBroadcastJoinThreshold="-1"



On Mon, Oct 24, 2016 at 7:22 PM, Mich Talebzadeh <mi...@gmail.com>
wrote:

> Rather strange as you have plenty free memory there.
>
> Try reducing driver memory to 2GB and executer memory to 2GB and run it
> again
>
> ${SPARK_HOME}/bin/spark-submit \
>                --driver-memory 2G \
>                 --num-executors 2 \
>                 --executor-cores 1 \
>                 --executor-memory 2G \
>                 --master spark://IPAddress:7077 \
>
> HTH
>
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 24 October 2016 at 13:15, Sankar Mittapally <sankar.mittapally@
> creditvidya.com> wrote:
>
>> Hi Mich,
>>
>>  Yes, I am using standalone mode cluster, We have two executors with 10G
>> memory each.  We have two workers.
>>
>> FYI..
>>
>>
>>
>> On Mon, Oct 24, 2016 at 5:22 PM, Mich Talebzadeh <
>> mich.talebzadeh@gmail.com> wrote:
>>
>>> Sounds like you are running in standalone mode.
>>>
>>> Have you checked the UI on port 4040 (default) to see where memory is
>>> going. Why do you need executor memory of 10GB?
>>>
>>> How many executors are running and plus how many slaves started?
>>>
>>> In standalone mode executors run on workers (UI 8080)
>>>
>>>
>>> HTH
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>> On 24 October 2016 at 12:19, sankarmittapally <
>>> sankar.mittapally@creditvidya.com> wrote:
>>>
>>>> Hi,
>>>>
>>>>  I have a three node cluster with 30G of Memory. I am trying to
>>>> analyzing
>>>> the data of 200MB and running out of memory every time. This is the
>>>> command
>>>> I am using
>>>>
>>>> Driver Memory = 10G
>>>> Executor memory=10G
>>>>
>>>> sc <- sparkR.session(master =
>>>> "spark://ip-172-31-6-116:7077",sparkConfig=list(spark.execut
>>>> or.memory="10g",spark.app.name="Testing",spark.driver.memory
>>>> ="14g",spark.executor.extraJavaOption="-Xms2g
>>>> -Xmx5g -XX:MaxPermSize=1024M",spark.driver.extraJavaOption="-Xms2g
>>>> -Xmx5g
>>>> -XX:MaxPermSize=1024M",spark.cores.max="2"))
>>>>
>>>>
>>>> [D 16:43:51.437 NotebookApp] 200 GET
>>>> /api/contents?type=directory&_=1477289197671 (123.176.38.226) 7.96ms
>>>> Exception in thread "broadcast-exchange-0" java.lang.OutOfMemoryError:
>>>> Java
>>>> heap space
>>>>         at
>>>> org.apache.spark.sql.execution.joins.LongToUnsafeRowMap.appe
>>>> nd(HashedRelation.scala:539)
>>>>         at
>>>> org.apache.spark.sql.execution.joins.LongHashedRelation$.app
>>>> ly(HashedRelation.scala:803)
>>>>         at
>>>> org.apache.spark.sql.execution.joins.HashedRelation$.apply(H
>>>> ashedRelation.scala:105)
>>>>         at
>>>> org.apache.spark.sql.execution.joins.HashedRelationBroadcast
>>>> Mode.transform(HashedRelation.scala:816)
>>>>         at
>>>> org.apache.spark.sql.execution.joins.HashedRelationBroadcast
>>>> Mode.transform(HashedRelation.scala:812)
>>>>         at
>>>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
>>>> c$$anonfun$relationFuture$1$$anonfun$apply$1.apply(Broadcast
>>>> ExchangeExec.
>>>> scala:90)
>>>>         at
>>>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
>>>> c$$anonfun$relationFuture$1$$anonfun$apply$1.apply(Broadcast
>>>> ExchangeExec.
>>>> scala:72)
>>>>         at
>>>> org.apache.spark.sql.execution.SQLExecution$.withExecutionId
>>>> (SQLExecution.scala:94)
>>>>         at
>>>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
>>>> c$$anonfun$relationFuture$1.apply(BroadcastExchangeExec.scala:72)
>>>>         at
>>>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
>>>> c$$anonfun$relationFuture$1.apply(BroadcastExchangeExec.scala:72)
>>>>         at
>>>> scala.concurrent.impl.Future$PromiseCompletingRunnable.lifte
>>>> dTree1$1(Future.scala:24)
>>>>         at
>>>> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(F
>>>> uture.scala:24)
>>>>         at
>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>>> Executor.java:1142)
>>>>         at
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>>> lExecutor.java:617)
>>>>         at java.lang.Thread.run(Thread.java:745)
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context: http://apache-spark-user-list.
>>>> 1001560.n3.nabble.com/JAVA-heap-space-issue-tp27950.html
>>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>>>>
>>>>
>>>
>>
>

Re: JAVA heap space issue

Posted by Mich Talebzadeh <mi...@gmail.com>.

Rather strange as you have plenty free memory there.

Try reducing driver memory to 2GB and executer memory to 2GB and run it
again

${SPARK_HOME}/bin/spark-submit \
               --driver-memory 2G \
                --num-executors 2 \
                --executor-cores 1 \
                --executor-memory 2G \
                --master spark://IPAddress:7077 \

HTH



Dr Mich Talebzadeh



LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 24 October 2016 at 13:15, Sankar Mittapally <
sankar.mittapally@creditvidya.com> wrote:

> Hi Mich,
>
>  Yes, I am using standalone mode cluster, We have two executors with 10G
> memory each.  We have two workers.
>
> FYI..
>
>
>
> On Mon, Oct 24, 2016 at 5:22 PM, Mich Talebzadeh <
> mich.talebzadeh@gmail.com> wrote:
>
>> Sounds like you are running in standalone mode.
>>
>> Have you checked the UI on port 4040 (default) to see where memory is
>> going. Why do you need executor memory of 10GB?
>>
>> How many executors are running and plus how many slaves started?
>>
>> In standalone mode executors run on workers (UI 8080)
>>
>>
>> HTH
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>> On 24 October 2016 at 12:19, sankarmittapally <
>> sankar.mittapally@creditvidya.com> wrote:
>>
>>> Hi,
>>>
>>>  I have a three node cluster with 30G of Memory. I am trying to analyzing
>>> the data of 200MB and running out of memory every time. This is the
>>> command
>>> I am using
>>>
>>> Driver Memory = 10G
>>> Executor memory=10G
>>>
>>> sc <- sparkR.session(master =
>>> "spark://ip-172-31-6-116:7077",sparkConfig=list(spark.execut
>>> or.memory="10g",spark.app.name="Testing",spark.driver.memory
>>> ="14g",spark.executor.extraJavaOption="-Xms2g
>>> -Xmx5g -XX:MaxPermSize=1024M",spark.driver.extraJavaOption="-Xms2g
>>> -Xmx5g
>>> -XX:MaxPermSize=1024M",spark.cores.max="2"))
>>>
>>>
>>> [D 16:43:51.437 NotebookApp] 200 GET
>>> /api/contents?type=directory&_=1477289197671 (123.176.38.226) 7.96ms
>>> Exception in thread "broadcast-exchange-0" java.lang.OutOfMemoryError:
>>> Java
>>> heap space
>>>         at
>>> org.apache.spark.sql.execution.joins.LongToUnsafeRowMap.appe
>>> nd(HashedRelation.scala:539)
>>>         at
>>> org.apache.spark.sql.execution.joins.LongHashedRelation$.app
>>> ly(HashedRelation.scala:803)
>>>         at
>>> org.apache.spark.sql.execution.joins.HashedRelation$.apply(H
>>> ashedRelation.scala:105)
>>>         at
>>> org.apache.spark.sql.execution.joins.HashedRelationBroadcast
>>> Mode.transform(HashedRelation.scala:816)
>>>         at
>>> org.apache.spark.sql.execution.joins.HashedRelationBroadcast
>>> Mode.transform(HashedRelation.scala:812)
>>>         at
>>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
>>> c$$anonfun$relationFuture$1$$anonfun$apply$1.apply(Broadcast
>>> ExchangeExec.
>>> scala:90)
>>>         at
>>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
>>> c$$anonfun$relationFuture$1$$anonfun$apply$1.apply(Broadcast
>>> ExchangeExec.
>>> scala:72)
>>>         at
>>> org.apache.spark.sql.execution.SQLExecution$.withExecutionId
>>> (SQLExecution.scala:94)
>>>         at
>>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
>>> c$$anonfun$relationFuture$1.apply(BroadcastExchangeExec.scala:72)
>>>         at
>>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
>>> c$$anonfun$relationFuture$1.apply(BroadcastExchangeExec.scala:72)
>>>         at
>>> scala.concurrent.impl.Future$PromiseCompletingRunnable.lifte
>>> dTree1$1(Future.scala:24)
>>>         at
>>> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(F
>>> uture.scala:24)
>>>         at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>> Executor.java:1142)
>>>         at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>> lExecutor.java:617)
>>>         at java.lang.Thread.run(Thread.java:745)
>>>
>>>
>>>
>>>
>>> --
>>> View this message in context: http://apache-spark-user-list.
>>> 1001560.n3.nabble.com/JAVA-heap-space-issue-tp27950.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>>>
>>>
>>
>

Re: JAVA heap space issue

Posted by Sankar Mittapally <sa...@creditvidya.com>.

Hi Mich,

 Yes, I am using standalone mode cluster, We have two executors with 10G
memory each.  We have two workers.

FYI..



On Mon, Oct 24, 2016 at 5:22 PM, Mich Talebzadeh <mi...@gmail.com>
wrote:

> Sounds like you are running in standalone mode.
>
> Have you checked the UI on port 4040 (default) to see where memory is
> going. Why do you need executor memory of 10GB?
>
> How many executors are running and plus how many slaves started?
>
> In standalone mode executors run on workers (UI 8080)
>
>
> HTH
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 24 October 2016 at 12:19, sankarmittapally <sankar.mittapally@
> creditvidya.com> wrote:
>
>> Hi,
>>
>>  I have a three node cluster with 30G of Memory. I am trying to analyzing
>> the data of 200MB and running out of memory every time. This is the
>> command
>> I am using
>>
>> Driver Memory = 10G
>> Executor memory=10G
>>
>> sc <- sparkR.session(master =
>> "spark://ip-172-31-6-116:7077",sparkConfig=list(spark.execut
>> or.memory="10g",spark.app.name="Testing",spark.driver.
>> memory="14g",spark.executor.extraJavaOption="-Xms2g
>> -Xmx5g -XX:MaxPermSize=1024M",spark.driver.extraJavaOption="-Xms2g -Xmx5g
>> -XX:MaxPermSize=1024M",spark.cores.max="2"))
>>
>>
>> [D 16:43:51.437 NotebookApp] 200 GET
>> /api/contents?type=directory&_=1477289197671 (123.176.38.226) 7.96ms
>> Exception in thread "broadcast-exchange-0" java.lang.OutOfMemoryError:
>> Java
>> heap space
>>         at
>> org.apache.spark.sql.execution.joins.LongToUnsafeRowMap.
>> append(HashedRelation.scala:539)
>>         at
>> org.apache.spark.sql.execution.joins.LongHashedRelation$.
>> apply(HashedRelation.scala:803)
>>         at
>> org.apache.spark.sql.execution.joins.HashedRelation$.apply(H
>> ashedRelation.scala:105)
>>         at
>> org.apache.spark.sql.execution.joins.HashedRelationBroadcast
>> Mode.transform(HashedRelation.scala:816)
>>         at
>> org.apache.spark.sql.execution.joins.HashedRelationBroadcast
>> Mode.transform(HashedRelation.scala:812)
>>         at
>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
>> c$$anonfun$relationFuture$1$$anonfun$apply$1.apply(BroadcastExchangeExec.
>> scala:90)
>>         at
>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
>> c$$anonfun$relationFuture$1$$anonfun$apply$1.apply(BroadcastExchangeExec.
>> scala:72)
>>         at
>> org.apache.spark.sql.execution.SQLExecution$.withExecutionId
>> (SQLExecution.scala:94)
>>         at
>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
>> c$$anonfun$relationFuture$1.apply(BroadcastExchangeExec.scala:72)
>>         at
>> org.apache.spark.sql.execution.exchange.BroadcastExchangeExe
>> c$$anonfun$relationFuture$1.apply(BroadcastExchangeExec.scala:72)
>>         at
>> scala.concurrent.impl.Future$PromiseCompletingRunnable.lifte
>> dTree1$1(Future.scala:24)
>>         at
>> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(F
>> uture.scala:24)
>>         at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>> Executor.java:1142)
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>> lExecutor.java:617)
>>         at java.lang.Thread.run(Thread.java:745)
>>
>>
>>
>>
>> --
>> View this message in context: http://apache-spark-user-list.
>> 1001560.n3.nabble.com/JAVA-heap-space-issue-tp27950.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>>
>>
>

Re: JAVA heap space issue

Posted by Mich Talebzadeh <mi...@gmail.com>.

Sounds like you are running in standalone mode.

Have you checked the UI on port 4040 (default) to see where memory is
going. Why do you need executor memory of 10GB?

How many executors are running and plus how many slaves started?

In standalone mode executors run on workers (UI 8080)


HTH

Dr Mich Talebzadeh



LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 24 October 2016 at 12:19, sankarmittapally <
sankar.mittapally@creditvidya.com> wrote:

> Hi,
>
>  I have a three node cluster with 30G of Memory. I am trying to analyzing
> the data of 200MB and running out of memory every time. This is the command
> I am using
>
> Driver Memory = 10G
> Executor memory=10G
>
> sc <- sparkR.session(master =
> "spark://ip-172-31-6-116:7077",sparkConfig=list(spark.
> executor.memory="10g",spark.app.name="Testing",spark.
> driver.memory="14g",spark.executor.extraJavaOption="-Xms2g
> -Xmx5g -XX:MaxPermSize=1024M",spark.driver.extraJavaOption="-Xms2g -Xmx5g
> -XX:MaxPermSize=1024M",spark.cores.max="2"))
>
>
> [D 16:43:51.437 NotebookApp] 200 GET
> /api/contents?type=directory&_=1477289197671 (123.176.38.226) 7.96ms
> Exception in thread "broadcast-exchange-0" java.lang.OutOfMemoryError: Java
> heap space
>         at
> org.apache.spark.sql.execution.joins.LongToUnsafeRowMap.append(
> HashedRelation.scala:539)
>         at
> org.apache.spark.sql.execution.joins.LongHashedRelation$.apply(
> HashedRelation.scala:803)
>         at
> org.apache.spark.sql.execution.joins.HashedRelation$.apply(
> HashedRelation.scala:105)
>         at
> org.apache.spark.sql.execution.joins.HashedRelationBroadcastMode.
> transform(HashedRelation.scala:816)
>         at
> org.apache.spark.sql.execution.joins.HashedRelationBroadcastMode.
> transform(HashedRelation.scala:812)
>         at
> org.apache.spark.sql.execution.exchange.BroadcastExchangeExec$$
> anonfun$relationFuture$1$$anonfun$apply$1.apply(BroadcastExchangeExec.
> scala:90)
>         at
> org.apache.spark.sql.execution.exchange.BroadcastExchangeExec$$
> anonfun$relationFuture$1$$anonfun$apply$1.apply(BroadcastExchangeExec.
> scala:72)
>         at
> org.apache.spark.sql.execution.SQLExecution$.withExecutionId(SQLExecution.
> scala:94)
>         at
> org.apache.spark.sql.execution.exchange.BroadcastExchangeExec$$
> anonfun$relationFuture$1.apply(BroadcastExchangeExec.scala:72)
>         at
> org.apache.spark.sql.execution.exchange.BroadcastExchangeExec$$
> anonfun$relationFuture$1.apply(BroadcastExchangeExec.scala:72)
>         at
> scala.concurrent.impl.Future$PromiseCompletingRunnable.
> liftedTree1$1(Future.scala:24)
>         at
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(
> Future.scala:24)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
>
>
>
>
> --
> View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/JAVA-heap-space-issue-tp27950.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>