You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by hansen <ha...@neusoft.com> on 2014/06/30 09:58:18 UTC
How to control a spark application(executor) using memory amount
per node?
Hi,
When i send the following statements in spark-shell:
val file =
sc.textFile("hdfs://nameservice1/user/study/spark/data/soc-LiveJournal1.txt")
val count = file.flatMap(line => line.split(" ")).map(word => (word,
1)).reduceByKey(_+_)
println(count.count())
and, it throw a exception:
......
14/06/30 15:50:53 WARN TaskSetManager: Loss was due to
java.lang.OutOfMemoryError
java.lang.OutOfMemoryError: Java heap space
at
java.io.ObjectOutputStream$HandleTable.growEntries(ObjectOutputStream.java:2346)
at
java.io.ObjectOutputStream$HandleTable.assign(ObjectOutputStream.java:2275)
at
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1427)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347)
at
org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:28)
at
org.apache.spark.storage.DiskBlockObjectWriter.write(BlockObjectWriter.scala:176)
at
org.apache.spark.scheduler.ShuffleMapTask$$anonfun$runTask$1.apply(ShuffleMapTask.scala:164)
at
org.apache.spark.scheduler.ShuffleMapTask$$anonfun$runTask$1.apply(ShuffleMapTask.scala:161)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at
org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator.foreach(ExternalAppendOnlyMap.scala:239)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102)
at org.apache.spark.scheduler.Task.run(Task.scala:53)
at
org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:213)
at
org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:42)
at
org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:41)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at
org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:41)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
then, I set the following configuration in spark-env.sh
export SPARK_EXECUTOR_MEMORY=1G
It's not OK.
spark.png
<http://apache-spark-user-list.1001560.n3.nabble.com/file/n8521/spark.png>
I found when i start spark-shell, then console also print the logs:
SparkDeploySchedulerBackend: Granted executor ID
app-20140630144110-0002/0 on hostPort dlx8:7078 with 8 cores, *512.0 MB RAM*
How to increate 512.0 MB RAM to the more memory?
Pls!
--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-control-a-spark-application-executor-using-memory-amount-per-node-tp8521.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
Re: How to control a spark application(executor) using memory amount per node?
Posted by MEETHU MATHEW <me...@yahoo.co.in>.
Hi,
Try setting driver-java-options with spark-submit or set spark.executor.extraJavaOptions in spark-default.conf
Thanks & Regards,
Meethu M
On Monday, 30 June 2014 1:28 PM, hansen <ha...@neusoft.com> wrote:
Hi,
When i send the following statements in spark-shell:
val file =
sc.textFile("hdfs://nameservice1/user/study/spark/data/soc-LiveJournal1.txt")
val count = file.flatMap(line => line.split(" ")).map(word => (word,
1)).reduceByKey(_+_)
println(count.count())
and, it throw a exception:
......
14/06/30 15:50:53 WARN TaskSetManager: Loss was due to
java.lang.OutOfMemoryError
java.lang.OutOfMemoryError: Java heap space
at
java.io.ObjectOutputStream$HandleTable.growEntries(ObjectOutputStream.java:2346)
at
java.io.ObjectOutputStream$HandleTable.assign(ObjectOutputStream.java:2275)
at
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1427)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347)
at
org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:28)
at
org.apache.spark.storage.DiskBlockObjectWriter.write(BlockObjectWriter.scala:176)
at
org.apache.spark.scheduler.ShuffleMapTask$$anonfun$runTask$1.apply(ShuffleMapTask.scala:164)
at
org.apache.spark.scheduler.ShuffleMapTask$$anonfun$runTask$1.apply(ShuffleMapTask.scala:161)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at
org.apache.spark.util.collection.ExternalAppendOnlyMap$ExternalIterator.foreach(ExternalAppendOnlyMap.scala:239)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102)
at org.apache.spark.scheduler.Task.run(Task.scala:53)
at
org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:213)
at
org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:42)
at
org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:41)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at
org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:41)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
then, I set the following configuration in spark-env.sh
export SPARK_EXECUTOR_MEMORY=1G
It's not OK.
spark.png
<http://apache-spark-user-list.1001560.n3.nabble.com/file/n8521/spark.png>
I found when i start spark-shell, then console also print the logs:
SparkDeploySchedulerBackend: Granted executor ID
app-20140630144110-0002/0 on hostPort dlx8:7078 with 8 cores, *512.0 MB RAM*
How to increate 512.0 MB RAM to the more memory?
Pls!
--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-control-a-spark-application-executor-using-memory-amount-per-node-tp8521.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.