You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Kazuaki Ishizaki (JIRA)" <ji...@apache.org> on 2018/02/19 00:49:00 UTC

[jira] [Commented] (SPARK-23427) spark.sql.autoBroadcastJoinThreshold causing OOM exception in the driver

    [ https://issues.apache.org/jira/browse/SPARK-23427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16368728#comment-16368728 ] 

Kazuaki Ishizaki commented on SPARK-23427:
------------------------------------------

Thank you. I ran this program several times with 64GB heap size. I saw the following OOM in both cases `-1` or default (`10*1024`*1024`). I am running the program with other heap sizes.
Is this OOM what you are seeing?  If not, I would appreciate if you could upload stack trace when OOM occurred.

{code:java}
[info] org.apache.spark.sql.MyTest *** ABORTED *** (2 hours, 14 minutes, 36 seconds)
[info] java.lang.OutOfMemoryError:
[info] at java.lang.AbstractStringBuilder.hugeCapacity(AbstractStringBuilder.java:161)
[info] at java.lang.AbstractStringBuilder.newCapacity(AbstractStringBuilder.java:155)
[info] at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:125)
[info] at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
[info] at java.lang.StringBuilder.append(StringBuilder.java:136)
[info] at java.lang.StringBuilder.append(StringBuilder.java:131)
[info] at scala.StringContext.standardInterpolator(StringContext.scala:125)
[info] at scala.StringContext.s(StringContext.scala:95)
[info] at org.apache.spark.sql.execution.QueryExecution.toString(QueryExecution.scala:199)
[info] at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:74)
[info] at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3252)
[info] at org.apache.spark.sql.Dataset.<init>(Dataset.scala:190)
[info] at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75)
[info] at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$withPlan(Dataset.scala:3295)
[info] at org.apache.spark.sql.Dataset.createOrReplaceTempView(Dataset.scala:3033)
[info] at org.apache.spark.sql.MyTest$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(MyTest.scala:87)
[info] at org.apache.spark.sql.catalyst.plans.PlanTestBase$class.withSQLConf(PlanTest.scala:176)
[info] at org.apache.spark.sql.MyTest.org$apache$spark$sql$test$SQLTestUtilsBase$$super$withSQLConf(MyTest.scala:27)
[info] at org.apache.spark.sql.test.SQLTestUtilsBase$class.withSQLConf(SQLTestUtils.scala:167)
[info] at org.apache.spark.sql.MyTest.withSQLConf(MyTest.scala:27)
[info] at org.apache.spark.sql.MyTest$$anonfun$1.apply$mcV$sp(MyTest.scala:65)
[info] at org.apache.spark.sql.MyTest$$anonfun$1.apply(MyTest.scala:65)
[info] at org.apache.spark.sql.MyTest$$anonfun$1.apply(MyTest.scala:65)
...
{code:java}


> spark.sql.autoBroadcastJoinThreshold causing OOM exception in the driver 
> -------------------------------------------------------------------------
>
>                 Key: SPARK-23427
>                 URL: https://issues.apache.org/jira/browse/SPARK-23427
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.0.0
>         Environment: SPARK 2.0 version
>            Reporter: Dhiraj
>            Priority: Critical
>
> We are facing issue around value of spark.sql.autoBroadcastJoinThreshold.
> With spark.sql.autoBroadcastJoinThreshold -1 ( disable) we seeing driver memory used flat.
> With any other values 10MB, 5MB, 2 MB, 1MB, 10K, 1K we see driver memory used goes up with rate depending upon the size of the autoBroadcastThreshold and getting OOM exception. The problem is memory used by autoBroadcast is not being free up in the driver.
> Application imports oracle tables as master dataframes which are persisted. Each job applies filter to these tables and then registered them as tempViewTable . Then sql query are using to process data further. At the end all the intermediate dataFrame are unpersisted.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org