You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Steve Johnston (JIRA)" <ji...@apache.org> on 2016/04/05 01:31:25 UTC
[jira] [Commented] (SPARK-14389) OOM during BroadcastNestedLoopJoin
[ https://issues.apache.org/jira/browse/SPARK-14389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15225265#comment-15225265 ]
Steve Johnston commented on SPARK-14389:
----------------------------------------
The sample script, data, etc is contrived in order to demonstrate the problem. This OOM occurs at various data sizes, query complexities and cluster configurations. We mostly see it as a deviation in our experiments.
> OOM during BroadcastNestedLoopJoin
> ----------------------------------
>
> Key: SPARK-14389
> URL: https://issues.apache.org/jira/browse/SPARK-14389
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.6.0
> Environment: OS: Amazon Linux AMI 2015.09
> EMR: 4.3.0
> Hadoop: Amazon 2.7.1
> Spark 1.6.0
> Ganglia 3.7.2
> Master: m3.xlarge
> Core: m3.xlarge
> m3.xlarge: 4 CPU, 15GB mem, 2x40GB SSD
> Reporter: Steve Johnston
> Attachments: lineitem.tbl, sample_script.py, stdout.txt
>
>
> When executing attached sample_script.py in client mode with a single executor an exception occurs, "java.lang.OutOfMemoryError: Java heap space", during the self join of a small table, TPC-H lineitem generated for a 1M dataset. Also see execution log stdout.txt attached.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org