You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Cheng Lian (JIRA)" <ji...@apache.org> on 2016/11/22 06:54:59 UTC
[jira] [Comment Edited] (SPARK-18403) ObjectHashAggregateSuite is
being flaky (occasional OOM errors)
[ https://issues.apache.org/jira/browse/SPARK-18403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15684659#comment-15684659 ]
Cheng Lian edited comment on SPARK-18403 at 11/22/16 6:54 AM:
--------------------------------------------------------------
Here is a minimal test case (add it to {{ObjectHashAggregateSuite}}) that can be used to reproduce this issue steadily:
{code}
test("oom") {
withSQLConf(
SQLConf.USE_OBJECT_HASH_AGG.key -> "true",
SQLConf.OBJECT_AGG_SORT_BASED_FALLBACK_THRESHOLD.key -> "1"
) {
Seq(Tuple1(Seq.empty[Int]))
.toDF("c0")
.groupBy(lit(1))
.agg(typed_count($"c0"), max($"c0"))
.show()
}
}
{code}
What I observed is that the partial aggregation phase produces a malformed {{UnsafeRow}} after applying the {{resultProjection}} [here|https://github.com/apache/spark/blob/07beb5d21c6803e80733149f1560c71cd3cacc86/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/AggregationIterator.scala#L254].
When printed, the malformed {{UnsafeRow}} is always
{noformat}
[0,0,2000000008,2800000008,100000000000000,5a5a5a5a5a5a5a5a]
{noformat}
The {{5a5a5a5a5a5a5a5a}} is interpreted as the length of an {{ArrayData}}. Therefore, the JVM blows up when trying to allocate a huge array to deep copy this {{ArrayData}} at a later phase.
[~sameer] and [~davies], would you mind to have a look at this issue? Thanks!
was (Author: lian cheng):
Here is a minimal test case (add it to {{ObjectHashAggregateSuite}}) that can be used to reproduce this issue steadily:
{code}
test("oom") {
withSQLConf(
SQLConf.USE_OBJECT_HASH_AGG.key -> "true",
SQLConf.OBJECT_AGG_SORT_BASED_FALLBACK_THRESHOLD.key -> "1"
) {
Seq(Tuple1(Seq.empty[Int]))
.toDF("c0")
.groupBy(lit(1))
.agg(typed_count($"c0"), max($"c0"))
.show()
}
}
{code}
What I observed is that the partial aggregation phase produces a malformed {{UnsafeRow}} after applying the {{resultProjection}} [here|https://github.com/apache/spark/blob/07beb5d21c6803e80733149f1560c71cd3cacc86/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/AggregationIterator.scala#L254].
When printed, the malformed {{UnsafeRow}} is always
{noformat}
[0,0,2000000008,2800000008,100000000000000,5a5a5a5a5a5a5a5a]
{noformat}
The {{5a5a5a5a5a5a5a5a}} is interpreted as the length of an {{ArrayData}}. Therefore, the JVM blows up when trying to allocate a huge array to deep copy of this {{ArrayData}} at a later phase.
[~sameer] and [~davies], would you mind to have a look at this issue? Thanks!
> ObjectHashAggregateSuite is being flaky (occasional OOM errors)
> ---------------------------------------------------------------
>
> Key: SPARK-18403
> URL: https://issues.apache.org/jira/browse/SPARK-18403
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.2.0
> Reporter: Cheng Lian
> Assignee: Cheng Lian
> Fix For: 2.2.0
>
>
> This test suite fails occasionally on Jenkins due to OOM errors. I've already reproduced it locally but haven't figured out the root cause.
> We should probably disable it temporarily before getting it fixed so that it doesn't break the PR build too often.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org