You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (Jira)" <ji...@apache.org> on 2019/12/02 02:17:00 UTC

[jira] [Updated] (SPARK-30081) StreamingAggregationSuite failure on zLinux(big endian)

     [ https://issues.apache.org/jira/browse/SPARK-30081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon updated SPARK-30081:
---------------------------------
    Description: 
The tests in 3 instance, the first two is at

{code}
[info] - SPARK-23004: Ensure that TypedImperativeAggregate functions do not throw errors - state format version 1 *** FAILED *** (760 milliseconds)
[info] Assert on query failed: : Query [id = 065b66ad-227a-46a4-9d9d-50d27672f02a, runId = 99c001b7-45df-4977-89b6-f68970378f4b] terminated with exception: Job aborted due to stage failure: Task 0 in stage 192.0 failed 1 times, most recent failure: Lost task 0.0 in stage 192.0 (TID 518, localhost, executor driver): java.lang.AssertionError: sizeInBytes (76) should be a multiple of 8
[info] at org.apache.spark.sql.catalyst.expressions.UnsafeRow.pointTo(UnsafeRow.java:168)
[info] at org.apache.spark.sql.execution.UnsafeKVExternalSorter$KVSorterIterator.next(UnsafeKVExternalSorter.java:297)
[info] at org.apache.spark.sql.execution.aggregate.SortBasedAggregator$$anon$1.<init>(ObjectAggregationIterator.scala:242)
info] at org.apache.spark.sql.execution.aggregate.SortBasedAggregator.destructiveIterator(ObjectAggregationIterator.scala:239)
[info] at org.apache.spark.sql.execution.aggregate.ObjectAggregationIterator.processInputs(ObjectAggregationIterator.scala:198)
[info] at org.apache.spark.sql.execution.aggregate.ObjectAggregationIterator.<init>(ObjectAggregationIterator.scala:78)
[info] at org.apache.spark.sql.execution.aggregate.ObjectHashAggregateExec$$anonfun$doExecute$1$$anonfun$2.apply(ObjectHashAggregateExec.scala:114)
[info] at org.apache.spark.sql.execution.aggregate.ObjectHashAggregateExec$$anonfun$doExecute$1$$anonfun$2.apply(ObjectHashAggregateExec.scala:105)
[info] at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1$$anonfun$12.apply(RDD.scala:823)
[info] at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1$$anonfun$12.apply(RDD.scala:823)
[info] at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
[info] at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
{code}

 
and third one is 

{code}
[info] - simple count, update mode - recovery from checkpoint uses state format version 1 *** FAILED *** (1 second, 21 milliseconds)*
[info] == Results ==*
[info] !== Correct Answer - 3 == == Spark Answer - 3 ==*
[info] !struct<_1:int,_2:int> struct<value:int,count(1):bigint>*
[info] [1,1] [1,1]*
[info] ![2,2] [2,1]*
[info] ![3,3] [3,1]*
[info]*
[info]*
[info] == Progress ==*
[info] StartStream(ProcessingTime(0),org.apache.spark.util.SystemClock@f12c12fb,Map(spark.sql.streaming.aggregation.stateFormatVersion -> 2),/scratch/devleish/spark/target/tmp/spark-5a533a9c-da17-41f9-a7d4-c3309d1c2b6f)*
[info] AddData to MemoryStream[value#1713]: 3,2,1*
[info] => CheckLastBatch: [3,3],[2,2],[1,1]*
[info] AssertOnQuery(<condition>, name)*
[info] AddData to MemoryStream[value#1713]: 4,4,4,4*
[info] CheckLastBatch: [4,4]*
{code}


This [https://github.com/apache/spark/commit/ebbe589d12434bc108672268bee05a7b7e571ee6] ensures that value is multiple of 8, but looks like it is not the case Big Endian platform

  was:
The tests in 3 instance, the first two is at

*[info] - SPARK-23004: Ensure that TypedImperativeAggregate functions do not throw errors - state format version 1 *** FAILED *** (760 milliseconds)*
*[info] Assert on query failed: : Query [id = 065b66ad-227a-46a4-9d9d-50d27672f02a, runId = 99c001b7-45df-4977-89b6-f68970378f4b] terminated with exception: Job aborted due to stage failure: Task 0 in stage 192.0 failed 1 times, most recent failure: Lost task 0.0 in stage 192.0 (TID 518, localhost, executor driver): java.lang.AssertionError: sizeInBytes (76) should be a multiple of 8*
*[info] at org.apache.spark.sql.catalyst.expressions.UnsafeRow.pointTo(UnsafeRow.java:168)*
*[info] at org.apache.spark.sql.execution.UnsafeKVExternalSorter$KVSorterIterator.next(UnsafeKVExternalSorter.java:297)*
*[info] at org.apache.spark.sql.execution.aggregate.SortBasedAggregator$$anon$1.<init>(ObjectAggregationIterator.scala:242)*
*[info] at org.apache.spark.sql.execution.aggregate.SortBasedAggregator.destructiveIterator(ObjectAggregationIterator.scala:239)*
*[info] at org.apache.spark.sql.execution.aggregate.ObjectAggregationIterator.processInputs(ObjectAggregationIterator.scala:198)*
*[info] at org.apache.spark.sql.execution.aggregate.ObjectAggregationIterator.<init>(ObjectAggregationIterator.scala:78)*
*[info] at org.apache.spark.sql.execution.aggregate.ObjectHashAggregateExec$$anonfun$doExecute$1$$anonfun$2.apply(ObjectHashAggregateExec.scala:114)*
*[info] at org.apache.spark.sql.execution.aggregate.ObjectHashAggregateExec$$anonfun$doExecute$1$$anonfun$2.apply(ObjectHashAggregateExec.scala:105)*
*[info] at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1$$anonfun$12.apply(RDD.scala:823)*
*[info] at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1$$anonfun$12.apply(RDD.scala:823)*
*[info] at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)*
*[info] at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)*

 

 

and third one is 

*[info] - simple count, update mode - recovery from checkpoint uses state format version 1 *** FAILED *** (1 second, 21 milliseconds)*
*[info] == Results ==*
*[info] !== Correct Answer - 3 == == Spark Answer - 3 ==*
*[info] !struct<_1:int,_2:int> struct<value:int,count(1):bigint>*
*[info] [1,1] [1,1]*
*[info] ![2,2] [2,1]*
*[info] ![3,3] [3,1]*
*[info]*
*[info]*
*[info] == Progress ==*
*[info] StartStream(ProcessingTime(0),org.apache.spark.util.SystemClock@f12c12fb,Map(spark.sql.streaming.aggregation.stateFormatVersion -> 2),/scratch/devleish/spark/target/tmp/spark-5a533a9c-da17-41f9-a7d4-c3309d1c2b6f)*
*[info] AddData to MemoryStream[value#1713]: 3,2,1*
*[info] => CheckLastBatch: [3,3],[2,2],[1,1]*
*[info] AssertOnQuery(<condition>, name)*
*[info] AddData to MemoryStream[value#1713]: 4,4,4,4*
*[info] CheckLastBatch: [4,4]*

This *[https://github.com/apache/spark/commit/ebbe589d12434bc108672268bee05a7b7e571ee6] e*nsures that value is multiple of 8, but looks like it is not the case Big Endian platform


> StreamingAggregationSuite failure on zLinux(big endian)
> -------------------------------------------------------
>
>                 Key: SPARK-30081
>                 URL: https://issues.apache.org/jira/browse/SPARK-30081
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL, Tests
>    Affects Versions: 2.4.4
>            Reporter: Dev Leishangthem
>            Priority: Major
>
> The tests in 3 instance, the first two is at
> {code}
> [info] - SPARK-23004: Ensure that TypedImperativeAggregate functions do not throw errors - state format version 1 *** FAILED *** (760 milliseconds)
> [info] Assert on query failed: : Query [id = 065b66ad-227a-46a4-9d9d-50d27672f02a, runId = 99c001b7-45df-4977-89b6-f68970378f4b] terminated with exception: Job aborted due to stage failure: Task 0 in stage 192.0 failed 1 times, most recent failure: Lost task 0.0 in stage 192.0 (TID 518, localhost, executor driver): java.lang.AssertionError: sizeInBytes (76) should be a multiple of 8
> [info] at org.apache.spark.sql.catalyst.expressions.UnsafeRow.pointTo(UnsafeRow.java:168)
> [info] at org.apache.spark.sql.execution.UnsafeKVExternalSorter$KVSorterIterator.next(UnsafeKVExternalSorter.java:297)
> [info] at org.apache.spark.sql.execution.aggregate.SortBasedAggregator$$anon$1.<init>(ObjectAggregationIterator.scala:242)
> info] at org.apache.spark.sql.execution.aggregate.SortBasedAggregator.destructiveIterator(ObjectAggregationIterator.scala:239)
> [info] at org.apache.spark.sql.execution.aggregate.ObjectAggregationIterator.processInputs(ObjectAggregationIterator.scala:198)
> [info] at org.apache.spark.sql.execution.aggregate.ObjectAggregationIterator.<init>(ObjectAggregationIterator.scala:78)
> [info] at org.apache.spark.sql.execution.aggregate.ObjectHashAggregateExec$$anonfun$doExecute$1$$anonfun$2.apply(ObjectHashAggregateExec.scala:114)
> [info] at org.apache.spark.sql.execution.aggregate.ObjectHashAggregateExec$$anonfun$doExecute$1$$anonfun$2.apply(ObjectHashAggregateExec.scala:105)
> [info] at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1$$anonfun$12.apply(RDD.scala:823)
> [info] at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1$$anonfun$12.apply(RDD.scala:823)
> [info] at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
> [info] at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
> {code}
>  
> and third one is 
> {code}
> [info] - simple count, update mode - recovery from checkpoint uses state format version 1 *** FAILED *** (1 second, 21 milliseconds)*
> [info] == Results ==*
> [info] !== Correct Answer - 3 == == Spark Answer - 3 ==*
> [info] !struct<_1:int,_2:int> struct<value:int,count(1):bigint>*
> [info] [1,1] [1,1]*
> [info] ![2,2] [2,1]*
> [info] ![3,3] [3,1]*
> [info]*
> [info]*
> [info] == Progress ==*
> [info] StartStream(ProcessingTime(0),org.apache.spark.util.SystemClock@f12c12fb,Map(spark.sql.streaming.aggregation.stateFormatVersion -> 2),/scratch/devleish/spark/target/tmp/spark-5a533a9c-da17-41f9-a7d4-c3309d1c2b6f)*
> [info] AddData to MemoryStream[value#1713]: 3,2,1*
> [info] => CheckLastBatch: [3,3],[2,2],[1,1]*
> [info] AssertOnQuery(<condition>, name)*
> [info] AddData to MemoryStream[value#1713]: 4,4,4,4*
> [info] CheckLastBatch: [4,4]*
> {code}
> This [https://github.com/apache/spark/commit/ebbe589d12434bc108672268bee05a7b7e571ee6] ensures that value is multiple of 8, but looks like it is not the case Big Endian platform



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org