You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Saisai Shao (JIRA)" <ji...@apache.org> on 2014/08/11 12:13:11 UTC
[jira] [Created] (SPARK-2967) Several SQL unit test failed when sort-based shuffle is enabled

Saisai Shao created SPARK-2967:
----------------------------------

             Summary: Several SQL unit test failed when sort-based shuffle is enabled
                 Key: SPARK-2967
                 URL: https://issues.apache.org/jira/browse/SPARK-2967
             Project: Spark
          Issue Type: Bug
    Affects Versions: 1.1.0
            Reporter: Saisai Shao


Several SQLQuerySuite unit test failed when sort-based shuffle is enabled. Seems SQL test uses GenericMutableRow  which will make ExternalSorter's internal buffer all refered to the same object finally because of object's mutability. Seems row should be copied when feeding into ExternalSorter.

The error shows below, though have many failures, I only pasted part of them:

{noformat}
 SQLQuerySuite:
 - SPARK-2041 column name equals tablename
 - SPARK-2407 Added Parser of SQL SUBSTR()
 - index into array
 - left semi greater than predicate
 - index into array of arrays
 - agg *** FAILED ***
   Results do not match for query:
   Aggregate ['a], ['a,SUM('b) AS c1#38]
    UnresolvedRelation None, testData2, None
   
   == Analyzed Plan ==
   Aggregate [a#4], [a#4,SUM(CAST(b#5, LongType)) AS c1#38L]
    SparkLogicalPlan (ExistingRdd [a#4,b#5], MapPartitionsRDD[7] at mapPartitions at basicOperators.scala:215)
   
   == Physical Plan ==
   Aggregate false, [a#4], [a#4,SUM(PartialSum#40L) AS c1#38L]
    Exchange (HashPartitioning [a#4], 200)
     Aggregate true, [a#4], [a#4,SUM(CAST(b#5, LongType)) AS PartialSum#40L]
      ExistingRdd [a#4,b#5], MapPartitionsRDD[7] at mapPartitions at basicOperators.scala:215
   
   == Results ==
   !== Correct Answer - 3 ==   == Spark Answer - 3 ==
   !Vector(1, 3)               [1,3]
   !Vector(2, 3)               [1,3]
   !Vector(3, 3)               [1,3] (QueryTest.scala:53)
 - aggregates with nulls
 - select *
 - simple select
 - sorting *** FAILED ***
   Results do not match for query:
   Sort ['a ASC,'b ASC]
    Project [*]
     UnresolvedRelation None, testData2, None
   
   == Analyzed Plan ==
   Sort [a#4 ASC,b#5 ASC]
    Project [a#4,b#5]
     SparkLogicalPlan (ExistingRdd [a#4,b#5], MapPartitionsRDD[7] at mapPartitions at basicOperators.scala:215)
   
   == Physical Plan ==
   Sort [a#4 ASC,b#5 ASC], true
    Exchange (RangePartitioning [a#4 ASC,b#5 ASC], 200)
     ExistingRdd [a#4,b#5], MapPartitionsRDD[7] at mapPartitions at basicOperators.scala:215
   
   == Results ==
   !== Correct Answer - 6 ==   == Spark Answer - 6 ==
   !Vector(1, 1)               [3,2]
   !Vector(1, 2)               [3,2]
   !Vector(2, 1)               [3,2]
   !Vector(2, 2)               [3,2]
   !Vector(3, 1)               [3,2]
   !Vector(3, 2)               [3,2] (QueryTest.scala:53)
 - limit
 - average
 - average overflow *** FAILED ***
   Results do not match for query:
   Aggregate ['b], [AVG('a) AS c0#90,'b]
    UnresolvedRelation None, largeAndSmallInts, None
   
   == Analyzed Plan ==
   Aggregate [b#3], [AVG(CAST(a#2, LongType)) AS c0#90,b#3]
    SparkLogicalPlan (ExistingRdd [a#2,b#3], MapPartitionsRDD[4] at mapPartitions at basicOperators.scala:215)
   
   == Physical Plan ==
   Aggregate false, [b#3], [(CAST(SUM(PartialSum#93L), DoubleType) / CAST(SUM(PartialCount#94L), DoubleType)) AS c0#90,b#3]
    Exchange (HashPartitioning [b#3], 200)
     Aggregate true, [b#3], [b#3,COUNT(CAST(a#2, LongType)) AS PartialCount#94L,SUM(CAST(a#2, LongType)) AS PartialSum#93L]
      ExistingRdd [a#2,b#3], MapPartitionsRDD[4] at mapPartitions at basicOperators.scala:215
   
   == Results ==
   !== Correct Answer - 2 ==   == Spark Answer - 2 ==
   !Vector(2.0, 2)             [2.147483645E9,1]
   !Vector(2.147483645E9, 1)   [2.147483645E9,1] (QueryTest.scala:53)
{noformat}





--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org