You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Kris Mok (JIRA)" <ji...@apache.org> on 2017/06/09 20:29:18 UTC

[jira] [Created] (SPARK-21041) With whole-stage codegen, SparkSession.range()'s behavior is inconsistent with SparkContext.range()

Kris Mok created SPARK-21041:
--------------------------------

             Summary: With whole-stage codegen, SparkSession.range()'s behavior is inconsistent with SparkContext.range()
                 Key: SPARK-21041
                 URL: https://issues.apache.org/jira/browse/SPARK-21041
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.2.0
            Reporter: Kris Mok


When whole-stage codegen is enabled, in face of integer overflow, SparkSession.range()'s behavior is inconsistent with when codegen is turned off, while the latter is consistent with SparkContext.range()'s behavior.

The following Spark Shell session shows the inconsistency:
{code:scala}
scala> sc.range
   def range(start: Long,end: Long,step: Long,numSlices: Int): org.apache.spark.rdd.RDD[Long]

scala> spark.range
                                                                                                     
def range(start: Long,end: Long,step: Long,numPartitions: Int): org.apache.spark.sql.Dataset[Long]   
def range(start: Long,end: Long,step: Long): org.apache.spark.sql.Dataset[Long]                      
def range(start: Long,end: Long): org.apache.spark.sql.Dataset[Long]                                 
def range(end: Long): org.apache.spark.sql.Dataset[Long] 

scala> sc.range(java.lang.Long.MAX_VALUE - 3, java.lang.Long.MIN_VALUE + 2, 1).collect
res1: Array[Long] = Array()

scala> spark.range(java.lang.Long.MAX_VALUE - 3, java.lang.Long.MIN_VALUE + 2, 1).collect
res2: Array[Long] = Array(9223372036854775804, 9223372036854775805, 9223372036854775806)

scala> spark.conf.set("spark.sql.codegen.wholeStage", false)

scala> spark.range(java.lang.Long.MAX_VALUE - 3, java.lang.Long.MIN_VALUE + 2, 1).collect
res5: Array[Long] = Array()
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org