You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Herman van Hovell (JIRA)" <ji...@apache.org> on 2016/03/01 15:58:18 UTC

[jira] [Commented] (SPARK-13552) Incorrect data for Long.minValue in SQLQuerySuite on IBM Java

    [ https://issues.apache.org/jira/browse/SPARK-13552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173856#comment-15173856 ] 

Herman van Hovell commented on SPARK-13552:
-------------------------------------------

Yeah, -2^32 points for me. Should of though longer about that before posting :S...

> Incorrect data for Long.minValue in SQLQuerySuite on IBM Java
> -------------------------------------------------------------
>
>                 Key: SPARK-13552
>                 URL: https://issues.apache.org/jira/browse/SPARK-13552
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.0.0
>         Environment: IBM Java only, all platforms
>            Reporter: Adam Roberts
>            Priority: Minor
>         Attachments: DefectBadMinValueLongResized.jpg
>
>
> The Long.minValue test fails on IBM Java 8, we get the following incorrect answer with the slightly simplified test case:
> {code:SQL}
> val tester = sql(s"SELECT ${Long.MinValue} FROM testData")
> {code}
> result is
> _-9,223,372,041,149,743,104_ instead of _-9,223,372,036,854,775,808_ (there's only one bit difference if we convert to binary representation).
> Here's the full test output:
> {code}
> Results do not match for query:
> == Parsed Logical Plan ==
> 'GlobalLimit 1
> +- 'LocalLimit 1
>    +- 'Sort ['key ASC], true
>       +- 'Project [unresolvedalias(-9223372036854775808, None)]
>          +- 'UnresolvedRelation `testData`, None
> == Analyzed Logical Plan ==
> (-9223372036854775808): decimal(19,0)
> GlobalLimit 1
> +- LocalLimit 1
>    +- Project [(-9223372036854775808)#4391]
>       +- Sort [key#101 ASC], true
>          +- Project [-9223372036854775808 AS (-9223372036854775808)#4391,key#101]
>             +- SubqueryAlias testData
>                +- LogicalRDD [key#101,value#102], MapPartitionsRDD[3] at beforeAll at BeforeAndAfterAll.scala:187
> == Optimized Logical Plan ==
> GlobalLimit 1
> +- LocalLimit 1
>    +- Project [(-9223372036854775808)#4391]
>       +- Sort [key#101 ASC], true
>          +- Project [-9223372036854775808 AS (-9223372036854775808)#4391,key#101]
>             +- LogicalRDD [key#101,value#102], MapPartitionsRDD[3] at beforeAll at BeforeAndAfterAll.scala:187
> == Physical Plan ==
> TakeOrderedAndProject(limit=1, orderBy=[key#101 ASC], output=[(-9223372036854775808)#4391])
> +- WholeStageCodegen
>    :  +- Project [-9223372036854775808 AS (-9223372036854775808)#4391,key#101]
>    :     +- INPUT
>    +- Scan ExistingRDD[key#101,value#102]
> == Results ==
> == Results ==
> !== Correct Answer - 1 ==   == Spark Answer - 1 ==
> ![-9223372036854775808]     [-9223372041149743104]
> {code}
> Debugging in Intellij shows the query seems to be parsed OK and we eventually have a schema with the correct data in the struct field but the BigDecimal's BigInteger is incorrect when we have a GenericRowWithSchema.
> I've identified that the problem started when SPARK-12575 was implemented and suspect the following paragraph is important:
> "Hive and the SQL Parser treat decimal literals differently. Hive will turn any decimal into a Double whereas the SQL Parser would convert a non-scientific decimal into a BigDecimal, and would turn a scientific decimal into a Double. We follow Hive's behavior here. The new parser supports a big decimal literal, for instance: 81923801.42BD, which can be used when a big decimal is needed."



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org