You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "DB Tsai (JIRA)" <ji...@apache.org> on 2019/01/24 06:02:00 UTC
[jira] [Resolved] (SPARK-26706) Fix Cast$mayTruncate for bytes
[ https://issues.apache.org/jira/browse/SPARK-26706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
DB Tsai resolved SPARK-26706.
-----------------------------
Resolution: Resolved
> Fix Cast$mayTruncate for bytes
> ------------------------------
>
> Key: SPARK-26706
> URL: https://issues.apache.org/jira/browse/SPARK-26706
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.6.3, 2.0.2, 2.1.3, 2.2.3, 2.3.2, 2.4.0
> Reporter: Anton Okolnychyi
> Assignee: Anton Okolnychyi
> Priority: Blocker
> Labels: correctness
> Fix For: 2.3.3, 2.4.1
>
>
> The logic in {{Cast$mayTruncate}} is broken for bytes.
> Right now, {{mayTruncate(ByteType, LongType)}} returns {{false}} while {{mayTruncate(ShortType, LongType)}} returns {{true}}. Consequently, {{spark.range(1, 3).as[Byte]}} and {{spark.range(1, 3).as[Short]}} will behave differently.
> Potentially, this bug can lead to silently corrupting someone's data.
> {code}
> // executes silently even though Long is converted into Byte
> spark.range(Long.MaxValue - 10, Long.MaxValue).as[Byte]
> .map(b => b - 1)
> .show()
> +-----+
> |value|
> +-----+
> | -12|
> | -11|
> | -10|
> | -9|
> | -8|
> | -7|
> | -6|
> | -5|
> | -4|
> | -3|
> +-----+
> // throws an AnalysisException: Cannot up cast `id` from bigint to smallint as it may truncate
> spark.range(Long.MaxValue - 10, Long.MaxValue).as[Short]
> .map(s => s - 1)
> .show()
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org