You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Xiao Li (JIRA)" <ji...@apache.org> on 2017/07/16 16:46:00 UTC
[jira] [Resolved] (SPARK-21344) BinaryType comparison does signed
byte array comparison
[ https://issues.apache.org/jira/browse/SPARK-21344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiao Li resolved SPARK-21344.
-----------------------------
Resolution: Fixed
Fix Version/s: 2.2.1
2.1.2
2.0.3
> BinaryType comparison does signed byte array comparison
> -------------------------------------------------------
>
> Key: SPARK-21344
> URL: https://issues.apache.org/jira/browse/SPARK-21344
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.0.0, 2.1.1
> Reporter: Shubham Chopra
> Assignee: Kazuaki Ishizaki
> Fix For: 2.0.3, 2.1.2, 2.2.1
>
>
> BinaryType used by Spark SQL defines ordering using signed byte comparisons. This can lead to unexpected behavior. Consider the following code snippet that shows this error:
> {code}
> case class TestRecord(col0: Array[Byte])
> def convertToBytes(i: Long): Array[Byte] = {
> val bb = java.nio.ByteBuffer.allocate(8)
> bb.putLong(i)
> bb.array
> }
> def test = {
> val sql = spark.sqlContext
> import sql.implicits._
> val timestamp = 1498772083037L
> val data = (timestamp to timestamp + 1000L).map(i => TestRecord(convertToBytes(i)))
> val testDF = sc.parallelize(data).toDF
> val filter1 = testDF.filter(col("col0") >= convertToBytes(timestamp) && col("col0") < convertToBytes(timestamp + 50L))
> val filter2 = testDF.filter(col("col0") >= convertToBytes(timestamp + 50L) && col("col0") < convertToBytes(timestamp + 100L))
> val filter3 = testDF.filter(col("col0") >= convertToBytes(timestamp) && col("col0") < convertToBytes(timestamp + 100L))
> assert(filter1.count == 50)
> assert(filter2.count == 50)
> assert(filter3.count == 100)
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org