You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Don Smith (JIRA)" <ji...@apache.org> on 2017/02/22 02:24:44 UTC
[jira] [Created] (SPARK-19692) Comparison on BinaryType returns no
results
Don Smith created SPARK-19692:
----------------------------------
Summary: Comparison on BinaryType returns no results
Key: SPARK-19692
URL: https://issues.apache.org/jira/browse/SPARK-19692
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 2.1.0
Reporter: Don Smith
I believe there is an issue with comparisons on binary fields:
{code}
val sc = SparkSession.builder.appName("test").getOrCreate()
val schema = StructType(Seq(StructField("ip", BinaryType)))
val ips = Seq("1.1.1.1", "2.2.2.2", "200.10.6.7").map(s => InetAddress.getByName(s).getAddress)
val df = sc.createDataFrame(
sc.sparkContext.parallelize(ips, 1).map { ip =>
Row(ip)
}, schema
)
val query = df
.where(df("ip") >= InetAddress.getByName("200.10.0.0").getAddress)
.where(df("ip") <= InetAddress.getByName("200.10.255.255").getAddress)
logger.info(query.explain(true))
val results = query.collect()
results.length mustEqual 1
{code}
returns no results.
i believe the problem is that the comparison is coercing the bytes to signed integers in the call to compareTo here in TypeUtils:
{code}
def compareBinary(x: Array[Byte], y: Array[Byte]): Int = {
for (i <- 0 until x.length; if i < y.length) {
val res = x(i).compareTo(y(i))
if (res != 0) return res
}
x.length - y.length
}
{code}
with some hacky testing i was able to get the desired results with: {{ val res = (x(i).toByte & 0xff) - (y(i).toByte & 0xff) }}
thanks!
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org