You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Balazs Meszaros (Jira)" <ji...@apache.org> on 2021/08/23 08:31:00 UTC

[jira] [Updated] (HBASE-26211) [hbase-connectors] Pushdown filters in Spark do not work correctly with long types

     [ https://issues.apache.org/jira/browse/HBASE-26211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Balazs Meszaros updated HBASE-26211:
------------------------------------
    Fix Version/s: hbase-connectors-1.1.0

> [hbase-connectors] Pushdown filters in Spark do not work correctly with long types
> ----------------------------------------------------------------------------------
>
>                 Key: HBASE-26211
>                 URL: https://issues.apache.org/jira/browse/HBASE-26211
>             Project: HBase
>          Issue Type: Bug
>          Components: hbase-connectors
>    Affects Versions: 1.0.0
>            Reporter: Hristo Iliev
>            Priority: Major
>             Fix For: hbase-connectors-1.1.0
>
>
> Reading from an HBase table and filtering on a LONG column does not seem to work correctly.
> {{Dataset<Row> df = spark.read()
>    .format("org.apache.hadoop.hbase.spark")
>    .option("hbase.columns.mapping", "id STRING :key, v LONG cf:v")
>    ...
>    .load();
>  df.filter("v > 100").show();}}
> Expected behaviour is to show rows where cf:v > 100, but instead an empty dataset is shown.
> Moreover, replacing {{"v > 100"}} with {{"v >= 100"}} results in a dataset where some rows have values of v less than 100. 
> The problem appears to be that long values are decoded incorrectly as integers in {{NaiveEncoder.filter}}:
> {{case LongEnc | TimestampEnc =>
>    val in = Bytes.toInt(input, offset1)
>    val value = Bytes.toInt(filterBytes, offset2 + 1)
>    compare(in.compareTo(value), ops)}}
> It looks like that error hasn’t been caught because {{DynamicLogicExpressionSuite}} lack test cases with long values.
> The erroneous code is also present in the master branch. We have extended the test suite and implemented a quick fix and will PR on GitHub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)