You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "eugen yushin (JIRA)" <ji...@apache.org> on 2016/05/11 13:03:12 UTC

[jira] [Created] (DRILL-4666) Pushdown doesn't apply for HBase with substr(key) from UT

eugen yushin created DRILL-4666:
-----------------------------------

             Summary: Pushdown doesn't apply for HBase with substr(key) from UT
                 Key: DRILL-4666
                 URL: https://issues.apache.org/jira/browse/DRILL-4666
             Project: Apache Drill
          Issue Type: Bug
    Affects Versions: 1.6.0
            Reporter: eugen yushin


Following [example|https://github.com/apache/drill/blob/95623912ebf348962fe8a8846c5f47c5fdcf2f78/contrib/storage-hbase/src/test/java/org/apache/drill/hbase/TestHBaseFilterPushDown.java] section, running query from {{testFilterPushDownCompositeBigIntRowKey1()}} results in following execution plan:
{code}
EXPLAIN PLAN FOR
SELECT
     CONVERT_FROM(BYTE_SUBSTR(row_key, 1, 8), 'bigint_be') d
    ,CONVERT_FROM(BYTE_SUBSTR(row_key, 9, 8), 'bigint_be') id
    ,CONVERT_FROM(tableName.f.c, 'UTF8')
FROM hbase.`TestTableCompositeDate` tableName
WHERE
    CONVERT_FROM(BYTE_SUBSTR(row_key, 1, 8), 'bigint_be') = cast(1409040000000 as bigint)
;
+------+------+
| text | json |
+------+------+
| 00-00    Screen
00-01      Project(d=[CONVERT_FROMBIGINT_BE(BYTE_SUBSTR($0, 1, 8))], id=[CONVERT_FROMBIGINT_BE(BYTE_SUBSTR($0, 9, 8))], EXPR$2=[CONVERT_FROMUTF8(ITEM($1, 'c'))])
00-02        SelectionVectorRemover
00-03          Filter(condition=[=(CONVERT_FROM(BYTE_SUBSTR($0, 1, 8), 'bigint_be'), 1409040000000)])
00-04            Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec [tableName=TestTableCompositeDate, startRow=null, stopRow=null, filter=null], columns=[`*`]]])
{code}

From the above, Drill uses full scan and then filters out rows by key substring started from 1st position.

This query executes pretty fast in test dataset provided in repo, but performance dramatically decreases with real use cases.

I've used _contrib\storage-hbase\src\test\java\org\apache\drill\hbase\TestTableGenerator.java_ to populate test table.

Moreover, [TestHBaseFilterPushDown|TestHBaseFilterPushDown.java] uses [runHBaseSQLVerifyCount|https://github.com/apache/drill/blob/95623912ebf348962fe8a8846c5f47c5fdcf2f78/contrib/storage-hbase/src/test/java/org/apache/drill/hbase/TestHBaseFilterPushDown.java] to pass the tests. It checks result set count, and not execution plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)