You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Aditya Kishore (JIRA)" <ji...@apache.org> on 2015/01/23 06:37:35 UTC

[jira] [Comment Edited] (DRILL-1651) hbase pushdown not working for some query

    [ https://issues.apache.org/jira/browse/DRILL-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14203022#comment-14203022 ] 

Aditya Kishore edited comment on DRILL-1651 at 1/23/15 5:37 AM:
----------------------------------------------------------------

The push-into -scan rule {{HBasePushFilterIntoScan}} gets triggered only when the {{Filter}} operator follows a {{Scan}}.

However wherever you have an {{ITEM}} operator in projection, a {{Project}} is inserted (or probably not removed by not treating it as trivial project) between {{Scan}} and {{Filter}}, see line {{00-04}} in the plan. This prevents the optimizer rule from getting triggered.

A work around would be to use the ITEM operator in the outer {{SELECT}} and {{WHERE}} clause in the inner query. The original query can be this rewritten as 
{code:sql}
select
    cast(row_key as integer) student_id,
    cast(twocf['age'] as integer)/cast(threecf['gpa'] as float)
from 
    (select row_key, twocf, threecf from student where row_key < '800' and row_key > '750');
{code}


was (Author: adityakishore):
The push-into -scan rule {{HBasePushFilterIntoScan}} gets triggered only when the {{Filter}} operator follows a {{Scan}}.

However wherever you have an ITEM}} operator in projection, a {{Project}} is inserted (or probably not removed by treating it as trivial project) between {{Scan}} and {{Filter}}, see line {{00-04}} in the plan. This prevents the optimizer rule from getting triggered.

A work around would be to use the ITEM operator in the outer {{SELECT}} and {{WHERE}} clause in the inner query. The original query can be this rewritten as 
{code:sql}
select
    cast(row_key as integer) student_id,
    cast(twocf['age'] as integer)/cast(threecf['gpa'] as float)
from 
    (select row_key, twocf, threecf from student where row_key < '800' and row_key > '750');
{code}

> hbase pushdown not working for some query
> -----------------------------------------
>
>                 Key: DRILL-1651
>                 URL: https://issues.apache.org/jira/browse/DRILL-1651
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - HBase
>    Affects Versions: 0.7.0
>            Reporter: Chun Chang
>            Assignee: Aditya Kishore
>            Priority: Minor
>             Fix For: 0.9.0
>
>
> #Tue Nov 04 16:58:08 UTC 2014
> git.commit.id.abbrev=129cb9c
> I noticed that the following query did not cause a pushdown:
> select cast(row_key as integer) student_id, (cast(twocf['age'] as integer)/cast(threecf['gpa'] as float)) from student where row_key < '800' and row_key > '750';
> plan:
> 00-01      Project(student_id=[CAST($0):INTEGER NOT NULL], EXPR$1=[/(CAST($1):INTEGER, CAST($2):FLOAT)])
> 00-02        SelectionVectorRemover
> 00-03          Filter(condition=[AND(<($0, '800'), >($0, '750'))])
> 00-04            Project(row_key=[$1], ITEM=[ITEM($2, 'age')], ITEM2=[ITEM($0, 'gpa')])
> 00-05              Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec [tableName=student, startRow=null, stopRow=null, filter=null], columns=[SchemaPath [`row_key`], SchemaPath [`twocf`.`age`], SchemaPath [`threecf`.`gpa`]]]])
> But the following query did:
> select cast(row_key as integer) student_id, cast(onecf['name'] as varchar(30)) name, cast(twocf['age'] as integer) age, cast(threecf['gpa'] as decimal(4,2)) gpa, cast(fourcf['studentnum'] as bigint) student_num, cast(fivecf['create_date'] as timestamp) create_date from student where row_key > '750' and row_key < '800';
> plan:
> 00-01      Project(student_id=[CAST($0):INTEGER NOT NULL], name=[CAST(ITEM($3, 'name')):VARCHAR(30) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"], age=[CAST(ITEM($5, 'age')):INTEGER], gpa=[CAST(ITEM($4, 'gpa')):DECIMAL(4, 2)], student_num=[CAST(ITEM($2, 'studentnum')):BIGINT], create_date=[CAST(ITEM($1, 'create_date')):TIMESTAMP(0)])
> 00-02        Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec [tableName=student, startRow=750\x00, stopRow=800, filter=FilterList AND (2/2): [RowFilter (GREATER, 750), RowFilter (LESS, 800)]], columns=[SchemaPath [`*`]]]])
> Select * caused triggered pushdown:
> 0: jdbc:drill:schema=hbase> explain plan for select *  from student where row_key < '800' and row_key > '750';
> +------------+------------+
> |    text    |    json    |
> +------------+------------+
> | 00-00    Screen
> 00-01      Project(row_key=[$0], fivecf=[$1], fourcf=[$2], onecf=[$3], threecf=[$4], twocf=[$5])
> 00-02        Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec [tableName=student, startRow=750\x00, stopRow=800, filter=FilterList AND (2/2): [RowFilter (LESS, 800), RowFilter (GREATER, 750)]], columns=[SchemaPath [`*`]]]])



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)