You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "LongShangRen (JIRA)" <ji...@apache.org> on 2016/11/03 14:54:58 UTC
[jira] [Created] (HIVE-15117) Partition filters are not pushed down with lateral view and undeterministic UDF

LongShangRen created HIVE-15117:
-----------------------------------

             Summary: Partition filters are not pushed down with lateral view and undeterministic UDF
                 Key: HIVE-15117
                 URL: https://issues.apache.org/jira/browse/HIVE-15117
             Project: Hive
          Issue Type: Bug
          Components: Hive
    Affects Versions: 1.2.1
            Reporter: LongShangRen
             Fix For: 1.2.1


sql with lateral view didn't push down partition column as expected!.

here is how it can be reproduced.

1. *create test table*

{quote}
   create table test_lateral_view (id bigint,json_cont string) partitioned by (vt string);
{quote}

2. *explain below sql*

{quote}

select   *
from test_lateral_view a
lateral view json_tuple(json_cont, 'iids', 'indexs') b as iids,indexs
where a.vt = '2016-10-27'
and rand()>0.5;
{quote}

here is my result:

{quote}

STAGE DEPENDENCIES:
  Stage-0 is a root stage

STAGE PLANS:
  Stage: Stage-0
    Fetch Operator
      limit: -1
      Processor Tree:
        TableScan
          alias: a
          Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
          Lateral View Forward
            Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
            Select Operator
              expressions: id (type: bigint), json_cont (type: string), vt (type: string)
              outputColumnNames: id, json_cont, vt
              Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
              Lateral View Join Operator
                outputColumnNames: _col0, _col1, _col2, _col6, _col7
                Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL Column stats: NONE
                Filter Operator
                  {color:red}
                  predicate: ((_col2 = '2016-10-27') and (rand() > 0.5)) (type: boolean)
                  {color:red}
                  Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
                  Select Operator
                    expressions: _col0 (type: bigint), _col1 (type: string), '2016-10-27' (type: string), _col6 (type: string), _col7 (type: string)
                    outputColumnNames: _col0, _col1, _col2, _col3, _col4
                    Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
                    ListSink
            Select Operator
              expressions: json_cont (type: string), 'iids' (type: string), 'indexs' (type: string)
              outputColumnNames: _col0, _col1, _col2
              Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
              UDTF Operator
                Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
                function name: json_tuple
                Lateral View Join Operator
                  outputColumnNames: _col0, _col1, _col2, _col6, _col7
                  Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL Column stats: NONE
                  Filter Operator
                    predicate: ((_col2 = '2016-10-27') and (rand() > 0.5)) (type: boolean)
                    Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
                    Select Operator
                      expressions: _col0 (type: bigint), _col1 (type: string), '2016-10-27' (type: string), _col6 (type: string), _col7 (type: string)
                      outputColumnNames: _col0, _col1, _col2, _col3, _col4
                      Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
                      ListSink
{quote}

 As you can see,the partition column is in filter operator,which means this sql will scan the whole table.






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)