You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Ashutosh Chauhan (JIRA)" <ji...@apache.org> on 2013/10/09 19:12:42 UTC
[jira] [Commented] (HIVE-5501) Filter on partitioning column shouldn't be present at execution time

    [ https://issues.apache.org/jira/browse/HIVE-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13790624#comment-13790624 ] 

Ashutosh Chauhan commented on HIVE-5501:
----------------------------------------

{code}
hive> create table t3 (a string, b int) partitioned by (p1 string);
hive> explain  select count(*) from t3 where p1='3';               
OK
ABSTRACT SYNTAX TREE:
  (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME t3))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_FUNCTIONSTAR count))) (TOK_WHERE (= (TOK_TABLE_OR_COL p1) '3'))))

STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 is a root stage

STAGE PLANS:
  Stage: Stage-1
    Map Reduce
      Alias -> Map Operator Tree:
        t3 
          TableScan
            alias: t3
>>            Filter Operator
>>              predicate:
>>                  expr: (p1 = '3')
>>                  type: boolean
              Select Operator
                Group By Operator
                  aggregations:
                        expr: count()
                  bucketGroup: false
                  mode: hash
                  outputColumnNames: _col0
                  Reduce Output Operator
                    sort order: 
                    tag: -1
                    value expressions:
                          expr: _col0
                          type: bigint
      Reduce Operator Tree:
        Group By Operator
          aggregations:
                expr: count(VALUE._col0)
          bucketGroup: false
          mode: mergepartial
          outputColumnNames: _col0
          Select Operator
            expressions:
                  expr: _col0
                  type: bigint
            outputColumnNames: _col0
            File Output Operator
              compressed: false
              GlobalTableId: 0
              table:
                  input format: org.apache.hadoop.mapred.TextInputFormat
                  output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
                  serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

  Stage: Stage-0
    Fetch Operator
      limit: -1

{code}
Filter marked by >>  is not useful at execution time and should have been eliminated by optimizer since partitioning pruning will already take care of it.

> Filter on partitioning column shouldn't be present at execution time
> --------------------------------------------------------------------
>
>                 Key: HIVE-5501
>                 URL: https://issues.apache.org/jira/browse/HIVE-5501
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Ashutosh Chauhan
>
> Since such filters are already processed via partitioning pruning, having it present in operator pipeline is unnecessary overhead.



--
This message was sent by Atlassian JIRA
(v6.1#6144)