You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Jesus Camacho Rodriguez (JIRA)" <ji...@apache.org> on 2015/11/03 17:23:27 UTC

[jira] [Updated] (HIVE-11726) Pushed IN predicates to the metastore

     [ https://issues.apache.org/jira/browse/HIVE-11726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jesus Camacho Rodriguez updated HIVE-11726:
-------------------------------------------
    Description: 
The PointLookupOptimizer can turn off some of the optimizations due to its use of tuple IN() clauses.

HIVE-11573 introduced the extraction of sub-clauses that could be pushed down till the TableScan operators, though they wouldn't be pushed down to the metastore.

In this issue, we tackle this problem by extending the filter parser of the metastore to support IN clauses, including multiple columns. This allows to push those additional predicates down throw directSQL to the metastore.

  was:
The PointLookupOptimizer can turn off some of the optimizations due to its use of tuple IN() clauses.

HIVE-11573 introduced the extraction of sub-clauses that could be pushed down till the TableScan operators, though they wouldn't be pushed down to the metastore.

In this issue, we tackle this problem by:
1) Grouping the columns in the sub-clauses depending on their lineage. This way PPD will be able to push them down throw the plan without any extension. For instance, if a, b, and c are partition columns, a and b belong to table1, and c belong to table2:
{code}
(a,b,c) IN ((1,2,3),(2,3,4)) ->
           (a,b) IN ((1,2),(2,3)) and c in (3,4) and (a,b,c) IN ((1,2,3),(2,3,4))
{code}
2) Extending the filter parser of the metastore to support IN clauses, including multiple columns. This allows to push those additional predicates down throw directSQL to the metastore.


> Pushed IN predicates to the metastore
> -------------------------------------
>
>                 Key: HIVE-11726
>                 URL: https://issues.apache.org/jira/browse/HIVE-11726
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 2.0.0
>            Reporter: Jesus Camacho Rodriguez
>            Assignee: Jesus Camacho Rodriguez
>         Attachments: HIVE-11726.patch
>
>
> The PointLookupOptimizer can turn off some of the optimizations due to its use of tuple IN() clauses.
> HIVE-11573 introduced the extraction of sub-clauses that could be pushed down till the TableScan operators, though they wouldn't be pushed down to the metastore.
> In this issue, we tackle this problem by extending the filter parser of the metastore to support IN clauses, including multiple columns. This allows to push those additional predicates down throw directSQL to the metastore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)