You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hive.apache.org by "liyunzhang_intel (JIRA)" <ji...@apache.org> on 2017/09/19 08:57:01 UTC

[jira] [Commented] (HIVE-16602) Implement shared scans with Tez

    [ https://issues.apache.org/jira/browse/HIVE-16602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171349#comment-16171349 ] 

liyunzhang_intel commented on HIVE-16602:
-----------------------------------------

[~jcamachorodriguez]: I am envaluating the the performance improvement of HIVE-16602 on tez
i use tpcds compare the execution time in the package without HIVE-16602
and with HIVE-16602 on 10g data scale. I guess there is improvement with this feature as it only loads table once even it appears multiple time in the query. Have you done some benchmark test about this feature?


> Implement shared scans with Tez
> -------------------------------
>
>                 Key: HIVE-16602
>                 URL: https://issues.apache.org/jira/browse/HIVE-16602
>             Project: Hive
>          Issue Type: New Feature
>          Components: Physical Optimizer
>    Affects Versions: 3.0.0
>            Reporter: Jesus Camacho Rodriguez
>            Assignee: Jesus Camacho Rodriguez
>              Labels: TODOC3.0
>             Fix For: 3.0.0
>
>         Attachments: HIVE-16602.01.patch, HIVE-16602.02.patch, HIVE-16602.03.patch, HIVE-16602.04.patch, HIVE-16602.patch
>
>
> Given a query plan, the goal is to identify scans on input tables that can be merged so the data is read only once. Optimization will be carried out at the physical level.
> In the longer term, identification of equivalent expressions and reutilization of intermediary results should be done at the logical layer via Spool operator.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)