You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Sergey Shelukhin (JIRA)" <ji...@apache.org> on 2015/09/10 03:38:45 UTC

[jira] [Updated] (HIVE-11777) implement an option to have single ETL strategy for multiple directories

     [ https://issues.apache.org/jira/browse/HIVE-11777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sergey Shelukhin updated HIVE-11777:
------------------------------------
    Attachment: HIVE-11777.WIP.patch

I wrote a much more generic patch and rewrote it to be simpler at the last moment. This is a WIP patch because tests can't build, also it needs a null check and unfortunately I accidentally implemented another feature on the same branch, I will untangle them tomorrow.
[~prasanth_j] fyi ;)

> implement an option to have single ETL strategy for multiple directories
> ------------------------------------------------------------------------
>
>                 Key: HIVE-11777
>                 URL: https://issues.apache.org/jira/browse/HIVE-11777
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HIVE-11777.WIP.patch
>
>
> In case of metastore footer PPD we don't want to call PPD call with all attendant SARG, MS and HBase overhead for each directory. If we wait for some time (10ms? some fraction of inputs?) we can do one call without losing overall perf. 
> For now make it time based.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)