You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Vinoth Chandar (Jira)" <ji...@apache.org> on 2019/11/12 01:56:00 UTC

[jira] [Updated] (HUDI-25) Faster Incremental queries on Hoodie #492

     [ https://issues.apache.org/jira/browse/HUDI-25?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinoth Chandar updated HUDI-25:
-------------------------------
    Status: Patch Available  (was: In Progress)

> Faster Incremental queries on Hoodie #492
> -----------------------------------------
>
>                 Key: HUDI-25
>                 URL: https://issues.apache.org/jira/browse/HUDI-25
>             Project: Apache Hudi (incubating)
>          Issue Type: New Feature
>          Components: Hive Integration
>            Reporter: Vinoth Chandar
>            Assignee: Bhavani Sudha Saktheeswaran
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.5.1
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hive Incremental queries on Hoodie currently suffer a limitation of listing all partitions when a datestr is not present (lists .hoodie and the partitions) and end up throwing away a lot of the files (since `__hoodie__commit_time` column values filters out those files) . This can be very expensive and can impact query planning time and sometime causes timeouts as well if the table is large. The original issue is tracked here - [https://github.com/uber/hudi/issues/492]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)