You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Bhavani Sudha (Jira)" <ji...@apache.org> on 2020/01/08 23:03:00 UTC
[jira] [Updated] (HUDI-25) Faster Incremental queries on Hoodie
#492
[ https://issues.apache.org/jira/browse/HUDI-25?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bhavani Sudha updated HUDI-25:
------------------------------
Status: Closed (was: Patch Available)
> Faster Incremental queries on Hoodie #492
> -----------------------------------------
>
> Key: HUDI-25
> URL: https://issues.apache.org/jira/browse/HUDI-25
> Project: Apache Hudi (incubating)
> Issue Type: New Feature
> Components: Hive Integration
> Reporter: Vinoth Chandar
> Assignee: Bhavani Sudha
> Priority: Major
> Labels: pull-request-available
> Fix For: 0.5.1
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> Hive Incremental queries on Hoodie currently suffer a limitation of listing all partitions when a datestr is not present (lists .hoodie and the partitions) and end up throwing away a lot of the files (since `__hoodie__commit_time` column values filters out those files) . This can be very expensive and can impact query planning time and sometime causes timeouts as well if the table is large. The original issue is tracked here - [https://github.com/uber/hudi/issues/492]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)