You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Vinoth Chandar (Jira)" <ji...@apache.org> on 2019/11/12 01:56:00 UTC
[jira] [Updated] (HUDI-25) Faster Incremental queries on Hoodie
#492
[ https://issues.apache.org/jira/browse/HUDI-25?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vinoth Chandar updated HUDI-25:
-------------------------------
Status: Patch Available (was: In Progress)
> Faster Incremental queries on Hoodie #492
> -----------------------------------------
>
> Key: HUDI-25
> URL: https://issues.apache.org/jira/browse/HUDI-25
> Project: Apache Hudi (incubating)
> Issue Type: New Feature
> Components: Hive Integration
> Reporter: Vinoth Chandar
> Assignee: Bhavani Sudha Saktheeswaran
> Priority: Major
> Labels: pull-request-available
> Fix For: 0.5.1
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Hive Incremental queries on Hoodie currently suffer a limitation of listing all partitions when a datestr is not present (lists .hoodie and the partitions) and end up throwing away a lot of the files (since `__hoodie__commit_time` column values filters out those files) . This can be very expensive and can impact query planning time and sometime causes timeouts as well if the table is large. The original issue is tracked here - [https://github.com/uber/hudi/issues/492]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)