You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Vinoth Chandar (Jira)" <ji...@apache.org> on 2022/01/06 16:46:00 UTC
[jira] [Commented] (HUDI-1628) [Umbrella] Improve data locality during ingestion
[ https://issues.apache.org/jira/browse/HUDI-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17470041#comment-17470041 ]
Vinoth Chandar commented on HUDI-1628:
--------------------------------------
[~guoyihua] assigning to you to drive this forward.
cc [~thirumalai.raj] please let us know if you are still interested in pursuing this.
> [Umbrella] Improve data locality during ingestion
> -------------------------------------------------
>
> Key: HUDI-1628
> URL: https://issues.apache.org/jira/browse/HUDI-1628
> Project: Apache Hudi
> Issue Type: Epic
> Components: Writer Core
> Reporter: satish
> Assignee: Ethan Guo
> Priority: Major
> Labels: hudi-umbrellas
> Fix For: 0.11.0
>
>
> Today the upsert partitioner does the file sizing/bin-packing etc for
> inserts and then sends some inserts over to existing file groups to
> maintain file size.
> We can abstract all of this into strategies and some kind of pipeline
> abstractions and have it also consider "affinity" to an existing file group
> based
> on say information stored in the metadata table?
> See http://mail-archives.apache.org/mod_mbox/hudi-dev/202102.mbox/browser
> for more details
--
This message was sent by Atlassian Jira
(v8.20.1#820001)