You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "satish (Jira)" <ji...@apache.org> on 2020/07/06 23:04:00 UTC

[jira] [Created] (HUDI-1072) Reader changes to support clustering and insert overwrite

satish created HUDI-1072:
----------------------------

             Summary: Reader changes to support clustering and insert overwrite
                 Key: HUDI-1072
                 URL: https://issues.apache.org/jira/browse/HUDI-1072
             Project: Apache Hudi
          Issue Type: Sub-task
            Reporter: satish


* Add metadata to track ‘replaced’ files. Replaced files are essentially file groups to be ignored. For ‘insert overwrite’ this is all existing files in the partition overwritten. For ‘clustering’, this is all file groups that are merged into a new set of file groups.
* Change Views to ignore replaced files (AbstractTableFileSystemView and all subclasses)
* Change cleaner to delete data files that have been replaced (Introduce a new policy?)
* Change archival to not delete active commits that have this special metadata if corresponding data files are not deleted.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)