You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Vinoth Chandar (Jira)" <ji...@apache.org> on 2020/03/31 14:32:00 UTC
[jira] [Commented] (HUDI-309) General Redesign of Archived Timeline for efficient scan and management

    [ https://issues.apache.org/jira/browse/HUDI-309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17071842#comment-17071842 ] 

Vinoth Chandar commented on HUDI-309:
-------------------------------------

Folks, I think we would agree this is a large effort with potential overall with RFC-15.. I was thinking about a way to make progress here.. on this specific problem and unblock other projects along the way. 

Specific problem : During write operations, cache the input using spark caching, compute a workload profile for purposes of file sizing etc. We also use persist this information in the inflight commit/deltacommit file, for doing rollbacks. i.e if the write fails midway leaving a .inflight commit/deltacommit, then upon the next write, we will read the workload profile written into the commit/deltacommit and then attempt delete left over files or log rollback blocks into log files to nullify the partial writes we might have written...  Note that we will not read base or log files that are inflight in the active time by checking if the instant was inflight. but if we don't perform any rollback action and enough time passes, then this instant will be archived and that's where the trouble is. Once an instant goes into archived timeline today, there is no way to check it's individual state (inflight vs completed).. and this is what the JIRA was trying to handle in a generic way, so that the memory caching requirement is not used in this critical way functionally.  

Thinking back, I think we can shelve this JIRA as a longer term effect. and use an alternate approach to solve the specific problem above.. During each write (from Create and Merge handles, code is in HoodieTable.java) we already write out marker files under .hoodie that correspond 1-1 with 1 file being created or merged today. In case of partial write, this marker file might be left behind (need to ensure in code that we commit first and then delete markers) and we can directly use this to perform the rollback... (note that we need to handle backwards compatibility with existing timelines, support also downgrades to old ways)

Let me know if this makes sense in a general way.. We can file a separate JIRA and get working on it.. 

[~xleesf] [~vbalaji] [~vinoyang] [~nagarwal] 


> General Redesign of Archived Timeline for efficient scan and management
> -----------------------------------------------------------------------
>
>                 Key: HUDI-309
>                 URL: https://issues.apache.org/jira/browse/HUDI-309
>             Project: Apache Hudi (incubating)
>          Issue Type: New Feature
>          Components: Common Core
>            Reporter: Balaji Varadarajan
>            Assignee: Balaji Varadarajan
>            Priority: Major
>             Fix For: 0.6.0
>
>         Attachments: Archive TImeline Notes by Vinoth 1.jpg, Archived Timeline Notes by Vinoth 2.jpg
>
>
> As designed by Vinoth:
> Goals
>  # Archived Metadata should be scannable in the same way as data
>  # Provides more safety by always serving committed data independent of timeframe when the corresponding commit action was tried. Currently, we implicitly assume a data file to be valid if its commit time is older than the earliest time in the active timeline. While this works ok, any inherent bugs in rollback could inadvertently expose a possibly duplicate file when its commit timestamp becomes older than that of any commits in the timeline.
>  # We had to deal with lot of corner cases because of the way we treat a "commit" as special after it gets archived. Examples also include Savepoint handling logic by cleaner.
>  # Small Files : For Cloud stores, archiving simply moves fils from one directory to another causing the archive folder to grow. We need a way to efficiently compact these files and at the same time be friendly to scans
> Design:
>  The basic file-group abstraction for managing file versions for data files can be extended to managing archived commit metadata. The idea is to use an optimal format (like HFile) for storing compacted version of <commitTime, Metadata> pairs. Every archiving run will read <commitTime, Metadata> pairs from active timeline and append to indexable log files. We will run periodic minor compactions to merge multiple log files to a compacted HFile storing metadata for a time-range. It should be also noted that we will partition by the action types (commit/clean).  This design would allow for the archived timeline to be queryable for determining whether a timeline is valid or not.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)