You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "yihua (via GitHub)" <gi...@apache.org> on 2023/03/30 01:06:59 UTC

[GitHub] [hudi] yihua commented on pull request #8319: [HUDI-5934] Remove archival configs for metadata table

yihua commented on PR #8319:
URL: https://github.com/apache/hudi/pull/8319#issuecomment-1489540341

   > I guess, the question is, do users ever want to retain more commit in MDT compared to DT for investigation purposes for eg. @prashantwason : do you have any take here. or are we good to get rid of it.
   
   Retaining more commits in MDT is going to make MDT read slower, especially on cloud storage, as there are more instant files under `metadata/.hoodie` and loading active timeline takes more time.  So I think it is reasonable to assume that data table's and metadata table's timelines go hand in hand.  For investigation purposes, if there are more commits in MDT compared to DT, the corresponding commits in DT are in the archived timeline, which requires loading the archived timeline anyway.  With this PR, we can still investigate all the commits in the archived timeline.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org