You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Manoj Govindassamy (Jira)" <ji...@apache.org> on 2021/10/22 18:34:00 UTC

[jira] [Created] (HUDI-2603) Metadata table bootstrapping is missed out when the feature is disabled intermittently

Manoj Govindassamy created HUDI-2603:
----------------------------------------

             Summary: Metadata table bootstrapping is missed out when the feature is disabled intermittently
                 Key: HUDI-2603
                 URL: https://issues.apache.org/jira/browse/HUDI-2603
             Project: Apache Hudi
          Issue Type: Bug
          Components: bootstrap
            Reporter: Manoj Govindassamy
            Assignee: Manoj Govindassamy


Metadata table is boostrapped whenever it finds its commits not synced up with data table. Each instantiation of metadata table does this check. When the metadata table is turned on at the start, and after few commits turned off, followed by more commits and then turned on again, the current check for bootstrapping doesn't seem to catch the intermittent breakages in the commit syncup and missing out the bootstrap.

 

```

protected void bootstrapIfNeeded(HoodieEngineContext engineContext, HoodieTableMetaClient dataMetaClient) throws IOException {
 HoodieTimer timer = new HoodieTimer().startTimer();
 boolean exists = dataMetaClient.getFs().exists(new Path(metadataWriteConfig.getBasePath(), HoodieTableMetaClient.METAFOLDER_NAME));
 boolean rebootstrap = false;
 if (exists) {
 // If the un-synched instants have been archived then the metadata table will need to be bootstrapped again
 HoodieTableMetaClient metadataMetaClient = HoodieTableMetaClient.builder().setConf(hadoopConf.get())
 .setBasePath(metadataWriteConfig.getBasePath()).build();
 Option<HoodieInstant> latestMetadataInstant = metadataMetaClient.getActiveTimeline().filterCompletedInstants().lastInstant();
 if (!latestMetadataInstant.isPresent()) {
 LOG.warn("Metadata Table will need to be re-bootstrapped as no instants were found");
 rebootstrap = true;
 } else if (!latestMetadataInstant.get().getTimestamp().equals(SOLO_COMMIT_TIMESTAMP)
 && dataMetaClient.getActiveTimeline().getAllCommitsTimeline().isBeforeTimelineStarts(latestMetadataInstant.get().getTimestamp())) {
 // TODO: Revisit this logic and validate that filtering for all commits timeline is the right thing to do
 LOG.warn("Metadata Table will need to be re-bootstrapped as un-synced instants have been archived."
 + " latestMetadataInstant=" + latestMetadataInstant.get().getTimestamp()
 + ", latestDataInstant=" + dataMetaClient.getActiveTimeline().firstInstant().get().getTimestamp());
 rebootstrap = true;
 }
 }

```

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)