You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by "keith-turner (via GitHub)" <gi...@apache.org> on 2023/12/21 22:50:26 UTC

[PR] Moves the compaction commit process into FATE [accumulo]

keith-turner opened a new pull request, #4109:
URL: https://github.com/apache/accumulo/pull/4109

   The custom refresh tracking code was removed and compaction commit was moved to being a FATE operation with the following 4 steps.
   
    1. Rename file done in RenameCompactionFile class
    2. Update the metadata table via a conditional mutation done in CommitCompaction class
    3. Write the gc candidates done in PutGcCandidates class
    4. Optionally send a RPC refresh request if the tablet was hosted done in RefreshTablet class
   
   There is some follow on work that still needs to be done to improve how this change works with detecting dead compactions.  After that is done these changes should address problems outlined #3811 and #3802 that were related to process death before adding GC candidates.  Now that GC candidates are written in FATE, if it dies it will run again later.
   
   This is currently storing the compaction commit FATE operations in zookeeper.  This would not be suitable for a cluster because per tablet information should never be stored in zookeeper.  However its fine as a temporary situation in the elasticity branch until FATE storage is availabe in an accumulo table, see #4049 and #3559
   
   WIP
   
   WIP
   
   WIP


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Moves the compaction commit process into FATE [accumulo]

Posted by "cshannon (via GitHub)" <gi...@apache.org>.
cshannon commented on code in PR #4109:
URL: https://github.com/apache/accumulo/pull/4109#discussion_r1451048703


##########
server/manager/src/main/java/org/apache/accumulo/manager/compaction/coordinator/commit/RenameCompactionFile.java:
##########
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   https://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.accumulo.manager.compaction.coordinator.commit;
+
+import java.io.IOException;
+
+import org.apache.accumulo.core.fate.Repo;
+import org.apache.accumulo.core.metadata.ReferencedTabletFile;
+import org.apache.accumulo.manager.Manager;
+import org.apache.accumulo.manager.tableOps.ManagerRepo;
+import org.apache.accumulo.server.tablets.TabletNameGenerator;
+import org.apache.hadoop.fs.Path;
+
+public class RenameCompactionFile extends ManagerRepo {
+  private static final long serialVersionUID = 1L;
+  private final CompactionCommitData commitData;
+
+  public RenameCompactionFile(CompactionCommitData commitData) {
+    this.commitData = commitData;
+  }
+
+  @Override
+  public Repo<Manager> call(long tid, Manager manager) throws Exception {
+    ReferencedTabletFile newDatafile = null;
+    var ctx = manager.getContext();
+
+    var tmpPath = commitData.outputTmpPath;
+
+    if (commitData.stats.getEntriesWritten() == 0) {
+      // the compaction produced no output so do not need to rename or add a file to the metadata
+      // table, only delete the input files.
+      if (!ctx.getVolumeManager().delete(new Path(tmpPath))) {
+        throw new IOException("delete returned false");
+      }
+    } else {
+      newDatafile =
+          TabletNameGenerator.computeCompactionFileDest(ReferencedTabletFile.of(new Path(tmpPath)));
+      if (!ctx.getVolumeManager().rename(new Path(tmpPath), newDatafile.getPath())) {

Review Comment:
   New changes LGTM



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Moves the compaction commit process into FATE [accumulo]

Posted by "keith-turner (via GitHub)" <gi...@apache.org>.
keith-turner merged PR #4109:
URL: https://github.com/apache/accumulo/pull/4109


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Moves the compaction commit process into FATE [accumulo]

Posted by "keith-turner (via GitHub)" <gi...@apache.org>.
keith-turner commented on code in PR #4109:
URL: https://github.com/apache/accumulo/pull/4109#discussion_r1451045688


##########
server/manager/src/main/java/org/apache/accumulo/manager/compaction/coordinator/commit/RenameCompactionFile.java:
##########
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   https://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.accumulo.manager.compaction.coordinator.commit;
+
+import java.io.IOException;
+
+import org.apache.accumulo.core.fate.Repo;
+import org.apache.accumulo.core.metadata.ReferencedTabletFile;
+import org.apache.accumulo.manager.Manager;
+import org.apache.accumulo.manager.tableOps.ManagerRepo;
+import org.apache.accumulo.server.tablets.TabletNameGenerator;
+import org.apache.hadoop.fs.Path;
+
+public class RenameCompactionFile extends ManagerRepo {
+  private static final long serialVersionUID = 1L;
+  private final CompactionCommitData commitData;
+
+  public RenameCompactionFile(CompactionCommitData commitData) {
+    this.commitData = commitData;
+  }
+
+  @Override
+  public Repo<Manager> call(long tid, Manager manager) throws Exception {
+    ReferencedTabletFile newDatafile = null;
+    var ctx = manager.getContext();
+
+    var tmpPath = commitData.outputTmpPath;
+
+    if (commitData.stats.getEntriesWritten() == 0) {
+      // the compaction produced no output so do not need to rename or add a file to the metadata
+      // table, only delete the input files.
+      if (!ctx.getVolumeManager().delete(new Path(tmpPath))) {
+        throw new IOException("delete returned false");
+      }
+    } else {
+      newDatafile =
+          TabletNameGenerator.computeCompactionFileDest(ReferencedTabletFile.of(new Path(tmpPath)));
+      if (!ctx.getVolumeManager().rename(new Path(tmpPath), newDatafile.getPath())) {

Review Comment:
   Pushed a fix in 8886367



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Moves the compaction commit process into FATE [accumulo]

Posted by "keith-turner (via GitHub)" <gi...@apache.org>.
keith-turner commented on code in PR #4109:
URL: https://github.com/apache/accumulo/pull/4109#discussion_r1451036721


##########
server/manager/src/main/java/org/apache/accumulo/manager/compaction/coordinator/commit/RenameCompactionFile.java:
##########
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   https://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.accumulo.manager.compaction.coordinator.commit;
+
+import java.io.IOException;
+
+import org.apache.accumulo.core.fate.Repo;
+import org.apache.accumulo.core.metadata.ReferencedTabletFile;
+import org.apache.accumulo.manager.Manager;
+import org.apache.accumulo.manager.tableOps.ManagerRepo;
+import org.apache.accumulo.server.tablets.TabletNameGenerator;
+import org.apache.hadoop.fs.Path;
+
+public class RenameCompactionFile extends ManagerRepo {
+  private static final long serialVersionUID = 1L;
+  private final CompactionCommitData commitData;
+
+  public RenameCompactionFile(CompactionCommitData commitData) {
+    this.commitData = commitData;
+  }
+
+  @Override
+  public Repo<Manager> call(long tid, Manager manager) throws Exception {
+    ReferencedTabletFile newDatafile = null;
+    var ctx = manager.getContext();
+
+    var tmpPath = commitData.outputTmpPath;
+
+    if (commitData.stats.getEntriesWritten() == 0) {
+      // the compaction produced no output so do not need to rename or add a file to the metadata
+      // table, only delete the input files.
+      if (!ctx.getVolumeManager().delete(new Path(tmpPath))) {
+        throw new IOException("delete returned false");
+      }
+    } else {
+      newDatafile =
+          TabletNameGenerator.computeCompactionFileDest(ReferencedTabletFile.of(new Path(tmpPath)));
+      if (!ctx.getVolumeManager().rename(new Path(tmpPath), newDatafile.getPath())) {

Review Comment:
   > Is this idempotent?
   
   No, I don't think it is. That is a good find.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Moves the compaction commit process into FATE [accumulo]

Posted by "cshannon (via GitHub)" <gi...@apache.org>.
cshannon commented on code in PR #4109:
URL: https://github.com/apache/accumulo/pull/4109#discussion_r1451027408


##########
server/manager/src/main/java/org/apache/accumulo/manager/compaction/coordinator/commit/RenameCompactionFile.java:
##########
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   https://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.accumulo.manager.compaction.coordinator.commit;
+
+import java.io.IOException;
+
+import org.apache.accumulo.core.fate.Repo;
+import org.apache.accumulo.core.metadata.ReferencedTabletFile;
+import org.apache.accumulo.manager.Manager;
+import org.apache.accumulo.manager.tableOps.ManagerRepo;
+import org.apache.accumulo.server.tablets.TabletNameGenerator;
+import org.apache.hadoop.fs.Path;
+
+public class RenameCompactionFile extends ManagerRepo {
+  private static final long serialVersionUID = 1L;
+  private final CompactionCommitData commitData;
+
+  public RenameCompactionFile(CompactionCommitData commitData) {
+    this.commitData = commitData;
+  }
+
+  @Override
+  public Repo<Manager> call(long tid, Manager manager) throws Exception {
+    ReferencedTabletFile newDatafile = null;
+    var ctx = manager.getContext();
+
+    var tmpPath = commitData.outputTmpPath;
+
+    if (commitData.stats.getEntriesWritten() == 0) {
+      // the compaction produced no output so do not need to rename or add a file to the metadata
+      // table, only delete the input files.
+      if (!ctx.getVolumeManager().delete(new Path(tmpPath))) {
+        throw new IOException("delete returned false");
+      }
+    } else {
+      newDatafile =
+          TabletNameGenerator.computeCompactionFileDest(ReferencedTabletFile.of(new Path(tmpPath)));
+      if (!ctx.getVolumeManager().rename(new Path(tmpPath), newDatafile.getPath())) {

Review Comment:
   Is this idempotent? What happens if the file is renamed and then this crashes before completing so this fate step reruns? The previous tmpPath wouldn't exist anymore so should we add a check to see if the file was already renamed in case this runs a second time?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org