You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2022/09/15 18:18:38 UTC

[GitHub] [ozone] prashantpogde opened a new pull request, #3755: HDDS-7224. Create a new RocksDBCheckpoint Diff utility.

prashantpogde opened a new pull request, #3755:
URL: https://github.com/apache/ozone/pull/3755

   ## What changes were proposed in this pull request?
   
   - Add a RocksDB diff utility. We want to identify the set of SST files that are different between two different RocksDB checkpoints.
   - Integrate this utility with Ozone
   
   The changes here include basic changes to track differences in SST files across RocksDB compaction. More changes will follow.
   
   ## What is the link to the Apache 
   
   https://issues.apache.org/jira/browse/HDDS-7224
   
   ## How was this patch tested?
   
   Manual Testing.
   
   Automated testing will be added in subsequent patches.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #3755: HDDS-7224. Create a new RocksDBCheckpoint Diff utility.

Posted by GitBox <gi...@apache.org>.
smengcl commented on code in PR #3755:
URL: https://github.com/apache/ozone/pull/3755#discussion_r974914124


##########
pom.xml:
##########
@@ -1843,35 +1843,47 @@ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xs
               </goals>
               <configuration>
                 <rules>
-                  <RestrictImports>
-                    <includeTestCode>false</includeTestCode>
-                    <reason>Use managed RocksObjects under org.apache.hadoop.hdds.utils.db.managed instead.</reason>
-                    <!-- By default, ban all the classes in org.rocksdb -->
-                    <bannedImport>org.rocksdb.**</bannedImport>
-                    <allowedImports>
-                      <!-- Allow non-RocksObject classes. -->
-                      <allowedImport>org.rocksdb.BlockBasedTableConfig</allowedImport>
-                      <allowedImport>org.rocksdb.ColumnFamilyDescriptor</allowedImport>
-                      <allowedImport>org.rocksdb.CompactionStyle</allowedImport>
-                      <allowedImport>org.rocksdb.HistogramData</allowedImport>
-                      <allowedImport>org.rocksdb.HistogramType</allowedImport>
-                      <allowedImport>org.rocksdb.Holder</allowedImport>
-                      <allowedImport>org.rocksdb.InfoLogLevel</allowedImport>
-                      <allowedImport>org.rocksdb.OptionsUtil</allowedImport>
-                      <allowedImport>org.rocksdb.RocksDBException</allowedImport>
-                      <allowedImport>org.rocksdb.StatsLevel</allowedImport>
-                      <allowedImport>org.rocksdb.TransactionLogIterator.BatchResult</allowedImport>
-                      <allowedImport>org.rocksdb.TickerType</allowedImport>
-
-                      <!-- Allow RocksObjects whose native pointer is managed by RocksDB. -->
-                      <allowedImport>org.rocksdb.ColumnFamilyHandle</allowedImport>
-                      <allowedImport>org.rocksdb.Env</allowedImport>
-                      <allowedImport>org.rocksdb.Statistics</allowedImport>
-
-                      <!-- Allow RocksDB constants and static methods to be used. -->
-                      <allowedImport>org.rocksdb.RocksDB.*</allowedImport>
-                    </allowedImports>
-                    <exclusion>org.apache.hadoop.hdds.utils.db.managed.*</exclusion>
+                  <RestrictImports>	
+                    <includeTestCode>false</includeTestCode>	
+                    <reason>Use managed RocksObjects under org.apache.hadoop.hdds.utils.db.managed instead.</reason>	
+                    <!-- By default, ban all the classes in org.rocksdb -->	
+                    <bannedImport>org.rocksdb.**</bannedImport>	
+                    <allowedImports>	
+                      <!-- Allow non-RocksObject classes. -->	
+                      <allowedImport>org.rocksdb.BlockBasedTableConfig</allowedImport>	
+                      <allowedImport>org.rocksdb.ColumnFamilyDescriptor</allowedImport>	
+                      <allowedImport>org.rocksdb.CompactionStyle</allowedImport>	
+                      <allowedImport>org.rocksdb.HistogramData</allowedImport>	
+                      <allowedImport>org.rocksdb.HistogramType</allowedImport>	
+                      <allowedImport>org.rocksdb.Holder</allowedImport>	
+                      <allowedImport>org.rocksdb.InfoLogLevel</allowedImport>	
+                      <allowedImport>org.rocksdb.OptionsUtil</allowedImport>	
+                      <allowedImport>org.rocksdb.RocksDBException</allowedImport>	
+                      <allowedImport>org.rocksdb.StatsLevel</allowedImport>	
+                      <allowedImport>org.rocksdb.TransactionLogIterator.BatchResult</allowedImport>	
+                      <allowedImport>org.rocksdb.TickerType</allowedImport>	
+
+                      <allowedImport>org.rocksdb.AbstractEventListener</allowedImport>	
+                      <allowedImport>org.rocksdb.Checkpoint</allowedImport>	
+                      <allowedImport>org.rocksdb.CompactionJobInfo</allowedImport>	
+                      <allowedImport>org.rocksdb.CompressionType</allowedImport>	
+                      <allowedImport>org.rocksdb.DBOptions</allowedImport>	
+                      <allowedImport>org.rocksdb.FlushOptions</allowedImport>	
+                      <allowedImport>org.rocksdb.LiveFileMetaData</allowedImport>	
+                      <allowedImport>org.rocksdb.Options</allowedImport>	
+                      <allowedImport>org.rocksdb.RocksDB</allowedImport>	

Review Comment:
   `org.rocksdb.RocksDB` is only used in the UT.
   For `org.rocksdb.Options` I think we will migrate to a managed one later.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on pull request #3755: HDDS-7224. Create a new RocksDBCheckpoint Diff utility.

Posted by GitBox <gi...@apache.org>.
smengcl commented on PR #3755:
URL: https://github.com/apache/ozone/pull/3755#issuecomment-1258643366

   Thanks @prashantpogde for the original PR. Thanks @nandakumar131 for the review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] nandakumar131 commented on a diff in pull request #3755: HDDS-7224. Create a new RocksDBCheckpoint Diff utility.

Posted by GitBox <gi...@apache.org>.
nandakumar131 commented on code in PR #3755:
URL: https://github.com/apache/ozone/pull/3755#discussion_r977985602


##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDBCheckpointDiffer.java:
##########
@@ -0,0 +1,817 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.ozone.rocksdiff;
+
+import java.io.File;
+import java.io.IOException;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.Comparator;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Set;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.stream.Collectors;
+
+
+import org.rocksdb.AbstractEventListener;
+import org.rocksdb.Checkpoint;
+import org.rocksdb.CompactionJobInfo;
+import org.rocksdb.CompressionType;
+import org.rocksdb.DBOptions;
+import org.rocksdb.FlushOptions;
+import org.rocksdb.LiveFileMetaData;
+import org.rocksdb.Options;
+import org.rocksdb.RocksDB;
+import org.rocksdb.RocksDBException;
+import org.rocksdb.SstFileReader;
+import org.rocksdb.TableProperties;
+
+import com.google.common.graph.GraphBuilder;
+import com.google.common.graph.MutableGraph;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import edu.umd.cs.findbugs.annotations.SuppressFBWarnings;
+
+// TODO
+//  1. Create a local instance of RocksDiff-local-RocksDB. This is the
+//  rocksDB that we can use for maintaining DAG and any other state. This is
+//  a per node state so it it doesn't have to go through RATIS anyway.
+//  2. Store fwd DAG in Diff-Local-RocksDB in Compaction Listener
+//  3. Store fwd DAG in Diff-Local-RocksDB in Compaction Listener
+//  4. Store last-Snapshot-counter/Compaction-generation-counter in Diff-Local
+//  -RocksDB in Compaction Listener
+//  5. System Restart handling. Read the DAG from Disk and load it in memory.
+//  6. Take the base snapshot. All the SST file nodes in the base snapshot
+//  should be arked with that Snapshot generation. Subsequently, all SST file
+//  node should have a snapshot-generation count and Compaction-generation
+//  count.
+//  6. DAG based SST file pruning. Start from the oldest snapshot and we can
+//  unlink any SST
+//  file from the SaveCompactedFilePath directory that is reachable in the
+//  reverse DAG.
+//  7. DAG pruning : For each snapshotted bucket, We can recycle the part of
+//  the DAG that is older than the predefined policy for the efficient snapdiff.
+//  E.g. we may decide not to support efficient snapdiff from any snapshot that
+//  is older than 2 weeks.
+//  Note on 8. & 9 .
+//  ==================
+//  A simple handling is to just iterate over all keys in keyspace when the
+//  compaction DAG is lost, instead of optimizing every case. And start
+//  Compaction-DAG afresh from the latest snapshot.
+//  --
+//  8. Handle bootstrapping rocksDB for a new OM follower node
+//      - new node will receive Active object store as well as all existing
+//      rocksDB checkpoints.
+//      - This bootstrapping should also receive the compaction-DAG information
+//  9. Handle rebuilding the DAG for a lagging follower. There are two cases
+//      - recieve RATIS transactions to replay. Nothing needs to be done in
+//      thise case.
+//      - Getting the DB sync. This case needs to handle getting the
+//      compaction-DAG information as well.
+//
+//
+/**
+ *  RocksDBCheckpointDiffer class.
+ */
+//CHECKSTYLE:OFF
+@SuppressWarnings({"NM_METHOD_NAMING_CONVENTION"})
+public class RocksDBCheckpointDiffer {
+  private final String rocksDbPath;
+  private String cpPath;
+  private final String CFDBPATH;
+  private final String SAVE_COMPACTED_FILE_PATH;
+  private final int MAX_SNAPSHOTS;

Review Comment:
   These are not constants, we should have them in camel case.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] adoroszlai commented on a diff in pull request #3755: HDDS-7224. Create a new RocksDBCheckpoint Diff utility.

Posted by GitBox <gi...@apache.org>.
adoroszlai commented on code in PR #3755:
URL: https://github.com/apache/ozone/pull/3755#discussion_r974926266


##########
hadoop-hdds/pom.xml:
##########
@@ -131,6 +132,12 @@ https://maven.apache.org/xsd/maven-4.0.0.xsd">
         <version>${hdds.version}</version>
       </dependency>
 
+    <dependency>
+      <groupId>org.apache.ozone</groupId>
+      <artifactId>RocksDBCheckpointDiffer</artifactId>

Review Comment:
   Leftover old artifact ID:
   
   ```suggestion
         <artifactId>rocksdb-checkpoint-differ</artifactId>
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #3755: HDDS-7224. Create a new RocksDBCheckpoint Diff utility.

Posted by GitBox <gi...@apache.org>.
smengcl commented on code in PR #3755:
URL: https://github.com/apache/ozone/pull/3755#discussion_r978972942


##########
hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/db/RDBStore.java:
##########
@@ -80,6 +82,13 @@ public RDBStore(File dbFile, ManagedDBOptions dbOptions,
     dbLocation = dbFile;
 
     try {
+      rocksDBCheckpointDiffer =
+          new RocksDBCheckpointDiffer(dbLocation.getAbsolutePath(), 1000,

Review Comment:
   doen



##########
hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/db/RDBStore.java:
##########
@@ -80,6 +82,13 @@ public RDBStore(File dbFile, ManagedDBOptions dbOptions,
     dbLocation = dbFile;
 
     try {
+      rocksDBCheckpointDiffer =
+          new RocksDBCheckpointDiffer(dbLocation.getAbsolutePath(), 1000,

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #3755: HDDS-7224. Create a new RocksDBCheckpoint Diff utility.

Posted by GitBox <gi...@apache.org>.
smengcl commented on code in PR #3755:
URL: https://github.com/apache/ozone/pull/3755#discussion_r978968277


##########
hadoop-hdds/rocksdb-checkpoint-differ/pom.xml:
##########
@@ -0,0 +1,193 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+<project xmlns="http://maven.apache.org/POM/4.0.0"
+         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
+https://maven.apache.org/xsd/maven-4.0.0.xsd">
+  <modelVersion>4.0.0</modelVersion>
+  <parent>
+    <groupId>org.apache.ozone</groupId>
+    <artifactId>hdds</artifactId>
+    <version>1.3.0-SNAPSHOT</version>
+  </parent>
+  <artifactId>rocksdb-checkpoint-differ</artifactId>
+  <version>1.3.0-SNAPSHOT</version>
+  <description>RocksDB Checkpoint Differ</description>
+  <name>RocksDB Checkpoint Differ</name>
+  <packaging>jar</packaging>
+
+  <dependencies>
+
+    <dependency>
+      <groupId>org.rocksdb</groupId>
+      <artifactId>rocksdbjni</artifactId>
+      <version>7.4.5</version>

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #3755: HDDS-7224. Create a new RocksDBCheckpoint Diff utility.

Posted by GitBox <gi...@apache.org>.
smengcl commented on code in PR #3755:
URL: https://github.com/apache/ozone/pull/3755#discussion_r974912953


##########
pom.xml:
##########
@@ -1843,35 +1843,47 @@ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xs
               </goals>
               <configuration>
                 <rules>
-                  <RestrictImports>
-                    <includeTestCode>false</includeTestCode>
-                    <reason>Use managed RocksObjects under org.apache.hadoop.hdds.utils.db.managed instead.</reason>
-                    <!-- By default, ban all the classes in org.rocksdb -->
-                    <bannedImport>org.rocksdb.**</bannedImport>
-                    <allowedImports>
-                      <!-- Allow non-RocksObject classes. -->
-                      <allowedImport>org.rocksdb.BlockBasedTableConfig</allowedImport>
-                      <allowedImport>org.rocksdb.ColumnFamilyDescriptor</allowedImport>
-                      <allowedImport>org.rocksdb.CompactionStyle</allowedImport>
-                      <allowedImport>org.rocksdb.HistogramData</allowedImport>
-                      <allowedImport>org.rocksdb.HistogramType</allowedImport>
-                      <allowedImport>org.rocksdb.Holder</allowedImport>
-                      <allowedImport>org.rocksdb.InfoLogLevel</allowedImport>
-                      <allowedImport>org.rocksdb.OptionsUtil</allowedImport>
-                      <allowedImport>org.rocksdb.RocksDBException</allowedImport>
-                      <allowedImport>org.rocksdb.StatsLevel</allowedImport>
-                      <allowedImport>org.rocksdb.TransactionLogIterator.BatchResult</allowedImport>
-                      <allowedImport>org.rocksdb.TickerType</allowedImport>
-
-                      <!-- Allow RocksObjects whose native pointer is managed by RocksDB. -->
-                      <allowedImport>org.rocksdb.ColumnFamilyHandle</allowedImport>
-                      <allowedImport>org.rocksdb.Env</allowedImport>
-                      <allowedImport>org.rocksdb.Statistics</allowedImport>
-
-                      <!-- Allow RocksDB constants and static methods to be used. -->
-                      <allowedImport>org.rocksdb.RocksDB.*</allowedImport>
-                    </allowedImports>
-                    <exclusion>org.apache.hadoop.hdds.utils.db.managed.*</exclusion>
+                  <RestrictImports>	
+                    <includeTestCode>false</includeTestCode>	
+                    <reason>Use managed RocksObjects under org.apache.hadoop.hdds.utils.db.managed instead.</reason>	
+                    <!-- By default, ban all the classes in org.rocksdb -->	
+                    <bannedImport>org.rocksdb.**</bannedImport>	
+                    <allowedImports>	
+                      <!-- Allow non-RocksObject classes. -->	
+                      <allowedImport>org.rocksdb.BlockBasedTableConfig</allowedImport>	
+                      <allowedImport>org.rocksdb.ColumnFamilyDescriptor</allowedImport>	
+                      <allowedImport>org.rocksdb.CompactionStyle</allowedImport>	
+                      <allowedImport>org.rocksdb.HistogramData</allowedImport>	
+                      <allowedImport>org.rocksdb.HistogramType</allowedImport>	
+                      <allowedImport>org.rocksdb.Holder</allowedImport>	
+                      <allowedImport>org.rocksdb.InfoLogLevel</allowedImport>	
+                      <allowedImport>org.rocksdb.OptionsUtil</allowedImport>	
+                      <allowedImport>org.rocksdb.RocksDBException</allowedImport>	
+                      <allowedImport>org.rocksdb.StatsLevel</allowedImport>	
+                      <allowedImport>org.rocksdb.TransactionLogIterator.BatchResult</allowedImport>	
+                      <allowedImport>org.rocksdb.TickerType</allowedImport>	
+
+                      <allowedImport>org.rocksdb.AbstractEventListener</allowedImport>	
+                      <allowedImport>org.rocksdb.Checkpoint</allowedImport>	
+                      <allowedImport>org.rocksdb.CompactionJobInfo</allowedImport>	
+                      <allowedImport>org.rocksdb.CompressionType</allowedImport>	
+                      <allowedImport>org.rocksdb.DBOptions</allowedImport>	
+                      <allowedImport>org.rocksdb.FlushOptions</allowedImport>	
+                      <allowedImport>org.rocksdb.LiveFileMetaData</allowedImport>	
+                      <allowedImport>org.rocksdb.Options</allowedImport>	
+                      <allowedImport>org.rocksdb.RocksDB</allowedImport>	

Review Comment:
   done. overridden in submodule's pom.xml instead



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] prashantpogde commented on pull request #3755: HDDS-7224. Create a new RocksDBCheckpoint Diff utility.

Posted by GitBox <gi...@apache.org>.
prashantpogde commented on PR #3755:
URL: https://github.com/apache/ozone/pull/3755#issuecomment-1249845339

   Find bugs warnings are in UnitTest Class and can be ignored for now.  integration test failures are unrelated.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] adoroszlai commented on a diff in pull request #3755: HDDS-7224. Create a new RocksDBCheckpoint Diff utility.

Posted by GitBox <gi...@apache.org>.
adoroszlai commented on code in PR #3755:
URL: https://github.com/apache/ozone/pull/3755#discussion_r973534185


##########
pom.xml:
##########
@@ -1843,35 +1843,47 @@ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xs
               </goals>
               <configuration>
                 <rules>
-                  <RestrictImports>
-                    <includeTestCode>false</includeTestCode>
-                    <reason>Use managed RocksObjects under org.apache.hadoop.hdds.utils.db.managed instead.</reason>
-                    <!-- By default, ban all the classes in org.rocksdb -->
-                    <bannedImport>org.rocksdb.**</bannedImport>
-                    <allowedImports>
-                      <!-- Allow non-RocksObject classes. -->
-                      <allowedImport>org.rocksdb.BlockBasedTableConfig</allowedImport>
-                      <allowedImport>org.rocksdb.ColumnFamilyDescriptor</allowedImport>
-                      <allowedImport>org.rocksdb.CompactionStyle</allowedImport>
-                      <allowedImport>org.rocksdb.HistogramData</allowedImport>
-                      <allowedImport>org.rocksdb.HistogramType</allowedImport>
-                      <allowedImport>org.rocksdb.Holder</allowedImport>
-                      <allowedImport>org.rocksdb.InfoLogLevel</allowedImport>
-                      <allowedImport>org.rocksdb.OptionsUtil</allowedImport>
-                      <allowedImport>org.rocksdb.RocksDBException</allowedImport>
-                      <allowedImport>org.rocksdb.StatsLevel</allowedImport>
-                      <allowedImport>org.rocksdb.TransactionLogIterator.BatchResult</allowedImport>
-                      <allowedImport>org.rocksdb.TickerType</allowedImport>
-
-                      <!-- Allow RocksObjects whose native pointer is managed by RocksDB. -->
-                      <allowedImport>org.rocksdb.ColumnFamilyHandle</allowedImport>
-                      <allowedImport>org.rocksdb.Env</allowedImport>
-                      <allowedImport>org.rocksdb.Statistics</allowedImport>
-
-                      <!-- Allow RocksDB constants and static methods to be used. -->
-                      <allowedImport>org.rocksdb.RocksDB.*</allowedImport>
-                    </allowedImports>
-                    <exclusion>org.apache.hadoop.hdds.utils.db.managed.*</exclusion>
+                  <RestrictImports>	
+                    <includeTestCode>false</includeTestCode>	
+                    <reason>Use managed RocksObjects under org.apache.hadoop.hdds.utils.db.managed instead.</reason>	
+                    <!-- By default, ban all the classes in org.rocksdb -->	
+                    <bannedImport>org.rocksdb.**</bannedImport>	
+                    <allowedImports>	
+                      <!-- Allow non-RocksObject classes. -->	
+                      <allowedImport>org.rocksdb.BlockBasedTableConfig</allowedImport>	
+                      <allowedImport>org.rocksdb.ColumnFamilyDescriptor</allowedImport>	
+                      <allowedImport>org.rocksdb.CompactionStyle</allowedImport>	
+                      <allowedImport>org.rocksdb.HistogramData</allowedImport>	
+                      <allowedImport>org.rocksdb.HistogramType</allowedImport>	
+                      <allowedImport>org.rocksdb.Holder</allowedImport>	
+                      <allowedImport>org.rocksdb.InfoLogLevel</allowedImport>	
+                      <allowedImport>org.rocksdb.OptionsUtil</allowedImport>	
+                      <allowedImport>org.rocksdb.RocksDBException</allowedImport>	
+                      <allowedImport>org.rocksdb.StatsLevel</allowedImport>	
+                      <allowedImport>org.rocksdb.TransactionLogIterator.BatchResult</allowedImport>	
+                      <allowedImport>org.rocksdb.TickerType</allowedImport>	
+
+                      <allowedImport>org.rocksdb.AbstractEventListener</allowedImport>	
+                      <allowedImport>org.rocksdb.Checkpoint</allowedImport>	
+                      <allowedImport>org.rocksdb.CompactionJobInfo</allowedImport>	
+                      <allowedImport>org.rocksdb.CompressionType</allowedImport>	
+                      <allowedImport>org.rocksdb.DBOptions</allowedImport>	
+                      <allowedImport>org.rocksdb.FlushOptions</allowedImport>	
+                      <allowedImport>org.rocksdb.LiveFileMetaData</allowedImport>	
+                      <allowedImport>org.rocksdb.Options</allowedImport>	
+                      <allowedImport>org.rocksdb.RocksDB</allowedImport>	

Review Comment:
   Allowing these (e.g. `RocksDB` and `Options`) seems to circumvent the whole point of banning these imports.
   
   If these imports are absolutely necessary for the diff tool, please add override the restrictions only for that module.



##########
hadoop-hdds/rocksdb-checkpoint-differ/pom.xml:
##########
@@ -0,0 +1,152 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+<project xmlns="http://maven.apache.org/POM/4.0.0"
+         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
+https://maven.apache.org/xsd/maven-4.0.0.xsd">
+  <modelVersion>4.0.0</modelVersion>
+  <parent>
+    <groupId>org.apache.ozone</groupId>
+    <artifactId>hdds</artifactId>
+    <version>1.3.0-SNAPSHOT</version>
+  </parent>
+  <artifactId>RocksDBCheckpointDiffer</artifactId>

Review Comment:
   Nit: Shouldn't artifact ID be in `rocksdb-checkpoint-differ` style?



##########
hadoop-hdds/rocksdb-checkpoint-differ/src/test/java/org/apache/ozone/rocksdiff/TestRocksDBCheckpointDiffer.java:
##########
@@ -0,0 +1,328 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.ozone.rocksdiff;
+
+import static org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.DEBUG_DAG_LIVE_NODES;
+import static org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.DEBUG_READ_ALL_DB_KEYS;
+
+import java.io.File;
+import java.io.FileWriter;
+import java.io.IOException;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.List;
+import java.util.Random;
+import java.util.stream.Collectors;
+
+import org.rocksdb.ColumnFamilyDescriptor;
+import org.rocksdb.ColumnFamilyHandle;
+import org.rocksdb.ColumnFamilyOptions;
+import org.rocksdb.CompressionType;
+import org.rocksdb.DBOptions;
+import org.rocksdb.LiveFileMetaData;
+import org.rocksdb.Options;
+import org.rocksdb.RocksDB;
+import org.rocksdb.RocksDBException;
+import org.rocksdb.RocksIterator;
+
+////CHECKSTYLE:OFF

Review Comment:
   > Find bugs warnings are in UnitTest Class and can be ignored for now.
   
   Please add a `@SuppressFBWarnings` annotation.



##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RelationshipEdg.java:
##########
@@ -0,0 +1,30 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.rocksdiff;
+
+// import org.jgrapht.graph.DefaultEdge;
+// Enable this import and extend DefaultEdge if We need to
+// pcitorially represent the DAG constructed.
+
+//class RelationshipEdge extends DefaultEdge {
+class RelationshipEdge {

Review Comment:
   Class name and file name mismatch.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #3755: HDDS-7224. Create a new RocksDBCheckpoint Diff utility.

Posted by GitBox <gi...@apache.org>.
smengcl commented on code in PR #3755:
URL: https://github.com/apache/ozone/pull/3755#discussion_r974934218


##########
hadoop-hdds/pom.xml:
##########
@@ -131,6 +132,12 @@ https://maven.apache.org/xsd/maven-4.0.0.xsd">
         <version>${hdds.version}</version>
       </dependency>
 
+    <dependency>
+      <groupId>org.apache.ozone</groupId>
+      <artifactId>RocksDBCheckpointDiffer</artifactId>

Review Comment:
   Thanks. Committed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] prashantpogde commented on a diff in pull request #3755: HDDS-7224. Create a new RocksDBCheckpoint Diff utility.

Posted by GitBox <gi...@apache.org>.
prashantpogde commented on code in PR #3755:
URL: https://github.com/apache/ozone/pull/3755#discussion_r972431607


##########
hadoop-hdds/RocksDBCheckpointDiffer/README.md:
##########
@@ -0,0 +1,2 @@
+# RocksDiff

Review Comment:
   done



##########
pom.xml:
##########
@@ -1843,36 +1843,6 @@ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xs
               </goals>
               <configuration>
                 <rules>
-                  <RestrictImports>
-                    <includeTestCode>false</includeTestCode>
-                    <reason>Use managed RocksObjects under org.apache.hadoop.hdds.utils.db.managed instead.</reason>
-                    <!-- By default, ban all the classes in org.rocksdb -->
-                    <bannedImport>org.rocksdb.**</bannedImport>
-                    <allowedImports>
-                      <!-- Allow non-RocksObject classes. -->
-                      <allowedImport>org.rocksdb.BlockBasedTableConfig</allowedImport>
-                      <allowedImport>org.rocksdb.ColumnFamilyDescriptor</allowedImport>
-                      <allowedImport>org.rocksdb.CompactionStyle</allowedImport>
-                      <allowedImport>org.rocksdb.HistogramData</allowedImport>
-                      <allowedImport>org.rocksdb.HistogramType</allowedImport>
-                      <allowedImport>org.rocksdb.Holder</allowedImport>
-                      <allowedImport>org.rocksdb.InfoLogLevel</allowedImport>
-                      <allowedImport>org.rocksdb.OptionsUtil</allowedImport>
-                      <allowedImport>org.rocksdb.RocksDBException</allowedImport>
-                      <allowedImport>org.rocksdb.StatsLevel</allowedImport>
-                      <allowedImport>org.rocksdb.TransactionLogIterator.BatchResult</allowedImport>
-                      <allowedImport>org.rocksdb.TickerType</allowedImport>
-
-                      <!-- Allow RocksObjects whose native pointer is managed by RocksDB. -->
-                      <allowedImport>org.rocksdb.ColumnFamilyHandle</allowedImport>
-                      <allowedImport>org.rocksdb.Env</allowedImport>
-                      <allowedImport>org.rocksdb.Statistics</allowedImport>
-
-                      <!-- Allow RocksDB constants and static methods to be used. -->
-                      <allowedImport>org.rocksdb.RocksDB.*</allowedImport>
-                    </allowedImports>
-                    <exclusion>org.apache.hadoop.hdds.utils.db.managed.*</exclusion>
-                  </RestrictImports>

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] prashantpogde commented on pull request #3755: HDDS-7224. Create a new RocksDBCheckpoint Diff utility.

Posted by GitBox <gi...@apache.org>.
prashantpogde commented on PR #3755:
URL: https://github.com/apache/ozone/pull/3755#issuecomment-1258371738

   Merging this PR as base set of changes for RocksDiff. We will continue to make more changes on top of this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] nandakumar131 commented on pull request #3755: HDDS-7224. Create a new RocksDBCheckpoint Diff utility.

Posted by GitBox <gi...@apache.org>.
nandakumar131 commented on PR #3755:
URL: https://github.com/apache/ozone/pull/3755#issuecomment-1255427567

   +1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #3755: HDDS-7224. Create a new RocksDBCheckpoint Diff utility.

Posted by GitBox <gi...@apache.org>.
smengcl commented on code in PR #3755:
URL: https://github.com/apache/ozone/pull/3755#discussion_r972330561


##########
hadoop-hdds/RocksDBCheckpointDiffer/README.md:
##########
@@ -0,0 +1,2 @@
+# RocksDiff

Review Comment:
   TODO: Rename `RocksDBCheckpointDiffer` to `rocksdb-checkpoint-differ`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #3755: HDDS-7224. Create a new RocksDBCheckpoint Diff utility.

Posted by GitBox <gi...@apache.org>.
smengcl commented on code in PR #3755:
URL: https://github.com/apache/ozone/pull/3755#discussion_r974484242


##########
hadoop-hdds/rocksdb-checkpoint-differ/pom.xml:
##########
@@ -0,0 +1,152 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+<project xmlns="http://maven.apache.org/POM/4.0.0"
+         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
+https://maven.apache.org/xsd/maven-4.0.0.xsd">
+  <modelVersion>4.0.0</modelVersion>
+  <parent>
+    <groupId>org.apache.ozone</groupId>
+    <artifactId>hdds</artifactId>
+    <version>1.3.0-SNAPSHOT</version>
+  </parent>
+  <artifactId>RocksDBCheckpointDiffer</artifactId>

Review Comment:
   fixed. thanks



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #3755: HDDS-7224. Create a new RocksDBCheckpoint Diff utility.

Posted by GitBox <gi...@apache.org>.
smengcl commented on code in PR #3755:
URL: https://github.com/apache/ozone/pull/3755#discussion_r972329299


##########
pom.xml:
##########
@@ -1843,36 +1843,6 @@ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xs
               </goals>
               <configuration>
                 <rules>
-                  <RestrictImports>
-                    <includeTestCode>false</includeTestCode>
-                    <reason>Use managed RocksObjects under org.apache.hadoop.hdds.utils.db.managed instead.</reason>
-                    <!-- By default, ban all the classes in org.rocksdb -->
-                    <bannedImport>org.rocksdb.**</bannedImport>
-                    <allowedImports>
-                      <!-- Allow non-RocksObject classes. -->
-                      <allowedImport>org.rocksdb.BlockBasedTableConfig</allowedImport>
-                      <allowedImport>org.rocksdb.ColumnFamilyDescriptor</allowedImport>
-                      <allowedImport>org.rocksdb.CompactionStyle</allowedImport>
-                      <allowedImport>org.rocksdb.HistogramData</allowedImport>
-                      <allowedImport>org.rocksdb.HistogramType</allowedImport>
-                      <allowedImport>org.rocksdb.Holder</allowedImport>
-                      <allowedImport>org.rocksdb.InfoLogLevel</allowedImport>
-                      <allowedImport>org.rocksdb.OptionsUtil</allowedImport>
-                      <allowedImport>org.rocksdb.RocksDBException</allowedImport>
-                      <allowedImport>org.rocksdb.StatsLevel</allowedImport>
-                      <allowedImport>org.rocksdb.TransactionLogIterator.BatchResult</allowedImport>
-                      <allowedImport>org.rocksdb.TickerType</allowedImport>
-
-                      <!-- Allow RocksObjects whose native pointer is managed by RocksDB. -->
-                      <allowedImport>org.rocksdb.ColumnFamilyHandle</allowedImport>
-                      <allowedImport>org.rocksdb.Env</allowedImport>
-                      <allowedImport>org.rocksdb.Statistics</allowedImport>
-
-                      <!-- Allow RocksDB constants and static methods to be used. -->
-                      <allowedImport>org.rocksdb.RocksDB.*</allowedImport>
-                    </allowedImports>
-                    <exclusion>org.apache.hadoop.hdds.utils.db.managed.*</exclusion>
-                  </RestrictImports>

Review Comment:
   We have to add the extra needed imports rather than removing this bulk. Simply copy and paste the class/artifact names emitted from maven compile error message here to new `allowedImport` entries.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #3755: HDDS-7224. Create a new RocksDBCheckpoint Diff utility.

Posted by GitBox <gi...@apache.org>.
smengcl commented on code in PR #3755:
URL: https://github.com/apache/ozone/pull/3755#discussion_r974914987


##########
hadoop-hdds/rocksdb-checkpoint-differ/src/test/java/org/apache/ozone/rocksdiff/TestRocksDBCheckpointDiffer.java:
##########
@@ -0,0 +1,328 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.ozone.rocksdiff;
+
+import static org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.DEBUG_DAG_LIVE_NODES;
+import static org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.DEBUG_READ_ALL_DB_KEYS;
+
+import java.io.File;
+import java.io.FileWriter;
+import java.io.IOException;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.List;
+import java.util.Random;
+import java.util.stream.Collectors;
+
+import org.rocksdb.ColumnFamilyDescriptor;
+import org.rocksdb.ColumnFamilyHandle;
+import org.rocksdb.ColumnFamilyOptions;
+import org.rocksdb.CompressionType;
+import org.rocksdb.DBOptions;
+import org.rocksdb.LiveFileMetaData;
+import org.rocksdb.Options;
+import org.rocksdb.RocksDB;
+import org.rocksdb.RocksDBException;
+import org.rocksdb.RocksIterator;
+
+////CHECKSTYLE:OFF

Review Comment:
   Hi Attila, I have addressed all of the findbugs without adding the suppression annotation.
   
   And the CI is green now. Please take another look. Thanks!
   
   We could merge this as soon as possible to unblock @nandakumar131 and @sadanand48 's diff logic dev.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] nandakumar131 commented on a diff in pull request #3755: HDDS-7224. Create a new RocksDBCheckpoint Diff utility.

Posted by GitBox <gi...@apache.org>.
nandakumar131 commented on code in PR #3755:
URL: https://github.com/apache/ozone/pull/3755#discussion_r977975801


##########
hadoop-hdds/rocksdb-checkpoint-differ/pom.xml:
##########
@@ -0,0 +1,193 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+<project xmlns="http://maven.apache.org/POM/4.0.0"
+         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
+https://maven.apache.org/xsd/maven-4.0.0.xsd">
+  <modelVersion>4.0.0</modelVersion>
+  <parent>
+    <groupId>org.apache.ozone</groupId>
+    <artifactId>hdds</artifactId>
+    <version>1.3.0-SNAPSHOT</version>
+  </parent>
+  <artifactId>rocksdb-checkpoint-differ</artifactId>
+  <version>1.3.0-SNAPSHOT</version>
+  <description>RocksDB Checkpoint Differ</description>
+  <name>RocksDB Checkpoint Differ</name>
+  <packaging>jar</packaging>
+
+  <dependencies>
+
+    <dependency>
+      <groupId>org.rocksdb</groupId>
+      <artifactId>rocksdbjni</artifactId>
+      <version>7.4.5</version>

Review Comment:
   We don't have the define the version here. The version is already defined in the parent pom file.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #3755: HDDS-7224. Create a new RocksDBCheckpoint Diff utility.

Posted by GitBox <gi...@apache.org>.
smengcl commented on code in PR #3755:
URL: https://github.com/apache/ozone/pull/3755#discussion_r978969879


##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDBCheckpointDiffer.java:
##########
@@ -0,0 +1,817 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.ozone.rocksdiff;
+
+import java.io.File;
+import java.io.IOException;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.Comparator;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Set;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.stream.Collectors;
+
+
+import org.rocksdb.AbstractEventListener;
+import org.rocksdb.Checkpoint;
+import org.rocksdb.CompactionJobInfo;
+import org.rocksdb.CompressionType;
+import org.rocksdb.DBOptions;
+import org.rocksdb.FlushOptions;
+import org.rocksdb.LiveFileMetaData;
+import org.rocksdb.Options;
+import org.rocksdb.RocksDB;
+import org.rocksdb.RocksDBException;
+import org.rocksdb.SstFileReader;
+import org.rocksdb.TableProperties;
+
+import com.google.common.graph.GraphBuilder;
+import com.google.common.graph.MutableGraph;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import edu.umd.cs.findbugs.annotations.SuppressFBWarnings;
+
+// TODO
+//  1. Create a local instance of RocksDiff-local-RocksDB. This is the
+//  rocksDB that we can use for maintaining DAG and any other state. This is
+//  a per node state so it it doesn't have to go through RATIS anyway.
+//  2. Store fwd DAG in Diff-Local-RocksDB in Compaction Listener
+//  3. Store fwd DAG in Diff-Local-RocksDB in Compaction Listener
+//  4. Store last-Snapshot-counter/Compaction-generation-counter in Diff-Local
+//  -RocksDB in Compaction Listener
+//  5. System Restart handling. Read the DAG from Disk and load it in memory.
+//  6. Take the base snapshot. All the SST file nodes in the base snapshot
+//  should be arked with that Snapshot generation. Subsequently, all SST file
+//  node should have a snapshot-generation count and Compaction-generation
+//  count.
+//  6. DAG based SST file pruning. Start from the oldest snapshot and we can
+//  unlink any SST
+//  file from the SaveCompactedFilePath directory that is reachable in the
+//  reverse DAG.
+//  7. DAG pruning : For each snapshotted bucket, We can recycle the part of
+//  the DAG that is older than the predefined policy for the efficient snapdiff.
+//  E.g. we may decide not to support efficient snapdiff from any snapshot that
+//  is older than 2 weeks.
+//  Note on 8. & 9 .
+//  ==================
+//  A simple handling is to just iterate over all keys in keyspace when the
+//  compaction DAG is lost, instead of optimizing every case. And start
+//  Compaction-DAG afresh from the latest snapshot.
+//  --
+//  8. Handle bootstrapping rocksDB for a new OM follower node
+//      - new node will receive Active object store as well as all existing
+//      rocksDB checkpoints.
+//      - This bootstrapping should also receive the compaction-DAG information
+//  9. Handle rebuilding the DAG for a lagging follower. There are two cases
+//      - recieve RATIS transactions to replay. Nothing needs to be done in
+//      thise case.
+//      - Getting the DB sync. This case needs to handle getting the
+//      compaction-DAG information as well.
+//
+//
+/**
+ *  RocksDBCheckpointDiffer class.
+ */
+//CHECKSTYLE:OFF
+@SuppressWarnings({"NM_METHOD_NAMING_CONVENTION"})
+public class RocksDBCheckpointDiffer {
+  private final String rocksDbPath;
+  private String cpPath;
+  private final String CFDBPATH;
+  private final String SAVE_COMPACTED_FILE_PATH;
+  private final int MAX_SNAPSHOTS;
+  private static final Logger LOG =
+      LoggerFactory.getLogger(RocksDBCheckpointDiffer.class);
+
+  // keeps track of all the snapshots created so far.
+  private int lastSnapshotCounter;
+  private String lastSnapshotPrefix;
+
+  // tracks number of compactions so far
+  private final long UNKNOWN_COMPACTION_GEN = 0;

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] prashantpogde merged pull request #3755: HDDS-7224. Create a new RocksDBCheckpoint Diff utility.

Posted by GitBox <gi...@apache.org>.
prashantpogde merged PR #3755:
URL: https://github.com/apache/ozone/pull/3755


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] adoroszlai commented on a diff in pull request #3755: HDDS-7224. Create a new RocksDBCheckpoint Diff utility.

Posted by GitBox <gi...@apache.org>.
adoroszlai commented on code in PR #3755:
URL: https://github.com/apache/ozone/pull/3755#discussion_r973534185


##########
pom.xml:
##########
@@ -1843,35 +1843,47 @@ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xs
               </goals>
               <configuration>
                 <rules>
-                  <RestrictImports>
-                    <includeTestCode>false</includeTestCode>
-                    <reason>Use managed RocksObjects under org.apache.hadoop.hdds.utils.db.managed instead.</reason>
-                    <!-- By default, ban all the classes in org.rocksdb -->
-                    <bannedImport>org.rocksdb.**</bannedImport>
-                    <allowedImports>
-                      <!-- Allow non-RocksObject classes. -->
-                      <allowedImport>org.rocksdb.BlockBasedTableConfig</allowedImport>
-                      <allowedImport>org.rocksdb.ColumnFamilyDescriptor</allowedImport>
-                      <allowedImport>org.rocksdb.CompactionStyle</allowedImport>
-                      <allowedImport>org.rocksdb.HistogramData</allowedImport>
-                      <allowedImport>org.rocksdb.HistogramType</allowedImport>
-                      <allowedImport>org.rocksdb.Holder</allowedImport>
-                      <allowedImport>org.rocksdb.InfoLogLevel</allowedImport>
-                      <allowedImport>org.rocksdb.OptionsUtil</allowedImport>
-                      <allowedImport>org.rocksdb.RocksDBException</allowedImport>
-                      <allowedImport>org.rocksdb.StatsLevel</allowedImport>
-                      <allowedImport>org.rocksdb.TransactionLogIterator.BatchResult</allowedImport>
-                      <allowedImport>org.rocksdb.TickerType</allowedImport>
-
-                      <!-- Allow RocksObjects whose native pointer is managed by RocksDB. -->
-                      <allowedImport>org.rocksdb.ColumnFamilyHandle</allowedImport>
-                      <allowedImport>org.rocksdb.Env</allowedImport>
-                      <allowedImport>org.rocksdb.Statistics</allowedImport>
-
-                      <!-- Allow RocksDB constants and static methods to be used. -->
-                      <allowedImport>org.rocksdb.RocksDB.*</allowedImport>
-                    </allowedImports>
-                    <exclusion>org.apache.hadoop.hdds.utils.db.managed.*</exclusion>
+                  <RestrictImports>	
+                    <includeTestCode>false</includeTestCode>	
+                    <reason>Use managed RocksObjects under org.apache.hadoop.hdds.utils.db.managed instead.</reason>	
+                    <!-- By default, ban all the classes in org.rocksdb -->	
+                    <bannedImport>org.rocksdb.**</bannedImport>	
+                    <allowedImports>	
+                      <!-- Allow non-RocksObject classes. -->	
+                      <allowedImport>org.rocksdb.BlockBasedTableConfig</allowedImport>	
+                      <allowedImport>org.rocksdb.ColumnFamilyDescriptor</allowedImport>	
+                      <allowedImport>org.rocksdb.CompactionStyle</allowedImport>	
+                      <allowedImport>org.rocksdb.HistogramData</allowedImport>	
+                      <allowedImport>org.rocksdb.HistogramType</allowedImport>	
+                      <allowedImport>org.rocksdb.Holder</allowedImport>	
+                      <allowedImport>org.rocksdb.InfoLogLevel</allowedImport>	
+                      <allowedImport>org.rocksdb.OptionsUtil</allowedImport>	
+                      <allowedImport>org.rocksdb.RocksDBException</allowedImport>	
+                      <allowedImport>org.rocksdb.StatsLevel</allowedImport>	
+                      <allowedImport>org.rocksdb.TransactionLogIterator.BatchResult</allowedImport>	
+                      <allowedImport>org.rocksdb.TickerType</allowedImport>	
+
+                      <allowedImport>org.rocksdb.AbstractEventListener</allowedImport>	
+                      <allowedImport>org.rocksdb.Checkpoint</allowedImport>	
+                      <allowedImport>org.rocksdb.CompactionJobInfo</allowedImport>	
+                      <allowedImport>org.rocksdb.CompressionType</allowedImport>	
+                      <allowedImport>org.rocksdb.DBOptions</allowedImport>	
+                      <allowedImport>org.rocksdb.FlushOptions</allowedImport>	
+                      <allowedImport>org.rocksdb.LiveFileMetaData</allowedImport>	
+                      <allowedImport>org.rocksdb.Options</allowedImport>	
+                      <allowedImport>org.rocksdb.RocksDB</allowedImport>	

Review Comment:
   Allowing these (e.g. `RocksDB` and `Options`) seems to circumvent the whole point of banning these imports to force use of managed objects (HDDS-7087).  If you make these changes, someone might accidentally use non-managed objects in "production" code.  (BTW, isn't the new diff tool production code?)
   
   If these imports are absolutely necessary for the diff tool, please add override the restrictions only for that module.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] nandakumar131 commented on a diff in pull request #3755: HDDS-7224. Create a new RocksDBCheckpoint Diff utility.

Posted by GitBox <gi...@apache.org>.
nandakumar131 commented on code in PR #3755:
URL: https://github.com/apache/ozone/pull/3755#discussion_r977986187


##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDBCheckpointDiffer.java:
##########
@@ -0,0 +1,817 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.ozone.rocksdiff;
+
+import java.io.File;
+import java.io.IOException;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.Comparator;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Set;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.stream.Collectors;
+
+
+import org.rocksdb.AbstractEventListener;
+import org.rocksdb.Checkpoint;
+import org.rocksdb.CompactionJobInfo;
+import org.rocksdb.CompressionType;
+import org.rocksdb.DBOptions;
+import org.rocksdb.FlushOptions;
+import org.rocksdb.LiveFileMetaData;
+import org.rocksdb.Options;
+import org.rocksdb.RocksDB;
+import org.rocksdb.RocksDBException;
+import org.rocksdb.SstFileReader;
+import org.rocksdb.TableProperties;
+
+import com.google.common.graph.GraphBuilder;
+import com.google.common.graph.MutableGraph;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import edu.umd.cs.findbugs.annotations.SuppressFBWarnings;
+
+// TODO
+//  1. Create a local instance of RocksDiff-local-RocksDB. This is the
+//  rocksDB that we can use for maintaining DAG and any other state. This is
+//  a per node state so it it doesn't have to go through RATIS anyway.
+//  2. Store fwd DAG in Diff-Local-RocksDB in Compaction Listener
+//  3. Store fwd DAG in Diff-Local-RocksDB in Compaction Listener
+//  4. Store last-Snapshot-counter/Compaction-generation-counter in Diff-Local
+//  -RocksDB in Compaction Listener
+//  5. System Restart handling. Read the DAG from Disk and load it in memory.
+//  6. Take the base snapshot. All the SST file nodes in the base snapshot
+//  should be arked with that Snapshot generation. Subsequently, all SST file
+//  node should have a snapshot-generation count and Compaction-generation
+//  count.
+//  6. DAG based SST file pruning. Start from the oldest snapshot and we can
+//  unlink any SST
+//  file from the SaveCompactedFilePath directory that is reachable in the
+//  reverse DAG.
+//  7. DAG pruning : For each snapshotted bucket, We can recycle the part of
+//  the DAG that is older than the predefined policy for the efficient snapdiff.
+//  E.g. we may decide not to support efficient snapdiff from any snapshot that
+//  is older than 2 weeks.
+//  Note on 8. & 9 .
+//  ==================
+//  A simple handling is to just iterate over all keys in keyspace when the
+//  compaction DAG is lost, instead of optimizing every case. And start
+//  Compaction-DAG afresh from the latest snapshot.
+//  --
+//  8. Handle bootstrapping rocksDB for a new OM follower node
+//      - new node will receive Active object store as well as all existing
+//      rocksDB checkpoints.
+//      - This bootstrapping should also receive the compaction-DAG information
+//  9. Handle rebuilding the DAG for a lagging follower. There are two cases
+//      - recieve RATIS transactions to replay. Nothing needs to be done in
+//      thise case.
+//      - Getting the DB sync. This case needs to handle getting the
+//      compaction-DAG information as well.
+//
+//
+/**
+ *  RocksDBCheckpointDiffer class.
+ */
+//CHECKSTYLE:OFF
+@SuppressWarnings({"NM_METHOD_NAMING_CONVENTION"})
+public class RocksDBCheckpointDiffer {
+  private final String rocksDbPath;
+  private String cpPath;
+  private final String CFDBPATH;
+  private final String SAVE_COMPACTED_FILE_PATH;
+  private final int MAX_SNAPSHOTS;
+  private static final Logger LOG =
+      LoggerFactory.getLogger(RocksDBCheckpointDiffer.class);
+
+  // keeps track of all the snapshots created so far.
+  private int lastSnapshotCounter;
+  private String lastSnapshotPrefix;
+
+  // tracks number of compactions so far
+  private final long UNKNOWN_COMPACTION_GEN = 0;

Review Comment:
   This can be marked as _static_



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] nandakumar131 commented on a diff in pull request #3755: HDDS-7224. Create a new RocksDBCheckpoint Diff utility.

Posted by GitBox <gi...@apache.org>.
nandakumar131 commented on code in PR #3755:
URL: https://github.com/apache/ozone/pull/3755#discussion_r977977122


##########
hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/db/RDBStore.java:
##########
@@ -80,6 +82,13 @@ public RDBStore(File dbFile, ManagedDBOptions dbOptions,
     dbLocation = dbFile;
 
     try {
+      rocksDBCheckpointDiffer =
+          new RocksDBCheckpointDiffer(dbLocation.getAbsolutePath(), 1000,

Review Comment:
   Make the _number of snapshots_ a configuration value.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on pull request #3755: HDDS-7224. Create a new RocksDBCheckpoint Diff utility.

Posted by GitBox <gi...@apache.org>.
smengcl commented on PR #3755:
URL: https://github.com/apache/ozone/pull/3755#issuecomment-1256708452

   Thanks @nandakumar131 for the review. I have addressed all the comments and got a green CI.
   
   Pls give a +1 via the "Review changes" button if you feel like the latest changes are good.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #3755: HDDS-7224. Create a new RocksDBCheckpoint Diff utility.

Posted by GitBox <gi...@apache.org>.
smengcl commented on code in PR #3755:
URL: https://github.com/apache/ozone/pull/3755#discussion_r978969691


##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDBCheckpointDiffer.java:
##########
@@ -0,0 +1,817 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.ozone.rocksdiff;
+
+import java.io.File;
+import java.io.IOException;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.Comparator;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Set;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.stream.Collectors;
+
+
+import org.rocksdb.AbstractEventListener;
+import org.rocksdb.Checkpoint;
+import org.rocksdb.CompactionJobInfo;
+import org.rocksdb.CompressionType;
+import org.rocksdb.DBOptions;
+import org.rocksdb.FlushOptions;
+import org.rocksdb.LiveFileMetaData;
+import org.rocksdb.Options;
+import org.rocksdb.RocksDB;
+import org.rocksdb.RocksDBException;
+import org.rocksdb.SstFileReader;
+import org.rocksdb.TableProperties;
+
+import com.google.common.graph.GraphBuilder;
+import com.google.common.graph.MutableGraph;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import edu.umd.cs.findbugs.annotations.SuppressFBWarnings;
+
+// TODO
+//  1. Create a local instance of RocksDiff-local-RocksDB. This is the
+//  rocksDB that we can use for maintaining DAG and any other state. This is
+//  a per node state so it it doesn't have to go through RATIS anyway.
+//  2. Store fwd DAG in Diff-Local-RocksDB in Compaction Listener
+//  3. Store fwd DAG in Diff-Local-RocksDB in Compaction Listener
+//  4. Store last-Snapshot-counter/Compaction-generation-counter in Diff-Local
+//  -RocksDB in Compaction Listener
+//  5. System Restart handling. Read the DAG from Disk and load it in memory.
+//  6. Take the base snapshot. All the SST file nodes in the base snapshot
+//  should be arked with that Snapshot generation. Subsequently, all SST file
+//  node should have a snapshot-generation count and Compaction-generation
+//  count.
+//  6. DAG based SST file pruning. Start from the oldest snapshot and we can
+//  unlink any SST
+//  file from the SaveCompactedFilePath directory that is reachable in the
+//  reverse DAG.
+//  7. DAG pruning : For each snapshotted bucket, We can recycle the part of
+//  the DAG that is older than the predefined policy for the efficient snapdiff.
+//  E.g. we may decide not to support efficient snapdiff from any snapshot that
+//  is older than 2 weeks.
+//  Note on 8. & 9 .
+//  ==================
+//  A simple handling is to just iterate over all keys in keyspace when the
+//  compaction DAG is lost, instead of optimizing every case. And start
+//  Compaction-DAG afresh from the latest snapshot.
+//  --
+//  8. Handle bootstrapping rocksDB for a new OM follower node
+//      - new node will receive Active object store as well as all existing
+//      rocksDB checkpoints.
+//      - This bootstrapping should also receive the compaction-DAG information
+//  9. Handle rebuilding the DAG for a lagging follower. There are two cases
+//      - recieve RATIS transactions to replay. Nothing needs to be done in
+//      thise case.
+//      - Getting the DB sync. This case needs to handle getting the
+//      compaction-DAG information as well.
+//
+//
+/**
+ *  RocksDBCheckpointDiffer class.
+ */
+//CHECKSTYLE:OFF
+@SuppressWarnings({"NM_METHOD_NAMING_CONVENTION"})
+public class RocksDBCheckpointDiffer {
+  private final String rocksDbPath;
+  private String cpPath;
+  private final String CFDBPATH;
+  private final String SAVE_COMPACTED_FILE_PATH;
+  private final int MAX_SNAPSHOTS;

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org