You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "sadanand48 (via GitHub)" <gi...@apache.org> on 2023/02/14 15:25:42 UTC

[GitHub] [ozone] sadanand48 opened a new pull request, #4273: [Snapshot] HDDS-7905. Snapdiff should only return modifications done to the bucket.

sadanand48 opened a new pull request, #4273:
URL: https://github.com/apache/ozone/pull/4273

   ## What changes were proposed in this pull request?
   When snapshot DAG fails to provide output , full snapdiff is run i.e all the SST files in the source and target (excluding the same SST's) are checked to calculate snapdiff.  The filtering logic to filter out ssts not pertaining to the snapshot bucket is not present during full snapdiff. This change is to add that. Also added a configuration to use full snapdiff path if needed, default is false.
   
   ## What is the link to the Apache JIRA
   https://issues.apache.org/jira/browse/HDDS-7905
   
   ## How was this patch tested?
   Unit tests
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1122639082


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##########
@@ -292,7 +312,8 @@ private void addToObjectIdMap(Table<String, ? extends WithObjectID> fsTable,
         try {
           final WithObjectID oldKey = fsTable.get(key);
           final WithObjectID newKey = tsTable.get(key);
-          if (areKeysEqual(oldKey, newKey)) {
+          if (areKeysEqual(oldKey, newKey) || !isKeyInBucket(key, tablePrefixes,

Review Comment:
   @sadanand48 Got it. That is alright. let's include the extra changes in the PR description for record-keeping.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1115699402


##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDiffUtils.java:
##########
@@ -50,5 +62,48 @@ public static String constructBucketKey(String keyName) {
     return builder.toString();
   }
 
+  public static void filterRelevantSstFiles(Set<String> inputFiles,
+      Map<String, String> tableToPrefixMap) {
+    for (Iterator<String> fileIterator =
+         inputFiles.iterator(); fileIterator.hasNext();) {
+      String filepath = fileIterator.next();
+      if (!RocksDiffUtils.doesSstFileContainKeyRange(filepath,
+          tableToPrefixMap)) {
+        fileIterator.remove();
+      }
+    }
+  }
+
+  public static boolean doesSstFileContainKeyRange(String filepath,
+      Map<String, String> tableToPrefixMap) {
+    try (SstFileReader sstFileReader = new SstFileReader(new Options())) {
+      sstFileReader.open(filepath);
+      TableProperties properties = sstFileReader.getTableProperties();
+      String tableName = new String(properties.getColumnFamilyName(), UTF_8);
+      if (tableToPrefixMap.containsKey(tableName)) {
+        String prefix = tableToPrefixMap.get(tableName);
+        SstFileReaderIterator iterator =
+            sstFileReader.newIterator(new ReadOptions());
+        iterator.seekToFirst();
+        String firstKey = RocksDiffUtils.constructBucketKey(
+            new String(iterator.key(), UTF_8));

Review Comment:
   Yes that's a good idea, updated with this change.



##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##########
@@ -387,4 +412,27 @@ private boolean areKeysEqual(WithObjectID oldKey, WithObjectID newKey) {
     }
     return false;
   }
+
+  /**
+    * check if the given key is in the bucket specified by tablePrefix map.
+   */
+  private boolean isKeyInBucket(String key, Map<String, String> tablePrefixes,
+      String tableName) {
+    String volumeBucketDbPrefix;
+    // In case of FSO - either File/Directory table
+    // the key Prefix would be volumeId/bucketId and
+    // in case of non-fso - volumeName/bucketName
+    if (tableName.equals(
+        OmMetadataManagerImpl.DIRECTORY_TABLE) || tableName.equals(
+        OmMetadataManagerImpl.FILE_TABLE)) {
+      volumeBucketDbPrefix =
+          tablePrefixes.get(OmMetadataManagerImpl.DIRECTORY_TABLE);
+    } else {
+      volumeBucketDbPrefix = tablePrefixes.get(OmMetadataManagerImpl.KEY_TABLE);
+    }
+    if (key.startsWith(volumeBucketDbPrefix)) {

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] swamirishi commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "swamirishi (via GitHub)" <gi...@apache.org>.
swamirishi commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1119395559


##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDiffUtils.java:
##########
@@ -35,20 +52,53 @@ public static boolean isKeyWithPrefixPresent(String prefixForColumnFamily,
   }
 
   public static String constructBucketKey(String keyName) {
-    if (!keyName.startsWith(OzoneConsts.OM_KEY_PREFIX)) {
-      keyName = OzoneConsts.OM_KEY_PREFIX.concat(keyName);
+    if (!keyName.startsWith(OM_KEY_PREFIX)) {
+      keyName = OM_KEY_PREFIX.concat(keyName);
     }
-    String[] elements = keyName.split(OzoneConsts.OM_KEY_PREFIX);
+    String[] elements = keyName.split(OM_KEY_PREFIX);
     String volume = elements[1];
     String bucket = elements[2];
     StringBuilder builder =
-        new StringBuilder().append(OzoneConsts.OM_KEY_PREFIX).append(volume);
+        new StringBuilder().append(OM_KEY_PREFIX).append(volume);
 
     if (StringUtils.isNotBlank(bucket)) {
-      builder.append(OzoneConsts.OM_KEY_PREFIX).append(bucket);
+      builder.append(OM_KEY_PREFIX).append(bucket);
     }
     return builder.toString();
   }
 
+  public static void filterRelevantSstFiles(Set<String> inputFiles,
+      Map<String, String> tableToPrefixMap) throws RocksDBException {
+    for (Iterator<String> fileIterator =
+         inputFiles.iterator(); fileIterator.hasNext();) {
+      String filepath = fileIterator.next();
+      if (!RocksDiffUtils.doesSstFileContainKeyRange(filepath,
+          tableToPrefixMap)) {
+        fileIterator.remove();
+      }
+    }
+  }
+
+  public static boolean doesSstFileContainKeyRange(String filepath,
+      Map<String, String> tableToPrefixMap) throws RocksDBException {
+    try (SstFileReader sstFileReader = new SstFileReader(new Options())) {
+      sstFileReader.open(filepath);
+      TableProperties properties = sstFileReader.getTableProperties();
+      String tableName = new String(properties.getColumnFamilyName(), UTF_8);
+      if (tableToPrefixMap.containsKey(tableName)) {
+        String prefix = tableToPrefixMap.get(tableName) + OM_KEY_PREFIX;
+        SstFileReaderIterator iterator =
+            sstFileReader.newIterator(new ReadOptions());

Review Comment:
   @prashantpogde @smengcl Can you guys also take a look if there are some other unclosed native objects in the PR just to ensure I haven't missed it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1122634261


##########
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestOmSnapshot.java:
##########
@@ -107,30 +109,34 @@ public class TestOmSnapshot {
   @Parameterized.Parameters
   public static Collection<Object[]> data() {
     return Arrays.asList(
-                         new Object[]{OBJECT_STORE, false},
-                         new Object[]{FILE_SYSTEM_OPTIMIZED, false},
-                         new Object[]{BucketLayout.LEGACY, true});
+                         new Object[]{OBJECT_STORE, false, false},
+                         new Object[]{FILE_SYSTEM_OPTIMIZED, false, false},
+                         new Object[]{BucketLayout.LEGACY, true, true});
   }
 
   public TestOmSnapshot(BucketLayout newBucketLayout,
-      boolean newEnableFileSystemPaths) throws Exception {
+      boolean newEnableFileSystemPaths, boolean shouldUseFullSnapshotDiff)

Review Comment:
   Done.



##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##########
@@ -72,10 +75,15 @@ public class SnapshotDiffManager {
   private final ManagedRocksDB db;
   private final CodecRegistry codecRegistry;
 
+  private OzoneConfiguration configuration;
+
+
   public SnapshotDiffManager(ManagedRocksDB db,
-                             RocksDBCheckpointDiffer differ) {
+                             RocksDBCheckpointDiffer differ,
+      OzoneConfiguration conf) {

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1122633722


##########
hadoop-hdds/common/src/main/resources/ozone-default.xml:
##########
@@ -3608,4 +3608,14 @@
       production environments.
     </description>
   </property>
+
+  <property>
+    <name>ozone.om.snapshot.use.full.diff</name>
+    <value>false</value>
+    <tag>OZONE, OM</tag>
+    <description>
+      If true, snapshot diff will not follow the optimised DAG based pruning approach
+      for computation.

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1122639082


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##########
@@ -292,7 +312,8 @@ private void addToObjectIdMap(Table<String, ? extends WithObjectID> fsTable,
         try {
           final WithObjectID oldKey = fsTable.get(key);
           final WithObjectID newKey = tsTable.get(key);
-          if (areKeysEqual(oldKey, newKey)) {
+          if (areKeysEqual(oldKey, newKey) || !isKeyInBucket(key, tablePrefixes,

Review Comment:
   @sadanand48 Got it. That is alright. But let's include the extra changes in the PR description for record-keeping.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1122391279


##########
hadoop-hdds/common/pom.xml:
##########
@@ -242,6 +242,10 @@ https://maven.apache.org/xsd/maven-4.0.0.xsd">
       <artifactId>spotbugs-annotations</artifactId>
       <scope>test</scope>
     </dependency>
+      <dependency>
+          <groupId>org.rocksdb</groupId>
+          <artifactId>rocksdbjni</artifactId>
+      </dependency>

Review Comment:
   Should remove the corresponding part in `framework/pom.xml` ?
   
   https://github.com/apache/ozone/blob/aea5edfe99ea92687c604378bec4bfbf1773ba80/hadoop-hdds/framework/pom.xml#L112-L115



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1122421812


##########
hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/OzoneConfigKeys.java:
##########
@@ -603,6 +603,11 @@ public final class OzoneConfigKeys {
       OZONE_OM_SNAPSHOT_PRUNE_COMPACTION_DAG_DAEMON_RUN_INTERVAL_DEFAULT =
       TimeUnit.HOURS.toMillis(1);
 
+  public static final String OZONE_OM_SNAPSHOT_USE_FULL_DIFF =
+      "ozone.om.snapshot.use.full.diff";
+
+  public static final boolean OZONE_OM_SNAPSHOT_USE_FULL_DIFF_DEFAULT = false;

Review Comment:
   Even when this is set to `false`, we can have a client-side option to force a full diff from the CLI (e.g. for debugging and diagnostics purposes). We can do this in another jira.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1122421812


##########
hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/OzoneConfigKeys.java:
##########
@@ -603,6 +603,11 @@ public final class OzoneConfigKeys {
       OZONE_OM_SNAPSHOT_PRUNE_COMPACTION_DAG_DAEMON_RUN_INTERVAL_DEFAULT =
       TimeUnit.HOURS.toMillis(1);
 
+  public static final String OZONE_OM_SNAPSHOT_USE_FULL_DIFF =
+      "ozone.om.snapshot.use.full.diff";
+
+  public static final boolean OZONE_OM_SNAPSHOT_USE_FULL_DIFF_DEFAULT = false;

Review Comment:
   Even when this is set to `false`, we can have a client-side option to force a full diff from the CLI. We can do this in another jira.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1122638387


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##########
@@ -292,7 +312,8 @@ private void addToObjectIdMap(Table<String, ? extends WithObjectID> fsTable,
         try {
           final WithObjectID oldKey = fsTable.get(key);
           final WithObjectID newKey = tsTable.get(key);
-          if (areKeysEqual(oldKey, newKey)) {
+          if (areKeysEqual(oldKey, newKey) || !isKeyInBucket(key, tablePrefixes,

Review Comment:
   Thanks @smengcl for the comment, I did not originally intend to make the change of moving these classes but it arose as a requirement while addressing one of the review comments, although will take a note to update the PR description every time such a change is made to simplify the review process and notify the reviewer.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1119571308


##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDiffUtils.java:
##########
@@ -35,20 +52,54 @@ public static boolean isKeyWithPrefixPresent(String prefixForColumnFamily,
   }
 
   public static String constructBucketKey(String keyName) {
-    if (!keyName.startsWith(OzoneConsts.OM_KEY_PREFIX)) {
-      keyName = OzoneConsts.OM_KEY_PREFIX.concat(keyName);
+    if (!keyName.startsWith(OM_KEY_PREFIX)) {
+      keyName = OM_KEY_PREFIX.concat(keyName);
     }
-    String[] elements = keyName.split(OzoneConsts.OM_KEY_PREFIX);
+    String[] elements = keyName.split(OM_KEY_PREFIX);
     String volume = elements[1];
     String bucket = elements[2];
     StringBuilder builder =
-        new StringBuilder().append(OzoneConsts.OM_KEY_PREFIX).append(volume);
+        new StringBuilder().append(OM_KEY_PREFIX).append(volume);
 
     if (StringUtils.isNotBlank(bucket)) {
-      builder.append(OzoneConsts.OM_KEY_PREFIX).append(bucket);
+      builder.append(OM_KEY_PREFIX).append(bucket);
     }
     return builder.toString();
   }
 
+  public static void filterRelevantSstFiles(Set<String> inputFiles,
+      Map<String, String> tableToPrefixMap) throws RocksDBException {
+    for (Iterator<String> fileIterator =
+         inputFiles.iterator(); fileIterator.hasNext();) {
+      String filepath = fileIterator.next();
+      if (!RocksDiffUtils.doesSstFileContainKeyRange(filepath,
+          tableToPrefixMap)) {
+        fileIterator.remove();
+      }
+    }
+  }
+
+  public static boolean doesSstFileContainKeyRange(String filepath,
+      Map<String, String> tableToPrefixMap) throws RocksDBException {
+    try (SstFileReader sstFileReader = new SstFileReader(new Options())) {
+      sstFileReader.open(filepath);
+      TableProperties properties = sstFileReader.getTableProperties();
+      String tableName = new String(properties.getColumnFamilyName(), UTF_8);
+      if (tableToPrefixMap.containsKey(tableName)) {
+        String prefix = tableToPrefixMap.get(tableName) + OM_KEY_PREFIX;
+        try (SstFileReaderIterator iterator = sstFileReader.newIterator(
+            new ReadOptions())) {
+          iterator.seek(prefix.getBytes(UTF_8));
+          String seekResultKey = new String(iterator.key(), UTF_8);
+          return seekResultKey.startsWith(prefix);
+        }
+      }
+      return false;
+    } catch (RocksDBException e) {
+      LOG.error("Failed to read SST File ", e);
+      throw e;

Review Comment:
   HddsServerUtil would need dependency on hdds-server module which will cause circular dependency.
   The exception thrown here is [caught here and rethrown](https://github.com/apache/ozone/blob/e90e2dd8ea1770bafea759015c955fc6e2281b9f/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmSnapshotManager.java#L270) as IOException.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1122442298


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##########
@@ -72,10 +75,15 @@ public class SnapshotDiffManager {
   private final ManagedRocksDB db;
   private final CodecRegistry codecRegistry;
 
+  private OzoneConfiguration configuration;
+
+
   public SnapshotDiffManager(ManagedRocksDB db,
-                             RocksDBCheckpointDiffer differ) {
+                             RocksDBCheckpointDiffer differ,
+      OzoneConfiguration conf) {

Review Comment:
   indentation



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] swamirishi commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "swamirishi (via GitHub)" <gi...@apache.org>.
swamirishi commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1109027241


##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDiffUtils.java:
##########
@@ -50,5 +62,48 @@ public static String constructBucketKey(String keyName) {
     return builder.toString();
   }
 
+  public static void filterRelevantSstFiles(Set<String> inputFiles,
+      Map<String, String> tableToPrefixMap) {
+    for (Iterator<String> fileIterator =
+         inputFiles.iterator(); fileIterator.hasNext();) {
+      String filepath = fileIterator.next();
+      if (!RocksDiffUtils.doesSstFileContainKeyRange(filepath,
+          tableToPrefixMap)) {
+        fileIterator.remove();
+      }
+    }
+  }
+
+  public static boolean doesSstFileContainKeyRange(String filepath,
+      Map<String, String> tableToPrefixMap) {
+    try (SstFileReader sstFileReader = new SstFileReader(new Options())) {
+      sstFileReader.open(filepath);
+      TableProperties properties = sstFileReader.getTableProperties();
+      String tableName = new String(properties.getColumnFamilyName(), UTF_8);
+      if (tableToPrefixMap.containsKey(tableName)) {
+        String prefix = tableToPrefixMap.get(tableName);
+        SstFileReaderIterator iterator =
+            sstFileReader.newIterator(new ReadOptions());
+        iterator.seekToFirst();
+        String firstKey = RocksDiffUtils.constructBucketKey(
+            new String(iterator.key(), UTF_8));

Review Comment:
   Wouldn't it be better if we try to seek to the particular key? Instead of just checking the range of the sst file.



##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDiffUtils.java:
##########
@@ -50,5 +62,48 @@ public static String constructBucketKey(String keyName) {
     return builder.toString();
   }
 
+  public static void filterRelevantSstFiles(Set<String> inputFiles,
+      Map<String, String> tableToPrefixMap) {
+    for (Iterator<String> fileIterator =
+         inputFiles.iterator(); fileIterator.hasNext();) {
+      String filepath = fileIterator.next();
+      if (!RocksDiffUtils.doesSstFileContainKeyRange(filepath,
+          tableToPrefixMap)) {
+        fileIterator.remove();
+      }
+    }
+  }
+
+  public static boolean doesSstFileContainKeyRange(String filepath,
+      Map<String, String> tableToPrefixMap) {
+    try (SstFileReader sstFileReader = new SstFileReader(new Options())) {
+      sstFileReader.open(filepath);
+      TableProperties properties = sstFileReader.getTableProperties();
+      String tableName = new String(properties.getColumnFamilyName(), UTF_8);
+      if (tableToPrefixMap.containsKey(tableName)) {
+        String prefix = tableToPrefixMap.get(tableName);
+        SstFileReaderIterator iterator =
+            sstFileReader.newIterator(new ReadOptions());
+        iterator.seekToFirst();
+        String firstKey = RocksDiffUtils.constructBucketKey(
+            new String(iterator.key(), UTF_8));
+        iterator.seekToLast();
+        String lastKey = RocksDiffUtils.constructBucketKey(
+            new String(iterator.key(), UTF_8));
+        boolean keyWithPrefixPresent =
+            RocksDiffUtils.isKeyWithPrefixPresent(prefix, firstKey, lastKey);
+        if (!keyWithPrefixPresent) {
+          return false;
+        }
+      } else {
+        // entry from other tables
+        return false;
+      }
+    } catch (RocksDBException e) {
+      e.printStackTrace();

Review Comment:
   Can we log this instead?



##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDiffUtils.java:
##########
@@ -50,5 +62,48 @@ public static String constructBucketKey(String keyName) {
     return builder.toString();
   }
 
+  public static void filterRelevantSstFiles(Set<String> inputFiles,
+      Map<String, String> tableToPrefixMap) {
+    for (Iterator<String> fileIterator =
+         inputFiles.iterator(); fileIterator.hasNext();) {
+      String filepath = fileIterator.next();
+      if (!RocksDiffUtils.doesSstFileContainKeyRange(filepath,
+          tableToPrefixMap)) {
+        fileIterator.remove();
+      }
+    }
+  }
+
+  public static boolean doesSstFileContainKeyRange(String filepath,
+      Map<String, String> tableToPrefixMap) {
+    try (SstFileReader sstFileReader = new SstFileReader(new Options())) {
+      sstFileReader.open(filepath);
+      TableProperties properties = sstFileReader.getTableProperties();
+      String tableName = new String(properties.getColumnFamilyName(), UTF_8);
+      if (tableToPrefixMap.containsKey(tableName)) {

Review Comment:
   I guess this check is not required since we are already getting only those sst files corresponding to the column families we are looking up. But yeah it is a good to have check.



##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDiffUtils.java:
##########
@@ -50,5 +62,48 @@ public static String constructBucketKey(String keyName) {
     return builder.toString();
   }
 
+  public static void filterRelevantSstFiles(Set<String> inputFiles,
+      Map<String, String> tableToPrefixMap) {
+    for (Iterator<String> fileIterator =
+         inputFiles.iterator(); fileIterator.hasNext();) {
+      String filepath = fileIterator.next();
+      if (!RocksDiffUtils.doesSstFileContainKeyRange(filepath,
+          tableToPrefixMap)) {
+        fileIterator.remove();
+      }
+    }
+  }
+
+  public static boolean doesSstFileContainKeyRange(String filepath,
+      Map<String, String> tableToPrefixMap) {
+    try (SstFileReader sstFileReader = new SstFileReader(new Options())) {
+      sstFileReader.open(filepath);
+      TableProperties properties = sstFileReader.getTableProperties();
+      String tableName = new String(properties.getColumnFamilyName(), UTF_8);
+      if (tableToPrefixMap.containsKey(tableName)) {
+        String prefix = tableToPrefixMap.get(tableName);
+        SstFileReaderIterator iterator =
+            sstFileReader.newIterator(new ReadOptions());
+        iterator.seekToFirst();
+        String firstKey = RocksDiffUtils.constructBucketKey(
+            new String(iterator.key(), UTF_8));
+        iterator.seekToLast();
+        String lastKey = RocksDiffUtils.constructBucketKey(
+            new String(iterator.key(), UTF_8));
+        boolean keyWithPrefixPresent =
+            RocksDiffUtils.isKeyWithPrefixPresent(prefix, firstKey, lastKey);
+        if (!keyWithPrefixPresent) {

Review Comment:
   why can't we just return RocksDiffUtils.isKeyWithPrefixPresent(prefix, firstKey, lastKey);



##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##########
@@ -387,4 +412,27 @@ private boolean areKeysEqual(WithObjectID oldKey, WithObjectID newKey) {
     }
     return false;
   }
+
+  /**
+    * check if the given key is in the bucket specified by tablePrefix map.
+   */
+  private boolean isKeyInBucket(String key, Map<String, String> tablePrefixes,
+      String tableName) {
+    String volumeBucketDbPrefix;
+    // In case of FSO - either File/Directory table
+    // the key Prefix would be volumeId/bucketId and
+    // in case of non-fso - volumeName/bucketName
+    if (tableName.equals(
+        OmMetadataManagerImpl.DIRECTORY_TABLE) || tableName.equals(
+        OmMetadataManagerImpl.FILE_TABLE)) {
+      volumeBucketDbPrefix =
+          tablePrefixes.get(OmMetadataManagerImpl.DIRECTORY_TABLE);
+    } else {
+      volumeBucketDbPrefix = tablePrefixes.get(OmMetadataManagerImpl.KEY_TABLE);
+    }
+    if (key.startsWith(volumeBucketDbPrefix)) {

Review Comment:
   return key.startsWith(volumeBucketDbPrefix);



##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDiffUtils.java:
##########
@@ -50,5 +62,48 @@ public static String constructBucketKey(String keyName) {
     return builder.toString();
   }
 
+  public static void filterRelevantSstFiles(Set<String> inputFiles,
+      Map<String, String> tableToPrefixMap) {
+    for (Iterator<String> fileIterator =
+         inputFiles.iterator(); fileIterator.hasNext();) {
+      String filepath = fileIterator.next();
+      if (!RocksDiffUtils.doesSstFileContainKeyRange(filepath,
+          tableToPrefixMap)) {
+        fileIterator.remove();
+      }
+    }
+  }
+
+  public static boolean doesSstFileContainKeyRange(String filepath,
+      Map<String, String> tableToPrefixMap) {
+    try (SstFileReader sstFileReader = new SstFileReader(new Options())) {
+      sstFileReader.open(filepath);
+      TableProperties properties = sstFileReader.getTableProperties();
+      String tableName = new String(properties.getColumnFamilyName(), UTF_8);
+      if (tableToPrefixMap.containsKey(tableName)) {
+        String prefix = tableToPrefixMap.get(tableName);
+        SstFileReaderIterator iterator =
+            sstFileReader.newIterator(new ReadOptions());
+        iterator.seekToFirst();
+        String firstKey = RocksDiffUtils.constructBucketKey(
+            new String(iterator.key(), UTF_8));

Review Comment:
   As in it could be the case that SST file doesn't have entry for the bucket itself. Correct me if I am wrong on this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] swamirishi commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "swamirishi (via GitHub)" <gi...@apache.org>.
swamirishi commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1119398408


##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDiffUtils.java:
##########
@@ -35,20 +52,54 @@ public static boolean isKeyWithPrefixPresent(String prefixForColumnFamily,
   }
 
   public static String constructBucketKey(String keyName) {
-    if (!keyName.startsWith(OzoneConsts.OM_KEY_PREFIX)) {
-      keyName = OzoneConsts.OM_KEY_PREFIX.concat(keyName);
+    if (!keyName.startsWith(OM_KEY_PREFIX)) {
+      keyName = OM_KEY_PREFIX.concat(keyName);
     }
-    String[] elements = keyName.split(OzoneConsts.OM_KEY_PREFIX);
+    String[] elements = keyName.split(OM_KEY_PREFIX);
     String volume = elements[1];
     String bucket = elements[2];
     StringBuilder builder =
-        new StringBuilder().append(OzoneConsts.OM_KEY_PREFIX).append(volume);
+        new StringBuilder().append(OM_KEY_PREFIX).append(volume);
 
     if (StringUtils.isNotBlank(bucket)) {
-      builder.append(OzoneConsts.OM_KEY_PREFIX).append(bucket);
+      builder.append(OM_KEY_PREFIX).append(bucket);
     }
     return builder.toString();
   }
 
+  public static void filterRelevantSstFiles(Set<String> inputFiles,
+      Map<String, String> tableToPrefixMap) throws RocksDBException {
+    for (Iterator<String> fileIterator =
+         inputFiles.iterator(); fileIterator.hasNext();) {
+      String filepath = fileIterator.next();
+      if (!RocksDiffUtils.doesSstFileContainKeyRange(filepath,
+          tableToPrefixMap)) {
+        fileIterator.remove();
+      }
+    }
+  }
+
+  public static boolean doesSstFileContainKeyRange(String filepath,
+      Map<String, String> tableToPrefixMap) throws RocksDBException {
+    try (SstFileReader sstFileReader = new SstFileReader(new Options())) {
+      sstFileReader.open(filepath);
+      TableProperties properties = sstFileReader.getTableProperties();
+      String tableName = new String(properties.getColumnFamilyName(), UTF_8);
+      if (tableToPrefixMap.containsKey(tableName)) {
+        String prefix = tableToPrefixMap.get(tableName) + OM_KEY_PREFIX;
+        try (SstFileReaderIterator iterator = sstFileReader.newIterator(
+            new ReadOptions())) {
+          iterator.seek(prefix.getBytes(UTF_8));
+          String seekResultKey = new String(iterator.key(), UTF_8);
+          return seekResultKey.startsWith(prefix);
+        }
+      }
+      return false;
+    } catch (RocksDBException e) {
+      LOG.error("Failed to read SST File ", e);
+      throw e;

Review Comment:
   nit: Do we need to Log if we are throwing an exception. BTW it would be better if we wrap this exception with one of our own exceptions.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1122634006


##########
hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/OzoneConfigKeys.java:
##########
@@ -603,6 +603,11 @@ public final class OzoneConfigKeys {
       OZONE_OM_SNAPSHOT_PRUNE_COMPACTION_DAG_DAEMON_RUN_INTERVAL_DEFAULT =
       TimeUnit.HOURS.toMillis(1);
 
+  public static final String OZONE_OM_SNAPSHOT_USE_FULL_DIFF =
+      "ozone.om.snapshot.use.full.diff";
+
+  public static final boolean OZONE_OM_SNAPSHOT_USE_FULL_DIFF_DEFAULT = false;

Review Comment:
   Yeah its a good idea. Thanks for filing this one.



##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDiffUtils.java:
##########
@@ -35,20 +53,56 @@ public static boolean isKeyWithPrefixPresent(String prefixForColumnFamily,
   }
 
   public static String constructBucketKey(String keyName) {
-    if (!keyName.startsWith(OzoneConsts.OM_KEY_PREFIX)) {
-      keyName = OzoneConsts.OM_KEY_PREFIX.concat(keyName);
+    if (!keyName.startsWith(OM_KEY_PREFIX)) {
+      keyName = OM_KEY_PREFIX.concat(keyName);
     }
-    String[] elements = keyName.split(OzoneConsts.OM_KEY_PREFIX);
+    String[] elements = keyName.split(OM_KEY_PREFIX);
     String volume = elements[1];
     String bucket = elements[2];
     StringBuilder builder =
-        new StringBuilder().append(OzoneConsts.OM_KEY_PREFIX).append(volume);
+        new StringBuilder().append(OM_KEY_PREFIX).append(volume);
 
     if (StringUtils.isNotBlank(bucket)) {
-      builder.append(OzoneConsts.OM_KEY_PREFIX).append(bucket);
+      builder.append(OM_KEY_PREFIX).append(bucket);
     }
     return builder.toString();
   }
 
+  public static void filterRelevantSstFiles(Set<String> inputFiles,
+      Map<String, String> tableToPrefixMap) throws RocksDBException {
+    for (Iterator<String> fileIterator =
+         inputFiles.iterator(); fileIterator.hasNext();) {
+      String filepath = fileIterator.next();
+      if (!RocksDiffUtils.doesSstFileContainKeyRange(filepath,
+          tableToPrefixMap)) {
+        fileIterator.remove();
+      }
+    }
+  }
+
+  public static boolean doesSstFileContainKeyRange(String filepath,
+      Map<String, String> tableToPrefixMap) throws RocksDBException {
+    try (ManagedSstFileReader sstFileReader = ManagedSstFileReader.managed(
+        new SstFileReader(new Options()))) {
+      sstFileReader.get().open(filepath);
+      TableProperties properties = sstFileReader.get().getTableProperties();
+      String tableName = new String(properties.getColumnFamilyName(), UTF_8);
+      if (tableToPrefixMap.containsKey(tableName)) {
+        String prefix = tableToPrefixMap.get(tableName) + OM_KEY_PREFIX;
+        try (ManagedSstFileReaderIterator iterator =
+            ManagedSstFileReaderIterator.managed(sstFileReader.get()
+                .newIterator(new ReadOptions()))) {
+          iterator.get().seek(prefix.getBytes(UTF_8));
+          String seekResultKey = new String(iterator.get().key(), UTF_8);
+          return seekResultKey.startsWith(prefix);
+        }
+      }
+      return false;
+    } catch (RocksDBException e) {
+      LOG.error("Failed to read SST File ", e);
+      throw e;

Review Comment:
   Done.



##########
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/freon/TestOMSnapshotDAG.java:
##########
@@ -42,6 +42,7 @@
 import org.junit.jupiter.api.Assertions;
 import org.junit.jupiter.api.BeforeAll;
 import org.junit.jupiter.api.Test;
+import org.rocksdb.RocksDBException;

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1122633337


##########
hadoop-hdds/common/pom.xml:
##########
@@ -242,6 +242,10 @@ https://maven.apache.org/xsd/maven-4.0.0.xsd">
       <artifactId>spotbugs-annotations</artifactId>
       <scope>test</scope>
     </dependency>
+      <dependency>
+          <groupId>org.rocksdb</groupId>
+          <artifactId>rocksdbjni</artifactId>
+      </dependency>

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1122433546


##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDiffUtils.java:
##########
@@ -35,20 +53,56 @@ public static boolean isKeyWithPrefixPresent(String prefixForColumnFamily,
   }
 
   public static String constructBucketKey(String keyName) {
-    if (!keyName.startsWith(OzoneConsts.OM_KEY_PREFIX)) {
-      keyName = OzoneConsts.OM_KEY_PREFIX.concat(keyName);
+    if (!keyName.startsWith(OM_KEY_PREFIX)) {
+      keyName = OM_KEY_PREFIX.concat(keyName);
     }
-    String[] elements = keyName.split(OzoneConsts.OM_KEY_PREFIX);
+    String[] elements = keyName.split(OM_KEY_PREFIX);
     String volume = elements[1];
     String bucket = elements[2];
     StringBuilder builder =
-        new StringBuilder().append(OzoneConsts.OM_KEY_PREFIX).append(volume);
+        new StringBuilder().append(OM_KEY_PREFIX).append(volume);
 
     if (StringUtils.isNotBlank(bucket)) {
-      builder.append(OzoneConsts.OM_KEY_PREFIX).append(bucket);
+      builder.append(OM_KEY_PREFIX).append(bucket);
     }
     return builder.toString();
   }
 
+  public static void filterRelevantSstFiles(Set<String> inputFiles,
+      Map<String, String> tableToPrefixMap) throws RocksDBException {
+    for (Iterator<String> fileIterator =
+         inputFiles.iterator(); fileIterator.hasNext();) {
+      String filepath = fileIterator.next();
+      if (!RocksDiffUtils.doesSstFileContainKeyRange(filepath,
+          tableToPrefixMap)) {
+        fileIterator.remove();
+      }
+    }
+  }
+
+  public static boolean doesSstFileContainKeyRange(String filepath,
+      Map<String, String> tableToPrefixMap) throws RocksDBException {
+    try (ManagedSstFileReader sstFileReader = ManagedSstFileReader.managed(
+        new SstFileReader(new Options()))) {
+      sstFileReader.get().open(filepath);
+      TableProperties properties = sstFileReader.get().getTableProperties();
+      String tableName = new String(properties.getColumnFamilyName(), UTF_8);
+      if (tableToPrefixMap.containsKey(tableName)) {
+        String prefix = tableToPrefixMap.get(tableName) + OM_KEY_PREFIX;
+        try (ManagedSstFileReaderIterator iterator =
+            ManagedSstFileReaderIterator.managed(sstFileReader.get()
+                .newIterator(new ReadOptions()))) {
+          iterator.get().seek(prefix.getBytes(UTF_8));
+          String seekResultKey = new String(iterator.get().key(), UTF_8);
+          return seekResultKey.startsWith(prefix);
+        }
+      }
+      return false;
+    } catch (RocksDBException e) {
+      LOG.error("Failed to read SST File ", e);
+      throw e;

Review Comment:
   We might want to wrap this inside an `IOException` or some custom `RocksDifferException` before throwing it back to the client.
   
   Ideally we don't want `freon/TestOMSnapshotDAG` to import `RocksDBException` directly.
   
   ```suggestion
         throw new IOException(e);
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1122418384


##########
hadoop-hdds/common/src/main/resources/ozone-default.xml:
##########
@@ -3608,4 +3608,14 @@
       production environments.
     </description>
   </property>
+
+  <property>
+    <name>ozone.om.snapshot.use.full.diff</name>

Review Comment:
   Call it **force** full diff might be more appropriate
   
   ```suggestion
       <name>ozone.om.snapshot.force.full.diff</name>
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1122452338


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##########
@@ -292,7 +312,8 @@ private void addToObjectIdMap(Table<String, ? extends WithObjectID> fsTable,
         try {
           final WithObjectID oldKey = fsTable.get(key);
           final WithObjectID newKey = tsTable.get(key);
-          if (areKeysEqual(oldKey, newKey)) {
+          if (areKeysEqual(oldKey, newKey) || !isKeyInBucket(key, tablePrefixes,

Review Comment:
   This is the core of the fix mentioned in the PR title, correct?
   
   Next time, when possible, for a **bug** fix we can try to make the patch itself simple enough for review. Major refactoring and config additions can be done in separate PRs.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1122434846


##########
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/freon/TestOMSnapshotDAG.java:
##########
@@ -42,6 +42,7 @@
 import org.junit.jupiter.api.Assertions;
 import org.junit.jupiter.api.BeforeAll;
 import org.junit.jupiter.api.Test;
+import org.rocksdb.RocksDBException;

Review Comment:
   Try to avoid this import



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] prashantpogde commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "prashantpogde (via GitHub)" <gi...@apache.org>.
prashantpogde commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1113498913


##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDiffUtils.java:
##########
@@ -50,5 +62,48 @@ public static String constructBucketKey(String keyName) {
     return builder.toString();
   }
 
+  public static void filterRelevantSstFiles(Set<String> inputFiles,
+      Map<String, String> tableToPrefixMap) {
+    for (Iterator<String> fileIterator =
+         inputFiles.iterator(); fileIterator.hasNext();) {
+      String filepath = fileIterator.next();
+      if (!RocksDiffUtils.doesSstFileContainKeyRange(filepath,
+          tableToPrefixMap)) {
+        fileIterator.remove();
+      }
+    }
+  }
+
+  public static boolean doesSstFileContainKeyRange(String filepath,
+      Map<String, String> tableToPrefixMap) {
+    try (SstFileReader sstFileReader = new SstFileReader(new Options())) {
+      sstFileReader.open(filepath);
+      TableProperties properties = sstFileReader.getTableProperties();
+      String tableName = new String(properties.getColumnFamilyName(), UTF_8);
+      if (tableToPrefixMap.containsKey(tableName)) {
+        String prefix = tableToPrefixMap.get(tableName);
+        SstFileReaderIterator iterator =
+            sstFileReader.newIterator(new ReadOptions());
+        iterator.seekToFirst();
+        String firstKey = RocksDiffUtils.constructBucketKey(
+            new String(iterator.key(), UTF_8));
+        iterator.seekToLast();
+        String lastKey = RocksDiffUtils.constructBucketKey(
+            new String(iterator.key(), UTF_8));
+        boolean keyWithPrefixPresent =
+            RocksDiffUtils.isKeyWithPrefixPresent(prefix, firstKey, lastKey);
+        if (!keyWithPrefixPresent) {
+          return false;
+        }
+      } else {
+        // entry from other tables
+        return false;
+      }
+    } catch (RocksDBException e) {
+      e.printStackTrace();

Review Comment:
   If we silently mask the exception, we can end up returning wrong snapdiff results. Shouldn't we throw it all the way back to the caller ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] swamirishi commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "swamirishi (via GitHub)" <gi...@apache.org>.
swamirishi commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1119162951


##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDiffUtils.java:
##########
@@ -35,20 +52,53 @@ public static boolean isKeyWithPrefixPresent(String prefixForColumnFamily,
   }
 
   public static String constructBucketKey(String keyName) {
-    if (!keyName.startsWith(OzoneConsts.OM_KEY_PREFIX)) {
-      keyName = OzoneConsts.OM_KEY_PREFIX.concat(keyName);
+    if (!keyName.startsWith(OM_KEY_PREFIX)) {
+      keyName = OM_KEY_PREFIX.concat(keyName);
     }
-    String[] elements = keyName.split(OzoneConsts.OM_KEY_PREFIX);
+    String[] elements = keyName.split(OM_KEY_PREFIX);
     String volume = elements[1];
     String bucket = elements[2];
     StringBuilder builder =
-        new StringBuilder().append(OzoneConsts.OM_KEY_PREFIX).append(volume);
+        new StringBuilder().append(OM_KEY_PREFIX).append(volume);
 
     if (StringUtils.isNotBlank(bucket)) {
-      builder.append(OzoneConsts.OM_KEY_PREFIX).append(bucket);
+      builder.append(OM_KEY_PREFIX).append(bucket);
     }
     return builder.toString();
   }
 
+  public static void filterRelevantSstFiles(Set<String> inputFiles,
+      Map<String, String> tableToPrefixMap) throws RocksDBException {
+    for (Iterator<String> fileIterator =
+         inputFiles.iterator(); fileIterator.hasNext();) {
+      String filepath = fileIterator.next();
+      if (!RocksDiffUtils.doesSstFileContainKeyRange(filepath,
+          tableToPrefixMap)) {
+        fileIterator.remove();
+      }
+    }
+  }
+
+  public static boolean doesSstFileContainKeyRange(String filepath,
+      Map<String, String> tableToPrefixMap) throws RocksDBException {
+    try (SstFileReader sstFileReader = new SstFileReader(new Options())) {
+      sstFileReader.open(filepath);
+      TableProperties properties = sstFileReader.getTableProperties();
+      String tableName = new String(properties.getColumnFamilyName(), UTF_8);
+      if (tableToPrefixMap.containsKey(tableName)) {
+        String prefix = tableToPrefixMap.get(tableName) + OM_KEY_PREFIX;
+        SstFileReaderIterator iterator =
+            sstFileReader.newIterator(new ReadOptions());

Review Comment:
   I guess the iterator here needs to be closed. this might lead to memory leak on native side.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1122436045


##########
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestOmSnapshot.java:
##########
@@ -107,30 +109,34 @@ public class TestOmSnapshot {
   @Parameterized.Parameters
   public static Collection<Object[]> data() {
     return Arrays.asList(
-                         new Object[]{OBJECT_STORE, false},
-                         new Object[]{FILE_SYSTEM_OPTIMIZED, false},
-                         new Object[]{BucketLayout.LEGACY, true});
+                         new Object[]{OBJECT_STORE, false, false},
+                         new Object[]{FILE_SYSTEM_OPTIMIZED, false, false},
+                         new Object[]{BucketLayout.LEGACY, true, true});
   }
 
   public TestOmSnapshot(BucketLayout newBucketLayout,
-      boolean newEnableFileSystemPaths) throws Exception {
+      boolean newEnableFileSystemPaths, boolean shouldUseFullSnapshotDiff)

Review Comment:
   ```suggestion
         boolean newEnableFileSystemPaths, boolean forceFullSnapDiff)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1122420206


##########
hadoop-hdds/common/src/main/resources/ozone-default.xml:
##########
@@ -3608,4 +3608,14 @@
       production environments.
     </description>
   </property>
+
+  <property>
+    <name>ozone.om.snapshot.use.full.diff</name>
+    <value>false</value>
+    <tag>OZONE, OM</tag>
+    <description>
+      If true, snapshot diff will not follow the optimised DAG based pruning approach
+      for computation.

Review Comment:
   ```suggestion
         If true, snapshot diff will always perform full diff (can be slow)
         without using the optimised DAG based pruning approach.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1122422386


##########
hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/OzoneConfigKeys.java:
##########
@@ -603,6 +603,11 @@ public final class OzoneConfigKeys {
       OZONE_OM_SNAPSHOT_PRUNE_COMPACTION_DAG_DAEMON_RUN_INTERVAL_DEFAULT =
       TimeUnit.HOURS.toMillis(1);
 
+  public static final String OZONE_OM_SNAPSHOT_USE_FULL_DIFF =
+      "ozone.om.snapshot.use.full.diff";
+
+  public static final boolean OZONE_OM_SNAPSHOT_USE_FULL_DIFF_DEFAULT = false;

Review Comment:
   Filed HDDS-8061



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1115700017


##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDiffUtils.java:
##########
@@ -50,5 +62,48 @@ public static String constructBucketKey(String keyName) {
     return builder.toString();
   }
 
+  public static void filterRelevantSstFiles(Set<String> inputFiles,
+      Map<String, String> tableToPrefixMap) {
+    for (Iterator<String> fileIterator =
+         inputFiles.iterator(); fileIterator.hasNext();) {
+      String filepath = fileIterator.next();
+      if (!RocksDiffUtils.doesSstFileContainKeyRange(filepath,
+          tableToPrefixMap)) {
+        fileIterator.remove();
+      }
+    }
+  }
+
+  public static boolean doesSstFileContainKeyRange(String filepath,
+      Map<String, String> tableToPrefixMap) {
+    try (SstFileReader sstFileReader = new SstFileReader(new Options())) {
+      sstFileReader.open(filepath);
+      TableProperties properties = sstFileReader.getTableProperties();
+      String tableName = new String(properties.getColumnFamilyName(), UTF_8);
+      if (tableToPrefixMap.containsKey(tableName)) {
+        String prefix = tableToPrefixMap.get(tableName);
+        SstFileReaderIterator iterator =
+            sstFileReader.newIterator(new ReadOptions());
+        iterator.seekToFirst();
+        String firstKey = RocksDiffUtils.constructBucketKey(
+            new String(iterator.key(), UTF_8));
+        iterator.seekToLast();
+        String lastKey = RocksDiffUtils.constructBucketKey(
+            new String(iterator.key(), UTF_8));
+        boolean keyWithPrefixPresent =
+            RocksDiffUtils.isKeyWithPrefixPresent(prefix, firstKey, lastKey);
+        if (!keyWithPrefixPresent) {

Review Comment:
   Done.



##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDiffUtils.java:
##########
@@ -50,5 +62,48 @@ public static String constructBucketKey(String keyName) {
     return builder.toString();
   }
 
+  public static void filterRelevantSstFiles(Set<String> inputFiles,
+      Map<String, String> tableToPrefixMap) {
+    for (Iterator<String> fileIterator =
+         inputFiles.iterator(); fileIterator.hasNext();) {
+      String filepath = fileIterator.next();
+      if (!RocksDiffUtils.doesSstFileContainKeyRange(filepath,
+          tableToPrefixMap)) {
+        fileIterator.remove();
+      }
+    }
+  }
+
+  public static boolean doesSstFileContainKeyRange(String filepath,
+      Map<String, String> tableToPrefixMap) {
+    try (SstFileReader sstFileReader = new SstFileReader(new Options())) {
+      sstFileReader.open(filepath);
+      TableProperties properties = sstFileReader.getTableProperties();
+      String tableName = new String(properties.getColumnFamilyName(), UTF_8);
+      if (tableToPrefixMap.containsKey(tableName)) {
+        String prefix = tableToPrefixMap.get(tableName);
+        SstFileReaderIterator iterator =
+            sstFileReader.newIterator(new ReadOptions());
+        iterator.seekToFirst();
+        String firstKey = RocksDiffUtils.constructBucketKey(
+            new String(iterator.key(), UTF_8));
+        iterator.seekToLast();
+        String lastKey = RocksDiffUtils.constructBucketKey(
+            new String(iterator.key(), UTF_8));
+        boolean keyWithPrefixPresent =
+            RocksDiffUtils.isKeyWithPrefixPresent(prefix, firstKey, lastKey);
+        if (!keyWithPrefixPresent) {
+          return false;
+        }
+      } else {
+        // entry from other tables
+        return false;
+      }
+    } catch (RocksDBException e) {
+      e.printStackTrace();

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1122419174


##########
hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/OzoneConfigKeys.java:
##########
@@ -603,6 +603,11 @@ public final class OzoneConfigKeys {
       OZONE_OM_SNAPSHOT_PRUNE_COMPACTION_DAG_DAEMON_RUN_INTERVAL_DEFAULT =
       TimeUnit.HOURS.toMillis(1);
 
+  public static final String OZONE_OM_SNAPSHOT_USE_FULL_DIFF =
+      "ozone.om.snapshot.use.full.diff";
+
+  public static final boolean OZONE_OM_SNAPSHOT_USE_FULL_DIFF_DEFAULT = false;

Review Comment:
   here as well



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1122433546


##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDiffUtils.java:
##########
@@ -35,20 +53,56 @@ public static boolean isKeyWithPrefixPresent(String prefixForColumnFamily,
   }
 
   public static String constructBucketKey(String keyName) {
-    if (!keyName.startsWith(OzoneConsts.OM_KEY_PREFIX)) {
-      keyName = OzoneConsts.OM_KEY_PREFIX.concat(keyName);
+    if (!keyName.startsWith(OM_KEY_PREFIX)) {
+      keyName = OM_KEY_PREFIX.concat(keyName);
     }
-    String[] elements = keyName.split(OzoneConsts.OM_KEY_PREFIX);
+    String[] elements = keyName.split(OM_KEY_PREFIX);
     String volume = elements[1];
     String bucket = elements[2];
     StringBuilder builder =
-        new StringBuilder().append(OzoneConsts.OM_KEY_PREFIX).append(volume);
+        new StringBuilder().append(OM_KEY_PREFIX).append(volume);
 
     if (StringUtils.isNotBlank(bucket)) {
-      builder.append(OzoneConsts.OM_KEY_PREFIX).append(bucket);
+      builder.append(OM_KEY_PREFIX).append(bucket);
     }
     return builder.toString();
   }
 
+  public static void filterRelevantSstFiles(Set<String> inputFiles,
+      Map<String, String> tableToPrefixMap) throws RocksDBException {
+    for (Iterator<String> fileIterator =
+         inputFiles.iterator(); fileIterator.hasNext();) {
+      String filepath = fileIterator.next();
+      if (!RocksDiffUtils.doesSstFileContainKeyRange(filepath,
+          tableToPrefixMap)) {
+        fileIterator.remove();
+      }
+    }
+  }
+
+  public static boolean doesSstFileContainKeyRange(String filepath,
+      Map<String, String> tableToPrefixMap) throws RocksDBException {
+    try (ManagedSstFileReader sstFileReader = ManagedSstFileReader.managed(
+        new SstFileReader(new Options()))) {
+      sstFileReader.get().open(filepath);
+      TableProperties properties = sstFileReader.get().getTableProperties();
+      String tableName = new String(properties.getColumnFamilyName(), UTF_8);
+      if (tableToPrefixMap.containsKey(tableName)) {
+        String prefix = tableToPrefixMap.get(tableName) + OM_KEY_PREFIX;
+        try (ManagedSstFileReaderIterator iterator =
+            ManagedSstFileReaderIterator.managed(sstFileReader.get()
+                .newIterator(new ReadOptions()))) {
+          iterator.get().seek(prefix.getBytes(UTF_8));
+          String seekResultKey = new String(iterator.get().key(), UTF_8);
+          return seekResultKey.startsWith(prefix);
+        }
+      }
+      return false;
+    } catch (RocksDBException e) {
+      LOG.error("Failed to read SST File ", e);
+      throw e;

Review Comment:
   We might want to wrap this inside an `IOException` or some custom `RocksDifferException` before throwing it back to the caller.
   
   Ideally we don't want `freon/TestOMSnapshotDAG` to import `RocksDBException` directly.
   
   We don't need `HddsServerUtil` to achieve this.
   
   ```suggestion
         throw new IOException(e);
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1122433546


##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDiffUtils.java:
##########
@@ -35,20 +53,56 @@ public static boolean isKeyWithPrefixPresent(String prefixForColumnFamily,
   }
 
   public static String constructBucketKey(String keyName) {
-    if (!keyName.startsWith(OzoneConsts.OM_KEY_PREFIX)) {
-      keyName = OzoneConsts.OM_KEY_PREFIX.concat(keyName);
+    if (!keyName.startsWith(OM_KEY_PREFIX)) {
+      keyName = OM_KEY_PREFIX.concat(keyName);
     }
-    String[] elements = keyName.split(OzoneConsts.OM_KEY_PREFIX);
+    String[] elements = keyName.split(OM_KEY_PREFIX);
     String volume = elements[1];
     String bucket = elements[2];
     StringBuilder builder =
-        new StringBuilder().append(OzoneConsts.OM_KEY_PREFIX).append(volume);
+        new StringBuilder().append(OM_KEY_PREFIX).append(volume);
 
     if (StringUtils.isNotBlank(bucket)) {
-      builder.append(OzoneConsts.OM_KEY_PREFIX).append(bucket);
+      builder.append(OM_KEY_PREFIX).append(bucket);
     }
     return builder.toString();
   }
 
+  public static void filterRelevantSstFiles(Set<String> inputFiles,
+      Map<String, String> tableToPrefixMap) throws RocksDBException {
+    for (Iterator<String> fileIterator =
+         inputFiles.iterator(); fileIterator.hasNext();) {
+      String filepath = fileIterator.next();
+      if (!RocksDiffUtils.doesSstFileContainKeyRange(filepath,
+          tableToPrefixMap)) {
+        fileIterator.remove();
+      }
+    }
+  }
+
+  public static boolean doesSstFileContainKeyRange(String filepath,
+      Map<String, String> tableToPrefixMap) throws RocksDBException {
+    try (ManagedSstFileReader sstFileReader = ManagedSstFileReader.managed(
+        new SstFileReader(new Options()))) {
+      sstFileReader.get().open(filepath);
+      TableProperties properties = sstFileReader.get().getTableProperties();
+      String tableName = new String(properties.getColumnFamilyName(), UTF_8);
+      if (tableToPrefixMap.containsKey(tableName)) {
+        String prefix = tableToPrefixMap.get(tableName) + OM_KEY_PREFIX;
+        try (ManagedSstFileReaderIterator iterator =
+            ManagedSstFileReaderIterator.managed(sstFileReader.get()
+                .newIterator(new ReadOptions()))) {
+          iterator.get().seek(prefix.getBytes(UTF_8));
+          String seekResultKey = new String(iterator.get().key(), UTF_8);
+          return seekResultKey.startsWith(prefix);
+        }
+      }
+      return false;
+    } catch (RocksDBException e) {
+      LOG.error("Failed to read SST File ", e);
+      throw e;

Review Comment:
   We might want to wrap this inside an `IOException` or some custom `RocksDifferException` before throwing it back to the caller.
   
   Ideally we don't want `freon/TestOMSnapshotDAG` to import `RocksDBException` directly.
   
   ```suggestion
         throw new IOException(e);
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1122432701


##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDiffUtils.java:
##########
@@ -35,20 +53,56 @@ public static boolean isKeyWithPrefixPresent(String prefixForColumnFamily,
   }
 
   public static String constructBucketKey(String keyName) {
-    if (!keyName.startsWith(OzoneConsts.OM_KEY_PREFIX)) {
-      keyName = OzoneConsts.OM_KEY_PREFIX.concat(keyName);
+    if (!keyName.startsWith(OM_KEY_PREFIX)) {
+      keyName = OM_KEY_PREFIX.concat(keyName);
     }
-    String[] elements = keyName.split(OzoneConsts.OM_KEY_PREFIX);
+    String[] elements = keyName.split(OM_KEY_PREFIX);
     String volume = elements[1];
     String bucket = elements[2];
     StringBuilder builder =
-        new StringBuilder().append(OzoneConsts.OM_KEY_PREFIX).append(volume);
+        new StringBuilder().append(OM_KEY_PREFIX).append(volume);
 
     if (StringUtils.isNotBlank(bucket)) {
-      builder.append(OzoneConsts.OM_KEY_PREFIX).append(bucket);
+      builder.append(OM_KEY_PREFIX).append(bucket);
     }
     return builder.toString();
   }
 
+  public static void filterRelevantSstFiles(Set<String> inputFiles,
+      Map<String, String> tableToPrefixMap) throws RocksDBException {
+    for (Iterator<String> fileIterator =
+         inputFiles.iterator(); fileIterator.hasNext();) {
+      String filepath = fileIterator.next();
+      if (!RocksDiffUtils.doesSstFileContainKeyRange(filepath,
+          tableToPrefixMap)) {
+        fileIterator.remove();
+      }
+    }
+  }
+
+  public static boolean doesSstFileContainKeyRange(String filepath,
+      Map<String, String> tableToPrefixMap) throws RocksDBException {
+    try (ManagedSstFileReader sstFileReader = ManagedSstFileReader.managed(
+        new SstFileReader(new Options()))) {
+      sstFileReader.get().open(filepath);
+      TableProperties properties = sstFileReader.get().getTableProperties();
+      String tableName = new String(properties.getColumnFamilyName(), UTF_8);
+      if (tableToPrefixMap.containsKey(tableName)) {
+        String prefix = tableToPrefixMap.get(tableName) + OM_KEY_PREFIX;
+        try (ManagedSstFileReaderIterator iterator =
+            ManagedSstFileReaderIterator.managed(sstFileReader.get()
+                .newIterator(new ReadOptions()))) {
+          iterator.get().seek(prefix.getBytes(UTF_8));
+          String seekResultKey = new String(iterator.get().key(), UTF_8);
+          return seekResultKey.startsWith(prefix);
+        }
+      }
+      return false;
+    } catch (RocksDBException e) {
+      LOG.error("Failed to read SST File ", e);
+      throw e;

Review Comment:
   We might want to wrap this inside an IOException or some custom RocksDifferException before throwing it back to the client.
   
   Ideally we don't want `freon/TestOMSnapshotDAG` to import RocksDBException directly. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1122456652


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##########
@@ -292,7 +312,8 @@ private void addToObjectIdMap(Table<String, ? extends WithObjectID> fsTable,
         try {
           final WithObjectID oldKey = fsTable.get(key);
           final WithObjectID newKey = tsTable.get(key);
-          if (areKeysEqual(oldKey, newKey)) {
+          if (areKeysEqual(oldKey, newKey) || !isKeyInBucket(key, tablePrefixes,

Review Comment:
   For this PR, you should include the list of other changes in the first PR comment as well. Such as moving those Managed classes from framework to common, and the new force full diff config addition.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] swamirishi commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "swamirishi (via GitHub)" <gi...@apache.org>.
swamirishi commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1119400224


##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDiffUtils.java:
##########
@@ -35,20 +52,54 @@ public static boolean isKeyWithPrefixPresent(String prefixForColumnFamily,
   }
 
   public static String constructBucketKey(String keyName) {
-    if (!keyName.startsWith(OzoneConsts.OM_KEY_PREFIX)) {
-      keyName = OzoneConsts.OM_KEY_PREFIX.concat(keyName);
+    if (!keyName.startsWith(OM_KEY_PREFIX)) {
+      keyName = OM_KEY_PREFIX.concat(keyName);
     }
-    String[] elements = keyName.split(OzoneConsts.OM_KEY_PREFIX);
+    String[] elements = keyName.split(OM_KEY_PREFIX);
     String volume = elements[1];
     String bucket = elements[2];
     StringBuilder builder =
-        new StringBuilder().append(OzoneConsts.OM_KEY_PREFIX).append(volume);
+        new StringBuilder().append(OM_KEY_PREFIX).append(volume);
 
     if (StringUtils.isNotBlank(bucket)) {
-      builder.append(OzoneConsts.OM_KEY_PREFIX).append(bucket);
+      builder.append(OM_KEY_PREFIX).append(bucket);
     }
     return builder.toString();
   }
 
+  public static void filterRelevantSstFiles(Set<String> inputFiles,
+      Map<String, String> tableToPrefixMap) throws RocksDBException {
+    for (Iterator<String> fileIterator =
+         inputFiles.iterator(); fileIterator.hasNext();) {
+      String filepath = fileIterator.next();
+      if (!RocksDiffUtils.doesSstFileContainKeyRange(filepath,
+          tableToPrefixMap)) {
+        fileIterator.remove();
+      }
+    }
+  }
+
+  public static boolean doesSstFileContainKeyRange(String filepath,
+      Map<String, String> tableToPrefixMap) throws RocksDBException {
+    try (SstFileReader sstFileReader = new SstFileReader(new Options())) {
+      sstFileReader.open(filepath);
+      TableProperties properties = sstFileReader.getTableProperties();
+      String tableName = new String(properties.getColumnFamilyName(), UTF_8);
+      if (tableToPrefixMap.containsKey(tableName)) {
+        String prefix = tableToPrefixMap.get(tableName) + OM_KEY_PREFIX;
+        try (SstFileReaderIterator iterator = sstFileReader.newIterator(
+            new ReadOptions())) {
+          iterator.seek(prefix.getBytes(UTF_8));
+          String seekResultKey = new String(iterator.key(), UTF_8);
+          return seekResultKey.startsWith(prefix);
+        }
+      }
+      return false;
+    } catch (RocksDBException e) {
+      LOG.error("Failed to read SST File ", e);
+      throw e;

Review Comment:
   throw HddsServerUtil.toIOException("Failed to read SST File ", e);



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl merged pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl merged PR #4273:
URL: https://github.com/apache/ozone/pull/4273


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1115700416


##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDiffUtils.java:
##########
@@ -50,5 +62,48 @@ public static String constructBucketKey(String keyName) {
     return builder.toString();
   }
 
+  public static void filterRelevantSstFiles(Set<String> inputFiles,
+      Map<String, String> tableToPrefixMap) {
+    for (Iterator<String> fileIterator =
+         inputFiles.iterator(); fileIterator.hasNext();) {
+      String filepath = fileIterator.next();
+      if (!RocksDiffUtils.doesSstFileContainKeyRange(filepath,
+          tableToPrefixMap)) {
+        fileIterator.remove();
+      }
+    }
+  }
+
+  public static boolean doesSstFileContainKeyRange(String filepath,
+      Map<String, String> tableToPrefixMap) {
+    try (SstFileReader sstFileReader = new SstFileReader(new Options())) {
+      sstFileReader.open(filepath);
+      TableProperties properties = sstFileReader.getTableProperties();
+      String tableName = new String(properties.getColumnFamilyName(), UTF_8);
+      if (tableToPrefixMap.containsKey(tableName)) {
+        String prefix = tableToPrefixMap.get(tableName);
+        SstFileReaderIterator iterator =
+            sstFileReader.newIterator(new ReadOptions());
+        iterator.seekToFirst();
+        String firstKey = RocksDiffUtils.constructBucketKey(
+            new String(iterator.key(), UTF_8));
+        iterator.seekToLast();
+        String lastKey = RocksDiffUtils.constructBucketKey(
+            new String(iterator.key(), UTF_8));
+        boolean keyWithPrefixPresent =
+            RocksDiffUtils.isKeyWithPrefixPresent(prefix, firstKey, lastKey);
+        if (!keyWithPrefixPresent) {
+          return false;
+        }
+      } else {
+        // entry from other tables
+        return false;
+      }
+    } catch (RocksDBException e) {
+      e.printStackTrace();

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on PR #4273:
URL: https://github.com/apache/ozone/pull/4273#issuecomment-1439049724

   @sadanand48 Would you resolve the merge conflicts?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] adoroszlai commented on pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "adoroszlai (via GitHub)" <gi...@apache.org>.
adoroszlai commented on PR #4273:
URL: https://github.com/apache/ozone/pull/4273#issuecomment-1453889436

   > Another change done here is to move the Managed RocksDB Classes from hadoop-hdds/framework to hadoop-hdds/common as while trying to use the Managed Classes in the Differ module, it complained of not having a dependency on hadoop-hdds/framework but inturn hadoop-hdds/framework has its dependency causing circular dependency , so had to move it to the common module to solve this.
   
   It also adds RocksDB to clients.  `rocksdbjni-7.7.3.jar` is 56MB, which then gets bundled in the FS jars:
   
   Before:
   
   ```
   60M share/ozone/lib/ozone-filesystem-hadoop3-1.4.0-SNAPSHOT.jar
   ```
   
   After:
   
   ```
   115M share/ozone/lib/ozone-filesystem-hadoop3-1.4.0-SNAPSHOT.jar
   ```
   
   Please create a separate module for RocksDB-related stuff, and make `hdds-framework` and `rocksdb-checkpoint-differ` both depend on it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1119630412


##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDiffUtils.java:
##########
@@ -35,20 +52,53 @@ public static boolean isKeyWithPrefixPresent(String prefixForColumnFamily,
   }
 
   public static String constructBucketKey(String keyName) {
-    if (!keyName.startsWith(OzoneConsts.OM_KEY_PREFIX)) {
-      keyName = OzoneConsts.OM_KEY_PREFIX.concat(keyName);
+    if (!keyName.startsWith(OM_KEY_PREFIX)) {
+      keyName = OM_KEY_PREFIX.concat(keyName);
     }
-    String[] elements = keyName.split(OzoneConsts.OM_KEY_PREFIX);
+    String[] elements = keyName.split(OM_KEY_PREFIX);
     String volume = elements[1];
     String bucket = elements[2];
     StringBuilder builder =
-        new StringBuilder().append(OzoneConsts.OM_KEY_PREFIX).append(volume);
+        new StringBuilder().append(OM_KEY_PREFIX).append(volume);
 
     if (StringUtils.isNotBlank(bucket)) {
-      builder.append(OzoneConsts.OM_KEY_PREFIX).append(bucket);
+      builder.append(OM_KEY_PREFIX).append(bucket);
     }
     return builder.toString();
   }
 
+  public static void filterRelevantSstFiles(Set<String> inputFiles,
+      Map<String, String> tableToPrefixMap) throws RocksDBException {
+    for (Iterator<String> fileIterator =
+         inputFiles.iterator(); fileIterator.hasNext();) {
+      String filepath = fileIterator.next();
+      if (!RocksDiffUtils.doesSstFileContainKeyRange(filepath,
+          tableToPrefixMap)) {
+        fileIterator.remove();
+      }
+    }
+  }
+
+  public static boolean doesSstFileContainKeyRange(String filepath,
+      Map<String, String> tableToPrefixMap) throws RocksDBException {
+    try (SstFileReader sstFileReader = new SstFileReader(new Options())) {
+      sstFileReader.open(filepath);
+      TableProperties properties = sstFileReader.getTableProperties();
+      String tableName = new String(properties.getColumnFamilyName(), UTF_8);
+      if (tableToPrefixMap.containsKey(tableName)) {
+        String prefix = tableToPrefixMap.get(tableName) + OM_KEY_PREFIX;
+        SstFileReaderIterator iterator =
+            sstFileReader.newIterator(new ReadOptions());

Review Comment:
   I have moved all the managed classes under hdds-common from hdds-server to avoid adding this dependency in rocksdiffer module, Please check if this looks fine.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1119575516


##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDiffUtils.java:
##########
@@ -35,20 +52,53 @@ public static boolean isKeyWithPrefixPresent(String prefixForColumnFamily,
   }
 
   public static String constructBucketKey(String keyName) {
-    if (!keyName.startsWith(OzoneConsts.OM_KEY_PREFIX)) {
-      keyName = OzoneConsts.OM_KEY_PREFIX.concat(keyName);
+    if (!keyName.startsWith(OM_KEY_PREFIX)) {
+      keyName = OM_KEY_PREFIX.concat(keyName);
     }
-    String[] elements = keyName.split(OzoneConsts.OM_KEY_PREFIX);
+    String[] elements = keyName.split(OM_KEY_PREFIX);
     String volume = elements[1];
     String bucket = elements[2];
     StringBuilder builder =
-        new StringBuilder().append(OzoneConsts.OM_KEY_PREFIX).append(volume);
+        new StringBuilder().append(OM_KEY_PREFIX).append(volume);
 
     if (StringUtils.isNotBlank(bucket)) {
-      builder.append(OzoneConsts.OM_KEY_PREFIX).append(bucket);
+      builder.append(OM_KEY_PREFIX).append(bucket);
     }
     return builder.toString();
   }
 
+  public static void filterRelevantSstFiles(Set<String> inputFiles,
+      Map<String, String> tableToPrefixMap) throws RocksDBException {
+    for (Iterator<String> fileIterator =
+         inputFiles.iterator(); fileIterator.hasNext();) {
+      String filepath = fileIterator.next();
+      if (!RocksDiffUtils.doesSstFileContainKeyRange(filepath,
+          tableToPrefixMap)) {
+        fileIterator.remove();
+      }
+    }
+  }
+
+  public static boolean doesSstFileContainKeyRange(String filepath,
+      Map<String, String> tableToPrefixMap) throws RocksDBException {
+    try (SstFileReader sstFileReader = new SstFileReader(new Options())) {
+      sstFileReader.open(filepath);
+      TableProperties properties = sstFileReader.getTableProperties();
+      String tableName = new String(properties.getColumnFamilyName(), UTF_8);
+      if (tableToPrefixMap.containsKey(tableName)) {
+        String prefix = tableToPrefixMap.get(tableName) + OM_KEY_PREFIX;
+        SstFileReaderIterator iterator =
+            sstFileReader.newIterator(new ReadOptions());

Review Comment:
   We cannot use ManagedSstFileReader here since most managed rocks objects are defined in hdds module for which we need dependency.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1122433546


##########
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDiffUtils.java:
##########
@@ -35,20 +53,56 @@ public static boolean isKeyWithPrefixPresent(String prefixForColumnFamily,
   }
 
   public static String constructBucketKey(String keyName) {
-    if (!keyName.startsWith(OzoneConsts.OM_KEY_PREFIX)) {
-      keyName = OzoneConsts.OM_KEY_PREFIX.concat(keyName);
+    if (!keyName.startsWith(OM_KEY_PREFIX)) {
+      keyName = OM_KEY_PREFIX.concat(keyName);
     }
-    String[] elements = keyName.split(OzoneConsts.OM_KEY_PREFIX);
+    String[] elements = keyName.split(OM_KEY_PREFIX);
     String volume = elements[1];
     String bucket = elements[2];
     StringBuilder builder =
-        new StringBuilder().append(OzoneConsts.OM_KEY_PREFIX).append(volume);
+        new StringBuilder().append(OM_KEY_PREFIX).append(volume);
 
     if (StringUtils.isNotBlank(bucket)) {
-      builder.append(OzoneConsts.OM_KEY_PREFIX).append(bucket);
+      builder.append(OM_KEY_PREFIX).append(bucket);
     }
     return builder.toString();
   }
 
+  public static void filterRelevantSstFiles(Set<String> inputFiles,
+      Map<String, String> tableToPrefixMap) throws RocksDBException {
+    for (Iterator<String> fileIterator =
+         inputFiles.iterator(); fileIterator.hasNext();) {
+      String filepath = fileIterator.next();
+      if (!RocksDiffUtils.doesSstFileContainKeyRange(filepath,
+          tableToPrefixMap)) {
+        fileIterator.remove();
+      }
+    }
+  }
+
+  public static boolean doesSstFileContainKeyRange(String filepath,
+      Map<String, String> tableToPrefixMap) throws RocksDBException {
+    try (ManagedSstFileReader sstFileReader = ManagedSstFileReader.managed(
+        new SstFileReader(new Options()))) {
+      sstFileReader.get().open(filepath);
+      TableProperties properties = sstFileReader.get().getTableProperties();
+      String tableName = new String(properties.getColumnFamilyName(), UTF_8);
+      if (tableToPrefixMap.containsKey(tableName)) {
+        String prefix = tableToPrefixMap.get(tableName) + OM_KEY_PREFIX;
+        try (ManagedSstFileReaderIterator iterator =
+            ManagedSstFileReaderIterator.managed(sstFileReader.get()
+                .newIterator(new ReadOptions()))) {
+          iterator.get().seek(prefix.getBytes(UTF_8));
+          String seekResultKey = new String(iterator.get().key(), UTF_8);
+          return seekResultKey.startsWith(prefix);
+        }
+      }
+      return false;
+    } catch (RocksDBException e) {
+      LOG.error("Failed to read SST File ", e);
+      throw e;

Review Comment:
   We might want to wrap this inside an IOException or some custom RocksDifferException before throwing it back to the client.
   
   Ideally we don't want `freon/TestOMSnapshotDAG` to import RocksDBException directly. 
   
   ```suggestion
         throw new IOException(e);
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1122456652


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java:
##########
@@ -292,7 +312,8 @@ private void addToObjectIdMap(Table<String, ? extends WithObjectID> fsTable,
         try {
           final WithObjectID oldKey = fsTable.get(key);
           final WithObjectID newKey = tsTable.get(key);
-          if (areKeysEqual(oldKey, newKey)) {
+          if (areKeysEqual(oldKey, newKey) || !isKeyInBucket(key, tablePrefixes,

Review Comment:
   For this PR, you should include the list of other changes in the first PR comment as well. Such as moving those Managed classes from framework to common, and the new force full diff config addition.
   
   Also, reflect the additional changes in the PR and JIRA title if possible.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #4273: HDDS-7905. [Snapshot] Snapdiff should only return modifications done to the bucket.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #4273:
URL: https://github.com/apache/ozone/pull/4273#discussion_r1122633534


##########
hadoop-hdds/common/src/main/resources/ozone-default.xml:
##########
@@ -3608,4 +3608,14 @@
       production environments.
     </description>
   </property>
+
+  <property>
+    <name>ozone.om.snapshot.use.full.diff</name>

Review Comment:
   Done.



##########
hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/OzoneConfigKeys.java:
##########
@@ -603,6 +603,11 @@ public final class OzoneConfigKeys {
       OZONE_OM_SNAPSHOT_PRUNE_COMPACTION_DAG_DAEMON_RUN_INTERVAL_DEFAULT =
       TimeUnit.HOURS.toMillis(1);
 
+  public static final String OZONE_OM_SNAPSHOT_USE_FULL_DIFF =
+      "ozone.om.snapshot.use.full.diff";
+
+  public static final boolean OZONE_OM_SNAPSHOT_USE_FULL_DIFF_DEFAULT = false;

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org