You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by GitBox <gi...@apache.org> on 2019/10/08 01:29:58 UTC
[GitHub] [hadoop] arp7 commented on a change in pull request #1588: HDDS-1986. Fix listkeys API.

arp7 commented on a change in pull request #1588: HDDS-1986. Fix listkeys API.
URL: https://github.com/apache/hadoop/pull/1588#discussion_r332303770
 
 

 ##########
 File path: hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmMetadataManagerImpl.java
 ##########
 @@ -680,26 +688,85 @@ public boolean isBucketEmpty(String volume, String bucket)
       seekPrefix = getBucketKey(volumeName, bucketName + OM_KEY_PREFIX);
     }
     int currentCount = 0;
-    try (TableIterator<String, ? extends KeyValue<String, OmKeyInfo>> keyIter =
-        getKeyTable()
-            .iterator()) {
-      KeyValue<String, OmKeyInfo> kv = keyIter.seek(seekKey);
-      while (currentCount < maxKeys && keyIter.hasNext()) {
-        kv = keyIter.next();
-        // Skip the Start key if needed.
-        if (kv != null && skipStartKey && kv.getKey().equals(seekKey)) {
-          continue;
+
+
+    TreeMap<String, OmKeyInfo> cacheKeyMap = new TreeMap<>();
+    Set<String> deletedKeySet = new TreeSet<>();
+    Iterator<Map.Entry<CacheKey<String>, CacheValue<OmKeyInfo>>> iterator =
+        keyTable.cacheIterator();
+
+    //TODO: We can avoid this iteration if table cache has stored entries in
+    // treemap. Currently HashMap is used in Cache. HashMap get operation is an
+    // constant time operation, where as for treeMap get is log(n).
+    // So if we move to treemap, the get operation will be affected. As get
+    // is frequent operation on table. So, for now in list we iterate cache map
+    // and construct treeMap which match with keyPrefix and are greater than or
+    // equal to startKey. Later we can revisit this, if list operation
+    // is becoming slow.
+    while (iterator.hasNext()) {
+      Map.Entry< CacheKey<String>, CacheValue<OmKeyInfo>> entry =
+          iterator.next();
+
+      String key = entry.getKey().getCacheKey();
+      OmKeyInfo omKeyInfo = entry.getValue().getCacheValue();
+      // Making sure that entry in cache is not for delete key request.
+
+      if (omKeyInfo != null) {
+        if (key.startsWith(seekPrefix) && key.compareTo(seekKey) >= 0) {
+          cacheKeyMap.put(key, omKeyInfo);
         }
+      } else {
+        deletedKeySet.add(key);
+      }
+    }
+
+    // Get maxKeys from DB if it has.
+
+    try (TableIterator<String, ? extends KeyValue<String, OmKeyInfo>>
+             keyIter = getKeyTable().iterator()) {
+      KeyValue< String, OmKeyInfo > kv;
+      keyIter.seek(seekKey);
+      // we need to iterate maxKeys + 1 here because if skipStartKey is true,
+      // we should skip that entry and return the result.
+      while (currentCount < maxKeys + 1 && keyIter.hasNext()) {
+        kv = keyIter.next();
         if (kv != null && kv.getKey().startsWith(seekPrefix)) {
-          result.add(kv.getValue());
-          currentCount++;
+
+          // Entry should not be marked for delete, consider only those
+          // entries.
+          if(!deletedKeySet.contains(kv.getKey())) {
+            cacheKeyMap.put(kv.getKey(), kv.getValue());
+            currentCount++;
+          }
         } else {
           // The SeekPrefix does not match any more, we can break out of the
           // loop.
           break;
         }
       }
     }
+
+    // Finally DB entries and cache entries are merged, then return the count
+    // of maxKeys from the sorted map.
+    currentCount = 0;
+
+    for (Map.Entry<String, OmKeyInfo>  cacheKey : cacheKeyMap.entrySet()) {
 
 Review comment:
   The second iteration is unfortunate. We should see if there is a way to avoid it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org