You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2022/02/23 11:40:35 UTC

[GitHub] [ozone] guihecheng opened a new pull request #3127: HDDS-6364. EC: Discard pre-allocated blocks to eliminate worthless retries.

guihecheng opened a new pull request #3127:
URL: https://github.com/apache/ozone/pull/3127


   ## What changes were proposed in this pull request?
   
   Discard pre-allocated blocks to eliminate worthless retries.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-6364
   
   ## How was this patch tested?
   
   A new ut.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] umamaheswararao commented on a change in pull request #3127: HDDS-6364. EC: Discard pre-allocated blocks to eliminate worthless retries.

Posted by GitBox <gi...@apache.org>.
umamaheswararao commented on a change in pull request #3127:
URL: https://github.com/apache/ozone/pull/3127#discussion_r816094083



##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/BlockOutputStreamEntry.java
##########
@@ -223,7 +223,7 @@ long getLength() {
    * Gets the block token that is used to authenticate during the write.
    * @return the block token for writing the data
    */
-  Token<OzoneBlockTokenIdentifier> getToken() {

Review comment:
       Not sure it's a good idea to make this public as we return token object? 
   I have an idea to avoid make them public. Create a small helper class in the same package in test and add getToken and getCurrentPosition APIs and return the actual. 
   
   class name at test: *.client.io.BlockStreamAccessor




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] guihecheng commented on a change in pull request #3127: HDDS-6364. EC: Discard pre-allocated blocks to eliminate worthless retries.

Posted by GitBox <gi...@apache.org>.
guihecheng commented on a change in pull request #3127:
URL: https://github.com/apache/ozone/pull/3127#discussion_r814507203



##########
File path: hadoop-ozone/client/src/test/java/org/apache/hadoop/ozone/client/TestOzoneECClient.java
##########
@@ -908,6 +911,107 @@ public void testPartialStripeWithPartialChunkRetry()
     }
   }
 
+  @Test
+  public void testDiscardPreAllocatedBlocksPreventRetryExceeds()
+      throws IOException {
+    close();
+    OzoneConfiguration con = new OzoneConfiguration();
+    int maxRetries = 3;
+    con.setStorageSize(OzoneConfigKeys.OZONE_SCM_BLOCK_SIZE,
+        2, StorageUnit.KB);
+    con.setInt(OzoneConfigKeys.OZONE_CLIENT_MAX_EC_STRIPE_WRITE_RETRIES,
+        maxRetries);
+    MultiNodePipelineBlockAllocator blkAllocator =
+        new MultiNodePipelineBlockAllocator(con, dataBlocks + parityBlocks,
+            15);
+    createNewClient(con, blkAllocator);
+
+    store.createVolume(volumeName);
+    OzoneVolume volume = store.getVolume(volumeName);
+    volume.createBucket(bucketName);
+    OzoneBucket bucket = volume.getBucket(bucketName);
+
+    int numStripesBeforeFailure = 1;
+    int numStripesAfterFailure = 1;
+    int numSTripesTotal = numStripesBeforeFailure + numStripesAfterFailure;
+    int numChunksToWriteAfterFailure = dataBlocks;
+    int numExpectedBlockGrps = 2;
+    // fail the DNs for parity blocks

Review comment:
       Sure, we could fail any nodes here to trigger retry, the code comment is misleading here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] umamaheswararao merged pull request #3127: HDDS-6364. EC: Discard pre-allocated blocks to eliminate worthless retries.

Posted by GitBox <gi...@apache.org>.
umamaheswararao merged pull request #3127:
URL: https://github.com/apache/ozone/pull/3127


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] guihecheng commented on a change in pull request #3127: HDDS-6364. EC: Discard pre-allocated blocks to eliminate worthless retries.

Posted by GitBox <gi...@apache.org>.
guihecheng commented on a change in pull request #3127:
URL: https://github.com/apache/ozone/pull/3127#discussion_r814503711



##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ECKeyOutputStream.java
##########
@@ -87,6 +88,23 @@ public XceiverClientFactory getXceiverClientFactory() {
     return blockOutputStreamEntryPool.getLocationInfoList();
   }
 
+  @VisibleForTesting

Review comment:
       Oh, sure we could put this into the test class, thanks~




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] guihecheng commented on a change in pull request #3127: HDDS-6364. EC: Discard pre-allocated blocks to eliminate worthless retries.

Posted by GitBox <gi...@apache.org>.
guihecheng commented on a change in pull request #3127:
URL: https://github.com/apache/ozone/pull/3127#discussion_r816388168



##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/BlockOutputStreamEntry.java
##########
@@ -223,7 +223,7 @@ long getLength() {
    * Gets the block token that is used to authenticate during the write.
    * @return the block token for writing the data
    */
-  Token<OzoneBlockTokenIdentifier> getToken() {

Review comment:
       Oh, sounds a good idea, I'll try that, thanks~




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] umamaheswararao commented on a change in pull request #3127: HDDS-6364. EC: Discard pre-allocated blocks to eliminate worthless retries.

Posted by GitBox <gi...@apache.org>.
umamaheswararao commented on a change in pull request #3127:
URL: https://github.com/apache/ozone/pull/3127#discussion_r814485024



##########
File path: hadoop-ozone/client/src/test/java/org/apache/hadoop/ozone/client/TestOzoneECClient.java
##########
@@ -908,6 +911,107 @@ public void testPartialStripeWithPartialChunkRetry()
     }
   }
 
+  @Test
+  public void testDiscardPreAllocatedBlocksPreventRetryExceeds()
+      throws IOException {
+    close();
+    OzoneConfiguration con = new OzoneConfiguration();
+    int maxRetries = 3;
+    con.setStorageSize(OzoneConfigKeys.OZONE_SCM_BLOCK_SIZE,
+        2, StorageUnit.KB);
+    con.setInt(OzoneConfigKeys.OZONE_CLIENT_MAX_EC_STRIPE_WRITE_RETRIES,
+        maxRetries);
+    MultiNodePipelineBlockAllocator blkAllocator =
+        new MultiNodePipelineBlockAllocator(con, dataBlocks + parityBlocks,
+            15);
+    createNewClient(con, blkAllocator);
+
+    store.createVolume(volumeName);
+    OzoneVolume volume = store.getVolume(volumeName);
+    volume.createBucket(bucketName);
+    OzoneBucket bucket = volume.getBucket(bucketName);
+
+    int numStripesBeforeFailure = 1;
+    int numStripesAfterFailure = 1;
+    int numSTripesTotal = numStripesBeforeFailure + numStripesAfterFailure;
+    int numChunksToWriteAfterFailure = dataBlocks;
+    int numExpectedBlockGrps = 2;
+    // fail the DNs for parity blocks
+    int[] nodesIndexesToMarkFailure = {3, 4};
+
+    try (OzoneOutputStream out = bucket.createKey(keyName,
+        1024 * dataBlocks * numStripesBeforeFailure
+            + numChunksToWriteAfterFailure,
+        new ECReplicationConfig(dataBlocks, parityBlocks,
+            ECReplicationConfig.EcCodec.RS,
+            chunkSize), new HashMap<>())) {
+      Assert.assertTrue(out.getOutputStream() instanceof ECKeyOutputStream);
+      ECKeyOutputStream kos = (ECKeyOutputStream) out.getOutputStream();
+      List<OmKeyLocationInfo> blockInfos = kos.getAllLocationInfoList();
+      Assert.assertEquals(1, blockInfos.size());
+
+      // Mock some pre-allocated blocks to the key
+      // should be > maxRetries
+      int numPreAllocatedBlocks = maxRetries + 1;
+      BlockID blockID = blockInfos.get(0).getBlockID();
+      Pipeline pipeline = blockInfos.get(0).getPipeline();
+      List<OmKeyLocationInfo> omKeyLocationInfos = new ArrayList<>();
+      for (int i = 0; i < numPreAllocatedBlocks; i++) {
+        BlockID nextBlockID = new BlockID(blockID.getContainerID(),
+            blockID.getLocalID() + i + 1);
+        omKeyLocationInfos.add(new OmKeyLocationInfo.Builder()
+            .setBlockID(nextBlockID)
+            .setPipeline(pipeline)
+            .build());
+      }
+      OmKeyLocationInfoGroup omKeyLocationInfoGroup =
+          new OmKeyLocationInfoGroup(0, omKeyLocationInfos);
+      kos.addPreallocateBlocks(omKeyLocationInfoGroup, 0);
+
+      for (int j = 0; j < numStripesBeforeFailure; j++) {
+        for (int i = 0; i < dataBlocks; i++) {
+          out.write(inputChunks[i]);
+        }
+      }
+
+      // Make the parity write fail to trigger retry
+      List<DatanodeDetails> failedDNs = new ArrayList<>();
+      List<HddsProtos.DatanodeDetailsProto> dns = allocator.getClusterDns();
+      for (int j = 0; j < nodesIndexesToMarkFailure.length; j++) {
+        failedDNs.add(DatanodeDetails
+            .getFromProtoBuf(dns.get(nodesIndexesToMarkFailure[j])));
+      }
+
+      // First let's set storage as bad
+      ((MockXceiverClientFactory) factoryStub).setFailedStorages(failedDNs);
+
+      for (int j = 0; j < numStripesAfterFailure; j++) {
+        for (int i = 0; i < dataBlocks; i++) {
+          out.write(inputChunks[i]);
+        }
+      }
+
+      // if we don't discard pre-allocated blocks,

Review comment:
       comment can be like "if we don't discard pre-allocated blocks, retries should exceed the maxRetries and write will fail." ?

##########
File path: hadoop-ozone/client/src/test/java/org/apache/hadoop/ozone/client/TestOzoneECClient.java
##########
@@ -908,6 +911,107 @@ public void testPartialStripeWithPartialChunkRetry()
     }
   }
 
+  @Test
+  public void testDiscardPreAllocatedBlocksPreventRetryExceeds()
+      throws IOException {
+    close();
+    OzoneConfiguration con = new OzoneConfiguration();
+    int maxRetries = 3;
+    con.setStorageSize(OzoneConfigKeys.OZONE_SCM_BLOCK_SIZE,
+        2, StorageUnit.KB);
+    con.setInt(OzoneConfigKeys.OZONE_CLIENT_MAX_EC_STRIPE_WRITE_RETRIES,
+        maxRetries);
+    MultiNodePipelineBlockAllocator blkAllocator =
+        new MultiNodePipelineBlockAllocator(con, dataBlocks + parityBlocks,
+            15);
+    createNewClient(con, blkAllocator);
+
+    store.createVolume(volumeName);
+    OzoneVolume volume = store.getVolume(volumeName);
+    volume.createBucket(bucketName);
+    OzoneBucket bucket = volume.getBucket(bucketName);
+
+    int numStripesBeforeFailure = 1;
+    int numStripesAfterFailure = 1;
+    int numSTripesTotal = numStripesBeforeFailure + numStripesAfterFailure;
+    int numChunksToWriteAfterFailure = dataBlocks;
+    int numExpectedBlockGrps = 2;
+    // fail the DNs for parity blocks
+    int[] nodesIndexesToMarkFailure = {3, 4};
+
+    try (OzoneOutputStream out = bucket.createKey(keyName,

Review comment:
       Below number seems key for getting the pre-allocated blocks. Can we extract to a variable and provide one line comment with details like how many pre-allocated blocks expected? But looks like we are adding preallocated blocks explicitly. Then this number really matters?

##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ECKeyOutputStream.java
##########
@@ -87,6 +88,23 @@ public XceiverClientFactory getXceiverClientFactory() {
     return blockOutputStreamEntryPool.getLocationInfoList();
   }
 
+  @VisibleForTesting

Review comment:
       Looks like this used only for tests?
   If so, why can't we use getStreamEntries API and build the required info in tests?

##########
File path: hadoop-ozone/client/src/test/java/org/apache/hadoop/ozone/client/TestOzoneECClient.java
##########
@@ -908,6 +911,107 @@ public void testPartialStripeWithPartialChunkRetry()
     }
   }
 
+  @Test
+  public void testDiscardPreAllocatedBlocksPreventRetryExceeds()
+      throws IOException {
+    close();
+    OzoneConfiguration con = new OzoneConfiguration();
+    int maxRetries = 3;
+    con.setStorageSize(OzoneConfigKeys.OZONE_SCM_BLOCK_SIZE,
+        2, StorageUnit.KB);
+    con.setInt(OzoneConfigKeys.OZONE_CLIENT_MAX_EC_STRIPE_WRITE_RETRIES,
+        maxRetries);
+    MultiNodePipelineBlockAllocator blkAllocator =
+        new MultiNodePipelineBlockAllocator(con, dataBlocks + parityBlocks,
+            15);
+    createNewClient(con, blkAllocator);
+
+    store.createVolume(volumeName);
+    OzoneVolume volume = store.getVolume(volumeName);
+    volume.createBucket(bucketName);
+    OzoneBucket bucket = volume.getBucket(bucketName);
+
+    int numStripesBeforeFailure = 1;
+    int numStripesAfterFailure = 1;
+    int numSTripesTotal = numStripesBeforeFailure + numStripesAfterFailure;
+    int numChunksToWriteAfterFailure = dataBlocks;
+    int numExpectedBlockGrps = 2;
+    // fail the DNs for parity blocks

Review comment:
       There is no reason to make only parity nodes fail right? we can fail any nodes to trigger retry right?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] guihecheng commented on a change in pull request #3127: HDDS-6364. EC: Discard pre-allocated blocks to eliminate worthless retries.

Posted by GitBox <gi...@apache.org>.
guihecheng commented on a change in pull request #3127:
URL: https://github.com/apache/ozone/pull/3127#discussion_r814506511



##########
File path: hadoop-ozone/client/src/test/java/org/apache/hadoop/ozone/client/TestOzoneECClient.java
##########
@@ -908,6 +911,107 @@ public void testPartialStripeWithPartialChunkRetry()
     }
   }
 
+  @Test
+  public void testDiscardPreAllocatedBlocksPreventRetryExceeds()
+      throws IOException {
+    close();
+    OzoneConfiguration con = new OzoneConfiguration();
+    int maxRetries = 3;
+    con.setStorageSize(OzoneConfigKeys.OZONE_SCM_BLOCK_SIZE,
+        2, StorageUnit.KB);
+    con.setInt(OzoneConfigKeys.OZONE_CLIENT_MAX_EC_STRIPE_WRITE_RETRIES,
+        maxRetries);
+    MultiNodePipelineBlockAllocator blkAllocator =
+        new MultiNodePipelineBlockAllocator(con, dataBlocks + parityBlocks,
+            15);
+    createNewClient(con, blkAllocator);
+
+    store.createVolume(volumeName);
+    OzoneVolume volume = store.getVolume(volumeName);
+    volume.createBucket(bucketName);
+    OzoneBucket bucket = volume.getBucket(bucketName);
+
+    int numStripesBeforeFailure = 1;
+    int numStripesAfterFailure = 1;
+    int numSTripesTotal = numStripesBeforeFailure + numStripesAfterFailure;
+    int numChunksToWriteAfterFailure = dataBlocks;
+    int numExpectedBlockGrps = 2;
+    // fail the DNs for parity blocks
+    int[] nodesIndexesToMarkFailure = {3, 4};
+
+    try (OzoneOutputStream out = bucket.createKey(keyName,
+        1024 * dataBlocks * numStripesBeforeFailure
+            + numChunksToWriteAfterFailure,
+        new ECReplicationConfig(dataBlocks, parityBlocks,
+            ECReplicationConfig.EcCodec.RS,
+            chunkSize), new HashMap<>())) {
+      Assert.assertTrue(out.getOutputStream() instanceof ECKeyOutputStream);
+      ECKeyOutputStream kos = (ECKeyOutputStream) out.getOutputStream();
+      List<OmKeyLocationInfo> blockInfos = kos.getAllLocationInfoList();
+      Assert.assertEquals(1, blockInfos.size());
+
+      // Mock some pre-allocated blocks to the key
+      // should be > maxRetries
+      int numPreAllocatedBlocks = maxRetries + 1;
+      BlockID blockID = blockInfos.get(0).getBlockID();
+      Pipeline pipeline = blockInfos.get(0).getPipeline();
+      List<OmKeyLocationInfo> omKeyLocationInfos = new ArrayList<>();
+      for (int i = 0; i < numPreAllocatedBlocks; i++) {
+        BlockID nextBlockID = new BlockID(blockID.getContainerID(),
+            blockID.getLocalID() + i + 1);
+        omKeyLocationInfos.add(new OmKeyLocationInfo.Builder()
+            .setBlockID(nextBlockID)
+            .setPipeline(pipeline)
+            .build());
+      }
+      OmKeyLocationInfoGroup omKeyLocationInfoGroup =
+          new OmKeyLocationInfoGroup(0, omKeyLocationInfos);
+      kos.addPreallocateBlocks(omKeyLocationInfoGroup, 0);
+
+      for (int j = 0; j < numStripesBeforeFailure; j++) {
+        for (int i = 0; i < dataBlocks; i++) {
+          out.write(inputChunks[i]);
+        }
+      }
+
+      // Make the parity write fail to trigger retry
+      List<DatanodeDetails> failedDNs = new ArrayList<>();
+      List<HddsProtos.DatanodeDetailsProto> dns = allocator.getClusterDns();
+      for (int j = 0; j < nodesIndexesToMarkFailure.length; j++) {
+        failedDNs.add(DatanodeDetails
+            .getFromProtoBuf(dns.get(nodesIndexesToMarkFailure[j])));
+      }
+
+      // First let's set storage as bad
+      ((MockXceiverClientFactory) factoryStub).setFailedStorages(failedDNs);
+
+      for (int j = 0; j < numStripesAfterFailure; j++) {
+        for (int i = 0; i < dataBlocks; i++) {
+          out.write(inputChunks[i]);
+        }
+      }
+
+      // if we don't discard pre-allocated blocks,

Review comment:
       Yes, exactly.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] guihecheng commented on a change in pull request #3127: HDDS-6364. EC: Discard pre-allocated blocks to eliminate worthless retries.

Posted by GitBox <gi...@apache.org>.
guihecheng commented on a change in pull request #3127:
URL: https://github.com/apache/ozone/pull/3127#discussion_r814506286



##########
File path: hadoop-ozone/client/src/test/java/org/apache/hadoop/ozone/client/TestOzoneECClient.java
##########
@@ -908,6 +911,107 @@ public void testPartialStripeWithPartialChunkRetry()
     }
   }
 
+  @Test
+  public void testDiscardPreAllocatedBlocksPreventRetryExceeds()
+      throws IOException {
+    close();
+    OzoneConfiguration con = new OzoneConfiguration();
+    int maxRetries = 3;
+    con.setStorageSize(OzoneConfigKeys.OZONE_SCM_BLOCK_SIZE,
+        2, StorageUnit.KB);
+    con.setInt(OzoneConfigKeys.OZONE_CLIENT_MAX_EC_STRIPE_WRITE_RETRIES,
+        maxRetries);
+    MultiNodePipelineBlockAllocator blkAllocator =
+        new MultiNodePipelineBlockAllocator(con, dataBlocks + parityBlocks,
+            15);
+    createNewClient(con, blkAllocator);
+
+    store.createVolume(volumeName);
+    OzoneVolume volume = store.getVolume(volumeName);
+    volume.createBucket(bucketName);
+    OzoneBucket bucket = volume.getBucket(bucketName);
+
+    int numStripesBeforeFailure = 1;
+    int numStripesAfterFailure = 1;
+    int numSTripesTotal = numStripesBeforeFailure + numStripesAfterFailure;
+    int numChunksToWriteAfterFailure = dataBlocks;
+    int numExpectedBlockGrps = 2;
+    // fail the DNs for parity blocks
+    int[] nodesIndexesToMarkFailure = {3, 4};
+
+    try (OzoneOutputStream out = bucket.createKey(keyName,

Review comment:
       Oh, it seems a mistakenly write size for the key, there's another variable `numPreAllocatedBlocks` below with comments, the `numPreAllocatedBlocks` should be > EC max retries, so we intend to show that if pre-allocated block are not dropped, they will be used during retry which is worthless for the same pipeline.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org