You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2022/02/16 02:26:56 UTC

[GitHub] [ozone] guihecheng opened a new pull request #3094: HDDS-6319. EC: Fix read big file failure with EC policy 10+4.

guihecheng opened a new pull request #3094:
URL: https://github.com/apache/ozone/pull/3094


   ## What changes were proposed in this pull request?
   
   Fix read big file failure with EC policy 10+4.
   Using a file of 2GB in manual test.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-6319
   
   ## How was this patch tested?
   
   manual test:
   ./bin/ozone sh volume create vol1
   ./bin/ozone sh bucket create vol1/bucket1 --layout=FILE_SYSTEM_OPTIMIZED --replication=rs-10-4-1024k --type EC
   ./bin/ozone sh key put /vol1/bucket1/dd.2G dd.2G
   ./bin/ozone sh key get /vol1/bucket1/dd.2G down.2G
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] umamaheswararao commented on pull request #3094: HDDS-6319. EC: Fix read big file failure with EC policy 10+4.

Posted by GitBox <gi...@apache.org>.

umamaheswararao commented on pull request #3094:
URL: https://github.com/apache/ozone/pull/3094#issuecomment-1043384284


   @guihecheng Do you mind creating PR with test?
   If test included I am +1, thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] guihecheng commented on a change in pull request #3094: HDDS-6319. EC: Fix read big file failure with EC policy 10+4.

Posted by GitBox <gi...@apache.org>.

guihecheng commented on a change in pull request #3094:
URL: https://github.com/apache/ozone/pull/3094#discussion_r812504649



##########
File path: hadoop-ozone/client/src/test/java/org/apache/hadoop/ozone/client/io/TestKeyInputStreamEC.java
##########
@@ -0,0 +1,124 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.ozone.client.io;
+
+import org.apache.hadoop.hdds.client.BlockID;
+import org.apache.hadoop.hdds.client.ECReplicationConfig;
+import org.apache.hadoop.hdds.client.ReplicationConfig;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.scm.pipeline.Pipeline;
+import org.apache.hadoop.hdds.scm.pipeline.PipelineID;
+import org.apache.hadoop.ozone.om.helpers.OmKeyLocationInfo;
+import org.junit.Assert;
+import org.junit.Test;
+
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.UUID;
+
+import static org.apache.hadoop.ozone.OzoneConsts.MB;
+import static org.mockito.ArgumentMatchers.any;
+import static org.mockito.Mockito.*;
+
+/**
+ * Test KeyInputStream with underlying ECBlockInputStream.
+ */
+public class TestKeyInputStreamEC {
+
+  @Test

Review comment:
       Oh, I agree it is better to create a KeyInputStream object from a normal interface instead of injecting it with a newly added test interface.
   So, in order to utilize ECStreamTestUtil.TestBlockInputStream, let's move this test file to the same directory of ECStreamTestUtil class.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] umamaheswararao merged pull request #3094: HDDS-6319. EC: Fix read big file failure with EC policy 10+4.

Posted by GitBox <gi...@apache.org>.

umamaheswararao merged pull request #3094:
URL: https://github.com/apache/ozone/pull/3094


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] sodonnel commented on a change in pull request #3094: HDDS-6319. EC: Fix read big file failure with EC policy 10+4.

Posted by GitBox <gi...@apache.org>.

sodonnel commented on a change in pull request #3094:
URL: https://github.com/apache/ozone/pull/3094#discussion_r811853242



##########
File path: hadoop-ozone/client/src/test/java/org/apache/hadoop/ozone/client/io/TestKeyInputStreamEC.java
##########
@@ -0,0 +1,124 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.ozone.client.io;
+
+import org.apache.hadoop.hdds.client.BlockID;
+import org.apache.hadoop.hdds.client.ECReplicationConfig;
+import org.apache.hadoop.hdds.client.ReplicationConfig;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.scm.pipeline.Pipeline;
+import org.apache.hadoop.hdds.scm.pipeline.PipelineID;
+import org.apache.hadoop.ozone.om.helpers.OmKeyLocationInfo;
+import org.junit.Assert;
+import org.junit.Test;
+
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.UUID;
+
+import static org.apache.hadoop.ozone.OzoneConsts.MB;
+import static org.mockito.ArgumentMatchers.any;
+import static org.mockito.Mockito.*;
+
+/**
+ * Test KeyInputStream with underlying ECBlockInputStream.
+ */
+public class TestKeyInputStreamEC {
+
+  @Test
+  public void testReadAgainstLargeBlockGroup() throws IOException {
+    int dataBlocks = 10;
+    int parityBlocks = 4;
+    ECReplicationConfig ec10And4RepConfig = new ECReplicationConfig(dataBlocks,
+        parityBlocks, ECReplicationConfig.EcCodec.RS, (int)(1 * MB));
+    // default blockSize of 256MB with EC 10+4 makes a large block group
+    long blockSize = 256 * MB;
+    OmKeyLocationInfo blockInfo = createBlockInfo(ec10And4RepConfig,
+        dataBlocks + parityBlocks, dataBlocks * blockSize);
+
+    ECBlockInputStream ecBlockInputStream = new ECBlockInputStream(
+        ec10And4RepConfig, blockInfo, false, null, null, null);
+
+    ECBlockInputStreamFactory mockStreamFactory =
+        mock(ECBlockInputStreamFactory.class);
+    when(mockStreamFactory.create(anyBoolean(), anyList(), any(), any(),
+        anyBoolean(), any(), any())).thenReturn(ecBlockInputStream);
+
+    try (KeyInputStream kis = new KeyInputStream()) {
+      ECBlockInputStreamProxy streamProxy = new ECBlockInputStreamProxy(
+          ec10And4RepConfig, blockInfo, true, null, null, mockStreamFactory);
+      ECBlockInputStreamProxy spyEcBlockInputStream = spy(streamProxy);
+      byte[] buf = new byte[100];
+      // spy the read(ByteBuffer) method is ok since issue HDDS-6319
+      // happens in read(byte[], off, len)
+      doReturn(buf.length).when(spyEcBlockInputStream)
+          .read(any(ByteBuffer.class));
+
+      kis.addStream(spyEcBlockInputStream);
+
+      int readBytes = kis.read(buf, 0, 100);
+      Assert.assertEquals(100, readBytes);
+    }
+  }
+
+  private OmKeyLocationInfo createBlockInfo(ReplicationConfig repConf,
+      int nodeCount, long blockLength) {
+    Map<DatanodeDetails, Integer> dnMap = new HashMap<>();
+    for (int i = 0; i < nodeCount; i++) {
+      dnMap.put(randomDatanodeDetails(), i + 1);
+    }
+
+    Pipeline pipeline = Pipeline.newBuilder()
+        .setState(Pipeline.PipelineState.CLOSED)
+        .setId(PipelineID.randomId())
+        .setNodes(new ArrayList<>(dnMap.keySet()))
+        .setReplicaIndexes(dnMap)
+        .setReplicationConfig(repConf)
+        .build();
+
+    OmKeyLocationInfo blockInfo = new OmKeyLocationInfo.Builder()
+        .setBlockID(new BlockID(1, 1))
+        .setLength(blockLength)
+        .setOffset(0)
+        .setPipeline(pipeline)
+        .setPartNumber(0)
+        .build();
+    return blockInfo;
+  }
+
+  private DatanodeDetails randomDatanodeDetails() {

Review comment:
       There is an existing class MockDatanodeDetails you can use to get random datanode details like this. Can you use it rather than the code here please?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] sodonnel commented on pull request #3094: HDDS-6319. EC: Fix read big file failure with EC policy 10+4.

Posted by GitBox <gi...@apache.org>.

sodonnel commented on pull request #3094:
URL: https://github.com/apache/ozone/pull/3094#issuecomment-1041244725


   Would there be any way to add a test to reproduce this via a unit test with mocks? I know we have some other mocked tests on the write path, but perhaps they don't work with the KeyInputStream Writing around 3GB of data for each run of the unit tests might be a bit of a resource burden for the CI checks and even running locally.
   
   I will have a look shortly if there is a way to reproduce this from the mocked input streams.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] guihecheng commented on pull request #3094: HDDS-6319. EC: Fix read big file failure with EC policy 10+4.

Posted by GitBox <gi...@apache.org>.

guihecheng commented on pull request #3094:
URL: https://github.com/apache/ozone/pull/3094#issuecomment-1041249336


   @umamaheswararao @sodonnel Oh thanks for your comments, originally I was wondering how to cover this with a test case and preventing creating a real large file on disk or in memory.
   The integration-test case offered by Uma looks good, but I understand that TestECBlockInputStream seems to be dedicated for testing ECBlockInputStream while test issues lies within KeyInputStream with underlying ECBlockInputStream.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] guihecheng commented on pull request #3094: HDDS-6319. EC: Fix read big file failure with EC policy 10+4.

Posted by GitBox <gi...@apache.org>.

guihecheng commented on pull request #3094:
URL: https://github.com/apache/ozone/pull/3094#issuecomment-1043825669


   @umamaheswararao a new ut added with mocks, I created a new source file here since I think it doesn't belong to any existing one.
   cc @sodonnel 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] guihecheng commented on a change in pull request #3094: HDDS-6319. EC: Fix read big file failure with EC policy 10+4.

Posted by GitBox <gi...@apache.org>.

guihecheng commented on a change in pull request #3094:
URL: https://github.com/apache/ozone/pull/3094#discussion_r807601538



##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/KeyInputStream.java
##########
@@ -264,7 +264,7 @@ protected synchronized int readWithStrategy(ByteReaderStrategy strategy)
 
       // Get the current blockStream and read data from it
       BlockExtendedInputStream current = blockStreams.get(blockIndex);
-      int numBytesToRead = Math.min(buffLen, (int)current.getRemaining());
+      int numBytesToRead = (int)Math.min(buffLen, current.getRemaining());

Review comment:
       Thanks for your reminding, rebased and waiting for CI.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] umamaheswararao commented on pull request #3094: HDDS-6319. EC: Fix read big file failure with EC policy 10+4.

Posted by GitBox <gi...@apache.org>.

umamaheswararao commented on pull request #3094:
URL: https://github.com/apache/ozone/pull/3094#issuecomment-1041183210


   Thanks @guihecheng for the patch. Is there a way we can add some test around this? 
   Below test can reproduce this case (Probably you may want to keep this test in appropriate  test class and If there is a better way, that would be awesome though, thanks CC: @sodonnel ):
   ```
   diff --git a/hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/KeyInputStream.java b/hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/KeyInputStream.java
   index 46a7b5099..f6e7a6dab 100644
   --- a/hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/KeyInputStream.java
   +++ b/hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/KeyInputStream.java
   @@ -210,6 +210,11 @@ public void addStream(BlockInputStream blockInputStream) {
        blockStreams.add(blockInputStream);
      }
   
   +  @VisibleForTesting
   +  public void addStream(BlockExtendedInputStream blockInputStream) {
   +    blockStreams.add(blockInputStream);
   +  }
   +
      /**
       * {@inheritDoc}
       */
   @@ -264,6 +269,7 @@ protected synchronized int readWithStrategy(ByteReaderStrategy strategy)
   
          // Get the current blockStream and read data from it
          BlockExtendedInputStream current = blockStreams.get(blockIndex);
   +
          int numBytesToRead = Math.min(buffLen, (int)current.getRemaining());
          int numBytesRead = strategy.readFromBlock(current, numBytesToRead);
          if (numBytesRead != numBytesToRead) {
   diff --git a/hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/read/TestECBlockInputStream.java b/hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/read/TestECBlockInputStream.java
   index b3b01ba4a..4289728a0 100644
   --- a/hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/read/TestECBlockInputStream.java
   +++ b/hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/read/TestECBlockInputStream.java
   @@ -30,6 +30,7 @@
    import org.apache.hadoop.ozone.client.io.BadDataLocationException;
    import org.apache.hadoop.ozone.client.io.BlockInputStreamFactory;
    import org.apache.hadoop.ozone.client.io.ECBlockInputStream;
   +import org.apache.hadoop.ozone.client.io.KeyInputStream;
    import org.apache.hadoop.ozone.om.helpers.OmKeyLocationInfo;
    import org.apache.hadoop.security.token.Token;
    import org.junit.Assert;
   @@ -230,6 +231,25 @@ public void testSimpleRead() throws IOException {
        }
      }
   
   +  @Test
   +  public void testLargeSizeReadWithKeyInputStream() throws IOException {
   +    ECReplicationConfig ec10And4RepConfig =
   +        new ECReplicationConfig(10, 4, ECReplicationConfig.EcCodec.RS, ONEMB);
   +    OmKeyLocationInfo keyInfo = ECStreamTestUtil
   +        .createKeyInfo(ec10And4RepConfig, 14, (10 * 256L * ONEMB));
   +    try (ECBlockInputStream ecb = new ECBlockInputStream(ec10And4RepConfig,
   +        keyInfo, true, null, null, streamFactory)) {
   +
   +      KeyInputStream kis = new KeyInputStream();
   +      kis.addStream(ecb);
   +      ByteBuffer buf = ByteBuffer.allocate(100);
   +      int read = kis.read(buf);
   +      Assert.assertEquals(100, read);
   +      validateBufferContents(buf, 0, 100, (byte) 0);
   +      Assert.assertEquals(100, ecb.getPos());
   +    }
   +  }
   +
      /**
       * This test is to ensure we can read a small key of 1 chunk or less when only
       * the first replica index is available.
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] guihecheng commented on a change in pull request #3094: HDDS-6319. EC: Fix read big file failure with EC policy 10+4.

Posted by GitBox <gi...@apache.org>.

guihecheng commented on a change in pull request #3094:
URL: https://github.com/apache/ozone/pull/3094#discussion_r811862732



##########
File path: hadoop-ozone/client/src/test/java/org/apache/hadoop/ozone/client/io/TestKeyInputStreamEC.java
##########
@@ -0,0 +1,124 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.ozone.client.io;
+
+import org.apache.hadoop.hdds.client.BlockID;
+import org.apache.hadoop.hdds.client.ECReplicationConfig;
+import org.apache.hadoop.hdds.client.ReplicationConfig;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.scm.pipeline.Pipeline;
+import org.apache.hadoop.hdds.scm.pipeline.PipelineID;
+import org.apache.hadoop.ozone.om.helpers.OmKeyLocationInfo;
+import org.junit.Assert;
+import org.junit.Test;
+
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.UUID;
+
+import static org.apache.hadoop.ozone.OzoneConsts.MB;
+import static org.mockito.ArgumentMatchers.any;
+import static org.mockito.Mockito.*;
+
+/**
+ * Test KeyInputStream with underlying ECBlockInputStream.
+ */
+public class TestKeyInputStreamEC {
+
+  @Test
+  public void testReadAgainstLargeBlockGroup() throws IOException {
+    int dataBlocks = 10;
+    int parityBlocks = 4;
+    ECReplicationConfig ec10And4RepConfig = new ECReplicationConfig(dataBlocks,
+        parityBlocks, ECReplicationConfig.EcCodec.RS, (int)(1 * MB));
+    // default blockSize of 256MB with EC 10+4 makes a large block group
+    long blockSize = 256 * MB;
+    OmKeyLocationInfo blockInfo = createBlockInfo(ec10And4RepConfig,
+        dataBlocks + parityBlocks, dataBlocks * blockSize);
+
+    ECBlockInputStream ecBlockInputStream = new ECBlockInputStream(
+        ec10And4RepConfig, blockInfo, false, null, null, null);
+
+    ECBlockInputStreamFactory mockStreamFactory =
+        mock(ECBlockInputStreamFactory.class);
+    when(mockStreamFactory.create(anyBoolean(), anyList(), any(), any(),
+        anyBoolean(), any(), any())).thenReturn(ecBlockInputStream);
+
+    try (KeyInputStream kis = new KeyInputStream()) {
+      ECBlockInputStreamProxy streamProxy = new ECBlockInputStreamProxy(
+          ec10And4RepConfig, blockInfo, true, null, null, mockStreamFactory);
+      ECBlockInputStreamProxy spyEcBlockInputStream = spy(streamProxy);
+      byte[] buf = new byte[100];
+      // spy the read(ByteBuffer) method is ok since issue HDDS-6319
+      // happens in read(byte[], off, len)
+      doReturn(buf.length).when(spyEcBlockInputStream)
+          .read(any(ByteBuffer.class));
+
+      kis.addStream(spyEcBlockInputStream);
+
+      int readBytes = kis.read(buf, 0, 100);
+      Assert.assertEquals(100, readBytes);
+    }
+  }
+
+  private OmKeyLocationInfo createBlockInfo(ReplicationConfig repConf,
+      int nodeCount, long blockLength) {
+    Map<DatanodeDetails, Integer> dnMap = new HashMap<>();
+    for (int i = 0; i < nodeCount; i++) {
+      dnMap.put(randomDatanodeDetails(), i + 1);
+    }
+
+    Pipeline pipeline = Pipeline.newBuilder()
+        .setState(Pipeline.PipelineState.CLOSED)
+        .setId(PipelineID.randomId())
+        .setNodes(new ArrayList<>(dnMap.keySet()))
+        .setReplicaIndexes(dnMap)
+        .setReplicationConfig(repConf)
+        .build();
+
+    OmKeyLocationInfo blockInfo = new OmKeyLocationInfo.Builder()
+        .setBlockID(new BlockID(1, 1))
+        .setLength(blockLength)
+        .setOffset(0)
+        .setPipeline(pipeline)
+        .setPartNumber(0)
+        .build();
+    return blockInfo;
+  }
+
+  private DatanodeDetails randomDatanodeDetails() {

Review comment:
       Oh, thanks for the suggestion, let's use the existing mock class.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] sodonnel commented on a change in pull request #3094: HDDS-6319. EC: Fix read big file failure with EC policy 10+4.

Posted by GitBox <gi...@apache.org>.

sodonnel commented on a change in pull request #3094:
URL: https://github.com/apache/ozone/pull/3094#discussion_r811940081



##########
File path: hadoop-ozone/client/src/test/java/org/apache/hadoop/ozone/client/io/TestKeyInputStreamEC.java
##########
@@ -0,0 +1,124 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.ozone.client.io;
+
+import org.apache.hadoop.hdds.client.BlockID;
+import org.apache.hadoop.hdds.client.ECReplicationConfig;
+import org.apache.hadoop.hdds.client.ReplicationConfig;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.scm.pipeline.Pipeline;
+import org.apache.hadoop.hdds.scm.pipeline.PipelineID;
+import org.apache.hadoop.ozone.om.helpers.OmKeyLocationInfo;
+import org.junit.Assert;
+import org.junit.Test;
+
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.UUID;
+
+import static org.apache.hadoop.ozone.OzoneConsts.MB;
+import static org.mockito.ArgumentMatchers.any;
+import static org.mockito.Mockito.*;
+
+/**
+ * Test KeyInputStream with underlying ECBlockInputStream.
+ */
+public class TestKeyInputStreamEC {
+
+  @Test

Review comment:
       For this test, I wonder if it would be better if we avoiding manually adding streams into the KeyInputStream, and just pass the correct objects into `KeyInputStream.getFromOmKeyInfo` to make the problem appear. For example, I was able to create the test like this:
   
   ```
   public class TestKeyInputStreamEC {
   
     @Test
     public void testReadAgainstLargeBlockGroup() throws IOException {
       int dataBlocks = 10;
       int parityBlocks = 4;
       ECReplicationConfig ec10And4RepConfig = new ECReplicationConfig(dataBlocks,
           parityBlocks, ECReplicationConfig.EcCodec.RS, (int)(1 * MB));
       // default blockSize of 256MB with EC 10+4 makes a large block group
       long blockSize = 256 * MB;
       long blockLength = blockSize * dataBlocks;
       OmKeyInfo keyInfo = createOmKeyInfo(ec10And4RepConfig,
           dataBlocks + parityBlocks, blockLength);
   
       BlockExtendedInputStream inputStream =
           new ECStreamTestUtil.TestBlockInputStream(new BlockID(1, 1),
               blockLength, ByteBuffer.allocate(100));
   
       BlockInputStreamFactory mockStreamFactory =
           mock(BlockInputStreamFactory.class);
       when(mockStreamFactory.create(any(), any(), any(), any(),
           anyBoolean(), any(), any())).thenReturn(inputStream);
   
       try (LengthInputStream is = KeyInputStream.getFromOmKeyInfo(keyInfo, null, true,  null, mockStreamFactory)) {
         byte[] buf = new byte[100];
         int readBytes = is.read(buf, 0, 100);
       }
     }
   
     private OmKeyInfo createOmKeyInfo(ReplicationConfig repConf,
         int nodeCount, long blockLength) {
       Map<DatanodeDetails, Integer> dnMap = new HashMap<>();
       for (int i = 0; i < nodeCount; i++) {
         dnMap.put(MockDatanodeDetails.randomDatanodeDetails(), i + 1);
       }
   
       Pipeline pipeline = Pipeline.newBuilder()
           .setState(Pipeline.PipelineState.CLOSED)
           .setId(PipelineID.randomId())
           .setNodes(new ArrayList<>(dnMap.keySet()))
           .setReplicaIndexes(dnMap)
           .setReplicationConfig(repConf)
           .build();
   
       OmKeyLocationInfo blockInfo = new OmKeyLocationInfo.Builder()
           .setBlockID(new BlockID(1, 1))
           .setLength(blockLength)
           .setOffset(0)
           .setPipeline(pipeline)
           .setPartNumber(0)
           .build();
   
       List<OmKeyLocationInfo> locations = new ArrayList<>();
       locations.add(blockInfo);
       return new OmKeyInfo.Builder()
           .setBucketName("bucket")
           .setVolumeName("volume")
           .setDataSize(blockLength)
           .setKeyName("someKey")
           .setReplicationConfig(repConf)
           .addOmKeyLocationInfoGroup(new OmKeyLocationInfoGroup(0, locations))
           .build();
     }
   }
   ```
   
   I did need to make a couple of small changes to the test util to make this work:
   
   ```
   --- a/hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/read/ECStreamTestUtil.java
   +++ b/hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/read/ECStreamTestUtil.java
   @@ -276,7 +276,7 @@ public BlockExtendedInputStream create(ReplicationConfig repConfig,
        private int ecReplicaIndex = 0;
        private static final byte EOF = -1;
    
   -    TestBlockInputStream(BlockID blockId, long blockLen, ByteBuffer data) {
   +    public TestBlockInputStream(BlockID blockId, long blockLen, ByteBuffer data) {
          this(blockId, blockLen, data, 0);
        }
    
   @@ -325,7 +325,7 @@ public long getLength() {
    
        @Override
        public long getRemaining() {
   -      return data.remaining();
   +      return getLength() - getPos();
        }
    
        @Override
   @@ -342,7 +342,7 @@ public int read(ByteBuffer buf) throws IOException {
          if (getRemaining() == 0) {
            return EOF;
          }
   -      int toRead = Math.min(buf.remaining(), (int)getRemaining());
   +      int toRead = (int)Math.min(buf.remaining(), getRemaining());
          for (int i = 0; i < toRead; i++) {
            if (shouldError && data.position() >= shouldErrorPosition) {
              throwError();
   ```
   
   I think this test is better as it uses the normal interface to create the KeyInputStream, and we avoid any `@VisibleForTesting` on KeyInputStream.
   
   What do you think?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] umamaheswararao commented on a change in pull request #3094: HDDS-6319. EC: Fix read big file failure with EC policy 10+4.

Posted by GitBox <gi...@apache.org>.

umamaheswararao commented on a change in pull request #3094:
URL: https://github.com/apache/ozone/pull/3094#discussion_r807566778



##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/KeyInputStream.java
##########
@@ -264,7 +264,7 @@ protected synchronized int readWithStrategy(ByteReaderStrategy strategy)
 
       // Get the current blockStream and read data from it
       BlockExtendedInputStream current = blockStreams.get(blockIndex);
-      int numBytesToRead = Math.min(buffLen, (int)current.getRemaining());
+      int numBytesToRead = (int)Math.min(buffLen, current.getRemaining());

Review comment:
       Above line needs format based on latest checkstyle rules. Please rebase based on latest EC branch, which got updated with latest rules and it might flag warning here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] guihecheng commented on a change in pull request #3094: HDDS-6319. EC: Fix read big file failure with EC policy 10+4.

Posted by GitBox <gi...@apache.org>.

guihecheng commented on a change in pull request #3094:
URL: https://github.com/apache/ozone/pull/3094#discussion_r812504649



##########
File path: hadoop-ozone/client/src/test/java/org/apache/hadoop/ozone/client/io/TestKeyInputStreamEC.java
##########
@@ -0,0 +1,124 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.ozone.client.io;
+
+import org.apache.hadoop.hdds.client.BlockID;
+import org.apache.hadoop.hdds.client.ECReplicationConfig;
+import org.apache.hadoop.hdds.client.ReplicationConfig;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import org.apache.hadoop.hdds.scm.pipeline.Pipeline;
+import org.apache.hadoop.hdds.scm.pipeline.PipelineID;
+import org.apache.hadoop.ozone.om.helpers.OmKeyLocationInfo;
+import org.junit.Assert;
+import org.junit.Test;
+
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.UUID;
+
+import static org.apache.hadoop.ozone.OzoneConsts.MB;
+import static org.mockito.ArgumentMatchers.any;
+import static org.mockito.Mockito.*;
+
+/**
+ * Test KeyInputStream with underlying ECBlockInputStream.
+ */
+public class TestKeyInputStreamEC {
+
+  @Test

Review comment:
       Oh, I agree it is better to create a KeyInputStream object from a normal interface instead of injecting it with a newly added test interface.
   So, in order to utilize ECStreamTestUtil.TestBlockInputStream, let's move this test file to the same directory of ECStreamTestUtil class.
   Thanks Stephen.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org