You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by GitBox <gi...@apache.org> on 2021/02/04 14:58:22 UTC

[GitHub] [hadoop] mukund-thakur commented on a change in pull request #2646: HADOOP-17038 Support disabling buffered reads in ABFS positional reads.

mukund-thakur commented on a change in pull request #2646:
URL: https://github.com/apache/hadoop/pull/2646#discussion_r570049628



##########
File path: hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/constants/ConfigurationKeys.java
##########
@@ -180,5 +180,6 @@ public static String accountProperty(String property, String account) {
   /** Key for Local Group to Service Group file location. */
   public static final String FS_AZURE_LOCAL_GROUP_SG_MAPPING_FILE_PATH = "fs.azure.identity.transformer.local.service.group.mapping.file.path";
 

Review comment:
       java doc pls.

##########
File path: hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/services/ITestAbfsPread.java
##########
@@ -0,0 +1,128 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.fs.azurebfs.services;
+
+import java.io.IOException;
+import java.util.Arrays;
+import java.util.concurrent.ExecutionException;
+
+import org.junit.Test;
+
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FutureDataInputStreamBuilder;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.azurebfs.AbstractAbfsIntegrationTest;
+import org.apache.hadoop.fs.azurebfs.constants.ConfigurationKeys;
+import org.apache.hadoop.fs.contract.ContractTestUtils;
+
+public class ITestAbfsPread extends AbstractAbfsIntegrationTest {
+
+  public ITestAbfsPread() throws Exception {
+  }
+
+  @Test
+  public void testPread() throws IOException {
+    describe("Testing preads in AbfsInputStream");
+    Path dest = path("ITestAbfsPread");

Review comment:
       nit: use getMethodName() in the path

##########
File path: hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/services/ITestAbfsPread.java
##########
@@ -0,0 +1,128 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.fs.azurebfs.services;
+
+import java.io.IOException;
+import java.util.Arrays;
+import java.util.concurrent.ExecutionException;
+
+import org.junit.Test;
+
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FutureDataInputStreamBuilder;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.azurebfs.AbstractAbfsIntegrationTest;
+import org.apache.hadoop.fs.azurebfs.constants.ConfigurationKeys;
+import org.apache.hadoop.fs.contract.ContractTestUtils;
+
+public class ITestAbfsPread extends AbstractAbfsIntegrationTest {
+
+  public ITestAbfsPread() throws Exception {
+  }
+
+  @Test
+  public void testPread() throws IOException {
+    describe("Testing preads in AbfsInputStream");
+    Path dest = path("ITestAbfsPread");
+
+    int dataSize = 100;
+    byte[] data = ContractTestUtils.dataset(dataSize, 'a', 26);
+    ContractTestUtils.writeDataset(getFileSystem(), dest, data, data.length,
+        dataSize, true);
+    int bytesToRead = 10;
+    try (FSDataInputStream inputStream = getFileSystem().open(dest)) {
+      assertTrue(
+          "unexpected stream type "
+              + inputStream.getWrappedStream().getClass().getSimpleName(),
+          inputStream.getWrappedStream() instanceof AbfsInputStream);
+      byte[] readBuffer = new byte[bytesToRead];
+      int pos = 0;
+      assertEquals(
+          "AbfsInputStream#read did not read the correct number of bytes",
+          bytesToRead, inputStream.read(pos, readBuffer, 0, bytesToRead));
+      assertTrue("AbfsInputStream#read did not read the correct bytes",
+          Arrays.equals(Arrays.copyOfRange(data, pos, pos + bytesToRead),

Review comment:
       see if we can use already present methods like compareByteArrays in ContractTestUtils  for data verification

##########
File path: hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/services/ITestAbfsPread.java
##########
@@ -0,0 +1,128 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.fs.azurebfs.services;
+
+import java.io.IOException;
+import java.util.Arrays;
+import java.util.concurrent.ExecutionException;
+
+import org.junit.Test;
+
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FutureDataInputStreamBuilder;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.azurebfs.AbstractAbfsIntegrationTest;
+import org.apache.hadoop.fs.azurebfs.constants.ConfigurationKeys;
+import org.apache.hadoop.fs.contract.ContractTestUtils;
+
+public class ITestAbfsPread extends AbstractAbfsIntegrationTest {
+
+  public ITestAbfsPread() throws Exception {
+  }
+
+  @Test
+  public void testPread() throws IOException {
+    describe("Testing preads in AbfsInputStream");
+    Path dest = path("ITestAbfsPread");
+
+    int dataSize = 100;
+    byte[] data = ContractTestUtils.dataset(dataSize, 'a', 26);
+    ContractTestUtils.writeDataset(getFileSystem(), dest, data, data.length,
+        dataSize, true);
+    int bytesToRead = 10;
+    try (FSDataInputStream inputStream = getFileSystem().open(dest)) {
+      assertTrue(
+          "unexpected stream type "
+              + inputStream.getWrappedStream().getClass().getSimpleName(),
+          inputStream.getWrappedStream() instanceof AbfsInputStream);
+      byte[] readBuffer = new byte[bytesToRead];
+      int pos = 0;
+      assertEquals(
+          "AbfsInputStream#read did not read the correct number of bytes",
+          bytesToRead, inputStream.read(pos, readBuffer, 0, bytesToRead));
+      assertTrue("AbfsInputStream#read did not read the correct bytes",
+          Arrays.equals(Arrays.copyOfRange(data, pos, pos + bytesToRead),
+              readBuffer));
+      // Read only 10 bytes from offset 0. But by default it will do the seek
+      // and read where the
+      // entire 100 bytes get read into the AbfsInputStream buffer.
+      assertArrayEquals(
+          "AbfsInputStream#read did not read more data into its buffer", data,
+          Arrays.copyOfRange(
+              ((AbfsInputStream) inputStream.getWrappedStream()).getBuffer(), 0,
+              dataSize));
+    }
+    FutureDataInputStreamBuilder builder = getFileSystem().openFile(dest);
+    builder.opt(ConfigurationKeys.FS_AZURE_BUFFERED_PREAD_DISABLE, true);
+    FSDataInputStream inputStream = null;
+    try {
+      inputStream = builder.build().get();
+    } catch (IllegalArgumentException | UnsupportedOperationException
+        | InterruptedException | ExecutionException e) {
+      throw new IOException(e);
+    }
+    assertNotNull(inputStream);
+    try {
+      AbfsInputStream abfsIs = (AbfsInputStream) inputStream.getWrappedStream();
+      byte[] readBuffer = new byte[bytesToRead];
+      int pos = 10;
+      assertEquals(
+          "AbfsInputStream#read did not read the correct number of bytes",
+          bytesToRead, inputStream.read(pos, readBuffer, 0, bytesToRead));
+      assertTrue("AbfsInputStream#read did not read the correct bytes",
+          Arrays.equals(Arrays.copyOfRange(data, pos, pos + bytesToRead),
+              readBuffer));
+      // Read only 10 bytes from offset 10. This time, as buffered pread is
+      // disabled, it will only read the exact bytes as requested and no data
+      // will get read into the AbfsInputStream#buffer. Infact the buffer won't
+      // even get initialized.
+      assertNull("AbfsInputStream pread caused the internal buffer creation",
+          abfsIs.getBuffer());
+      // Now make a seek and read so that internal buffer gets created
+      inputStream.seek(0);
+      inputStream.read(readBuffer);
+      // This read would have fetched all 100 bytes into internal buffer.
+      assertArrayEquals(
+          "AbfsInputStream#read did not read more data into its buffer", data,
+          Arrays.copyOfRange(
+              ((AbfsInputStream) inputStream.getWrappedStream()).getBuffer(), 0,
+              dataSize));
+      // Now again do pos read and make sure not any extra data being fetched.
+      resetBuffer(abfsIs.getBuffer());
+      pos = 0;
+      assertEquals(
+          "AbfsInputStream#read did not read the correct number of bytes",
+          bytesToRead, inputStream.read(pos, readBuffer, 0, bytesToRead));
+      assertTrue("AbfsInputStream#read did not read the correct bytes",
+          Arrays.equals(Arrays.copyOfRange(data, pos, pos + bytesToRead),
+              readBuffer));
+      assertFalse(
+          "AbfsInputStream#read read more data into its buffer than expected",
+          Arrays.equals(data,
+              Arrays.copyOfRange(abfsIs.getBuffer(), 0, dataSize)));
+    } finally {
+      inputStream.close();
+    }
+  }
+

Review comment:
       This test grew big. I would suggest to break this into two, one with the config FS_AZURE_BUFFERED_PREAD_DISABLE set to true and one without setting. What do you say?

##########
File path: hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/services/ITestAbfsPread.java
##########
@@ -0,0 +1,128 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.fs.azurebfs.services;
+
+import java.io.IOException;
+import java.util.Arrays;
+import java.util.concurrent.ExecutionException;
+
+import org.junit.Test;
+
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FutureDataInputStreamBuilder;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.azurebfs.AbstractAbfsIntegrationTest;
+import org.apache.hadoop.fs.azurebfs.constants.ConfigurationKeys;
+import org.apache.hadoop.fs.contract.ContractTestUtils;
+
+public class ITestAbfsPread extends AbstractAbfsIntegrationTest {
+
+  public ITestAbfsPread() throws Exception {
+  }
+
+  @Test
+  public void testPread() throws IOException {
+    describe("Testing preads in AbfsInputStream");
+    Path dest = path("ITestAbfsPread");
+
+    int dataSize = 100;
+    byte[] data = ContractTestUtils.dataset(dataSize, 'a', 26);
+    ContractTestUtils.writeDataset(getFileSystem(), dest, data, data.length,
+        dataSize, true);
+    int bytesToRead = 10;
+    try (FSDataInputStream inputStream = getFileSystem().open(dest)) {
+      assertTrue(
+          "unexpected stream type "
+              + inputStream.getWrappedStream().getClass().getSimpleName(),
+          inputStream.getWrappedStream() instanceof AbfsInputStream);
+      byte[] readBuffer = new byte[bytesToRead];
+      int pos = 0;
+      assertEquals(
+          "AbfsInputStream#read did not read the correct number of bytes",
+          bytesToRead, inputStream.read(pos, readBuffer, 0, bytesToRead));
+      assertTrue("AbfsInputStream#read did not read the correct bytes",
+          Arrays.equals(Arrays.copyOfRange(data, pos, pos + bytesToRead),
+              readBuffer));
+      // Read only 10 bytes from offset 0. But by default it will do the seek
+      // and read where the
+      // entire 100 bytes get read into the AbfsInputStream buffer.
+      assertArrayEquals(
+          "AbfsInputStream#read did not read more data into its buffer", data,
+          Arrays.copyOfRange(
+              ((AbfsInputStream) inputStream.getWrappedStream()).getBuffer(), 0,
+              dataSize));
+    }
+    FutureDataInputStreamBuilder builder = getFileSystem().openFile(dest);
+    builder.opt(ConfigurationKeys.FS_AZURE_BUFFERED_PREAD_DISABLE, true);
+    FSDataInputStream inputStream = null;
+    try {
+      inputStream = builder.build().get();
+    } catch (IllegalArgumentException | UnsupportedOperationException
+        | InterruptedException | ExecutionException e) {
+      throw new IOException(e);
+    }
+    assertNotNull(inputStream);

Review comment:
       add assert message please.

##########
File path: hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/services/ITestAbfsPread.java
##########
@@ -0,0 +1,128 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.fs.azurebfs.services;
+
+import java.io.IOException;
+import java.util.Arrays;
+import java.util.concurrent.ExecutionException;
+
+import org.junit.Test;
+
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FutureDataInputStreamBuilder;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.azurebfs.AbstractAbfsIntegrationTest;
+import org.apache.hadoop.fs.azurebfs.constants.ConfigurationKeys;
+import org.apache.hadoop.fs.contract.ContractTestUtils;
+
+public class ITestAbfsPread extends AbstractAbfsIntegrationTest {
+
+  public ITestAbfsPread() throws Exception {
+  }
+
+  @Test
+  public void testPread() throws IOException {
+    describe("Testing preads in AbfsInputStream");
+    Path dest = path("ITestAbfsPread");
+
+    int dataSize = 100;
+    byte[] data = ContractTestUtils.dataset(dataSize, 'a', 26);
+    ContractTestUtils.writeDataset(getFileSystem(), dest, data, data.length,
+        dataSize, true);
+    int bytesToRead = 10;
+    try (FSDataInputStream inputStream = getFileSystem().open(dest)) {
+      assertTrue(
+          "unexpected stream type "
+              + inputStream.getWrappedStream().getClass().getSimpleName(),
+          inputStream.getWrappedStream() instanceof AbfsInputStream);
+      byte[] readBuffer = new byte[bytesToRead];
+      int pos = 0;
+      assertEquals(
+          "AbfsInputStream#read did not read the correct number of bytes",
+          bytesToRead, inputStream.read(pos, readBuffer, 0, bytesToRead));
+      assertTrue("AbfsInputStream#read did not read the correct bytes",
+          Arrays.equals(Arrays.copyOfRange(data, pos, pos + bytesToRead),
+              readBuffer));
+      // Read only 10 bytes from offset 0. But by default it will do the seek
+      // and read where the
+      // entire 100 bytes get read into the AbfsInputStream buffer.
+      assertArrayEquals(
+          "AbfsInputStream#read did not read more data into its buffer", data,
+          Arrays.copyOfRange(
+              ((AbfsInputStream) inputStream.getWrappedStream()).getBuffer(), 0,
+              dataSize));
+    }
+    FutureDataInputStreamBuilder builder = getFileSystem().openFile(dest);
+    builder.opt(ConfigurationKeys.FS_AZURE_BUFFERED_PREAD_DISABLE, true);
+    FSDataInputStream inputStream = null;
+    try {
+      inputStream = builder.build().get();
+    } catch (IllegalArgumentException | UnsupportedOperationException
+        | InterruptedException | ExecutionException e) {
+      throw new IOException(e);
+    }
+    assertNotNull(inputStream);
+    try {
+      AbfsInputStream abfsIs = (AbfsInputStream) inputStream.getWrappedStream();
+      byte[] readBuffer = new byte[bytesToRead];
+      int pos = 10;
+      assertEquals(
+          "AbfsInputStream#read did not read the correct number of bytes",
+          bytesToRead, inputStream.read(pos, readBuffer, 0, bytesToRead));
+      assertTrue("AbfsInputStream#read did not read the correct bytes",
+          Arrays.equals(Arrays.copyOfRange(data, pos, pos + bytesToRead),

Review comment:
       same as above. try reusing code for data matching.

##########
File path: hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/services/ITestAbfsPread.java
##########
@@ -0,0 +1,128 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.fs.azurebfs.services;
+
+import java.io.IOException;
+import java.util.Arrays;
+import java.util.concurrent.ExecutionException;
+
+import org.junit.Test;
+
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FutureDataInputStreamBuilder;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.azurebfs.AbstractAbfsIntegrationTest;
+import org.apache.hadoop.fs.azurebfs.constants.ConfigurationKeys;
+import org.apache.hadoop.fs.contract.ContractTestUtils;
+
+public class ITestAbfsPread extends AbstractAbfsIntegrationTest {
+
+  public ITestAbfsPread() throws Exception {
+  }
+
+  @Test
+  public void testPread() throws IOException {
+    describe("Testing preads in AbfsInputStream");
+    Path dest = path("ITestAbfsPread");
+
+    int dataSize = 100;
+    byte[] data = ContractTestUtils.dataset(dataSize, 'a', 26);
+    ContractTestUtils.writeDataset(getFileSystem(), dest, data, data.length,
+        dataSize, true);
+    int bytesToRead = 10;
+    try (FSDataInputStream inputStream = getFileSystem().open(dest)) {
+      assertTrue(
+          "unexpected stream type "
+              + inputStream.getWrappedStream().getClass().getSimpleName(),
+          inputStream.getWrappedStream() instanceof AbfsInputStream);
+      byte[] readBuffer = new byte[bytesToRead];
+      int pos = 0;
+      assertEquals(
+          "AbfsInputStream#read did not read the correct number of bytes",
+          bytesToRead, inputStream.read(pos, readBuffer, 0, bytesToRead));
+      assertTrue("AbfsInputStream#read did not read the correct bytes",
+          Arrays.equals(Arrays.copyOfRange(data, pos, pos + bytesToRead),
+              readBuffer));
+      // Read only 10 bytes from offset 0. But by default it will do the seek
+      // and read where the
+      // entire 100 bytes get read into the AbfsInputStream buffer.
+      assertArrayEquals(
+          "AbfsInputStream#read did not read more data into its buffer", data,
+          Arrays.copyOfRange(
+              ((AbfsInputStream) inputStream.getWrappedStream()).getBuffer(), 0,
+              dataSize));
+    }
+    FutureDataInputStreamBuilder builder = getFileSystem().openFile(dest);
+    builder.opt(ConfigurationKeys.FS_AZURE_BUFFERED_PREAD_DISABLE, true);
+    FSDataInputStream inputStream = null;
+    try {
+      inputStream = builder.build().get();
+    } catch (IllegalArgumentException | UnsupportedOperationException
+        | InterruptedException | ExecutionException e) {
+      throw new IOException(e);
+    }
+    assertNotNull(inputStream);
+    try {
+      AbfsInputStream abfsIs = (AbfsInputStream) inputStream.getWrappedStream();
+      byte[] readBuffer = new byte[bytesToRead];
+      int pos = 10;
+      assertEquals(
+          "AbfsInputStream#read did not read the correct number of bytes",
+          bytesToRead, inputStream.read(pos, readBuffer, 0, bytesToRead));
+      assertTrue("AbfsInputStream#read did not read the correct bytes",
+          Arrays.equals(Arrays.copyOfRange(data, pos, pos + bytesToRead),
+              readBuffer));
+      // Read only 10 bytes from offset 10. This time, as buffered pread is
+      // disabled, it will only read the exact bytes as requested and no data
+      // will get read into the AbfsInputStream#buffer. Infact the buffer won't
+      // even get initialized.
+      assertNull("AbfsInputStream pread caused the internal buffer creation",

Review comment:
       We can also assert on the return value of read() call. Should exactly be 10.

##########
File path: hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/services/ITestAbfsPread.java
##########
@@ -0,0 +1,128 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.fs.azurebfs.services;
+
+import java.io.IOException;
+import java.util.Arrays;
+import java.util.concurrent.ExecutionException;
+
+import org.junit.Test;
+
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FutureDataInputStreamBuilder;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.azurebfs.AbstractAbfsIntegrationTest;
+import org.apache.hadoop.fs.azurebfs.constants.ConfigurationKeys;
+import org.apache.hadoop.fs.contract.ContractTestUtils;
+
+public class ITestAbfsPread extends AbstractAbfsIntegrationTest {

Review comment:
       ITestAbfsPositionedRead would be a clear name. what do you think?

##########
File path: hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/services/ITestAbfsPread.java
##########
@@ -0,0 +1,128 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.fs.azurebfs.services;
+
+import java.io.IOException;
+import java.util.Arrays;
+import java.util.concurrent.ExecutionException;
+
+import org.junit.Test;
+
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FutureDataInputStreamBuilder;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.azurebfs.AbstractAbfsIntegrationTest;
+import org.apache.hadoop.fs.azurebfs.constants.ConfigurationKeys;
+import org.apache.hadoop.fs.contract.ContractTestUtils;
+
+public class ITestAbfsPread extends AbstractAbfsIntegrationTest {
+
+  public ITestAbfsPread() throws Exception {
+  }
+
+  @Test
+  public void testPread() throws IOException {
+    describe("Testing preads in AbfsInputStream");
+    Path dest = path("ITestAbfsPread");
+
+    int dataSize = 100;
+    byte[] data = ContractTestUtils.dataset(dataSize, 'a', 26);
+    ContractTestUtils.writeDataset(getFileSystem(), dest, data, data.length,
+        dataSize, true);
+    int bytesToRead = 10;
+    try (FSDataInputStream inputStream = getFileSystem().open(dest)) {
+      assertTrue(
+          "unexpected stream type "
+              + inputStream.getWrappedStream().getClass().getSimpleName(),
+          inputStream.getWrappedStream() instanceof AbfsInputStream);
+      byte[] readBuffer = new byte[bytesToRead];
+      int pos = 0;
+      assertEquals(
+          "AbfsInputStream#read did not read the correct number of bytes",
+          bytesToRead, inputStream.read(pos, readBuffer, 0, bytesToRead));
+      assertTrue("AbfsInputStream#read did not read the correct bytes",
+          Arrays.equals(Arrays.copyOfRange(data, pos, pos + bytesToRead),
+              readBuffer));
+      // Read only 10 bytes from offset 0. But by default it will do the seek
+      // and read where the
+      // entire 100 bytes get read into the AbfsInputStream buffer.
+      assertArrayEquals(
+          "AbfsInputStream#read did not read more data into its buffer", data,
+          Arrays.copyOfRange(
+              ((AbfsInputStream) inputStream.getWrappedStream()).getBuffer(), 0,
+              dataSize));
+    }
+    FutureDataInputStreamBuilder builder = getFileSystem().openFile(dest);
+    builder.opt(ConfigurationKeys.FS_AZURE_BUFFERED_PREAD_DISABLE, true);
+    FSDataInputStream inputStream = null;
+    try {
+      inputStream = builder.build().get();
+    } catch (IllegalArgumentException | UnsupportedOperationException
+        | InterruptedException | ExecutionException e) {
+      throw new IOException(e);

Review comment:
       Error message would be helpful here.

##########
File path: hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/AzureBlobFileSystemStore.java
##########
@@ -605,7 +607,9 @@ public void createDirectory(final Path path, final FsPermission permission, fina
     }
   }
 
-  public AbfsInputStream openFileForRead(final Path path, final FileSystem.Statistics statistics)
+  public AbfsInputStream openFileForRead(final Path path,

Review comment:
       Can we method overloading like 
   `  public AbfsInputStream openFileForRead(final Path path,
                                            final FileSystem.Statistics statistics)
             throws AzureBlobFileSystemException {
      return openFileForRead(path,Optional.empty(), statistics);
     }` so as to avoid changing everywhere in the tests especially.

##########
File path: hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAbfsInputStreamStatistics.java
##########
@@ -399,6 +406,89 @@ public void testActionHttpGetRequest() throws IOException {
     }
   }
 
+  @Test
+  public void testPread() throws IOException {

Review comment:
       I don't see much benefit in duplicating the test here. We can just add the stats assertion in the new tests added.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org