You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by GitBox <gi...@apache.org> on 2022/06/21 01:34:16 UTC

[GitHub] [hadoop] mukund-thakur opened a new pull request, #4476: HADOOP-18103. High performance vectored read API in Hadoop

mukund-thakur opened a new pull request, #4476:
URL: https://github.com/apache/hadoop/pull/4476

   
   
   ### Description of PR
   
   Add support for multiple ranged vectored read api in PositionedReadable. The default iterates through the ranges to read each synchronously, but the intent is that FSDataInputStream subclasses can make more efficient readers especially object stores implementation.
   
   
   ### How was this patch tested?
   Added new tests. Ran older test suites in ap-south-1. All good.
   
   ### For code changes:
   
   - [ ] Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
   - [ ] Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
   - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, `NOTICE-binary` files?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] mukund-thakur commented on pull request #4476: HADOOP-18103. High performance vectored read API in Hadoop

Posted by GitBox <gi...@apache.org>.
mukund-thakur commented on PR #4476:
URL: https://github.com/apache/hadoop/pull/4476#issuecomment-1162422291

   Other test failures seems unrelated to this 
   `OutOfMemoryError: unable to create new native thread`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] steveloughran closed pull request #4476: HADOOP-18103. High performance vectored read API in Hadoop

Posted by GitBox <gi...@apache.org>.
steveloughran closed pull request #4476: HADOOP-18103. High performance vectored read API in Hadoop
URL: https://github.com/apache/hadoop/pull/4476


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] mukund-thakur commented on pull request #4476: HADOOP-18103. High performance vectored read API in Hadoop

Posted by GitBox <gi...@apache.org>.
mukund-thakur commented on PR #4476:
URL: https://github.com/apache/hadoop/pull/4476#issuecomment-1162424160

   `./hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java:614:    final int maxReadSizeVectored = (int) longBytesOption(conf, AWS_S3_VECTOR_READS_MAX_MERGED_READ_SIZE,: Line is longer than 100 characters (found 105). [LineLength]
   `
   This is the only checkstyle which is new and can be fixed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] mukund-thakur commented on pull request #4476: HADOOP-18103. High performance vectored read API in Hadoop

Posted by GitBox <gi...@apache.org>.
mukund-thakur commented on PR #4476:
URL: https://github.com/apache/hadoop/pull/4476#issuecomment-1161059225

   @steveloughran , @mehakmeet  So will go ahead and merge this if Yetus is okay?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #4476: HADOOP-18103. High performance vectored read API in Hadoop

Posted by GitBox <gi...@apache.org>.
hadoop-yetus commented on PR #4476:
URL: https://github.com/apache/hadoop/pull/4476#issuecomment-1162357662

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 38s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  |
   | +0 :ok: |  shelldocs  |   0m  1s |  |  Shelldocs was not available.  |
   | +0 :ok: |  markdownlint  |   0m  1s |  |  markdownlint was not available.  |
   | +0 :ok: |  xmllint  |   0m  1s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 12 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  16m  2s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m 10s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m  3s |  |  trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |  20m 32s |  |  trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m 27s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |  18m 47s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   8m 13s |  |  trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   7m 23s |  |  trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +0 :ok: |  spotbugs  |   0m 35s |  |  branch/hadoop-project no spotbugs output file (spotbugsXml.xml)  |
   | +1 :green_heart: |  shadedclient  |  50m 33s |  |  branch has no errors when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  51m  2s |  |  Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 58s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |  32m 38s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 31s |  |  the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | -1 :x: |  javac  |  22m 31s | [/results-compile-javac-root-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4476/1/artifact/out/results-compile-javac-root-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt) |  root-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 generated 3 new + 2875 unchanged - 0 fixed = 2878 total (was 2875)  |
   | +1 :green_heart: |  compile  |  20m 29s |  |  the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | -1 :x: |  javac  |  20m 29s | [/results-compile-javac-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4476/1/artifact/out/results-compile-javac-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt) |  root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 generated 3 new + 2671 unchanged - 0 fixed = 2674 total (was 2671)  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | -0 :warning: |  checkstyle  |   4m 11s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4476/1/artifact/out/results-checkstyle-root.txt) |  root: The patch generated 3 new + 90 unchanged - 3 fixed = 93 total (was 93)  |
   | +1 :green_heart: |  mvnsite  |  18m 25s |  |  the patch passed  |
   | +1 :green_heart: |  shellcheck  |   0m  0s |  |  No new issues.  |
   | +1 :green_heart: |  javadoc  |   8m  4s |  |  the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   7m 29s |  |  root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 generated 0 new + 2328 unchanged - 1 fixed = 2328 total (was 2329)  |
   | +0 :ok: |  spotbugs  |   0m 33s |  |  hadoop-project has no data from spotbugs  |
   | +1 :green_heart: |  shadedclient  |  53m 13s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | -1 :x: |  unit  | 785m 22s | [/patch-unit-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4476/1/artifact/out/patch-unit-root.txt) |  root in the patch failed.  |
   | +0 :ok: |  asflicense  |   2m 13s |  |  ASF License check generated no output?  |
   |  |   | 1166m 24s |  |  |
   
   
   | Reason | Tests |
   |-------:|:------|
   | Failed junit tests | hadoop.cli.TestHDFSCLI |
   |   | hadoop.tools.dynamometer.workloadgenerator.TestWorkloadGenerator |
   |   | hadoop.tools.TestHadoopArchives |
   |   | hadoop.streaming.TestStreamingOutputOnlyKeys |
   |   | hadoop.yarn.server.router.webapp.TestRouterWebServicesREST |
   |   | hadoop.mapred.TestLocalDistributedCacheManager |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4476/1/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4476 |
   | Optional Tests | dupname asflicense codespell detsecrets shellcheck shelldocs compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle markdownlint xmllint |
   | uname | Linux 62db4b268209 4.15.0-156-generic #163-Ubuntu SMP Thu Aug 19 23:31:58 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 645bcd648022b9a8ed51bb29898cbdf19a8deaf7 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4476/1/testReport/ |
   | Max. process+thread count | 2872 (vs. ulimit of 5500) |
   | modules | C: hadoop-project hadoop-common-project/hadoop-common hadoop-common-project hadoop-tools/hadoop-aws hadoop-tools/hadoop-benchmark hadoop-tools . U: . |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4476/1/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 shellcheck=0.7.0 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] steveloughran commented on pull request #4476: HADOOP-18103. High performance vectored read API in Hadoop

Posted by GitBox <gi...@apache.org>.
steveloughran commented on PR #4476:
URL: https://github.com/apache/hadoop/pull/4476#issuecomment-1162369940

   as discussed in a call, I think should be merged as a merge commit of the chain, rather than through the github IDE. I will help with this; it can be done through the cli or sourcetree.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] steveloughran commented on pull request #4476: HADOOP-18103. High performance vectored read API in Hadoop

Posted by GitBox <gi...@apache.org>.
steveloughran commented on PR #4476:
URL: https://github.com/apache/hadoop/pull/4476#issuecomment-1163433801

   merged manually, closing work. great work mukund & owen, once we get this picked up it is going to deliver significant speedups. get those apachecon demos ready!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] steveloughran commented on pull request #4476: HADOOP-18103. High performance vectored read API in Hadoop

Posted by GitBox <gi...@apache.org>.
steveloughran commented on PR #4476:
URL: https://github.com/apache/hadoop/pull/4476#issuecomment-1162371141

   test failure hadoop.mapred.TestLocalDistributedCacheManager is mine; TestHDFSCLI is known


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org