You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by GitBox <gi...@apache.org> on 2020/03/28 05:06:24 UTC

[GitHub] [hadoop] dengzhhu653 opened a new pull request #1920: MAPREDUCE-7241. FileInputFormat listStatus with less memory footprint

dengzhhu653 opened a new pull request #1920: MAPREDUCE-7241. FileInputFormat listStatus with less memory footprint
URL: https://github.com/apache/hadoop/pull/1920
 
 
   LocatedFileStatus returned by fs.listLocatedStatus contains the LocatedBlock, which is useless when MR submitting jobs.  The patch tries to exclude the LocatedBlock from the LocatedFileStatus, allows MR scanning more files with less memory footprint.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] dengzhhu653 commented on a change in pull request #1920: MAPREDUCE-7241. FileInputFormat listStatus with less memory footprint

Posted by GitBox <gi...@apache.org>.
dengzhhu653 commented on a change in pull request #1920: MAPREDUCE-7241. FileInputFormat listStatus with less memory footprint
URL: https://github.com/apache/hadoop/pull/1920#discussion_r399768595
 
 

 ##########
 File path: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.java
 ##########
 @@ -364,13 +364,29 @@ protected void addInputPathRecursively(List<FileStatus> result,
         if (stat.isDirectory()) {
           addInputPathRecursively(result, fs, stat.getPath(), inputFilter);
         } else {
-          result.add(stat);
+          result.add(shrinkStatus(stat));
         }
       }
     }
   }
-  
-  
+
+  public static FileStatus shrinkStatus(FileStatus origStat) {
 
 Review comment:
   Thanks for reviewing this!  Comments are added to clarify the changes.
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] dengzhhu653 commented on a change in pull request #1920: MAPREDUCE-7241. FileInputFormat listStatus with less memory footprint

Posted by GitBox <gi...@apache.org>.
dengzhhu653 commented on a change in pull request #1920: MAPREDUCE-7241. FileInputFormat listStatus with less memory footprint
URL: https://github.com/apache/hadoop/pull/1920#discussion_r400917354
 
 

 ##########
 File path: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestFileInputFormat.java
 ##########
 @@ -238,6 +238,44 @@ public void testListStatusErrorOnNonExistantDir() throws IOException {
     }
   }
 
+  @Test
+  public void testShrinkStatus() throws IOException {
+    Configuration conf = getConfiguration();
+    MockFileSystem mockFs =
+            (MockFileSystem) new Path("test:///").getFileSystem(conf);
+    Path dir1  = new Path("test:/a1");
+    RemoteIterator<LocatedFileStatus> statuses = mockFs.listLocatedStatus(dir1);
 
 Review comment:
   Thank you for your reply! Add assert here to check the instance of original filestatus used and the result of shrinking should be a ```BlockLocation```,  which contains no references to ```LocatedBlock```.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] jlowe commented on a change in pull request #1920: MAPREDUCE-7241. FileInputFormat listStatus with less memory footprint

Posted by GitBox <gi...@apache.org>.
jlowe commented on a change in pull request #1920: MAPREDUCE-7241. FileInputFormat listStatus with less memory footprint
URL: https://github.com/apache/hadoop/pull/1920#discussion_r400574855
 
 

 ##########
 File path: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestFileInputFormat.java
 ##########
 @@ -238,6 +238,44 @@ public void testListStatusErrorOnNonExistantDir() throws IOException {
     }
   }
 
+  @Test
+  public void testShrinkStatus() throws IOException {
+    Configuration conf = getConfiguration();
+    MockFileSystem mockFs =
+            (MockFileSystem) new Path("test:///").getFileSystem(conf);
+    Path dir1  = new Path("test:/a1");
+    RemoteIterator<LocatedFileStatus> statuses = mockFs.listLocatedStatus(dir1);
 
 Review comment:
   Thanks for the updates!   The unit test still has the problem I originally described -- it will pass even when the shrink implementation does not actually shrink anything.  Maybe the unit test should assert that the `LocatedFileStatus` received from the filesystem is using instances of `HdfsBlockLocation` (which it does after your recent changes) and the result of shrinking no longer has such instances?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on issue #1920: MAPREDUCE-7241. FileInputFormat listStatus with less memory footprint

Posted by GitBox <gi...@apache.org>.
hadoop-yetus commented on issue #1920: MAPREDUCE-7241. FileInputFormat listStatus with less memory footprint
URL: https://github.com/apache/hadoop/pull/1920#issuecomment-606687079
 
 
   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |  24m 44s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 1 new or modified test files.  |
   ||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  18m 57s |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 35s |  trunk passed  |
   | +1 :green_heart: |  checkstyle  |   0m 31s |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 38s |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  15m 15s |  branch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 23s |  trunk passed  |
   | +0 :ok: |  spotbugs  |   1m 10s |  Used deprecated FindBugs config; considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   1m  7s |  trunk passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 32s |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 27s |  the patch passed  |
   | +1 :green_heart: |  javac  |   0m 27s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   0m 23s |  hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core: The patch generated 2 new + 158 unchanged - 4 fixed = 160 total (was 162)  |
   | +1 :green_heart: |  mvnsite  |   0m 29s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  14m 27s |  patch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 19s |  the patch passed  |
   | +1 :green_heart: |  findbugs  |   1m 10s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   5m 30s |  hadoop-mapreduce-client-core in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 33s |  The patch does not generate ASF License warnings.  |
   |  |   |  87m 21s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1920/3/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1920 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux d6f494132372 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / c734d24 |
   | Default Java | 1.8.0_242 |
   | checkstyle | https://builds.apache.org/job/hadoop-multibranch/job/PR-1920/3/artifact/out/diff-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.txt |
   |  Test Results | https://builds.apache.org/job/hadoop-multibranch/job/PR-1920/3/testReport/ |
   | Max. process+thread count | 1570 (vs. ulimit of 5500) |
   | modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core |
   | Console output | https://builds.apache.org/job/hadoop-multibranch/job/PR-1920/3/console |
   | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] dengzhhu653 commented on a change in pull request #1920: MAPREDUCE-7241. FileInputFormat listStatus with less memory footprint

Posted by GitBox <gi...@apache.org>.
dengzhhu653 commented on a change in pull request #1920: MAPREDUCE-7241. FileInputFormat listStatus with less memory footprint
URL: https://github.com/apache/hadoop/pull/1920#discussion_r399768595
 
 

 ##########
 File path: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.java
 ##########
 @@ -364,13 +364,29 @@ protected void addInputPathRecursively(List<FileStatus> result,
         if (stat.isDirectory()) {
           addInputPathRecursively(result, fs, stat.getPath(), inputFilter);
         } else {
-          result.add(stat);
+          result.add(shrinkStatus(stat));
         }
       }
     }
   }
-  
-  
+
+  public static FileStatus shrinkStatus(FileStatus origStat) {
 
 Review comment:
   Thanks for reviewing this!  Comments are added to clarify the changes.
   --
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] jlowe commented on a change in pull request #1920: MAPREDUCE-7241. FileInputFormat listStatus with less memory footprint

Posted by GitBox <gi...@apache.org>.
jlowe commented on a change in pull request #1920: MAPREDUCE-7241. FileInputFormat listStatus with less memory footprint
URL: https://github.com/apache/hadoop/pull/1920#discussion_r399693801
 
 

 ##########
 File path: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestFileInputFormat.java
 ##########
 @@ -238,6 +238,44 @@ public void testListStatusErrorOnNonExistantDir() throws IOException {
     }
   }
 
+  @Test
+  public void testShrinkStatus() throws IOException {
+    Configuration conf = getConfiguration();
+    MockFileSystem mockFs =
+            (MockFileSystem) new Path("test:///").getFileSystem(conf);
+    Path dir1  = new Path("test:/a1");
+    RemoteIterator<LocatedFileStatus> statuses = mockFs.listLocatedStatus(dir1);
 
 Review comment:
   This test doesn't test what we care about in this change.  `MockFileSystem` doesn't return a derived class of `BlockLocation` so nothing shrinks.  This test passes if `shrinkStatus` simply returned the original status, showing it really isn't a useful test for verifying we don't regress and start caching large location statuses again.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on issue #1920: MAPREDUCE-7241. FileInputFormat listStatus with less memory footprint

Posted by GitBox <gi...@apache.org>.
hadoop-yetus commented on issue #1920: MAPREDUCE-7241. FileInputFormat listStatus with less memory footprint
URL: https://github.com/apache/hadoop/pull/1920#issuecomment-607011737
 
 
   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 33s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 1 new or modified test files.  |
   ||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  19m 45s |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 34s |  trunk passed  |
   | +1 :green_heart: |  checkstyle  |   0m 29s |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 38s |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  14m 41s |  branch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 22s |  trunk passed  |
   | +0 :ok: |  spotbugs  |   1m  8s |  Used deprecated FindBugs config; considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   1m  6s |  trunk passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 32s |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 27s |  the patch passed  |
   | +1 :green_heart: |  javac  |   0m 27s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   0m 23s |  hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core: The patch generated 2 new + 158 unchanged - 4 fixed = 160 total (was 162)  |
   | +1 :green_heart: |  mvnsite  |   0m 31s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  13m 44s |  patch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 19s |  the patch passed  |
   | +1 :green_heart: |  findbugs  |   1m 12s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   5m 30s |  hadoop-mapreduce-client-core in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 32s |  The patch does not generate ASF License warnings.  |
   |  |   |  62m 43s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1920/4/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1920 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux a41ab6ee9fcd 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / c734d24 |
   | Default Java | 1.8.0_242 |
   | checkstyle | https://builds.apache.org/job/hadoop-multibranch/job/PR-1920/4/artifact/out/diff-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.txt |
   |  Test Results | https://builds.apache.org/job/hadoop-multibranch/job/PR-1920/4/testReport/ |
   | Max. process+thread count | 1246 (vs. ulimit of 5500) |
   | modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core |
   | Console output | https://builds.apache.org/job/hadoop-multibranch/job/PR-1920/4/console |
   | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] jlowe commented on a change in pull request #1920: MAPREDUCE-7241. FileInputFormat listStatus with less memory footprint

Posted by GitBox <gi...@apache.org>.
jlowe commented on a change in pull request #1920: MAPREDUCE-7241. FileInputFormat listStatus with less memory footprint
URL: https://github.com/apache/hadoop/pull/1920#discussion_r399693011
 
 

 ##########
 File path: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.java
 ##########
 @@ -364,13 +364,29 @@ protected void addInputPathRecursively(List<FileStatus> result,
         if (stat.isDirectory()) {
           addInputPathRecursively(result, fs, stat.getPath(), inputFilter);
         } else {
-          result.add(stat);
+          result.add(shrinkStatus(stat));
         }
       }
     }
   }
-  
-  
+
+  public static FileStatus shrinkStatus(FileStatus origStat) {
 
 Review comment:
   This is very much a user-facing class, and all public methods in this file have Javadoc comments. I think it's appropriate here as well.  There should be a comment explaining what this is doing and why. When I first glanced at the code, it looked like it was just making a copy of everything, and it didn't look like it was shrinking anything at all.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on issue #1920: MAPREDUCE-7241. FileInputFormat listStatus with less memory footprint

Posted by GitBox <gi...@apache.org>.
hadoop-yetus commented on issue #1920: MAPREDUCE-7241. FileInputFormat listStatus with less memory footprint
URL: https://github.com/apache/hadoop/pull/1920#issuecomment-605613182
 
 
   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 34s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 1 new or modified test files.  |
   ||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  19m 38s |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 35s |  trunk passed  |
   | +1 :green_heart: |  checkstyle  |   0m 33s |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 39s |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  15m  9s |  branch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 22s |  trunk passed  |
   | +0 :ok: |  spotbugs  |   1m 11s |  Used deprecated FindBugs config; considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   1m  9s |  trunk passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 32s |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 27s |  the patch passed  |
   | +1 :green_heart: |  javac  |   0m 27s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   0m 23s |  hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core: The patch generated 5 new + 158 unchanged - 4 fixed = 163 total (was 162)  |
   | +1 :green_heart: |  mvnsite  |   0m 31s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  13m 52s |  patch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 19s |  the patch passed  |
   | +1 :green_heart: |  findbugs  |   1m 12s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   5m 29s |  hadoop-mapreduce-client-core in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 32s |  The patch does not generate ASF License warnings.  |
   |  |   |  63m  3s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1920/2/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1920 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux b59673f489fe 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 696a663 |
   | Default Java | 1.8.0_242 |
   | checkstyle | https://builds.apache.org/job/hadoop-multibranch/job/PR-1920/2/artifact/out/diff-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.txt |
   |  Test Results | https://builds.apache.org/job/hadoop-multibranch/job/PR-1920/2/testReport/ |
   | Max. process+thread count | 1332 (vs. ulimit of 5500) |
   | modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core |
   | Console output | https://builds.apache.org/job/hadoop-multibranch/job/PR-1920/2/console |
   | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on issue #1920: MAPREDUCE-7241. FileInputFormat listStatus with less memory footprint

Posted by GitBox <gi...@apache.org>.
hadoop-yetus commented on issue #1920: MAPREDUCE-7241. FileInputFormat listStatus with less memory footprint
URL: https://github.com/apache/hadoop/pull/1920#issuecomment-605404498
 
 
   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 36s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 1 new or modified test files.  |
   ||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  19m  7s |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 36s |  trunk passed  |
   | +1 :green_heart: |  checkstyle  |   0m 31s |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 38s |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  14m 57s |  branch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 23s |  trunk passed  |
   | +0 :ok: |  spotbugs  |   1m 11s |  Used deprecated FindBugs config; considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   1m  9s |  trunk passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 32s |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 27s |  the patch passed  |
   | +1 :green_heart: |  javac  |   0m 27s |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   0m 24s |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 32s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  13m 47s |  patch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 19s |  the patch passed  |
   | +1 :green_heart: |  findbugs  |   1m 11s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   5m 32s |  hadoop-mapreduce-client-core in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 32s |  The patch does not generate ASF License warnings.  |
   |  |   |  62m 35s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1920/1/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1920 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux 0df0740ddc33 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / f531a4a |
   | Default Java | 1.8.0_242 |
   |  Test Results | https://builds.apache.org/job/hadoop-multibranch/job/PR-1920/1/testReport/ |
   | Max. process+thread count | 1576 (vs. ulimit of 5500) |
   | modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core |
   | Console output | https://builds.apache.org/job/hadoop-multibranch/job/PR-1920/1/console |
   | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] dengzhhu653 commented on a change in pull request #1920: MAPREDUCE-7241. FileInputFormat listStatus with less memory footprint

Posted by GitBox <gi...@apache.org>.
dengzhhu653 commented on a change in pull request #1920: MAPREDUCE-7241. FileInputFormat listStatus with less memory footprint
URL: https://github.com/apache/hadoop/pull/1920#discussion_r399770285
 
 

 ##########
 File path: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestFileInputFormat.java
 ##########
 @@ -238,6 +238,44 @@ public void testListStatusErrorOnNonExistantDir() throws IOException {
     }
   }
 
+  @Test
+  public void testShrinkStatus() throws IOException {
+    Configuration conf = getConfiguration();
+    MockFileSystem mockFs =
+            (MockFileSystem) new Path("test:///").getFileSystem(conf);
+    Path dir1  = new Path("test:/a1");
+    RemoteIterator<LocatedFileStatus> statuses = mockFs.listLocatedStatus(dir1);
 
 Review comment:
   The updated codes have made ```MockFileSystem.getFileBlockLocations``` return array of ```HdfsBlockLocation``` for regression and the unit test shows that the filestatus being shrunk has the same block location infos with the original.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] dengzhhu653 closed pull request #1920: MAPREDUCE-7241. FileInputFormat listStatus with less memory footprint

Posted by GitBox <gi...@apache.org>.
dengzhhu653 closed pull request #1920: MAPREDUCE-7241. FileInputFormat listStatus with less memory footprint
URL: https://github.com/apache/hadoop/pull/1920
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org