You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Todd Lipcon (Created) (JIRA)" <ji...@apache.org> on 2011/10/24 22:52:33 UTC

[jira] [Created] (MAPREDUCE-3252) MR2: Map tasks rewrite data once even if output fits in sort buffer

MR2: Map tasks rewrite data once even if output fits in sort buffer
-------------------------------------------------------------------

                 Key: MAPREDUCE-3252
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3252
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: mrv2, task
    Affects Versions: 0.23.0
            Reporter: Todd Lipcon
            Assignee: Todd Lipcon
            Priority: Critical


I found that, even if the output of a map task fits entirely in its sort buffer, it was rewriting the output entirely rather than just renaming the first spill into place. This is due to RawLocalFileSystem.rename() falling back to a copy if renameTo() fails. The first rename attempt was failing because no one has called mkdir for the output directory yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-3252) MR2: Map tasks rewrite data once even if output fits in sort buffer

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-3252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13134607#comment-13134607 ] 

Hudson commented on MAPREDUCE-3252:
-----------------------------------

Integrated in Hadoop-Mapreduce-trunk-Commit #1162 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1162/])
    MAPREDUCE-3252. Fix map tasks to not rewrite data an extra time when map output fits in spill buffer. Contributed by Todd Lipcon.

todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1188424
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MapTask.java

                
> MR2: Map tasks rewrite data once even if output fits in sort buffer
> -------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3252
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3252
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, task
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.23.0
>
>         Attachments: mr-3252.txt, mr-3252.txt
>
>
> I found that, even if the output of a map task fits entirely in its sort buffer, it was rewriting the output entirely rather than just renaming the first spill into place. This is due to RawLocalFileSystem.rename() falling back to a copy if renameTo() fails. The first rename attempt was failing because no one has called mkdir for the output directory yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-3252) MR2: Map tasks rewrite data once even if output fits in sort buffer

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-3252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13135040#comment-13135040 ] 

Hudson commented on MAPREDUCE-3252:
-----------------------------------

Integrated in Hadoop-Hdfs-trunk #842 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/842/])
    MAPREDUCE-3252. Fix map tasks to not rewrite data an extra time when map output fits in spill buffer. Contributed by Todd Lipcon.

todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1188424
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MapTask.java

                
> MR2: Map tasks rewrite data once even if output fits in sort buffer
> -------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3252
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3252
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, task
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.23.0
>
>         Attachments: mr-3252.txt, mr-3252.txt
>
>
> I found that, even if the output of a map task fits entirely in its sort buffer, it was rewriting the output entirely rather than just renaming the first spill into place. This is due to RawLocalFileSystem.rename() falling back to a copy if renameTo() fails. The first rename attempt was failing because no one has called mkdir for the output directory yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-3252) MR2: Map tasks rewrite data once even if output fits in sort buffer

Posted by "Todd Lipcon (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-3252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated MAPREDUCE-3252:
-----------------------------------

          Resolution: Fixed
       Fix Version/s: 0.23.0
    Target Version/s: 0.23.0
        Hadoop Flags: Reviewed
              Status: Resolved  (was: Patch Available)

Committed to 23 and trunk. Thanks for the quick review, Arun.
                
> MR2: Map tasks rewrite data once even if output fits in sort buffer
> -------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3252
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3252
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, task
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.23.0
>
>         Attachments: mr-3252.txt, mr-3252.txt
>
>
> I found that, even if the output of a map task fits entirely in its sort buffer, it was rewriting the output entirely rather than just renaming the first spill into place. This is due to RawLocalFileSystem.rename() falling back to a copy if renameTo() fails. The first rename attempt was failing because no one has called mkdir for the output directory yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-3252) MR2: Map tasks rewrite data once even if output fits in sort buffer

Posted by "Todd Lipcon (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-3252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated MAPREDUCE-3252:
-----------------------------------

    Attachment: mr-3252.txt
    
> MR2: Map tasks rewrite data once even if output fits in sort buffer
> -------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3252
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3252
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, task
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>         Attachments: mr-3252.txt
>
>
> I found that, even if the output of a map task fits entirely in its sort buffer, it was rewriting the output entirely rather than just renaming the first spill into place. This is due to RawLocalFileSystem.rename() falling back to a copy if renameTo() fails. The first rename attempt was failing because no one has called mkdir for the output directory yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-3252) MR2: Map tasks rewrite data once even if output fits in sort buffer

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-3252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13134990#comment-13134990 ] 

Hudson commented on MAPREDUCE-3252:
-----------------------------------

Integrated in Hadoop-Hdfs-0.23-Build #50 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/50/])
    Merge -c 1188388 from trunk to branch-0.23 to complete fix for MAPREDUCE-3252.
MAPREDUCE-3252. Fix map tasks to not rewrite data an extra time when map output fits in spill buffer. Contributed by Todd Lipcon.

acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1188467
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/recover/RecoveryService.java
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRApp.java
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRecovery.java

todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1188423
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MapTask.java

                
> MR2: Map tasks rewrite data once even if output fits in sort buffer
> -------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3252
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3252
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, task
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.23.0
>
>         Attachments: mr-3252.txt, mr-3252.txt
>
>
> I found that, even if the output of a map task fits entirely in its sort buffer, it was rewriting the output entirely rather than just renaming the first spill into place. This is due to RawLocalFileSystem.rename() falling back to a copy if renameTo() fails. The first rename attempt was failing because no one has called mkdir for the output directory yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-3252) MR2: Map tasks rewrite data once even if output fits in sort buffer

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-3252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13134885#comment-13134885 ] 

Hudson commented on MAPREDUCE-3252:
-----------------------------------

Integrated in Hadoop-Hdfs-0.23-Commit #49 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Commit/49/])
    Merge -c 1188388 from trunk to branch-0.23 to complete fix for MAPREDUCE-3252.

acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1188467
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/recover/RecoveryService.java
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRApp.java
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRecovery.java

                
> MR2: Map tasks rewrite data once even if output fits in sort buffer
> -------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3252
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3252
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, task
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.23.0
>
>         Attachments: mr-3252.txt, mr-3252.txt
>
>
> I found that, even if the output of a map task fits entirely in its sort buffer, it was rewriting the output entirely rather than just renaming the first spill into place. This is due to RawLocalFileSystem.rename() falling back to a copy if renameTo() fails. The first rename attempt was failing because no one has called mkdir for the output directory yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-3252) MR2: Map tasks rewrite data once even if output fits in sort buffer

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-3252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13135016#comment-13135016 ] 

Hudson commented on MAPREDUCE-3252:
-----------------------------------

Integrated in Hadoop-Mapreduce-trunk #871 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/871/])
    MAPREDUCE-3252. Fix map tasks to not rewrite data an extra time when map output fits in spill buffer. Contributed by Todd Lipcon.

todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1188424
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MapTask.java

                
> MR2: Map tasks rewrite data once even if output fits in sort buffer
> -------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3252
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3252
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, task
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.23.0
>
>         Attachments: mr-3252.txt, mr-3252.txt
>
>
> I found that, even if the output of a map task fits entirely in its sort buffer, it was rewriting the output entirely rather than just renaming the first spill into place. This is due to RawLocalFileSystem.rename() falling back to a copy if renameTo() fails. The first rename attempt was failing because no one has called mkdir for the output directory yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-3252) MR2: Map tasks rewrite data once even if output fits in sort buffer

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-3252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13134606#comment-13134606 ] 

Hudson commented on MAPREDUCE-3252:
-----------------------------------

Integrated in Hadoop-Mapreduce-0.23-Commit #47 (See [https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Commit/47/])
    MAPREDUCE-3252. Fix map tasks to not rewrite data an extra time when map output fits in spill buffer. Contributed by Todd Lipcon.

todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1188423
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MapTask.java

                
> MR2: Map tasks rewrite data once even if output fits in sort buffer
> -------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3252
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3252
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, task
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.23.0
>
>         Attachments: mr-3252.txt, mr-3252.txt
>
>
> I found that, even if the output of a map task fits entirely in its sort buffer, it was rewriting the output entirely rather than just renaming the first spill into place. This is due to RawLocalFileSystem.rename() falling back to a copy if renameTo() fails. The first rename attempt was failing because no one has called mkdir for the output directory yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-3252) MR2: Map tasks rewrite data once even if output fits in sort buffer

Posted by "Arun C Murthy (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-3252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13134484#comment-13134484 ] 

Arun C Murthy commented on MAPREDUCE-3252:
------------------------------------------

+1
                
> MR2: Map tasks rewrite data once even if output fits in sort buffer
> -------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3252
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3252
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, task
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>         Attachments: mr-3252.txt
>
>
> I found that, even if the output of a map task fits entirely in its sort buffer, it was rewriting the output entirely rather than just renaming the first spill into place. This is due to RawLocalFileSystem.rename() falling back to a copy if renameTo() fails. The first rename attempt was failing because no one has called mkdir for the output directory yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-3252) MR2: Map tasks rewrite data once even if output fits in sort buffer

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-3252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13134871#comment-13134871 ] 

Hudson commented on MAPREDUCE-3252:
-----------------------------------

Integrated in Hadoop-Common-0.23-Commit #48 (See [https://builds.apache.org/job/Hadoop-Common-0.23-Commit/48/])
    Merge -c 1188388 from trunk to branch-0.23 to complete fix for MAPREDUCE-3252.

acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1188467
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/recover/RecoveryService.java
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRApp.java
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRecovery.java

                
> MR2: Map tasks rewrite data once even if output fits in sort buffer
> -------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3252
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3252
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, task
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.23.0
>
>         Attachments: mr-3252.txt, mr-3252.txt
>
>
> I found that, even if the output of a map task fits entirely in its sort buffer, it was rewriting the output entirely rather than just renaming the first spill into place. This is due to RawLocalFileSystem.rename() falling back to a copy if renameTo() fails. The first rename attempt was failing because no one has called mkdir for the output directory yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-3252) MR2: Map tasks rewrite data once even if output fits in sort buffer

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-3252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13134593#comment-13134593 ] 

Hudson commented on MAPREDUCE-3252:
-----------------------------------

Integrated in Hadoop-Common-0.23-Commit #47 (See [https://builds.apache.org/job/Hadoop-Common-0.23-Commit/47/])
    MAPREDUCE-3252. Fix map tasks to not rewrite data an extra time when map output fits in spill buffer. Contributed by Todd Lipcon.

todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1188423
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MapTask.java

                
> MR2: Map tasks rewrite data once even if output fits in sort buffer
> -------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3252
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3252
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, task
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.23.0
>
>         Attachments: mr-3252.txt, mr-3252.txt
>
>
> I found that, even if the output of a map task fits entirely in its sort buffer, it was rewriting the output entirely rather than just renaming the first spill into place. This is due to RawLocalFileSystem.rename() falling back to a copy if renameTo() fails. The first rename attempt was failing because no one has called mkdir for the output directory yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-3252) MR2: Map tasks rewrite data once even if output fits in sort buffer

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-3252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13134591#comment-13134591 ] 

Hudson commented on MAPREDUCE-3252:
-----------------------------------

Integrated in Hadoop-Common-trunk-Commit #1148 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1148/])
    MAPREDUCE-3252. Fix map tasks to not rewrite data an extra time when map output fits in spill buffer. Contributed by Todd Lipcon.

todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1188424
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MapTask.java

                
> MR2: Map tasks rewrite data once even if output fits in sort buffer
> -------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3252
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3252
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, task
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.23.0
>
>         Attachments: mr-3252.txt, mr-3252.txt
>
>
> I found that, even if the output of a map task fits entirely in its sort buffer, it was rewriting the output entirely rather than just renaming the first spill into place. This is due to RawLocalFileSystem.rename() falling back to a copy if renameTo() fails. The first rename attempt was failing because no one has called mkdir for the output directory yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-3252) MR2: Map tasks rewrite data once even if output fits in sort buffer

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-3252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13134599#comment-13134599 ] 

Hudson commented on MAPREDUCE-3252:
-----------------------------------

Integrated in Hadoop-Hdfs-0.23-Commit #48 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Commit/48/])
    MAPREDUCE-3252. Fix map tasks to not rewrite data an extra time when map output fits in spill buffer. Contributed by Todd Lipcon.

todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1188423
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MapTask.java

                
> MR2: Map tasks rewrite data once even if output fits in sort buffer
> -------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3252
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3252
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, task
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.23.0
>
>         Attachments: mr-3252.txt, mr-3252.txt
>
>
> I found that, even if the output of a map task fits entirely in its sort buffer, it was rewriting the output entirely rather than just renaming the first spill into place. This is due to RawLocalFileSystem.rename() falling back to a copy if renameTo() fails. The first rename attempt was failing because no one has called mkdir for the output directory yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-3252) MR2: Map tasks rewrite data once even if output fits in sort buffer

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-3252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13134598#comment-13134598 ] 

Hadoop QA commented on MAPREDUCE-3252:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12500564/mr-3252.txt
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    -1 javadoc.  The javadoc tool appears to have generated 1 warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 162 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed unit tests in .

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1126//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1126//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-common.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1126//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-app.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1126//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html
Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1126//console

This message is automatically generated.
                
> MR2: Map tasks rewrite data once even if output fits in sort buffer
> -------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3252
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3252
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, task
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.23.0
>
>         Attachments: mr-3252.txt, mr-3252.txt
>
>
> I found that, even if the output of a map task fits entirely in its sort buffer, it was rewriting the output entirely rather than just renaming the first spill into place. This is due to RawLocalFileSystem.rename() falling back to a copy if renameTo() fails. The first rename attempt was failing because no one has called mkdir for the output directory yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-3252) MR2: Map tasks rewrite data once even if output fits in sort buffer

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-3252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13134861#comment-13134861 ] 

Hudson commented on MAPREDUCE-3252:
-----------------------------------

Integrated in Hadoop-Mapreduce-0.23-Commit #48 (See [https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Commit/48/])
    Merge -c 1188388 from trunk to branch-0.23 to complete fix for MAPREDUCE-3252.

acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1188467
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/recover/RecoveryService.java
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRApp.java
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRecovery.java

                
> MR2: Map tasks rewrite data once even if output fits in sort buffer
> -------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3252
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3252
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, task
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.23.0
>
>         Attachments: mr-3252.txt, mr-3252.txt
>
>
> I found that, even if the output of a map task fits entirely in its sort buffer, it was rewriting the output entirely rather than just renaming the first spill into place. This is due to RawLocalFileSystem.rename() falling back to a copy if renameTo() fails. The first rename attempt was failing because no one has called mkdir for the output directory yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-3252) MR2: Map tasks rewrite data once even if output fits in sort buffer

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-3252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13134508#comment-13134508 ] 

Hadoop QA commented on MAPREDUCE-3252:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12500541/mr-3252.txt
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 160 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed unit tests in .

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1125//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1125//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-app.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1125//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1125//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-common.html
Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1125//console

This message is automatically generated.
                
> MR2: Map tasks rewrite data once even if output fits in sort buffer
> -------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3252
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3252
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, task
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>         Attachments: mr-3252.txt
>
>
> I found that, even if the output of a map task fits entirely in its sort buffer, it was rewriting the output entirely rather than just renaming the first spill into place. This is due to RawLocalFileSystem.rename() falling back to a copy if renameTo() fails. The first rename attempt was failing because no one has called mkdir for the output directory yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-3252) MR2: Map tasks rewrite data once even if output fits in sort buffer

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-3252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13134589#comment-13134589 ] 

Hudson commented on MAPREDUCE-3252:
-----------------------------------

Integrated in Hadoop-Hdfs-trunk-Commit #1226 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1226/])
    MAPREDUCE-3252. Fix map tasks to not rewrite data an extra time when map output fits in spill buffer. Contributed by Todd Lipcon.

todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1188424
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MapTask.java

                
> MR2: Map tasks rewrite data once even if output fits in sort buffer
> -------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3252
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3252
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, task
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.23.0
>
>         Attachments: mr-3252.txt, mr-3252.txt
>
>
> I found that, even if the output of a map task fits entirely in its sort buffer, it was rewriting the output entirely rather than just renaming the first spill into place. This is due to RawLocalFileSystem.rename() falling back to a copy if renameTo() fails. The first rename attempt was failing because no one has called mkdir for the output directory yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-3252) MR2: Map tasks rewrite data once even if output fits in sort buffer

Posted by "Todd Lipcon (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-3252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated MAPREDUCE-3252:
-----------------------------------

    Attachment: mr-3252.txt

added javadoc, fixed a spurious whitespace change. I ran this on a terasort configured with a large io.sort.mb, and counters showed the correct number of spilled records. I'll commit this momentarily.
                
> MR2: Map tasks rewrite data once even if output fits in sort buffer
> -------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3252
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3252
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, task
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>         Attachments: mr-3252.txt, mr-3252.txt
>
>
> I found that, even if the output of a map task fits entirely in its sort buffer, it was rewriting the output entirely rather than just renaming the first spill into place. This is due to RawLocalFileSystem.rename() falling back to a copy if renameTo() fails. The first rename attempt was failing because no one has called mkdir for the output directory yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-3252) MR2: Map tasks rewrite data once even if output fits in sort buffer

Posted by "Todd Lipcon (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-3252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated MAPREDUCE-3252:
-----------------------------------

    Status: Patch Available  (was: Open)
    
> MR2: Map tasks rewrite data once even if output fits in sort buffer
> -------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3252
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3252
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, task
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>         Attachments: mr-3252.txt
>
>
> I found that, even if the output of a map task fits entirely in its sort buffer, it was rewriting the output entirely rather than just renaming the first spill into place. This is due to RawLocalFileSystem.rename() falling back to a copy if renameTo() fails. The first rename attempt was failing because no one has called mkdir for the output directory yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-3252) MR2: Map tasks rewrite data once even if output fits in sort buffer

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-3252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13135028#comment-13135028 ] 

Hudson commented on MAPREDUCE-3252:
-----------------------------------

Integrated in Hadoop-Mapreduce-0.23-Build #62 (See [https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/62/])
    Merge -c 1188388 from trunk to branch-0.23 to complete fix for MAPREDUCE-3252.
MAPREDUCE-3252. Fix map tasks to not rewrite data an extra time when map output fits in spill buffer. Contributed by Todd Lipcon.

acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1188467
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/recover/RecoveryService.java
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRApp.java
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRecovery.java

todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1188423
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MapTask.java

                
> MR2: Map tasks rewrite data once even if output fits in sort buffer
> -------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3252
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3252
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, task
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.23.0
>
>         Attachments: mr-3252.txt, mr-3252.txt
>
>
> I found that, even if the output of a map task fits entirely in its sort buffer, it was rewriting the output entirely rather than just renaming the first spill into place. This is due to RawLocalFileSystem.rename() falling back to a copy if renameTo() fails. The first rename attempt was failing because no one has called mkdir for the output directory yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira