You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "Christian Kunz (JIRA)" <ji...@apache.org> on 2007/12/16 00:56:43 UTC

[jira] Created: (HADOOP-2437) final map output not evenly distributed across multiple disks

final map output not evenly distributed across multiple disks
-------------------------------------------------------------

                 Key: HADOOP-2437
                 URL: https://issues.apache.org/jira/browse/HADOOP-2437
             Project: Hadoop
          Issue Type: Bug
          Components: mapred
    Affects Versions: 0.16.0
            Reporter: Christian Kunz
            Priority: Blocker
             Fix For: 0.15.2


It seems that the final merge output of map tasks for a particular job does not select the output location in random fashion.

This results in a job with a lot of map tasks eventually running out of taskTrackers asking for more tasks because the disk with most of the map outputs eventually has less disk space than specified by mapred.local.dir.minspacestart.

Maybe the start of round-robin selection of multiple locations should be randomized.

In our case:
110,000 maps, each about 3GB final output, on a 1300 node cluster.
Out of 4 locations and after processing about 79,000 maps, the selection for final map outputs 'file.out' looked like:
location1: 24,000
location2: 25
location3: 55,000
location4: 7



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HADOOP-2437) final map output not evenly distributed across multiple disks

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das reassigned HADOOP-2437:
-----------------------------------

    Assignee: Arun C Murthy

> final map output not evenly distributed across multiple disks
> -------------------------------------------------------------
>
>                 Key: HADOOP-2437
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2437
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Christian Kunz
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.15.2
>
>
> It seems that the final merge output of map tasks for a particular job does not select the output location in random fashion.
> This results in a job with a lot of map tasks eventually running out of taskTrackers asking for more tasks because the disk with most of the map outputs eventually has less disk space than specified by mapred.local.dir.minspacestart.
> Maybe the start of round-robin selection of multiple locations should be randomized.
> In our case:
> 110,000 maps, each about 3GB final output, on a 1300 node cluster.
> Out of 4 locations and after processing about 79,000 maps, the selection for final map outputs 'file.out' looked like:
> location1: 24,000
> location2: 25
> location3: 55,000
> location4: 7

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2437) final map output not evenly distributed across multiple disks

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-2437:
----------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I just committed this.

> final map output not evenly distributed across multiple disks
> -------------------------------------------------------------
>
>                 Key: HADOOP-2437
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2437
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Christian Kunz
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.15.2
>
>         Attachments: HADOOP-2437_1_20071218.patch, HADOOP-2437_1_20071218.patch, HADOOP-2437_2_20071220.patch
>
>
> It seems that the final merge output of map tasks for a particular job does not select the output location in random fashion.
> This results in a job with a lot of map tasks eventually running out of taskTrackers asking for more tasks because the disk with most of the map outputs eventually has less disk space than specified by mapred.local.dir.minspacestart.
> Maybe the start of round-robin selection of multiple locations should be randomized.
> In our case:
> 110,000 maps, each about 3GB final output, on a 1300 node cluster.
> Out of 4 locations and after processing about 79,000 maps, the selection for final map outputs 'file.out' looked like:
> location1: 24,000
> location2: 25
> location3: 55,000
> location4: 7

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2437) final map output not evenly distributed across multiple disks

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-2437:
----------------------------------

    Attachment: HADOOP-2437_2_20071220.patch

I've had to jump through a few hoops to get the TestLocalDirAllocator to work, but the meat of the patch remains the same.

> final map output not evenly distributed across multiple disks
> -------------------------------------------------------------
>
>                 Key: HADOOP-2437
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2437
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Christian Kunz
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.15.2
>
>         Attachments: HADOOP-2437_1_20071218.patch, HADOOP-2437_1_20071218.patch, HADOOP-2437_2_20071220.patch
>
>
> It seems that the final merge output of map tasks for a particular job does not select the output location in random fashion.
> This results in a job with a lot of map tasks eventually running out of taskTrackers asking for more tasks because the disk with most of the map outputs eventually has less disk space than specified by mapred.local.dir.minspacestart.
> Maybe the start of round-robin selection of multiple locations should be randomized.
> In our case:
> 110,000 maps, each about 3GB final output, on a 1300 node cluster.
> Out of 4 locations and after processing about 79,000 maps, the selection for final map outputs 'file.out' looked like:
> location1: 24,000
> location2: 25
> location3: 55,000
> location4: 7

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2437) final map output not evenly distributed across multiple disks

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-2437:
----------------------------------

    Status: Patch Available  (was: Open)

> final map output not evenly distributed across multiple disks
> -------------------------------------------------------------
>
>                 Key: HADOOP-2437
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2437
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Christian Kunz
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.15.2
>
>         Attachments: HADOOP-2437_1_20071218.patch, HADOOP-2437_1_20071218.patch
>
>
> It seems that the final merge output of map tasks for a particular job does not select the output location in random fashion.
> This results in a job with a lot of map tasks eventually running out of taskTrackers asking for more tasks because the disk with most of the map outputs eventually has less disk space than specified by mapred.local.dir.minspacestart.
> Maybe the start of round-robin selection of multiple locations should be randomized.
> In our case:
> 110,000 maps, each about 3GB final output, on a 1300 node cluster.
> Out of 4 locations and after processing about 79,000 maps, the selection for final map outputs 'file.out' looked like:
> location1: 24,000
> location2: 25
> location3: 55,000
> location4: 7

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-2437) final map output not evenly distributed across multiple disks

Posted by "Runping Qi (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12552244 ] 

Runping Qi commented on HADOOP-2437:
------------------------------------


Similar problem due to round robin placement policy happens in DFS  data node blocks placement

> final map output not evenly distributed across multiple disks
> -------------------------------------------------------------
>
>                 Key: HADOOP-2437
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2437
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Christian Kunz
>            Priority: Blocker
>             Fix For: 0.15.2
>
>
> It seems that the final merge output of map tasks for a particular job does not select the output location in random fashion.
> This results in a job with a lot of map tasks eventually running out of taskTrackers asking for more tasks because the disk with most of the map outputs eventually has less disk space than specified by mapred.local.dir.minspacestart.
> Maybe the start of round-robin selection of multiple locations should be randomized.
> In our case:
> 110,000 maps, each about 3GB final output, on a 1300 node cluster.
> Out of 4 locations and after processing about 79,000 maps, the selection for final map outputs 'file.out' looked like:
> location1: 24,000
> location2: 25
> location3: 55,000
> location4: 7

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-2437) final map output not evenly distributed across multiple disks

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12552182 ] 

Devaraj Das commented on HADOOP-2437:
-------------------------------------

bq. Maybe the start of round-robin selection of multiple locations should be randomized.
+1

> final map output not evenly distributed across multiple disks
> -------------------------------------------------------------
>
>                 Key: HADOOP-2437
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2437
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Christian Kunz
>            Priority: Blocker
>             Fix For: 0.15.2
>
>
> It seems that the final merge output of map tasks for a particular job does not select the output location in random fashion.
> This results in a job with a lot of map tasks eventually running out of taskTrackers asking for more tasks because the disk with most of the map outputs eventually has less disk space than specified by mapred.local.dir.minspacestart.
> Maybe the start of round-robin selection of multiple locations should be randomized.
> In our case:
> 110,000 maps, each about 3GB final output, on a 1300 node cluster.
> Out of 4 locations and after processing about 79,000 maps, the selection for final map outputs 'file.out' looked like:
> location1: 24,000
> location2: 25
> location3: 55,000
> location4: 7

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2437) final map output not evenly distributed across multiple disks

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-2437:
----------------------------------

    Status: Patch Available  (was: Open)

Thanks for the review Christian, I got around to testing this too... submitting patch.

> final map output not evenly distributed across multiple disks
> -------------------------------------------------------------
>
>                 Key: HADOOP-2437
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2437
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Christian Kunz
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.15.2
>
>         Attachments: HADOOP-2437_1_20071218.patch
>
>
> It seems that the final merge output of map tasks for a particular job does not select the output location in random fashion.
> This results in a job with a lot of map tasks eventually running out of taskTrackers asking for more tasks because the disk with most of the map outputs eventually has less disk space than specified by mapred.local.dir.minspacestart.
> Maybe the start of round-robin selection of multiple locations should be randomized.
> In our case:
> 110,000 maps, each about 3GB final output, on a 1300 node cluster.
> Out of 4 locations and after processing about 79,000 maps, the selection for final map outputs 'file.out' looked like:
> location1: 24,000
> location2: 25
> location3: 55,000
> location4: 7

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-2437) final map output not evenly distributed across multiple disks

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12553286 ] 

Hadoop QA commented on HADOOP-2437:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12371934/HADOOP-2437_1_20071218.patch
against trunk revision r605473.

    @author +1.  The patch does not contain any @author tags.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new compiler warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests -1.  The patch failed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1393/testReport/
Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1393/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1393/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1393/console

This message is automatically generated.

> final map output not evenly distributed across multiple disks
> -------------------------------------------------------------
>
>                 Key: HADOOP-2437
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2437
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Christian Kunz
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.15.2
>
>         Attachments: HADOOP-2437_1_20071218.patch, HADOOP-2437_1_20071218.patch
>
>
> It seems that the final merge output of map tasks for a particular job does not select the output location in random fashion.
> This results in a job with a lot of map tasks eventually running out of taskTrackers asking for more tasks because the disk with most of the map outputs eventually has less disk space than specified by mapred.local.dir.minspacestart.
> Maybe the start of round-robin selection of multiple locations should be randomized.
> In our case:
> 110,000 maps, each about 3GB final output, on a 1300 node cluster.
> Out of 4 locations and after processing about 79,000 maps, the selection for final map outputs 'file.out' looked like:
> location1: 24,000
> location2: 25
> location3: 55,000
> location4: 7

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2437) final map output not evenly distributed across multiple disks

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-2437:
----------------------------------

    Status: Open  (was: Patch Available)

Ok, org.apache.hadoop.fs.TestLocalDirAllocator.test3 failed with: 
{noformat}
junit.framework.AssertionFailedError
	at org.apache.hadoop.fs.TestLocalDirAllocator.validateTempDirCreation(TestLocalDirAllocator.java:71)
	at org.apache.hadoop.fs.TestLocalDirAllocator.test3(TestLocalDirAllocator.java:142)
{noformat}


The problem is that the test case assumes that the start of the round-robin is _zero_; I'll put up another patch fixing it shortly...


> final map output not evenly distributed across multiple disks
> -------------------------------------------------------------
>
>                 Key: HADOOP-2437
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2437
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Christian Kunz
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.15.2
>
>         Attachments: HADOOP-2437_1_20071218.patch, HADOOP-2437_1_20071218.patch
>
>
> It seems that the final merge output of map tasks for a particular job does not select the output location in random fashion.
> This results in a job with a lot of map tasks eventually running out of taskTrackers asking for more tasks because the disk with most of the map outputs eventually has less disk space than specified by mapred.local.dir.minspacestart.
> Maybe the start of round-robin selection of multiple locations should be randomized.
> In our case:
> 110,000 maps, each about 3GB final output, on a 1300 node cluster.
> Out of 4 locations and after processing about 79,000 maps, the selection for final map outputs 'file.out' looked like:
> location1: 24,000
> location2: 25
> location3: 55,000
> location4: 7

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2437) final map output not evenly distributed across multiple disks

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-2437:
----------------------------------

    Status: Patch Available  (was: Open)

> final map output not evenly distributed across multiple disks
> -------------------------------------------------------------
>
>                 Key: HADOOP-2437
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2437
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Christian Kunz
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.15.2
>
>         Attachments: HADOOP-2437_1_20071218.patch, HADOOP-2437_1_20071218.patch, HADOOP-2437_2_20071220.patch
>
>
> It seems that the final merge output of map tasks for a particular job does not select the output location in random fashion.
> This results in a job with a lot of map tasks eventually running out of taskTrackers asking for more tasks because the disk with most of the map outputs eventually has less disk space than specified by mapred.local.dir.minspacestart.
> Maybe the start of round-robin selection of multiple locations should be randomized.
> In our case:
> 110,000 maps, each about 3GB final output, on a 1300 node cluster.
> Out of 4 locations and after processing about 79,000 maps, the selection for final map outputs 'file.out' looked like:
> location1: 24,000
> location2: 25
> location3: 55,000
> location4: 7

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2437) final map output not evenly distributed across multiple disks

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-2437:
----------------------------------

    Attachment: HADOOP-2437_1_20071218.patch

Simple patch to randomize the start of the round-robin, I'll keep testing.

> final map output not evenly distributed across multiple disks
> -------------------------------------------------------------
>
>                 Key: HADOOP-2437
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2437
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Christian Kunz
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.15.2
>
>         Attachments: HADOOP-2437_1_20071218.patch
>
>
> It seems that the final merge output of map tasks for a particular job does not select the output location in random fashion.
> This results in a job with a lot of map tasks eventually running out of taskTrackers asking for more tasks because the disk with most of the map outputs eventually has less disk space than specified by mapred.local.dir.minspacestart.
> Maybe the start of round-robin selection of multiple locations should be randomized.
> In our case:
> 110,000 maps, each about 3GB final output, on a 1300 node cluster.
> Out of 4 locations and after processing about 79,000 maps, the selection for final map outputs 'file.out' looked like:
> location1: 24,000
> location2: 25
> location3: 55,000
> location4: 7

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2437) final map output not evenly distributed across multiple disks

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-2437:
----------------------------------

    Status: Open  (was: Patch Available)

> final map output not evenly distributed across multiple disks
> -------------------------------------------------------------
>
>                 Key: HADOOP-2437
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2437
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Christian Kunz
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.15.2
>
>         Attachments: HADOOP-2437_1_20071218.patch
>
>
> It seems that the final merge output of map tasks for a particular job does not select the output location in random fashion.
> This results in a job with a lot of map tasks eventually running out of taskTrackers asking for more tasks because the disk with most of the map outputs eventually has less disk space than specified by mapred.local.dir.minspacestart.
> Maybe the start of round-robin selection of multiple locations should be randomized.
> In our case:
> 110,000 maps, each about 3GB final output, on a 1300 node cluster.
> Out of 4 locations and after processing about 79,000 maps, the selection for final map outputs 'file.out' looked like:
> location1: 24,000
> location2: 25
> location3: 55,000
> location4: 7

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-2437) final map output not evenly distributed across multiple disks

Posted by "Christian Kunz (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12552884 ] 

Christian Kunz commented on HADOOP-2437:
----------------------------------------

+1 (similar patch is working fine)

> final map output not evenly distributed across multiple disks
> -------------------------------------------------------------
>
>                 Key: HADOOP-2437
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2437
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Christian Kunz
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.15.2
>
>         Attachments: HADOOP-2437_1_20071218.patch
>
>
> It seems that the final merge output of map tasks for a particular job does not select the output location in random fashion.
> This results in a job with a lot of map tasks eventually running out of taskTrackers asking for more tasks because the disk with most of the map outputs eventually has less disk space than specified by mapred.local.dir.minspacestart.
> Maybe the start of round-robin selection of multiple locations should be randomized.
> In our case:
> 110,000 maps, each about 3GB final output, on a 1300 node cluster.
> Out of 4 locations and after processing about 79,000 maps, the selection for final map outputs 'file.out' looked like:
> location1: 24,000
> location2: 25
> location3: 55,000
> location4: 7

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-2437) final map output not evenly distributed across multiple disks

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12553239 ] 

Hadoop QA commented on HADOOP-2437:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12371892/HADOOP-2437_1_20071218.patch
against trunk revision r605433.

    @author +1.  The patch does not contain any @author tags.

    patch -1.  The patch command could not apply the patch.

Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1390/console

This message is automatically generated.

> final map output not evenly distributed across multiple disks
> -------------------------------------------------------------
>
>                 Key: HADOOP-2437
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2437
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Christian Kunz
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.15.2
>
>         Attachments: HADOOP-2437_1_20071218.patch
>
>
> It seems that the final merge output of map tasks for a particular job does not select the output location in random fashion.
> This results in a job with a lot of map tasks eventually running out of taskTrackers asking for more tasks because the disk with most of the map outputs eventually has less disk space than specified by mapred.local.dir.minspacestart.
> Maybe the start of round-robin selection of multiple locations should be randomized.
> In our case:
> 110,000 maps, each about 3GB final output, on a 1300 node cluster.
> Out of 4 locations and after processing about 79,000 maps, the selection for final map outputs 'file.out' looked like:
> location1: 24,000
> location2: 25
> location3: 55,000
> location4: 7

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-2437) final map output not evenly distributed across multiple disks

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12553498 ] 

Hadoop QA commented on HADOOP-2437:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12371962/HADOOP-2437_2_20071220.patch
against trunk revision r605672.

    @author +1.  The patch does not contain any @author tags.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new compiler warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1396/testReport/
Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1396/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1396/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1396/console

This message is automatically generated.

> final map output not evenly distributed across multiple disks
> -------------------------------------------------------------
>
>                 Key: HADOOP-2437
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2437
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Christian Kunz
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.15.2
>
>         Attachments: HADOOP-2437_1_20071218.patch, HADOOP-2437_1_20071218.patch, HADOOP-2437_2_20071220.patch
>
>
> It seems that the final merge output of map tasks for a particular job does not select the output location in random fashion.
> This results in a job with a lot of map tasks eventually running out of taskTrackers asking for more tasks because the disk with most of the map outputs eventually has less disk space than specified by mapred.local.dir.minspacestart.
> Maybe the start of round-robin selection of multiple locations should be randomized.
> In our case:
> 110,000 maps, each about 3GB final output, on a 1300 node cluster.
> Out of 4 locations and after processing about 79,000 maps, the selection for final map outputs 'file.out' looked like:
> location1: 24,000
> location2: 25
> location3: 55,000
> location4: 7

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2437) final map output not evenly distributed across multiple disks

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-2437:
----------------------------------

    Attachment: HADOOP-2437_1_20071218.patch

Same patch with relative path...

> final map output not evenly distributed across multiple disks
> -------------------------------------------------------------
>
>                 Key: HADOOP-2437
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2437
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Christian Kunz
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.15.2
>
>         Attachments: HADOOP-2437_1_20071218.patch, HADOOP-2437_1_20071218.patch
>
>
> It seems that the final merge output of map tasks for a particular job does not select the output location in random fashion.
> This results in a job with a lot of map tasks eventually running out of taskTrackers asking for more tasks because the disk with most of the map outputs eventually has less disk space than specified by mapred.local.dir.minspacestart.
> Maybe the start of round-robin selection of multiple locations should be randomized.
> In our case:
> 110,000 maps, each about 3GB final output, on a 1300 node cluster.
> Out of 4 locations and after processing about 79,000 maps, the selection for final map outputs 'file.out' looked like:
> location1: 24,000
> location2: 25
> location3: 55,000
> location4: 7

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.