You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Amar Kamat (JIRA)" <ji...@apache.org> on 2008/11/10 09:00:44 UTC

[jira] Created: (HADOOP-4623) Running tasks are not maintained by JobInProgress if speculation is off

Running tasks are not maintained by JobInProgress if speculation is off
-----------------------------------------------------------------------

                 Key: HADOOP-4623
                 URL: https://issues.apache.org/jira/browse/HADOOP-4623
             Project: Hadoop Core
          Issue Type: Bug
          Components: mapred
            Reporter: Amar Kamat


{{JobInProgress}} doesnt maintain any structure for running tasks if speculation is turned _off_.  {{getRunningMapCache()}} in {{JobInProgress}} exposes the running map cache. This api returns an empty {{Map}} if speculation turned off. 

_Usage_ :
{{CapicityScheduler}} requires a list of running tasks for both speculated and non-speculated jobs. See HADOOP-4558 to see how this issue affects {{CapacityScheduler}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4623) Running tasks are not maintained by JobInProgress if speculation is off

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-4623:
-------------------------------

    Status: Patch Available  (was: Open)

Running through hudson. _test-patch_ result on my box
{code}
[exec] +1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     +1 tests included.  The patch appears to include 32 new or modified tests.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec] 
     [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
{code}

> Running tasks are not maintained by JobInProgress if speculation is off
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4623
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4623
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>         Attachments: HADOOP-4623-v1.1.patch, HADOOP-4623-v1.2.patch, HADOOP-4623-v1.3.patch
>
>
> {{JobInProgress}} doesnt maintain any structure for running tasks if speculation is turned _off_.  {{getRunningMapCache()}} in {{JobInProgress}} exposes the running map cache. This api returns an empty {{Map}} if speculation turned off. 
> _Usage_ :
> {{CapicityScheduler}} requires a list of running tasks for both speculated and non-speculated jobs. See HADOOP-4558 to see how this issue affects {{CapacityScheduler}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4623) Running tasks are not maintained by JobInProgress if speculation is off

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648615#action_12648615 ] 

Amar Kamat commented on HADOOP-4623:
------------------------------------

I ran gridmix on _trunk_ and _trunk+patch-v1.2_ and here are the results 
||type||time(secs)||
|trunk|2373|
|trunk+patch-v1.2|2556|
This patch takes ~3min more than trunk. I will benchmark this again and see if the results are consistent. 

> Running tasks are not maintained by JobInProgress if speculation is off
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4623
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4623
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>         Attachments: HADOOP-4623-v1.1.patch, HADOOP-4623-v1.2.patch
>
>
> {{JobInProgress}} doesnt maintain any structure for running tasks if speculation is turned _off_.  {{getRunningMapCache()}} in {{JobInProgress}} exposes the running map cache. This api returns an empty {{Map}} if speculation turned off. 
> _Usage_ :
> {{CapicityScheduler}} requires a list of running tasks for both speculated and non-speculated jobs. See HADOOP-4558 to see how this issue affects {{CapacityScheduler}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4623) Running tasks are not maintained by JobInProgress if speculation is off

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12654094#action_12654094 ] 

Hudson commented on HADOOP-4623:
--------------------------------

Integrated in Hadoop-trunk #680 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/680/])
    

> Running tasks are not maintained by JobInProgress if speculation is off
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4623
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4623
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>             Fix For: 0.20.0
>
>         Attachments: HADOOP-4623-v1.1.patch, HADOOP-4623-v1.2.patch, HADOOP-4623-v1.3.patch, HADOOP-4623-v1.4.patch
>
>
> {{JobInProgress}} doesnt maintain any structure for running tasks if speculation is turned _off_.  {{getRunningMapCache()}} in {{JobInProgress}} exposes the running map cache. This api returns an empty {{Map}} if speculation turned off. 
> _Usage_ :
> {{CapicityScheduler}} requires a list of running tasks for both speculated and non-speculated jobs. See HADOOP-4558 to see how this issue affects {{CapacityScheduler}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4623) Running tasks are not maintained by JobInProgress if speculation is off

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-4623:
-------------------------------

    Status: Open  (was: Patch Available)

> Running tasks are not maintained by JobInProgress if speculation is off
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4623
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4623
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>         Attachments: HADOOP-4623-v1.1.patch, HADOOP-4623-v1.2.patch
>
>
> {{JobInProgress}} doesnt maintain any structure for running tasks if speculation is turned _off_.  {{getRunningMapCache()}} in {{JobInProgress}} exposes the running map cache. This api returns an empty {{Map}} if speculation turned off. 
> _Usage_ :
> {{CapicityScheduler}} requires a list of running tasks for both speculated and non-speculated jobs. See HADOOP-4558 to see how this issue affects {{CapacityScheduler}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4623) Running tasks are not maintained by JobInProgress if speculation is off

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12651914#action_12651914 ] 

Amareshwari Sriramadasu commented on HADOOP-4623:
-------------------------------------------------

+1 
patch looks good

> Running tasks are not maintained by JobInProgress if speculation is off
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4623
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4623
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>         Attachments: HADOOP-4623-v1.1.patch, HADOOP-4623-v1.2.patch, HADOOP-4623-v1.3.patch, HADOOP-4623-v1.4.patch
>
>
> {{JobInProgress}} doesnt maintain any structure for running tasks if speculation is turned _off_.  {{getRunningMapCache()}} in {{JobInProgress}} exposes the running map cache. This api returns an empty {{Map}} if speculation turned off. 
> _Usage_ :
> {{CapicityScheduler}} requires a list of running tasks for both speculated and non-speculated jobs. See HADOOP-4558 to see how this issue affects {{CapacityScheduler}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4623) Running tasks are not maintained by JobInProgress if speculation is off

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-4623:
-------------------------------

    Status: Open  (was: Patch Available)

> Running tasks are not maintained by JobInProgress if speculation is off
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4623
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4623
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>         Attachments: HADOOP-4623-v1.1.patch, HADOOP-4623-v1.2.patch, HADOOP-4623-v1.3.patch
>
>
> {{JobInProgress}} doesnt maintain any structure for running tasks if speculation is turned _off_.  {{getRunningMapCache()}} in {{JobInProgress}} exposes the running map cache. This api returns an empty {{Map}} if speculation turned off. 
> _Usage_ :
> {{CapicityScheduler}} requires a list of running tasks for both speculated and non-speculated jobs. See HADOOP-4558 to see how this issue affects {{CapacityScheduler}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4623) Running tasks are not maintained by JobInProgress if speculation is off

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-4623:
-------------------------------

    Status: Patch Available  (was: Open)

> Running tasks are not maintained by JobInProgress if speculation is off
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4623
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4623
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>         Attachments: HADOOP-4623-v1.1.patch, HADOOP-4623-v1.2.patch, HADOOP-4623-v1.3.patch, HADOOP-4623-v1.4.patch
>
>
> {{JobInProgress}} doesnt maintain any structure for running tasks if speculation is turned _off_.  {{getRunningMapCache()}} in {{JobInProgress}} exposes the running map cache. This api returns an empty {{Map}} if speculation turned off. 
> _Usage_ :
> {{CapicityScheduler}} requires a list of running tasks for both speculated and non-speculated jobs. See HADOOP-4558 to see how this issue affects {{CapacityScheduler}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4623) Running tasks are not maintained by JobInProgress if speculation is off

Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653193#action_12653193 ] 

Runping Qi commented on HADOOP-4623:
------------------------------------

sounds good.


> Running tasks are not maintained by JobInProgress if speculation is off
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4623
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4623
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>         Attachments: HADOOP-4623-v1.1.patch, HADOOP-4623-v1.2.patch, HADOOP-4623-v1.3.patch, HADOOP-4623-v1.4.patch
>
>
> {{JobInProgress}} doesnt maintain any structure for running tasks if speculation is turned _off_.  {{getRunningMapCache()}} in {{JobInProgress}} exposes the running map cache. This api returns an empty {{Map}} if speculation turned off. 
> _Usage_ :
> {{CapicityScheduler}} requires a list of running tasks for both speculated and non-speculated jobs. See HADOOP-4558 to see how this issue affects {{CapacityScheduler}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4623) Running tasks are not maintained by JobInProgress if speculation is off

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653164#action_12653164 ] 

Amar Kamat commented on HADOOP-4623:
------------------------------------

@Runping : This issue addresses the bug in the framework where _running-tasks_ are not maintained if _specualtion_ is OFF. Benchmarks have shown that this patch doesnt always degrade the performance. I feel we should go ahead and get this committed and open another _perforamce-related_ jira incase we see some performance degradation. This issue is blocking capacity-scheduler.

> Running tasks are not maintained by JobInProgress if speculation is off
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4623
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4623
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>         Attachments: HADOOP-4623-v1.1.patch, HADOOP-4623-v1.2.patch, HADOOP-4623-v1.3.patch, HADOOP-4623-v1.4.patch
>
>
> {{JobInProgress}} doesnt maintain any structure for running tasks if speculation is turned _off_.  {{getRunningMapCache()}} in {{JobInProgress}} exposes the running map cache. This api returns an empty {{Map}} if speculation turned off. 
> _Usage_ :
> {{CapicityScheduler}} requires a list of running tasks for both speculated and non-speculated jobs. See HADOOP-4558 to see how this issue affects {{CapacityScheduler}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4623) Running tasks are not maintained by JobInProgress if speculation is off

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-4623:
-------------------------------

    Attachment: HADOOP-4623-v1.3.patch

Attaching a patch the incorporates Amareshwari's comment #1. I have added a testcase that requires {{WaitingMapper}} and the related _utils_ and hence I have re-factored the code. Here are the benchmark results
||trunk(secs)||trunk+4623-v1.2.patch(secs)||diff(secs)||
|2440|2407|-33|
| |2432|-8 secs|
| |2434|-6 secs|

> Running tasks are not maintained by JobInProgress if speculation is off
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4623
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4623
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>         Attachments: HADOOP-4623-v1.1.patch, HADOOP-4623-v1.2.patch, HADOOP-4623-v1.3.patch
>
>
> {{JobInProgress}} doesnt maintain any structure for running tasks if speculation is turned _off_.  {{getRunningMapCache()}} in {{JobInProgress}} exposes the running map cache. This api returns an empty {{Map}} if speculation turned off. 
> _Usage_ :
> {{CapicityScheduler}} requires a list of running tasks for both speculated and non-speculated jobs. See HADOOP-4558 to see how this issue affects {{CapacityScheduler}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4623) Running tasks are not maintained by JobInProgress if speculation is off

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-4623:
-------------------------------

    Attachment: HADOOP-4623-v1.4.patch

Incorporating Amareshwari's comments. Result of _test-patch_
{noformat}
 [exec] +1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     +1 tests included.  The patch appears to include 32 new or modified tests.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec] 
     [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

{noformat}

> Running tasks are not maintained by JobInProgress if speculation is off
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4623
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4623
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>         Attachments: HADOOP-4623-v1.1.patch, HADOOP-4623-v1.2.patch, HADOOP-4623-v1.3.patch, HADOOP-4623-v1.4.patch
>
>
> {{JobInProgress}} doesnt maintain any structure for running tasks if speculation is turned _off_.  {{getRunningMapCache()}} in {{JobInProgress}} exposes the running map cache. This api returns an empty {{Map}} if speculation turned off. 
> _Usage_ :
> {{CapicityScheduler}} requires a list of running tasks for both speculated and non-speculated jobs. See HADOOP-4558 to see how this issue affects {{CapacityScheduler}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4623) Running tasks are not maintained by JobInProgress if speculation is off

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12650866#action_12650866 ] 

Amareshwari Sriramadasu commented on HADOOP-4623:
-------------------------------------------------

Changes in JobInProgress look good.
In TestJobInProgress, instead of directly accessing JIP.runningMapTasks, JIP.runningMapTasks variables, you can use public methods runningMaps() and runningReduces().
I don't know whether UtilsForTest refactoring is needed in this jira, but the code changes look fine.

> Running tasks are not maintained by JobInProgress if speculation is off
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4623
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4623
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>         Attachments: HADOOP-4623-v1.1.patch, HADOOP-4623-v1.2.patch
>
>
> {{JobInProgress}} doesnt maintain any structure for running tasks if speculation is turned _off_.  {{getRunningMapCache()}} in {{JobInProgress}} exposes the running map cache. This api returns an empty {{Map}} if speculation turned off. 
> _Usage_ :
> {{CapicityScheduler}} requires a list of running tasks for both speculated and non-speculated jobs. See HADOOP-4558 to see how this issue affects {{CapacityScheduler}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4623) Running tasks are not maintained by JobInProgress if speculation is off

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653760#action_12653760 ] 

Devaraj Das commented on HADOOP-4623:
-------------------------------------

+1

> Running tasks are not maintained by JobInProgress if speculation is off
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4623
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4623
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>         Attachments: HADOOP-4623-v1.1.patch, HADOOP-4623-v1.2.patch, HADOOP-4623-v1.3.patch, HADOOP-4623-v1.4.patch
>
>
> {{JobInProgress}} doesnt maintain any structure for running tasks if speculation is turned _off_.  {{getRunningMapCache()}} in {{JobInProgress}} exposes the running map cache. This api returns an empty {{Map}} if speculation turned off. 
> _Usage_ :
> {{CapicityScheduler}} requires a list of running tasks for both speculated and non-speculated jobs. See HADOOP-4558 to see how this issue affects {{CapacityScheduler}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4623) Running tasks are not maintained by JobInProgress if speculation is off

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653610#action_12653610 ] 

Amar Kamat commented on HADOOP-4623:
------------------------------------

_ant test_ passed on my local box.

> Running tasks are not maintained by JobInProgress if speculation is off
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4623
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4623
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>         Attachments: HADOOP-4623-v1.1.patch, HADOOP-4623-v1.2.patch, HADOOP-4623-v1.3.patch, HADOOP-4623-v1.4.patch
>
>
> {{JobInProgress}} doesnt maintain any structure for running tasks if speculation is turned _off_.  {{getRunningMapCache()}} in {{JobInProgress}} exposes the running map cache. This api returns an empty {{Map}} if speculation turned off. 
> _Usage_ :
> {{CapicityScheduler}} requires a list of running tasks for both speculated and non-speculated jobs. See HADOOP-4558 to see how this issue affects {{CapacityScheduler}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4623) Running tasks are not maintained by JobInProgress if speculation is off

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653310#action_12653310 ] 

Amar Kamat commented on HADOOP-4623:
------------------------------------

Result of _test-patch_ on trunk :
{noformat}
[exec] +1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     +1 tests included.  The patch appears to include 32 new or modified tests.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec] 
     [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
{noformat}

Running _ant test_ now.

> Running tasks are not maintained by JobInProgress if speculation is off
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4623
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4623
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>         Attachments: HADOOP-4623-v1.1.patch, HADOOP-4623-v1.2.patch, HADOOP-4623-v1.3.patch, HADOOP-4623-v1.4.patch
>
>
> {{JobInProgress}} doesnt maintain any structure for running tasks if speculation is turned _off_.  {{getRunningMapCache()}} in {{JobInProgress}} exposes the running map cache. This api returns an empty {{Map}} if speculation turned off. 
> _Usage_ :
> {{CapicityScheduler}} requires a list of running tasks for both speculated and non-speculated jobs. See HADOOP-4558 to see how this issue affects {{CapacityScheduler}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4623) Running tasks are not maintained by JobInProgress if speculation is off

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-4623:
-------------------------------

    Assignee: Amar Kamat
      Status: Patch Available  (was: Open)

> Running tasks are not maintained by JobInProgress if speculation is off
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4623
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4623
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>         Attachments: HADOOP-4623-v1.1.patch, HADOOP-4623-v1.2.patch
>
>
> {{JobInProgress}} doesnt maintain any structure for running tasks if speculation is turned _off_.  {{getRunningMapCache()}} in {{JobInProgress}} exposes the running map cache. This api returns an empty {{Map}} if speculation turned off. 
> _Usage_ :
> {{CapicityScheduler}} requires a list of running tasks for both speculated and non-speculated jobs. See HADOOP-4558 to see how this issue affects {{CapacityScheduler}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4623) Running tasks are not maintained by JobInProgress if speculation is off

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-4623:
-------------------------------

    Attachment: HADOOP-4623-v1.1.patch

Attaching a patch that maintains the running structure irrespective of speculation. Modified {{TestJobInProgress}} to test this patch. Testing under progress. Benchmarking is pending.

> Running tasks are not maintained by JobInProgress if speculation is off
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4623
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4623
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>         Attachments: HADOOP-4623-v1.1.patch
>
>
> {{JobInProgress}} doesnt maintain any structure for running tasks if speculation is turned _off_.  {{getRunningMapCache()}} in {{JobInProgress}} exposes the running map cache. This api returns an empty {{Map}} if speculation turned off. 
> _Usage_ :
> {{CapicityScheduler}} requires a list of running tasks for both speculated and non-speculated jobs. See HADOOP-4558 to see how this issue affects {{CapacityScheduler}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4623) Running tasks are not maintained by JobInProgress if speculation is off

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-4623:
-------------------------------

    Attachment: HADOOP-4623-v1.2.patch

Attaching a new patch with some bug fixes and cleanup. Test patch result 
{code}
[exec] +1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     +1 tests included.  The patch appears to include 32 new or modified tests.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec] 
     [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
{code}
Testing and benchmarking in progress.

> Running tasks are not maintained by JobInProgress if speculation is off
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4623
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4623
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>         Attachments: HADOOP-4623-v1.1.patch, HADOOP-4623-v1.2.patch
>
>
> {{JobInProgress}} doesnt maintain any structure for running tasks if speculation is turned _off_.  {{getRunningMapCache()}} in {{JobInProgress}} exposes the running map cache. This api returns an empty {{Map}} if speculation turned off. 
> _Usage_ :
> {{CapicityScheduler}} requires a list of running tasks for both speculated and non-speculated jobs. See HADOOP-4558 to see how this issue affects {{CapacityScheduler}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4623) Running tasks are not maintained by JobInProgress if speculation is off

Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12649443#action_12649443 ] 

Runping Qi commented on HADOOP-4623:
------------------------------------


Currently the data structure for runningMapCache (logically a map from Node->Collection<TaskInProgress>).
Whenever a task is scheduled, a tip is added to this structure. Whenever a task is completed, the tip is deleted from the data structure.
This data structure is currently implemented as a LinkedHashMap. That means each operation involves link manipulation and objection creation.

I suspect that the performance would improve if a more efficient data structure is used.
Here is an idea.
Use a HashMap mapping nodes to fix sized arrays of tips. The fix size should be the number of slots per node. 
With this simple data structure, you need to initialize it once. Any add/delete operations be simply setting a reference in a fix sized array.
No object creation is involved. Their overhead will be lower and predictable.

 

> Running tasks are not maintained by JobInProgress if speculation is off
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4623
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4623
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>         Attachments: HADOOP-4623-v1.1.patch, HADOOP-4623-v1.2.patch
>
>
> {{JobInProgress}} doesnt maintain any structure for running tasks if speculation is turned _off_.  {{getRunningMapCache()}} in {{JobInProgress}} exposes the running map cache. This api returns an empty {{Map}} if speculation turned off. 
> _Usage_ :
> {{CapicityScheduler}} requires a list of running tasks for both speculated and non-speculated jobs. See HADOOP-4558 to see how this issue affects {{CapacityScheduler}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4623) Running tasks are not maintained by JobInProgress if speculation is off

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12649610#action_12649610 ] 

Amar Kamat commented on HADOOP-4623:
------------------------------------

bq. That means each operation involves link manipulation and objection creation.
There is no object creation involved. Node is passed by the jobtracker which it maintains per tracker/node. TIP is already created by the job as part of job-initialization. Only data structure updates will happen using object references.

bq. Use a HashMap mapping nodes to fix sized arrays of tips. The fix size should be the number of slots per node. 
Wouldnt this result into memory wastage as there might be many other jobs running in parallel and the whole(expanded) _running-cache_ might not be used. How will you take care of holes in the array? 

----
I think I should benchmark the patch again and see if the results are consistent. If yes then we can go ahead with the patch and file an optimization jira for running-cache. Thoughts?

> Running tasks are not maintained by JobInProgress if speculation is off
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4623
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4623
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>         Attachments: HADOOP-4623-v1.1.patch, HADOOP-4623-v1.2.patch
>
>
> {{JobInProgress}} doesnt maintain any structure for running tasks if speculation is turned _off_.  {{getRunningMapCache()}} in {{JobInProgress}} exposes the running map cache. This api returns an empty {{Map}} if speculation turned off. 
> _Usage_ :
> {{CapicityScheduler}} requires a list of running tasks for both speculated and non-speculated jobs. See HADOOP-4558 to see how this issue affects {{CapacityScheduler}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4623) Running tasks are not maintained by JobInProgress if speculation is off

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12651699#action_12651699 ] 

Hadoop QA commented on HADOOP-4623:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12394738/HADOOP-4623-v1.3.patch
  against trunk revision 721415.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 32 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    -1 core tests.  The patch failed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3668/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3668/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3668/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3668/console

This message is automatically generated.

> Running tasks are not maintained by JobInProgress if speculation is off
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4623
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4623
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>         Attachments: HADOOP-4623-v1.1.patch, HADOOP-4623-v1.2.patch, HADOOP-4623-v1.3.patch
>
>
> {{JobInProgress}} doesnt maintain any structure for running tasks if speculation is turned _off_.  {{getRunningMapCache()}} in {{JobInProgress}} exposes the running map cache. This api returns an empty {{Map}} if speculation turned off. 
> _Usage_ :
> {{CapicityScheduler}} requires a list of running tasks for both speculated and non-speculated jobs. See HADOOP-4558 to see how this issue affects {{CapacityScheduler}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4623) Running tasks are not maintained by JobInProgress if speculation is off

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648421#action_12648421 ] 

Hadoop QA commented on HADOOP-4623:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12393871/HADOOP-4623-v1.2.patch
  against trunk revision 718232.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 32 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3601/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3601/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3601/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3601/console

This message is automatically generated.

> Running tasks are not maintained by JobInProgress if speculation is off
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4623
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4623
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>         Attachments: HADOOP-4623-v1.1.patch, HADOOP-4623-v1.2.patch
>
>
> {{JobInProgress}} doesnt maintain any structure for running tasks if speculation is turned _off_.  {{getRunningMapCache()}} in {{JobInProgress}} exposes the running map cache. This api returns an empty {{Map}} if speculation turned off. 
> _Usage_ :
> {{CapicityScheduler}} requires a list of running tasks for both speculated and non-speculated jobs. See HADOOP-4558 to see how this issue affects {{CapacityScheduler}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4623) Running tasks are not maintained by JobInProgress if speculation is off

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12651237#action_12651237 ] 

Amareshwari Sriramadasu commented on HADOOP-4623:
-------------------------------------------------

some more comments:
1. In method,   private void waitTillReady(JobInProgress jip, JobConf job) in TestJobInProgress, you still use variables runningMapTasks and runningReduceTasks.
2. The code _private static Path TEST_DIR = new Path("jip-testing")_ will create jip-testing directory in current working dir. It should be
_new Path(System.getProperty("test.build.data","/tmp"), "jip-testing")_

> Running tasks are not maintained by JobInProgress if speculation is off
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4623
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4623
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>         Attachments: HADOOP-4623-v1.1.patch, HADOOP-4623-v1.2.patch, HADOOP-4623-v1.3.patch
>
>
> {{JobInProgress}} doesnt maintain any structure for running tasks if speculation is turned _off_.  {{getRunningMapCache()}} in {{JobInProgress}} exposes the running map cache. This api returns an empty {{Map}} if speculation turned off. 
> _Usage_ :
> {{CapicityScheduler}} requires a list of running tasks for both speculated and non-speculated jobs. See HADOOP-4558 to see how this issue affects {{CapacityScheduler}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4623) Running tasks are not maintained by JobInProgress if speculation is off

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hemanth Yamijala updated HADOOP-4623:
-------------------------------------

       Resolution: Fixed
    Fix Version/s: 0.20.0
     Hadoop Flags: [Reviewed]
           Status: Resolved  (was: Patch Available)

I just committed this. Thanks, Amar !

> Running tasks are not maintained by JobInProgress if speculation is off
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4623
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4623
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>             Fix For: 0.20.0
>
>         Attachments: HADOOP-4623-v1.1.patch, HADOOP-4623-v1.2.patch, HADOOP-4623-v1.3.patch, HADOOP-4623-v1.4.patch
>
>
> {{JobInProgress}} doesnt maintain any structure for running tasks if speculation is turned _off_.  {{getRunningMapCache()}} in {{JobInProgress}} exposes the running map cache. This api returns an empty {{Map}} if speculation turned off. 
> _Usage_ :
> {{CapicityScheduler}} requires a list of running tasks for both speculated and non-speculated jobs. See HADOOP-4558 to see how this issue affects {{CapacityScheduler}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4623) Running tasks are not maintained by JobInProgress if speculation is off

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648964#action_12648964 ] 

Amar Kamat commented on HADOOP-4623:
------------------------------------

Tried running gridmix again and here are the runtimes (comparing with trunk run from the above comment)
||run#|time(secs)|diff(secs)||
|1|2489|117|
|2|2634|262|
|3|2426|54|


> Running tasks are not maintained by JobInProgress if speculation is off
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-4623
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4623
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>         Attachments: HADOOP-4623-v1.1.patch, HADOOP-4623-v1.2.patch
>
>
> {{JobInProgress}} doesnt maintain any structure for running tasks if speculation is turned _off_.  {{getRunningMapCache()}} in {{JobInProgress}} exposes the running map cache. This api returns an empty {{Map}} if speculation turned off. 
> _Usage_ :
> {{CapicityScheduler}} requires a list of running tasks for both speculated and non-speculated jobs. See HADOOP-4558 to see how this issue affects {{CapacityScheduler}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.