You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Arun C Murthy (JIRA)" <ji...@apache.org> on 2007/04/20 12:10:15 UTC
[jira] Created: (HADOOP-1281) Speculative map tasks aren't getting
killed although the TIP completed
Speculative map tasks aren't getting killed although the TIP completed
----------------------------------------------------------------------
Key: HADOOP-1281
URL: https://issues.apache.org/jira/browse/HADOOP-1281
Project: Hadoop
Issue Type: Bug
Components: mapred
Reporter: Arun C Murthy
Assigned To: Arun C Murthy
The speculative map tasks run to completion although the TIP succeeded since the other task completed elsewhere.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-1281) Speculative map tasks aren't
getting killed although the TIP completed
Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12543788 ]
Arun C Murthy commented on HADOOP-1281:
---------------------------------------
bq. Do you remember why the original code was explicitly not killing the speculative attempts of map tasks ?
Actually I don't!
It was a long while ago (almost a year) and probably had to do with the fact that we never ran into issues like this since not many people used speculative-execution back then. *smile*
> Speculative map tasks aren't getting killed although the TIP completed
> ----------------------------------------------------------------------
>
> Key: HADOOP-1281
> URL: https://issues.apache.org/jira/browse/HADOOP-1281
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.15.0
> Reporter: Arun C Murthy
> Assignee: Arun C Murthy
> Fix For: 0.16.0
>
> Attachments: HADOOP-1281_1_20071117.patch
>
>
> The speculative map tasks run to completion although the TIP succeeded since the other task completed elsewhere.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1281) Speculative map tasks aren't getting
killed although the TIP completed
Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arun C Murthy updated HADOOP-1281:
----------------------------------
Attachment: HADOOP-1281_2_20080109.patch
Exact same patch as before, but added comments rationalizing the fix...
> Speculative map tasks aren't getting killed although the TIP completed
> ----------------------------------------------------------------------
>
> Key: HADOOP-1281
> URL: https://issues.apache.org/jira/browse/HADOOP-1281
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.15.0
> Reporter: Arun C Murthy
> Assignee: Arun C Murthy
> Priority: Critical
> Fix For: 0.16.0
>
> Attachments: HADOOP-1281_1_20071117.patch, HADOOP-1281_2_20071123.patch, HADOOP-1281_2_20080109.patch
>
>
> The speculative map tasks run to completion although the TIP succeeded since the other task completed elsewhere.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-1281) Speculative map tasks aren't
getting killed although the TIP completed
Posted by "lohit vijayarenu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12539748 ]
lohit vijayarenu commented on HADOOP-1281:
------------------------------------------
We hit this bug today.
Below is the log for 2 attempts for same task
<log>
Task Attempts Status Progress Start Time Finish Time Errors Task Logs Counters
task_200711022153_0001_m_001548_0 SUCCEEDED 100.00% 2-Nov-2007 22:00:59 2-Nov-2007 22:05:50 (4mins, 51sec)
task_200711022153_0001_m_001548_1 KILLED 84.44% 2-Nov-2007 22:02:17 2-Nov-2007 22:26:02 (23mins, 45sec)
</log>
If you look at the time each of the attempt took, after the first attempt finished in ~4mins, the second attempt should have been killed. But it went ahead and was running for ~23min. When we took a look at the logs, we saw that, the attempt was issued a kill signal after the whole job was completed.
The JobTracker did not send Kill signal to this task attempt (Or may be nothing was logged).
> Speculative map tasks aren't getting killed although the TIP completed
> ----------------------------------------------------------------------
>
> Key: HADOOP-1281
> URL: https://issues.apache.org/jira/browse/HADOOP-1281
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Reporter: Arun C Murthy
> Assignee: Arun C Murthy
>
> The speculative map tasks run to completion although the TIP succeeded since the other task completed elsewhere.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1281) Speculative map tasks aren't getting
killed although the TIP completed
Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arun C Murthy updated HADOOP-1281:
----------------------------------
Resolution: Fixed
Status: Resolved (was: Patch Available)
I just committed this.
> Speculative map tasks aren't getting killed although the TIP completed
> ----------------------------------------------------------------------
>
> Key: HADOOP-1281
> URL: https://issues.apache.org/jira/browse/HADOOP-1281
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.15.0
> Reporter: Arun C Murthy
> Assignee: Arun C Murthy
> Priority: Critical
> Fix For: 0.16.0
>
> Attachments: HADOOP-1281_1_20071117.patch, HADOOP-1281_2_20071123.patch, HADOOP-1281_2_20080109.patch
>
>
> The speculative map tasks run to completion although the TIP succeeded since the other task completed elsewhere.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-1281) Speculative map tasks aren't
getting killed although the TIP completed
Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12557241#action_12557241 ]
Devaraj Das commented on HADOOP-1281:
-------------------------------------
+1
> Speculative map tasks aren't getting killed although the TIP completed
> ----------------------------------------------------------------------
>
> Key: HADOOP-1281
> URL: https://issues.apache.org/jira/browse/HADOOP-1281
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.15.0
> Reporter: Arun C Murthy
> Assignee: Arun C Murthy
> Priority: Critical
> Fix For: 0.16.0
>
> Attachments: HADOOP-1281_1_20071117.patch, HADOOP-1281_2_20071123.patch
>
>
> The speculative map tasks run to completion although the TIP succeeded since the other task completed elsewhere.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-1281) Speculative map tasks aren't
getting killed although the TIP completed
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12543745 ]
Hadoop QA commented on HADOOP-1281:
-----------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12369687/HADOOP-1281_1_20071117.patch
against trunk revision r596418.
@author +1. The patch does not contain any @author tags.
javadoc +1. The javadoc tool did not generate any warning messages.
javac +1. The applied patch does not generate any new compiler warnings.
findbugs +1. The patch does not introduce any new Findbugs warnings.
core tests -1. The patch failed core unit tests.
contrib tests -1. The patch failed contrib unit tests.
Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1119/testReport/
Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1119/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1119/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1119/console
This message is automatically generated.
> Speculative map tasks aren't getting killed although the TIP completed
> ----------------------------------------------------------------------
>
> Key: HADOOP-1281
> URL: https://issues.apache.org/jira/browse/HADOOP-1281
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.15.0
> Reporter: Arun C Murthy
> Assignee: Arun C Murthy
> Fix For: 0.16.0
>
> Attachments: HADOOP-1281_1_20071117.patch
>
>
> The speculative map tasks run to completion although the TIP succeeded since the other task completed elsewhere.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1281) Speculative map tasks aren't getting
killed although the TIP completed
Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arun C Murthy updated HADOOP-1281:
----------------------------------
Status: Patch Available (was: Reopened)
I finally got around to testing this patch throughly, hence marking it PA.
> Speculative map tasks aren't getting killed although the TIP completed
> ----------------------------------------------------------------------
>
> Key: HADOOP-1281
> URL: https://issues.apache.org/jira/browse/HADOOP-1281
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.15.0
> Reporter: Arun C Murthy
> Assignee: Arun C Murthy
> Priority: Critical
> Fix For: 0.16.0
>
> Attachments: HADOOP-1281_1_20071117.patch, HADOOP-1281_2_20071123.patch
>
>
> The speculative map tasks run to completion although the TIP succeeded since the other task completed elsewhere.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-1281) Speculative map tasks aren't
getting killed although the TIP completed
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12557179#action_12557179 ]
Hadoop QA commented on HADOOP-1281:
-----------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12370124/HADOOP-1281_2_20071123.patch
against trunk revision .
@author +1. The patch does not contain any @author tags.
javadoc +1. The javadoc tool did not generate any warning messages.
javac +1. The applied patch does not generate any new compiler warnings.
findbugs +1. The patch does not introduce any new Findbugs warnings.
core tests +1. The patch passed core unit tests.
contrib tests -1. The patch failed contrib unit tests.
Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1520/testReport/
Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1520/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1520/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1520/console
This message is automatically generated.
> Speculative map tasks aren't getting killed although the TIP completed
> ----------------------------------------------------------------------
>
> Key: HADOOP-1281
> URL: https://issues.apache.org/jira/browse/HADOOP-1281
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.15.0
> Reporter: Arun C Murthy
> Assignee: Arun C Murthy
> Priority: Critical
> Fix For: 0.16.0
>
> Attachments: HADOOP-1281_1_20071117.patch, HADOOP-1281_2_20071123.patch
>
>
> The speculative map tasks run to completion although the TIP succeeded since the other task completed elsewhere.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1281) Speculative map tasks aren't getting
killed although the TIP completed
Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arun C Murthy updated HADOOP-1281:
----------------------------------
Resolution: Fixed
Status: Resolved (was: Patch Available)
I just committed this.
> Speculative map tasks aren't getting killed although the TIP completed
> ----------------------------------------------------------------------
>
> Key: HADOOP-1281
> URL: https://issues.apache.org/jira/browse/HADOOP-1281
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.15.0
> Reporter: Arun C Murthy
> Assignee: Arun C Murthy
> Fix For: 0.16.0
>
> Attachments: HADOOP-1281_1_20071117.patch
>
>
> The speculative map tasks run to completion although the TIP succeeded since the other task completed elsewhere.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Reopened: (HADOOP-1281) Speculative map tasks aren't getting
killed although the TIP completed
Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arun C Murthy reopened HADOOP-1281:
-----------------------------------
This seems to have introduced sporadic failures in some test-cases as noted by HADOOP--2252/HADOOP-2254, I'll investigate and fix those. In the meanwhile I'm reverting this patch.
> Speculative map tasks aren't getting killed although the TIP completed
> ----------------------------------------------------------------------
>
> Key: HADOOP-1281
> URL: https://issues.apache.org/jira/browse/HADOOP-1281
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.15.0
> Reporter: Arun C Murthy
> Assignee: Arun C Murthy
> Fix For: 0.16.0
>
> Attachments: HADOOP-1281_1_20071117.patch
>
>
> The speculative map tasks run to completion although the TIP succeeded since the other task completed elsewhere.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1281) Speculative map tasks aren't getting
killed although the TIP completed
Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arun C Murthy updated HADOOP-1281:
----------------------------------
Fix Version/s: 0.16.0
Affects Version/s: 0.15.0
> Speculative map tasks aren't getting killed although the TIP completed
> ----------------------------------------------------------------------
>
> Key: HADOOP-1281
> URL: https://issues.apache.org/jira/browse/HADOOP-1281
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.15.0
> Reporter: Arun C Murthy
> Assignee: Arun C Murthy
> Fix For: 0.16.0
>
> Attachments: HADOOP-1281_1_20071117.patch
>
>
> The speculative map tasks run to completion although the TIP succeeded since the other task completed elsewhere.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1281) Speculative map tasks aren't getting
killed although the TIP completed
Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arun C Murthy updated HADOOP-1281:
----------------------------------
Attachment: HADOOP-1281_1_20071117.patch
Straight-forward patch. I've done some preliminary testing, need to do more.
> Speculative map tasks aren't getting killed although the TIP completed
> ----------------------------------------------------------------------
>
> Key: HADOOP-1281
> URL: https://issues.apache.org/jira/browse/HADOOP-1281
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Reporter: Arun C Murthy
> Assignee: Arun C Murthy
> Attachments: HADOOP-1281_1_20071117.patch
>
>
> The speculative map tasks run to completion although the TIP succeeded since the other task completed elsewhere.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-1281) Speculative map tasks aren't
getting killed although the TIP completed
Posted by "Hudson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12544770 ]
Hudson commented on HADOOP-1281:
--------------------------------
Integrated in Hadoop-Nightly #311 (See [http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/311/])
> Speculative map tasks aren't getting killed although the TIP completed
> ----------------------------------------------------------------------
>
> Key: HADOOP-1281
> URL: https://issues.apache.org/jira/browse/HADOOP-1281
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.15.0
> Reporter: Arun C Murthy
> Assignee: Arun C Murthy
> Fix For: 0.16.0
>
> Attachments: HADOOP-1281_1_20071117.patch
>
>
> The speculative map tasks run to completion although the TIP succeeded since the other task completed elsewhere.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1281) Speculative map tasks aren't getting
killed although the TIP completed
Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arun C Murthy updated HADOOP-1281:
----------------------------------
Status: Patch Available (was: Open)
Submitting patch for review, done with testing.
> Speculative map tasks aren't getting killed although the TIP completed
> ----------------------------------------------------------------------
>
> Key: HADOOP-1281
> URL: https://issues.apache.org/jira/browse/HADOOP-1281
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.15.0
> Reporter: Arun C Murthy
> Assignee: Arun C Murthy
> Fix For: 0.16.0
>
> Attachments: HADOOP-1281_1_20071117.patch
>
>
> The speculative map tasks run to completion although the TIP succeeded since the other task completed elsewhere.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1281) Speculative map tasks aren't getting
killed although the TIP completed
Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arun C Murthy updated HADOOP-1281:
----------------------------------
Attachment: HADOOP-1281_2_20071123.patch
Updated patch to fix the test-case flakiness... I'll continue testing this.
> Speculative map tasks aren't getting killed although the TIP completed
> ----------------------------------------------------------------------
>
> Key: HADOOP-1281
> URL: https://issues.apache.org/jira/browse/HADOOP-1281
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.15.0
> Reporter: Arun C Murthy
> Assignee: Arun C Murthy
> Fix For: 0.16.0
>
> Attachments: HADOOP-1281_1_20071117.patch, HADOOP-1281_2_20071123.patch
>
>
> The speculative map tasks run to completion although the TIP succeeded since the other task completed elsewhere.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1281) Speculative map tasks aren't getting
killed although the TIP completed
Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arun C Murthy updated HADOOP-1281:
----------------------------------
Priority: Critical (was: Major)
I'm marking up the priority to reflect that this is an important bug to fix for 0.16.0, we are losing lots of cycles due to this.
> Speculative map tasks aren't getting killed although the TIP completed
> ----------------------------------------------------------------------
>
> Key: HADOOP-1281
> URL: https://issues.apache.org/jira/browse/HADOOP-1281
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.15.0
> Reporter: Arun C Murthy
> Assignee: Arun C Murthy
> Priority: Critical
> Fix For: 0.16.0
>
> Attachments: HADOOP-1281_1_20071117.patch, HADOOP-1281_2_20071123.patch
>
>
> The speculative map tasks run to completion although the TIP succeeded since the other task completed elsewhere.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-1281) Speculative map tasks aren't
getting killed although the TIP completed
Posted by "Milind Bhandarkar (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12543663 ]
Milind Bhandarkar commented on HADOOP-1281:
-------------------------------------------
Arun,
Do you remember why the original code was explicitly not killing the speculative attempts of map tasks ?
> Speculative map tasks aren't getting killed although the TIP completed
> ----------------------------------------------------------------------
>
> Key: HADOOP-1281
> URL: https://issues.apache.org/jira/browse/HADOOP-1281
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.15.0
> Reporter: Arun C Murthy
> Assignee: Arun C Murthy
> Fix For: 0.16.0
>
> Attachments: HADOOP-1281_1_20071117.patch
>
>
> The speculative map tasks run to completion although the TIP succeeded since the other task completed elsewhere.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.