You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Johan Oskarsson (JIRA)" <ji...@apache.org> on 2007/05/25 19:06:16 UTC

[jira] Created: (HADOOP-1433) Add job priority

Add job priority
----------------

                 Key: HADOOP-1433
                 URL: https://issues.apache.org/jira/browse/HADOOP-1433
             Project: Hadoop
          Issue Type: New Feature
          Components: mapred
            Reporter: Johan Oskarsson
            Priority: Minor


As more and more developers start using our cluster we run into problems where big low priority jobs block smaller high priority ones.

A simple way of specifying the job priority in the JobConf and perhaps even change it during runtime via the jobtracker web ui would help a lot.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1433) Add job priority

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12512447 ] 

Hudson commented on HADOOP-1433:
--------------------------------

Integrated in Hadoop-Nightly #154 (See [http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/154/])

> Add job priority
> ----------------
>
>                 Key: HADOOP-1433
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1433
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>    Affects Versions: 0.14.0
>            Reporter: Johan Oskarsson
>            Assignee: Johan Oskarsson
>            Priority: Minor
>             Fix For: 0.14.0
>
>         Attachments: jobpriorityv3.patch
>
>
> As more and more developers start using our cluster we run into problems where big low priority jobs block smaller high priority ones.
> A simple way of specifying the job priority in the JobConf and perhaps even change it during runtime via the jobtracker web ui would help a lot.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1433) Add job priority

Posted by "Johan Oskarsson (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Johan Oskarsson updated HADOOP-1433:
------------------------------------

    Attachment:     (was: jobpriorityv2.patch)

> Add job priority
> ----------------
>
>                 Key: HADOOP-1433
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1433
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>    Affects Versions: 0.14.0
>            Reporter: Johan Oskarsson
>            Assignee: Johan Oskarsson
>            Priority: Minor
>             Fix For: 0.14.0
>
>
> As more and more developers start using our cluster we run into problems where big low priority jobs block smaller high priority ones.
> A simple way of specifying the job priority in the JobConf and perhaps even change it during runtime via the jobtracker web ui would help a lot.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1433) Add job priority

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12512265 ] 

Hadoop QA commented on HADOOP-1433:
-----------------------------------

-1, could not apply patch.

The patch command could not apply the latest attachment http://issues.apache.org/jira/secure/attachment/12361661/jobpriorityv3.patch as a patch to trunk revision r555770.

Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/400/console

Please note that this message is automatically generated and may represent a problem with the automation system and not the patch.

> Add job priority
> ----------------
>
>                 Key: HADOOP-1433
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1433
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>    Affects Versions: 0.14.0
>            Reporter: Johan Oskarsson
>            Assignee: Johan Oskarsson
>            Priority: Minor
>             Fix For: 0.14.0
>
>         Attachments: jobpriorityv3.patch
>
>
> As more and more developers start using our cluster we run into problems where big low priority jobs block smaller high priority ones.
> A simple way of specifying the job priority in the JobConf and perhaps even change it during runtime via the jobtracker web ui would help a lot.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1433) Add job priority

Posted by "Johan Oskarsson (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Johan Oskarsson updated HADOOP-1433:
------------------------------------

    Attachment: jobpriorityv2.patch

Cleaned up the findbugs issue. Also fixed most codestyle problems (in some cases I kept the style of the current source).

Comments are more then welcome.

> Add job priority
> ----------------
>
>                 Key: HADOOP-1433
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1433
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>    Affects Versions: 0.14.0
>            Reporter: Johan Oskarsson
>            Priority: Minor
>             Fix For: 0.14.0
>
>         Attachments: jobpriorityv1-trunk.patch, jobpriorityv2.patch
>
>
> As more and more developers start using our cluster we run into problems where big low priority jobs block smaller high priority ones.
> A simple way of specifying the job priority in the JobConf and perhaps even change it during runtime via the jobtracker web ui would help a lot.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1433) Add job priority

Posted by "Johan Oskarsson (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Johan Oskarsson updated HADOOP-1433:
------------------------------------

        Fix Version/s: 0.14.0
    Affects Version/s: 0.14.0
               Status: Patch Available  (was: Open)

> Add job priority
> ----------------
>
>                 Key: HADOOP-1433
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1433
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>    Affects Versions: 0.14.0
>            Reporter: Johan Oskarsson
>            Priority: Minor
>             Fix For: 0.14.0
>
>         Attachments: jobpriorityv1-trunk.patch
>
>
> As more and more developers start using our cluster we run into problems where big low priority jobs block smaller high priority ones.
> A simple way of specifying the job priority in the JobConf and perhaps even change it during runtime via the jobtracker web ui would help a lot.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1433) Add job priority

Posted by "Tom White (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tom White updated HADOOP-1433:
------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I've just committed this. Thanks Johan!

> Add job priority
> ----------------
>
>                 Key: HADOOP-1433
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1433
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>    Affects Versions: 0.14.0
>            Reporter: Johan Oskarsson
>            Assignee: Johan Oskarsson
>            Priority: Minor
>             Fix For: 0.14.0
>
>         Attachments: jobpriorityv3.patch
>
>
> As more and more developers start using our cluster we run into problems where big low priority jobs block smaller high priority ones.
> A simple way of specifying the job priority in the JobConf and perhaps even change it during runtime via the jobtracker web ui would help a lot.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1433) Add job priority

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12499171 ] 

Doug Cutting commented on HADOOP-1433:
--------------------------------------

The answer we've come to in the past is that an elaborate scheduler is beyond the immediate scope of Hadoop.  So I don't think anyone's currently working on this, although you are free to if you like.

Some workarounds that have been proposed are:

1. Run two sets of MapReduce daemons on the same cluster, one niced, the other not.  Since HADOOP-468, Hadoop has supporting setting HADOOP_NICENESS in conf/hadoop-env.sh.  Then high-priority, short-running jobs may be submitted to the un-niced cluster while low-priority, long-running jobs may be submitted to the niced cluster.  I've heard this works well, that when a high-priority job is submitted, the low-priority job slows nearly to a halt, giving nearly all the resources to the high-priority jobs.

2. Use HOD (HADOOP-719) or Amazon EC2 to dynamically allocate MapReduce clusters per job or job-sequence.


> Add job priority
> ----------------
>
>                 Key: HADOOP-1433
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1433
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Johan Oskarsson
>            Priority: Minor
>
> As more and more developers start using our cluster we run into problems where big low priority jobs block smaller high priority ones.
> A simple way of specifying the job priority in the JobConf and perhaps even change it during runtime via the jobtracker web ui would help a lot.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1433) Add job priority

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-1433:
---------------------------------

    Status: Open  (was: Patch Available)

This looks reasonable to me too.  Unfortunately it no longer passes unit tests, e.g., TestMiniMRLocalFS failing with:

{code}
java.io.IOException: java.lang.NullPointerException
	at org.apache.hadoop.mapred.JobTracker$2.compare(JobTracker.java:1477)
	at org.apache.hadoop.mapred.JobTracker$2.compare(JobTracker.java:1476)
	at java.util.Arrays.mergeSort(Arrays.java:1284)
	at java.util.Arrays.sort(Arrays.java:1223)
	at java.util.Collections.sort(Collections.java:159)
	at org.apache.hadoop.mapred.JobTracker.resortPriority(JobTracker.java:1490)
	at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:1462)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:585)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:340)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:566)
{code}


> Add job priority
> ----------------
>
>                 Key: HADOOP-1433
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1433
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>    Affects Versions: 0.14.0
>            Reporter: Johan Oskarsson
>            Assignee: Johan Oskarsson
>            Priority: Minor
>             Fix For: 0.14.0
>
>         Attachments: jobpriorityv2.patch
>
>
> As more and more developers start using our cluster we run into problems where big low priority jobs block smaller high priority ones.
> A simple way of specifying the job priority in the JobConf and perhaps even change it during runtime via the jobtracker web ui would help a lot.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1433) Add job priority

Posted by "Tom White (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12511561 ] 

Tom White commented on HADOOP-1433:
-----------------------------------

+1

This looks like an unobtrusive change.

> Add job priority
> ----------------
>
>                 Key: HADOOP-1433
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1433
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>    Affects Versions: 0.14.0
>            Reporter: Johan Oskarsson
>            Priority: Minor
>             Fix For: 0.14.0
>
>         Attachments: jobpriorityv2.patch
>
>
> As more and more developers start using our cluster we run into problems where big low priority jobs block smaller high priority ones.
> A simple way of specifying the job priority in the JobConf and perhaps even change it during runtime via the jobtracker web ui would help a lot.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1433) Add job priority

Posted by "Johan Oskarsson (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Johan Oskarsson updated HADOOP-1433:
------------------------------------

    Status: Patch Available  (was: Open)

new patch

> Add job priority
> ----------------
>
>                 Key: HADOOP-1433
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1433
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>    Affects Versions: 0.14.0
>            Reporter: Johan Oskarsson
>            Priority: Minor
>             Fix For: 0.14.0
>
>         Attachments: jobpriorityv2.patch
>
>
> As more and more developers start using our cluster we run into problems where big low priority jobs block smaller high priority ones.
> A simple way of specifying the job priority in the JobConf and perhaps even change it during runtime via the jobtracker web ui would help a lot.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1433) Add job priority

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510480 ] 

Hadoop QA commented on HADOOP-1433:
-----------------------------------

+1

http://issues.apache.org/jira/secure/attachment/12361215/jobpriorityv2.patch applied and successfully tested against trunk revision r553620.

Test results:   http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/365/testReport/
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/365/console

> Add job priority
> ----------------
>
>                 Key: HADOOP-1433
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1433
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>    Affects Versions: 0.14.0
>            Reporter: Johan Oskarsson
>            Priority: Minor
>             Fix For: 0.14.0
>
>         Attachments: jobpriorityv2.patch
>
>
> As more and more developers start using our cluster we run into problems where big low priority jobs block smaller high priority ones.
> A simple way of specifying the job priority in the JobConf and perhaps even change it during runtime via the jobtracker web ui would help a lot.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1433) Add job priority

Posted by "Johan Oskarsson (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Johan Oskarsson updated HADOOP-1433:
------------------------------------

    Attachment: jobpriorityv1-trunk.patch

This patch adds a jobconf setting and web ui controls for changing the priority of a job.
It then sorts the collection containing the jobs by this priority and then by start date.
Hadoop picks the first job in this list and starts running tasks from that job.

There might be situations where all reduce task slots have already been assigned. the user then increases the priority on another job. that job can not be assigned reducers until the previous jobs have released theirs.
I've left this as is for now, but there's room for improvements down the road.

Passes all unit tests on my machine.

> Add job priority
> ----------------
>
>                 Key: HADOOP-1433
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1433
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>    Affects Versions: 0.14.0
>            Reporter: Johan Oskarsson
>            Priority: Minor
>             Fix For: 0.14.0
>
>         Attachments: jobpriorityv1-trunk.patch
>
>
> As more and more developers start using our cluster we run into problems where big low priority jobs block smaller high priority ones.
> A simple way of specifying the job priority in the JobConf and perhaps even change it during runtime via the jobtracker web ui would help a lot.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1433) Add job priority

Posted by "Johan Oskarsson (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Johan Oskarsson updated HADOOP-1433:
------------------------------------

    Attachment: jobpriorityv3.patch

I think the test failed due to the patch not applying cleanly. I tried to apply it to the current trunk and one line of the code that would cause the above error failed to apply.

I've updated the patch to work with trunk and added a few more safety checks. It passes the core tests, I get the HADOOP-1596 problem though, so will not let hudson run through it just yet.

> Add job priority
> ----------------
>
>                 Key: HADOOP-1433
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1433
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>    Affects Versions: 0.14.0
>            Reporter: Johan Oskarsson
>            Assignee: Johan Oskarsson
>            Priority: Minor
>             Fix For: 0.14.0
>
>         Attachments: jobpriorityv3.patch
>
>
> As more and more developers start using our cluster we run into problems where big low priority jobs block smaller high priority ones.
> A simple way of specifying the job priority in the JobConf and perhaps even change it during runtime via the jobtracker web ui would help a lot.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1433) Add job priority

Posted by "Johan Oskarsson (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Johan Oskarsson updated HADOOP-1433:
------------------------------------

    Status: Open  (was: Patch Available)

canceling old patch

> Add job priority
> ----------------
>
>                 Key: HADOOP-1433
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1433
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>    Affects Versions: 0.14.0
>            Reporter: Johan Oskarsson
>            Priority: Minor
>             Fix For: 0.14.0
>
>         Attachments: jobpriorityv2.patch
>
>
> As more and more developers start using our cluster we run into problems where big low priority jobs block smaller high priority ones.
> A simple way of specifying the job priority in the JobConf and perhaps even change it during runtime via the jobtracker web ui would help a lot.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1433) Add job priority

Posted by "Johan Oskarsson (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Johan Oskarsson updated HADOOP-1433:
------------------------------------

    Attachment:     (was: jobpriorityv1-trunk.patch)

> Add job priority
> ----------------
>
>                 Key: HADOOP-1433
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1433
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>    Affects Versions: 0.14.0
>            Reporter: Johan Oskarsson
>            Priority: Minor
>             Fix For: 0.14.0
>
>         Attachments: jobpriorityv2.patch
>
>
> As more and more developers start using our cluster we run into problems where big low priority jobs block smaller high priority ones.
> A simple way of specifying the job priority in the JobConf and perhaps even change it during runtime via the jobtracker web ui would help a lot.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1433) Add job priority

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510221 ] 

Hadoop QA commented on HADOOP-1433:
-----------------------------------

+0, new Findbugs warnings

http://issues.apache.org/jira/secure/attachment/12361146/jobpriorityv1-trunk.patch
applied and successfully tested against trunk revision r553080,
but there appear to be new Findbugs warnings introduced by this patch.

New Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/362/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Test results:   http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/362/testReport/
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/362/console

> Add job priority
> ----------------
>
>                 Key: HADOOP-1433
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1433
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>    Affects Versions: 0.14.0
>            Reporter: Johan Oskarsson
>            Priority: Minor
>             Fix For: 0.14.0
>
>         Attachments: jobpriorityv1-trunk.patch
>
>
> As more and more developers start using our cluster we run into problems where big low priority jobs block smaller high priority ones.
> A simple way of specifying the job priority in the JobConf and perhaps even change it during runtime via the jobtracker web ui would help a lot.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1433) Add job priority

Posted by "Johan Oskarsson (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Johan Oskarsson updated HADOOP-1433:
------------------------------------

    Status: Patch Available  (was: Open)

> Add job priority
> ----------------
>
>                 Key: HADOOP-1433
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1433
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>    Affects Versions: 0.14.0
>            Reporter: Johan Oskarsson
>            Assignee: Johan Oskarsson
>            Priority: Minor
>             Fix For: 0.14.0
>
>         Attachments: jobpriorityv3.patch
>
>
> As more and more developers start using our cluster we run into problems where big low priority jobs block smaller high priority ones.
> A simple way of specifying the job priority in the JobConf and perhaps even change it during runtime via the jobtracker web ui would help a lot.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.