You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@oodt.apache.org by "Gabe Resneck (Created) (JIRA)" <ji...@apache.org> on 2011/10/14 03:28:13 UTC

[jira] [Created] (OODT-335) Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs

Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs
-------------------------------------------------------------------------------------------------------

                 Key: OODT-335
                 URL: https://issues.apache.org/jira/browse/OODT-335
             Project: OODT
          Issue Type: Improvement
          Components: resource manager
            Reporter: Gabe Resneck
            Assignee: Gabe Resneck


While performing operations (such as prioritizing a job or flagging it to not be scheduled) upon specific jobs is well and good, it is not a very powerful tool.  Many times, operators want to perform these functions on groups of jobs, specifically, groups that share a key-value pair of metadata.  For example, this would allow an operator to move all jobs with the value "10/18" for the metadata key "date".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (OODT-335) Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs

Posted by "Chris A. Mattmann (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OODT-335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris A. Mattmann updated OODT-335:
-----------------------------------


- push out to 0.5
                
> Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs
> -------------------------------------------------------------------------------------------------------
>
>                 Key: OODT-335
>                 URL: https://issues.apache.org/jira/browse/OODT-335
>             Project: OODT
>          Issue Type: Improvement
>          Components: resource manager
>            Reporter: Gabe Resneck
>            Assignee: Gabe Resneck
>             Fix For: 0.5
>
>         Attachments: OODT-335_resneck_10-14-11_DO_NOT_COMMIT.txt
>
>
> While performing operations (such as prioritizing a job or flagging it to not be scheduled) upon specific jobs is well and good, it is not a very powerful tool.  Many times, operators want to perform these functions on groups of jobs, specifically, groups that share a key-value pair of metadata.  For example, this would allow an operator to move all jobs with the value "10/18" for the metadata key "date".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OODT-335) Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs

Posted by "Gabe Resneck (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OODT-335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13127705#comment-13127705 ] 

Gabe Resneck commented on OODT-335:
-----------------------------------

Brian and Chris, why is it that you two feel that the RM should be a more simple and less powerful component?  If the WFM has a certain operation that it can perform on workflows why should the RM not allow the same (or similar) operation on jobs?  This would make the CAS more consistent across components and easier to use.  How do you judge what functionality belongs in which component?

>From what I have seen from the CAS users with whom I have interacted (ACOS/OCO2), adding additional capabilities to our components could greatly help operators.  Before I started working on the RM, the configuration used by the project operator specified that the maxSize for the RM JobQueue be 5.  The reason for this was that at the time the RM offered no ways to manipulate jobs beyond killing them.  Once a job had been successfully submitted to the RM, the operator lost almost all control over the job, short of killing it.  I argue that if we want the RM to be used with a JobQueue of any reasonable maxSize, we must increase its power by adding additional capabilities such as those that I have suggested in this issue and previous issues.  Otherwise, users will either use such infinitesimal queue sizes (in which they're not using the RM as you intended anyways) or move away from using the RM altogether.
                
> Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs
> -------------------------------------------------------------------------------------------------------
>
>                 Key: OODT-335
>                 URL: https://issues.apache.org/jira/browse/OODT-335
>             Project: OODT
>          Issue Type: Improvement
>          Components: resource manager
>            Reporter: Gabe Resneck
>            Assignee: Gabe Resneck
>
> While performing operations (such as prioritizing a job or flagging it to not be scheduled) upon specific jobs is well and good, it is not a very powerful tool.  Many times, operators want to perform these functions on groups of jobs, specifically, groups that share a key-value pair of metadata.  For example, this would allow an operator to move all jobs with the value "10/18" for the metadata key "date".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OODT-335) Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs

Posted by "Chris A. Mattmann (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OODT-335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128038#comment-13128038 ] 

Chris A. Mattmann commented on OODT-335:
----------------------------------------

bq. specifically, groups that share a key-value pair of metadata. For example, this would allow an operator to move all jobs with the value "10/18" for the metadata key "date".

Note, the resource manager is *not* intended to operate on cas-metadata. That's why we have the notion of JobInput, JobSpec, and Job, where Job and JobSpec are the unique combination of information required to run a Job at the resource manager level -- and why we don't have metadata at that level. One could achieve the functionality that you are requesting by performing these sets of operations at the workflow manager level -- whose primary concern and focus is the manipulation of metadata.

To summarize some of [~bfoster]'s comments:

{code:java}
public boolean killJob(String jobId) throws MonitorException
{code}

In XmlRpcResourceManager.java simply requires a jobId. You claim that an operator wants to kill a set of jobs not based on jobId, but based on some metadata spec. I'd say, that can be done in the workflow manager by querying it for all jobs with the provided metadata (yes, this doesn't exist in current trunk but as I merge in OODT-215, it will), grabbing workflowContext.getMetadata("WorkflowInstId"), and then iterating over that information and calling workflowManagerClient.stopWorkflowInstance (yes it has the *exact* same method, note the duplication):

http://oodt.apache.org/components/maven/apidocs/org/apache/oodt/cas/workflow/system/XmlRpcWorkflowManagerClient.html#stopWorkflowInstance%28java.lang.String%29

bq.i suggest they let the scientist continue using torque and leave the resource manager for PCS jobs

+1, I'm in favor of this too.

bq. the reason your purposed changes shouldn't be added into the core resource manager is because these feature are primarily ACOS only features and thus they don't warrant adding extra features into the core resource manager

This is also a major concern of mine. Over the past years, we've seen this happen to some of the core infrastructure and kudos to Brian for bringing this up as a concern and something we should be thinking about now. ACOS is not the only project (at JPL) that's using resource manager, let alone the only project out there in the broader community.

>From [~gabe.resneck]:

bq. I don't see why such small changes are so detrimental.

Because then if other projects wanted to upgrade and they had code that compiled against these interfaces, it would require a code update to upgrade.

>From [~bfoster]: 

bq. Making a stand-alone scheduler allows you to add these specialized metadata-query-like methods without imposing them on others who really don't need them. 

Bingo. Insulation here (and separation of concerns) is the key. 




                
> Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs
> -------------------------------------------------------------------------------------------------------
>
>                 Key: OODT-335
>                 URL: https://issues.apache.org/jira/browse/OODT-335
>             Project: OODT
>          Issue Type: Improvement
>          Components: resource manager
>            Reporter: Gabe Resneck
>            Assignee: Gabe Resneck
>         Attachments: OODT-335_resneck_10-14-11_DO_NOT_COMMIT.txt
>
>
> While performing operations (such as prioritizing a job or flagging it to not be scheduled) upon specific jobs is well and good, it is not a very powerful tool.  Many times, operators want to perform these functions on groups of jobs, specifically, groups that share a key-value pair of metadata.  For example, this would allow an operator to move all jobs with the value "10/18" for the metadata key "date".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OODT-335) Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs

Posted by "Brian Foster (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OODT-335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13127745#comment-13127745 ] 

Brian Foster commented on OODT-335:
-----------------------------------

I offered ACOS several GOOD alternatives to their resource manager "issues", none of which they wanted... if you really want to add above features (and features in other issues posted) write your own stand-alone XmlRpcScheduler which comes up as its own server... then write a XmlRpcSchedulerClient which extends Scheduler for the ResourceManager which is a client to this server... then you can control your XmlRpcScheduler anyway your project demands... you can wrap any information you want into the JobInput and have your scheduler pull it out... just realize that certain changes made in the resource manager must be synced back to wengine otherwise wengine will get confused
                
> Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs
> -------------------------------------------------------------------------------------------------------
>
>                 Key: OODT-335
>                 URL: https://issues.apache.org/jira/browse/OODT-335
>             Project: OODT
>          Issue Type: Improvement
>          Components: resource manager
>            Reporter: Gabe Resneck
>            Assignee: Gabe Resneck
>
> While performing operations (such as prioritizing a job or flagging it to not be scheduled) upon specific jobs is well and good, it is not a very powerful tool.  Many times, operators want to perform these functions on groups of jobs, specifically, groups that share a key-value pair of metadata.  For example, this would allow an operator to move all jobs with the value "10/18" for the metadata key "date".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (OODT-335) Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs

Posted by "Chris A. Mattmann (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OODT-335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris A. Mattmann updated OODT-335:
-----------------------------------

    Fix Version/s:     (was: 0.4)
                   0.5

- push to 0.5
                
> Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs
> -------------------------------------------------------------------------------------------------------
>
>                 Key: OODT-335
>                 URL: https://issues.apache.org/jira/browse/OODT-335
>             Project: OODT
>          Issue Type: Improvement
>          Components: resource manager
>            Reporter: Gabe Resneck
>            Assignee: Gabe Resneck
>             Fix For: 0.5
>
>         Attachments: OODT-335_resneck_10-14-11_DO_NOT_COMMIT.txt
>
>
> While performing operations (such as prioritizing a job or flagging it to not be scheduled) upon specific jobs is well and good, it is not a very powerful tool.  Many times, operators want to perform these functions on groups of jobs, specifically, groups that share a key-value pair of metadata.  For example, this would allow an operator to move all jobs with the value "10/18" for the metadata key "date".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OODT-335) Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs

Posted by "Brian Foster (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OODT-335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13127815#comment-13127815 ] 

Brian Foster commented on OODT-335:
-----------------------------------

The only reason ACOS wanted these features is because they wanted to used the resource manager for PCS jobs and scientist jobs... i suggest they let the scientist continue using torque and leave the resource manager for PCS jobs... you could then write a simple wrapper script that would allow you to change node allocations of both torque and resource manager at the same time to insure node allocation never exceeds 100% between the 2 components... for example: say torque was using 60% and resource manager node was set to 40%, the wrapper script would allow you to change it up so that torque would say only had 40% and the resource manager would have 60%... the script would just translate percentage to whatever percent allocation unit the component used -- i.e. if your resource manager node full load for a node was say 100, then 40% would translate to 40 load.
                
> Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs
> -------------------------------------------------------------------------------------------------------
>
>                 Key: OODT-335
>                 URL: https://issues.apache.org/jira/browse/OODT-335
>             Project: OODT
>          Issue Type: Improvement
>          Components: resource manager
>            Reporter: Gabe Resneck
>            Assignee: Gabe Resneck
>
> While performing operations (such as prioritizing a job or flagging it to not be scheduled) upon specific jobs is well and good, it is not a very powerful tool.  Many times, operators want to perform these functions on groups of jobs, specifically, groups that share a key-value pair of metadata.  For example, this would allow an operator to move all jobs with the value "10/18" for the metadata key "date".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OODT-335) Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs

Posted by "Brian Foster (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OODT-335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128003#comment-13128003 ] 

Brian Foster commented on OODT-335:
-----------------------------------

your patch is super specialized and it doesn't have the part of the code for the Scheduler change... i believe torque even keeps its complex scheduling code separate from it's resource manager (i.e. maui)... your patch will take away the ability to have a simple stupid scheduler for cases where ppl don't want ACOS complexities... from your patch it looks like each scheduler will now be expected to poll its JobRepo for Jobs whose status has been changed by other components (i.e. Your XmlRpcResourceManager flag for schedule method)... the Scheduler is expected to persist to the JobRepo, not be trigger to do different things based on what's in it... Making a stand-alone scheduler allows you to add these specialized metadata-query-like methods without imposing them on others who really don't need them.  You could even just make your scheduler have a server it in which you can make XML-RPC calls to directly (doesn't even need to be stand-alone)... the stand-alone idea came from torque/maui.
                
> Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs
> -------------------------------------------------------------------------------------------------------
>
>                 Key: OODT-335
>                 URL: https://issues.apache.org/jira/browse/OODT-335
>             Project: OODT
>          Issue Type: Improvement
>          Components: resource manager
>            Reporter: Gabe Resneck
>            Assignee: Gabe Resneck
>         Attachments: OODT-335_resneck_10-14-11_DO_NOT_COMMIT.txt
>
>
> While performing operations (such as prioritizing a job or flagging it to not be scheduled) upon specific jobs is well and good, it is not a very powerful tool.  Many times, operators want to perform these functions on groups of jobs, specifically, groups that share a key-value pair of metadata.  For example, this would allow an operator to move all jobs with the value "10/18" for the metadata key "date".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Updated] (OODT-335) Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs

Posted by "Chris A. Mattmann (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OODT-335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris A. Mattmann updated OODT-335:
-----------------------------------

    Fix Version/s: 0.4

- schedule
                
> Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs
> -------------------------------------------------------------------------------------------------------
>
>                 Key: OODT-335
>                 URL: https://issues.apache.org/jira/browse/OODT-335
>             Project: OODT
>          Issue Type: Improvement
>          Components: resource manager
>            Reporter: Gabe Resneck
>            Assignee: Gabe Resneck
>             Fix For: 0.4
>
>         Attachments: OODT-335_resneck_10-14-11_DO_NOT_COMMIT.txt
>
>
> While performing operations (such as prioritizing a job or flagging it to not be scheduled) upon specific jobs is well and good, it is not a very powerful tool.  Many times, operators want to perform these functions on groups of jobs, specifically, groups that share a key-value pair of metadata.  For example, this would allow an operator to move all jobs with the value "10/18" for the metadata key "date".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OODT-335) Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs

Posted by "Gabe Resneck (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OODT-335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131946#comment-13131946 ] 

Gabe Resneck commented on OODT-335:
-----------------------------------

Ok, I can tell when an argument is more trouble than it's worth.  I've laid out 2 options to the ACOS folks: either start developing the RM according to their designs on a branch (similar to the wEngine) or take the approach that you guys recommended of building an independent scheduler server.  They have not yet responded.  Obviously, taking the latter approach would be preferable from the perspective of the OODT project, but from their point of view, there are pros and cons to each option.

Before they decide to go one way or another, I would really like to learn more about the scheduler server.  What would the logical structure of that look like?  How would it communicate with the Scheduler and visa versa?

Also, if they do decide to take your recommended approach, we will need the feature that you described in OODT-215 more urgently.  As such, Chris, would you please prioritize that task?
                
> Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs
> -------------------------------------------------------------------------------------------------------
>
>                 Key: OODT-335
>                 URL: https://issues.apache.org/jira/browse/OODT-335
>             Project: OODT
>          Issue Type: Improvement
>          Components: resource manager
>            Reporter: Gabe Resneck
>            Assignee: Gabe Resneck
>         Attachments: OODT-335_resneck_10-14-11_DO_NOT_COMMIT.txt
>
>
> While performing operations (such as prioritizing a job or flagging it to not be scheduled) upon specific jobs is well and good, it is not a very powerful tool.  Many times, operators want to perform these functions on groups of jobs, specifically, groups that share a key-value pair of metadata.  For example, this would allow an operator to move all jobs with the value "10/18" for the metadata key "date".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (OODT-335) Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs

Posted by "Gabe Resneck (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OODT-335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gabe Resneck updated OODT-335:
------------------------------

    Attachment: OODT-335_resneck_10-14-11_DO_NOT_COMMIT.txt
    
> Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs
> -------------------------------------------------------------------------------------------------------
>
>                 Key: OODT-335
>                 URL: https://issues.apache.org/jira/browse/OODT-335
>             Project: OODT
>          Issue Type: Improvement
>          Components: resource manager
>            Reporter: Gabe Resneck
>            Assignee: Gabe Resneck
>         Attachments: OODT-335_resneck_10-14-11_DO_NOT_COMMIT.txt
>
>
> While performing operations (such as prioritizing a job or flagging it to not be scheduled) upon specific jobs is well and good, it is not a very powerful tool.  Many times, operators want to perform these functions on groups of jobs, specifically, groups that share a key-value pair of metadata.  For example, this would allow an operator to move all jobs with the value "10/18" for the metadata key "date".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OODT-335) Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs

Posted by "Gabe Resneck (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OODT-335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13127966#comment-13127966 ] 

Gabe Resneck commented on OODT-335:
-----------------------------------

You're right, Brian, in that this change would require the modification of a couple of interfaces in the RM.  I've attached a patch to show the extent of the "damage".  The good news is that there are only two interfaces that change, and both changes only require the addition of a single, simple method.  Yes, changing the JobInput interface does require some changes to the implementation of ResourceJobInput in wEngine, but the change is trivial.  I don't see why such small changes are so detrimental.
                
> Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs
> -------------------------------------------------------------------------------------------------------
>
>                 Key: OODT-335
>                 URL: https://issues.apache.org/jira/browse/OODT-335
>             Project: OODT
>          Issue Type: Improvement
>          Components: resource manager
>            Reporter: Gabe Resneck
>            Assignee: Gabe Resneck
>
> While performing operations (such as prioritizing a job or flagging it to not be scheduled) upon specific jobs is well and good, it is not a very powerful tool.  Many times, operators want to perform these functions on groups of jobs, specifically, groups that share a key-value pair of metadata.  For example, this would allow an operator to move all jobs with the value "10/18" for the metadata key "date".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OODT-335) Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs

Posted by "Chris A. Mattmann (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OODT-335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128036#comment-13128036 ] 

Chris A. Mattmann commented on OODT-335:
----------------------------------------

{quote}
why is it that you two feel that the RM should be a more simple and less powerful component? 
{quote}

It's less about being less powerful, and more about separation of concerns. The concern list that I always had intended for the resource manager was fairly small: see:

http://oodt.apache.org/components/maven/resource/user/

Note the vocabulary -- it's comparatively less complex than the workflow manager. The rationale behind this is that the workflow manager was always intended to control metadata (static task based configuration, and dynamic run-time parameters), control-flow and workflow. See this paper for a description:

http://sunset.usc.edu/~mattmann/pubs/SMCIT09.pdf

{quote}
If the WFM has a certain operation that it can perform on workflows why should the RM not allow the same (or similar) operation on jobs? 
{quote}

True, some of the job management features and utilities would be great. However, we need to be _very_ careful each and every time there is the knee-jerk reaction to *add something to the resource manager* when instead we need to take a look at how we can support this functionality with the existing infrastructure. 

Take this from two people who recently went through this exact same interaction -- the important thing to do here is to ask yourself -- why am I adding things to an infrastructure that's been used to process data on a number of research projects (water resource management, square kilometre array, snow hydrology, EDRN proteomics processing, climate data exchange, etc.) All of those projects found out how to use the resource management service and workflow management service in their existing forms by leveraging the infrastructure rather than re-innovating it. 

{quote}
This would make the CAS more consistent across components and easier to use.
{quote}

In what way? I see it as taking a component that was never intended to have a large footprint, and bloating it out and I'm not in favor of that.

{quote}
 How do you judge what functionality belongs in which component?
{quote}

A lot of it is driven by experience working on projects -- I've worked on a ton. The other part of it is by carefully examining:

* separation of concerns
* intended functionality
* footprint
* intended architectural interfaces
* evolution patterns
* operational patterns

...any many more.

                
> Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs
> -------------------------------------------------------------------------------------------------------
>
>                 Key: OODT-335
>                 URL: https://issues.apache.org/jira/browse/OODT-335
>             Project: OODT
>          Issue Type: Improvement
>          Components: resource manager
>            Reporter: Gabe Resneck
>            Assignee: Gabe Resneck
>         Attachments: OODT-335_resneck_10-14-11_DO_NOT_COMMIT.txt
>
>
> While performing operations (such as prioritizing a job or flagging it to not be scheduled) upon specific jobs is well and good, it is not a very powerful tool.  Many times, operators want to perform these functions on groups of jobs, specifically, groups that share a key-value pair of metadata.  For example, this would allow an operator to move all jobs with the value "10/18" for the metadata key "date".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OODT-335) Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs

Posted by "Chris A. Mattmann (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OODT-335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13127244#comment-13127244 ] 

Chris A. Mattmann commented on OODT-335:
----------------------------------------

Why can't this be solved with the Workflow Manager? I'm seeing us conflating these two components. The RM was *never* intended to have this capability.
                
> Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs
> -------------------------------------------------------------------------------------------------------
>
>                 Key: OODT-335
>                 URL: https://issues.apache.org/jira/browse/OODT-335
>             Project: OODT
>          Issue Type: Improvement
>          Components: resource manager
>            Reporter: Gabe Resneck
>            Assignee: Gabe Resneck
>
> While performing operations (such as prioritizing a job or flagging it to not be scheduled) upon specific jobs is well and good, it is not a very powerful tool.  Many times, operators want to perform these functions on groups of jobs, specifically, groups that share a key-value pair of metadata.  For example, this would allow an operator to move all jobs with the value "10/18" for the metadata key "date".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (OODT-335) Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs

Posted by "Brian Foster (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OODT-335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13127249#comment-13127249 ] 

Brian Foster edited comment on OODT-335 at 10/14/11 2:28 AM:
-------------------------------------------------------------

I think you are looking to add too many features into the resource manager... the resource manager is meant to be dumb... this feature could be added into the workflow manager... the workflow manager (both trunk and wengine versions) understands metadata already and keeps track of the resource manager jobId for every Task they submit... the resource manager already supports delete by JobId -- i.e. XmlRpcResourceManager's: public boolean killJob(String jobId) throws MonitorException... this issue should be attached to the workflow manager and then discuss in that context... the resource manager should only know how to find a node for a job, run the job, and stop the job... everything else belongs in the workflow manager or another wrapper component.
                
      was (Author: bfoster):
    I think you are looking to add too many features into the resource manager... the resource manager is meant to be dumb... this feature could be added into the workflow manager... the workflow manager (both trunk and wengine versions) understands metadata already and keeps track of the resource manager jobId for every Task they submit... the resource manager already supports delete by JobId -- i.e. XmlRpcResourceManager's: public boolean killJob(String jobId) throws MonitorException... this issue should be added to the workflow manager and then discuss in that context... the resource manager should only know how to find a node for a job, run the job, and stop the job... everything else belongs in the workflow manager or another wrapper component.
                  
> Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs
> -------------------------------------------------------------------------------------------------------
>
>                 Key: OODT-335
>                 URL: https://issues.apache.org/jira/browse/OODT-335
>             Project: OODT
>          Issue Type: Improvement
>          Components: resource manager
>            Reporter: Gabe Resneck
>            Assignee: Gabe Resneck
>
> While performing operations (such as prioritizing a job or flagging it to not be scheduled) upon specific jobs is well and good, it is not a very powerful tool.  Many times, operators want to perform these functions on groups of jobs, specifically, groups that share a key-value pair of metadata.  For example, this would allow an operator to move all jobs with the value "10/18" for the metadata key "date".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OODT-335) Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs

Posted by "Gabe Resneck (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OODT-335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13127228#comment-13127228 ] 

Gabe Resneck commented on OODT-335:
-----------------------------------

I should add that one thing required for this task is to make metadata available to the RM.  I feel that the best place to put the metadata (or rather, the best place to provide access to the metadata) is in the JobInput implementation, as this allows the project developer to negotiate the manner in which Metadata is stored and accessed.
The upshot of all of this is that the JobInput interface must be extended to include a method that allows access to job metadata and the implementation will define how this access take place.
                
> Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs
> -------------------------------------------------------------------------------------------------------
>
>                 Key: OODT-335
>                 URL: https://issues.apache.org/jira/browse/OODT-335
>             Project: OODT
>          Issue Type: Improvement
>          Components: resource manager
>            Reporter: Gabe Resneck
>            Assignee: Gabe Resneck
>
> While performing operations (such as prioritizing a job or flagging it to not be scheduled) upon specific jobs is well and good, it is not a very powerful tool.  Many times, operators want to perform these functions on groups of jobs, specifically, groups that share a key-value pair of metadata.  For example, this would allow an operator to move all jobs with the value "10/18" for the metadata key "date".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OODT-335) Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs

Posted by "Brian Foster (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OODT-335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13127835#comment-13127835 ] 

Brian Foster commented on OODT-335:
-----------------------------------

the reason your purposed changes shouldn't be added into the core resource manager is because these feature are primarily ACOS only features and thus they don't warrant adding extra features into the core resource manager... your changes would change several method signatures and even interface definitions (i.e. definitely the Scheduler, possibly JobSpec, and probably more)... all the other projects currently in place, if they wanted to update to the changes you propose, would have to make signification code changes just to get "niceties"... these changes would require changes to both wengine and trunk workflow to make them work with the fact that the resource manager was metadata aware... one change you might be able to get through though is adding a new method in XmlRpcResourceManager: killJobByName(String jobName)... then you could stuff all you want into that jobName (i.e. ACOS_Blah_Blah_Somedate_Somedude_SomeDudesDogsName get my drift?)... then you could modify any job by just doing a kill/readd... HTH
                
> Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs
> -------------------------------------------------------------------------------------------------------
>
>                 Key: OODT-335
>                 URL: https://issues.apache.org/jira/browse/OODT-335
>             Project: OODT
>          Issue Type: Improvement
>          Components: resource manager
>            Reporter: Gabe Resneck
>            Assignee: Gabe Resneck
>
> While performing operations (such as prioritizing a job or flagging it to not be scheduled) upon specific jobs is well and good, it is not a very powerful tool.  Many times, operators want to perform these functions on groups of jobs, specifically, groups that share a key-value pair of metadata.  For example, this would allow an operator to move all jobs with the value "10/18" for the metadata key "date".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (OODT-335) Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs

Posted by "Chris A. Mattmann (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OODT-335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131995#comment-13131995 ] 

Chris A. Mattmann commented on OODT-335:
----------------------------------------

I threw up a wiki page (based on our old internal JPL one) that hopefully helps to demystify the head scratching process of where-should-my-code-go?

https://cwiki.apache.org/confluence/display/OODT/Apache+OODT+CM+demystified

Updates/improvements/contributions welcome.
                
> Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs
> -------------------------------------------------------------------------------------------------------
>
>                 Key: OODT-335
>                 URL: https://issues.apache.org/jira/browse/OODT-335
>             Project: OODT
>          Issue Type: Improvement
>          Components: resource manager
>            Reporter: Gabe Resneck
>            Assignee: Gabe Resneck
>         Attachments: OODT-335_resneck_10-14-11_DO_NOT_COMMIT.txt
>
>
> While performing operations (such as prioritizing a job or flagging it to not be scheduled) upon specific jobs is well and good, it is not a very powerful tool.  Many times, operators want to perform these functions on groups of jobs, specifically, groups that share a key-value pair of metadata.  For example, this would allow an operator to move all jobs with the value "10/18" for the metadata key "date".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (OODT-335) Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs

Posted by "Chris A. Mattmann (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OODT-335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128036#comment-13128036 ] 

Chris A. Mattmann edited comment on OODT-335 at 10/15/11 2:24 AM:
------------------------------------------------------------------

{quote}
why is it that you two feel that the RM should be a more simple and less powerful component? 
{quote}

It's less about being less powerful, and more about separation of concerns. The concern list that I always had intended for the resource manager was fairly small: see:

http://oodt.apache.org/components/maven/resource/user/

Note the vocabulary -- it's comparatively less complex than the workflow manager. The rationale behind this is that the workflow manager was always intended to control metadata (static task based configuration, and dynamic run-time parameters), control-flow and workflow. See this paper for a description:

http://sunset.usc.edu/~mattmann/pubs/SMCIT09.pdf

{quote}
If the WFM has a certain operation that it can perform on workflows why should the RM not allow the same (or similar) operation on jobs? 
{quote}

True, some of the job management features and utilities would be great. However, we need to be _very_ careful each and every time there is the knee-jerk reaction to *add something to the resource manager* when instead we need to take a look at how we can support this functionality with the existing infrastructure. 

Take this from two people who recently went through this exact same interaction -- the important thing to do here is to ask yourself -- why am I adding things to an infrastructure that's been used to process data on a number of research projects (water resource management, square kilometre array, snow hydrology, EDRN proteomics processing, climate data exchange, etc.) All of those projects found out how to use the resource management service and workflow management service in their existing forms by leveraging the infrastructure rather than re-innovating it. 

{quote}
This would make the CAS more consistent across components and easier to use.
{quote}

In what way? I see it as taking a component that was never intended to have a large footprint, and bloating it out and I'm not in favor of that.

{quote}
 How do you judge what functionality belongs in which component?
{quote}

A lot of it is driven by experience working on projects -- I've worked on a ton. The other part of it is by carefully examining:

* separation of concerns
* intended functionality
* footprint
* intended architectural interfaces
* evolution patterns
* operational patterns

...and many more.

                
      was (Author: chrismattmann):
    {quote}
why is it that you two feel that the RM should be a more simple and less powerful component? 
{quote}

It's less about being less powerful, and more about separation of concerns. The concern list that I always had intended for the resource manager was fairly small: see:

http://oodt.apache.org/components/maven/resource/user/

Note the vocabulary -- it's comparatively less complex than the workflow manager. The rationale behind this is that the workflow manager was always intended to control metadata (static task based configuration, and dynamic run-time parameters), control-flow and workflow. See this paper for a description:

http://sunset.usc.edu/~mattmann/pubs/SMCIT09.pdf

{quote}
If the WFM has a certain operation that it can perform on workflows why should the RM not allow the same (or similar) operation on jobs? 
{quote}

True, some of the job management features and utilities would be great. However, we need to be _very_ careful each and every time there is the knee-jerk reaction to *add something to the resource manager* when instead we need to take a look at how we can support this functionality with the existing infrastructure. 

Take this from two people who recently went through this exact same interaction -- the important thing to do here is to ask yourself -- why am I adding things to an infrastructure that's been used to process data on a number of research projects (water resource management, square kilometre array, snow hydrology, EDRN proteomics processing, climate data exchange, etc.) All of those projects found out how to use the resource management service and workflow management service in their existing forms by leveraging the infrastructure rather than re-innovating it. 

{quote}
This would make the CAS more consistent across components and easier to use.
{quote}

In what way? I see it as taking a component that was never intended to have a large footprint, and bloating it out and I'm not in favor of that.

{quote}
 How do you judge what functionality belongs in which component?
{quote}

A lot of it is driven by experience working on projects -- I've worked on a ton. The other part of it is by carefully examining:

* separation of concerns
* intended functionality
* footprint
* intended architectural interfaces
* evolution patterns
* operational patterns

...any many more.

                  
> Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs
> -------------------------------------------------------------------------------------------------------
>
>                 Key: OODT-335
>                 URL: https://issues.apache.org/jira/browse/OODT-335
>             Project: OODT
>          Issue Type: Improvement
>          Components: resource manager
>            Reporter: Gabe Resneck
>            Assignee: Gabe Resneck
>         Attachments: OODT-335_resneck_10-14-11_DO_NOT_COMMIT.txt
>
>
> While performing operations (such as prioritizing a job or flagging it to not be scheduled) upon specific jobs is well and good, it is not a very powerful tool.  Many times, operators want to perform these functions on groups of jobs, specifically, groups that share a key-value pair of metadata.  For example, this would allow an operator to move all jobs with the value "10/18" for the metadata key "date".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OODT-335) Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs

Posted by "Brian Foster (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OODT-335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13127249#comment-13127249 ] 

Brian Foster commented on OODT-335:
-----------------------------------

I think you are looking to add too many features into the resource manager... the resource manager is meant to be dumb... this feature could be added into the workflow manager... the workflow manager (both trunk and wengine versions) understands metadata already and keeps track of the resource manager jobId for every Task they submit... the resource manager already supports delete by JobId -- i.e. XmlRpcResourceManager's: public boolean killJob(String jobId) throws MonitorException... this issue should be added to the workflow manager and then discuss in that context... the resource manager should only know how to find a node for a job, run the job, and stop the job... everything else belongs in the workflow manager or another wrapper component.
                
> Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs
> -------------------------------------------------------------------------------------------------------
>
>                 Key: OODT-335
>                 URL: https://issues.apache.org/jira/browse/OODT-335
>             Project: OODT
>          Issue Type: Improvement
>          Components: resource manager
>            Reporter: Gabe Resneck
>            Assignee: Gabe Resneck
>
> While performing operations (such as prioritizing a job or flagging it to not be scheduled) upon specific jobs is well and good, it is not a very powerful tool.  Many times, operators want to perform these functions on groups of jobs, specifically, groups that share a key-value pair of metadata.  For example, this would allow an operator to move all jobs with the value "10/18" for the metadata key "date".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OODT-335) Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs

Posted by "Gabe Resneck (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OODT-335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13127756#comment-13127756 ] 

Gabe Resneck commented on OODT-335:
-----------------------------------

What is the problem with adding the features that I have suggested to the RM server and client (as opposed to your proposed solution of adding a new Scheduler implementation and a Scheduler client)?
Also, what solutions did you offer to ACOS?  It would be very useful to know, as I may be able to use some of those suggestions in my implementation.
                
> Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs
> -------------------------------------------------------------------------------------------------------
>
>                 Key: OODT-335
>                 URL: https://issues.apache.org/jira/browse/OODT-335
>             Project: OODT
>          Issue Type: Improvement
>          Components: resource manager
>            Reporter: Gabe Resneck
>            Assignee: Gabe Resneck
>
> While performing operations (such as prioritizing a job or flagging it to not be scheduled) upon specific jobs is well and good, it is not a very powerful tool.  Many times, operators want to perform these functions on groups of jobs, specifically, groups that share a key-value pair of metadata.  For example, this would allow an operator to move all jobs with the value "10/18" for the metadata key "date".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OODT-335) Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs

Posted by "Chris A. Mattmann (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OODT-335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131986#comment-13131986 ] 

Chris A. Mattmann commented on OODT-335:
----------------------------------------

bq. Ok, I can tell when an argument is more trouble than it's worth.

First off, let's get away from the word "argument". This isn't an argument, it's a discussion, and a mighty useful one at that.
It's helped to flush out design decisions and guidelines and get them out in the open for all to benefit from. So, good job on 
that.

bq. approach that you guys recommended of building an independent scheduler server

Hrm, I'm not sure a *scheduler server* is as important as *an implementation of the o.a.oodt.cas.resource.scheduler.Scheduler interface*. 

bq. Obviously, taking the latter approach would be preferable from the perspective of the OODT project, but from their point of view, there are pros and cons to each option.

Can you enumerate the pros and cons? 

Here's my general rule of thumb:

* if something is experimental, or requires changing a bunch of interfaces, or adding code, then first:
  - develop in locally in a project repo
  - field it, experiment with it
* decide if and how it can be made general based on the above experience

bq. What would the logical structure of that look like? How would it communicate with the Scheduler and visa versa?

I'd imagine it's just a concrete implementation of the Scheduler interface that doesn't impose any 
interface-level or generalized level changes to the interface. 

There's also another subtle approach being lost here -- and in fact a *3rd* option. The answer is, 
this can be done in the workflow manager -- why not do it there?

bq. Also, if they do decide to take your recommended approach, we will need the feature that you described in OODT-215 more urgently. As such, Chris, would you please prioritize that task?

That's not the way it works in open source. You don't ask people to prioritize tasks. We're all volunteers here 
in Apache. If you'd like to help it get done quicker, patches, and help, are welcomed :-)
                
> Allow for some Resource Manager operations to act on groups of jobs based upon Metadata key-value pairs
> -------------------------------------------------------------------------------------------------------
>
>                 Key: OODT-335
>                 URL: https://issues.apache.org/jira/browse/OODT-335
>             Project: OODT
>          Issue Type: Improvement
>          Components: resource manager
>            Reporter: Gabe Resneck
>            Assignee: Gabe Resneck
>         Attachments: OODT-335_resneck_10-14-11_DO_NOT_COMMIT.txt
>
>
> While performing operations (such as prioritizing a job or flagging it to not be scheduled) upon specific jobs is well and good, it is not a very powerful tool.  Many times, operators want to perform these functions on groups of jobs, specifically, groups that share a key-value pair of metadata.  For example, this would allow an operator to move all jobs with the value "10/18" for the metadata key "date".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira