You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@oozie.apache.org by "Hadoop QA (JIRA)" <ji...@apache.org> on 2011/09/08 06:57:09 UTC

[jira] [Created] (OOZIE-258) GH-361: Throttle coordinator action/workflow creation per coordinator job

GH-361: Throttle coordinator action/workflow creation per coordinator job
-------------------------------------------------------------------------

                 Key: OOZIE-258
                 URL: https://issues.apache.org/jira/browse/OOZIE-258
             Project: Oozie
          Issue Type: Bug
            Reporter: Hadoop QA


Currently, if there are thousands of eligible  coordinator actions  to be created (e.g. catch-up mode) for one coordinator job, oozie creates all of them once and overloading the system.

This thing needs to be controlled and should only materialized X number of active actions per coordinator job. The
value of X should be configurable whether in oozie level or per job level that needs to be decided. 

In addition, there should be a way for customer to define the order by which job creation should be followed.  For
example, if there are 1000 actions ready to be created which X actions should be created first. Such as FIFO or LIFO.

This will definitely reduce the load on the internal queue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OOZIE-258) GH-361: Throttle coordinator action/workflow creation per coordinator job

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OOZIE-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101939#comment-13101939 ] 

Hadoop QA commented on OOZIE-258:
---------------------------------

brookwc remarked:
Closed by b0446e9ae627cb5edc8a34a50d545f0789607ebc Throttle coordinator action/workflow creation per coordinator job

> GH-361: Throttle coordinator action/workflow creation per coordinator job
> -------------------------------------------------------------------------
>
>                 Key: OOZIE-258
>                 URL: https://issues.apache.org/jira/browse/OOZIE-258
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Hadoop QA
>
> Currently, if there are thousands of eligible  coordinator actions  to be created (e.g. catch-up mode) for one coordinator job, oozie creates all of them once and overloading the system.
> This thing needs to be controlled and should only materialized X number of active actions per coordinator job. The
> value of X should be configurable whether in oozie level or per job level that needs to be decided. 
> In addition, there should be a way for customer to define the order by which job creation should be followed.  For
> example, if there are 1000 actions ready to be created which X actions should be created first. Such as FIFO or LIFO.
> This will definitely reduce the load on the internal queue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OOZIE-258) GH-361: Throttle coordinator action/workflow creation per coordinator job

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OOZIE-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101937#comment-13101937 ] 

Hadoop QA commented on OOZIE-258:
---------------------------------

anew remarked:
I think that this should be user-defined, depending on the frequency of a job. And there should be a system-wide, admin-defined cap to protect Oozie's resources from "greedy" users.

> GH-361: Throttle coordinator action/workflow creation per coordinator job
> -------------------------------------------------------------------------
>
>                 Key: OOZIE-258
>                 URL: https://issues.apache.org/jira/browse/OOZIE-258
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Hadoop QA
>
> Currently, if there are thousands of eligible  coordinator actions  to be created (e.g. catch-up mode) for one coordinator job, oozie creates all of them once and overloading the system.
> This thing needs to be controlled and should only materialized X number of active actions per coordinator job. The
> value of X should be configurable whether in oozie level or per job level that needs to be decided. 
> In addition, there should be a way for customer to define the order by which job creation should be followed.  For
> example, if there are 1000 actions ready to be created which X actions should be created first. Such as FIFO or LIFO.
> This will definitely reduce the load on the internal queue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OOZIE-258) GH-361: Throttle coordinator action/workflow creation per coordinator job

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OOZIE-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101936#comment-13101936 ] 

Hadoop QA commented on OOZIE-258:
---------------------------------

mislam77 remarked:
Also finding a "good" value that will be applicable for "all types" of jobs is a challenge.
For example, for a monthly job a value of 3 is sufficient, however for minutes job it is not.

> GH-361: Throttle coordinator action/workflow creation per coordinator job
> -------------------------------------------------------------------------
>
>                 Key: OOZIE-258
>                 URL: https://issues.apache.org/jira/browse/OOZIE-258
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Hadoop QA
>
> Currently, if there are thousands of eligible  coordinator actions  to be created (e.g. catch-up mode) for one coordinator job, oozie creates all of them once and overloading the system.
> This thing needs to be controlled and should only materialized X number of active actions per coordinator job. The
> value of X should be configurable whether in oozie level or per job level that needs to be decided. 
> In addition, there should be a way for customer to define the order by which job creation should be followed.  For
> example, if there are 1000 actions ready to be created which X actions should be created first. Such as FIFO or LIFO.
> This will definitely reduce the load on the internal queue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OOZIE-258) GH-361: Throttle coordinator action/workflow creation per coordinator job

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OOZIE-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13099922#comment-13099922 ] 

Hadoop QA commented on OOZIE-258:
---------------------------------

mislam77 remarked:
Please comment on these open questions:
* How the throttling value/order should be defined? 
       - system wide/ System defined.
       - per job based/ User defined.

> GH-361: Throttle coordinator action/workflow creation per coordinator job
> -------------------------------------------------------------------------
>
>                 Key: OOZIE-258
>                 URL: https://issues.apache.org/jira/browse/OOZIE-258
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Hadoop QA
>
> Currently, if there are thousands of eligible  coordinator actions  to be created (e.g. catch-up mode) for one coordinator job, oozie creates all of them once and overloading the system.
> This thing needs to be controlled and should only materialized X number of active actions per coordinator job. The
> value of X should be configurable whether in oozie level or per job level that needs to be decided. 
> In addition, there should be a way for customer to define the order by which job creation should be followed.  For
> example, if there are 1000 actions ready to be created which X actions should be created first. Such as FIFO or LIFO.
> This will definitely reduce the load on the internal queue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OOZIE-258) GH-361: Throttle coordinator action/workflow creation per coordinator job

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OOZIE-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101935#comment-13101935 ] 

Hadoop QA commented on OOZIE-258:
---------------------------------

mislam77 remarked:
No. it was not implemented.
There is similar thing in WF submission when a lot coordinator actions ready to execute, oozie throttles using "concurrency" and order such as FIFO/LIFO.
I think you were confused with this.

The issue is for throttling during action creation.

Throttling configuration in system level might create some future issue. For example, at some point, a user has 100 actions to be materialized but system throttling value is :7. All 7 actions are waiting for data . However, he has the data for 10th and 12th instances. In this case, if he could give the throttling value of  12, his ready jobs would have been started.

> GH-361: Throttle coordinator action/workflow creation per coordinator job
> -------------------------------------------------------------------------
>
>                 Key: OOZIE-258
>                 URL: https://issues.apache.org/jira/browse/OOZIE-258
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Hadoop QA
>
> Currently, if there are thousands of eligible  coordinator actions  to be created (e.g. catch-up mode) for one coordinator job, oozie creates all of them once and overloading the system.
> This thing needs to be controlled and should only materialized X number of active actions per coordinator job. The
> value of X should be configurable whether in oozie level or per job level that needs to be decided. 
> In addition, there should be a way for customer to define the order by which job creation should be followed.  For
> example, if there are 1000 actions ready to be created which X actions should be created first. Such as FIFO or LIFO.
> This will definitely reduce the load on the internal queue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OOZIE-258) GH-361: Throttle coordinator action/workflow creation per coordinator job

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OOZIE-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101938#comment-13101938 ] 

Hadoop QA commented on OOZIE-258:
---------------------------------

mislam77 remarked:
How a user will define this value? Should we overload the meaning of "concurrency" used for "ready" job? OR there should be a new parameter like "max_active_action"?

> GH-361: Throttle coordinator action/workflow creation per coordinator job
> -------------------------------------------------------------------------
>
>                 Key: OOZIE-258
>                 URL: https://issues.apache.org/jira/browse/OOZIE-258
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Hadoop QA
>
> Currently, if there are thousands of eligible  coordinator actions  to be created (e.g. catch-up mode) for one coordinator job, oozie creates all of them once and overloading the system.
> This thing needs to be controlled and should only materialized X number of active actions per coordinator job. The
> value of X should be configurable whether in oozie level or per job level that needs to be decided. 
> In addition, there should be a way for customer to define the order by which job creation should be followed.  For
> example, if there are 1000 actions ready to be created which X actions should be created first. Such as FIFO or LIFO.
> This will definitely reduce the load on the internal queue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Closed] (OOZIE-258) GH-361: Throttle coordinator action/workflow creation per coordinator job

Posted by "Roman Shaposhnik (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OOZIE-258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Roman Shaposhnik closed OOZIE-258.
----------------------------------

    Resolution: Fixed

> GH-361: Throttle coordinator action/workflow creation per coordinator job
> -------------------------------------------------------------------------
>
>                 Key: OOZIE-258
>                 URL: https://issues.apache.org/jira/browse/OOZIE-258
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Hadoop QA
>
> Currently, if there are thousands of eligible  coordinator actions  to be created (e.g. catch-up mode) for one coordinator job, oozie creates all of them once and overloading the system.
> This thing needs to be controlled and should only materialized X number of active actions per coordinator job. The
> value of X should be configurable whether in oozie level or per job level that needs to be decided. 
> In addition, there should be a way for customer to define the order by which job creation should be followed.  For
> example, if there are 1000 actions ready to be created which X actions should be created first. Such as FIFO or LIFO.
> This will definitely reduce the load on the internal queue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OOZIE-258) GH-361: Throttle coordinator action/workflow creation per coordinator job

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OOZIE-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101934#comment-13101934 ] 

Hadoop QA commented on OOZIE-258:
---------------------------------

tucu00 remarked:
IMO this should be system wide. system admins should control this, not users.

action materialization being FIFO/LIFO/lastOnly is already defined in the spec. 

I presume it is implemented, correct?

> GH-361: Throttle coordinator action/workflow creation per coordinator job
> -------------------------------------------------------------------------
>
>                 Key: OOZIE-258
>                 URL: https://issues.apache.org/jira/browse/OOZIE-258
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Hadoop QA
>
> Currently, if there are thousands of eligible  coordinator actions  to be created (e.g. catch-up mode) for one coordinator job, oozie creates all of them once and overloading the system.
> This thing needs to be controlled and should only materialized X number of active actions per coordinator job. The
> value of X should be configurable whether in oozie level or per job level that needs to be decided. 
> In addition, there should be a way for customer to define the order by which job creation should be followed.  For
> example, if there are 1000 actions ready to be created which X actions should be created first. Such as FIFO or LIFO.
> This will definitely reduce the load on the internal queue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira