You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@oozie.apache.org by "Hadoop QA (JIRA)" <ji...@apache.org> on 2011/09/08 06:57:09 UTC
[jira] [Created] (OOZIE-258) GH-361: Throttle coordinator
action/workflow creation per coordinator job
GH-361: Throttle coordinator action/workflow creation per coordinator job
-------------------------------------------------------------------------
Key: OOZIE-258
URL: https://issues.apache.org/jira/browse/OOZIE-258
Project: Oozie
Issue Type: Bug
Reporter: Hadoop QA
Currently, if there are thousands of eligible coordinator actions to be created (e.g. catch-up mode) for one coordinator job, oozie creates all of them once and overloading the system.
This thing needs to be controlled and should only materialized X number of active actions per coordinator job. The
value of X should be configurable whether in oozie level or per job level that needs to be decided.
In addition, there should be a way for customer to define the order by which job creation should be followed. For
example, if there are 1000 actions ready to be created which X actions should be created first. Such as FIFO or LIFO.
This will definitely reduce the load on the internal queue.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OOZIE-258) GH-361: Throttle coordinator
action/workflow creation per coordinator job
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/OOZIE-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101939#comment-13101939 ]
Hadoop QA commented on OOZIE-258:
---------------------------------
brookwc remarked:
Closed by b0446e9ae627cb5edc8a34a50d545f0789607ebc Throttle coordinator action/workflow creation per coordinator job
> GH-361: Throttle coordinator action/workflow creation per coordinator job
> -------------------------------------------------------------------------
>
> Key: OOZIE-258
> URL: https://issues.apache.org/jira/browse/OOZIE-258
> Project: Oozie
> Issue Type: Bug
> Reporter: Hadoop QA
>
> Currently, if there are thousands of eligible coordinator actions to be created (e.g. catch-up mode) for one coordinator job, oozie creates all of them once and overloading the system.
> This thing needs to be controlled and should only materialized X number of active actions per coordinator job. The
> value of X should be configurable whether in oozie level or per job level that needs to be decided.
> In addition, there should be a way for customer to define the order by which job creation should be followed. For
> example, if there are 1000 actions ready to be created which X actions should be created first. Such as FIFO or LIFO.
> This will definitely reduce the load on the internal queue.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OOZIE-258) GH-361: Throttle coordinator
action/workflow creation per coordinator job
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/OOZIE-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101937#comment-13101937 ]
Hadoop QA commented on OOZIE-258:
---------------------------------
anew remarked:
I think that this should be user-defined, depending on the frequency of a job. And there should be a system-wide, admin-defined cap to protect Oozie's resources from "greedy" users.
> GH-361: Throttle coordinator action/workflow creation per coordinator job
> -------------------------------------------------------------------------
>
> Key: OOZIE-258
> URL: https://issues.apache.org/jira/browse/OOZIE-258
> Project: Oozie
> Issue Type: Bug
> Reporter: Hadoop QA
>
> Currently, if there are thousands of eligible coordinator actions to be created (e.g. catch-up mode) for one coordinator job, oozie creates all of them once and overloading the system.
> This thing needs to be controlled and should only materialized X number of active actions per coordinator job. The
> value of X should be configurable whether in oozie level or per job level that needs to be decided.
> In addition, there should be a way for customer to define the order by which job creation should be followed. For
> example, if there are 1000 actions ready to be created which X actions should be created first. Such as FIFO or LIFO.
> This will definitely reduce the load on the internal queue.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OOZIE-258) GH-361: Throttle coordinator
action/workflow creation per coordinator job
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/OOZIE-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101936#comment-13101936 ]
Hadoop QA commented on OOZIE-258:
---------------------------------
mislam77 remarked:
Also finding a "good" value that will be applicable for "all types" of jobs is a challenge.
For example, for a monthly job a value of 3 is sufficient, however for minutes job it is not.
> GH-361: Throttle coordinator action/workflow creation per coordinator job
> -------------------------------------------------------------------------
>
> Key: OOZIE-258
> URL: https://issues.apache.org/jira/browse/OOZIE-258
> Project: Oozie
> Issue Type: Bug
> Reporter: Hadoop QA
>
> Currently, if there are thousands of eligible coordinator actions to be created (e.g. catch-up mode) for one coordinator job, oozie creates all of them once and overloading the system.
> This thing needs to be controlled and should only materialized X number of active actions per coordinator job. The
> value of X should be configurable whether in oozie level or per job level that needs to be decided.
> In addition, there should be a way for customer to define the order by which job creation should be followed. For
> example, if there are 1000 actions ready to be created which X actions should be created first. Such as FIFO or LIFO.
> This will definitely reduce the load on the internal queue.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OOZIE-258) GH-361: Throttle coordinator
action/workflow creation per coordinator job
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/OOZIE-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13099922#comment-13099922 ]
Hadoop QA commented on OOZIE-258:
---------------------------------
mislam77 remarked:
Please comment on these open questions:
* How the throttling value/order should be defined?
- system wide/ System defined.
- per job based/ User defined.
> GH-361: Throttle coordinator action/workflow creation per coordinator job
> -------------------------------------------------------------------------
>
> Key: OOZIE-258
> URL: https://issues.apache.org/jira/browse/OOZIE-258
> Project: Oozie
> Issue Type: Bug
> Reporter: Hadoop QA
>
> Currently, if there are thousands of eligible coordinator actions to be created (e.g. catch-up mode) for one coordinator job, oozie creates all of them once and overloading the system.
> This thing needs to be controlled and should only materialized X number of active actions per coordinator job. The
> value of X should be configurable whether in oozie level or per job level that needs to be decided.
> In addition, there should be a way for customer to define the order by which job creation should be followed. For
> example, if there are 1000 actions ready to be created which X actions should be created first. Such as FIFO or LIFO.
> This will definitely reduce the load on the internal queue.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OOZIE-258) GH-361: Throttle coordinator
action/workflow creation per coordinator job
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/OOZIE-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101935#comment-13101935 ]
Hadoop QA commented on OOZIE-258:
---------------------------------
mislam77 remarked:
No. it was not implemented.
There is similar thing in WF submission when a lot coordinator actions ready to execute, oozie throttles using "concurrency" and order such as FIFO/LIFO.
I think you were confused with this.
The issue is for throttling during action creation.
Throttling configuration in system level might create some future issue. For example, at some point, a user has 100 actions to be materialized but system throttling value is :7. All 7 actions are waiting for data . However, he has the data for 10th and 12th instances. In this case, if he could give the throttling value of 12, his ready jobs would have been started.
> GH-361: Throttle coordinator action/workflow creation per coordinator job
> -------------------------------------------------------------------------
>
> Key: OOZIE-258
> URL: https://issues.apache.org/jira/browse/OOZIE-258
> Project: Oozie
> Issue Type: Bug
> Reporter: Hadoop QA
>
> Currently, if there are thousands of eligible coordinator actions to be created (e.g. catch-up mode) for one coordinator job, oozie creates all of them once and overloading the system.
> This thing needs to be controlled and should only materialized X number of active actions per coordinator job. The
> value of X should be configurable whether in oozie level or per job level that needs to be decided.
> In addition, there should be a way for customer to define the order by which job creation should be followed. For
> example, if there are 1000 actions ready to be created which X actions should be created first. Such as FIFO or LIFO.
> This will definitely reduce the load on the internal queue.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OOZIE-258) GH-361: Throttle coordinator
action/workflow creation per coordinator job
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/OOZIE-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101938#comment-13101938 ]
Hadoop QA commented on OOZIE-258:
---------------------------------
mislam77 remarked:
How a user will define this value? Should we overload the meaning of "concurrency" used for "ready" job? OR there should be a new parameter like "max_active_action"?
> GH-361: Throttle coordinator action/workflow creation per coordinator job
> -------------------------------------------------------------------------
>
> Key: OOZIE-258
> URL: https://issues.apache.org/jira/browse/OOZIE-258
> Project: Oozie
> Issue Type: Bug
> Reporter: Hadoop QA
>
> Currently, if there are thousands of eligible coordinator actions to be created (e.g. catch-up mode) for one coordinator job, oozie creates all of them once and overloading the system.
> This thing needs to be controlled and should only materialized X number of active actions per coordinator job. The
> value of X should be configurable whether in oozie level or per job level that needs to be decided.
> In addition, there should be a way for customer to define the order by which job creation should be followed. For
> example, if there are 1000 actions ready to be created which X actions should be created first. Such as FIFO or LIFO.
> This will definitely reduce the load on the internal queue.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (OOZIE-258) GH-361: Throttle coordinator
action/workflow creation per coordinator job
Posted by "Roman Shaposhnik (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/OOZIE-258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Roman Shaposhnik closed OOZIE-258.
----------------------------------
Resolution: Fixed
> GH-361: Throttle coordinator action/workflow creation per coordinator job
> -------------------------------------------------------------------------
>
> Key: OOZIE-258
> URL: https://issues.apache.org/jira/browse/OOZIE-258
> Project: Oozie
> Issue Type: Bug
> Reporter: Hadoop QA
>
> Currently, if there are thousands of eligible coordinator actions to be created (e.g. catch-up mode) for one coordinator job, oozie creates all of them once and overloading the system.
> This thing needs to be controlled and should only materialized X number of active actions per coordinator job. The
> value of X should be configurable whether in oozie level or per job level that needs to be decided.
> In addition, there should be a way for customer to define the order by which job creation should be followed. For
> example, if there are 1000 actions ready to be created which X actions should be created first. Such as FIFO or LIFO.
> This will definitely reduce the load on the internal queue.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OOZIE-258) GH-361: Throttle coordinator
action/workflow creation per coordinator job
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/OOZIE-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101934#comment-13101934 ]
Hadoop QA commented on OOZIE-258:
---------------------------------
tucu00 remarked:
IMO this should be system wide. system admins should control this, not users.
action materialization being FIFO/LIFO/lastOnly is already defined in the spec.
I presume it is implemented, correct?
> GH-361: Throttle coordinator action/workflow creation per coordinator job
> -------------------------------------------------------------------------
>
> Key: OOZIE-258
> URL: https://issues.apache.org/jira/browse/OOZIE-258
> Project: Oozie
> Issue Type: Bug
> Reporter: Hadoop QA
>
> Currently, if there are thousands of eligible coordinator actions to be created (e.g. catch-up mode) for one coordinator job, oozie creates all of them once and overloading the system.
> This thing needs to be controlled and should only materialized X number of active actions per coordinator job. The
> value of X should be configurable whether in oozie level or per job level that needs to be decided.
> In addition, there should be a way for customer to define the order by which job creation should be followed. For
> example, if there are 1000 actions ready to be created which X actions should be created first. Such as FIFO or LIFO.
> This will definitely reduce the load on the internal queue.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira