You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@oozie.apache.org by "Hadoop QA (JIRA)" <ji...@apache.org> on 2011/09/08 06:41:09 UTC

[jira] [Created] (OOZIE-144) GH-118: JT/NN backoff if response time over threshold

GH-118: JT/NN backoff if response time over threshold
-----------------------------------------------------

                 Key: OOZIE-144
                 URL: https://issues.apache.org/jira/browse/OOZIE-144
             Project: Oozie
          Issue Type: Bug
            Reporter: Hadoop QA


If the JT/NN and overloaded Oozie should back-off temporary.

This can be done in the HadoopAccessorService.

Because JT/NN does not provide and API to find out the current health this has to be determined using API calls that do a known/fixed amount of work. For example for JT asking for the queue names, for NN asking for the contents of the root directory.

A tool that queries this values should be run against the cluster to find the normal values an values under stress. This would help to determine the threshold value for Oozie.

Oozie, before using a  JT/NN handle (JobClient/FileSystem) will test the  response time, if the response time is above the threshold Oozie will backoff for # seconds and will not attempt any call to the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (OOZIE-144) GH-118: JT/NN backoff if response time over threshold

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/OOZIE-144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hadoop QA resolved OOZIE-144.
-----------------------------

    Resolution: Fixed

> GH-118: JT/NN backoff if response time over threshold
> -----------------------------------------------------
>
>                 Key: OOZIE-144
>                 URL: https://issues.apache.org/jira/browse/OOZIE-144
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Hadoop QA
>
> If the JT/NN and overloaded Oozie should back-off temporary.
> This can be done in the HadoopAccessorService.
> Because JT/NN does not provide and API to find out the current health this has to be determined using API calls that do a known/fixed amount of work. For example for JT asking for the queue names, for NN asking for the contents of the root directory.
> A tool that queries this values should be run against the cluster to find the normal values an values under stress. This would help to determine the threshold value for Oozie.
> Oozie, before using a  JT/NN handle (JobClient/FileSystem) will test the  response time, if the response time is above the threshold Oozie will backoff for # seconds and will not attempt any call to the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Reopened] (OOZIE-144) GH-118: JT/NN backoff if response time over threshold

Posted by "Roman Shaposhnik (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/OOZIE-144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Roman Shaposhnik reopened OOZIE-144:
------------------------------------


> GH-118: JT/NN backoff if response time over threshold
> -----------------------------------------------------
>
>                 Key: OOZIE-144
>                 URL: https://issues.apache.org/jira/browse/OOZIE-144
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Hadoop QA
>
> If the JT/NN and overloaded Oozie should back-off temporary.
> This can be done in the HadoopAccessorService.
> Because JT/NN does not provide and API to find out the current health this has to be determined using API calls that do a known/fixed amount of work. For example for JT asking for the queue names, for NN asking for the contents of the root directory.
> A tool that queries this values should be run against the cluster to find the normal values an values under stress. This would help to determine the threshold value for Oozie.
> Oozie, before using a  JT/NN handle (JobClient/FileSystem) will test the  response time, if the response time is above the threshold Oozie will backoff for # seconds and will not attempt any call to the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira