You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bigtop.apache.org by "Johnny Zhang (JIRA)" <ji...@apache.org> on 2012/09/28 23:37:07 UTC

[jira] [Created] (BIGTOP-721) improve the package daemon status check, check twice by some delay if status doesn't match expected value

Johnny Zhang created BIGTOP-721:
-----------------------------------

             Summary: improve the package daemon status check, check twice by some delay if status doesn't match expected value
                 Key: BIGTOP-721
                 URL: https://issues.apache.org/jira/browse/BIGTOP-721
             Project: Bigtop
          Issue Type: Bug
          Components: Tests
    Affects Versions: 0.5.0
            Reporter: Johnny Zhang
            Assignee: Johnny Zhang
            Priority: Critical
             Fix For: 0.5.0


the current package test check the daemon can be correctly started, stopped, restarted and so on. It first changes the daemon status (start it, or stop it), then sleep 3001, then use function checkThat to check if the daemon status satisfy the matcher. 

However, there is a case that daemon status update speed is slightly different: some daemon status change a little bit slower than others. In this case, we want the test can give a delay and check the daemon status again if it doesn't match the matcher at the first place. 

Example is:
{noformat}
service flume-ng-agent start
__EOT__
12/09/28 14:10:19 TRACE shell.Shell: 
<stdout>
Flume NG agent is not running..failed
Starting Flume NG agent daemon (flume-ng-agent): ..done
</stdout>
12/09/28 14:10:22 TRACE shell.Shell: /bin/bash -s << __EOT__
service flume-ng-agent status
__EOT__
12/09/28 14:10:22 TRACE shell.Shell: return: 3
12/09/28 14:10:22 TRACE shell.Shell: 
<stdout>
Flume NG agent is not running..failed
</stdout>
12/09/28 14:10:25 TRACE shell.Shell: /bin/bash -s << __EOT__
service flume-ng-agent status
__EOT__
12/09/28 14:10:25 TRACE shell.Shell: 
<stdout>
Flume NG agent is running..done
{noformat}

in above example (real case) is in SLES11, the flume-ng-agent daemon status update is a little bit slower, after sleep 3001, the daemon doesn't shown as "running" yet, then we give another 3001 delay and check again, and it is shown as "running".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (BIGTOP-721) improve the package daemon status check, check twice by some delay if status doesn't match expected value

Posted by "Johnny Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/BIGTOP-721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Johnny Zhang updated BIGTOP-721:
--------------------------------

    Attachment: BIGTOP_721.txt
    
> improve the package daemon status check, check twice by some delay if status doesn't match expected value
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: BIGTOP-721
>                 URL: https://issues.apache.org/jira/browse/BIGTOP-721
>             Project: Bigtop
>          Issue Type: Bug
>          Components: Tests
>    Affects Versions: 0.5.0
>            Reporter: Johnny Zhang
>            Assignee: Johnny Zhang
>            Priority: Critical
>             Fix For: 0.5.0
>
>         Attachments: BIGTOP_721.txt
>
>
> the current package test check the daemon can be correctly started, stopped, restarted and so on. It first changes the daemon status (start it, or stop it), then sleep 3001, then use function checkThat to check if the daemon status satisfy the matcher. 
> However, there is a case that daemon status update speed is slightly different: some daemon status change a little bit slower than others. In this case, we want the test can give a delay and check the daemon status again if it doesn't match the matcher at the first place. 
> Example is:
> {noformat}
> service flume-ng-agent start
> __EOT__
> 12/09/28 14:10:19 TRACE shell.Shell: 
> <stdout>
> Flume NG agent is not running..failed
> Starting Flume NG agent daemon (flume-ng-agent): ..done
> </stdout>
> 12/09/28 14:10:22 TRACE shell.Shell: /bin/bash -s << __EOT__
> service flume-ng-agent status
> __EOT__
> 12/09/28 14:10:22 TRACE shell.Shell: return: 3
> 12/09/28 14:10:22 TRACE shell.Shell: 
> <stdout>
> Flume NG agent is not running..failed
> </stdout>
> 12/09/28 14:10:25 TRACE shell.Shell: /bin/bash -s << __EOT__
> service flume-ng-agent status
> __EOT__
> 12/09/28 14:10:25 TRACE shell.Shell: 
> <stdout>
> Flume NG agent is running..done
> {noformat}
> in above example (real case) is in SLES11, the flume-ng-agent daemon status update is a little bit slower, after sleep 3001, the daemon doesn't shown as "running" yet, then we give another 3001 delay and check again, and it is shown as "running".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BIGTOP-721) improve the package daemon status check, check twice by some delay if status doesn't match expected value

Posted by "Johnny Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BIGTOP-721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13465931#comment-13465931 ] 

Johnny Zhang commented on BIGTOP-721:
-------------------------------------

of course, for case daemon status doesn't match expected by 3rd try, test will record the failure as usual
{noformat}
service hadoop-yarn-proxyserver status
__EOT__
12/09/28 14:17:33 TRACE shell.Shell: return: 1
12/09/28 14:17:33 TRACE shell.Shell: 
<stdout>
Hadoop proxyserver is dead and pid file exists ... failed!
</stdout>
12/09/28 14:17:36 TRACE shell.Shell: /bin/bash -s << __EOT__
service hadoop-yarn-proxyserver status
__EOT__
12/09/28 14:17:36 TRACE shell.Shell: return: 1
12/09/28 14:17:36 TRACE shell.Shell: 
<stdout>
Hadoop proxyserver is dead and pid file exists ... failed!
</stdout>
12/09/28 14:17:39 TRACE shell.Shell: /bin/bash -s << __EOT__
service hadoop-yarn-proxyserver status
__EOT__
12/09/28 14:17:39 TRACE shell.Shell: return: 1
12/09/28 14:17:39 TRACE shell.Shell: 
<stdout>
Hadoop proxyserver is dead and pid file exists ... failed!
{noformat}

I have been tested both the positive and negative case already!
                
> improve the package daemon status check, check twice by some delay if status doesn't match expected value
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: BIGTOP-721
>                 URL: https://issues.apache.org/jira/browse/BIGTOP-721
>             Project: Bigtop
>          Issue Type: Bug
>          Components: Tests
>    Affects Versions: 0.5.0
>            Reporter: Johnny Zhang
>            Assignee: Johnny Zhang
>            Priority: Critical
>             Fix For: 0.5.0
>
>         Attachments: BIGTOP_721.txt
>
>
> the current package test check the daemon can be correctly started, stopped, restarted and so on. It first changes the daemon status (start it, or stop it), then sleep 3001, then use function checkThat to check if the daemon status satisfy the matcher. 
> However, there is a case that daemon status update speed is slightly different: some daemon status change a little bit slower than others. In this case, we want the test can give a delay and check the daemon status again if it doesn't match the matcher at the first place. 
> Example is:
> {noformat}
> service flume-ng-agent start
> __EOT__
> 12/09/28 14:10:19 TRACE shell.Shell: 
> <stdout>
> Flume NG agent is not running..failed
> Starting Flume NG agent daemon (flume-ng-agent): ..done
> </stdout>
> 12/09/28 14:10:22 TRACE shell.Shell: /bin/bash -s << __EOT__
> service flume-ng-agent status
> __EOT__
> 12/09/28 14:10:22 TRACE shell.Shell: return: 3
> 12/09/28 14:10:22 TRACE shell.Shell: 
> <stdout>
> Flume NG agent is not running..failed
> </stdout>
> 12/09/28 14:10:25 TRACE shell.Shell: /bin/bash -s << __EOT__
> service flume-ng-agent status
> __EOT__
> 12/09/28 14:10:25 TRACE shell.Shell: 
> <stdout>
> Flume NG agent is running..done
> {noformat}
> in above example (real case) is in SLES11, the flume-ng-agent daemon status update is a little bit slower, after sleep 3001, the daemon doesn't shown as "running" yet, then we give another 3001 delay and check again, and it is shown as "running".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira