You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@brooklyn.apache.org by "Aled Sage (JIRA)" <ji...@apache.org> on 2016/03/24 23:07:25 UTC

[jira] [Commented] (BROOKLYN-243) MySql stop+restart: timed out waiting for serviceUp (due to enrichers/feeds?)

    [ https://issues.apache.org/jira/browse/BROOKLYN-243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15211037#comment-15211037 ] 

Aled Sage commented on BROOKLYN-243:
------------------------------------

I saw the same thing with RabbitMQ. I deployed a RabbitMQ to GCE, and then I ran the script below to repeatedly stop and restart the process (exiting as soon as one of the effectors failed). It failed after 46 iterations, with the same symptoms as described above.

I'll repeat this again with trace enabled.

{noformat}
#!/bin/bash -e -x

for i in {1..1000}; do 
  echo "Run $i"
  br app oigXiDpI entity QOQJinps stop --param "stopMachineMode=NEVER"
  br app oigXiDpI entity QOQJinps restart --param "restartMachine=false"
done
{noformat}

> MySql stop+restart: timed out waiting for serviceUp (due to enrichers/feeds?)
> -----------------------------------------------------------------------------
>
>                 Key: BROOKLYN-243
>                 URL: https://issues.apache.org/jira/browse/BROOKLYN-243
>             Project: Brooklyn
>          Issue Type: Bug
>            Reporter: Aled Sage
>
> Using Brooklyn 0.9.0-SNAPSHOT, I deployed MySqlNode to a BYON VM in AWS (on CentOS 6.5).
> My automated test script invoked stop() on the MySqlNode to just stop the process, and then invoked restart().
> The restart() successfully restarted the process, but then the post-restart task timed out waiting for SERVICE_UP.
> Looking at the sensor values, I think (*) it showed:
> {noformat}
>     mysql.queries.perSec.fromMysql: 0.29
>     service.process.isRunning:      true
>     service.state:                  STARTING
>     service.isUp:                   false
>     service.notUp.indicators:       {}
> {noformat}
> (*) unfortunately the automated test script changed the state of the entity before I had copy-pasted all the values. But I'm pretty sure it was in this state.
> This suggests that the feed was doing its job (having populated isRunning and queries.perSec) - the log confirmed that this was being executed periodically.
> It suggests that the notUp.indicators had been updated correctly by the enricher.
> But that the {{ServiceNotUpLogic.newEnricherForServiceUpIfNotUpIndicatorsEmpty()}} had somehow not set the serviceUp.
> This is very surprising! The entity was previously up; the enricher has been there for a while. I therefore don't think it's a race with the first value being missed or anything like that.
> A (probably unrelated) worry I have about this code is for stop(): we stop the feeds (but we don't wait for the feeds to be terminated), and then set isRunning to false. There is a race, where we could leave the entity saying isRunning=true even though the process is stopped.
> This is not reproducible; I've only ever seen it once.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)