You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@brooklyn.apache.org by Bhairavi Sankar <bs...@us.ibm.com> on 2015/05/01 17:42:19 UTC

Retry on Connection refused(sshd daemon restart)


Hi All,

An exception like Connection refused(sshd daemon restart) during the time
of provisioning is considered to be fatal in brooklyn so it aborts and
permanently  sets the entity on-fire. It would be good if we have retry
logic in place to handle such exceptions.


2015-04-29 10:50:02,415 DEBUG b.u.internal.ssh.sshj.SshjTool
[brooklyn-execmanager-TGxn6eaq-202675]: << (amp@184.173.25.246:22) error
acquiring {hostAndPort=184.173.25.246:22, user=amp, ssh=698850007,
password=null, privateKeyFile=/home/amp/.ssh/id_rsa, privateKey=xxxxxx,
connectTimeout=10000, sessionTimeout=30000} (attempt 1/1, in time 10ms/2m)
(rethrowing, out of retries): Connection refused
2015-04-29 10:50:02,416 DEBUG b.u.internal.ssh.sshj.SshjTool
[brooklyn-execmanager-TGxn6eaq-202675]: amp@184.173.25.246:22 failed to
connect (rethrowing)
brooklyn.util.internal.ssh.SshException: (amp@184.173.25.246:22)
(amp@184.173.25.246:22) error acquiring {hostAndPort=184.173.25.246:22,
user=amp, ssh=698850007, password=null,
privateKeyFile=/home/amp/.ssh/id_rsa, privateKey=xxxxxx,
connectTimeout=10000, sessionTimeout=30000} (attempt 1/1, in time 10ms/2m);
out of retries: Connection refused
        at brooklyn.util.internal.ssh.SshAbstractTool.propagate
(SshAbstractTool.java:169) ~
[brooklyn-core-0.7.0-N20150401.jar:0.7.0-N20150401]
        at brooklyn.util.internal.ssh.sshj.SshjTool.acquire
(SshjTool.java:664) ~[brooklyn-core-0.7.0-N20150401.jar:0.7.0-N20150401]
        at brooklyn.util.internal.ssh.sshj.SshjTool.acquire
(SshjTool.java:617) ~[brooklyn-core-0.7.0-N20150401.jar:0.7.0-N20150401]
        at brooklyn.util.internal.ssh.sshj.SshjTool.connect
(SshjTool.java:206) ~[brooklyn-core-0.7.0-N20150401.jar:0.7.0-N20150401]
        at brooklyn.location.basic.SshMachineLocation.connectSsh
(SshMachineLocation.java:570)
[brooklyn-core-0.7.0-N20150401.jar:0.7.0-N20150401]
        at brooklyn.location.basic.SshMachineLocation$8.get
(SshMachineLocation.java:329)
[brooklyn-core-0.7.0-N20150401.jar:0.7.0-N20150401]
        at brooklyn.location.basic.SshMachineLocation$8.get
(SshMachineLocation.java:327)
[brooklyn-core-0.7.0-N20150401.jar:0.7.0-N20150401]
        at brooklyn.util.pool.BasicPool.leaseObject(BasicPool.java:135)
[brooklyn-utils-common-0.7.0-N20150401.jar:0.7.0-N20150401]
        at brooklyn.util.pool.BasicPool.exec(BasicPool.java:144)
[brooklyn-utils-common-0.7.0-N20150401.jar:0.7.0-N20150401]
        at brooklyn.location.basic.SshMachineLocation.execSsh
(SshMachineLocation.java:512)
[brooklyn-core-0.7.0-N20150401.jar:0.7.0-N20150401]
        at brooklyn.location.basic.SshMachineLocation$11.execWithTool
(SshMachineLocation.java:664)
[brooklyn-core-0.7.0-N20150401.jar:0.7.0-N20150401]
        at
brooklyn.util.task.system.internal.ExecWithLoggingHelpers.execWithLogging
(ExecWithLoggingHelpers.java:165)
[brooklyn-core-0.7.0-N20150401.jar:0.7.0-N20150401]
        at
brooklyn.util.task.system.internal.ExecWithLoggingHelpers.execScript
(ExecWithLoggingHelpers.java:81)
[brooklyn-core-0.7.0-N20150401.jar:0.7.0-N20150401]
        at brooklyn.location.basic.SshMachineLocation.execScript
(SshMachineLocation.java:648)
[brooklyn-core-0.7.0-N20150401.jar:0.7.0-N20150401]
        at brooklyn.entity.basic.AbstractSoftwareProcessSshDriver.execute
(AbstractSoftwareProcessSshDriver.java:324)
[brooklyn-software-base-0.7.0-N20150401.jar:0.7.0-N20150401]
        at brooklyn.entity.basic.lifecycle.ScriptHelper.executeInternal
(ScriptHelper.java:368)
[brooklyn-software-base-0.7.0-N20150401.jar:0.7.0-N20150401]
        at brooklyn.entity.basic.lifecycle.ScriptHelper$8.call
(ScriptHelper.java:289)
[brooklyn-software-base-0.7.0-N20150401.jar:0.7.0-N20150401]
        at brooklyn.entity.basic.lifecycle.ScriptHelper$8.call
(ScriptHelper.java:287)
[brooklyn-software-base-0.7.0-N20150401.jar:0.7.0-N20150401]
        at brooklyn.util.task.DynamicSequentialTask$DstJob.call
(DynamicSequentialTask.java:337)
[brooklyn-core-0.7.0-N20150401.jar:0.7.0-N20150401]
        at brooklyn.util.task.BasicExecutionManager$SubmissionCallable.call
(BasicExecutionManager.java:469)
[brooklyn-core-0.7.0-N20150401.jar:0.7.0-N20150401]
        at java.util.concurrent.FutureTask.run(FutureTask.java:274)
[na:1.7.0]
        at java.util.concurrent.ThreadPoolExecutor.runWorker
(ThreadPoolExecutor.java:1177) [na:1.7.0]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run
(ThreadPoolExecutor.java:642) [na:1.7.0]
        at java.lang.Thread.run(Thread.java:857) [na:1.7.0]

Caused by: java.net.ConnectException: Connection refused
        at java.net.Socket.connect(Socket.java:643) ~[na:1.7.0]
        at net.schmizz.sshj.SocketClient.connect(SocketClient.java:70) ~
[sshj-0.8.1.jar:na]
        at net.schmizz.sshj.SocketClient.connect(SocketClient.java:77) ~
[sshj-0.8.1.jar:na]
        at brooklyn.util.internal.ssh.sshj.SshjClientConnection.create
(SshjClientConnection.java:189) ~
[brooklyn-core-0.7.0-N20150401.jar:0.7.0-N20150401]
        at brooklyn.util.internal.ssh.sshj.SshjClientConnection.create
(SshjClientConnection.java:42) ~
[brooklyn-core-0.7.0-N20150401.jar:0.7.0-N20150401]
        at brooklyn.util.internal.ssh.sshj.SshjTool.acquire
(SshjTool.java:631) ~[brooklyn-core-0.7.0-N20150401.jar:0.7.0-N20150401]
        ... 22 common frames omitted


Thanks!

Regards,
Bhairavi Sankar
_____________________________________________
Software Engineer, Cloud Modular Managed Services
Global Technology Services,
IBM Austin, TX
E-mail: bsankar@us.ibm.com
Ph: +1-512-286-5164 | Tie-Line: 363 516
Cell : +1-512-221-8245

Re: Retry on Connection refused(sshd daemon restart)

Posted by Svetoslav Neykov <sv...@cloudsoftcorp.com>.
Hi Bharavi,

The exception you included happened while waiting for a confirmation that the application is running, just after launching the process. I agree that an exception at this stage shouldn't mark the entity permanently as failed, instead it should be handled just like "is running" reporting false.
I've implemented the change at https://github.com/apache/incubator-brooklyn/pull/624 <https://github.com/apache/incubator-brooklyn/pull/624>.

Svet.


> On 1.05.2015 г., at 18:42, Bhairavi Sankar <bs...@us.ibm.com> wrote:
> 
> 
> 
> Hi All,
> 
> An exception like Connection refused(sshd daemon restart) during the time
> of provisioning is considered to be fatal in brooklyn so it aborts and
> permanently  sets the entity on-fire. It would be good if we have retry
> logic in place to handle such exceptions.
> 
> 
> 2015-04-29 10:50:02,415 DEBUG b.u.internal.ssh.sshj.SshjTool
> [brooklyn-execmanager-TGxn6eaq-202675]: << (amp@184.173.25.246:22) error
> acquiring {hostAndPort=184.173.25.246:22, user=amp, ssh=698850007,
> password=null, privateKeyFile=/home/amp/.ssh/id_rsa, privateKey=xxxxxx,
> connectTimeout=10000, sessionTimeout=30000} (attempt 1/1, in time 10ms/2m)
> (rethrowing, out of retries): Connection refused
> 2015-04-29 10:50:02,416 DEBUG b.u.internal.ssh.sshj.SshjTool
> [brooklyn-execmanager-TGxn6eaq-202675]: amp@184.173.25.246:22 failed to
> connect (rethrowing)
> brooklyn.util.internal.ssh.SshException: (amp@184.173.25.246:22)
> (amp@184.173.25.246:22) error acquiring {hostAndPort=184.173.25.246:22,
> user=amp, ssh=698850007, password=null,
> privateKeyFile=/home/amp/.ssh/id_rsa, privateKey=xxxxxx,
> connectTimeout=10000, sessionTimeout=30000} (attempt 1/1, in time 10ms/2m);
> out of retries: Connection refused
>        at brooklyn.util.internal.ssh.SshAbstractTool.propagate
> (SshAbstractTool.java:169) ~
> [brooklyn-core-0.7.0-N20150401.jar:0.7.0-N20150401]
>        at brooklyn.util.internal.ssh.sshj.SshjTool.acquire
> (SshjTool.java:664) ~[brooklyn-core-0.7.0-N20150401.jar:0.7.0-N20150401]
>        at brooklyn.util.internal.ssh.sshj.SshjTool.acquire
> (SshjTool.java:617) ~[brooklyn-core-0.7.0-N20150401.jar:0.7.0-N20150401]
>        at brooklyn.util.internal.ssh.sshj.SshjTool.connect
> (SshjTool.java:206) ~[brooklyn-core-0.7.0-N20150401.jar:0.7.0-N20150401]
>        at brooklyn.location.basic.SshMachineLocation.connectSsh
> (SshMachineLocation.java:570)
> [brooklyn-core-0.7.0-N20150401.jar:0.7.0-N20150401]
>        at brooklyn.location.basic.SshMachineLocation$8.get
> (SshMachineLocation.java:329)
> [brooklyn-core-0.7.0-N20150401.jar:0.7.0-N20150401]
>        at brooklyn.location.basic.SshMachineLocation$8.get
> (SshMachineLocation.java:327)
> [brooklyn-core-0.7.0-N20150401.jar:0.7.0-N20150401]
>        at brooklyn.util.pool.BasicPool.leaseObject(BasicPool.java:135)
> [brooklyn-utils-common-0.7.0-N20150401.jar:0.7.0-N20150401]
>        at brooklyn.util.pool.BasicPool.exec(BasicPool.java:144)
> [brooklyn-utils-common-0.7.0-N20150401.jar:0.7.0-N20150401]
>        at brooklyn.location.basic.SshMachineLocation.execSsh
> (SshMachineLocation.java:512)
> [brooklyn-core-0.7.0-N20150401.jar:0.7.0-N20150401]
>        at brooklyn.location.basic.SshMachineLocation$11.execWithTool
> (SshMachineLocation.java:664)
> [brooklyn-core-0.7.0-N20150401.jar:0.7.0-N20150401]
>        at
> brooklyn.util.task.system.internal.ExecWithLoggingHelpers.execWithLogging
> (ExecWithLoggingHelpers.java:165)
> [brooklyn-core-0.7.0-N20150401.jar:0.7.0-N20150401]
>        at
> brooklyn.util.task.system.internal.ExecWithLoggingHelpers.execScript
> (ExecWithLoggingHelpers.java:81)
> [brooklyn-core-0.7.0-N20150401.jar:0.7.0-N20150401]
>        at brooklyn.location.basic.SshMachineLocation.execScript
> (SshMachineLocation.java:648)
> [brooklyn-core-0.7.0-N20150401.jar:0.7.0-N20150401]
>        at brooklyn.entity.basic.AbstractSoftwareProcessSshDriver.execute
> (AbstractSoftwareProcessSshDriver.java:324)
> [brooklyn-software-base-0.7.0-N20150401.jar:0.7.0-N20150401]
>        at brooklyn.entity.basic.lifecycle.ScriptHelper.executeInternal
> (ScriptHelper.java:368)
> [brooklyn-software-base-0.7.0-N20150401.jar:0.7.0-N20150401]
>        at brooklyn.entity.basic.lifecycle.ScriptHelper$8.call
> (ScriptHelper.java:289)
> [brooklyn-software-base-0.7.0-N20150401.jar:0.7.0-N20150401]
>        at brooklyn.entity.basic.lifecycle.ScriptHelper$8.call
> (ScriptHelper.java:287)
> [brooklyn-software-base-0.7.0-N20150401.jar:0.7.0-N20150401]
>        at brooklyn.util.task.DynamicSequentialTask$DstJob.call
> (DynamicSequentialTask.java:337)
> [brooklyn-core-0.7.0-N20150401.jar:0.7.0-N20150401]
>        at brooklyn.util.task.BasicExecutionManager$SubmissionCallable.call
> (BasicExecutionManager.java:469)
> [brooklyn-core-0.7.0-N20150401.jar:0.7.0-N20150401]
>        at java.util.concurrent.FutureTask.run(FutureTask.java:274)
> [na:1.7.0]
>        at java.util.concurrent.ThreadPoolExecutor.runWorker
> (ThreadPoolExecutor.java:1177) [na:1.7.0]
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run
> (ThreadPoolExecutor.java:642) [na:1.7.0]
>        at java.lang.Thread.run(Thread.java:857) [na:1.7.0]
> 
> Caused by: java.net.ConnectException: Connection refused
>        at java.net.Socket.connect(Socket.java:643) ~[na:1.7.0]
>        at net.schmizz.sshj.SocketClient.connect(SocketClient.java:70) ~
> [sshj-0.8.1.jar:na]
>        at net.schmizz.sshj.SocketClient.connect(SocketClient.java:77) ~
> [sshj-0.8.1.jar:na]
>        at brooklyn.util.internal.ssh.sshj.SshjClientConnection.create
> (SshjClientConnection.java:189) ~
> [brooklyn-core-0.7.0-N20150401.jar:0.7.0-N20150401]
>        at brooklyn.util.internal.ssh.sshj.SshjClientConnection.create
> (SshjClientConnection.java:42) ~
> [brooklyn-core-0.7.0-N20150401.jar:0.7.0-N20150401]
>        at brooklyn.util.internal.ssh.sshj.SshjTool.acquire
> (SshjTool.java:631) ~[brooklyn-core-0.7.0-N20150401.jar:0.7.0-N20150401]
>        ... 22 common frames omitted
> 
> 
> Thanks!
> 
> Regards,
> Bhairavi Sankar
> _____________________________________________
> Software Engineer, Cloud Modular Managed Services
> Global Technology Services,
> IBM Austin, TX
> E-mail: bsankar@us.ibm.com
> Ph: +1-512-286-5164 | Tie-Line: 363 516
> Cell : +1-512-221-8245