You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@cloudstack.apache.org by "Patrick (JIRA)" <ji...@apache.org> on 2017/03/25 02:44:42 UTC

[jira] [Commented] (CLOUDSTACK-9385) Password Server is not running on RvR

    [ https://issues.apache.org/jira/browse/CLOUDSTACK-9385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15941531#comment-15941531 ] 

Patrick commented on CLOUDSTACK-9385:
-------------------------------------

just upgraded from 4.5.2 to 4.9.2, xen, and also impacted by the same issue.
A few additional clarification:
Happenned at the RvR recreation to apply the new SVM template. When RvR are rebooted to install the new SVM version, the pair always end up both in BACKUP state, whether I do VR reboot, network clean reboot, stop / start, etc. 
To fix it, had to find which of the two VR was displaying the errors: "Password server failed with error code 1. Restarting it...", restart the password service, restart the VR and it would then gain its MASTER state. From this point forward, the role switch between the two VR goes smoothly, until either VR is recreated. This is pretty ugly, I'm switching my RvR to standalone to avoid this issue.

> Password Server is not running on RvR
> -------------------------------------
>
>                 Key: CLOUDSTACK-9385
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9385
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the default.) 
>          Components: ISO, SystemVM
>    Affects Versions: 4.6.0, 4.6.1, 4.6.2, 4.7.0, 4.7.1, 4.8.0
>            Reporter: dsclose
>
> NB: I have not tested this on VPC routers.
> The cloud-passwd-srvr service fails on redundant virtual routers. This appears to only concern redundant virtual routers. Standalone routers launch the password server successfully, as per this bash session:
> {code:title=Standalone Router}
> root@r-3775-VM:~# ps aux | grep passwd | grep -v grep
> root      2257  0.0  0.5   9244  1328 ?        S    14:27   0:00 /bin/bash /opt/cloud/bin/passwd_server_ip 10.1.1.1 dummy
> root      2259  0.0  3.2  37276  8128 ?        S    14:27   0:00 python /opt/cloud/bin/passwd_server_ip.py 10.1.1.1
> root@r-3775-VM:~# netstat -tnlp | grep 2259
> tcp        0      0 10.1.1.1:8080           0.0.0.0:*               LISTEN      2259/python
> {code}
> However, redundant virtual routers do not exhibit this behaviour. Instead, the password server process is running without an IP argument. No matching process is bound  to any ports:
> {code:title=Master Redundant Virtual Router}
> root@r-3776-VM:~# ps aux | grep passwd | grep -v grep
> root      5152  0.0  0.2  17684  1516 ?        S    14:38   0:00 /bin/bash /opt/cloud/bin/passwd_server_ip None dummy
> root@r-3776-VM:~# netstat -ntlp | grep 5152
> root@r-3776-VM:~#
> {code}
> Further, an error message is being repeated in /var/log/messages:
> {code:title=/var/log/messages}
> May 24 14:53:07 r-3776-VM cloud: Password server failed with error code 1. Restarting it...
> May 24 14:53:11 r-3776-VM cloud: Password server failed with error code 1. Restarting it...
> May 24 14:53:14 r-3776-VM cloud: Password server failed with error code 1. Restarting it...
> May 24 14:53:17 r-3776-VM cloud: Password server failed with error code 1. Restarting it...
> May 24 14:53:20 r-3776-VM cloud: Password server failed with error code 1. Restarting it...
> May 24 14:53:23 r-3776-VM cloud: Password server failed with error code 1. Restarting it...
> May 24 14:53:26 r-3776-VM cloud: Password server failed with error code 1. Restarting it...
> May 24 14:53:29 r-3776-VM cloud: Password server failed with error code 1. Restarting it...
> {code}
> No process is bound to the password server port. Consequently, attempts to request a password from the password server get rejected.
> Manually restarting the cloud-passwd-srvr resolves this issue immediately:
> {code:title=Master Redundant Virtual Router}
> root@r-3776-VM:~# service cloud-passwd-srvr restart
> Killed password server (pid=4874)
> iptables: Bad rule (does a matching rule exist in that chain?).
> Removed cloud-passwd-srvr iptables rules
> Stopped password server (pid=5152)
> iptables: Bad rule (does a matching rule exist in that chain?).
> Removed cloud-passwd-srvr iptables rules
> Added cloud-passwd-srvr iptables rules
> root@r-3776-VM:~# nohup: appending output to `nohup.out'
> root@r-3776-VM:~# ps aux | grep passwd | grep -v grep
> root     15776  0.0  0.3  19436  1576 pts/1    S    15:05   0:00 /bin/bash /opt/cloud/bin/passwd_server_ip 10.1.1.250
> root     15780  0.2  1.6  45484  8304 pts/1    S    15:05   0:00 python /opt/cloud/bin/passwd_server_ip.py 10.1.1.250
> root     15781  0.0  0.3  19436  1572 pts/1    S    15:05   0:00 /bin/bash /opt/cloud/bin/passwd_server_ip 10.1.1.1
> root     15782  0.2  1.6  49692  8396 pts/1    S    15:05   0:00 python /opt/cloud/bin/passwd_server_ip.py 10.1.1.1
> root@r-3776-VM:~# netstat -ntlp | grep 15780
> tcp        0      0 10.1.1.250:8080         0.0.0.0:*               LISTEN      15780/python
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)