You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cloudstack.apache.org by Andrei Mikhailovsky <an...@arhont.com> on 2014/05/07 11:36:06 UTC

ACS 4.3 problem with starting virtual routers

Hello guys, 

could someone help me to solve the problem with virtual routers on ACS 4.3 using Ubuntu 12.04 for both management and host servers. 

I've recenly upgraded from ACS 4.2.1 following the release notes. In the process of upgrading i've added new system vm template and following the upgrade i've restarted all virtual routers. The process went well so far as there we no errors. 

Next day i've noticed that I am no longer able to start new virtual routers or restart networks. I can successfully start existing virtual routers which are in the Stopped state, but can't start a new virtual router. For instance, the management server log shows the following when I am trying to restart an existing network: 



------------------- 

2014-05-07 00:11:32,069 DEBUG [c.c.a.ApiServlet] (catalina-exec-5:ctx-f0be1010) ===START=== 192.168.169.52 -- GET command=restartNetwork&id=13 
1e86d0-8d0b-4e9a-964d-e102511b055a&cleanup=true&response=json&sessionkey=i9vBkmoEtC2L4tAjX%2BMQQ9NzZKw%3D&_=1399417892014 
2014-05-07 00:11:32,106 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (catalina-exec-5:ctx-f0be1010 ctx-d6f35608) submit async job-4913, details: Asyn 
cJobVO {id:4913, userId: 3, accountId: 2, instanceType: None, instanceId: null, cmd: org.apache.cloudstack.api.command.user.network.RestartNetwo 
rkCmd, cmdInfo: {"id":"131e86d0-8d0b-4e9a-964d-e102511b055a","response":"json","cleanup":"true","sessionkey":"i9vBkmoEtC2L4tAjX+MQQ9NzZKw\u003d" 
,"cmdEventType":"NETWORK.RESTART","ctxUserId":"3","httpmethod":"GET","_":"1399417892014","ctxAccountId":"2","ctxStartEventId":"15168"}, cmdVersi 
on: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null, initMsid: 238402986947280, completeMsid: null, lastUpdated: null, las 
tPolled: null, created: null} 
2014-05-07 00:11:32,107 INFO [o.a.c.f.j.i.AsyncJobMonitor] (Job-Executor-2:ctx-549fa81b) Add job-4913 into job monitoring 
2014-05-07 00:11:32,107 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (Job-Executor-2:ctx-549fa81b) Executing AsyncJobVO {id:4913, userId: 3, accountI 
d: 2, instanceType: None, instanceId: null, cmd: org.apache.cloudstack.api.command.user.network.RestartNetworkCmd, cmdInfo: {"id":"131e86d0-8d0b 
-4e9a-964d-e102511b055a","response":"json","cleanup":"true","sessionkey":"i9vBkmoEtC2L4tAjX+MQQ9NzZKw\u003d","cmdEventType":"NETWORK.RESTART","c 
txUserId":"3","httpmethod":"GET","_":"1399417892014","ctxAccountId":"2","ctxStartEventId":"15168"}, cmdVersion: 0, status: IN_PROGRESS, processS 
tatus: 0, resultCode: 0, result: null, initMsid: 238402986947280, completeMsid: null, lastUpdated: null, lastPolled: null, created: null} 
2014-05-07 00:11:32,108 DEBUG [c.c.a.ApiServlet] (catalina-exec-5:ctx-f0be1010 ctx-d6f35608) ===END=== 192.168.169.52 -- GET command=restartNe 
twork&id=131e86d0-8d0b-4e9a-964d-e102511b055a&cleanup=true&response=json&sessionkey=i9vBkmoEtC2L4tAjX%2BMQQ9NzZKw%3D&_=1399417892014 
2014-05-07 00:11:32,130 DEBUG [o.a.c.e.o.NetworkOrchestrator] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) Restarting network 264... 
2014-05-07 00:11:32,130 DEBUG [o.a.c.e.o.NetworkOrchestrator] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) Shutting down the network id=264 as a p 
art of network restart 
2014-05-07 00:11:32,134 DEBUG [o.a.c.e.o.NetworkOrchestrator] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) Releasing 2 port forwarding rules for n 
etwork id=264 as a part of shutdownNetworkRules 
2014-05-07 00:11:32,160 DEBUG [c.c.n.e.VirtualRouterElement] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) Virtual router elemnt doesn't need to ap 
ply firewall rules on the backend; virtual router doesn't exist in the network 264 
2014-05-07 00:11:32,162 DEBUG [o.a.c.e.o.NetworkOrchestrator] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) Releasing 0 static nat rules for networ 
k id=264 as a part of shutdownNetworkRules 
2014-05-07 00:11:32,162 DEBUG [c.c.n.f.FirewallManagerImpl] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) There are no rules to forward to the netw 
ork elements 
2014-05-07 00:11:32,164 DEBUG [c.c.n.l.LoadBalancingRulesManagerImpl] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) Revoking 0 Public load balancing rules for network id=264 
2014-05-07 00:11:32,164 DEBUG [c.c.n.l.LoadBalancingRulesManagerImpl] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) There are no Load Balancing Rules to forward to the network elements 
2014-05-07 00:11:32,166 DEBUG [c.c.n.l.LoadBalancingRulesManagerImpl] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) Revoking 0 Internal load balancing rules for network id=264 
2014-05-07 00:11:32,166 DEBUG [c.c.n.l.LoadBalancingRulesManagerImpl] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) There are no Load Balancing Rules to forward to the network elements 
2014-05-07 00:11:32,168 DEBUG [o.a.c.e.o.NetworkOrchestrator] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) Releasing 5 firewall ingress rules for network id=264 as a part of shutdownNetworkRules 
2014-05-07 00:11:32,186 DEBUG [c.c.n.e.VirtualRouterElement] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) Virtual router elemnt doesn't need to apply firewall rules on the backend; virtual router doesn't exist in the network 264 
2014-05-07 00:11:32,188 DEBUG [o.a.c.e.o.NetworkOrchestrator] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) Releasing 1 firewall egress rules for network id=264 as a part of shutdownNetworkRules 
2014-05-07 00:11:32,192 DEBUG [c.c.n.f.FirewallManagerImpl] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) applying default firewall egress rules 
2014-05-07 00:11:32,208 DEBUG [c.c.n.e.VirtualRouterElement] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) Virtual router elemnt doesn't need to apply firewall rules on the backend; virtual router doesn't exist in the network 264 
2014-05-07 00:11:32,222 DEBUG [c.c.n.e.VirtualRouterElement] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) Virtual router elemnt doesn't need to apply firewall rules on the backend; virtual router doesn't exist in the network 264 
2014-05-07 00:11:32,224 DEBUG [c.c.n.r.RulesManagerImpl] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) Found 0 static nat rules to apply for network id 264 
2014-05-07 00:11:32,251 DEBUG [c.c.n.e.VirtualRouterElement] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) Virtual router elemnt doesn't need to associate ip addresses on the backend; virtual router doesn't exist in the network 264 
2014-05-07 00:11:32,253 DEBUG [o.a.c.e.o.NetworkOrchestrator] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) Sending network shutdown to VirtualRouter 
2014-05-07 00:11:32,253 DEBUG [o.a.c.e.o.NetworkOrchestrator] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) Implementing the network Ntwk[264|Guest|8] elements and resources as a part of network restart 
2014-05-07 00:11:32,257 DEBUG [o.a.c.e.o.NetworkOrchestrator] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) Asking VirtualRouter to implemenet Ntwk[264|Guest|8] 
2014-05-07 00:11:32,260 DEBUG [c.c.n.r.VirtualNetworkApplianceManagerImpl] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) Lock is acquired for network id 264 as a part of router startup in Dest[Zone(Id)-Pod(Id)-Cluster(Id)-Host(Id)-Storage(Volume(Id|Type-->Pool(Id))] : Dest[Zone(1)-Pod(null)-Cluster(null)-Host(null)-Storage()] 
2014-05-07 00:11:32,277 DEBUG [c.c.n.r.VirtualNetworkApplianceManagerImpl] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) Adding nic for Virtual Router in Guest network Ntwk[264|Guest|8] 
2014-05-07 00:11:32,277 DEBUG [c.c.n.r.VirtualNetworkApplianceManagerImpl] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) Adding nic for Virtual Router in Control network 
2014-05-07 00:11:32,281 DEBUG [o.a.c.e.o.NetworkOrchestrator] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) Found existing network configuration for offering [Network Offering [3-Control-System-Control-Network]: Ntwk[202|Control|3] 
2014-05-07 00:11:32,281 DEBUG [o.a.c.e.o.NetworkOrchestrator] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) Releasing lock for Acct[06ee8d45-65f2-11e3-9bd1-d8d38559b2d0-system] 
2014-05-07 00:11:32,282 DEBUG [c.c.n.r.VirtualNetworkApplianceManagerImpl] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) Adding nic for Virtual Router in Public network 
2014-05-07 00:11:32,287 DEBUG [o.a.c.e.o.NetworkOrchestrator] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) Found existing network configuration for offering [Network Offering [1-Public-System-Public-Network]: Ntwk[200|Public|1] 
2014-05-07 00:11:32,287 DEBUG [o.a.c.e.o.NetworkOrchestrator] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) Releasing lock for Acct[06ee8d45-65f2-11e3-9bd1-d8d38559b2d0-system] 
2014-05-07 00:11:32,300 DEBUG [c.c.n.r.VirtualNetworkApplianceManagerImpl] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) Allocating the VR i=831 in datacenter com.cloud.dc.DataCenterVO$$EnhancerByCGLIB$$732fb519@1with the hypervisor type KVM 
2014-05-07 00:11:32,304 DEBUG [c.c.n.r.VirtualNetworkApplianceManagerImpl] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) KVM won't support system vm, skip it 
2014-05-07 00:11:32,305 DEBUG [c.c.n.r.VirtualNetworkApplianceManagerImpl] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) Lock is released for network id 264 as a part of router startup in Dest[Zone(Id)-Pod(Id)-Cluster(Id)-Host(Id)-Storage(Volume(Id|Type-->Pool(Id))] : Dest[Zone(1)-Pod(null)-Cluster(null)-Host(null)-Storage()] 
2014-05-07 00:11:32,305 WARN [o.a.c.e.o.NetworkOrchestrator] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) Failed to implement network Ntwk[264|Guest|8] elements and resources as a part of network restart due to 
com.cloud.exception.ResourceUnavailableException: Resource [DataCenter:1] is unreachable: Can't find at least one running router! 
at com.cloud.network.element.VirtualRouterElement.implement(VirtualRouterElement.java:192) 
at org.apache.cloudstack.engine.orchestration.NetworkOrchestrator.implementNetworkElementsAndResources(NetworkOrchestrator.java:1070) 
at org.apache.cloudstack.engine.orchestration.NetworkOrchestrator.restartNetwork(NetworkOrchestrator.java:2387) 
at com.cloud.network.NetworkServiceImpl.restartNetwork(NetworkServiceImpl.java:1847) 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
at java.lang.reflect.Method.invoke(Method.java:622) 
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317) 
at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183) 
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150) 
at com.cloud.event.ActionEventInterceptor.invoke(ActionEventInterceptor.java:50) 
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161) 
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91) 
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172) 
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204) 
at com.sun.proxy.$Proxy199.restartNetwork(Unknown Source) 
at org.apache.cloudstack.api.command.user.network.RestartNetworkCmd.execute(RestartNetworkCmd.java:92) 
at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:161) 
at com.cloud.api.ApiAsyncJobDispatcher.runJobInContext(ApiAsyncJobDispatcher.java:109) 
at com.cloud.api.ApiAsyncJobDispatcher$1.run(ApiAsyncJobDispatcher.java:66) 
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56) 
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103) 
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53) 
at com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:63) 
at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:509) 
at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49) 
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56) 
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103) 
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53) 
at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46) 
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) 
at java.util.concurrent.FutureTask.run(FutureTask.java:166) 
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) 
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
at java.lang.Thread.run(Thread.java:701) 
2014-05-07 00:11:32,307 WARN [c.c.n.NetworkServiceImpl] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) Network id=264 failed to restart. 
2014-05-07 00:11:32,311 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (Job-Executor-2:ctx-549fa81b) Complete async job-4913, jobStatus: FAILED, resultCode: 530, result: org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":530,"errortext":"Failed to restart network"} 
2014-05-07 00:11:32,317 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (Job-Executor-2:ctx-549fa81b) Done executing org.apache.cloudstack.api.command.user.network.RestartNetworkCmd for job-4913 
2014-05-07 00:11:32,321 INFO [o.a.c.f.j.i.AsyncJobMonitor] (Job-Executor-2:ctx-549fa81b) Remove job-4913 from job monitoring 
2014-05-07 00:11:34,215 DEBUG [c.c.s.StatsCollector] (StatsCollector-1:ctx-d23e62b6) HostStatsCollector is running... 


---------- 

>From the logs, the following line looks very odd to me: 

2014-05-07 00:11:32,304 DEBUG [c.c.n.r.VirtualNetworkApplianceManagerImpl] (Job-Executor-2:ctx-549fa81b ctx-d6f35608) KVM won't support system vm, skip it 

Not sure what this means or what to do with this information. I've downloaded the system vm for the kvm hypervisor, so why is it not supported? Not sure. 

Also, while starting the existing virtual routers, the firewall rules are not being properly passed on to the virtual router. The router starts with what it seems a basic set of iptable rules. Guest vms can't connect outside despite the existing Egress rule to allow everything out. The incoming rules are also not assigned. I've also noticed that if i remove an existing rule from ACS GUI and add it back it triggers majority of the firewall rules to be populated on the virtual router. Apart from the outgoing rules, which are never populated. 

Anyway, in trying to solve the problem i've also followed the guide (http://cloud.kelceydamage.com/cloudfire/blog/2013/10/08/conquering-the-cloudstack-4-2-dragon-kvm/) and completely recreated the system vm templates. Following the steps using Method 2 I've managed to install the new systemvm template using the latest 4.3 template and i've successfully recreated console proxy and ssvm vms. Both vms are showing VM and Agent states as Up. I've tried destroying both vms and they are recreated automatically without any issues. Also, the ssvm check script - /usr/local/cloud/systemvm/ssvm-check.sh is not showing any errors. all looks good and the secondary storage is mountable and writable. 

However, I am still unable to create the new virtual routers. I still get the same error and not sure what to do. the existing virtual routers also exhibit the same problem of not getting the rules after a restart. 

I am in a bit of a problem because of this as most of my infrastructure is not working properly. I was hoping someone could help me with fixing the issue. 

Many thanks 

Andrei