You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@cloudstack.apache.org by Maurice Lawler <ma...@me.com> on 2013/04/13 19:23:29 UTC

Emergency: Cloud NOT starting

Greetings,

I'm have a terrible way to go, nothing I have done will start my cloud. None of my system VM's will start, which in turn do not permit the regular OS VM's to start. I suffered from first a power outage, then I manually rebooted my server. Now, nothing is coming back online.

I was previously told, having cloud0 first is the cause of this. Even when doing ifconfig cloud0 down, nothing seems to come back online.

I have gone as far as stopping iptables / eatables along with stopping/starting the network and the management console.


Checking the system VM's the continue to remain in a 'starting' status.

[root@lunder ~]# service iptables status
iptables: Firewall is not running.
[root@lunder ~]# service ebtables status
# Generated by ebtables-save v1.0 on Sat Apr 13 12:21:04 CDT 2013
*nat
:PREROUTING ACCEPT
:OUTPUT ACCEPT
:POSTROUTING ACCEPT

[root@lunder ~]#


[root@lunder daoenix]# ifconfig
cloud0    Link encap:Ethernet  HWaddr FE:00:A9:FE:00:67
          inet addr:169.254.0.1  Bcast:169.254.255.255  Mask:255.255.0.0
          inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:658 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 b)  TX bytes:28068 (27.4 KiB)

cloudbr0  Link encap:Ethernet  HWaddr C8:0A:A9:9E:2D:7C
          inet addr:myipaddress  Bcast:9myipaddress Mask:255.255.255.224
          inet6 addr: fe80::fc2c:bcff:fe00:5/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:192832 errors:0 dropped:0 overruns:0 frame:0
          TX packets:11251 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:11481135 (10.9 MiB)  TX bytes:25153331 (23.9 MiB)

eth0      Link encap:Ethernet  HWaddr C8:0A:A9:9E:2D:7C
          inet6 addr: fe80::ca0a:a9ff:fe9e:2d7c/64 Scope:Link
          UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
          RX packets:199794 errors:0 dropped:0 overruns:0 frame:0
          TX packets:24157 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:14647159 (13.9 MiB)  TX bytes:25994485 (24.7 MiB)
          Memory:df6e0000-df700000

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:7850808 errors:0 dropped:0 overruns:0 frame:0
          TX packets:7850808 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:1611132695 (1.5 GiB)  TX bytes:1611132695 (1.5 GiB)

virbr0    Link encap:Ethernet  HWaddr 52:54:00:D9:D9:9A
          inet addr:192.168.122.1  Bcast:192.168.122.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

vnet0     Link encap:Ethernet  HWaddr FE:00:A9:FE:00:67
          inet6 addr: fe80::fc00:a9ff:fefe:67/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:116 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:500
          RX bytes:0 (0.0 b)  TX bytes:5232 (5.1 KiB)

vnet1     Link encap:Ethernet  HWaddr FE:84:4C:00:00:01
          inet6 addr: fe80::fc84:4cff:fe00:1/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:256 errors:0 dropped:0 overruns:1 carrier:0
          collisions:0 txqueuelen:500
          RX bytes:0 (0.0 b)  TX bytes:17849 (17.4 KiB)

vnet2     Link encap:Ethernet  HWaddr FE:2C:BC:00:00:05
          inet6 addr: fe80::fc2c:bcff:fe00:5/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:256 errors:0 dropped:0 overruns:1 carrier:0
          collisions:0 txqueuelen:500
          RX bytes:0 (0.0 b)  TX bytes:17849 (17.4 KiB)

[root@lunder daoenix]#

Re: Emergency: Cloud NOT starting

Posted by Maurice Lawler <ma...@me.com>.

Now a new error shows: 



	at com.cloud.async.AsyncJobManagerImpl$1.run(AsyncJobManagerImpl.java:432)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
	at java.util.concurrent.FutureTask.run(FutureTask.java:166)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:679)
2013-04-13 12:29:33,067 WARN  [cloud.api.ApiDispatcher] (Job-Executor-1:job-132) class com.cloud.api.ServerApiException : null
2013-04-13 12:29:33,068 DEBUG [cloud.async.AsyncJobManagerImpl] (Job-Executor-1:job-132) Complete async job-132, jobStatus: 2, resultCode: 530, result: Error Code: 534 Error text: null
2013-04-13 12:29:34,382 DEBUG [cloud.cluster.ClusterManagerImpl] (Cluster-Heartbeat-1:null) Detected management node left, id:1, nodeIP:myipaddress
2013-04-13 12:29:34,382 INFO  [cloud.cluster.ClusterManagerImpl] (Cluster-Heartbeat-1:null) Trying to connect to myipaddress
2013-04-13 12:29:34,382 INFO  [cloud.cluster.ClusterManagerImpl] (Cluster-Heartbeat-1:null) Management node 1 is detected inactive by timestamp but is pingable


On Apr 13, 2013, at 12:23 PM, Maurice Lawler <ma...@me.com> wrote:

> Greetings,
> 
> I'm have a terrible way to go, nothing I have done will start my cloud. None of my system VM's will start, which in turn do not permit the regular OS VM's to start. I suffered from first a power outage, then I manually rebooted my server. Now, nothing is coming back online.
> 
> I was previously told, having cloud0 first is the cause of this. Even when doing ifconfig cloud0 down, nothing seems to come back online.
> 
> I have gone as far as stopping iptables / eatables along with stopping/starting the network and the management console.
> 
> 
> Checking the system VM's the continue to remain in a 'starting' status.
> 
> [root@lunder ~]# service iptables status
> iptables: Firewall is not running.
> [root@lunder ~]# service ebtables status
> # Generated by ebtables-save v1.0 on Sat Apr 13 12:21:04 CDT 2013
> *nat
> :PREROUTING ACCEPT
> :OUTPUT ACCEPT
> :POSTROUTING ACCEPT
> 
> [root@lunder ~]#
> 
> 
> [root@lunder daoenix]# ifconfig
> cloud0    Link encap:Ethernet  HWaddr FE:00:A9:FE:00:67
>          inet addr:169.254.0.1  Bcast:169.254.255.255  Mask:255.255.0.0
>          inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:658 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:0
>          RX bytes:0 (0.0 b)  TX bytes:28068 (27.4 KiB)
> 
> cloudbr0  Link encap:Ethernet  HWaddr C8:0A:A9:9E:2D:7C
>          inet addr:myipaddress  Bcast:9myipaddress Mask:255.255.255.224
>          inet6 addr: fe80::fc2c:bcff:fe00:5/64 Scope:Link
>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>          RX packets:192832 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:11251 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:0
>          RX bytes:11481135 (10.9 MiB)  TX bytes:25153331 (23.9 MiB)
> 
> eth0      Link encap:Ethernet  HWaddr C8:0A:A9:9E:2D:7C
>          inet6 addr: fe80::ca0a:a9ff:fe9e:2d7c/64 Scope:Link
>          UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
>          RX packets:199794 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:24157 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:1000
>          RX bytes:14647159 (13.9 MiB)  TX bytes:25994485 (24.7 MiB)
>          Memory:df6e0000-df700000
> 
> lo        Link encap:Local Loopback
>          inet addr:127.0.0.1  Mask:255.0.0.0
>          inet6 addr: ::1/128 Scope:Host
>          UP LOOPBACK RUNNING  MTU:16436  Metric:1
>          RX packets:7850808 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:7850808 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:0
>          RX bytes:1611132695 (1.5 GiB)  TX bytes:1611132695 (1.5 GiB)
> 
> virbr0    Link encap:Ethernet  HWaddr 52:54:00:D9:D9:9A
>          inet addr:192.168.122.1  Bcast:192.168.122.255  Mask:255.255.255.0
>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:0
>          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
> 
> vnet0     Link encap:Ethernet  HWaddr FE:00:A9:FE:00:67
>          inet6 addr: fe80::fc00:a9ff:fefe:67/64 Scope:Link
>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:116 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:500
>          RX bytes:0 (0.0 b)  TX bytes:5232 (5.1 KiB)
> 
> vnet1     Link encap:Ethernet  HWaddr FE:84:4C:00:00:01
>          inet6 addr: fe80::fc84:4cff:fe00:1/64 Scope:Link
>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:256 errors:0 dropped:0 overruns:1 carrier:0
>          collisions:0 txqueuelen:500
>          RX bytes:0 (0.0 b)  TX bytes:17849 (17.4 KiB)
> 
> vnet2     Link encap:Ethernet  HWaddr FE:2C:BC:00:00:05
>          inet6 addr: fe80::fc2c:bcff:fe00:5/64 Scope:Link
>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:256 errors:0 dropped:0 overruns:1 carrier:0
>          collisions:0 txqueuelen:500
>          RX bytes:0 (0.0 b)  TX bytes:17849 (17.4 KiB)
> 
> [root@lunder daoenix]#
> 
>

Re: Emergency: Cloud NOT starting

Posted by Jeronimo Garcia <ga...@gmail.com>.

Do you have a secondary storage defined? ,  does it work  ? are the system
vm templates in it?
go to infrastructure and check the status of the system vms.

Thanks


On Sat, Apr 13, 2013 at 12:57 PM, Marcus Sorensen <sh...@gmail.com>wrote:

> A "brctl show" would also be good to have.
> On Apr 13, 2013 11:52 AM, "Marcus Sorensen" <sh...@gmail.com> wrote:
>
> > If you do a "virsh list" on the agent there's a good chance you would see
> > a VM running, however the system will only wait so long for it to boot up
> > before shutting it down, so it will come and go. You can do "virsh
> > vncdisplay (vmname)" and it will tell you what port to vnc to on the host
> > in order to connect to the VM and see what state it is in.
> >
> > I see in the agent log that at one point it failed to start due to no
> > private bridge. Is cloudbr0 your private as defined in agent.properties?
> >
> > You can also open /etc/cloud/agent/log4j-cloud.xml and change every INFO
> > to DEBUG, restart the agent, and get more info.
> > On Apr 13, 2013 11:45 AM, "Maurice Lawler" <ma...@me.com>
> wrote:
> >
> >> Thank you.
> >>
> >> The FSCK was already completed during boot up, it was forced. However,
> >> how can I access the VM's when they are in starting state to see if they
> >> need a FSCK?
> >>
> >> Agent log is showing this presently.
> >>
> >>
> >> 2013-04-13 12:35:09,989 INFO  [cloud.agent.Agent]
> >> (AgentShutdownThread:null) Stopping the agent: Reason = sig.kill
> >> 2013-04-13 12:37:32,244 INFO  [utils.component.ComponentLocator]
> >> (main:null) Unable to find components.xml
> >> 2013-04-13 12:37:32,285 INFO  [utils.component.ComponentLocator]
> >> (main:null) Skipping configuration using components.xml
> >> 2013-04-13 12:37:32,285 INFO  [cloud.agent.AgentShell] (main:null)
> >> Implementation Version is 4.0.1.20130201075054
> >> 2013-04-13 12:37:32,286 INFO  [cloud.agent.AgentShell] (main:null)
> >> agent.properties found at /etc/cloud/agent/agent.properties
> >> 2013-04-13 12:37:32,287 INFO  [cloud.agent.AgentShell] (main:null)
> >> Defaulting to using properties file for storage
> >> 2013-04-13 12:37:32,289 INFO  [cloud.agent.AgentShell] (main:null)
> >> Defaulting to the constant time backoff algorithm
> >> 2013-04-13 12:37:32,413 INFO  [cloud.agent.Agent] (main:null) id is 1
> >> 2013-04-13 12:37:32,418 ERROR [cloud.resource.ServerResourceBase]
> >> (main:null) Nics are not configured!
> >> 2013-04-13 12:37:32,420 ERROR [cloud.agent.AgentShell] (main:null)
> Unable
> >> to start agent: Private NIC is not configured
> >> 2013-04-13 12:42:30,653 INFO  [utils.component.ComponentLocator]
> >> (main:null) Unable to find components.xml
> >> 2013-04-13 12:42:30,654 INFO  [utils.component.ComponentLocator]
> >> (main:null) Skipping configuration using components.xml
> >> 2013-04-13 12:42:30,654 INFO  [cloud.agent.AgentShell] (main:null)
> >> Implementation Version is 4.0.1.20130201075054
> >> 2013-04-13 12:42:30,655 INFO  [cloud.agent.AgentShell] (main:null)
> >> agent.properties found at /etc/cloud/agent/agent.properties
> >> 2013-04-13 12:42:30,656 INFO  [cloud.agent.AgentShell] (main:null)
> >> Defaulting to using properties file for storage
> >> 2013-04-13 12:42:30,658 INFO  [cloud.agent.AgentShell] (main:null)
> >> Defaulting to the constant time backoff algorithm
> >> 2013-04-13 12:42:30,721 INFO  [cloud.agent.Agent] (main:null) id is 1
> >> 2013-04-13 12:42:30,820 INFO
> >>  [resource.virtualnetwork.VirtualRoutingResource] (main:null)
> >> VirtualRoutingResource _scriptDir to use: scripts/network/domr/kvm
> >> 2013-04-13 12:42:32,094 INFO  [kvm.resource.LibvirtComputingResource]
> >> (main:null) No libvirt.vif.driver specififed. Defaults to
> BridgeVifDriver.
> >> 2013-04-13 12:42:32,147 INFO  [cloud.agent.Agent] (main:null) Agent [id
> =
> >> 1 : type = LibvirtComputingResource : zone = 1 : pod = 1 : workers = 5 :
> >> host = 96.31.67.232 : port = 8250
> >> 2013-04-13 12:42:32,154 INFO  [utils.nio.NioClient]
> (Agent-Selector:null)
> >> Connecting to myipaddress:8250
> >> 2013-04-13 12:42:32,444 INFO  [utils.nio.NioClient]
> (Agent-Selector:null)
> >> SSL: Handshake done
> >> 2013-04-13 12:42:32,599 INFO  [cloud.serializer.GsonHelper]
> >> (Agent-Handler-1:null) Default Builder inited.
> >> 2013-04-13 12:42:32,803 INFO  [cloud.agent.Agent] (Agent-Handler-2:null)
> >> Proccess agent startup answer, agent id = 1
> >> 2013-04-13 12:42:32,803 INFO  [cloud.agent.Agent] (Agent-Handler-2:null)
> >> Set agent id 1
> >> 2013-04-13 12:42:32,808 INFO  [cloud.agent.Agent] (Agent-Handler-2:null)
> >> Startup Response Received: agent id = 1
> >>
> >>
> >> The management log says this:
> >>
> >> 2013-04-13 12:43:28,952 DEBUG [cloud.network.NetworkManagerImpl]
> >> (secstorage-1:null) Lock is released for network id 201 as a part of
> >> network implement
> >> 2013-04-13 12:43:28,969 DEBUG [db.Transaction.Transaction]
> >> (secstorage-1:null) Rolling back the transaction: Time = 1 Name =
> >>
>  -SystemVmLoadScanner$1.run:71-Executors$RunnableAdapter.call:471-FutureTask$Sync.innerRunAndReset:351-FutureTask.runAndReset:178-ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201:165-ScheduledThreadPoolExecutor$ScheduledFutureTask.run:267-ThreadPoolExecutor.runWorker:1146-ThreadPoolExecutor$Worker.run:615-Thread.run:679;
> >> called by
> >>
> -Transaction.rollback:887-DataCenterIpAddressDaoImpl.takeIpAddress:57-DatabaseCallback.intercept:34-DataCenterDaoImpl.allocatePrivateIpAddress:228-DatabaseCallback.intercept:34-PodBasedNetworkGuru.reserve:119-NetworkManagerImpl.prepareNic:2143-NetworkManagerImpl.prepare:2113-VirtualMachineManagerImpl.advanceStart:752-VirtualMachineManagerImpl.start:472-VirtualMachineManagerImpl.start:465-SecondaryStorageManagerImpl.startSecStorageVm:257
> >> 2013-04-13 12:43:28,970 INFO  [cloud.vm.VirtualMachineManagerImpl]
> >> (secstorage-1:null) Insufficient capacity
> >> com.cloud.exception.InsufficientAddressCapacityException: Unable to get
> a
> >> management ip addressScope=interface com.cloud.dc.Pod; id=1
> >>         at
> >>
> com.cloud.network.guru.PodBasedNetworkGuru.reserve(PodBasedNetworkGuru.java:121)
> >>         at
> >>
> com.cloud.network.NetworkManagerImpl.prepareNic(NetworkManagerImpl.java:2143)
> >>         at
> >>
> com.cloud.network.NetworkManagerImpl.prepare(NetworkManagerImpl.java:2113)
> >>         at
> >>
> com.cloud.vm.VirtualMachineManagerImpl.advanceStart(VirtualMachineManagerImpl.java:752)
> >>         at
> >>
> com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:472)
> >>         at
> >>
> com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:465)
> >>         at
> >>
> com.cloud.storage.secondary.SecondaryStorageManagerImpl.startSecStorageVm(SecondaryStorageManagerImpl.java:257)
> >>         at
> >>
> com.cloud.storage.secondary.SecondaryStorageManagerImpl.allocCapacity(SecondaryStorageManagerImpl.java:684)
> >>         at
> >>
> com.cloud.storage.secondary.SecondaryStorageManagerImpl.expandPool(SecondaryStorageManagerImpl.java:1310)
> >>         at
> >>
> com.cloud.secstorage.PremiumSecondaryStorageManagerImpl.scanPool(PremiumSecondaryStorageManagerImpl.java:119)
> >>         at
> >>
> com.cloud.secstorage.PremiumSecondaryStorageManagerImpl.scanPool(PremiumSecondaryStorageManagerImpl.java:50)
> >>         at
> >> com.cloud.vm.SystemVmLoadScanner.loadScan(SystemVmLoadScanner.java:106)
> >>         at
> >> com.cloud.vm.SystemVmLoadScanner.access$100(SystemVmLoadScanner.java:34)
> >>         at
> >>
> com.cloud.vm.SystemVmLoadScanner$1.reallyRun(SystemVmLoadScanner.java:83)
> >>         at
> >> com.cloud.vm.SystemVmLoadScanner$1.run(SystemVmLoadScanner.java:73)
> >>         at
> >> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> >>         at
> >>
> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
> >>         at
> >> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
> >>         at
> >>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:165)
> >>         at
> >>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:267)
> >>         at
> >>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
> >>         at
> >>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> >>         at java.lang.Thread.run(Thread.java:679)
> >> 2013-04-13 12:43:28,973 DEBUG [cloud.vm.VirtualMachineManagerImpl]
> >> (secstorage-1:null) Cleaning up resources for the vm
> >> VM[SecondaryStorageVm|s-588-VM] in Starting state
> >> 2013-04-13 12:43:28,975 DEBUG [agent.transport.Request]
> >> (secstorage-1:null) Seq 1-751304715: Waiting for Seq 751304714
> Scheduling:
> >>  { Cmd , MgmtId: 219948120943996, via: 1, Ver: v1, Flags: 100111,
> >> [{"StopCommand":{"isProxy":false,"vmName":"s-588-VM","wait":0}}] }
> >> 2013-04-13 12:43:29,186 DEBUG
> >> [network.router.VirtualNetworkApplianceManagerImpl]
> >> (RouterStatusMonitor-1:null) Found 0 routers.
> >> 2013-04-13 12:43:37,927 DEBUG [agent.manager.AgentManagerImpl]
> >> (AgentManager-Handler-14:null) Ping from 1
> >> 2013-04-13 12:43:43,240 DEBUG [cloud.server.StatsCollector]
> >> (StatsCollector-3:null) VmStatsCollector is running...
> >> 2013-04-13 12:43:43,323 DEBUG [cloud.server.StatsCollector]
> >> (StatsCollector-3:null) StorageCollector is running...
> >> 2013-04-13 12:43:43,327 DEBUG [cloud.server.StatsCollector]
> >> (StatsCollector-3:null) There is no secondary storage VM for secondary
> >> storage host nfs://96.31.67.232/secondary
> >> 2013-04-13 <http://96.31.67.232/secondary2013-04-13> 12:43:43,400 DEBUG
> >> [agent.transport.Request] (StatsCollector-3:null) Seq 1-751304716:
> >> Received:  { Ans: , MgmtId: 219948120943996, via: 1, Ver: v1, Flags:
> 10, {
> >> GetStorageStatsAnswer } }
> >> 2013-04-13 12:43:43,936 DEBUG [cloud.server.StatsCollector]
> >> (StatsCollector-3:null) HostStatsCollector is running...
> >> 2013-04-13 12:43:44,545 DEBUG [agent.transport.Request]
> >> (StatsCollector-3:null) Seq 1-751304717: Received:  { Ans: , MgmtId:
> >> 219948120943996, via: 1, Ver: v1, Flags: 10, { GetHostStatsAnswer } }
> >> 2013-04-13 12:43:58,231 DEBUG [cloud.server.ManagementServerImpl]
> >> (EventChecker-1:null) Deleting events older than: Fri Apr 12 12:43:58
> CDT
> >> 2013
> >> 2013-04-13 12:43:58,233 DEBUG [cloud.server.ManagementServerImpl]
> >> (EventChecker-1:null) Found 0 events to be purged
> >> 2013-04-13 12:43:58,235 DEBUG [cloud.server.ManagementServerImpl]
> >> (EventChecker-1:null) Deleting events older than: Fri Apr 12 12:43:58
> CDT
> >> 2013
> >> 2013-04-13 12:43:58,238 DEBUG [cloud.server.ManagementServerImpl]
> >> (EventChecker-1:null) Found 0 events to be purged
> >> 2013-04-13 12:43:59,186 DEBUG
> >> [network.router.VirtualNetworkApplianceManagerImpl]
> >> (RouterStatusMonitor-1:null) Found 0 routers.
> >> [root@lunder agent]#
> >>
> >>
> >>
> >>
> >> On Apr 13, 2013, at 12:30 PM, Marcus Sorensen <sh...@gmail.com>
> >> wrote:
> >>
> >> > Well you've got something trying to start, because you have vnet
> >> > interfaces. You need to look at your agent logs to see why the system
> >> VMS
> >> > refuse to start. If the power went out it could be corruption, the
> >> system
> >> > VMS may be waiting for you to fsck. It sounds like maybe the system
> was
> >> put
> >> > into production without testing to make sure the host settings were
> >> > persistent and would survive a reboot?
> >> >
> >> > So 1) look at your agent logs. And 2) use vnc to look at whatever
> system
> >> > VMS are running and see what state they are in. They will probably
> >> > continually try to start and then shut down.
> >> > On Apr 13, 2013 11:24 AM, "Maurice Lawler" <ma...@me.com>
> >> wrote:
> >> >
> >> >> Greetings,
> >> >>
> >> >> I'm have a terrible way to go, nothing I have done will start my
> cloud.
> >> >> None of my system VM's will start, which in turn do not permit the
> >> regular
> >> >> OS VM's to start. I suffered from first a power outage, then I
> manually
> >> >> rebooted my server. Now, nothing is coming back online.
> >> >>
> >> >> I was previously told, having cloud0 first is the cause of this. Even
> >> when
> >> >> doing ifconfig cloud0 down, nothing seems to come back online.
> >> >>
> >> >> I have gone as far as stopping iptables / eatables along with
> >> >> stopping/starting the network and the management console.
> >> >>
> >> >>
> >> >> Checking the system VM's the continue to remain in a 'starting'
> status.
> >> >>
> >> >> [root@lunder ~]# service iptables status
> >> >> iptables: Firewall is not running.
> >> >> [root@lunder ~]# service ebtables status
> >> >> # Generated by ebtables-save v1.0 on Sat Apr 13 12:21:04 CDT 2013
> >> >> *nat
> >> >> :PREROUTING ACCEPT
> >> >> :OUTPUT ACCEPT
> >> >> :POSTROUTING ACCEPT
> >> >>
> >> >> [root@lunder ~]#
> >> >>
> >> >>
> >> >> [root@lunder daoenix]# ifconfig
> >> >> cloud0    Link encap:Ethernet  HWaddr FE:00:A9:FE:00:67
> >> >>          inet addr:169.254.0.1  Bcast:169.254.255.255
>  Mask:255.255.0.0
> >> >>          inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
> >> >>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> >> >>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> >> >>          TX packets:658 errors:0 dropped:0 overruns:0 carrier:0
> >> >>          collisions:0 txqueuelen:0
> >> >>          RX bytes:0 (0.0 b)  TX bytes:28068 (27.4 KiB)
> >> >>
> >> >> cloudbr0  Link encap:Ethernet  HWaddr C8:0A:A9:9E:2D:7C
> >> >>          inet addr:myipaddress  Bcast:9myipaddress
> Mask:255.255.255.224
> >> >>          inet6 addr: fe80::fc2c:bcff:fe00:5/64 Scope:Link
> >> >>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> >> >>          RX packets:192832 errors:0 dropped:0 overruns:0 frame:0
> >> >>          TX packets:11251 errors:0 dropped:0 overruns:0 carrier:0
> >> >>          collisions:0 txqueuelen:0
> >> >>          RX bytes:11481135 (10.9 MiB)  TX bytes:25153331 (23.9 MiB)
> >> >>
> >> >> eth0      Link encap:Ethernet  HWaddr C8:0A:A9:9E:2D:7C
> >> >>          inet6 addr: fe80::ca0a:a9ff:fe9e:2d7c/64 Scope:Link
> >> >>          UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
> >> >>          RX packets:199794 errors:0 dropped:0 overruns:0 frame:0
> >> >>          TX packets:24157 errors:0 dropped:0 overruns:0 carrier:0
> >> >>          collisions:0 txqueuelen:1000
> >> >>          RX bytes:14647159 (13.9 MiB)  TX bytes:25994485 (24.7 MiB)
> >> >>          Memory:df6e0000-df700000
> >> >>
> >> >> lo        Link encap:Local Loopback
> >> >>          inet addr:127.0.0.1  Mask:255.0.0.0
> >> >>          inet6 addr: ::1/128 Scope:Host
> >> >>          UP LOOPBACK RUNNING  MTU:16436  Metric:1
> >> >>          RX packets:7850808 errors:0 dropped:0 overruns:0 frame:0
> >> >>          TX packets:7850808 errors:0 dropped:0 overruns:0 carrier:0
> >> >>          collisions:0 txqueuelen:0
> >> >>          RX bytes:1611132695 (1.5 GiB)  TX bytes:1611132695 (1.5 GiB)
> >> >>
> >> >> virbr0    Link encap:Ethernet  HWaddr 52:54:00:D9:D9:9A
> >> >>          inet addr:192.168.122.1  Bcast:192.168.122.255
> >> >> Mask:255.255.255.0
> >> >>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> >> >>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> >> >>          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
> >> >>          collisions:0 txqueuelen:0
> >> >>          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
> >> >>
> >> >> vnet0     Link encap:Ethernet  HWaddr FE:00:A9:FE:00:67
> >> >>          inet6 addr: fe80::fc00:a9ff:fefe:67/64 Scope:Link
> >> >>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> >> >>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> >> >>          TX packets:116 errors:0 dropped:0 overruns:0 carrier:0
> >> >>          collisions:0 txqueuelen:500
> >> >>          RX bytes:0 (0.0 b)  TX bytes:5232 (5.1 KiB)
> >> >>
> >> >> vnet1     Link encap:Ethernet  HWaddr FE:84:4C:00:00:01
> >> >>          inet6 addr: fe80::fc84:4cff:fe00:1/64 Scope:Link
> >> >>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> >> >>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> >> >>          TX packets:256 errors:0 dropped:0 overruns:1 carrier:0
> >> >>          collisions:0 txqueuelen:500
> >> >>          RX bytes:0 (0.0 b)  TX bytes:17849 (17.4 KiB)
> >> >>
> >> >> vnet2     Link encap:Ethernet  HWaddr FE:2C:BC:00:00:05
> >> >>          inet6 addr: fe80::fc2c:bcff:fe00:5/64 Scope:Link
> >> >>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> >> >>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> >> >>          TX packets:256 errors:0 dropped:0 overruns:1 carrier:0
> >> >>          collisions:0 txqueuelen:500
> >> >>          RX bytes:0 (0.0 b)  TX bytes:17849 (17.4 KiB)
> >> >>
> >> >> [root@lunder daoenix]#
> >> >>
> >> >>
> >> >>
> >>
> >>
>

Re: Emergency: Cloud NOT starting

Posted by Marcus Sorensen <sh...@gmail.com>.

A "brctl show" would also be good to have.
On Apr 13, 2013 11:52 AM, "Marcus Sorensen" <sh...@gmail.com> wrote:

> If you do a "virsh list" on the agent there's a good chance you would see
> a VM running, however the system will only wait so long for it to boot up
> before shutting it down, so it will come and go. You can do "virsh
> vncdisplay (vmname)" and it will tell you what port to vnc to on the host
> in order to connect to the VM and see what state it is in.
>
> I see in the agent log that at one point it failed to start due to no
> private bridge. Is cloudbr0 your private as defined in agent.properties?
>
> You can also open /etc/cloud/agent/log4j-cloud.xml and change every INFO
> to DEBUG, restart the agent, and get more info.
> On Apr 13, 2013 11:45 AM, "Maurice Lawler" <ma...@me.com> wrote:
>
>> Thank you.
>>
>> The FSCK was already completed during boot up, it was forced. However,
>> how can I access the VM's when they are in starting state to see if they
>> need a FSCK?
>>
>> Agent log is showing this presently.
>>
>>
>> 2013-04-13 12:35:09,989 INFO  [cloud.agent.Agent]
>> (AgentShutdownThread:null) Stopping the agent: Reason = sig.kill
>> 2013-04-13 12:37:32,244 INFO  [utils.component.ComponentLocator]
>> (main:null) Unable to find components.xml
>> 2013-04-13 12:37:32,285 INFO  [utils.component.ComponentLocator]
>> (main:null) Skipping configuration using components.xml
>> 2013-04-13 12:37:32,285 INFO  [cloud.agent.AgentShell] (main:null)
>> Implementation Version is 4.0.1.20130201075054
>> 2013-04-13 12:37:32,286 INFO  [cloud.agent.AgentShell] (main:null)
>> agent.properties found at /etc/cloud/agent/agent.properties
>> 2013-04-13 12:37:32,287 INFO  [cloud.agent.AgentShell] (main:null)
>> Defaulting to using properties file for storage
>> 2013-04-13 12:37:32,289 INFO  [cloud.agent.AgentShell] (main:null)
>> Defaulting to the constant time backoff algorithm
>> 2013-04-13 12:37:32,413 INFO  [cloud.agent.Agent] (main:null) id is 1
>> 2013-04-13 12:37:32,418 ERROR [cloud.resource.ServerResourceBase]
>> (main:null) Nics are not configured!
>> 2013-04-13 12:37:32,420 ERROR [cloud.agent.AgentShell] (main:null) Unable
>> to start agent: Private NIC is not configured
>> 2013-04-13 12:42:30,653 INFO  [utils.component.ComponentLocator]
>> (main:null) Unable to find components.xml
>> 2013-04-13 12:42:30,654 INFO  [utils.component.ComponentLocator]
>> (main:null) Skipping configuration using components.xml
>> 2013-04-13 12:42:30,654 INFO  [cloud.agent.AgentShell] (main:null)
>> Implementation Version is 4.0.1.20130201075054
>> 2013-04-13 12:42:30,655 INFO  [cloud.agent.AgentShell] (main:null)
>> agent.properties found at /etc/cloud/agent/agent.properties
>> 2013-04-13 12:42:30,656 INFO  [cloud.agent.AgentShell] (main:null)
>> Defaulting to using properties file for storage
>> 2013-04-13 12:42:30,658 INFO  [cloud.agent.AgentShell] (main:null)
>> Defaulting to the constant time backoff algorithm
>> 2013-04-13 12:42:30,721 INFO  [cloud.agent.Agent] (main:null) id is 1
>> 2013-04-13 12:42:30,820 INFO
>>  [resource.virtualnetwork.VirtualRoutingResource] (main:null)
>> VirtualRoutingResource _scriptDir to use: scripts/network/domr/kvm
>> 2013-04-13 12:42:32,094 INFO  [kvm.resource.LibvirtComputingResource]
>> (main:null) No libvirt.vif.driver specififed. Defaults to BridgeVifDriver.
>> 2013-04-13 12:42:32,147 INFO  [cloud.agent.Agent] (main:null) Agent [id =
>> 1 : type = LibvirtComputingResource : zone = 1 : pod = 1 : workers = 5 :
>> host = 96.31.67.232 : port = 8250
>> 2013-04-13 12:42:32,154 INFO  [utils.nio.NioClient] (Agent-Selector:null)
>> Connecting to myipaddress:8250
>> 2013-04-13 12:42:32,444 INFO  [utils.nio.NioClient] (Agent-Selector:null)
>> SSL: Handshake done
>> 2013-04-13 12:42:32,599 INFO  [cloud.serializer.GsonHelper]
>> (Agent-Handler-1:null) Default Builder inited.
>> 2013-04-13 12:42:32,803 INFO  [cloud.agent.Agent] (Agent-Handler-2:null)
>> Proccess agent startup answer, agent id = 1
>> 2013-04-13 12:42:32,803 INFO  [cloud.agent.Agent] (Agent-Handler-2:null)
>> Set agent id 1
>> 2013-04-13 12:42:32,808 INFO  [cloud.agent.Agent] (Agent-Handler-2:null)
>> Startup Response Received: agent id = 1
>>
>>
>> The management log says this:
>>
>> 2013-04-13 12:43:28,952 DEBUG [cloud.network.NetworkManagerImpl]
>> (secstorage-1:null) Lock is released for network id 201 as a part of
>> network implement
>> 2013-04-13 12:43:28,969 DEBUG [db.Transaction.Transaction]
>> (secstorage-1:null) Rolling back the transaction: Time = 1 Name =
>>  -SystemVmLoadScanner$1.run:71-Executors$RunnableAdapter.call:471-FutureTask$Sync.innerRunAndReset:351-FutureTask.runAndReset:178-ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201:165-ScheduledThreadPoolExecutor$ScheduledFutureTask.run:267-ThreadPoolExecutor.runWorker:1146-ThreadPoolExecutor$Worker.run:615-Thread.run:679;
>> called by
>> -Transaction.rollback:887-DataCenterIpAddressDaoImpl.takeIpAddress:57-DatabaseCallback.intercept:34-DataCenterDaoImpl.allocatePrivateIpAddress:228-DatabaseCallback.intercept:34-PodBasedNetworkGuru.reserve:119-NetworkManagerImpl.prepareNic:2143-NetworkManagerImpl.prepare:2113-VirtualMachineManagerImpl.advanceStart:752-VirtualMachineManagerImpl.start:472-VirtualMachineManagerImpl.start:465-SecondaryStorageManagerImpl.startSecStorageVm:257
>> 2013-04-13 12:43:28,970 INFO  [cloud.vm.VirtualMachineManagerImpl]
>> (secstorage-1:null) Insufficient capacity
>> com.cloud.exception.InsufficientAddressCapacityException: Unable to get a
>> management ip addressScope=interface com.cloud.dc.Pod; id=1
>>         at
>> com.cloud.network.guru.PodBasedNetworkGuru.reserve(PodBasedNetworkGuru.java:121)
>>         at
>> com.cloud.network.NetworkManagerImpl.prepareNic(NetworkManagerImpl.java:2143)
>>         at
>> com.cloud.network.NetworkManagerImpl.prepare(NetworkManagerImpl.java:2113)
>>         at
>> com.cloud.vm.VirtualMachineManagerImpl.advanceStart(VirtualMachineManagerImpl.java:752)
>>         at
>> com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:472)
>>         at
>> com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:465)
>>         at
>> com.cloud.storage.secondary.SecondaryStorageManagerImpl.startSecStorageVm(SecondaryStorageManagerImpl.java:257)
>>         at
>> com.cloud.storage.secondary.SecondaryStorageManagerImpl.allocCapacity(SecondaryStorageManagerImpl.java:684)
>>         at
>> com.cloud.storage.secondary.SecondaryStorageManagerImpl.expandPool(SecondaryStorageManagerImpl.java:1310)
>>         at
>> com.cloud.secstorage.PremiumSecondaryStorageManagerImpl.scanPool(PremiumSecondaryStorageManagerImpl.java:119)
>>         at
>> com.cloud.secstorage.PremiumSecondaryStorageManagerImpl.scanPool(PremiumSecondaryStorageManagerImpl.java:50)
>>         at
>> com.cloud.vm.SystemVmLoadScanner.loadScan(SystemVmLoadScanner.java:106)
>>         at
>> com.cloud.vm.SystemVmLoadScanner.access$100(SystemVmLoadScanner.java:34)
>>         at
>> com.cloud.vm.SystemVmLoadScanner$1.reallyRun(SystemVmLoadScanner.java:83)
>>         at
>> com.cloud.vm.SystemVmLoadScanner$1.run(SystemVmLoadScanner.java:73)
>>         at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>         at
>> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
>>         at
>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
>>         at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:165)
>>         at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:267)
>>         at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>         at java.lang.Thread.run(Thread.java:679)
>> 2013-04-13 12:43:28,973 DEBUG [cloud.vm.VirtualMachineManagerImpl]
>> (secstorage-1:null) Cleaning up resources for the vm
>> VM[SecondaryStorageVm|s-588-VM] in Starting state
>> 2013-04-13 12:43:28,975 DEBUG [agent.transport.Request]
>> (secstorage-1:null) Seq 1-751304715: Waiting for Seq 751304714 Scheduling:
>>  { Cmd , MgmtId: 219948120943996, via: 1, Ver: v1, Flags: 100111,
>> [{"StopCommand":{"isProxy":false,"vmName":"s-588-VM","wait":0}}] }
>> 2013-04-13 12:43:29,186 DEBUG
>> [network.router.VirtualNetworkApplianceManagerImpl]
>> (RouterStatusMonitor-1:null) Found 0 routers.
>> 2013-04-13 12:43:37,927 DEBUG [agent.manager.AgentManagerImpl]
>> (AgentManager-Handler-14:null) Ping from 1
>> 2013-04-13 12:43:43,240 DEBUG [cloud.server.StatsCollector]
>> (StatsCollector-3:null) VmStatsCollector is running...
>> 2013-04-13 12:43:43,323 DEBUG [cloud.server.StatsCollector]
>> (StatsCollector-3:null) StorageCollector is running...
>> 2013-04-13 12:43:43,327 DEBUG [cloud.server.StatsCollector]
>> (StatsCollector-3:null) There is no secondary storage VM for secondary
>> storage host nfs://96.31.67.232/secondary
>> 2013-04-13 <http://96.31.67.232/secondary2013-04-13> 12:43:43,400 DEBUG
>> [agent.transport.Request] (StatsCollector-3:null) Seq 1-751304716:
>> Received:  { Ans: , MgmtId: 219948120943996, via: 1, Ver: v1, Flags: 10, {
>> GetStorageStatsAnswer } }
>> 2013-04-13 12:43:43,936 DEBUG [cloud.server.StatsCollector]
>> (StatsCollector-3:null) HostStatsCollector is running...
>> 2013-04-13 12:43:44,545 DEBUG [agent.transport.Request]
>> (StatsCollector-3:null) Seq 1-751304717: Received:  { Ans: , MgmtId:
>> 219948120943996, via: 1, Ver: v1, Flags: 10, { GetHostStatsAnswer } }
>> 2013-04-13 12:43:58,231 DEBUG [cloud.server.ManagementServerImpl]
>> (EventChecker-1:null) Deleting events older than: Fri Apr 12 12:43:58 CDT
>> 2013
>> 2013-04-13 12:43:58,233 DEBUG [cloud.server.ManagementServerImpl]
>> (EventChecker-1:null) Found 0 events to be purged
>> 2013-04-13 12:43:58,235 DEBUG [cloud.server.ManagementServerImpl]
>> (EventChecker-1:null) Deleting events older than: Fri Apr 12 12:43:58 CDT
>> 2013
>> 2013-04-13 12:43:58,238 DEBUG [cloud.server.ManagementServerImpl]
>> (EventChecker-1:null) Found 0 events to be purged
>> 2013-04-13 12:43:59,186 DEBUG
>> [network.router.VirtualNetworkApplianceManagerImpl]
>> (RouterStatusMonitor-1:null) Found 0 routers.
>> [root@lunder agent]#
>>
>>
>>
>>
>> On Apr 13, 2013, at 12:30 PM, Marcus Sorensen <sh...@gmail.com>
>> wrote:
>>
>> > Well you've got something trying to start, because you have vnet
>> > interfaces. You need to look at your agent logs to see why the system
>> VMS
>> > refuse to start. If the power went out it could be corruption, the
>> system
>> > VMS may be waiting for you to fsck. It sounds like maybe the system was
>> put
>> > into production without testing to make sure the host settings were
>> > persistent and would survive a reboot?
>> >
>> > So 1) look at your agent logs. And 2) use vnc to look at whatever system
>> > VMS are running and see what state they are in. They will probably
>> > continually try to start and then shut down.
>> > On Apr 13, 2013 11:24 AM, "Maurice Lawler" <ma...@me.com>
>> wrote:
>> >
>> >> Greetings,
>> >>
>> >> I'm have a terrible way to go, nothing I have done will start my cloud.
>> >> None of my system VM's will start, which in turn do not permit the
>> regular
>> >> OS VM's to start. I suffered from first a power outage, then I manually
>> >> rebooted my server. Now, nothing is coming back online.
>> >>
>> >> I was previously told, having cloud0 first is the cause of this. Even
>> when
>> >> doing ifconfig cloud0 down, nothing seems to come back online.
>> >>
>> >> I have gone as far as stopping iptables / eatables along with
>> >> stopping/starting the network and the management console.
>> >>
>> >>
>> >> Checking the system VM's the continue to remain in a 'starting' status.
>> >>
>> >> [root@lunder ~]# service iptables status
>> >> iptables: Firewall is not running.
>> >> [root@lunder ~]# service ebtables status
>> >> # Generated by ebtables-save v1.0 on Sat Apr 13 12:21:04 CDT 2013
>> >> *nat
>> >> :PREROUTING ACCEPT
>> >> :OUTPUT ACCEPT
>> >> :POSTROUTING ACCEPT
>> >>
>> >> [root@lunder ~]#
>> >>
>> >>
>> >> [root@lunder daoenix]# ifconfig
>> >> cloud0    Link encap:Ethernet  HWaddr FE:00:A9:FE:00:67
>> >>          inet addr:169.254.0.1  Bcast:169.254.255.255  Mask:255.255.0.0
>> >>          inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
>> >>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>> >>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>> >>          TX packets:658 errors:0 dropped:0 overruns:0 carrier:0
>> >>          collisions:0 txqueuelen:0
>> >>          RX bytes:0 (0.0 b)  TX bytes:28068 (27.4 KiB)
>> >>
>> >> cloudbr0  Link encap:Ethernet  HWaddr C8:0A:A9:9E:2D:7C
>> >>          inet addr:myipaddress  Bcast:9myipaddress Mask:255.255.255.224
>> >>          inet6 addr: fe80::fc2c:bcff:fe00:5/64 Scope:Link
>> >>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>> >>          RX packets:192832 errors:0 dropped:0 overruns:0 frame:0
>> >>          TX packets:11251 errors:0 dropped:0 overruns:0 carrier:0
>> >>          collisions:0 txqueuelen:0
>> >>          RX bytes:11481135 (10.9 MiB)  TX bytes:25153331 (23.9 MiB)
>> >>
>> >> eth0      Link encap:Ethernet  HWaddr C8:0A:A9:9E:2D:7C
>> >>          inet6 addr: fe80::ca0a:a9ff:fe9e:2d7c/64 Scope:Link
>> >>          UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
>> >>          RX packets:199794 errors:0 dropped:0 overruns:0 frame:0
>> >>          TX packets:24157 errors:0 dropped:0 overruns:0 carrier:0
>> >>          collisions:0 txqueuelen:1000
>> >>          RX bytes:14647159 (13.9 MiB)  TX bytes:25994485 (24.7 MiB)
>> >>          Memory:df6e0000-df700000
>> >>
>> >> lo        Link encap:Local Loopback
>> >>          inet addr:127.0.0.1  Mask:255.0.0.0
>> >>          inet6 addr: ::1/128 Scope:Host
>> >>          UP LOOPBACK RUNNING  MTU:16436  Metric:1
>> >>          RX packets:7850808 errors:0 dropped:0 overruns:0 frame:0
>> >>          TX packets:7850808 errors:0 dropped:0 overruns:0 carrier:0
>> >>          collisions:0 txqueuelen:0
>> >>          RX bytes:1611132695 (1.5 GiB)  TX bytes:1611132695 (1.5 GiB)
>> >>
>> >> virbr0    Link encap:Ethernet  HWaddr 52:54:00:D9:D9:9A
>> >>          inet addr:192.168.122.1  Bcast:192.168.122.255
>> >> Mask:255.255.255.0
>> >>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>> >>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>> >>          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>> >>          collisions:0 txqueuelen:0
>> >>          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
>> >>
>> >> vnet0     Link encap:Ethernet  HWaddr FE:00:A9:FE:00:67
>> >>          inet6 addr: fe80::fc00:a9ff:fefe:67/64 Scope:Link
>> >>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>> >>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>> >>          TX packets:116 errors:0 dropped:0 overruns:0 carrier:0
>> >>          collisions:0 txqueuelen:500
>> >>          RX bytes:0 (0.0 b)  TX bytes:5232 (5.1 KiB)
>> >>
>> >> vnet1     Link encap:Ethernet  HWaddr FE:84:4C:00:00:01
>> >>          inet6 addr: fe80::fc84:4cff:fe00:1/64 Scope:Link
>> >>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>> >>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>> >>          TX packets:256 errors:0 dropped:0 overruns:1 carrier:0
>> >>          collisions:0 txqueuelen:500
>> >>          RX bytes:0 (0.0 b)  TX bytes:17849 (17.4 KiB)
>> >>
>> >> vnet2     Link encap:Ethernet  HWaddr FE:2C:BC:00:00:05
>> >>          inet6 addr: fe80::fc2c:bcff:fe00:5/64 Scope:Link
>> >>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>> >>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>> >>          TX packets:256 errors:0 dropped:0 overruns:1 carrier:0
>> >>          collisions:0 txqueuelen:500
>> >>          RX bytes:0 (0.0 b)  TX bytes:17849 (17.4 KiB)
>> >>
>> >> [root@lunder daoenix]#
>> >>
>> >>
>> >>
>>
>>

Re: Emergency: Cloud NOT starting

Posted by Marcus Sorensen <sh...@gmail.com>.

If you do a "virsh list" on the agent there's a good chance you would see a
VM running, however the system will only wait so long for it to boot up
before shutting it down, so it will come and go. You can do "virsh
vncdisplay (vmname)" and it will tell you what port to vnc to on the host
in order to connect to the VM and see what state it is in.

I see in the agent log that at one point it failed to start due to no
private bridge. Is cloudbr0 your private as defined in agent.properties?

You can also open /etc/cloud/agent/log4j-cloud.xml and change every INFO to
DEBUG, restart the agent, and get more info.
On Apr 13, 2013 11:45 AM, "Maurice Lawler" <ma...@me.com> wrote:

> Thank you.
>
> The FSCK was already completed during boot up, it was forced. However, how
> can I access the VM's when they are in starting state to see if they need a
> FSCK?
>
> Agent log is showing this presently.
>
>
> 2013-04-13 12:35:09,989 INFO  [cloud.agent.Agent]
> (AgentShutdownThread:null) Stopping the agent: Reason = sig.kill
> 2013-04-13 12:37:32,244 INFO  [utils.component.ComponentLocator]
> (main:null) Unable to find components.xml
> 2013-04-13 12:37:32,285 INFO  [utils.component.ComponentLocator]
> (main:null) Skipping configuration using components.xml
> 2013-04-13 12:37:32,285 INFO  [cloud.agent.AgentShell] (main:null)
> Implementation Version is 4.0.1.20130201075054
> 2013-04-13 12:37:32,286 INFO  [cloud.agent.AgentShell] (main:null)
> agent.properties found at /etc/cloud/agent/agent.properties
> 2013-04-13 12:37:32,287 INFO  [cloud.agent.AgentShell] (main:null)
> Defaulting to using properties file for storage
> 2013-04-13 12:37:32,289 INFO  [cloud.agent.AgentShell] (main:null)
> Defaulting to the constant time backoff algorithm
> 2013-04-13 12:37:32,413 INFO  [cloud.agent.Agent] (main:null) id is 1
> 2013-04-13 12:37:32,418 ERROR [cloud.resource.ServerResourceBase]
> (main:null) Nics are not configured!
> 2013-04-13 12:37:32,420 ERROR [cloud.agent.AgentShell] (main:null) Unable
> to start agent: Private NIC is not configured
> 2013-04-13 12:42:30,653 INFO  [utils.component.ComponentLocator]
> (main:null) Unable to find components.xml
> 2013-04-13 12:42:30,654 INFO  [utils.component.ComponentLocator]
> (main:null) Skipping configuration using components.xml
> 2013-04-13 12:42:30,654 INFO  [cloud.agent.AgentShell] (main:null)
> Implementation Version is 4.0.1.20130201075054
> 2013-04-13 12:42:30,655 INFO  [cloud.agent.AgentShell] (main:null)
> agent.properties found at /etc/cloud/agent/agent.properties
> 2013-04-13 12:42:30,656 INFO  [cloud.agent.AgentShell] (main:null)
> Defaulting to using properties file for storage
> 2013-04-13 12:42:30,658 INFO  [cloud.agent.AgentShell] (main:null)
> Defaulting to the constant time backoff algorithm
> 2013-04-13 12:42:30,721 INFO  [cloud.agent.Agent] (main:null) id is 1
> 2013-04-13 12:42:30,820 INFO
>  [resource.virtualnetwork.VirtualRoutingResource] (main:null)
> VirtualRoutingResource _scriptDir to use: scripts/network/domr/kvm
> 2013-04-13 12:42:32,094 INFO  [kvm.resource.LibvirtComputingResource]
> (main:null) No libvirt.vif.driver specififed. Defaults to BridgeVifDriver.
> 2013-04-13 12:42:32,147 INFO  [cloud.agent.Agent] (main:null) Agent [id =
> 1 : type = LibvirtComputingResource : zone = 1 : pod = 1 : workers = 5 :
> host = 96.31.67.232 : port = 8250
> 2013-04-13 12:42:32,154 INFO  [utils.nio.NioClient] (Agent-Selector:null)
> Connecting to myipaddress:8250
> 2013-04-13 12:42:32,444 INFO  [utils.nio.NioClient] (Agent-Selector:null)
> SSL: Handshake done
> 2013-04-13 12:42:32,599 INFO  [cloud.serializer.GsonHelper]
> (Agent-Handler-1:null) Default Builder inited.
> 2013-04-13 12:42:32,803 INFO  [cloud.agent.Agent] (Agent-Handler-2:null)
> Proccess agent startup answer, agent id = 1
> 2013-04-13 12:42:32,803 INFO  [cloud.agent.Agent] (Agent-Handler-2:null)
> Set agent id 1
> 2013-04-13 12:42:32,808 INFO  [cloud.agent.Agent] (Agent-Handler-2:null)
> Startup Response Received: agent id = 1
>
>
> The management log says this:
>
> 2013-04-13 12:43:28,952 DEBUG [cloud.network.NetworkManagerImpl]
> (secstorage-1:null) Lock is released for network id 201 as a part of
> network implement
> 2013-04-13 12:43:28,969 DEBUG [db.Transaction.Transaction]
> (secstorage-1:null) Rolling back the transaction: Time = 1 Name =
>  -SystemVmLoadScanner$1.run:71-Executors$RunnableAdapter.call:471-FutureTask$Sync.innerRunAndReset:351-FutureTask.runAndReset:178-ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201:165-ScheduledThreadPoolExecutor$ScheduledFutureTask.run:267-ThreadPoolExecutor.runWorker:1146-ThreadPoolExecutor$Worker.run:615-Thread.run:679;
> called by
> -Transaction.rollback:887-DataCenterIpAddressDaoImpl.takeIpAddress:57-DatabaseCallback.intercept:34-DataCenterDaoImpl.allocatePrivateIpAddress:228-DatabaseCallback.intercept:34-PodBasedNetworkGuru.reserve:119-NetworkManagerImpl.prepareNic:2143-NetworkManagerImpl.prepare:2113-VirtualMachineManagerImpl.advanceStart:752-VirtualMachineManagerImpl.start:472-VirtualMachineManagerImpl.start:465-SecondaryStorageManagerImpl.startSecStorageVm:257
> 2013-04-13 12:43:28,970 INFO  [cloud.vm.VirtualMachineManagerImpl]
> (secstorage-1:null) Insufficient capacity
> com.cloud.exception.InsufficientAddressCapacityException: Unable to get a
> management ip addressScope=interface com.cloud.dc.Pod; id=1
>         at
> com.cloud.network.guru.PodBasedNetworkGuru.reserve(PodBasedNetworkGuru.java:121)
>         at
> com.cloud.network.NetworkManagerImpl.prepareNic(NetworkManagerImpl.java:2143)
>         at
> com.cloud.network.NetworkManagerImpl.prepare(NetworkManagerImpl.java:2113)
>         at
> com.cloud.vm.VirtualMachineManagerImpl.advanceStart(VirtualMachineManagerImpl.java:752)
>         at
> com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:472)
>         at
> com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:465)
>         at
> com.cloud.storage.secondary.SecondaryStorageManagerImpl.startSecStorageVm(SecondaryStorageManagerImpl.java:257)
>         at
> com.cloud.storage.secondary.SecondaryStorageManagerImpl.allocCapacity(SecondaryStorageManagerImpl.java:684)
>         at
> com.cloud.storage.secondary.SecondaryStorageManagerImpl.expandPool(SecondaryStorageManagerImpl.java:1310)
>         at
> com.cloud.secstorage.PremiumSecondaryStorageManagerImpl.scanPool(PremiumSecondaryStorageManagerImpl.java:119)
>         at
> com.cloud.secstorage.PremiumSecondaryStorageManagerImpl.scanPool(PremiumSecondaryStorageManagerImpl.java:50)
>         at
> com.cloud.vm.SystemVmLoadScanner.loadScan(SystemVmLoadScanner.java:106)
>         at
> com.cloud.vm.SystemVmLoadScanner.access$100(SystemVmLoadScanner.java:34)
>         at
> com.cloud.vm.SystemVmLoadScanner$1.reallyRun(SystemVmLoadScanner.java:83)
>         at
> com.cloud.vm.SystemVmLoadScanner$1.run(SystemVmLoadScanner.java:73)
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at
> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:165)
>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:267)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:679)
> 2013-04-13 12:43:28,973 DEBUG [cloud.vm.VirtualMachineManagerImpl]
> (secstorage-1:null) Cleaning up resources for the vm
> VM[SecondaryStorageVm|s-588-VM] in Starting state
> 2013-04-13 12:43:28,975 DEBUG [agent.transport.Request]
> (secstorage-1:null) Seq 1-751304715: Waiting for Seq 751304714 Scheduling:
>  { Cmd , MgmtId: 219948120943996, via: 1, Ver: v1, Flags: 100111,
> [{"StopCommand":{"isProxy":false,"vmName":"s-588-VM","wait":0}}] }
> 2013-04-13 12:43:29,186 DEBUG
> [network.router.VirtualNetworkApplianceManagerImpl]
> (RouterStatusMonitor-1:null) Found 0 routers.
> 2013-04-13 12:43:37,927 DEBUG [agent.manager.AgentManagerImpl]
> (AgentManager-Handler-14:null) Ping from 1
> 2013-04-13 12:43:43,240 DEBUG [cloud.server.StatsCollector]
> (StatsCollector-3:null) VmStatsCollector is running...
> 2013-04-13 12:43:43,323 DEBUG [cloud.server.StatsCollector]
> (StatsCollector-3:null) StorageCollector is running...
> 2013-04-13 12:43:43,327 DEBUG [cloud.server.StatsCollector]
> (StatsCollector-3:null) There is no secondary storage VM for secondary
> storage host nfs://96.31.67.232/secondary
> 2013-04-13 12:43:43,400 DEBUG [agent.transport.Request]
> (StatsCollector-3:null) Seq 1-751304716: Received:  { Ans: , MgmtId:
> 219948120943996, via: 1, Ver: v1, Flags: 10, { GetStorageStatsAnswer } }
> 2013-04-13 12:43:43,936 DEBUG [cloud.server.StatsCollector]
> (StatsCollector-3:null) HostStatsCollector is running...
> 2013-04-13 12:43:44,545 DEBUG [agent.transport.Request]
> (StatsCollector-3:null) Seq 1-751304717: Received:  { Ans: , MgmtId:
> 219948120943996, via: 1, Ver: v1, Flags: 10, { GetHostStatsAnswer } }
> 2013-04-13 12:43:58,231 DEBUG [cloud.server.ManagementServerImpl]
> (EventChecker-1:null) Deleting events older than: Fri Apr 12 12:43:58 CDT
> 2013
> 2013-04-13 12:43:58,233 DEBUG [cloud.server.ManagementServerImpl]
> (EventChecker-1:null) Found 0 events to be purged
> 2013-04-13 12:43:58,235 DEBUG [cloud.server.ManagementServerImpl]
> (EventChecker-1:null) Deleting events older than: Fri Apr 12 12:43:58 CDT
> 2013
> 2013-04-13 12:43:58,238 DEBUG [cloud.server.ManagementServerImpl]
> (EventChecker-1:null) Found 0 events to be purged
> 2013-04-13 12:43:59,186 DEBUG
> [network.router.VirtualNetworkApplianceManagerImpl]
> (RouterStatusMonitor-1:null) Found 0 routers.
> [root@lunder agent]#
>
>
>
>
> On Apr 13, 2013, at 12:30 PM, Marcus Sorensen <sh...@gmail.com> wrote:
>
> > Well you've got something trying to start, because you have vnet
> > interfaces. You need to look at your agent logs to see why the system VMS
> > refuse to start. If the power went out it could be corruption, the system
> > VMS may be waiting for you to fsck. It sounds like maybe the system was
> put
> > into production without testing to make sure the host settings were
> > persistent and would survive a reboot?
> >
> > So 1) look at your agent logs. And 2) use vnc to look at whatever system
> > VMS are running and see what state they are in. They will probably
> > continually try to start and then shut down.
> > On Apr 13, 2013 11:24 AM, "Maurice Lawler" <ma...@me.com>
> wrote:
> >
> >> Greetings,
> >>
> >> I'm have a terrible way to go, nothing I have done will start my cloud.
> >> None of my system VM's will start, which in turn do not permit the
> regular
> >> OS VM's to start. I suffered from first a power outage, then I manually
> >> rebooted my server. Now, nothing is coming back online.
> >>
> >> I was previously told, having cloud0 first is the cause of this. Even
> when
> >> doing ifconfig cloud0 down, nothing seems to come back online.
> >>
> >> I have gone as far as stopping iptables / eatables along with
> >> stopping/starting the network and the management console.
> >>
> >>
> >> Checking the system VM's the continue to remain in a 'starting' status.
> >>
> >> [root@lunder ~]# service iptables status
> >> iptables: Firewall is not running.
> >> [root@lunder ~]# service ebtables status
> >> # Generated by ebtables-save v1.0 on Sat Apr 13 12:21:04 CDT 2013
> >> *nat
> >> :PREROUTING ACCEPT
> >> :OUTPUT ACCEPT
> >> :POSTROUTING ACCEPT
> >>
> >> [root@lunder ~]#
> >>
> >>
> >> [root@lunder daoenix]# ifconfig
> >> cloud0    Link encap:Ethernet  HWaddr FE:00:A9:FE:00:67
> >>          inet addr:169.254.0.1  Bcast:169.254.255.255  Mask:255.255.0.0
> >>          inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
> >>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> >>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> >>          TX packets:658 errors:0 dropped:0 overruns:0 carrier:0
> >>          collisions:0 txqueuelen:0
> >>          RX bytes:0 (0.0 b)  TX bytes:28068 (27.4 KiB)
> >>
> >> cloudbr0  Link encap:Ethernet  HWaddr C8:0A:A9:9E:2D:7C
> >>          inet addr:myipaddress  Bcast:9myipaddress Mask:255.255.255.224
> >>          inet6 addr: fe80::fc2c:bcff:fe00:5/64 Scope:Link
> >>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> >>          RX packets:192832 errors:0 dropped:0 overruns:0 frame:0
> >>          TX packets:11251 errors:0 dropped:0 overruns:0 carrier:0
> >>          collisions:0 txqueuelen:0
> >>          RX bytes:11481135 (10.9 MiB)  TX bytes:25153331 (23.9 MiB)
> >>
> >> eth0      Link encap:Ethernet  HWaddr C8:0A:A9:9E:2D:7C
> >>          inet6 addr: fe80::ca0a:a9ff:fe9e:2d7c/64 Scope:Link
> >>          UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
> >>          RX packets:199794 errors:0 dropped:0 overruns:0 frame:0
> >>          TX packets:24157 errors:0 dropped:0 overruns:0 carrier:0
> >>          collisions:0 txqueuelen:1000
> >>          RX bytes:14647159 (13.9 MiB)  TX bytes:25994485 (24.7 MiB)
> >>          Memory:df6e0000-df700000
> >>
> >> lo        Link encap:Local Loopback
> >>          inet addr:127.0.0.1  Mask:255.0.0.0
> >>          inet6 addr: ::1/128 Scope:Host
> >>          UP LOOPBACK RUNNING  MTU:16436  Metric:1
> >>          RX packets:7850808 errors:0 dropped:0 overruns:0 frame:0
> >>          TX packets:7850808 errors:0 dropped:0 overruns:0 carrier:0
> >>          collisions:0 txqueuelen:0
> >>          RX bytes:1611132695 (1.5 GiB)  TX bytes:1611132695 (1.5 GiB)
> >>
> >> virbr0    Link encap:Ethernet  HWaddr 52:54:00:D9:D9:9A
> >>          inet addr:192.168.122.1  Bcast:192.168.122.255
> >> Mask:255.255.255.0
> >>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> >>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> >>          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
> >>          collisions:0 txqueuelen:0
> >>          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
> >>
> >> vnet0     Link encap:Ethernet  HWaddr FE:00:A9:FE:00:67
> >>          inet6 addr: fe80::fc00:a9ff:fefe:67/64 Scope:Link
> >>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> >>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> >>          TX packets:116 errors:0 dropped:0 overruns:0 carrier:0
> >>          collisions:0 txqueuelen:500
> >>          RX bytes:0 (0.0 b)  TX bytes:5232 (5.1 KiB)
> >>
> >> vnet1     Link encap:Ethernet  HWaddr FE:84:4C:00:00:01
> >>          inet6 addr: fe80::fc84:4cff:fe00:1/64 Scope:Link
> >>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> >>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> >>          TX packets:256 errors:0 dropped:0 overruns:1 carrier:0
> >>          collisions:0 txqueuelen:500
> >>          RX bytes:0 (0.0 b)  TX bytes:17849 (17.4 KiB)
> >>
> >> vnet2     Link encap:Ethernet  HWaddr FE:2C:BC:00:00:05
> >>          inet6 addr: fe80::fc2c:bcff:fe00:5/64 Scope:Link
> >>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> >>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> >>          TX packets:256 errors:0 dropped:0 overruns:1 carrier:0
> >>          collisions:0 txqueuelen:500
> >>          RX bytes:0 (0.0 b)  TX bytes:17849 (17.4 KiB)
> >>
> >> [root@lunder daoenix]#
> >>
> >>
> >>
>
>

Re: Emergency: Cloud NOT starting

Posted by Maurice Lawler <ma...@me.com>.

Thank you.

The FSCK was already completed during boot up, it was forced. However, how can I access the VM's when they are in starting state to see if they need a FSCK? 

Agent log is showing this presently.


2013-04-13 12:35:09,989 INFO  [cloud.agent.Agent] (AgentShutdownThread:null) Stopping the agent: Reason = sig.kill
2013-04-13 12:37:32,244 INFO  [utils.component.ComponentLocator] (main:null) Unable to find components.xml
2013-04-13 12:37:32,285 INFO  [utils.component.ComponentLocator] (main:null) Skipping configuration using components.xml
2013-04-13 12:37:32,285 INFO  [cloud.agent.AgentShell] (main:null) Implementation Version is 4.0.1.20130201075054
2013-04-13 12:37:32,286 INFO  [cloud.agent.AgentShell] (main:null) agent.properties found at /etc/cloud/agent/agent.properties
2013-04-13 12:37:32,287 INFO  [cloud.agent.AgentShell] (main:null) Defaulting to using properties file for storage
2013-04-13 12:37:32,289 INFO  [cloud.agent.AgentShell] (main:null) Defaulting to the constant time backoff algorithm
2013-04-13 12:37:32,413 INFO  [cloud.agent.Agent] (main:null) id is 1
2013-04-13 12:37:32,418 ERROR [cloud.resource.ServerResourceBase] (main:null) Nics are not configured!
2013-04-13 12:37:32,420 ERROR [cloud.agent.AgentShell] (main:null) Unable to start agent: Private NIC is not configured
2013-04-13 12:42:30,653 INFO  [utils.component.ComponentLocator] (main:null) Unable to find components.xml
2013-04-13 12:42:30,654 INFO  [utils.component.ComponentLocator] (main:null) Skipping configuration using components.xml
2013-04-13 12:42:30,654 INFO  [cloud.agent.AgentShell] (main:null) Implementation Version is 4.0.1.20130201075054
2013-04-13 12:42:30,655 INFO  [cloud.agent.AgentShell] (main:null) agent.properties found at /etc/cloud/agent/agent.properties
2013-04-13 12:42:30,656 INFO  [cloud.agent.AgentShell] (main:null) Defaulting to using properties file for storage
2013-04-13 12:42:30,658 INFO  [cloud.agent.AgentShell] (main:null) Defaulting to the constant time backoff algorithm
2013-04-13 12:42:30,721 INFO  [cloud.agent.Agent] (main:null) id is 1
2013-04-13 12:42:30,820 INFO  [resource.virtualnetwork.VirtualRoutingResource] (main:null) VirtualRoutingResource _scriptDir to use: scripts/network/domr/kvm
2013-04-13 12:42:32,094 INFO  [kvm.resource.LibvirtComputingResource] (main:null) No libvirt.vif.driver specififed. Defaults to BridgeVifDriver.
2013-04-13 12:42:32,147 INFO  [cloud.agent.Agent] (main:null) Agent [id = 1 : type = LibvirtComputingResource : zone = 1 : pod = 1 : workers = 5 : host = 96.31.67.232 : port = 8250
2013-04-13 12:42:32,154 INFO  [utils.nio.NioClient] (Agent-Selector:null) Connecting to myipaddress:8250
2013-04-13 12:42:32,444 INFO  [utils.nio.NioClient] (Agent-Selector:null) SSL: Handshake done
2013-04-13 12:42:32,599 INFO  [cloud.serializer.GsonHelper] (Agent-Handler-1:null) Default Builder inited.
2013-04-13 12:42:32,803 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Proccess agent startup answer, agent id = 1
2013-04-13 12:42:32,803 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Set agent id 1
2013-04-13 12:42:32,808 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Startup Response Received: agent id = 1


The management log says this: 

2013-04-13 12:43:28,952 DEBUG [cloud.network.NetworkManagerImpl] (secstorage-1:null) Lock is released for network id 201 as a part of network implement
2013-04-13 12:43:28,969 DEBUG [db.Transaction.Transaction] (secstorage-1:null) Rolling back the transaction: Time = 1 Name =  -SystemVmLoadScanner$1.run:71-Executors$RunnableAdapter.call:471-FutureTask$Sync.innerRunAndReset:351-FutureTask.runAndReset:178-ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201:165-ScheduledThreadPoolExecutor$ScheduledFutureTask.run:267-ThreadPoolExecutor.runWorker:1146-ThreadPoolExecutor$Worker.run:615-Thread.run:679; called by -Transaction.rollback:887-DataCenterIpAddressDaoImpl.takeIpAddress:57-DatabaseCallback.intercept:34-DataCenterDaoImpl.allocatePrivateIpAddress:228-DatabaseCallback.intercept:34-PodBasedNetworkGuru.reserve:119-NetworkManagerImpl.prepareNic:2143-NetworkManagerImpl.prepare:2113-VirtualMachineManagerImpl.advanceStart:752-VirtualMachineManagerImpl.start:472-VirtualMachineManagerImpl.start:465-SecondaryStorageManagerImpl.startSecStorageVm:257
2013-04-13 12:43:28,970 INFO  [cloud.vm.VirtualMachineManagerImpl] (secstorage-1:null) Insufficient capacity
com.cloud.exception.InsufficientAddressCapacityException: Unable to get a management ip addressScope=interface com.cloud.dc.Pod; id=1
	at com.cloud.network.guru.PodBasedNetworkGuru.reserve(PodBasedNetworkGuru.java:121)
	at com.cloud.network.NetworkManagerImpl.prepareNic(NetworkManagerImpl.java:2143)
	at com.cloud.network.NetworkManagerImpl.prepare(NetworkManagerImpl.java:2113)
	at com.cloud.vm.VirtualMachineManagerImpl.advanceStart(VirtualMachineManagerImpl.java:752)
	at com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:472)
	at com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:465)
	at com.cloud.storage.secondary.SecondaryStorageManagerImpl.startSecStorageVm(SecondaryStorageManagerImpl.java:257)
	at com.cloud.storage.secondary.SecondaryStorageManagerImpl.allocCapacity(SecondaryStorageManagerImpl.java:684)
	at com.cloud.storage.secondary.SecondaryStorageManagerImpl.expandPool(SecondaryStorageManagerImpl.java:1310)
	at com.cloud.secstorage.PremiumSecondaryStorageManagerImpl.scanPool(PremiumSecondaryStorageManagerImpl.java:119)
	at com.cloud.secstorage.PremiumSecondaryStorageManagerImpl.scanPool(PremiumSecondaryStorageManagerImpl.java:50)
	at com.cloud.vm.SystemVmLoadScanner.loadScan(SystemVmLoadScanner.java:106)
	at com.cloud.vm.SystemVmLoadScanner.access$100(SystemVmLoadScanner.java:34)
	at com.cloud.vm.SystemVmLoadScanner$1.reallyRun(SystemVmLoadScanner.java:83)
	at com.cloud.vm.SystemVmLoadScanner$1.run(SystemVmLoadScanner.java:73)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
	at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:165)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:267)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:679)
2013-04-13 12:43:28,973 DEBUG [cloud.vm.VirtualMachineManagerImpl] (secstorage-1:null) Cleaning up resources for the vm VM[SecondaryStorageVm|s-588-VM] in Starting state
2013-04-13 12:43:28,975 DEBUG [agent.transport.Request] (secstorage-1:null) Seq 1-751304715: Waiting for Seq 751304714 Scheduling:  { Cmd , MgmtId: 219948120943996, via: 1, Ver: v1, Flags: 100111, [{"StopCommand":{"isProxy":false,"vmName":"s-588-VM","wait":0}}] }
2013-04-13 12:43:29,186 DEBUG [network.router.VirtualNetworkApplianceManagerImpl] (RouterStatusMonitor-1:null) Found 0 routers.
2013-04-13 12:43:37,927 DEBUG [agent.manager.AgentManagerImpl] (AgentManager-Handler-14:null) Ping from 1
2013-04-13 12:43:43,240 DEBUG [cloud.server.StatsCollector] (StatsCollector-3:null) VmStatsCollector is running...
2013-04-13 12:43:43,323 DEBUG [cloud.server.StatsCollector] (StatsCollector-3:null) StorageCollector is running...
2013-04-13 12:43:43,327 DEBUG [cloud.server.StatsCollector] (StatsCollector-3:null) There is no secondary storage VM for secondary storage host nfs://96.31.67.232/secondary
2013-04-13 12:43:43,400 DEBUG [agent.transport.Request] (StatsCollector-3:null) Seq 1-751304716: Received:  { Ans: , MgmtId: 219948120943996, via: 1, Ver: v1, Flags: 10, { GetStorageStatsAnswer } }
2013-04-13 12:43:43,936 DEBUG [cloud.server.StatsCollector] (StatsCollector-3:null) HostStatsCollector is running...
2013-04-13 12:43:44,545 DEBUG [agent.transport.Request] (StatsCollector-3:null) Seq 1-751304717: Received:  { Ans: , MgmtId: 219948120943996, via: 1, Ver: v1, Flags: 10, { GetHostStatsAnswer } }
2013-04-13 12:43:58,231 DEBUG [cloud.server.ManagementServerImpl] (EventChecker-1:null) Deleting events older than: Fri Apr 12 12:43:58 CDT 2013
2013-04-13 12:43:58,233 DEBUG [cloud.server.ManagementServerImpl] (EventChecker-1:null) Found 0 events to be purged
2013-04-13 12:43:58,235 DEBUG [cloud.server.ManagementServerImpl] (EventChecker-1:null) Deleting events older than: Fri Apr 12 12:43:58 CDT 2013
2013-04-13 12:43:58,238 DEBUG [cloud.server.ManagementServerImpl] (EventChecker-1:null) Found 0 events to be purged
2013-04-13 12:43:59,186 DEBUG [network.router.VirtualNetworkApplianceManagerImpl] (RouterStatusMonitor-1:null) Found 0 routers.
[root@lunder agent]#




On Apr 13, 2013, at 12:30 PM, Marcus Sorensen <sh...@gmail.com> wrote:

> Well you've got something trying to start, because you have vnet
> interfaces. You need to look at your agent logs to see why the system VMS
> refuse to start. If the power went out it could be corruption, the system
> VMS may be waiting for you to fsck. It sounds like maybe the system was put
> into production without testing to make sure the host settings were
> persistent and would survive a reboot?
> 
> So 1) look at your agent logs. And 2) use vnc to look at whatever system
> VMS are running and see what state they are in. They will probably
> continually try to start and then shut down.
> On Apr 13, 2013 11:24 AM, "Maurice Lawler" <ma...@me.com> wrote:
> 
>> Greetings,
>> 
>> I'm have a terrible way to go, nothing I have done will start my cloud.
>> None of my system VM's will start, which in turn do not permit the regular
>> OS VM's to start. I suffered from first a power outage, then I manually
>> rebooted my server. Now, nothing is coming back online.
>> 
>> I was previously told, having cloud0 first is the cause of this. Even when
>> doing ifconfig cloud0 down, nothing seems to come back online.
>> 
>> I have gone as far as stopping iptables / eatables along with
>> stopping/starting the network and the management console.
>> 
>> 
>> Checking the system VM's the continue to remain in a 'starting' status.
>> 
>> [root@lunder ~]# service iptables status
>> iptables: Firewall is not running.
>> [root@lunder ~]# service ebtables status
>> # Generated by ebtables-save v1.0 on Sat Apr 13 12:21:04 CDT 2013
>> *nat
>> :PREROUTING ACCEPT
>> :OUTPUT ACCEPT
>> :POSTROUTING ACCEPT
>> 
>> [root@lunder ~]#
>> 
>> 
>> [root@lunder daoenix]# ifconfig
>> cloud0    Link encap:Ethernet  HWaddr FE:00:A9:FE:00:67
>>          inet addr:169.254.0.1  Bcast:169.254.255.255  Mask:255.255.0.0
>>          inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
>>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>>          TX packets:658 errors:0 dropped:0 overruns:0 carrier:0
>>          collisions:0 txqueuelen:0
>>          RX bytes:0 (0.0 b)  TX bytes:28068 (27.4 KiB)
>> 
>> cloudbr0  Link encap:Ethernet  HWaddr C8:0A:A9:9E:2D:7C
>>          inet addr:myipaddress  Bcast:9myipaddress Mask:255.255.255.224
>>          inet6 addr: fe80::fc2c:bcff:fe00:5/64 Scope:Link
>>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>          RX packets:192832 errors:0 dropped:0 overruns:0 frame:0
>>          TX packets:11251 errors:0 dropped:0 overruns:0 carrier:0
>>          collisions:0 txqueuelen:0
>>          RX bytes:11481135 (10.9 MiB)  TX bytes:25153331 (23.9 MiB)
>> 
>> eth0      Link encap:Ethernet  HWaddr C8:0A:A9:9E:2D:7C
>>          inet6 addr: fe80::ca0a:a9ff:fe9e:2d7c/64 Scope:Link
>>          UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
>>          RX packets:199794 errors:0 dropped:0 overruns:0 frame:0
>>          TX packets:24157 errors:0 dropped:0 overruns:0 carrier:0
>>          collisions:0 txqueuelen:1000
>>          RX bytes:14647159 (13.9 MiB)  TX bytes:25994485 (24.7 MiB)
>>          Memory:df6e0000-df700000
>> 
>> lo        Link encap:Local Loopback
>>          inet addr:127.0.0.1  Mask:255.0.0.0
>>          inet6 addr: ::1/128 Scope:Host
>>          UP LOOPBACK RUNNING  MTU:16436  Metric:1
>>          RX packets:7850808 errors:0 dropped:0 overruns:0 frame:0
>>          TX packets:7850808 errors:0 dropped:0 overruns:0 carrier:0
>>          collisions:0 txqueuelen:0
>>          RX bytes:1611132695 (1.5 GiB)  TX bytes:1611132695 (1.5 GiB)
>> 
>> virbr0    Link encap:Ethernet  HWaddr 52:54:00:D9:D9:9A
>>          inet addr:192.168.122.1  Bcast:192.168.122.255
>> Mask:255.255.255.0
>>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>>          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>>          collisions:0 txqueuelen:0
>>          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
>> 
>> vnet0     Link encap:Ethernet  HWaddr FE:00:A9:FE:00:67
>>          inet6 addr: fe80::fc00:a9ff:fefe:67/64 Scope:Link
>>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>>          TX packets:116 errors:0 dropped:0 overruns:0 carrier:0
>>          collisions:0 txqueuelen:500
>>          RX bytes:0 (0.0 b)  TX bytes:5232 (5.1 KiB)
>> 
>> vnet1     Link encap:Ethernet  HWaddr FE:84:4C:00:00:01
>>          inet6 addr: fe80::fc84:4cff:fe00:1/64 Scope:Link
>>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>>          TX packets:256 errors:0 dropped:0 overruns:1 carrier:0
>>          collisions:0 txqueuelen:500
>>          RX bytes:0 (0.0 b)  TX bytes:17849 (17.4 KiB)
>> 
>> vnet2     Link encap:Ethernet  HWaddr FE:2C:BC:00:00:05
>>          inet6 addr: fe80::fc2c:bcff:fe00:5/64 Scope:Link
>>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>>          TX packets:256 errors:0 dropped:0 overruns:1 carrier:0
>>          collisions:0 txqueuelen:500
>>          RX bytes:0 (0.0 b)  TX bytes:17849 (17.4 KiB)
>> 
>> [root@lunder daoenix]#
>> 
>> 
>>

Re: Emergency: Cloud NOT starting

Posted by Marcus Sorensen <sh...@gmail.com>.

Well you've got something trying to start, because you have vnet
interfaces. You need to look at your agent logs to see why the system VMS
refuse to start. If the power went out it could be corruption, the system
VMS may be waiting for you to fsck. It sounds like maybe the system was put
into production without testing to make sure the host settings were
persistent and would survive a reboot?

So 1) look at your agent logs. And 2) use vnc to look at whatever system
VMS are running and see what state they are in. They will probably
continually try to start and then shut down.
On Apr 13, 2013 11:24 AM, "Maurice Lawler" <ma...@me.com> wrote:

> Greetings,
>
> I'm have a terrible way to go, nothing I have done will start my cloud.
> None of my system VM's will start, which in turn do not permit the regular
> OS VM's to start. I suffered from first a power outage, then I manually
> rebooted my server. Now, nothing is coming back online.
>
> I was previously told, having cloud0 first is the cause of this. Even when
> doing ifconfig cloud0 down, nothing seems to come back online.
>
> I have gone as far as stopping iptables / eatables along with
> stopping/starting the network and the management console.
>
>
> Checking the system VM's the continue to remain in a 'starting' status.
>
> [root@lunder ~]# service iptables status
> iptables: Firewall is not running.
> [root@lunder ~]# service ebtables status
> # Generated by ebtables-save v1.0 on Sat Apr 13 12:21:04 CDT 2013
> *nat
> :PREROUTING ACCEPT
> :OUTPUT ACCEPT
> :POSTROUTING ACCEPT
>
> [root@lunder ~]#
>
>
> [root@lunder daoenix]# ifconfig
> cloud0    Link encap:Ethernet  HWaddr FE:00:A9:FE:00:67
>           inet addr:169.254.0.1  Bcast:169.254.255.255  Mask:255.255.0.0
>           inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:658 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:0
>           RX bytes:0 (0.0 b)  TX bytes:28068 (27.4 KiB)
>
> cloudbr0  Link encap:Ethernet  HWaddr C8:0A:A9:9E:2D:7C
>           inet addr:myipaddress  Bcast:9myipaddress Mask:255.255.255.224
>           inet6 addr: fe80::fc2c:bcff:fe00:5/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:192832 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:11251 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:0
>           RX bytes:11481135 (10.9 MiB)  TX bytes:25153331 (23.9 MiB)
>
> eth0      Link encap:Ethernet  HWaddr C8:0A:A9:9E:2D:7C
>           inet6 addr: fe80::ca0a:a9ff:fe9e:2d7c/64 Scope:Link
>           UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
>           RX packets:199794 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:24157 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000
>           RX bytes:14647159 (13.9 MiB)  TX bytes:25994485 (24.7 MiB)
>           Memory:df6e0000-df700000
>
> lo        Link encap:Local Loopback
>           inet addr:127.0.0.1  Mask:255.0.0.0
>           inet6 addr: ::1/128 Scope:Host
>           UP LOOPBACK RUNNING  MTU:16436  Metric:1
>           RX packets:7850808 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:7850808 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:0
>           RX bytes:1611132695 (1.5 GiB)  TX bytes:1611132695 (1.5 GiB)
>
> virbr0    Link encap:Ethernet  HWaddr 52:54:00:D9:D9:9A
>           inet addr:192.168.122.1  Bcast:192.168.122.255
>  Mask:255.255.255.0
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:0
>           RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
>
> vnet0     Link encap:Ethernet  HWaddr FE:00:A9:FE:00:67
>           inet6 addr: fe80::fc00:a9ff:fefe:67/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:116 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:500
>           RX bytes:0 (0.0 b)  TX bytes:5232 (5.1 KiB)
>
> vnet1     Link encap:Ethernet  HWaddr FE:84:4C:00:00:01
>           inet6 addr: fe80::fc84:4cff:fe00:1/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:256 errors:0 dropped:0 overruns:1 carrier:0
>           collisions:0 txqueuelen:500
>           RX bytes:0 (0.0 b)  TX bytes:17849 (17.4 KiB)
>
> vnet2     Link encap:Ethernet  HWaddr FE:2C:BC:00:00:05
>           inet6 addr: fe80::fc2c:bcff:fe00:5/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:256 errors:0 dropped:0 overruns:1 carrier:0
>           collisions:0 txqueuelen:500
>           RX bytes:0 (0.0 b)  TX bytes:17849 (17.4 KiB)
>
> [root@lunder daoenix]#
>
>
>

Re: Emergency: Cloud NOT starting

Posted by Maurice Lawler <ma...@me.com>.

Now a new error shows: 



	at com.cloud.async.AsyncJobManagerImpl$1.run(AsyncJobManagerImpl.java:432)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
	at java.util.concurrent.FutureTask.run(FutureTask.java:166)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:679)
2013-04-13 12:29:33,067 WARN  [cloud.api.ApiDispatcher] (Job-Executor-1:job-132) class com.cloud.api.ServerApiException : null
2013-04-13 12:29:33,068 DEBUG [cloud.async.AsyncJobManagerImpl] (Job-Executor-1:job-132) Complete async job-132, jobStatus: 2, resultCode: 530, result: Error Code: 534 Error text: null
2013-04-13 12:29:34,382 DEBUG [cloud.cluster.ClusterManagerImpl] (Cluster-Heartbeat-1:null) Detected management node left, id:1, nodeIP:myipaddress
2013-04-13 12:29:34,382 INFO  [cloud.cluster.ClusterManagerImpl] (Cluster-Heartbeat-1:null) Trying to connect to myipaddress
2013-04-13 12:29:34,382 INFO  [cloud.cluster.ClusterManagerImpl] (Cluster-Heartbeat-1:null) Management node 1 is detected inactive by timestamp but is pingable


On Apr 13, 2013, at 12:23 PM, Maurice Lawler <ma...@me.com> wrote:

> Greetings,
> 
> I'm have a terrible way to go, nothing I have done will start my cloud. None of my system VM's will start, which in turn do not permit the regular OS VM's to start. I suffered from first a power outage, then I manually rebooted my server. Now, nothing is coming back online.
> 
> I was previously told, having cloud0 first is the cause of this. Even when doing ifconfig cloud0 down, nothing seems to come back online.
> 
> I have gone as far as stopping iptables / eatables along with stopping/starting the network and the management console.
> 
> 
> Checking the system VM's the continue to remain in a 'starting' status.
> 
> [root@lunder ~]# service iptables status
> iptables: Firewall is not running.
> [root@lunder ~]# service ebtables status
> # Generated by ebtables-save v1.0 on Sat Apr 13 12:21:04 CDT 2013
> *nat
> :PREROUTING ACCEPT
> :OUTPUT ACCEPT
> :POSTROUTING ACCEPT
> 
> [root@lunder ~]#
> 
> 
> [root@lunder daoenix]# ifconfig
> cloud0    Link encap:Ethernet  HWaddr FE:00:A9:FE:00:67
>          inet addr:169.254.0.1  Bcast:169.254.255.255  Mask:255.255.0.0
>          inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:658 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:0
>          RX bytes:0 (0.0 b)  TX bytes:28068 (27.4 KiB)
> 
> cloudbr0  Link encap:Ethernet  HWaddr C8:0A:A9:9E:2D:7C
>          inet addr:myipaddress  Bcast:9myipaddress Mask:255.255.255.224
>          inet6 addr: fe80::fc2c:bcff:fe00:5/64 Scope:Link
>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>          RX packets:192832 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:11251 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:0
>          RX bytes:11481135 (10.9 MiB)  TX bytes:25153331 (23.9 MiB)
> 
> eth0      Link encap:Ethernet  HWaddr C8:0A:A9:9E:2D:7C
>          inet6 addr: fe80::ca0a:a9ff:fe9e:2d7c/64 Scope:Link
>          UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
>          RX packets:199794 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:24157 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:1000
>          RX bytes:14647159 (13.9 MiB)  TX bytes:25994485 (24.7 MiB)
>          Memory:df6e0000-df700000
> 
> lo        Link encap:Local Loopback
>          inet addr:127.0.0.1  Mask:255.0.0.0
>          inet6 addr: ::1/128 Scope:Host
>          UP LOOPBACK RUNNING  MTU:16436  Metric:1
>          RX packets:7850808 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:7850808 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:0
>          RX bytes:1611132695 (1.5 GiB)  TX bytes:1611132695 (1.5 GiB)
> 
> virbr0    Link encap:Ethernet  HWaddr 52:54:00:D9:D9:9A
>          inet addr:192.168.122.1  Bcast:192.168.122.255  Mask:255.255.255.0
>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:0
>          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
> 
> vnet0     Link encap:Ethernet  HWaddr FE:00:A9:FE:00:67
>          inet6 addr: fe80::fc00:a9ff:fefe:67/64 Scope:Link
>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:116 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:500
>          RX bytes:0 (0.0 b)  TX bytes:5232 (5.1 KiB)
> 
> vnet1     Link encap:Ethernet  HWaddr FE:84:4C:00:00:01
>          inet6 addr: fe80::fc84:4cff:fe00:1/64 Scope:Link
>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:256 errors:0 dropped:0 overruns:1 carrier:0
>          collisions:0 txqueuelen:500
>          RX bytes:0 (0.0 b)  TX bytes:17849 (17.4 KiB)
> 
> vnet2     Link encap:Ethernet  HWaddr FE:2C:BC:00:00:05
>          inet6 addr: fe80::fc2c:bcff:fe00:5/64 Scope:Link
>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:256 errors:0 dropped:0 overruns:1 carrier:0
>          collisions:0 txqueuelen:500
>          RX bytes:0 (0.0 b)  TX bytes:17849 (17.4 KiB)
> 
> [root@lunder daoenix]#
> 
>

Re: Emergency: Cloud NOT starting

Posted by Marcus Sorensen <sh...@gmail.com>.

Well you've got something trying to start, because you have vnet
interfaces. You need to look at your agent logs to see why the system VMS
refuse to start. If the power went out it could be corruption, the system
VMS may be waiting for you to fsck. It sounds like maybe the system was put
into production without testing to make sure the host settings were
persistent and would survive a reboot?

So 1) look at your agent logs. And 2) use vnc to look at whatever system
VMS are running and see what state they are in. They will probably
continually try to start and then shut down.
On Apr 13, 2013 11:24 AM, "Maurice Lawler" <ma...@me.com> wrote:

> Greetings,
>
> I'm have a terrible way to go, nothing I have done will start my cloud.
> None of my system VM's will start, which in turn do not permit the regular
> OS VM's to start. I suffered from first a power outage, then I manually
> rebooted my server. Now, nothing is coming back online.
>
> I was previously told, having cloud0 first is the cause of this. Even when
> doing ifconfig cloud0 down, nothing seems to come back online.
>
> I have gone as far as stopping iptables / eatables along with
> stopping/starting the network and the management console.
>
>
> Checking the system VM's the continue to remain in a 'starting' status.
>
> [root@lunder ~]# service iptables status
> iptables: Firewall is not running.
> [root@lunder ~]# service ebtables status
> # Generated by ebtables-save v1.0 on Sat Apr 13 12:21:04 CDT 2013
> *nat
> :PREROUTING ACCEPT
> :OUTPUT ACCEPT
> :POSTROUTING ACCEPT
>
> [root@lunder ~]#
>
>
> [root@lunder daoenix]# ifconfig
> cloud0    Link encap:Ethernet  HWaddr FE:00:A9:FE:00:67
>           inet addr:169.254.0.1  Bcast:169.254.255.255  Mask:255.255.0.0
>           inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:658 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:0
>           RX bytes:0 (0.0 b)  TX bytes:28068 (27.4 KiB)
>
> cloudbr0  Link encap:Ethernet  HWaddr C8:0A:A9:9E:2D:7C
>           inet addr:myipaddress  Bcast:9myipaddress Mask:255.255.255.224
>           inet6 addr: fe80::fc2c:bcff:fe00:5/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:192832 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:11251 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:0
>           RX bytes:11481135 (10.9 MiB)  TX bytes:25153331 (23.9 MiB)
>
> eth0      Link encap:Ethernet  HWaddr C8:0A:A9:9E:2D:7C
>           inet6 addr: fe80::ca0a:a9ff:fe9e:2d7c/64 Scope:Link
>           UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
>           RX packets:199794 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:24157 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000
>           RX bytes:14647159 (13.9 MiB)  TX bytes:25994485 (24.7 MiB)
>           Memory:df6e0000-df700000
>
> lo        Link encap:Local Loopback
>           inet addr:127.0.0.1  Mask:255.0.0.0
>           inet6 addr: ::1/128 Scope:Host
>           UP LOOPBACK RUNNING  MTU:16436  Metric:1
>           RX packets:7850808 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:7850808 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:0
>           RX bytes:1611132695 (1.5 GiB)  TX bytes:1611132695 (1.5 GiB)
>
> virbr0    Link encap:Ethernet  HWaddr 52:54:00:D9:D9:9A
>           inet addr:192.168.122.1  Bcast:192.168.122.255
>  Mask:255.255.255.0
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:0
>           RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
>
> vnet0     Link encap:Ethernet  HWaddr FE:00:A9:FE:00:67
>           inet6 addr: fe80::fc00:a9ff:fefe:67/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:116 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:500
>           RX bytes:0 (0.0 b)  TX bytes:5232 (5.1 KiB)
>
> vnet1     Link encap:Ethernet  HWaddr FE:84:4C:00:00:01
>           inet6 addr: fe80::fc84:4cff:fe00:1/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:256 errors:0 dropped:0 overruns:1 carrier:0
>           collisions:0 txqueuelen:500
>           RX bytes:0 (0.0 b)  TX bytes:17849 (17.4 KiB)
>
> vnet2     Link encap:Ethernet  HWaddr FE:2C:BC:00:00:05
>           inet6 addr: fe80::fc2c:bcff:fe00:5/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:256 errors:0 dropped:0 overruns:1 carrier:0
>           collisions:0 txqueuelen:500
>           RX bytes:0 (0.0 b)  TX bytes:17849 (17.4 KiB)
>
> [root@lunder daoenix]#
>
>
>