You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@cloudstack.apache.org by Francois Scheurer <fr...@everyware.ch> on 2017/01/25 14:49:28 UTC

cs 4.5.1, vm stopped when putting xen hosts in maintenance

Dear CS contributors


We use CS 4.5.1 on a 3 Clusters with XenServer 6.5. and shared primary 
storage.

When we put a host in maintenance, cs evacuates all VM's on other hosts.

But it happens regularly that some VM's are stopped instead of live 
migrated.

If we restart that stopped VM's, CS will then stop them again later.

We need to start them several times until they stays up.

Is it a known issue? Is it fixed on CS 4.9.2 ?


Migrating manually all VM's is working but impracticable because too 
much time consuming.


Many thanks in advance for your help.



Best Regards
Francois






-- 


EveryWare AG
Fran�ois Scheurer
Senior Systems Engineer
Zurlindenstrasse 52a
CH-8003 Z�rich

tel: +41 44 466 60 00
fax: +41 44 466 60 10
mail: francois.scheurer@everyware.ch
web: http://www.everyware.ch

Re: cs 4.5.1, vm stopped when putting xen hosts in maintenance

Posted by Rafael Weingärtner <ra...@gmail.com>.

Francois, you are welcome.
The reported problem should not appear during host migrations if you
disabled VMs HA.

If you find any other problem or need any other help, please do not
hesitate to reach out to our users or dev mailing lists.

On Thu, Feb 9, 2017 at 8:26 AM, Francois Scheurer <
francois.scheurer@everyware.ch> wrote:

> Dear Rafael
>
>
> Many thanks for having checked the potential impacts and gave us your
> "blessing" before proceeding.
> We did the maintenance this morning and no VM's were stopped.
> Both tables op_it_work and op_ha_work did not get any new rows after the
> MS's restart (as it did before).
> So the problem seems to be fixed.
>
> It is still possible that it appears again during hosts maintenances but
> now we have a workaround (DB cleanup).
> We also updated our test env. with CS 4.9.2 with all testcases succeeded
> and are planning the ugrade on Prod in the short term.
> If the bug still arises in 4.9.2 we will open a jira.
>
> Thanks you again for your help Rafael.
>
>
> Best Regards
> Francois Scheurer
>
>
>
> On 08.02.2017 16:59, Rafael Weingärtner wrote:
>
> Your update seems reasonable. The methods that use the “step” data are the
> following:
>
>    - com.cloud.vm.VirtualMachineManagerImpl.checkWorkItems(VMInstanceVO,
>    State)
>    - com.cloud.vm.VirtualMachineManagerImpl.cleanup(VirtualMachineGuru,
>    VirtualMachineProfileItWorkVO, Event, boolean)
>    - com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(String,
>    Map<Param, Object>, DeploymentPlan, DeploymentPlanner)
>
>
> Basically, all of them would stop the flow of execution if the step is
> defined as “done”. What I would do before running your “update” query is
> checking the current steps of the VMs that were having problems, just to
> make sure you are not cutting some processes that may be important. After
> that, I would probably do the same as you and update them to done. I am
> assuming you already removed the HA from VMs, right?
>
>
> On Wed, Feb 8, 2017 at 5:32 AM, Francois Scheurer <
> <fr...@everyware.ch> wrote:
>
>> Dear Rafael
>>
>>
>> You are right about cloudmonkey with CS 4.5.2 being able to update the
>> ha_enable flag on VM's. Thank you for that.
>>
>> Back to the central question, do you think it is safe to run this update
>> while all MS's are stopped?:
>>   update op_it_work set step="Done" where step!="Done" and
>> instance_id=<X>;
>>
>> Where <X> is pointing to an expunged VM's or to the specific VM's that
>> are intermittently being stopped.
>>
>> If you are not sure about it, maybe do you know somebody else who can
>> help us on this?
>>
>>
>> Best Regards
>>
>> Francois Scheurer
>>
>>
>>
>>
>>
>>
>> On 07.02.2017 17:47, Rafael Weingärtner wrote:
>>
>> Got the idea of your work around now.
>>
>> I have not paid much attention to the table data you posted (sorry for
>> that). Well, it seems you have all taken care of.
>>
>> I have a suggestion to make; instead of changing data directly on the
>> data base, what about using CloudMonkey to update the HA flag to false on
>> VMs? Currently, it is the UI that is blocking such customization, but
>> according to the API, you should be able to do that.
>> https://cloudstack.apache.org/api/apidocs-4.5/root_admin/upd
>> ateVirtualMachine.html
>>
>> On Tue, Feb 7, 2017 at 11:25 AM, Francois Scheurer <
>> francois.scheurer@everyware.ch> wrote:
>>
>>> Dear Rafael
>>>
>>>
>>> We changed it on the VM to avoid the required stop and start. Changing
>>> on the used Offering is not affecting the vm_instance.ha_enabled.
>>> We could create a new Offering without HA but the VM need to be stopped
>>> to let us apply it.
>>>
>>> We did not disabled the HA on these 4 VM's to solve the problem, but
>>> just to minimize the impact of the problem (now only i-189-1177-VM should
>>> be stopped intermittently) if the cleaning of op_it_work is not fixing the
>>> issue.
>>> Of course we could filter the update query with “instance id”, but if
>>> you look the the table output all instance_id are related to the 5
>>> problematic VM's.
>>> The only exceptions are the first 3 rows that are related to VM's that
>>> does no exist anymore since July 2015 ...
>>> That is why I thought it may be safe to do.
>>>
>>>
>>> Best Regards
>>>
>>> Francois Scheurer
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On 07.02.2017 16:27, Rafael Weingärtner wrote:
>>>
>>> So, you changed the flags directly on "vm_instance". Why did not you
>>> change the service offering of these VMs using ACS UI or API? For me, it
>>> feels that the best approach would be to change the service offering of the
>>> affected VMs (the HA flag is first configured at the service offering). You
>>> could assign a service offering without HA; I mean, it would be a matter of
>>> creating a similar service offering (same configurations to the one they
>>> are using right now), but without HA.
>>>
>>> I am assuming that the change directly on “vm_instance” did not solve
>>> your problem, right? That is why you want to change some data at
>>> “op_it_work”. I did a quick look at “op_it_work”, and it seems that this
>>> table may be used for more things than HA. Here is a tip for complex
>>> production system, never run update SQLs in production environments without
>>> a very, very restrictive and well thought “where” clause. In my opinion
>>> this is not restrictive enough (update op_it_work set step="Done" where
>>> step!="Done"). I would advise you not to go further and change
>>> configurations directly on database (unless you have a dev team that is
>>> capable of reading, debugging, and work with ACS source code). ACS has over
>>> 500k lines of java code, and God knows how many lines of shell, python and
>>> other languages, software and systems that may be used; it is pretty hard
>>> sometimes to get a clear picture of how a change in the database will
>>> reverberate in the whole system without knowing in details the installation
>>> configurations and underlying structures (networking, hypervisors, storage
>>> systems and other pieces). However, if you want to proceed with that, I
>>> would suggest you adding some extra filters to that “where” clause. What
>>> about adding the “instance id” there?
>>>
>>> PS: BTW, do you already heard about the Cloudstack Fork from Schuberg
>>> Philis named Cosmic? https://github.com/MissionCriticalCloud/cosmic
>>> Yes, I have. They were part of the active ACS community back there. I
>>> was already here when they created the fork. Still, I kept with ACS.
>>>
>>>
>>> On Tue, Feb 7, 2017 at 3:54 AM, Francois Scheurer <
>>> francois.scheurer@everyware.ch> wrote:
>>>
>>>> Dear Rafael
>>>>
>>>>
>>>> We disabled temporarily the HA Flag on these VM's by updating the DB
>>>> directly: update vm_instance set ha_enabled=0 where removed is null and
>>>> id=1178 or id=1240 or id=1690 or id=1736;
>>>>
>>>> We kept the test VM i-189-1177-VM with HA Flag.
>>>>
>>>> Restarting the MS + cleaning the op_it_work table will hopefully fix
>>>> the issue. And if the issue is persisting, then only the i-189-1177-VM
>>>> should be affected, hence mitigating the impact.
>>>>
>>>> In the other hand if everything run fine we would thereafter re-enable
>>>> the HA Flag on the problematic VM's.
>>>>
>>>>
>>>> Do you think it is safe to run this update?: update op_it_work set
>>>> step="Done" where step!="Done";
>>>>
>>>>
>>>> Best Regards
>>>>
>>>> Francois Scheurer
>>>>
>>>>
>>>>
>>>> PS: BTW, do you already heard about the Cloudstack Fork from Schuberg
>>>> Philis named Cosmic? <https://github.com/>https://github.com/
>>>> MissionCriticalCloud/cosmic
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 06.02.2017 16:41, Rafael Weingärtner wrote:
>>>>
>>>> I totally understand your position of not being able to debug the ACS
>>>> in production ;)
>>>>
>>>>
>>>> There is one bit I quite did not understand. If you have disabled the
>>>> HA for VMs, why do you still need to change data on the data base? How did
>>>> you disabled the HA of VMs?
>>>>
>>>> On Mon, Feb 6, 2017 at 5:04 AM, Francois Scheurer <
>>>> <fr...@everyware.ch> wrote:
>>>>
>>>>> Dear Rafael / Dear CS contributors
>>>>>
>>>>>
>>>>> I tried the following:
>>>>>
>>>>> 1.     stop all MS
>>>>> 2.     backup DB
>>>>> 3.     select * from op_ha_work where type="Migration" and
>>>>> mgmt_server_id is null and taken is null;
>>>>> 4.     delete from op_ha_work where type="Migration" and
>>>>> mgmt_server_id is null and taken is null;
>>>>> 5.     start all MS
>>>>>
>>>>> But actually after the MS stopped the op_ha_work table was
>>>>> automatically cleaned, making my steps useless.
>>>>> After the MS started the op_ha_work table was filled again and related
>>>>> VM's were stopped several times (3-7 times), requiring high attention to
>>>>> manually restart them on each stop.
>>>>>
>>>>> I later found another table that could be the cause of the issue:
>>>>>
>>>>> mysql> select op_it_work.*,vm_instance.instance_name,vm_instance.removed
>>>>> from op_it_work join vm_instance where op_it_work.step!="Done" and
>>>>> op_it_work.instance_id=vm_instance.id order by instance_id;
>>>>> +--------------------------------------+----------------+---
>>>>> ---------+-----------------------+-----------+--------------
>>>>> ------+-----------+------------+-------------+--------------
>>>>> -+-------------+---------------+---------------------+
>>>>> | id                                   | mgmt_server_id | created_at
>>>>> | thread                | type      | vm_type            | step      |
>>>>> updated_at | instance_id | resource_type | resource_id | instance_name |
>>>>> removed             |
>>>>> +--------------------------------------+----------------+---
>>>>> ---------+-----------------------+-----------+--------------
>>>>> ------+-----------+------------+-------------+--------------
>>>>> -+-------------+---------------+---------------------+
>>>>> | be4d0dd2-7d4f-47d3-ba49-623d2f115bbf |   345049098498 | 1436187965
>>>>> <%2814%29%203618-7965> | consoleproxy-1        | Starting  |
>>>>> ConsoleProxy       | Release   | 1436188265 <%2814%29%203618-8265>
>>>>> |         402 | NULL          |           0 | v-402-VM      | 2015-07-06
>>>>> 13:23:22 |
>>>>> | fb4d1844-eb92-478c-9f50-d666790f43af |   345049098498 | 1436187965
>>>>> <%2814%29%203618-7965> | secstorage-1          | Starting  |
>>>>> SecondaryStorageVm | Release   | 1436188265 <%2814%29%203618-8265>
>>>>> |         687 | NULL          |           0 | s-687-VM      | 2015-07-06
>>>>> 13:11:34 |
>>>>> | 7dfeb25a-a705-42e8-907c-79d1e69697ed |   345049098498 | 1436964548
>>>>> <%2814%29%203696-4548> | Work-Job-Executor-165 | Migrating |
>>>>> User               | Migrating | 1436964548 <%2814%29%203696-4548>
>>>>> |         776 | Host          |          29 | i-51-776-VM   | 2015-07-15
>>>>> 12:55:38 |
>>>>> | 2018ac61-b11c-4101-8ef5-07a39509f2bd |   345049098498 | 1486020173
>>>>> | Work-Job-Executor-19  | Migrating | User               | Prepare   |
>>>>> 1486020173 |        1177 | Host          |          25 | i-189-1177-VM |
>>>>> NULL                |
>>>>> | 25430cd5-4346-4643-9c7e-7b39b8a876a2 |   345049098122 | 1486019693
>>>>> | Work-Job-Executor-6   | Migrating | User               | Prepare   |
>>>>> 1486019693 |        1177 | Host          |          25 | i-189-1177-VM |
>>>>> NULL                |
>>>>> | 3276d52e-cbd9-44c5-8ab6-c70784337ff9 |   345049098498 | 1486020953
>>>>> | Work-Job-Executor-33  | Migrating | User               | Prepare   |
>>>>> 1486020953 |        1177 | Host          |          33 | i-189-1177-VM |
>>>>> NULL                |
>>>>> | 6462da33-83f2-4575-af38-0e52d2a48488 |   345049098122 | 1486021433
>>>>> | Work-Job-Executor-12  | Migrating | User               | Prepare   |
>>>>> 1486021433 |        1177 | Host          |          25 | i-189-1177-VM |
>>>>> NULL                |
>>>>> | 688e16c8-c39f-4e88-9e8f-36c0282f3df1 |   345049098122 | 1486020773
>>>>> | Work-Job-Executor-11  | Migrating | User               | Prepare   |
>>>>> 1486020773 |        1177 | Host          |          25 | i-189-1177-VM |
>>>>> NULL                |
>>>>> | d8a18e9d-ac6f-4273-923e-e401533e2cf7 |   345049098498 | 1486020353
>>>>> | Work-Job-Executor-25  | Migrating | User               | Prepare   |
>>>>> 1486020353 |        1177 | Host          |          25 | i-189-1177-VM |
>>>>> NULL                |
>>>>> | 1222ece8-399d-43d8-8ff8-d24bdebca2ed |   345049103441 | 1486018924
>>>>> | Work-Job-Executor-2   | Migrating | User               | Prepare   |
>>>>> 1486018924 |        1178 | Host          |          29 | i-189-1178-VM |
>>>>> NULL                |
>>>>> | 2ac5ca4a-c20d-433a-8379-40e0de3b3ba8 |   345049098498 | 1486021613
>>>>> | Work-Job-Executor-41  | Migrating | User               | Prepare   |
>>>>> 1486021613 |        1178 | Host          |          29 | i-189-1178-VM |
>>>>> NULL                |
>>>>> | 3e3b95a0-2af4-4268-bfdd-f3399a85bcaa |   345049103441 | 1486019584
>>>>> | Work-Job-Executor-5   | Migrating | User               | Prepare   |
>>>>> 1486019584 |        1178 | Host          |          29 | i-189-1178-VM |
>>>>> NULL                |
>>>>> | 53745a2d-056c-4a1b-9267-eb97f077a6c8 |   345049103441 | 1486020244
>>>>> | Work-Job-Executor-6   | Migrating | User               | Prepare   |
>>>>> 1486020244 |        1178 | Host          |          29 | i-189-1178-VM |
>>>>> NULL                |
>>>>> | e3555b49-7c18-4272-b7f4-ba9351b3bbd1 |   345049098498 | 1486020893
>>>>> | Work-Job-Executor-32  | Migrating | User               | Prepare   |
>>>>> 1486020893 |        1178 | Host          |          29 | i-189-1178-VM |
>>>>> NULL                |
>>>>> | 1dcd5204-4f9a-4342-a23b-329a48596e13 |   345049098498 | 1486020953
>>>>> | Work-Job-Executor-34  | Migrating | User               | Prepare   |
>>>>> 1486020953 |        1240 | Host          |           3 | i-388-1240-VM |
>>>>> NULL                |
>>>>> | ec1668b3-fa3c-4673-b593-3d607b169a24 |   345049098498 | 1486019693
>>>>> | Work-Job-Executor-14  | Migrating | User               | Prepare   |
>>>>> 1486019693 |        1240 | Host          |           3 | i-388-1240-VM |
>>>>> NULL                |
>>>>> | fa7b2dec-ed7f-4a67-a055-e6e6bc1631ca |   345049098498 | 1486020353
>>>>> | Work-Job-Executor-26  | Migrating | User               | Prepare   |
>>>>> 1486020353 |        1240 | Host          |           3 | i-388-1240-VM |
>>>>> NULL                |
>>>>> | 8cfe6d3e-5b7b-4be8-ac10-c865a492aab9 |   345049098498 | 1465387271
>>>>> | Work-Job-Executor-18  | Running   | User               | Prepare   |
>>>>> 1465387271 |        1388 | Host          |           7 | i-361-1388-VM |
>>>>> NULL                |
>>>>> | 6e941b7e-6f19-4717-bf82-974c59737dbc |   345049103441 | 1486019524
>>>>> | Work-Job-Executor-4   | Migrating | User               | Prepare   |
>>>>> 1486019524 |        1690 | Host          |          15 | i-648-1690-VM |
>>>>> NULL                |
>>>>> | 718578a4-0dd6-42d9-be10-ed7680650108 |   345049098122 | 1486021433
>>>>> | Work-Job-Executor-13  | Migrating | User               | Prepare   |
>>>>> 1486021433 |        1690 | Host          |          15 | i-648-1690-VM |
>>>>> NULL                |
>>>>> | 9acf3908-ef13-4b03-8c2b-1a7e55e18491 |   345049098498 | 1486020173
>>>>> | Work-Job-Executor-20  | Migrating | User               | Prepare   |
>>>>> 1486020173 |        1690 | Host          |          15 | i-648-1690-VM |
>>>>> NULL                |
>>>>> | a50c420b-ae52-4caa-83db-704b52b299a4 |   345049098498 | 1486020773
>>>>> | Work-Job-Executor-28  | Migrating | User               | Prepare   |
>>>>> 1486020773 |        1690 | Host          |          15 | i-648-1690-VM |
>>>>> NULL                |
>>>>> | 2b5f4179-5975-4326-9db7-643c3ede63fe |   345049103441 | 1484224516
>>>>> | Work-Job-Executor-161 | Running   | User               | Prepare   |
>>>>> 1484224516 |        1703 | Host          |           3 | i-51-1703-VM  |
>>>>> NULL                |
>>>>> | 2f3c6c44-b2a0-4d10-88d2-ac143d2dd379 |   345049098498 | 1484223910
>>>>> | Work-Job-Executor-163 | Running   | User               | Prepare   |
>>>>> 1484223910 |        1703 | Host          |           3 | i-51-1703-VM  |
>>>>> NULL                |
>>>>> | 4126a23b-1cde-49bc-8636-c0d063b85cf6 |   345049103441 | 1484224636
>>>>> | Work-Job-Executor-86  | Running   | User               | Prepare   |
>>>>> 1484224636 |        1703 | Host          |           3 | i-51-1703-VM  |
>>>>> NULL                |
>>>>> | 5dad5645-766b-48c9-9a85-b77b3c101390 |   345049103441 | 1484224636
>>>>> | Work-Job-Executor-92  | Running   | User               | Prepare   |
>>>>> 1484224636 |        1703 | Host          |           3 | i-51-1703-VM  |
>>>>> NULL                |
>>>>> | 8c1dfff9-e55d-4f81-a22f-45f72accc416 |   345049103441 | 1484224576
>>>>> | Work-Job-Executor-95  | Running   | User               | Prepare   |
>>>>> 1484224576 |        1703 | Host          |           3 | i-51-1703-VM  |
>>>>> NULL                |
>>>>> | a703d2d7-b1e3-42ce-a551-831982990ffb |   345049098498 | 1484224630
>>>>> | Work-Job-Executor-13  | Running   | User               | Prepare   |
>>>>> 1484224630 |        1703 | Host          |           3 | i-51-1703-VM  |
>>>>> NULL                |
>>>>> | 0e194cb5-95b7-41cb-969a-ae37d2ad449a |   345049098498 | 1485763613
>>>>> | Work-Job-Executor-28  | Migrating | User               | Prepare   |
>>>>> 1485763613 |        1735 | Host          |          19 | i-638-1735-VM |
>>>>> 2017-01-30 15:24:10 |
>>>>> | 74f180b9-be0f-429b-9a45-597ca2ca997d |   345049098498 | 1485763013
>>>>> | Work-Job-Executor-21  | Migrating | User               | Prepare   |
>>>>> 1485763013 |        1735 | Host          |          19 | i-638-1735-VM |
>>>>> 2017-01-30 15:24:10 |
>>>>> | feaa421c-88df-4493-9a4f-bddb6ee1b75b |   345049098122 | 1485764285
>>>>> | Work-Job-Executor-19  | Migrating | User               | Prepare   |
>>>>> 1485764285 |        1735 | Host          |          19 | i-638-1735-VM |
>>>>> 2017-01-30 15:24:10 |
>>>>> | 32138a36-ad18-42b5-8478-76ad63caf4da |   345049098122 | 1486020293
>>>>> | Work-Job-Executor-9   | Migrating | User               | Prepare   |
>>>>> 1486020293 |        1736 | Host          |          19 | i-638-1736-VM |
>>>>> NULL                |
>>>>> | 5306ab1f-3c92-4f95-bbff-283b43e6862d |   345049098498 | 1486019573
>>>>> | Work-Job-Executor-11  | Migrating | User               | Prepare   |
>>>>> 1486019573 |        1736 | Host          |          19 | i-638-1736-VM |
>>>>> NULL                |
>>>>> | 843bdf9c-247d-432c-ac19-1a4f7858631b |   345049098498 | 1486021493
>>>>> | Work-Job-Executor-38  | Migrating | User               | Prepare   |
>>>>> 1486021493 |        1736 | Host          |          19 | i-638-1736-VM |
>>>>> NULL                |
>>>>> | 8d400afb-766f-4ac5-89ee-854f25a38eba |   345049098498 | 1486020173
>>>>> | Work-Job-Executor-21  | Migrating | User               | Prepare   |
>>>>> 1486020173 |        1736 | Host          |          19 | i-638-1736-VM |
>>>>> NULL                |
>>>>> | f0ae5612-b6ef-4882-94a2-40a1e6dce342 |   345049098498 | 1486020833
>>>>> | Work-Job-Executor-30  | Migrating | User               | Prepare   |
>>>>> 1486020833 |        1736 | Host          |          19 | i-638-1736-VM |
>>>>> NULL                |
>>>>> +--------------------------------------+----------------+---
>>>>> ---------+-----------------------+-----------+--------------
>>>>> ------+-----------+------------+-------------+--------------
>>>>> -+-------------+---------------+---------------------+
>>>>>
>>>>> As you see, the problematic VM's are all present on that table:
>>>>>
>>>>> i-189-1177-VM
>>>>> i-189-1178-VM
>>>>> i-388-1240-VM
>>>>> i-648-1690-VM
>>>>> i-638-1736-VM
>>>>>
>>>>> As the MS are running the production environnements, we cannot debug a
>>>>> lot and just need the shortest path to resolution.
>>>>> For example restarting the MS is having an impact on our customers and
>>>>> we need to reduce the number of restarts at the minimum.
>>>>>
>>>>> I have disabled the HA on all problematic VM's except on i-189-1177-VM,
>>>>> which is less critical.
>>>>>
>>>>> The next steps could be:
>>>>> 1.     stop all MS
>>>>> 2.     backup DB
>>>>> 3.     select op_it_work.*,vm_instance.instance_name,vm_instance.removed
>>>>> from op_it_work join vm_instance where op_it_work.step!="Done" and
>>>>> op_it_work.instance_id=vm_instance.id order by instance_id;
>>>>> 4.     update op_it_work set step="Done" where step!="Done";
>>>>> 5.     start all MS
>>>>>
>>>>> What do you think about step 4., safe or risky?
>>>>> Some of these entries are related to deleted VM's and seems to safe to
>>>>> mark as "Done".
>>>>> All other entries are related to the problematic VM's that keep
>>>>> stopping... so marking them as "Done" may (hopefully) eliminate the cause
>>>>> of the Migrating/Stopping.
>>>>> But we are not sure if marking them as "Done" need also changes on
>>>>> other tables for consistency.
>>>>>
>>>>>
>>>>> Thank you for your help.
>>>>>
>>>>>
>>>>> Best Regards
>>>>> Francois Scheurer
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 01.02.2017 03:01, Rafael Weingärtner wrote:
>>>>>
>>>>> Ok, I dug a little further into the code.
>>>>>
>>>>> If we put a host in maintenance, it may unfortunately happen that a
>>>>>> timeout occurs and CS aborts the migration. As far as we can interpret the
>>>>>> logs, it seems (we speculate) that CS detects that the VM is then still
>>>>>> running on the original host, which is in maintenance, and decides somehow
>>>>>> to stop it because of the "host not up" status.
>>>>>>
>>>>>
>>>>> From wat I have seen in the code, this may happen (as you speculated).
>>>>> If a timeout or some other event interrupts migrations, the VM will be
>>>>> stopped. This happens in the method “com.cloud.vm.VirtualMachineMa
>>>>> nagerImpl.orchestrateMigrateAway(String, long, DeploymentPlanner)” at
>>>>> line 2396 and further. This was implemented on purpose; if the migration
>>>>> does not happen; the VM is stopped on the source host. Probably it was
>>>>> implemented this way on that philosophy of gradient descendent of
>>>>> artificial intelligence:
>>>>>
>>>>> “pick a random point in the field, update the values according to the
>>>>>> slope and hopefully after successfully iterations you converge to a minimum
>>>>>> point.” (the problem is that sometimes the function diverges).
>>>>>>
>>>>>
>>>>> I mean, probably this was done this way because the HA was supposed to
>>>>> see that a VM with HA is stopped, and should try to start it again
>>>>> somewhere else. The log you posted do not provide any other details to
>>>>> debug this further because the log is after an MS restart, right?
>>>>>
>>>>>
>>>>> After restarting the MSs I guess that a similar issue may happen: the
>>>>>> hosts are disconnected during the early stage and then reconnects. Maybe CS
>>>>>> then again stops the VMs because the "host not up" status?
>>>>>> Cf. 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>>>>> The last host of this VM is not UP or is not enabled, host status is:
>>>>>> Disconnected, host resource state is: Enabled
>>>>>>
>>>>>
>>>>> Here is something interesting that you have found. I checked the code,
>>>>> but without a live debug it is pretty hard for me to get the right flow of
>>>>> execution. By looking at the code and at the logs you provided, your
>>>>> deduction is right. After ACS boots up, the check for hosts that are
>>>>> shutdown/deactivated starts, and the method “com.cloud.agent.manager.Agent
>>>>> ManagerImpl.executeUserRequest(long, Event)” is executed. The problem
>>>>> is that all of the hosts are considered as disconnected (ACS may not have
>>>>> finished connecting to all of them yet), and then a whole process of HA is
>>>>> kicked. The problem is that there is no host with status “UP”. That is why
>>>>> this happen “No suitable hosts found under this Cluster: 25”, and then the
>>>>> whole flow of HA, which shuts down the VM is kicked.
>>>>>
>>>>> In theory, the HA worker should see that a VM with HA is down, and
>>>>> should try to start it. I am not understanding why this is not working. If
>>>>> you decide to debug the running ACS, I may be able to help you, just keep
>>>>> me posted.
>>>>>
>>>>>
>>>>> On Tue, Jan 31, 2017 at 10:26 AM, Rafael Weingärtner <
>>>>> <ra...@gmail.com> wrote:
>>>>>
>>>>>> I will be able to re-check some code and answer you with new thoughts
>>>>>> on this problem tonight (the log files you provided today have some very
>>>>>> insightful information regarding this problem). I believe this is a bug in
>>>>>> HA, a very tricky one that has not been reported yet (at least I do not
>>>>>> remember seeing anything similar). And thanks to randomness/destiny now you
>>>>>> have found/uncovered a way to reproduce the problem (bug?!).
>>>>>>
>>>>>> I do not suggest you changing any data at all directly in the data
>>>>>> base (unless you know ACS code well and are 100% of what you are doing).
>>>>>> This can cause even bigger and more complicated problems. If you remove the
>>>>>> HA feature from this VMs, this problem will stop. However, if you can/are
>>>>>> willing to maintain the HA until we fully uncover the problem, map and
>>>>>> report it, this would be great.
>>>>>> BTW: why are you restarting MSs?
>>>>>> What are the configurations of your MSs?
>>>>>>
>>>>>> Thanks for your time and for sharing information with our community.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Jan 31, 2017 at 8:53 AM, Francois Scheurer <
>>>>>> <fr...@everyware.ch> wrote:
>>>>>>
>>>>>>> Dear Rafael
>>>>>>>
>>>>>>>
>>>>>>> Many thanks for your deep analysis.
>>>>>>> Nice humor with the planetsalignments BTW ;)
>>>>>>>
>>>>>>> We also suspected the HA Flag to be related this issue, as we never
>>>>>>> got this issue in the past, until we enabled the HA on all VMs and
>>>>>>> Offerings.
>>>>>>>
>>>>>>> Here is also the workers list:
>>>>>>>
>>>>>>> mysql> select * from op_ha_work;
>>>>>>> +------+-------------+-----------+--------------------+-----
>>>>>>> -----+----------------+---------+---------------------+-----
>>>>>>> --+---------------------+-------+-------------+---------+
>>>>>>> | id   | instance_id | type      | vm_type            | state    |
>>>>>>> mgmt_server_id | host_id | created             | tried | taken
>>>>>>>  | step  | time_to_try | updated |
>>>>>>> +------+-------------+-----------+--------------------+-----
>>>>>>> -----+----------------+---------+---------------------+-----
>>>>>>> --+---------------------+-------+-------------+---------+
>>>>>>> |  105 |         343 | HA        | User               | Running  |
>>>>>>>  345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>>>> Error |  1394056013 |       3 |
>>>>>>> |  109 |         407 | HA        | ConsoleProxy       | Running  |
>>>>>>>  345049098498 |      33 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>>>> Error |  1394056013 |       3 |
>>>>>>> |  113 |         407 | HA        | ConsoleProxy       | Running  |
>>>>>>>  345049098498 |      33 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>>>> Error |  1394056013 |       3 |
>>>>>>> |  117 |         407 | HA        | ConsoleProxy       | Running  |
>>>>>>>  345049098498 |      33 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>>>> Error |  1394056013 |       3 |
>>>>>>> |  121 |         343 | HA        | User               | Running  |
>>>>>>>  345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>>>> Error |  1394056013 |       3 |
>>>>>>> |  125 |         405 | HA        | SecondaryStorageVm | Running  |
>>>>>>>  345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>>>> Error |  1394056013 |       3 |
>>>>>>> |  129 |         393 | HA        | User               | Running  |
>>>>>>>  345049098498 |      25 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>>>> Error |  1394056014 |      30 |
>>>>>>> |  133 |         405 | HA        | SecondaryStorageVm | Running  |
>>>>>>>  345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>>>> Error |  1394056014 |       3 |
>>>>>>> |  137 |         402 | HA        | ConsoleProxy       | Running  |
>>>>>>>  345049098498 |      25 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>>>> Error |  1394056014 |       3 |
>>>>>>> |  345 |         402 | HA        | ConsoleProxy       | Starting |
>>>>>>>  345049098498 |      25 | 2015-07-06 13:09:26 | 1 | 2015-07-06 13:09:26 |
>>>>>>> Error |  1402527506 |     770 |
>>>>>>> |  349 |         687 | HA        | SecondaryStorageVm | Starting |
>>>>>>>  345049098498 |      25 | 2015-07-06 13:09:26 | 1 | 2015-07-06 13:09:26 |
>>>>>>> Error |  1402527506 |       2 |
>>>>>>> | 2057 |        1178 | Migration | User               | Running  |
>>>>>>>  345049098498 |      33 | 2017-01-31 07:56:08 | 0 | 2017-01-31 07:56:08 |
>>>>>>> Done  |  1451024774 |      95 |
>>>>>>> | 2058 |        1736 | Migration | User               | Running  |
>>>>>>>  345049098498 |     224 | 2017-01-31 07:56:08 | 0 | 2017-01-31 07:56:08 |
>>>>>>> Done  |  1451024774 |      45 |
>>>>>>> | 2059 |        1240 | Migration | User               | Running  |
>>>>>>>  345049098498 |       7 | 2017-01-31 07:56:08 | 0 | 2017-01-31 07:56:08 |
>>>>>>> Done  |  1451024774 |      61 |
>>>>>>> | 2060 |        1178 | Migration | User               | Running  |
>>>>>>>  345049098122 |      33 | 2017-01-31 07:56:08 | 1 | 2017-01-31 08:06:37 |
>>>>>>> Done  |  1451025374 |      95 |
>>>>>>> | 2061 |        1178 | Migration | User               | Stopped  |
>>>>>>>          NULL |      33 | 2017-01-31 07:56:09 | 5 | NULL                |
>>>>>>> Done  |  1451027900 |     109 |
>>>>>>> | 2062 |        1690 | Migration | User               | Running  |
>>>>>>>  345049098498 |     224 | 2017-01-31 07:56:09 | 0 | 2017-01-31 07:56:09 |
>>>>>>> Done  |  1451024774 |     109 |
>>>>>>> | 2063 |        1177 | Migration | User               | Running  |
>>>>>>>  345049098498 |      29 | 2017-01-31 07:56:09 | 0 | 2017-01-31 07:56:09 |
>>>>>>> Done  |  1451024774 |      82 |
>>>>>>> | 2064 |        1690 | Migration | User               | Running  |
>>>>>>>  345049098122 |     224 | 2017-01-31 07:56:09 | 1 | 2017-01-31 07:58:18 |
>>>>>>> Done  |  1451024900 |     109 |
>>>>>>> | 2065 |        1240 | Migration | User               | Running  |
>>>>>>>  345049098122 |       7 | 2017-01-31 07:56:09 | 1 | 2017-01-31 07:58:18 |
>>>>>>> Done  |  1451024900 |      61 |
>>>>>>> | 2066 |        1736 | Migration | User               | Running  |
>>>>>>>  345049098122 |     224 | 2017-01-31 07:56:09 | 1 | 2017-01-31 07:58:18 |
>>>>>>> Done  |  1451024900 |      45 |
>>>>>>> | 2067 |        1240 | Migration | User               | Stopped  |
>>>>>>>          NULL |       7 | 2017-01-31 07:56:09 | 5 | NULL                |
>>>>>>> Done  |  1451028003 |      80 |
>>>>>>> | 2068 |        1178 | Migration | User               | Running  |
>>>>>>>  345049098122 |      33 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
>>>>>>> Done  |  1451024781 |      95 |
>>>>>>> | 2069 |        1736 | Migration | User               | Running  |
>>>>>>>  345049098122 |     224 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
>>>>>>> Done  |  1451024781 |      45 |
>>>>>>> | 2070 |        1240 | Migration | User               | Running  |
>>>>>>>  345049098122 |       7 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
>>>>>>> Done  |  1451024781 |      61 |
>>>>>>> | 2071 |        1177 | Migration | User               | Running  |
>>>>>>>  345049098122 |      29 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
>>>>>>> Done  |  1451024781 |      82 |
>>>>>>> | 2072 |        1177 | Migration | User               | Stopped  |
>>>>>>>          NULL |      29 | 2017-01-31 07:56:16 | 5 | NULL                |
>>>>>>> Done  |  1451027973 |     101 |
>>>>>>> | 2073 |        1177 | Migration | User               | Running  |
>>>>>>>  345049098122 |      29 | 2017-01-31 07:56:16 | 1 | 2017-01-31 07:58:22 |
>>>>>>> Done  |  1451024904 |      82 |
>>>>>>> | 2074 |        1690 | Migration | User               | Stopped  |
>>>>>>>          NULL |     224 | 2017-01-31 07:56:16 | 5 | NULL                |
>>>>>>> Done  |  1451027464 |     123 |
>>>>>>> | 2075 |        1736 | Migration | User               | Stopped  |
>>>>>>>          NULL |     224 | 2017-01-31 07:56:16 | 5 | NULL                |
>>>>>>> Done  |  1451027477 |      59 |
>>>>>>> +------+-------------+-----------+--------------------+-----
>>>>>>> -----+----------------+---------+---------------------+-----
>>>>>>> --+---------------------+-------+-------------+---------+
>>>>>>>
>>>>>>> If we put a host in maintenance, it may unfortunately happen that a
>>>>>>> timeout occurs and CS aborts the migration.
>>>>>>> As far as we can interpret the logs, it seems (we speculate) that CS
>>>>>>> detects that the VM is then still running on the original host, which is in
>>>>>>> maintenance, and decides somehow to stop it because of the "host not up"
>>>>>>> status.
>>>>>>> We then need to start several times the VM, until CS gives up the
>>>>>>> stopping.
>>>>>>>
>>>>>>> After restarting the MSs I guess that a similar issue may happen:
>>>>>>> the hosts are disconnected during the early stage and then reconnects.
>>>>>>> Maybe CS then again stops the VMs because the "host not up" status?
>>>>>>> Cf. 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>>>>>>> ctx-6e9c753f) The last host of this VM is not UP or is not enabled, host
>>>>>>> status is: Disconnected, host resource state is: Enabled
>>>>>>>
>>>>>>> But the problem after the MSs restart is definitely only impacting
>>>>>>> VMs listed in the Table op_ha_work.
>>>>>>>
>>>>>>>
>>>>>>> To your questions:
>>>>>>>         We have 3 ACS MSs managing several Zones. Each zone has a
>>>>>>> single pod and cluster.
>>>>>>>         The Cluster of i-638-1735-VM has this summary:
>>>>>>>          Total number of VMs: 121
>>>>>>>          Total number hypervisors: 4
>>>>>>>          Total allocated RAM: 522.0 / 972 GB (53.7 %)
>>>>>>>          Total allocated cores: 211
>>>>>>>          Total allocated storage: 11285 GB
>>>>>>>         Used Memory on the hosts is now varying from 28 to 77%, so
>>>>>>> there is enough RAM.
>>>>>>>         I did not check the XS logs. The same Problem has arisen
>>>>>>> this morning again (shortly after a restart of all MSs), so I will look in
>>>>>>> the XS logs now.
>>>>>>>
>>>>>>>         I have some fair experience with Eclipse and Java, I will
>>>>>>> ask my boss if we can go this way.
>>>>>>>
>>>>>>> The main issue for now, is that restarting the MSs is reproducing
>>>>>>> the problem, which is quite annoying for the impacted VM's.
>>>>>>> I thought to stop the MSs, delete all HA/Migration entries in
>>>>>>> op_ha_work and start the MSs.
>>>>>>> Is that a viable option? (or should I delete the Migration entries
>>>>>>> only? or never touch this table?)
>>>>>>> We will also very probably remove the HA Flag on all VMs if it does
>>>>>>> not work reliably.
>>>>>>>
>>>>>>> Best Regards
>>>>>>> Francois Scheurer
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Here are new logs regarding the same issue of today.
>>>>>>> I tried to get the important parts related to the stop of
>>>>>>> i-189-1178-VM.
>>>>>>>
>>>>>>> regexp: (i-189-1178-VM|VmWorkMigrateAway|job-15393|job-192989|StopRe
>>>>>>> questedvm|job-14965|job-193019|Stopped-Migrating|ctx-36aae73
>>>>>>> 3|work-2049)
>>>>>>>
>>>>>>> 2017-01-30 08:44:04,440 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Executing
>>>>>>> AsyncJobVO {id:192989, userId: 1, accountId: 1, instanceType: null,
>>>>>>> instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>>>>> rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZU
>>>>>>> F3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm
>>>>>>> 1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTA
>>>>>>> ALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAA
>>>>>>> AAAAAAAAEAAAAAAAAEmnQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAHQ,
>>>>>>> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resu
>>>>>>> ltCode: 0, result: null, initMsid: 345049103441, completeMsid: null,
>>>>>>> lastUpdated: null, lastPolled: null, created: Mon Jan 30 08:44:03 CET 2017}
>>>>>>> 2017-01-30 08:44:04,440 DEBUG [c.c.v.VmWorkJobDispatcher]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Run VM work
>>>>>>> job: com.cloud.vm.VmWorkMigrateAway for VM 1178, job origin: 15393
>>>>>>>
>>>>>>> 2017-01-30 08:44:05,217 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>>>>>> ctx-ea83d6ea) Deploy avoids pods: [], clusters: [], hosts: [29]
>>>>>>> 2017-01-30 08:44:05,220 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
>>>>>>> (AgentConnectTaskPool-4:ctx-5b61f2e5) create ClusteredAgentAttache
>>>>>>> for 232
>>>>>>> 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>>>>>> ctx-ea83d6ea) DeploymentPlanner allocation algorithm:
>>>>>>> com.cloud.deploy.UserDispersingPlanner@1734b2e2
>>>>>>> 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>>>>>> ctx-ea83d6ea) Trying to allocate a host and storage pools from dc:17,
>>>>>>> pod:25,cluster:25, requested cpu: 4200, requested ram: 4294967296
>>>>>>> <%2842%29%209496-7296>
>>>>>>> 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>>>>>> ctx-ea83d6ea) Is ROOT volume READY (pool already allocated)?: Yes
>>>>>>> 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>>>>>> ctx-ea83d6ea) This VM has last host_id specified, trying to choose the same
>>>>>>> host: 25
>>>>>>>
>>>>>>> 2017-01-30 08:44:05,614 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>>>>>> ctx-ea83d6ea) The last host of this VM is not UP or is not enabled, host
>>>>>>> status is: Disconnected, host resource state is: Enabled
>>>>>>> 2017-01-30 08:44:05,614 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>>>>>> ctx-ea83d6ea) Cannot choose the last host to deploy this VM
>>>>>>> 2017-01-30 08:44:05,614 DEBUG [c.c.d.FirstFitPlanner]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>>>>>> ctx-ea83d6ea) Searching resources only under specified Cluster: 25
>>>>>>>
>>>>>>> 2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>>>>>> FirstFitRoutingAllocator) Found 0 hosts for allocation after
>>>>>>> prioritization: []
>>>>>>> 2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>>>>>> FirstFitRoutingAllocator) Looking for speed=4200Mhz, Ram=4096
>>>>>>> 2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>>>>>> FirstFitRoutingAllocator) Host Allocator returning 0 suitable hosts
>>>>>>> 2017-01-30 08:44:05,777 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>>>>>> ctx-ea83d6ea) No suitable hosts found
>>>>>>> 2017-01-30 08:44:05,777 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>>>>>> ctx-ea83d6ea) No suitable hosts found under this Cluster: 25
>>>>>>> 2017-01-30 08:44:05,787 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>>>>>> ctx-ea83d6ea) Could not find suitable Deployment Destination for this VM
>>>>>>> under any clusters, returning.
>>>>>>> 2017-01-30 08:44:05,787 DEBUG [c.c.d.FirstFitPlanner]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>>>>>> ctx-ea83d6ea) Searching resources only under specified Cluster: 25
>>>>>>> 2017-01-30 08:44:05,798 DEBUG [c.c.d.FirstFitPlanner]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>>>>>> ctx-ea83d6ea) The specified cluster is in avoid set, returning.
>>>>>>> 2017-01-30 08:44:05,798 DEBUG [c.c.v.VirtualMachineManagerImpl]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>>>>>> ctx-ea83d6ea) Unable to find destination for migrating the vm
>>>>>>> VM[User|i-189-1178-VM]
>>>>>>> 2017-01-30 08:44:05,798 WARN [c.c.v.VirtualMachineManagerImpl]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>>>>>> ctx-ea83d6ea) Failed to deploy vm 1178 with original planner, sending
>>>>>>> HAPlanner
>>>>>>>
>>>>>>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>>>>>> ctx-ea83d6ea) Deploy avoids pods: [], clusters: [], hosts: [29]
>>>>>>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>>>>>> ctx-ea83d6ea) DeploymentPlanner allocation algorithm:
>>>>>>> com.cloud.deploy.SkipHeuresticsPlanner@4fdf86ce
>>>>>>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>>>>>> ctx-ea83d6ea) Trying to allocate a host and storage pools from dc:17,
>>>>>>> pod:25,cluster:25, requested cpu: 4200, requested ram: 4294967296
>>>>>>> <%2842%29%209496-7296>
>>>>>>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>>>>>> ctx-ea83d6ea) Is ROOT volume READY (pool already allocated)?: Yes
>>>>>>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>>>>>> ctx-ea83d6ea) This VM has last host_id specified, trying to choose the same
>>>>>>> host: 25
>>>>>>>
>>>>>>> 2017-01-30 08:44:06,022 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>>>>>> ctx-ea83d6ea) The last host of this VM is not UP or is not enabled, host
>>>>>>> status is: Disconnected, host resource state is: Enabled
>>>>>>> 2017-01-30 08:44:06,022 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>>>>>> ctx-ea83d6ea) Cannot choose the last host to deploy this VM
>>>>>>> 2017-01-30 08:44:06,022 DEBUG [c.c.d.FirstFitPlanner]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>>>>>> ctx-ea83d6ea) Searching resources only under specified Cluster: 25
>>>>>>> 2017-01-30 08:44:06,039 DEBUG [c.c.d.SkipHeuresticsPlanner]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>>>>>> ctx-ea83d6ea) Deploying vm during HA process, so skipping disable threshold
>>>>>>> check
>>>>>>>
>>>>>>> 2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>>>>>> FirstFitRoutingAllocator) Found 0 hosts for allocation after
>>>>>>> prioritization: []
>>>>>>> 2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>>>>>> FirstFitRoutingAllocator) Looking for speed=4200Mhz, Ram=4096
>>>>>>> 2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>>>>>> FirstFitRoutingAllocator) Host Allocator returning 0 suitable hosts
>>>>>>> 2017-01-30 08:44:06,151 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>>>>>> ctx-ea83d6ea) No suitable hosts found
>>>>>>> 2017-01-30 08:44:06,151 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>>>>>> ctx-ea83d6ea) No suitable hosts found under this Cluster: 25
>>>>>>> 2017-01-30 08:44:06,157 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Publish
>>>>>>> async job-192986 complete on message bus
>>>>>>> 2017-01-30 08:44:06,157 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Wake up
>>>>>>> jobs related to job-192986
>>>>>>> 2017-01-30 08:44:06,157 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Update db
>>>>>>> status for job-192986
>>>>>>> 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>>>>>>> ctx-6e9c753f) The last host of this VM is not UP or is not enabled, host
>>>>>>> status is: Disconnected, host resource state is: Enabled
>>>>>>> 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>>>>>>> ctx-6e9c753f) Cannot choose the last host to deploy this VM
>>>>>>> 2017-01-30 08:44:06,157 DEBUG [c.c.d.FirstFitPlanner]
>>>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>>>>>>> ctx-6e9c753f) Searching resources only under specified Cluster: 35
>>>>>>> 2017-01-30 08:44:06,160 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>>>>>> ctx-ea83d6ea) Could not find suitable Deployment Destination for this VM
>>>>>>> under any clusters, returning.
>>>>>>> 2017-01-30 08:44:06,160 DEBUG [c.c.d.FirstFitPlanner]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>>>>>> ctx-ea83d6ea) Searching resources only under specified Cluster: 25
>>>>>>> 2017-01-30 08:44:06,163 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>>> (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Execute sync-queue item:
>>>>>>> SyncQueueItemVO {id:63629, queueId: 64423, contentType: AsyncJob,
>>>>>>> contentId: 192991, lastProcessMsid: 345049098498, lastprocessNumber: 27,
>>>>>>> lastProcessTime: Mon Jan 30 08:44:06 CET 2017, created: Mon Jan 30 08:44:05
>>>>>>> CET 2017}
>>>>>>> 2017-01-30 08:44:06,169 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>>>>>>> (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Add
>>>>>>> job-192992 into job monitoring
>>>>>>> 2017-01-30 08:44:06,184 DEBUG [c.c.d.SkipHeuresticsPlanner]
>>>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>>>>>>> ctx-6e9c753f) Deploying vm during HA process, so skipping disable threshold
>>>>>>> check
>>>>>>> 2017-01-30 08:44:06,186 DEBUG [c.c.d.FirstFitPlanner]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>>>>>> ctx-ea83d6ea) The specified cluster is in avoid set, returning.
>>>>>>> 2017-01-30 08:44:06,186 DEBUG [c.c.v.VirtualMachineManagerImpl]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>>>>>> ctx-ea83d6ea) Unable to find destination for migrating the vm
>>>>>>> VM[User|i-189-1178-VM]
>>>>>>> 2017-01-30 08:44:06,186 ERROR [c.c.v.VmWorkJobHandlerProxy]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>>>>>> ctx-ea83d6ea) Invocation exception, caused by:
>>>>>>> com.cloud.exception.InsufficientServerCapacityException: Unable to
>>>>>>> find a server to migrate to.Scope=interface com.cloud.org.Cluster; id=25
>>>>>>> 2017-01-30 08:44:06,187 INFO  [c.c.v.VmWorkJobHandlerProxy]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>>>>>> ctx-ea83d6ea) Rethrow exception com.cloud.exception.InsufficientServerCapacityException:
>>>>>>> Unable to find a server to migrate to.Scope=interface
>>>>>>> com.cloud.org.Cluster; id=25
>>>>>>> 2017-01-30 08:44:06,187 DEBUG [c.c.v.VmWorkJobDispatcher]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Done with
>>>>>>> run of VM work job: com.cloud.vm.VmWorkMigrateAway for VM 1178, job origin:
>>>>>>> 15393
>>>>>>> 2017-01-30 08:44:06,187 ERROR [c.c.v.VmWorkJobDispatcher]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Unable to
>>>>>>> complete AsyncJobVO {id:192989, userId: 1, accountId: 1, instanceType:
>>>>>>> null, instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway,
>>>>>>> cmdInfo: rO0ABXNyAB5jb20uY2xvdWQudm0uVm
>>>>>>> 1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb2
>>>>>>> 0uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2
>>>>>>> VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZz
>>>>>>> t4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAEmnQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAHQ,
>>>>>>> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
>>>>>>> result: null, initMsid: 345049103441, completeMsid: null, lastUpdated:
>>>>>>> null, lastPolled: null, created: Mon Jan 30 08:44:03 CET 2017}, job
>>>>>>> origin:15393
>>>>>>> com.cloud.exception.InsufficientServerCapacityException: Unable to
>>>>>>> find a server to migrate to.Scope=interface com.cloud.org.Cluster; id=25
>>>>>>>         at com.cloud.vm.VirtualMachineMan
>>>>>>> agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:2390)
>>>>>>>         at com.cloud.vm.VirtualMachineMan
>>>>>>> agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:4520)
>>>>>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>>>>>>> Method)
>>>>>>>         at sun.reflect.NativeMethodAccess
>>>>>>> orImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>>>>         at sun.reflect.DelegatingMethodAc
>>>>>>> cessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>>>         at java.lang.reflect.Method.invoke(Method.java:606)
>>>>>>>         at com.cloud.vm.VmWorkJobHandlerP
>>>>>>> roxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
>>>>>>>         at com.cloud.vm.VirtualMachineMan
>>>>>>> agerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4636)
>>>>>>>         at com.cloud.vm.VmWorkJobDispatch
>>>>>>> er.runJob(VmWorkJobDispatcher.java:103)
>>>>>>>         at org.apache.cloudstack.framewor
>>>>>>> k.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManag
>>>>>>> erImpl.java:537)
>>>>>>>         at org.apache.cloudstack.managed.
>>>>>>> context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>>>>>>>         at org.apache.cloudstack.managed.
>>>>>>> context.impl.DefaultManagedContext$1.call(DefaultManagedCont
>>>>>>> ext.java:56)
>>>>>>>         at org.apache.cloudstack.managed.
>>>>>>> context.impl.DefaultManagedContext.callWithContext(DefaultMa
>>>>>>> nagedContext.java:103)
>>>>>>>         at org.apache.cloudstack.managed.
>>>>>>> context.impl.DefaultManagedContext.runWithContext(DefaultMan
>>>>>>> agedContext.java:53)
>>>>>>>         at org.apache.cloudstack.managed.
>>>>>>> context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>>>>>>>         at org.apache.cloudstack.framewor
>>>>>>> k.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:494)
>>>>>>>         at java.util.concurrent.Executors
>>>>>>> $RunnableAdapter.call(Executors.java:471)
>>>>>>>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>>>>>         at java.util.concurrent.ThreadPoo
>>>>>>> lExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>>>>         at java.util.concurrent.ThreadPoo
>>>>>>> lExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>>>>         at java.lang.Thread.run(Thread.java:745)
>>>>>>> 2017-01-30 08:44:06,188 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>>> (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Schedule queued job-192991
>>>>>>> 2017-01-30 08:44:06,192 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Wake up
>>>>>>> jobs joined with job-192986 and disjoin all subjobs created from job- 192986
>>>>>>> 2017-01-30 08:44:06,196 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Complete
>>>>>>> async job-192989, jobStatus: FAILED, resultCode: 0, result:
>>>>>>> rO0ABXNyADdjb20uY2xvdWQuZXhjZXB0aW9uLkluc3VmZmljaWVudFNlcnZl
>>>>>>> ckNhcGFjaXR5RXhjZXB0aW9uAAAAAFZNT3ACAAFaABVhZmZpbml0eUdyb3Vw
>>>>>>> c0FwcGxpZWR4cgAxY29tLmNsb3VkLmV4Y2VwdGlvbi5JbnN1ZmZpY2llbnRD
>>>>>>> YXBhY2l0eUV4Y2VwdGlvbgAAAABWTU91AgACTAACaWR0ABBMamF2YS9sYW5n
>>>>>>> L0xvbmc7TAAFc2NvcGV0ABFMamF2YS9sYW5nL0NsYXNzO3hyACJjb20uY2xv
>>>>>>> dWQuZXhjZXB0aW9uLkNsb3VkRXhjZXB0aW9ueeiOG_HkRMcCAAJMAAtjc0Vy
>>>>>>> cm9yQ29kZXQAE0xqYXZhL2xhbmcvSW50ZWdlcjtMAAZpZExpc3R0ABVMamF2
>>>>>>> YS91dGlsL0FycmF5TGlzdDt4cgATamF2YS5sYW5nLkV4Y2VwdGlvbtD9Hz4a
>>>>>>> OxzEAgAAeHIAE2phdmEubGFuZy5UaHJvd2FibGXVxjUnOXe4ywMABEwABWNh
>>>>>>> dXNldAAVTGphdmEvbGFuZy9UaHJvd2FibGU7TAANZGV0YWlsTWVzc2FnZXQA
>>>>>>> EkxqYXZhL2xhbmcvU3RyaW5nO1sACnN0YWNrVHJhY2V0AB5bTGphdmEvbGFu
>>>>>>> Zy9TdGFja1RyYWNlRWxlbWVudDtMABRzdXBwcmVzc2VkRXhjZXB0aW9uc3QA
>>>>>>> EExqYXZhL3V0aWwvTGlzdDt4cHEAfgANdAAmVW5hYmxlIHRvIGZpbmQgYSBz
>>>>>>> ZXJ2ZXIgdG8gbWlncmF0ZSB0by51cgAeW0xqYXZhLmxhbmcuU3RhY2tUcmFj
>>>>>>> ZUVsZW1lbnQ7AkYqPDz9IjkCAAB4cAAAABVzcgAbamF2YS5sYW5nLlN0YWNr
>>>>>>> VHJhY2VFbGVtZW50YQnFmiY23YUCAARJAApsaW5lTnVtYmVyTAAOZGVjbGFy
>>>>>>> aW5nQ2xhc3NxAH4ACkwACGZpbGVOYW1lcQB-AApMAAptZXRob2ROYW1lcQB-
>>>>>>> AAp4cAAACVZ0ACZjb20uY2xvdWQudm0uVmlydHVhbE1hY2hpbmVNYW5hZ2Vy
>>>>>>> SW1wbHQAHlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwuamF2YXQAFm9yY2hl
>>>>>>> c3RyYXRlTWlncmF0ZUF3YXlzcQB-ABEAABGocQB-ABNxAH4AFHEAfgAVc3EA
>>>>>>> fgAR_____nQAJHN1bi5yZWZsZWN0Lk5hdGl2ZU1ldGhvZEFjY2Vzc29ySW1w
>>>>>>> bHQAHU5hdGl2ZU1ldGhvZEFjY2Vzc29ySW1wbC5qYXZhdAAHaW52b2tlMHNx
>>>>>>> AH4AEQAAADlxAH4AGHEAfgAZdAAGaW52b2tlc3EAfgARAAAAK3QAKHN1bi5y
>>>>>>> ZWZsZWN0LkRlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcGx0ACFEZWxlZ2F0
>>>>>>> aW5nTWV0aG9kQWNjZXNzb3JJbXBsLmphdmFxAH4AHHNxAH4AEQAAAl50ABhq
>>>>>>> YXZhLmxhbmcucmVmbGVjdC5NZXRob2R0AAtNZXRob2QuamF2YXEAfgAcc3EA
>>>>>>> fgARAAAAa3QAImNvbS5jbG91ZC52bS5WbVdvcmtKb2JIYW5kbGVyUHJveHl0
>>>>>>> ABpWbVdvcmtKb2JIYW5kbGVyUHJveHkuamF2YXQAD2hhbmRsZVZtV29ya0pv
>>>>>>> YnNxAH4AEQAAEhxxAH4AE3EAfgAUcQB-ACZzcQB-ABEAAABndAAgY29tLmNs
>>>>>>> b3VkLnZtLlZtV29ya0pvYkRpc3BhdGNoZXJ0ABhWbVdvcmtKb2JEaXNwYXRj
>>>>>>> aGVyLmphdmF0AAZydW5Kb2JzcQB-ABEAAAIZdAA_b3JnLmFwYWNoZS5jbG91
>>>>>>> ZHN0YWNrLmZyYW1ld29yay5qb2JzLmltcGwuQXN5bmNKb2JNYW5hZ2VySW1w
>>>>>>> bCQ1dAAYQXN5bmNKb2JNYW5hZ2VySW1wbC5qYXZhdAAMcnVuSW5Db250ZXh0
>>>>>>> c3EAfgARAAAAMXQAPm9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNv
>>>>>>> bnRleHQuTWFuYWdlZENvbnRleHRSdW5uYWJsZSQxdAAbTWFuYWdlZENvbnRl
>>>>>>> eHRSdW5uYWJsZS5qYXZhdAADcnVuc3EAfgARAAAAOHQAQm9yZy5hcGFjaGUu
>>>>>>> Y2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0TWFuYWdl
>>>>>>> ZENvbnRleHQkMXQAGkRlZmF1bHRNYW5hZ2VkQ29udGV4dC5qYXZhdAAEY2Fs
>>>>>>> bHNxAH4AEQAAAGd0AEBvcmcuYXBhY2hlLmNsb3Vkc3RhY2subWFuYWdlZC5j
>>>>>>> b250ZXh0LmltcGwuRGVmYXVsdE1hbmFnZWRDb250ZXh0cQB-ADZ0AA9jYWxs
>>>>>>> V2l0aENvbnRleHRzcQB-ABEAAAA1cQB-ADlxAH4ANnQADnJ1bldpdGhDb250
>>>>>>> ZXh0c3EAfgARAAAALnQAPG9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2Vk
>>>>>>> LmNvbnRleHQuTWFuYWdlZENvbnRleHRSdW5uYWJsZXEAfgAycQB-ADNzcQB-
>>>>>>> ABEAAAHucQB-AC1xAH4ALnEAfgAzc3EAfgARAAAB13QALmphdmEudXRpbC5j
>>>>>>> b25jdXJyZW50LkV4ZWN1dG9ycyRSdW5uYWJsZUFkYXB0ZXJ0AA5FeGVjdXRv
>>>>>>> cnMuamF2YXEAfgA3c3EAfgARAAABBnQAH2phdmEudXRpbC5jb25jdXJyZW50
>>>>>>> LkZ1dHVyZVRhc2t0AA9GdXR1cmVUYXNrLmphdmFxAH4AM3NxAH4AEQAABHl0
>>>>>>> ACdqYXZhLnV0aWwuY29uY3VycmVudC5UaHJlYWRQb29sRXhlY3V0b3J0ABdU
>>>>>>> aHJlYWRQb29sRXhlY3V0b3IuamF2YXQACXJ1bldvcmtlcnNxAH4AEQAAAmd0
>>>>>>> AC5qYXZhLnV0aWwuY29uY3VycmVudC5UaHJlYWRQb29sRXhlY3V0b3IkV29y
>>>>>>> a2VycQB-AEhxAH4AM3NxAH4AEQAAAul0ABBqYXZhLmxhbmcuVGhyZWFkdAAL
>>>>>>> VGhyZWFkLmphdmFxAH4AM3NyACZqYXZhLnV0aWwuQ29sbGVjdGlvbnMkVW5t
>>>>>>> b2RpZmlhYmxlTGlzdPwPJTG17I4QAgABTAAEbGlzdHEAfgAMeHIALGphdmEu
>>>>>>> dXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVDb2xsZWN0aW9uGUIAgMte
>>>>>>> 9x4CAAFMAAFjdAAWTGphdmEvdXRpbC9Db2xsZWN0aW9uO3hwc3IAE2phdmEu
>>>>>>> dXRpbC5BcnJheUxpc3R4gdIdmcdhnQMAAUkABHNpemV4cAAAAAB3BAAAAAB4
>>>>>>> cQB-AFR4c3IAEWphdmEubGFuZy5JbnRlZ2VyEuKgpPeBhzgCAAFJAAV2YWx1
>>>>>>> ZXhyABBqYXZhLmxhbmcuTnVtYmVyhqyVHQuU4IsCAAB4cAAAEO9zcQB-AFMA
>>>>>>> AAAAdwQAAAAAeHNyAA5qYXZhLmxhbmcuTG9uZzuL5JDMjyPfAgABSgAFdmFs
>>>>>>> dWV4cQB-AFYAAAAAAAAAGXZyABVjb20uY2xvdWQub3JnLkNsdXN0ZXKqoV-o
>>>>>>> maW2FAIAAHhwAA
>>>>>>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987
>>>>>>> ctx-4c93bc31) Deploy avoids pods: [], clusters: [], hosts: [3]
>>>>>>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987
>>>>>>> ctx-4c93bc31) DeploymentPlanner allocation algorithm:
>>>>>>> com.cloud.deploy.SkipHeuresticsPlanner@4fdf86ce
>>>>>>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987
>>>>>>> ctx-4c93bc31) Trying to allocate a host and storage pools from dc:14,
>>>>>>> pod:17,cluster:17, requested cpu: 4200, requested ram: 4294967296
>>>>>>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987
>>>>>>> ctx-4c93bc31) Is ROOT volume READY (pool already allocated)?: Yes
>>>>>>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987
>>>>>>> ctx-4c93bc31) This VM has last host_id specified, trying to choose the same
>>>>>>> host: 213
>>>>>>> 2017-01-30 08:44:06,213 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Publish
>>>>>>> async job-192989 complete on message bus
>>>>>>> 2017-01-30 08:44:06,213 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Wake up
>>>>>>> jobs related to job-192989
>>>>>>> 2017-01-30 08:44:06,213 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Update db
>>>>>>> status for job-192989
>>>>>>> 2017-01-30 08:44:06,219 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>>>>>>> ctx-6e9c753f) Checking resources in Cluster: 35 under Pod: 31
>>>>>>> 2017-01-30 08:44:06,227 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f
>>>>>>> FirstFitRoutingAllocator) Looking for hosts in dc: 23  pod:31  cluster:35
>>>>>>> 2017-01-30 08:44:06,229 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>>> (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Executing
>>>>>>> AsyncJobVO {id:192992, userId: 1, accountId: 1, instanceType: null,
>>>>>>> instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>>>>> rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcE
>>>>>>> mwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAl
>>>>>>> Z2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5h
>>>>>>> bWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAA
>>>>>>> AAAGwXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAEw, cmdVersion:
>>>>>>> 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null,
>>>>>>> initMsid: 345049098122, completeMsid: null, lastUpdated: null, lastPolled:
>>>>>>> null, created: Mon Jan 30 08:44:05 CET 2017}
>>>>>>> 2017-01-30 08:44:06,229 DEBUG [c.c.v.VmWorkJobDispatcher]
>>>>>>> (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Run VM work
>>>>>>> job: com.cloud.vm.VmWorkMigrateAway for VM 1729, job origin: 14953
>>>>>>> 2017-01-30 08:44:06,230 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>>> (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Execute sync-queue item:
>>>>>>> SyncQueueItemVO {id:63630, queueId: 67180, contentType: AsyncJob,
>>>>>>> contentId: 192990, lastProcessMsid: 345049098498, lastprocessNumber: 61,
>>>>>>> lastProcessTime: Mon Jan 30 08:44:06 CET 2017, created: Mon Jan 30 08:44:05
>>>>>>> CET 2017}
>>>>>>> 2017-01-30 08:44:06,244 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987
>>>>>>> ctx-4c93bc31) The last host of this VM is not UP or is not enabled, host
>>>>>>> status is: Disconnected, host resource state is: Enabled
>>>>>>> 2017-01-30 08:44:06,262 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987
>>>>>>> ctx-4c93bc31) Cannot choose the last host to deploy this VM
>>>>>>> 2017-01-30 08:44:06,263 DEBUG [c.c.d.FirstFitPlanner]
>>>>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987
>>>>>>> ctx-4c93bc31) Searching resources only under specified Cluster: 17
>>>>>>> 2017-01-30 08:44:06,262 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>>>>>>> (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Add
>>>>>>> job-192991 into job monitoring
>>>>>>> 2017-01-30 08:44:06,246 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Done
>>>>>>> executing com.cloud.vm.VmWorkMigrateAway for job-192986
>>>>>>> 2017-01-30 08:44:06,244 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Wake up
>>>>>>> jobs joined with job-192989 and disjoin all subjobs created from job- 192989
>>>>>>> 2017-01-30 08:44:06,269 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>>>>>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Remove
>>>>>>> job-192986 from job monitoring
>>>>>>> 2017-01-30 08:44:06,279 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Done
>>>>>>> executing com.cloud.vm.VmWorkMigrateAway for job-192989
>>>>>>> 2017-01-30 08:44:06,305 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>>> (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Executing
>>>>>>> AsyncJobVO {id:192991, userId: 1, accountId: 1, instanceType: null,
>>>>>>> instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>>>>> rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcE
>>>>>>> mwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAl
>>>>>>> Z2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5h
>>>>>>> bWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAA
>>>>>>> AAAGRXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAABw, cmdVersion:
>>>>>>> 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null,
>>>>>>> initMsid: 345049098122, completeMsid: null, lastUpdated: null, lastPolled:
>>>>>>> null, created: Mon Jan 30 08:44:05 CET 2017}
>>>>>>> 2017-01-30 08:44:06,305 DEBUG [c.c.v.VmWorkJobDispatcher]
>>>>>>> (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Run VM work
>>>>>>> job: com.cloud.vm.VmWorkMigrateAway for VM 1605, job origin: 14965
>>>>>>> 2017-01-30 08:44:06,306 DEBUG [c.c.a.m.AgentManagerImpl]
>>>>>>> (AgentManager-Handler-13:null) Ping from 258
>>>>>>> 2017-01-30 08:44:06,319 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f
>>>>>>> FirstFitRoutingAllocator) List of hosts in ascending order of number of
>>>>>>> VMs: [224, 19, 21, 15]
>>>>>>> 2017-01-30 08:44:06,322 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f
>>>>>>> FirstFitRoutingAllocator) FirstFitAllocator has 0 hosts to check for
>>>>>>> allocation: []
>>>>>>> 2017-01-30 08:44:06,322 INFO  [c.c.a.m.AgentManagerImpl]
>>>>>>> (AgentManager-Handler-13:null) PingMap for agent: 258 will not be updated
>>>>>>> because agent is no longer in the PingMap
>>>>>>> 2017-01-30 08:44:06,326 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Remove
>>>>>>> job-192989 from job monitoring
>>>>>>>
>>>>>>> 2017-01-30 08:54:38,139 INFO [c.c.v.VirtualMachineManagerImpl]
>>>>>>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019
>>>>>>> ctx-052eb4f5) Migration cancelled because state has changed:
>>>>>>> VM[User|i-189-1178-VM]
>>>>>>> 2017-01-30 08:54:38,139 DEBUG [c.c.v.VirtualMachineManagerImpl]
>>>>>>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019
>>>>>>> ctx-052eb4f5) Unable to migrate VM due to: Migration cancelled because
>>>>>>> state has changed: VM[User|i-189-1178-VM]
>>>>>>> 2017-01-30 08:54:38,167 DEBUG [c.c.c.CapacityManagerImpl]
>>>>>>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019
>>>>>>> ctx-052eb4f5) VM state transitted from :Running to Stopping with event:
>>>>>>> StopRequestedvm's original host id: 29 new host id: 33 host id before state
>>>>>>> transition: 33
>>>>>>> 2017-01-30 08:54:38,177 DEBUG [c.c.a.m.DirectAgentAttache]
>>>>>>> (DirectAgent-325:ctx-e540d7f5) Seq 19-2355664080091480198: Response
>>>>>>> Received:
>>>>>>> 2017-01-30 08:54:38,177 DEBUG [c.c.a.t.Request]
>>>>>>> (DirectAgent-325:ctx-e540d7f5) Seq 19-2355664080091480198: Processing:  {
>>>>>>> Ans: , MgmtId: 345049098498, via: 19, Ver: v1, Flags: 10,
>>>>>>> [{"com.cloud.agent.api.NetworkUsageAnswer":{"routerName":"r-
>>>>>>> 1706-VM","bytesSent":12276848087,"bytesReceived":5226621450,
>>>>>>> "result":true,"details":"","wait":0}}] }
>>>>>>> 2017-01-30 08:54:38,177 DEBUG [c.c.a.t.Request]
>>>>>>> (RouterMonitor-1:ctx-2e80f405) Seq 19-2355664080091480198: Received:  {
>>>>>>> Ans: , MgmtId: 345049098498, via: 19, Ver: v1, Flags: 10, {
>>>>>>> NetworkUsageAnswer } }
>>>>>>> 2017-01-30 08:54:38,177 DEBUG [c.c.a.m.AgentManagerImpl]
>>>>>>> (RouterMonitor-1:ctx-2e80f405) Details from executing class
>>>>>>> com.cloud.agent.api.NetworkUsageCommand:
>>>>>>> 2017-01-30 08:54:38,178 DEBUG [c.c.a.t.Request]
>>>>>>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019
>>>>>>> ctx-052eb4f5) Seq 33-1917970491306410035: Sending  { Cmd , MgmtId:
>>>>>>> 345049098498, via: 33(ewcstack-vh003-prod), Ver: v1, Flags: 100011,
>>>>>>> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"execut
>>>>>>> eInSequence":false,"checkBeforeCleanup":false,"vmName":"i-189-1178-VM","wait":0}}]
>>>>>>> }
>>>>>>> 2017-01-30 08:54:38,178 DEBUG [c.c.a.t.Request]
>>>>>>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019
>>>>>>> ctx-052eb4f5) Seq 33-1917970491306410035: Executing:  { Cmd , MgmtId:
>>>>>>> 345049098498, via: 33(ewcstack-vh003-prod), Ver: v1, Flags: 100011,
>>>>>>> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"execut
>>>>>>> eInSequence":false,"checkBeforeCleanup":false,"vmName":"i-189-1178-VM","wait":0}}]
>>>>>>> }
>>>>>>> 2017-01-30 08:54:38,178 DEBUG [c.c.a.m.DirectAgentAttache]
>>>>>>> (DirectAgent-43:ctx-da30f109) Seq 33-1917970491306410035: Executing request
>>>>>>> 2017-01-30 08:54:38,202 DEBUG [c.c.a.m.DirectAgentAttache]
>>>>>>> (DirectAgent-326:ctx-4fb88aa7) Seq 21-1035546439318503505: Executing request
>>>>>>>
>>>>>>> 2017-01-30 09:05:55,628 INFO [c.c.h.HighAvailabilityManagerImpl]
>>>>>>> (HA-Worker-4:ctx-36aae733 work-2049) Processing
>>>>>>> HAWork[2049-Migration-1178-Stopped-Migrating]
>>>>>>> 2017-01-30 09:05:55,633 WARN [o.a.c.f.j.AsyncJobExecutionContext]
>>>>>>> (HA-Worker-4:ctx-36aae733 work-2049) Job is executed without a context,
>>>>>>> setup psudo job for the executing thread
>>>>>>> 2017-01-30 09:05:55,645 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>>> (HA-Worker-4:ctx-36aae733 work-2049) Sync job-193045 execution on object
>>>>>>> VmWorkJobQueue.1178
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 31.01.2017 00:42, Rafael Weingärtner wrote:
>>>>>>>
>>>>>>>> This seems to be one of those nasty bugs that are very hard to
>>>>>>>> debug. It
>>>>>>>> seems to be cause to a combination of different factors, such as
>>>>>>>> planets
>>>>>>>> alignments ;)
>>>>>>>>
>>>>>>>> At class “VirtualMachineManagerImpl” line 2395 (method
>>>>>>>> “com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAw
>>>>>>>> ay(String,
>>>>>>>> long, DeploymentPlanner)”) is the start of the code flow that
>>>>>>>> triggers the
>>>>>>>> migration. The point here is that, if an error occurs during the
>>>>>>>> migration,
>>>>>>>> the VM will be stopped; For some reason this is executed only for
>>>>>>>> HA VMs.
>>>>>>>>
>>>>>>>> Going a little bit further, we find the point that is triggering
>>>>>>>> the error
>>>>>>>> in the method “com.cloud.vm.VirtualMachineMa
>>>>>>>> nagerImpl.migrate(VMInstanceVO,
>>>>>>>> long, DeployDestination)” at line 1941. This line will call the
>>>>>>>> method
>>>>>>>> “com.cloud.vm.VirtualMachineManagerImpl.changeState(VMInstanceVO,
>>>>>>>> Event,
>>>>>>>> Long, ItWorkVO, Step)” to change the VM state to migrating/prepare
>>>>>>>> migration/or something like that (I am guessing here), but for some
>>>>>>>> reason,
>>>>>>>> the state of the VM is not the one expected and this is causing a
>>>>>>>> “ConcurrentOperationException” to be thrown. The problem is that the
>>>>>>>> “com.cloud.vm.VirtualMachineManagerImpl.changeState(VMInstanceVO,
>>>>>>>> Event,
>>>>>>>> Long, ItWorkVO, Step)” does not seem to provide much useful log
>>>>>>>> entries.
>>>>>>>>
>>>>>>>> Just to enable us to discuss the problem a little further, could
>>>>>>>> you tell
>>>>>>>> us a little bit about your environment setting? How many ACS MSs do
>>>>>>>> you
>>>>>>>> have? What is the VM workload being migrated (I mean, the number of
>>>>>>>> VMs and
>>>>>>>> computing resources)? Do the other servers have enough computing
>>>>>>>> resources
>>>>>>>> to support these VMs? Could you also check the XenServer logs, and
>>>>>>>> check
>>>>>>>> what happened on both, source and destination hosts when the
>>>>>>>> migration for
>>>>>>>> VM “i-638-1735-VM” was triggered?
>>>>>>>>
>>>>>>>> Do you have the knowledge/skills to use an IDE (e.g. Eclipse) to
>>>>>>>> connect to
>>>>>>>> a running JVM and debug a code? Because for me it will not be
>>>>>>>> possible to
>>>>>>>> go further without looking at the variables in a running
>>>>>>>> environment.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Jan 30, 2017 at 4:28 AM, Francois Scheurer <
>>>>>>>> <fr...@everyware.ch>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Dear Rafael
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Sorry for the late answer (was sick a couple of days).
>>>>>>>>>
>>>>>>>>> I confirm we are migrating inside the same xenserver pool (within
>>>>>>>>> a CS
>>>>>>>>> Cluster).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Best Regards
>>>>>>>>> Francois
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 26.01.2017 04:35, Rafael Weingärtner wrote:
>>>>>>>>>
>>>>>>>>> Found the code that is executed.
>>>>>>>>>> I need you to confirm if you are migrating within a cluster or
>>>>>>>>>> across
>>>>>>>>>> clusters?
>>>>>>>>>>
>>>>>>>>>> On Wed, Jan 25, 2017 at 11:57 AM, Rafael Weingärtner <
>>>>>>>>>> <ra...@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>> I do not have access to ACS code right now, but I suggest starting
>>>>>>>>>>
>>>>>>>>>>> debugging here:
>>>>>>>>>>>
>>>>>>>>>>> 2017-01-24 18:05:28,427 DEBUG [c.c.c.CapacityManagerImpl]
>>>>>>>>>>>
>>>>>>>>>>>> (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380
>>>>>>>>>>>> ctx-175d37df)
>>>>>>>>>>>> VM state transitted from :Running to Stopping with event:
>>>>>>>>>>>> StopRequestedvm's
>>>>>>>>>>>> original host id: 224 new host id: 15 host id before state
>>>>>>>>>>>> transition:
>>>>>>>>>>>> 15
>>>>>>>>>>>>
>>>>>>>>>>>> I would try to understand first why ACS requested the
>>>>>>>>>>>> hypervisor to stop
>>>>>>>>>>>>
>>>>>>>>>>> the VM. I mean the following, check the source code for
>>>>>>>>>>> conditions that
>>>>>>>>>>> would make ACS request the shutdown of VMs.
>>>>>>>>>>> BTW: are you migrating within a cluster or across clusters?
>>>>>>>>>>>
>>>>>>>>>>> I will be able to analyze this further only after work hours, so
>>>>>>>>>>> if you
>>>>>>>>>>> find anything, keep me posted.
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Jan 25, 2017 at 11:45 AM, Francois Scheurer <
>>>>>>>>>>> <fr...@everyware.ch>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hello Rafael
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I think the important log lines are these:
>>>>>>>>>>>>
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 INFO
>>>>>>>>>>>> [c.c.v.VirtualMachineManagerImpl]
>>>>>>>>>>>> (Work-Job-Executor-156:ctx-14b86f5e
>>>>>>>>>>>> job-60216/job-191380 ctx-175d37df) Migration cancelled because
>>>>>>>>>>>> state has
>>>>>>>>>>>> changed: VM[User|i-638-1736-VM]
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 DEBUG
>>>>>>>>>>>> [c.c.v.VirtualMachineManagerImpl]
>>>>>>>>>>>> (Work-Job-Executor-156:ctx-14b86f5e
>>>>>>>>>>>> job-60216/job-191380 ctx-175d37df) Unable to migrate VM due to:
>>>>>>>>>>>> Migration
>>>>>>>>>>>> cancelled because state has changed: VM[User|i-638-1736-VM]
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,427 DEBUG
>>>>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>>>>>>>>>>> job-60216/job-191380 ctx-175d37df) VM state transitted from
>>>>>>>>>>>> :Running to
>>>>>>>>>>>> Stopping with event: StopRequestedvm's original host id: 224
>>>>>>>>>>>> new host
>>>>>>>>>>>> id:
>>>>>>>>>>>> 15 host id before state transition: 15
>>>>>>>>>>>>
>>>>>>>>>>>> The really disturbing thing is that CS stop again the VM if we
>>>>>>>>>>>> start it
>>>>>>>>>>>> manually, apparently because the migratingjob is still running.
>>>>>>>>>>>>
>>>>>>>>>>>> Note the the VM HA Flag is enabled.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thank you.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Best Regards.
>>>>>>>>>>>>
>>>>>>>>>>>> Francois
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> more details:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,148 DEBUG
>>>>>>>>>>>> [o.a.c.e.o.VolumeOrchestrator] (Work-Job-Executor-156:ctx-14b
>>>>>>>>>>>> 86f5e
>>>>>>>>>>>> job-60216/job-191380 ctx-175d37df) Preparing 3 volumes for
>>>>>>>>>>>> VM[User|i-638-1736-VM]
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,155 DEBUG
>>>>>>>>>>>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-122:ctx-a8bacc64
>>>>>>>>>>>> job-14953/job-191351) Done with run of VM work job:
>>>>>>>>>>>> com.cloud.vm.VmWorkMigrateAway for VM 1411, job origin: 14953
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,155 DEBUG
>>>>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>>>>>>>> (Work-Job-Executor-122:ctx-a8bacc64
>>>>>>>>>>>> job-14953/job-191351) Done executing
>>>>>>>>>>>> com.cloud.vm.VmWorkMigrateAway for
>>>>>>>>>>>> job-191351
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,160 INFO
>>>>>>>>>>>> [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-122:ctx-a8b
>>>>>>>>>>>> acc64
>>>>>>>>>>>> job-14953/job-191351) Remove job-191351 from job monitoring
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,168 INFO
>>>>>>>>>>>> [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-2:ctx-f3bcf43c
>>>>>>>>>>>> work-1989)
>>>>>>>>>>>> Completed HAWork[1989-Migration-1411-Running-Migrating]
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,246 DEBUG
>>>>>>>>>>>> [c.c.a.m.ClusteredAgentAttache] (Work-Job-Executor-156:ctx-14b
>>>>>>>>>>>> 86f5e
>>>>>>>>>>>> job-60216/job-191380 ctx-175d37df) Seq 19-8526158519542516375:
>>>>>>>>>>>> Forwarding
>>>>>>>>>>>> Seq 19-8526158519542516375:  { Cmd , MgmtId: 345049103441, via:
>>>>>>>>>>>> 19(ewcstack-vh010-prod), Ver: v1, Flags: 100111, [{
>>>>>>>>>>>> "com.cloud.agent.api.PrepareForMigrationCommand":{"vm":{"id"
>>>>>>>>>>>> :1736,"name":"i-638-1736-VM","bootloader":"PyGrub","type":"U
>>>>>>>>>>>> ser","cpus":4,"minSpeed":2100,"maxSpeed":2100,"minRam":17179
>>>>>>>>>>>> 869184,"maxRam":17179869184,"arch":"x86_64","os":"Other
>>>>>>>>>>>> (64-bit)","platformEmulator":"Other install
>>>>>>>>>>>> media","bootArgs":"","enable
>>>>>>>>>>>> HA":true,"limitCpuUse":true,"enableDynamicallyScaleVm":true,
>>>>>>>>>>>> "vncPassword":"pC1WUC7h1DBH9J36q3H9pg==","params":{"memoryOv
>>>>>>>>>>>> ercommitRatio":"1.0","platform":"viridian:true;acpi:1;apic:t
>>>>>>>>>>>> rue;pae:true;nx:true;timeoffset:-3601","keyboard":"us","Mess
>>>>>>>>>>>> age.ReservedCapacityFreed.Flag":"false","cpuOvercommitRatio"
>>>>>>>>>>>> :"4.0","
>>>>>>>>>>>> hypervisortoolsversion":"xenserver61"},"uuid":"5bff01de-c033
>>>>>>>>>>>> -4925-9e47-30dd14539272","disks":[{"data":{"org.apache.cloud
>>>>>>>>>>>> stack.storage.to.VolumeObjectTO":{"uuid":"1ab514b2-f2de-
>>>>>>>>>>>> 4d85-8e0d-9768184a1349","volumeType":"ROOT","dataStore":{"org.
>>>>>>>>>>>> apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"b8
>>>>>>>>>>>> 751b89-00f
>>>>>>>>>>>> 6-5079-ed7a-c073579c7563","id":64,"poolType":"PreSetup","hos
>>>>>>>>>>>> t":"localhost","path":"/b8751b89-00f6-5079-ed7a-c073579c7563
>>>>>>>>>>>> ","port":0,"url":"PreSetup://localhost/b8751b89-00f6-5079-ed
>>>>>>>>>>>> 7a-c073579c7563/?ROLE=Primary&STOREUUID=b8751b89-00f6-5079-
>>>>>>>>>>>> ed7a-c073579c7563"}},"name":"ROOT-1736","size":53687091200,"
>>>>>>>>>>>> path":"d3
>>>>>>>>>>>> f87e3c-1efe-42ad-aa93-609d5b980a34","volumeId":8655,"vmName"
>>>>>>>>>>>> :"i-638-1736-VM","accountId":638,"format":"VHD","provisionin
>>>>>>>>>>>> gType":"THIN","id":8655,"deviceId":0,"hypervisorType":"
>>>>>>>>>>>> XenServer"}},"diskSeq":0,"path":"d3f87e3c-1efe-42ad-aa93
>>>>>>>>>>>> -609d5b980a34","type":"ROOT","_details":{"managed":"false","
>>>>>>>>>>>> storagePort":"0",
>>>>>>>>>>>> "storageHost":"localhost","volumeSize":"53687091200"}},{"dat
>>>>>>>>>>>> a":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid
>>>>>>>>>>>> ":"733df9dc-5e24-4ab9-b22b-bce78322428e","volumeType":"DATAD
>>>>>>>>>>>> ISK","dataStore":{"org.apache.cloudstack.storage.to.PrimaryD
>>>>>>>>>>>> ataStoreTO":{"uuid":"b8751b89-00f6-5079-ed7a-c073579c7563","
>>>>>>>>>>>> id":64,"
>>>>>>>>>>>> poolType":"PreSetup","host":"localhost","path":"/b8751b89-00
>>>>>>>>>>>> f6-5079-ed7a-c073579c7563","port":0,"url":"PreSetup://localh
>>>>>>>>>>>> ost/b8751b89-00f6-5079-ed7a-c073579c7563/?ROLE=Primary&STORE
>>>>>>>>>>>> UUID=b8751b89-00f6-5079-ed7a-c073579c7563"}},"name":"WELLA-
>>>>>>>>>>>> DB1-NCSA_D","size":268435456000,"path":"265f168a-bb92-4591-a
>>>>>>>>>>>> 32b-ee3e2
>>>>>>>>>>>> 2c8cb68","volumeId":8656,"vmName":"i-638-1736-VM","accountId
>>>>>>>>>>>> ":638,"format":"VHD","provisioningType":"THIN","id":8656,"
>>>>>>>>>>>> deviceId":1,"hypervisorType":"XenServer"}},"diskSeq":1,"
>>>>>>>>>>>> path":"265f168a-bb92-4591-a32b-ee3e22c8cb68","type":"DATADIS
>>>>>>>>>>>> K","_details":{"managed":"false","storagePort":"0","stor
>>>>>>>>>>>> ageHost":"localhos
>>>>>>>>>>>> t","volumeSize":"268435456000"}},{"data":{"org.apache.clouds
>>>>>>>>>>>> tack.storage.to.VolumeObjectTO":{"uuid":"bda6f335-b8a3-479c-
>>>>>>>>>>>> 8c0d-41b344d2a50d","volumeType":"DATADISK","dataStore":{"org.
>>>>>>>>>>>> apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid"
>>>>>>>>>>>> :"09df3ca6-0a20-e4f0-3d79-12a6ed67a13d","id":89,"poolType":"
>>>>>>>>>>>> PreSetup","
>>>>>>>>>>>> host":"localhost","path":"/09df3ca6-0a20-e4f0-3d79-12a6ed67a
>>>>>>>>>>>> 13d","port":0,"url":"PreSetup://localhost/09df3ca6-0a20-e4f0
>>>>>>>>>>>> -3d79-12a6ed67a13d/?ROLE=Primary&STOREUUID=09df3ca6-0a20-
>>>>>>>>>>>>
>>>>>>>>>>>> e4f0-3d79-12a6ed67a13d"}},"name":"WELLA-DB1-NCSA_E","size
>>>>>>>>>>>> ":429496729600,"path":"612cb586-a70c-4896-b120-062ee7a67894"
>>>>>>>>>>>> ,"volumeId":86
>>>>>>>>>>>> 57,"vmName":"i-638-1736-VM","accountId":638,"format":"VHD","
>>>>>>>>>>>> provisioningType":"THIN","id":8657,"deviceId":2,"hypervisorT
>>>>>>>>>>>> ype":"XenServer"}},"diskSeq":2,"path":"612cb586-a70c-4896-b1
>>>>>>>>>>>> 20-062ee7a67894","type":"DATADISK","_details":{"managed":"
>>>>>>>>>>>> false","storagePort":"0","storageHost":"localhost","volumeSi
>>>>>>>>>>>> ze":"42949
>>>>>>>>>>>> 6729600"}},{"data":{"org.apache.cloudstack.storage.to.Templa
>>>>>>>>>>>> teObjectTO":{"uuid":"0e5bf4ff-caca-4b08-abed-7bbcdf3c42ef","
>>>>>>>>>>>> id":201,"format":"ISO","accountId":1,"hvm":true,"displayText
>>>>>>>>>>>> ":"xen-pv-drv-iso","name":"xs-tools.iso","guestOsType":"CentOS
>>>>>>>>>>>> 4.5
>>>>>>>>>>>> (32-bit)","hypervisorType":"XenServer"}},"diskSeq":3,"type":"
>>>>>>>>>>>> ISO"}],"nics":[{"deviceId":0,"networkRateMbps":1000,"default
>>>>>>>>>>>> Nic":true,"pxeDisable":false,"nicUuid":"3523c3d3-9f92-4076-a
>>>>>>>>>>>> d99-f8864d0f0edd","uuid":"e6fbb257-16e1-415f-88fb-9a79b263b3
>>>>>>>>>>>> a8","ip":"192.168.1.42","netmask":"255.255.255.0","
>>>>>>>>>>>> gateway":"192.168.1.1","mac":"02:00:0c:ae:00:0f","dns1":"8.8
>>>>>>>>>>>> .8.8","dns2":"8.8
>>>>>>>>>>>> .4.4","broadcastType":"Vlan","type":"Guest","broadcastUri":"
>>>>>>>>>>>> vlan://2285","isolationUri":"vlan://2285","isSecurityGroupEn
>>>>>>>>>>>> abled":false,"name":"bond0"}],"vcpuMaxLimit":16},"wait":0}}] }
>>>>>>>>>>>> to
>>>>>>>>>>>> 345049098498
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG
>>>>>>>>>>>> [c.c.a.t.Request]
>>>>>>>>>>>> (AgentManager-Handler-10:null) Seq 19-8526158519542516375:
>>>>>>>>>>>> Processing:
>>>>>>>>>>>> {
>>>>>>>>>>>> Ans: , MgmtId: 345049103441, via: 19, Ver: v1, Flags: 110,
>>>>>>>>>>>> [{"com.cloud.agent.api.PrepareForMigrationAnswer":{"result":
>>>>>>>>>>>> true,"wait":0}}]
>>>>>>>>>>>> }
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG
>>>>>>>>>>>> [c.c.a.m.AgentAttache]
>>>>>>>>>>>> (AgentManager-Handler-10:null) Seq 19-8526158519542516375: No
>>>>>>>>>>>> more
>>>>>>>>>>>> commands
>>>>>>>>>>>> found
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG
>>>>>>>>>>>> [c.c.a.t.Request]
>>>>>>>>>>>> (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380
>>>>>>>>>>>> ctx-175d37df)
>>>>>>>>>>>> Seq 19-8526158519542516375: Received:  { Ans: , MgmtId:
>>>>>>>>>>>> 345049103441,
>>>>>>>>>>>> via:
>>>>>>>>>>>> 19, Ver: v1, Flags: 110, { PrepareForMigrationAnswer } }
>>>>>>>>>>>>
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 INFO
>>>>>>>>>>>> [c.c.v.VirtualMachineManagerImpl]
>>>>>>>>>>>> (Work-Job-Executor-156:ctx-14b86f5e
>>>>>>>>>>>> job-60216/job-191380 ctx-175d37df) Migration cancelled because
>>>>>>>>>>>> state has
>>>>>>>>>>>> changed: VM[User|i-638-1736-VM]
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 DEBUG
>>>>>>>>>>>> [c.c.v.VirtualMachineManagerImpl]
>>>>>>>>>>>> (Work-Job-Executor-156:ctx-14b86f5e
>>>>>>>>>>>> job-60216/job-191380 ctx-175d37df) Unable to migrate VM due to:
>>>>>>>>>>>> Migration
>>>>>>>>>>>> cancelled because state has changed: VM[User|i-638-1736-VM]
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,427 DEBUG
>>>>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>>>>>>>>>>> job-60216/job-191380 ctx-175d37df) VM state transitted from
>>>>>>>>>>>> :Running to
>>>>>>>>>>>> Stopping with event: StopRequestedvm's original host id: 224
>>>>>>>>>>>> new host
>>>>>>>>>>>> id:
>>>>>>>>>>>> 15 host id before state transition: 15
>>>>>>>>>>>>
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,438 DEBUG
>>>>>>>>>>>> [c.c.a.m.ClusteredAgentAttache] (Work-Job-Executor-156:ctx-14b
>>>>>>>>>>>> 86f5e
>>>>>>>>>>>> job-60216/job-191380 ctx-175d37df) Seq 15-6889944479923437643:
>>>>>>>>>>>> Forwarding
>>>>>>>>>>>> Seq 15-6889944479923437643:  { Cmd , MgmtId: 345049103441, via:
>>>>>>>>>>>> 15(ewcstack-vh011-prod), Ver: v1, Flags: 100011,
>>>>>>>>>>>> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"execut
>>>>>>>>>>>> eInSequence":false,"checkBeforeCleanup":false,"vmName":"i-
>>>>>>>>>>>> 638-1736-VM","wait":0}}]
>>>>>>>>>>>> } to 345049098498
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:30,335 DEBUG
>>>>>>>>>>>> [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-1:null) SeqA
>>>>>>>>>>>> 254-800498:
>>>>>>>>>>>> Processing Seq 254-800498:  { Cmd , MgmtId: -1, via: 254, Ver:
>>>>>>>>>>>> v1,
>>>>>>>>>>>> Flags:
>>>>>>>>>>>> 11, [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_
>>>>>>>>>>>> proxyVmId":1530,"_loadInfo":"{\n
>>>>>>>>>>>> \"connections\": []\n}","wait":0}}] }
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:30,340 DEBUG
>>>>>>>>>>>> [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-1:null) SeqA
>>>>>>>>>>>> 254-800498:
>>>>>>>>>>>> Sending Seq 254-800498:  { Ans: , MgmtId: 345049103441, via:
>>>>>>>>>>>> 254, Ver:
>>>>>>>>>>>> v1,
>>>>>>>>>>>> Flags: 100010, [{"com.cloud.agent.api.AgentCo
>>>>>>>>>>>> ntrolAnswer":{"result":true,"wait":0}}] }
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,447 DEBUG
>>>>>>>>>>>> [c.c.a.t.Request]
>>>>>>>>>>>> (AgentManager-Handler-11:null) Seq 15-6889944479923437632:
>>>>>>>>>>>> Processing:
>>>>>>>>>>>> {
>>>>>>>>>>>> Ans: , MgmtId: 345049103441, via: 15, Ver: v1, Flags: 10,
>>>>>>>>>>>> [{"com.cloud.agent.api.StopAnswer":{"platform":"viridian:tru
>>>>>>>>>>>> e;acpi:1;apic:true;pae:true;nx:true;timeoffset:3601","result
>>>>>>>>>>>> ":true,"details":"Stop
>>>>>>>>>>>> VM i-638-1735-VM Succeed","wait":0}}] }
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,447 DEBUG
>>>>>>>>>>>> [c.c.a.t.Request]
>>>>>>>>>>>> (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379
>>>>>>>>>>>> ctx-bbabfeae)
>>>>>>>>>>>> Seq 15-6889944479923437632: Received:  { Ans: , MgmtId:
>>>>>>>>>>>> 345049103441,
>>>>>>>>>>>> via:
>>>>>>>>>>>> 15, Ver: v1, Flags: 10, { StopAnswer } }
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,470 DEBUG
>>>>>>>>>>>> [c.c.v.VirtualMachineManagerImpl]
>>>>>>>>>>>> (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) VM[User|i-638-1735-VM] is
>>>>>>>>>>>> stopped on
>>>>>>>>>>>> the
>>>>>>>>>>>> host. Proceeding to release resource held.
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,475 DEBUG
>>>>>>>>>>>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not
>>>>>>>>>>>> supported
>>>>>>>>>>>> in the network id=519
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,478 DEBUG
>>>>>>>>>>>> [o.a.c.e.o.NetworkOrchestrator] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>>>> db528
>>>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Changing active number of
>>>>>>>>>>>> nics for
>>>>>>>>>>>> network id=519 on -1
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>>>>>>>>>>>> [o.a.c.e.o.NetworkOrchestrator] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>>>> db528
>>>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Asking VpcVirtualRouter to
>>>>>>>>>>>> release
>>>>>>>>>>>> NicProfile[17403-1735-3364083b-dcc0-4eb9-a7f9-f3e85bf8c7d3-1
>>>>>>>>>>>> 92.168.1.11-null
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>>>>>>>>>>>> [c.c.v.VirtualMachineManagerImpl]
>>>>>>>>>>>> (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Successfully released network
>>>>>>>>>>>> resources
>>>>>>>>>>>> for the vm VM[User|i-638-1735-VM]
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>>>>>>>>>>>> [c.c.v.VirtualMachineManagerImpl]
>>>>>>>>>>>> (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Successfully released storage
>>>>>>>>>>>> resources
>>>>>>>>>>>> for the vm VM[User|i-638-1735-VM]
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,496 DEBUG
>>>>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) VM state transitted from
>>>>>>>>>>>> :Stopping to
>>>>>>>>>>>> Stopped with event: OperationSucceededvm's original host id: 15
>>>>>>>>>>>> new host
>>>>>>>>>>>> id: null host id before state transition: 15
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Hosts's actual total CPU:
>>>>>>>>>>>> 82976 and
>>>>>>>>>>>> CPU
>>>>>>>>>>>> after applying overprovisioning: 331904
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Hosts's actual total RAM:
>>>>>>>>>>>> 261961927808
>>>>>>>>>>>> and RAM after applying overprovisioning: 261961924608
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) release cpu from host: 15,
>>>>>>>>>>>> old used:
>>>>>>>>>>>> 71300,reserved: 2100, actual total: 82976, total with
>>>>>>>>>>>> overprovisioning:
>>>>>>>>>>>> 331904; new used: 67100,reserved:6300; movedfromreserved:
>>>>>>>>>>>> false,moveToReserveredtrue
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) release mem from host: 15,
>>>>>>>>>>>> old used:
>>>>>>>>>>>> 98784247808,reserved: 1073741824, total: 261961924608; new used:
>>>>>>>>>>>> 94489280512,reserved:5368709120 <(53)%206870-9120>;
>>>>>>>>>>>> movedfromreserved:
>>>>>>>>>>>>
>>>>>>>>>>>> false,moveToReserveredtrue
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,538 DEBUG
>>>>>>>>>>>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not
>>>>>>>>>>>> supported
>>>>>>>>>>>> in the network id=519
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,557 DEBUG
>>>>>>>>>>>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not
>>>>>>>>>>>> supported
>>>>>>>>>>>> in the network id=519
>>>>>>>>>>>>
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,588 ERROR
>>>>>>>>>>>> [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>>>> db528
>>>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Invocation exception, caused
>>>>>>>>>>>> by:
>>>>>>>>>>>> com.cloud.utils.exception.CloudRuntimeException: Unable to
>>>>>>>>>>>> migrate
>>>>>>>>>>>> VM[User|i-638-1735-VM]
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 INFO
>>>>>>>>>>>> [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>>>> db528
>>>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Rethrow exception
>>>>>>>>>>>> com.cloud.utils.exception.CloudRuntimeException: Unable to
>>>>>>>>>>>> migrate
>>>>>>>>>>>> VM[User|i-638-1735-VM]
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 DEBUG
>>>>>>>>>>>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>>>> job-13733/job-191379) Done with run of VM work job:
>>>>>>>>>>>> com.cloud.vm.VmWorkMigrateAway for VM 1735, job origin: 13733
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 ERROR
>>>>>>>>>>>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>>>> job-13733/job-191379) Unable to complete AsyncJobVO {id:191379,
>>>>>>>>>>>> userId:
>>>>>>>>>>>> 1,
>>>>>>>>>>>> accountId: 1, instanceType: null, instanceId: null, cmd:
>>>>>>>>>>>> com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>>>>>>>>>> rO0ABXNyAB5jb20uY2xvdWQudm0uVm
>>>>>>>>>>>> 1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb2
>>>>>>>>>>>> 0uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2
>>>>>>>>>>>> VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZz
>>>>>>>>>>>> t4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGx3QAGVZpcnR1YWxNYWNoaW5lTW
>>>>>>>>>>>> FuYWdlckltcGwAAAAAAAAA4A,
>>>>>>>>>>>> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0,
>>>>>>>>>>>> resultCode: 0,
>>>>>>>>>>>> result: null, initMsid: 345049098498, completeMsid: null,
>>>>>>>>>>>> lastUpdated:
>>>>>>>>>>>> null, lastPolled: null, created: Tue Jan 24 18:05:00 CET 2017},
>>>>>>>>>>>> job
>>>>>>>>>>>> origin:13733
>>>>>>>>>>>> ewcstack-man02-prod: com.cloud.utils.exception.Clou
>>>>>>>>>>>> dRuntimeException:
>>>>>>>>>>>> Unable to migrate VM[User|i-638-1735-VM]
>>>>>>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>>>>>>>>>>>> agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.ja
>>>>>>>>>>>> va:2405)
>>>>>>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>>>>>>>>>>>> agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.ja
>>>>>>>>>>>> va:4517)
>>>>>>>>>>>> ewcstack-man02-prod:    at sun.reflect.GeneratedMethodAcc
>>>>>>>>>>>> essor402.invoke(Unknown
>>>>>>>>>>>> Source)
>>>>>>>>>>>> ewcstack-man02-prod:    at sun.reflect.DelegatingMethodAc
>>>>>>>>>>>> cessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>>>>>>>> ewcstack-man02-prod:    at java.lang.reflect.Method.invok
>>>>>>>>>>>> e(Method.java:606)
>>>>>>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VmWorkJobHandlerP
>>>>>>>>>>>> roxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
>>>>>>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>>>>>>>>>>>> agerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4636)
>>>>>>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VmWorkJobDispatch
>>>>>>>>>>>> er.runJob(VmWorkJobDispatcher.java:103)
>>>>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.framewor
>>>>>>>>>>>> k.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManag
>>>>>>>>>>>> erImpl.java:537)
>>>>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>>>>>>> context.ManagedContextRunnable$1.run(ManagedContextRunnable.
>>>>>>>>>>>> java:49)
>>>>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>>>>>>> context.impl.DefaultManagedContext$1.call(DefaultManagedCont
>>>>>>>>>>>> ext.java:56)
>>>>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>>>>>>> context.impl.DefaultManagedContext.callWithContext(DefaultMa
>>>>>>>>>>>> nagedContext.java:103)
>>>>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>>>>>>> context.impl.DefaultManagedContext.runWithContext(DefaultMan
>>>>>>>>>>>> agedContext.java:53)
>>>>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>>>>>>> context.ManagedContextRunnable.run(ManagedContextRunnable.ja
>>>>>>>>>>>> va:46)
>>>>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.framewor
>>>>>>>>>>>> k.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.ja
>>>>>>>>>>>> va:494)
>>>>>>>>>>>> ewcstack-man02-prod:    at java.util.concurrent.Executors
>>>>>>>>>>>> $RunnableAdapter.call(Executors.java:471)
>>>>>>>>>>>> ewcstack-man02-prod:    at java.util.concurrent.FutureTas
>>>>>>>>>>>> k.run(FutureTask.java:262)
>>>>>>>>>>>> ewcstack-man02-prod:    at java.util.concurrent.ThreadPoo
>>>>>>>>>>>> lExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>>>>>>>>> ewcstack-man02-prod:    at java.util.concurrent.ThreadPoo
>>>>>>>>>>>> lExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>>>>>>>>> ewcstack-man02-prod:    at java.lang.Thread.run(Thread.ja
>>>>>>>>>>>> va:745)
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,590 DEBUG
>>>>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>>>>>>>> (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>>>> job-13733/job-191379) Complete async job-191379, jobStatus:
>>>>>>>>>>>> FAILED,
>>>>>>>>>>>> resultCode: 0, result: rO0ABXNyAC9jb20uY2xvdWQudXRpbH
>>>>>>>>>>>> MuZXhjZXB0aW9uLkNsb3VkUnVudGltZUV4Y2VwdGlvbgAAAABWTU9yAwABSQ
>>>>>>>>>>>> ALY3NFcnJvckNvZGV4cgAaamF2YS5sYW5nLlJ1bnRpbWVFeGNlcHRpb26eXw
>>>>>>>>>>>> ZHCjSD5QIAAHhyABNqYXZhLmxhbmcuRXhjZXB0aW9u0P0fPho7HMQCAAB4cg
>>>>>>>>>>>> ATamF2YS5sYW5nLlRocm93YWJsZdXGNSc5d7jLAwAETAAFY2F1c2V0ABVMam
>>>>>>>>>>>> F2YS9sYW5nL1Rocm93YWJsZTtMAA1kZXRhaWxNZXNzYWdldAASTGphdmEvbG
>>>>>>>>>>>> FuZy9TdHJpbmc7WwAKc3RhY2tUcmFjZXQAHltMamF2YS9sYW5nL1N0YWNrVH
>>>>>>>>>>>> JhY2VFbGVtZW50O0wAFHN1cHByZXNzZWRFeGNlcHRpb25zdAAQTGphdmEvdX
>>>>>>>>>>>> RpbC9MaXN0O3hwcQB-AAh0AChVbmFibGUgdG8gbWlncmF0ZSBWTVtVc2VyfG
>>>>>>>>>>>> ktNjM4LTE3MzUtVk1ddXIAHltMamF2YS5sYW5nLlN0YWNrVHJhY2VFbGVtZW
>>>>>>>>>>>> 50OwJGKjw8_SI5AgAAeHAAAAAUc3IAG2phdmEubGFuZy5TdGFja1RyYWNlRW
>>>>>>>>>>>> xlbWVudGEJxZomNt2FAgAESQAKbGluZU51bWJlckwADmRlY2xhcmluZ0NsYX
>>>>>>>>>>>> NzcQB-AAVMAAhmaWxlTmFtZXEAfgAFTAAKbWV0aG9kTmFtZXEAfgAFeHAAAA
>>>>>>>>>>>> lldAAmY29tLmNsb3VkLnZtLlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGx0AB
>>>>>>>>>>>> 5WaXJ0dWFsTWFjaGluZU1hbmFnZXJJbXBsLmphdmF0ABZvcmNoZXN0cmF0ZU
>>>>>>>>>>>> 1pZ3JhdGVBd2F5c3EAfgAMAAARpXEAfgAOcQB-AA9xAH4AEHNxAH4ADP____
>>>>>>>>>>>> 90ACZzdW4ucmVmbGVjdC5HZW5lcmF0ZWRNZXRob2RBY2Nlc3NvcjQwMnB0AA
>>>>>>>>>>>> ZpbnZva2VzcQB-AAwAAAArdAAoc3VuLnJlZmxlY3QuRGVsZWdhdGluZ01ldG
>>>>>>>>>>>> hvZEFjY2Vzc29ySW1wbHQAIURlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcG
>>>>>>>>>>>> wuamF2YXEAfgAUc3EAfgAMAAACXnQAGGphdmEubGFuZy5yZWZsZWN0Lk1ldG
>>>>>>>>>>>> hvZHQAC01ldGhvZC5qYXZhcQB-ABRzcQB-AAwAAABrdAAiY29tLmNsb3VkLn
>>>>>>>>>>>> ZtLlZtV29ya0pvYkhhbmRsZXJQcm94eXQAGlZtV29ya0pvYkhhbmRsZXJQcm
>>>>>>>>>>>> 94eS5qYXZhdAAPaGFuZGxlVm1Xb3JrSm9ic3EAfgAMAAASHHEAfgAOcQB-
>>>>>>>>>>>> AA9xAH4AHnNxAH4ADAAAAGd0ACBjb20uY2xvdWQudm0uVm1Xb3JrSm9iRGlz
>>>>>>>>>>>> cGF0Y2hlcnQAGFZtV29ya0pvYkRpc3BhdGNoZXIuamF2YXQABnJ1bkpvYnNx
>>>>>>>>>>>> AH4ADAAAAhl0AD9vcmcuYXBhY2hlLmNsb3Vkc3RhY2suZnJhbWV3b3JrLmpv
>>>>>>>>>>>> YnMuaW1wbC5Bc3luY0pvYk1hbmFnZXJJbXBsJDV0ABhBc3luY0pvYk1hbmFn
>>>>>>>>>>>> ZXJJbXBsLmphdmF0AAxydW5JbkNvbnRleHRzcQB-AAwAAAAxdAA-b3JnLmFw
>>>>>>>>>>>> YWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29udGV4
>>>>>>>>>>>> dFJ1bm5hYmxlJDF0ABtNYW5hZ2VkQ29udGV4dFJ1bm5hYmxlLmphdmF0AANy
>>>>>>>>>>>> dW5zcQB-AAwAAAA4dABCb3JnLmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQu
>>>>>>>>>>>> Y29udGV4dC5pbXBsLkRlZmF1bHRNYW5hZ2VkQ29udGV4dCQxdAAaRGVmYXVs
>>>>>>>>>>>> dE1hbmFnZWRDb250ZXh0LmphdmF0AARjYWxsc3EAfgAMAAAAZ3QAQG9yZy5h
>>>>>>>>>>>> cGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0
>>>>>>>>>>>> TWFuYWdlZENvbnRleHRxAH4ALnQAD2NhbGxXaXRoQ29udGV4dHNxAH4ADAAA
>>>>>>>>>>>> ADVxAH4AMXEAfgAudAAOcnVuV2l0aENvbnRleHRzcQB-AAwAAAAudAA8b3Jn
>>>>>>>>>>>> LmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29u
>>>>>>>>>>>> dGV4dFJ1bm5hYmxlcQB-ACpxAH4AK3NxAH4ADAAAAe5xAH4AJXEAfgAmcQB-
>>>>>>>>>>>> ACtzcQB-AAwAAAHXdAAuamF2YS51dGlsLmNvbmN1cnJlbnQuRXhlY3V0b3Jz
>>>>>>>>>>>> JFJ1bm5hYmxlQWRhcHRlcnQADkV4ZWN1dG9ycy5qYXZhcQB-AC9zcQB-AAwA
>>>>>>>>>>>> AAEGdAAfamF2YS51dGlsLmNvbmN1cnJlbnQuRnV0dXJlVGFza3QAD0Z1dHVy
>>>>>>>>>>>> ZVRhc2suamF2YXEAfgArc3EAfgAMAAAEeXQAJ2phdmEudXRpbC5jb25jdXJy
>>>>>>>>>>>> ZW50LlRocmVhZFBvb2xFeGVjdXRvcnQAF1RocmVhZFBvb2xFeGVjdXRvci5q
>>>>>>>>>>>> YXZhdAAJcnVuV29ya2Vyc3EAfgAMAAACZ3QALmphdmEudXRpbC5jb25jdXJy
>>>>>>>>>>>> ZW50LlRocmVhZFBvb2xFeGVjdXRvciRXb3JrZXJxAH4AQHEAfgArc3EAfgAM
>>>>>>>>>>>> AAAC6XQAEGphdmEubGFuZy5UaHJlYWR0AAtUaHJlYWQuamF2YXEAfgArc3IA
>>>>>>>>>>>> JmphdmEudXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVMaXN0_A8lMbXs
>>>>>>>>>>>> jhACAAFMAARsaXN0cQB-AAd4cgAsamF2YS51dGlsLkNvbGxlY3Rpb25zJFVu
>>>>>>>>>>>> bW9kaWZpYWJsZUNvbGxlY3Rpb24ZQgCAy173HgIAAUwAAWN0ABZMamF2YS91
>>>>>>>>>>>> dGlsL0NvbGxlY3Rpb247eHBzcgATamF2YS51dGlsLkFycmF5TGlzdHiB0h2Z
>>>>>>>>>>>> x2GdAwABSQAEc2l6ZXhwAAAAAHcEAAAAAHhxAH4ATHgAABCadwgAAAAAAAAAAHg
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>>>>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>>>>>>>> (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>>>> job-13733/job-191379) Publish async job-191379 complete on
>>>>>>>>>>>> message bus
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>>>>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>>>>>>>> (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>>>> job-13733/job-191379) Wake up jobs related to job-191379
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>>>>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>>>>>>>> (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>>>> job-13733/job-191379) Update db status for job-191379
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,594 DEBUG
>>>>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>>>>>>>> (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>>>> job-13733/job-191379) Wake up jobs joined with job-191379 and
>>>>>>>>>>>> disjoin
>>>>>>>>>>>> all
>>>>>>>>>>>> subjobs created from job- 191379
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,604 DEBUG
>>>>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>>>>>>>> (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>>>> job-13733/job-191379) Done executing
>>>>>>>>>>>> com.cloud.vm.VmWorkMigrateAway for
>>>>>>>>>>>> job-191379
>>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,608 INFO
>>>>>>>>>>>> [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>>>> db528
>>>>>>>>>>>> job-13733/job-191379) Remove job-191379 from job monitoring
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 25.01.2017 15:53, Rafael Weingärtner wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Did you check the ACS log files looking for some exceptions
>>>>>>>>>>>> when the
>>>>>>>>>>>>
>>>>>>>>>>>>> migrations for this specific VM started?
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Jan 25, 2017 at 9:49 AM, Francois Scheurer <
>>>>>>>>>>>>> <fr...@everyware.ch>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Dear CS contributors
>>>>>>>>>>>>>
>>>>>>>>>>>>> We use CS 4.5.1 on a 3 Clusters with XenServer 6.5. and shared
>>>>>>>>>>>>>> primary
>>>>>>>>>>>>>> storage.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> When we put a host in maintenance, cs evacuates all VM's on
>>>>>>>>>>>>>> other
>>>>>>>>>>>>>> hosts.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> But it happens regularly that some VM's are stopped instead
>>>>>>>>>>>>>> of live
>>>>>>>>>>>>>> migrated.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If we restart that stopped VM's, CS will then stop them again
>>>>>>>>>>>>>> later.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> We need to start them several times until they stays up.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Is it a known issue? Is it fixed on CS 4.9.2 ?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Migrating manually all VM's is working but impracticable
>>>>>>>>>>>>>> because too
>>>>>>>>>>>>>> much
>>>>>>>>>>>>>> time consuming.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Many thanks in advance for your help.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best Regards
>>>>>>>>>>>>>> Francois
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> EveryWare AG
>>>>>>>>>>>>>> François Scheurer
>>>>>>>>>>>>>> Senior Systems Engineer
>>>>>>>>>>>>>> Zurlindenstrasse 52a
>>>>>>>>>>>>>> CH-8003 Zürich
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> tel: +41 44 466 60 00 <%2B41%2044%20466%2060%2000>
>>>>>>>>>>>>>> fax: +41 44 466 60 10 <%2B41%2044%20466%2060%2010>
>>>>>>>>>>>>>> mail: <fr...@everyware.ch>
>>>>>>>>>>>>>> francois.scheurer@everyware.ch
>>>>>>>>>>>>>> web: <http://www.everyware.ch>http://www.everyware.ch
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> EveryWare AG
>>>>>>>>>>>> François Scheurer
>>>>>>>>>>>> Senior Systems Engineer
>>>>>>>>>>>> Zurlindenstrasse 52a
>>>>>>>>>>>> CH-8003 Zürich
>>>>>>>>>>>>
>>>>>>>>>>>> tel: +41 44 466 60 00 <%2B41%2044%20466%2060%2000>
>>>>>>>>>>>> fax: +41 44 466 60 10 <%2B41%2044%20466%2060%2010>
>>>>>>>>>>>> mail: <fr...@everyware.ch>
>>>>>>>>>>>> francois.scheurer@everyware.ch
>>>>>>>>>>>> web: <http://www.everyware.ch>http://www.everyware.ch
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>> Rafael Weingärtner
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> EveryWare AG
>>>>>>>>> François Scheurer
>>>>>>>>> Senior Systems Engineer
>>>>>>>>> Zurlindenstrasse 52a
>>>>>>>>> CH-8003 Zürich
>>>>>>>>>
>>>>>>>>> tel: +41 44 466 60 00 <%2B41%2044%20466%2060%2000>
>>>>>>>>> fax: +41 44 466 60 10 <%2B41%2044%20466%2060%2010>
>>>>>>>>> mail: <fr...@everyware.ch>
>>>>>>>>> francois.scheurer@everyware.ch
>>>>>>>>> web: <http://www.everyware.ch>http://www.everyware.ch
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>>
>>>>>>> EveryWare AG
>>>>>>> François Scheurer
>>>>>>> Senior Systems Engineer
>>>>>>> Zurlindenstrasse 52a
>>>>>>> CH-8003 Zürich
>>>>>>>
>>>>>>> tel: +41 44 466 60 00 <%2B41%2044%20466%2060%2000>
>>>>>>> fax: +41 44 466 60 10 <%2B41%2044%20466%2060%2010>
>>>>>>> mail: <fr...@everyware.ch>francois.scheurer@everyware.ch
>>>>>>> web: <http://www.everyware.ch>http://www.everyware.ch
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Rafael Weingärtner
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Rafael Weingärtner
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>>
>>>>> EveryWare AG
>>>>> François Scheurer
>>>>> Senior Systems Engineer
>>>>> Zurlindenstrasse 52a
>>>>> CH-8003 Zürich
>>>>>
>>>>> tel: +41 44 466 60 00 <+41%2044%20466%2060%2000>
>>>>> fax: +41 44 466 60 10 <+41%2044%20466%2060%2010>
>>>>> mail: francois.scheurer@everyware.ch
>>>>> web: http://www.everyware.ch
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Rafael Weingärtner
>>>>
>>>>
>>>> --
>>>>
>>>>
>>>> EveryWare AG
>>>> François Scheurer
>>>> Senior Systems Engineer
>>>> Zurlindenstrasse 52a
>>>> CH-8003 Zürich
>>>>
>>>> tel: +41 44 466 60 00 <+41%2044%20466%2060%2000>
>>>> fax: +41 44 466 60 10 <+41%2044%20466%2060%2010>
>>>> mail: francois.scheurer@everyware.ch
>>>> web: http://www.everyware.ch
>>>>
>>>>
>>>
>>>
>>> --
>>> Rafael Weingärtner
>>>
>>>
>>> --
>>>
>>>
>>> EveryWare AG
>>> François Scheurer
>>> Senior Systems Engineer
>>> Zurlindenstrasse 52a
>>> CH-8003 Zürich
>>>
>>> tel: +41 44 466 60 00 <+41%2044%20466%2060%2000>
>>> fax: +41 44 466 60 10 <+41%2044%20466%2060%2010>
>>> mail: francois.scheurer@everyware.ch
>>> web: http://www.everyware.ch
>>>
>>>
>>
>>
>> --
>> Rafael Weingärtner
>>
>>
>> --
>>
>>
>> EveryWare AG
>> François Scheurer
>> Senior Systems Engineer
>> Zurlindenstrasse 52a
>> CH-8003 Zürich
>>
>> tel: +41 44 466 60 00 <+41%2044%20466%2060%2000>
>> fax: +41 44 466 60 10 <+41%2044%20466%2060%2010>
>> mail: francois.scheurer@everyware.ch
>> web: http://www.everyware.ch
>>
>>
>
>
> --
> Rafael Weingärtner
>
>
> --
>
>
> EveryWare AG
> François Scheurer
> Senior Systems Engineer
> Zurlindenstrasse 52a
> CH-8003 Zürich
>
> tel: +41 44 466 60 00 <+41%2044%20466%2060%2000>
> fax: +41 44 466 60 10 <+41%2044%20466%2060%2010>
> mail: francois.scheurer@everyware.ch
> web: http://www.everyware.ch
>
>


-- 
Rafael Weingärtner

Re: cs 4.5.1, vm stopped when putting xen hosts in maintenance

Posted by Francois Scheurer <fr...@everyware.ch>.

Dear Rafael


Many thanks for having checked the potential impacts and gave us your 
"blessing" before proceeding.
We did the maintenance this morning and no VM's were stopped.
Both tables op_it_work and op_ha_work did not get any new rows after the 
MS's restart (as it did before).
So the problem seems to be fixed.

It is still possible that it appears again during hosts maintenances but 
now we have a workaround (DB cleanup).
We also updated our test env. with CS 4.9.2 with all testcases succeeded 
and are planning the ugrade on Prod in the short term.
If the bug still arises in 4.9.2 we will open a jira.

Thanks you again for your help Rafael.


Best Regards
Francois Scheurer



On 08.02.2017 16:59, Rafael Weingärtner wrote:
> Your update seems reasonable. The methods that use the “step” data are 
> the following:
>
>   * com.cloud.vm.VirtualMachineManagerImpl.checkWorkItems(VMInstanceVO,
>     State)
>   * com.cloud.vm.VirtualMachineManagerImpl.cleanup(VirtualMachineGuru,
>     VirtualMachineProfileItWorkVO, Event, boolean)
>   * com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(String,
>     Map<Param, Object>, DeploymentPlan, DeploymentPlanner)
>
>
> Basically, all of them would stop the flow of execution if the step is 
> defined as “done”. What I would do before running your “update” query 
> is checking the current steps of the VMs that were having problems, 
> just to make sure you are not cutting some processes that may be 
> important. After that, I would probably do the same as you and update 
> them to done. I am assuming you already removed the HA from VMs, right?
>
>
> On Wed, Feb 8, 2017 at 5:32 AM, Francois Scheurer 
> <francois.scheurer@everyware.ch 
> <ma...@everyware.ch>> wrote:
>
>     Dear Rafael
>
>
>     You are right about cloudmonkey with CS 4.5.2 being able to update
>     the ha_enable flag on VM's. Thank you for that.
>
>     Back to the central question, do you think it is safe to run this
>     update while all MS's are stopped?:
>       update op_it_work set step="Done" where step!="Done" and
>     instance_id=<X>;
>
>     Where <X> is pointing to an expunged VM's or to the specific VM's
>     that are intermittently being stopped.
>
>     If you are not sure about it, maybe do you know somebody else who
>     can help us on this?
>
>
>     Best Regards
>
>     Francois Scheurer
>
>
>
>
>
>
>     On 07.02.2017 17:47, Rafael Weingärtner wrote:
>>     Got the idea of your work around now.
>>
>>     I have not paid much attention to the table data you posted
>>     (sorry for that). Well, it seems you have all taken care of.
>>
>>     I have a suggestion to make; instead of changing data directly on
>>     the data base, what about using CloudMonkey to update the HA flag
>>     to false on VMs? Currently, it is the UI that is blocking such
>>     customization, but according to the API, you should be able to do
>>     that.
>>     https://cloudstack.apache.org/api/apidocs-4.5/root_admin/updateVirtualMachine.html
>>     <https://cloudstack.apache.org/api/apidocs-4.5/root_admin/updateVirtualMachine.html>
>>
>>     On Tue, Feb 7, 2017 at 11:25 AM, Francois Scheurer
>>     <francois.scheurer@everyware.ch
>>     <ma...@everyware.ch>> wrote:
>>
>>         Dear Rafael
>>
>>
>>         We changed it on the VM to avoid the required stop and start.
>>         Changing on the used Offering is not affecting the
>>         vm_instance.ha_enabled.
>>         We could create a new Offering without HA but the VM need to
>>         be stopped to let us apply it.
>>
>>         We did not disabled the HA on these 4 VM's to solve the
>>         problem, but just to minimize the impact of the problem (now
>>         only i-189-1177-VM should be stopped intermittently) if the
>>         cleaning of op_it_work is not fixing the issue.
>>         Of course we could filter the update query with “instance
>>         id”, but if you look the the table output all instance_id are
>>         related to the 5 problematic VM's.
>>         The only exceptions are the first 3 rows that are related to
>>         VM's that does no exist anymore since July 2015 ...
>>         That is why I thought it may be safe to do.
>>
>>
>>         Best Regards
>>
>>         Francois Scheurer
>>
>>
>>
>>
>>
>>
>>
>>
>>         On 07.02.2017 16:27, Rafael Weingärtner wrote:
>>>         So, you changed the flags directly on "vm_instance". Why did
>>>         not you change the service offering of these VMs using ACS
>>>         UI or API? For me, it feels that the best approach would be
>>>         to change the service offering of the affected VMs (the HA
>>>         flag is first configured at the service offering). You could
>>>         assign a service offering without HA; I mean, it would be a
>>>         matter of creating a similar service offering (same
>>>         configurations to the one they are using right now), but
>>>         without HA.
>>>
>>>         I am assuming that the change directly on “vm_instance” did
>>>         not solve your problem, right? That is why you want to
>>>         change some data at “op_it_work”. I did a quick look at
>>>         “op_it_work”, and it seems that this table may be used for
>>>         more things than HA. Here is a tip for complex production
>>>         system, never run update SQLs in production environments
>>>         without a very, very restrictive and well thought “where”
>>>         clause. In my opinion this is not restrictive enough (update
>>>         op_it_work set step="Done" where step!="Done"). I would
>>>         advise you not to go further and change configurations
>>>         directly on database (unless you have a dev team that is
>>>         capable of reading, debugging, and work with ACS source
>>>         code). ACS has over 500k lines of java code, and God knows
>>>         how many lines of shell, python and other languages,
>>>         software and systems that may be used; it is pretty hard
>>>         sometimes to get a clear picture of how a change in the
>>>         database will reverberate in the whole system without
>>>         knowing in details the installation configurations and
>>>         underlying structures (networking, hypervisors, storage
>>>         systems and other pieces). However, if you want to proceed
>>>         with that, I would suggest you adding some extra filters to
>>>         that “where” clause. What about adding the “instance id” there?
>>>
>>>         PS: BTW, do you already heard about the Cloudstack Fork from
>>>         Schuberg Philis named Cosmic?
>>>         https://github.com/MissionCriticalCloud/cosmic
>>>         Yes, I have. They were part of the active ACS community back
>>>         there. I was already here when they created the fork. Still,
>>>         I kept with ACS.
>>>
>>>
>>>         On Tue, Feb 7, 2017 at 3:54 AM, Francois Scheurer
>>>         <francois.scheurer@everyware.ch
>>>         <ma...@everyware.ch>> wrote:
>>>
>>>             Dear Rafael
>>>
>>>
>>>             We disabled temporarily the HA Flag on these VM's by
>>>             updating the DB directly: update vm_instance set
>>>             ha_enabled=0 where removed is null and id=1178 or
>>>             id=1240 or id=1690 or id=1736;
>>>
>>>             We kept the test VM i-189-1177-VM with HA Flag.
>>>
>>>             Restarting the MS + cleaning the op_it_work table will
>>>             hopefully fix the issue. And if the issue is persisting,
>>>             then only the i-189-1177-VM should be affected, hence
>>>             mitigating the impact.
>>>
>>>             In the other hand if everything run fine we would
>>>             thereafter re-enable the HA Flag on the problematic VM's.
>>>
>>>
>>>             Do you think it is safe to run this update?: update
>>>             op_it_work set step="Done" where step!="Done";
>>>
>>>
>>>             Best Regards
>>>
>>>             Francois Scheurer
>>>
>>>
>>>
>>>             PS: BTW, do you already heard about the Cloudstack Fork
>>>             from Schuberg Philis named Cosmic?
>>>             https://github.com/MissionCriticalCloud/cosmic
>>>
>>>
>>>
>>>
>>>
>>>             On 06.02.2017 16:41, Rafael Weingärtner wrote:
>>>>
>>>>             I totally understand your position of not being able to
>>>>             debug the ACS in production ;)
>>>>
>>>>
>>>>             There is one bit I quite did not understand. If you
>>>>             have disabled the HA for VMs, why do you still need to
>>>>             change data on the data base? How did you disabled the
>>>>             HA of VMs?
>>>>
>>>>
>>>>             On Mon, Feb 6, 2017 at 5:04 AM, Francois Scheurer
>>>>             <francois.scheurer@everyware.ch
>>>>             <ma...@everyware.ch>> wrote:
>>>>
>>>>                 Dear Rafael / Dear CS contributors
>>>>
>>>>
>>>>                 I tried the following:
>>>>
>>>>                 1.     stop all MS
>>>>                 2.     backup DB
>>>>                 3.     select * from op_ha_work where
>>>>                 type="Migration" and mgmt_server_id is null and
>>>>                 taken is null;
>>>>                 4.     delete from op_ha_work where
>>>>                 type="Migration" and mgmt_server_id is null and
>>>>                 taken is null;
>>>>                 5.     start all MS
>>>>
>>>>                 But actually after the MS stopped the op_ha_work
>>>>                 table was automatically cleaned, making my steps
>>>>                 useless.
>>>>                 After the MS started theop_ha_work table was filled
>>>>                 again and related VM's were stopped several times
>>>>                 (3-7 times), requiring high attention to manually
>>>>                 restart them on each stop.
>>>>
>>>>                 I later found another table that could be the cause
>>>>                 of the issue:
>>>>
>>>>                 mysql> select
>>>>                 op_it_work.*,vm_instance.instance_name,vm_instance.removed
>>>>                 from op_it_work join vm_instance where
>>>>                 op_it_work.step!="Done" and
>>>>                 op_it_work.instance_id=vm_instance.id
>>>>                 <http://vm_instance.id> order by instance_id;
>>>>                 +--------------------------------------+----------------+------------+-----------------------+-----------+--------------------+-----------+------------+-------------+---------------+-------------+---------------+---------------------+
>>>>                 | id | mgmt_server_id | created_at |
>>>>                 thread                | type      |
>>>>                 vm_type            | step      | updated_at |
>>>>                 instance_id | resource_type | resource_id |
>>>>                 instance_name | removed             |
>>>>                 +--------------------------------------+----------------+------------+-----------------------+-----------+--------------------+-----------+------------+-------------+---------------+-------------+---------------+---------------------+
>>>>                 | be4d0dd2-7d4f-47d3-ba49-623d2f115bbf |  
>>>>                 345049098498 | 1436187965
>>>>                 <tel:%2814%29%203618-7965> | consoleproxy-1 |
>>>>                 Starting  | ConsoleProxy       | Release   |
>>>>                 1436188265 <tel:%2814%29%203618-8265> |         402
>>>>                 | NULL |           0 | v-402-VM      | 2015-07-06
>>>>                 13:23:22 |
>>>>                 | fb4d1844-eb92-478c-9f50-d666790f43af |  
>>>>                 345049098498 | 1436187965
>>>>                 <tel:%2814%29%203618-7965> | secstorage-1 |
>>>>                 Starting  | SecondaryStorageVm | Release   |
>>>>                 1436188265 <tel:%2814%29%203618-8265> |         687
>>>>                 | NULL |           0 | s-687-VM      | 2015-07-06
>>>>                 13:11:34 |
>>>>                 | 7dfeb25a-a705-42e8-907c-79d1e69697ed |  
>>>>                 345049098498 | 1436964548
>>>>                 <tel:%2814%29%203696-4548> | Work-Job-Executor-165
>>>>                 | Migrating | User               | Migrating |
>>>>                 1436964548 <tel:%2814%29%203696-4548> |         776
>>>>                 | Host          | 29 | i-51-776-VM   | 2015-07-15
>>>>                 12:55:38 |
>>>>                 | 2018ac61-b11c-4101-8ef5-07a39509f2bd |  
>>>>                 345049098498 | 1486020173 | Work-Job-Executor-19  |
>>>>                 Migrating | User               | Prepare   |
>>>>                 1486020173 |        1177 | Host          | 25 |
>>>>                 i-189-1177-VM | NULL                |
>>>>                 | 25430cd5-4346-4643-9c7e-7b39b8a876a2 |  
>>>>                 345049098122 | 1486019693 | Work-Job-Executor-6   |
>>>>                 Migrating | User               | Prepare   |
>>>>                 1486019693 |        1177 | Host          | 25 |
>>>>                 i-189-1177-VM | NULL                |
>>>>                 | 3276d52e-cbd9-44c5-8ab6-c70784337ff9 |  
>>>>                 345049098498 | 1486020953 | Work-Job-Executor-33  |
>>>>                 Migrating | User               | Prepare   |
>>>>                 1486020953 |        1177 | Host          | 33 |
>>>>                 i-189-1177-VM | NULL                |
>>>>                 | 6462da33-83f2-4575-af38-0e52d2a48488 |  
>>>>                 345049098122 | 1486021433 | Work-Job-Executor-12  |
>>>>                 Migrating | User               | Prepare   |
>>>>                 1486021433 |        1177 | Host          | 25 |
>>>>                 i-189-1177-VM | NULL                |
>>>>                 | 688e16c8-c39f-4e88-9e8f-36c0282f3df1 |  
>>>>                 345049098122 | 1486020773 | Work-Job-Executor-11  |
>>>>                 Migrating | User               | Prepare   |
>>>>                 1486020773 |        1177 | Host          | 25 |
>>>>                 i-189-1177-VM | NULL                |
>>>>                 | d8a18e9d-ac6f-4273-923e-e401533e2cf7 |  
>>>>                 345049098498 | 1486020353 | Work-Job-Executor-25  |
>>>>                 Migrating | User               | Prepare   |
>>>>                 1486020353 |        1177 | Host          | 25 |
>>>>                 i-189-1177-VM | NULL                |
>>>>                 | 1222ece8-399d-43d8-8ff8-d24bdebca2ed |  
>>>>                 345049103441 | 1486018924 | Work-Job-Executor-2   |
>>>>                 Migrating | User               | Prepare   |
>>>>                 1486018924 |        1178 | Host          | 29 |
>>>>                 i-189-1178-VM | NULL                |
>>>>                 | 2ac5ca4a-c20d-433a-8379-40e0de3b3ba8 |  
>>>>                 345049098498 | 1486021613 | Work-Job-Executor-41  |
>>>>                 Migrating | User               | Prepare   |
>>>>                 1486021613 |        1178 | Host          | 29 |
>>>>                 i-189-1178-VM | NULL                |
>>>>                 | 3e3b95a0-2af4-4268-bfdd-f3399a85bcaa |  
>>>>                 345049103441 | 1486019584 | Work-Job-Executor-5   |
>>>>                 Migrating | User               | Prepare   |
>>>>                 1486019584 |        1178 | Host          | 29 |
>>>>                 i-189-1178-VM | NULL                |
>>>>                 | 53745a2d-056c-4a1b-9267-eb97f077a6c8 |  
>>>>                 345049103441 | 1486020244 | Work-Job-Executor-6   |
>>>>                 Migrating | User               | Prepare   |
>>>>                 1486020244 |        1178 | Host          | 29 |
>>>>                 i-189-1178-VM | NULL                |
>>>>                 | e3555b49-7c18-4272-b7f4-ba9351b3bbd1 |  
>>>>                 345049098498 | 1486020893 | Work-Job-Executor-32  |
>>>>                 Migrating | User               | Prepare   |
>>>>                 1486020893 |        1178 | Host          | 29 |
>>>>                 i-189-1178-VM | NULL                |
>>>>                 | 1dcd5204-4f9a-4342-a23b-329a48596e13 |  
>>>>                 345049098498 | 1486020953 | Work-Job-Executor-34  |
>>>>                 Migrating | User               | Prepare   |
>>>>                 1486020953 |        1240 | Host |           3 |
>>>>                 i-388-1240-VM | NULL                |
>>>>                 | ec1668b3-fa3c-4673-b593-3d607b169a24 |  
>>>>                 345049098498 | 1486019693 | Work-Job-Executor-14  |
>>>>                 Migrating | User               | Prepare   |
>>>>                 1486019693 |        1240 | Host |           3 |
>>>>                 i-388-1240-VM | NULL                |
>>>>                 | fa7b2dec-ed7f-4a67-a055-e6e6bc1631ca |  
>>>>                 345049098498 | 1486020353 | Work-Job-Executor-26  |
>>>>                 Migrating | User               | Prepare   |
>>>>                 1486020353 |        1240 | Host |           3 |
>>>>                 i-388-1240-VM | NULL                |
>>>>                 | 8cfe6d3e-5b7b-4be8-ac10-c865a492aab9 |  
>>>>                 345049098498 | 1465387271 | Work-Job-Executor-18  |
>>>>                 Running   | User               | Prepare   |
>>>>                 1465387271 |        1388 | Host |           7 |
>>>>                 i-361-1388-VM | NULL                |
>>>>                 | 6e941b7e-6f19-4717-bf82-974c59737dbc |  
>>>>                 345049103441 | 1486019524 | Work-Job-Executor-4   |
>>>>                 Migrating | User               | Prepare   |
>>>>                 1486019524 |        1690 | Host          | 15 |
>>>>                 i-648-1690-VM | NULL                |
>>>>                 | 718578a4-0dd6-42d9-be10-ed7680650108 |  
>>>>                 345049098122 | 1486021433 | Work-Job-Executor-13  |
>>>>                 Migrating | User               | Prepare   |
>>>>                 1486021433 |        1690 | Host          | 15 |
>>>>                 i-648-1690-VM | NULL                |
>>>>                 | 9acf3908-ef13-4b03-8c2b-1a7e55e18491 |  
>>>>                 345049098498 | 1486020173 | Work-Job-Executor-20  |
>>>>                 Migrating | User               | Prepare   |
>>>>                 1486020173 |        1690 | Host          | 15 |
>>>>                 i-648-1690-VM | NULL                |
>>>>                 | a50c420b-ae52-4caa-83db-704b52b299a4 |  
>>>>                 345049098498 | 1486020773 | Work-Job-Executor-28  |
>>>>                 Migrating | User               | Prepare   |
>>>>                 1486020773 |        1690 | Host          | 15 |
>>>>                 i-648-1690-VM | NULL                |
>>>>                 | 2b5f4179-5975-4326-9db7-643c3ede63fe |  
>>>>                 345049103441 | 1484224516 | Work-Job-Executor-161 |
>>>>                 Running   | User               | Prepare   |
>>>>                 1484224516 |        1703 | Host |           3 |
>>>>                 i-51-1703-VM  | NULL                |
>>>>                 | 2f3c6c44-b2a0-4d10-88d2-ac143d2dd379 |  
>>>>                 345049098498 | 1484223910 | Work-Job-Executor-163 |
>>>>                 Running   | User               | Prepare   |
>>>>                 1484223910 |        1703 | Host |           3 |
>>>>                 i-51-1703-VM  | NULL                |
>>>>                 | 4126a23b-1cde-49bc-8636-c0d063b85cf6 |  
>>>>                 345049103441 | 1484224636 | Work-Job-Executor-86  |
>>>>                 Running   | User               | Prepare   |
>>>>                 1484224636 |        1703 | Host |           3 |
>>>>                 i-51-1703-VM  | NULL                |
>>>>                 | 5dad5645-766b-48c9-9a85-b77b3c101390 |  
>>>>                 345049103441 | 1484224636 | Work-Job-Executor-92  |
>>>>                 Running   | User               | Prepare   |
>>>>                 1484224636 |        1703 | Host |           3 |
>>>>                 i-51-1703-VM  | NULL                |
>>>>                 | 8c1dfff9-e55d-4f81-a22f-45f72accc416 |  
>>>>                 345049103441 | 1484224576 | Work-Job-Executor-95  |
>>>>                 Running   | User               | Prepare   |
>>>>                 1484224576 |        1703 | Host |           3 |
>>>>                 i-51-1703-VM  | NULL                |
>>>>                 | a703d2d7-b1e3-42ce-a551-831982990ffb |  
>>>>                 345049098498 | 1484224630 | Work-Job-Executor-13  |
>>>>                 Running   | User               | Prepare   |
>>>>                 1484224630 |        1703 | Host |           3 |
>>>>                 i-51-1703-VM  | NULL                |
>>>>                 | 0e194cb5-95b7-41cb-969a-ae37d2ad449a |  
>>>>                 345049098498 | 1485763613 | Work-Job-Executor-28  |
>>>>                 Migrating | User               | Prepare   |
>>>>                 1485763613 |        1735 | Host          | 19 |
>>>>                 i-638-1735-VM | 2017-01-30 15:24:10 |
>>>>                 | 74f180b9-be0f-429b-9a45-597ca2ca997d |  
>>>>                 345049098498 | 1485763013 | Work-Job-Executor-21  |
>>>>                 Migrating | User               | Prepare   |
>>>>                 1485763013 |        1735 | Host          | 19 |
>>>>                 i-638-1735-VM | 2017-01-30 15:24:10 |
>>>>                 | feaa421c-88df-4493-9a4f-bddb6ee1b75b |  
>>>>                 345049098122 | 1485764285 | Work-Job-Executor-19  |
>>>>                 Migrating | User               | Prepare   |
>>>>                 1485764285 |        1735 | Host          | 19 |
>>>>                 i-638-1735-VM | 2017-01-30 15:24:10 |
>>>>                 | 32138a36-ad18-42b5-8478-76ad63caf4da |  
>>>>                 345049098122 | 1486020293 | Work-Job-Executor-9   |
>>>>                 Migrating | User               | Prepare   |
>>>>                 1486020293 |        1736 | Host          | 19 |
>>>>                 i-638-1736-VM | NULL                |
>>>>                 | 5306ab1f-3c92-4f95-bbff-283b43e6862d |  
>>>>                 345049098498 | 1486019573 | Work-Job-Executor-11  |
>>>>                 Migrating | User               | Prepare   |
>>>>                 1486019573 |        1736 | Host          | 19 |
>>>>                 i-638-1736-VM | NULL                |
>>>>                 | 843bdf9c-247d-432c-ac19-1a4f7858631b |  
>>>>                 345049098498 | 1486021493 | Work-Job-Executor-38  |
>>>>                 Migrating | User               | Prepare   |
>>>>                 1486021493 |        1736 | Host          | 19 |
>>>>                 i-638-1736-VM | NULL                |
>>>>                 | 8d400afb-766f-4ac5-89ee-854f25a38eba |  
>>>>                 345049098498 | 1486020173 | Work-Job-Executor-21  |
>>>>                 Migrating | User               | Prepare   |
>>>>                 1486020173 |        1736 | Host          | 19 |
>>>>                 i-638-1736-VM | NULL                |
>>>>                 | f0ae5612-b6ef-4882-94a2-40a1e6dce342 |  
>>>>                 345049098498 | 1486020833 | Work-Job-Executor-30  |
>>>>                 Migrating | User               | Prepare   |
>>>>                 1486020833 |        1736 | Host          | 19 |
>>>>                 i-638-1736-VM | NULL                |
>>>>                 +--------------------------------------+----------------+------------+-----------------------+-----------+--------------------+-----------+------------+-------------+---------------+-------------+---------------+---------------------+
>>>>
>>>>                 As you see, the problematic VM's are all present on
>>>>                 that table:
>>>>
>>>>                     i-189-1177-VM
>>>>                     i-189-1178-VM
>>>>                     i-388-1240-VM
>>>>                     i-648-1690-VM
>>>>                     i-638-1736-VM
>>>>
>>>>                 As the MS are running the production
>>>>                 environnements, we cannot debug a lot and just need
>>>>                 the shortest path to resolution.
>>>>                 For example restarting the MS is having an impact
>>>>                 on our customers and we need to reduce the number
>>>>                 of restarts at the minimum.
>>>>
>>>>                 I have disabled the HA on all problematic VM's
>>>>                 except on i-189-1177-VM, which is less critical.
>>>>
>>>>                 The next steps could be:
>>>>
>>>>                 1.     stop all MS
>>>>                 2.     backup DB
>>>>                 3.     select
>>>>                 op_it_work.*,vm_instance.instance_name,vm_instance.removed
>>>>                 from op_it_work join vm_instance where
>>>>                 op_it_work.step!="Done" and
>>>>                 op_it_work.instance_id=vm_instance.id
>>>>                 <http://vm_instance.id> order by instance_id;
>>>>                 4.     update op_it_work set step="Done" where
>>>>                 step!="Done";
>>>>                 5.     start all MS
>>>>
>>>>                 What do you think about step 4., safe or risky?
>>>>                 Some of these entries are related to deleted VM's
>>>>                 and seems to safe to mark as "Done".
>>>>                 All other entries are related to the problematic
>>>>                 VM's that keep stopping... so marking them as
>>>>                 "Done" may (hopefully) eliminate the cause of the
>>>>                 Migrating/Stopping.
>>>>                 But we are not sure if marking them as "Done" need
>>>>                 also changes on other tables for consistency.
>>>>
>>>>
>>>>                 Thank you for your help.
>>>>
>>>>
>>>>                 Best Regards
>>>>                 Francois Scheurer
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>                 On 01.02.2017 03:01, Rafael Weingärtner wrote:
>>>>>                 Ok, I dug a little further into the code.
>>>>>
>>>>>                     If we put a host in maintenance, it may
>>>>>                     unfortunately happen that a timeout occurs and
>>>>>                     CS aborts the migration. As far as we can
>>>>>                     interpret the logs, it seems (we speculate)
>>>>>                     that CS detects that the VM is then still
>>>>>                     running on the original host, which is in
>>>>>                     maintenance, and decides somehow to stop it
>>>>>                     because of the "host not up" status.
>>>>>
>>>>>
>>>>>                 From wat I have seen in the code, this may happen
>>>>>                 (as you speculated). If a timeout or some other
>>>>>                 event interrupts migrations, the VM will be
>>>>>                 stopped. This happens in the method
>>>>>                 “com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(String,
>>>>>                 long, DeploymentPlanner)” at line 2396 and
>>>>>                 further. This was implemented on purpose; if the
>>>>>                 migration does not happen; the VM is stopped on
>>>>>                 the source host. Probably it was implemented this
>>>>>                 way on that philosophy of gradient descendent of
>>>>>                 artificial intelligence:
>>>>>
>>>>>                     “pick a random point in the field, update the
>>>>>                     values according to the slope and hopefully
>>>>>                     after successfully iterations you converge to
>>>>>                     a minimum point.” (the problem is that
>>>>>                     sometimes the function diverges).
>>>>>
>>>>>
>>>>>                 I mean, probably this was done this way because
>>>>>                 the HA was supposed to see that a VM with HA is
>>>>>                 stopped, and should try to start it again
>>>>>                 somewhere else. The log you posted do not provide
>>>>>                 any other details to debug this further because
>>>>>                 the log is after an MS restart, right?
>>>>>
>>>>>
>>>>>                     After restarting the MSs I guess that a
>>>>>                     similar issue may happen: the hosts are
>>>>>                     disconnected during the early stage and then
>>>>>                     reconnects. Maybe CS then again stops the VMs
>>>>>                     because the "host not up" status?
>>>>>                     Cf. 2017-01-30 08:44:06,157 DEBUG
>>>>>                     [c.c.d.DeploymentPlanningManagerImpl]
>>>>>                     (Work-Job-Executor-6:ctx-0d738a8e
>>>>>                     job-13737/job-192988 ctx-6e9c753f) The last
>>>>>                     host of this VM is not UP or is not enabled,
>>>>>                     host status is: Disconnected, host resource
>>>>>                     state is: Enabled
>>>>>
>>>>>
>>>>>                 Here is something interesting that you have found.
>>>>>                 I checked the code, but without a live debug it is
>>>>>                 pretty hard for me to get the right flow of
>>>>>                 execution. By looking at the code and at the logs
>>>>>                 you provided, your deduction is right. After ACS
>>>>>                 boots up, the check for hosts that are
>>>>>                 shutdown/deactivated starts, and the method
>>>>>                 “com.cloud.agent.manager.AgentManagerImpl.executeUserRequest(long,
>>>>>                 Event)” is executed. The problem is that all of
>>>>>                 the hosts are considered as disconnected (ACS may
>>>>>                 not have finished connecting to all of them yet),
>>>>>                 and then a whole process of HA is kicked. The
>>>>>                 problem is that there is no host with status “UP”.
>>>>>                 That is why this happen “No suitable hosts found
>>>>>                 under this Cluster: 25”, and then the whole flow
>>>>>                 of HA, which shuts down the VM is kicked.
>>>>>
>>>>>                 In theory, the HA worker should see that a VM with
>>>>>                 HA is down, and should try to start it. I am not
>>>>>                 understanding why this is not working. If you
>>>>>                 decide to debug the running ACS, I may be able to
>>>>>                 help you, just keep me posted.
>>>>>
>>>>>
>>>>>                 On Tue, Jan 31, 2017 at 10:26 AM, Rafael
>>>>>                 Weingärtner <rafaelweingartner@gmail.com
>>>>>                 <ma...@gmail.com>> wrote:
>>>>>
>>>>>                     I will be able to re-check some code and
>>>>>                     answer you with new thoughts on this problem
>>>>>                     tonight (the log files you provided today have
>>>>>                     some very insightful information regarding
>>>>>                     this problem). I believe this is a bug in HA,
>>>>>                     a very tricky one that has not been reported
>>>>>                     yet (at least I do not remember seeing
>>>>>                     anything similar). And thanks to
>>>>>                     randomness/destiny now you have
>>>>>                     found/uncovered a way to reproduce the problem
>>>>>                     (bug?!).
>>>>>
>>>>>                     I do not suggest you changing any data at all
>>>>>                     directly in the data base (unless you know ACS
>>>>>                     code well and are 100% of what you are doing).
>>>>>                     This can cause even bigger and more
>>>>>                     complicated problems. If you remove the HA
>>>>>                     feature from this VMs, this problem will stop.
>>>>>                     However, if you can/are willing to maintain
>>>>>                     the HA until we fully uncover the problem, map
>>>>>                     and report it, this would be great.
>>>>>                     BTW: why are you restarting MSs?
>>>>>                     What are the configurations of your MSs?
>>>>>
>>>>>                     Thanks for your time and for sharing
>>>>>                     information with our community.
>>>>>
>>>>>
>>>>>
>>>>>                     On Tue, Jan 31, 2017 at 8:53 AM, Francois
>>>>>                     Scheurer <francois.scheurer@everyware.c
>>>>>                     <ma...@everyware.c>h> wrote:
>>>>>
>>>>>                         Dear Rafael
>>>>>
>>>>>
>>>>>                         Many thanks for your deep analysis.
>>>>>                         Nice humor with the planetsalignments BTW ;)
>>>>>
>>>>>                         We also suspected the HA Flag to be
>>>>>                         related this issue, as we never got this
>>>>>                         issue in the past, until we enabled the HA
>>>>>                         on all VMs and Offerings.
>>>>>
>>>>>                         Here is also the workers list:
>>>>>
>>>>>                         mysql> select * from op_ha_work;
>>>>>                         +------+-------------+-----------+--------------------+----------+----------------+---------+---------------------+-------+---------------------+-------+-------------+---------+
>>>>>                         | id   | instance_id | type      | vm_type
>>>>>                             | state | mgmt_server_id | host_id |
>>>>>                         created      | tried | taken      | step 
>>>>>                         | time_to_try | updated |
>>>>>                         +------+-------------+-----------+--------------------+----------+----------------+---------+---------------------+-------+---------------------+-------+-------------+---------+
>>>>>                         |  105 |    343 | HA     | User          
>>>>>                          | Running  |  345049098498 |      29 |
>>>>>                         2015-03-28 03:19:03 | 2 | 2015-03-28
>>>>>                         03:30:04 | Error | 1394056013 |      3 |
>>>>>                         |  109 |    407 | HA     | ConsoleProxy  
>>>>>                            | Running  |  345049098498 |      33 |
>>>>>                         2015-03-28 03:19:03 | 2 | 2015-03-28
>>>>>                         03:30:04 | Error | 1394056013 |      3 |
>>>>>                         |  113 |    407 | HA     | ConsoleProxy  
>>>>>                            | Running  |  345049098498 |      33 |
>>>>>                         2015-03-28 03:19:03 | 2 | 2015-03-28
>>>>>                         03:30:04 | Error | 1394056013 |      3 |
>>>>>                         |  117 |    407 | HA     | ConsoleProxy  
>>>>>                            | Running  |  345049098498 |      33 |
>>>>>                         2015-03-28 03:19:03 | 2 | 2015-03-28
>>>>>                         03:30:04 | Error | 1394056013 |      3 |
>>>>>                         |  121 |    343 | HA     | User          
>>>>>                          | Running  |  345049098498 |      29 |
>>>>>                         2015-03-28 03:19:03 | 2 | 2015-03-28
>>>>>                         03:30:04 | Error | 1394056013 |      3 |
>>>>>                         |  125 |    405 | HA     |
>>>>>                         SecondaryStorageVm | Running  |
>>>>>                          345049098498 |      29 | 2015-03-28
>>>>>                         03:19:03 | 2 | 2015-03-28 03:30:04 | Error
>>>>>                         | 1394056013 |      3 |
>>>>>                         |  129 |    393 | HA     | User          
>>>>>                          | Running  |  345049098498 |      25 |
>>>>>                         2015-03-28 03:19:03 | 2 | 2015-03-28
>>>>>                         03:30:04 | Error | 1394056014 |     30 |
>>>>>                         |  133 |    405 | HA     |
>>>>>                         SecondaryStorageVm | Running  |
>>>>>                          345049098498 |      29 | 2015-03-28
>>>>>                         03:19:03 | 2 | 2015-03-28 03:30:04 | Error
>>>>>                         | 1394056014 |      3 |
>>>>>                         |  137 |    402 | HA     | ConsoleProxy  
>>>>>                            | Running  |  345049098498 |      25 |
>>>>>                         2015-03-28 03:19:03 | 2 | 2015-03-28
>>>>>                         03:30:04 | Error | 1394056014 |      3 |
>>>>>                         |  345 |    402 | HA     | ConsoleProxy  
>>>>>                            | Starting |  345049098498 |      25 |
>>>>>                         2015-07-06 13:09:26 | 1 | 2015-07-06
>>>>>                         13:09:26 | Error | 1402527506 |    770 |
>>>>>                         |  349 |    687 | HA     |
>>>>>                         SecondaryStorageVm | Starting |
>>>>>                          345049098498 |      25 | 2015-07-06
>>>>>                         13:09:26 | 1 | 2015-07-06 13:09:26 | Error
>>>>>                         | 1402527506 |      2 |
>>>>>                         | 2057 |   1178 | Migration | User      |
>>>>>                         Running  |  345049098498 |      33 |
>>>>>                         2017-01-31 07:56:08 | 0 | 2017-01-31
>>>>>                         07:56:08 | Done  | 1451024774 |     95 |
>>>>>                         | 2058 |   1736 | Migration | User      |
>>>>>                         Running  |  345049098498 |     224 |
>>>>>                         2017-01-31 07:56:08 | 0 | 2017-01-31
>>>>>                         07:56:08 | Done  | 1451024774 |     45 |
>>>>>                         | 2059 |   1240 | Migration | User      |
>>>>>                         Running  |  345049098498 |       7 |
>>>>>                         2017-01-31 07:56:08 | 0 | 2017-01-31
>>>>>                         07:56:08 | Done  | 1451024774 |     61 |
>>>>>                         | 2060 |   1178 | Migration | User      |
>>>>>                         Running  |  345049098122 |      33 |
>>>>>                         2017-01-31 07:56:08 | 1 | 2017-01-31
>>>>>                         08:06:37 | Done  | 1451025374 |     95 |
>>>>>                         | 2061 |   1178 | Migration | User      |
>>>>>                         Stopped  |        NULL |     33 |
>>>>>                         2017-01-31 07:56:09 | 5 | NULL       |
>>>>>                         Done | 1451027900 <tel:1451027900> |     109 |
>>>>>                         | 2062 |   1690 | Migration | User      |
>>>>>                         Running  |  345049098498 |     224 |
>>>>>                         2017-01-31 07:56:09 | 0 | 2017-01-31
>>>>>                         07:56:09 | Done  | 1451024774 |    109 |
>>>>>                         | 2063 |   1177 | Migration | User      |
>>>>>                         Running  |  345049098498 |      29 |
>>>>>                         2017-01-31 07:56:09 | 0 | 2017-01-31
>>>>>                         07:56:09 | Done  | 1451024774 |     82 |
>>>>>                         | 2064 |   1690 | Migration | User      |
>>>>>                         Running  |  345049098122 |     224 |
>>>>>                         2017-01-31 07:56:09 | 1 | 2017-01-31
>>>>>                         07:58:18 | Done  | 1451024900 |    109 |
>>>>>                         | 2065 |   1240 | Migration | User      |
>>>>>                         Running  |  345049098122 |       7 |
>>>>>                         2017-01-31 07:56:09 | 1 | 2017-01-31
>>>>>                         07:58:18 | Done  | 1451024900 |     61 |
>>>>>                         | 2066 |   1736 | Migration | User      |
>>>>>                         Running  |  345049098122 |     224 |
>>>>>                         2017-01-31 07:56:09 | 1 | 2017-01-31
>>>>>                         07:58:18 | Done  | 1451024900 |     45 |
>>>>>                         | 2067 |   1240 | Migration | User      |
>>>>>                         Stopped  |        NULL |      7 |
>>>>>                         2017-01-31 07:56:09 | 5 | NULL       |
>>>>>                         Done | 1451028003 <tel:1451028003> |      80 |
>>>>>                         | 2068 |   1178 | Migration | User      |
>>>>>                         Running  |  345049098122 |      33 |
>>>>>                         2017-01-31 07:56:16 | 0 | 2017-01-31
>>>>>                         07:56:16 | Done  | 1451024781 |     95 |
>>>>>                         | 2069 |   1736 | Migration | User      |
>>>>>                         Running  |  345049098122 |     224 |
>>>>>                         2017-01-31 07:56:16 | 0 | 2017-01-31
>>>>>                         07:56:16 | Done  | 1451024781 |     45 |
>>>>>                         | 2070 |   1240 | Migration | User      |
>>>>>                         Running  |  345049098122 |       7 |
>>>>>                         2017-01-31 07:56:16 | 0 | 2017-01-31
>>>>>                         07:56:16 | Done  | 1451024781 |     61 |
>>>>>                         | 2071 |   1177 | Migration | User      |
>>>>>                         Running  |  345049098122 |      29 |
>>>>>                         2017-01-31 07:56:16 | 0 | 2017-01-31
>>>>>                         07:56:16 | Done  | 1451024781 |     82 |
>>>>>                         | 2072 |   1177 | Migration | User      |
>>>>>                         Stopped  |        NULL |     29 |
>>>>>                         2017-01-31 07:56:16 | 5 | NULL       |
>>>>>                         Done | 1451027973 <tel:1451027973> |     101 |
>>>>>                         | 2073 |   1177 | Migration | User      |
>>>>>                         Running  |  345049098122 |      29 |
>>>>>                         2017-01-31 07:56:16 | 1 | 2017-01-31
>>>>>                         07:58:22 | Done  | 1451024904 |     82 |
>>>>>                         | 2074 |   1690 | Migration | User      |
>>>>>                         Stopped  |        NULL |    224 |
>>>>>                         2017-01-31 07:56:16 | 5 | NULL       |
>>>>>                         Done | 1451027464 <tel:1451027464> |     123 |
>>>>>                         | 2075 |   1736 | Migration | User      |
>>>>>                         Stopped  |        NULL |    224 |
>>>>>                         2017-01-31 07:56:16 | 5 | NULL       |
>>>>>                         Done | 1451027477 <tel:1451027477> |      59 |
>>>>>                         +------+-------------+-----------+--------------------+----------+----------------+---------+---------------------+-------+---------------------+-------+-------------+---------+
>>>>>
>>>>>                         If we put a host in maintenance, it may
>>>>>                         unfortunately happen that a timeout occurs
>>>>>                         and CS aborts the migration.
>>>>>                         As far as we can interpret the logs, it
>>>>>                         seems (we speculate) that CS detects that
>>>>>                         the VM is then still running on the
>>>>>                         original host, which is in maintenance,
>>>>>                         and decides somehow to stop it because of
>>>>>                         the "host not up" status.
>>>>>                         We then need to start several times the
>>>>>                         VM, until CS gives up the stopping.
>>>>>
>>>>>                         After restarting the MSs I guess that a
>>>>>                         similar issue may happen: the hosts are
>>>>>                         disconnected during the early stage and
>>>>>                         then reconnects. Maybe CS then again stops
>>>>>                         the VMs because the "host not up" status?
>>>>>                         Cf. 2017-01-30 08:44:06,157 DEBUG
>>>>>                         [c.c.d.DeploymentPlanningManagerImpl]
>>>>>                         (Work-Job-Executor-6:ctx-0d738a8e
>>>>>                         job-13737/job-192988 ctx-6e9c753f) The
>>>>>                         last host of this VM is not UP or is not
>>>>>                         enabled, host status is: Disconnected,
>>>>>                         host resource state is: Enabled
>>>>>
>>>>>                         But the problem after the MSs restart is
>>>>>                         definitely only impacting VMs listed in
>>>>>                         the Table op_ha_work.
>>>>>
>>>>>
>>>>>                         To your questions:
>>>>>                                 We have 3 ACS MSs managing several
>>>>>                         Zones. Each zone has a single pod and cluster.
>>>>>                                 The Cluster of i-638-1735-VM has
>>>>>                         this summary:
>>>>>                                  Total number of VMs: 121
>>>>>                                  Total number hypervisors: 4
>>>>>                                  Total allocated RAM: 522.0 / 972
>>>>>                         GB (53.7 %)
>>>>>                                  Total allocated cores: 211
>>>>>                                  Total allocated storage: 11285 GB
>>>>>                                 Used Memory on the hosts is now
>>>>>                         varying from 28 to 77%, so there is enough
>>>>>                         RAM.
>>>>>                                 I did not check the XS logs. The
>>>>>                         same Problem has arisen this morning again
>>>>>                         (shortly after a restart of all MSs), so I
>>>>>                         will look in the XS logs now.
>>>>>
>>>>>                                 I have some fair experience with
>>>>>                         Eclipse and Java, I will ask my boss if we
>>>>>                         can go this way.
>>>>>
>>>>>                         The main issue for now, is that restarting
>>>>>                         the MSs is reproducing the problem, which
>>>>>                         is quite annoying for the impacted VM's.
>>>>>                         I thought to stop the MSs, delete all
>>>>>                         HA/Migration entries in op_ha_work and
>>>>>                         start the MSs.
>>>>>                         Is that a viable option? (or should I
>>>>>                         delete the Migration entries only? or
>>>>>                         never touch this table?)
>>>>>                         We will also very probably remove the HA
>>>>>                         Flag on all VMs if it does not work reliably.
>>>>>
>>>>>                         Best Regards
>>>>>                         Francois Scheurer
>>>>>
>>>>>
>>>>>
>>>>>                         Here are new logs regarding the same issue
>>>>>                         of today.
>>>>>                         I tried to get the important parts related
>>>>>                         to the stop of i-189-1178-VM.
>>>>>
>>>>>                         regexp:
>>>>>                         (i-189-1178-VM|VmWorkMigrateAway|job-15393|job-192989|StopRequestedvm|job-14965|job-193019|Stopped-Migrating|ctx-36aae733|work-2049)
>>>>>
>>>>>                         2017-01-30 08:44:04,440 DEBUG
>>>>>                         [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989) Executing AsyncJobVO
>>>>>                         {id:192989, userId: 1, accountId: 1,
>>>>>                         instanceType: null, instanceId: null, cmd:
>>>>>                         com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>>>                         rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZU
>>>>>                         F3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAEmnQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAHQ,
>>>>>                         cmdVersion: 0, status: IN_PROGRESS,
>>>>>                         processStatus: 0, resu
>>>>>                         ltCode: 0, result: null, initMsid:
>>>>>                         345049103441, completeMsid: null,
>>>>>                         lastUpdated: null, lastPolled: null,
>>>>>                         created: Mon Jan 30 08:44:03 CET 2017}
>>>>>                         2017-01-30 08:44:04,440 DEBUG
>>>>>                         [c.c.v.VmWorkJobDispatcher]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989) Run VM work job:
>>>>>                         com.cloud.vm.VmWorkMigrateAway for VM
>>>>>                         1178, job origin: 15393
>>>>>
>>>>>                         2017-01-30 08:44:05,217 DEBUG
>>>>>                         [c.c.d.DeploymentPlanningManagerImpl]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea) Deploy
>>>>>                         avoids pods: [], clusters: [], hosts: [29]
>>>>>                         2017-01-30 08:44:05,220 DEBUG
>>>>>                         [c.c.a.m.ClusteredAgentManagerImpl]
>>>>>                         (AgentConnectTaskPool-4:ctx-5b61f2e5)
>>>>>                         create ClusteredAgentAttache for 232
>>>>>                         2017-01-30 08:44:05,220 DEBUG
>>>>>                         [c.c.d.DeploymentPlanningManagerImpl]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea)
>>>>>                         DeploymentPlanner allocation algorithm:
>>>>>                         com.cloud.deploy.UserDispersingPlanner@1734b2e2
>>>>>                         2017-01-30 08:44:05,220 DEBUG
>>>>>                         [c.c.d.DeploymentPlanningManagerImpl]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea) Trying
>>>>>                         to allocate a host and storage pools from
>>>>>                         dc:17, pod:25,cluster:25, requested cpu:
>>>>>                         4200, requested ram: 4294967296
>>>>>                         <tel:%2842%29%209496-7296>
>>>>>                         2017-01-30 08:44:05,220 DEBUG
>>>>>                         [c.c.d.DeploymentPlanningManagerImpl]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea) Is ROOT
>>>>>                         volume READY (pool already allocated)?: Yes
>>>>>                         2017-01-30 08:44:05,220 DEBUG
>>>>>                         [c.c.d.DeploymentPlanningManagerImpl]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea) This VM
>>>>>                         has last host_id specified, trying to
>>>>>                         choose the same host: 25
>>>>>
>>>>>                         2017-01-30 08:44:05,614 DEBUG
>>>>>                         [c.c.d.DeploymentPlanningManagerImpl]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea) The
>>>>>                         last host of this VM is not UP or is not
>>>>>                         enabled, host status is: Disconnected,
>>>>>                         host resource state is: Enabled
>>>>>                         2017-01-30 08:44:05,614 DEBUG
>>>>>                         [c.c.d.DeploymentPlanningManagerImpl]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea) Cannot
>>>>>                         choose the last host to deploy this VM
>>>>>                         2017-01-30 08:44:05,614 DEBUG
>>>>>                         [c.c.d.FirstFitPlanner]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea)
>>>>>                         Searching resources only under specified
>>>>>                         Cluster: 25
>>>>>
>>>>>                         2017-01-30 08:44:05,777 DEBUG
>>>>>                         [c.c.a.m.a.i.FirstFitAllocator]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea
>>>>>                         FirstFitRoutingAllocator) Found 0 hosts
>>>>>                         for allocation after prioritization: []
>>>>>                         2017-01-30 08:44:05,777 DEBUG
>>>>>                         [c.c.a.m.a.i.FirstFitAllocator]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea
>>>>>                         FirstFitRoutingAllocator) Looking for
>>>>>                         speed=4200Mhz, Ram=4096
>>>>>                         2017-01-30 08:44:05,777 DEBUG
>>>>>                         [c.c.a.m.a.i.FirstFitAllocator]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea
>>>>>                         FirstFitRoutingAllocator) Host Allocator
>>>>>                         returning 0 suitable hosts
>>>>>                         2017-01-30 08:44:05,777 DEBUG
>>>>>                         [c.c.d.DeploymentPlanningManagerImpl]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea) No
>>>>>                         suitable hosts found
>>>>>                         2017-01-30 08:44:05,777 DEBUG
>>>>>                         [c.c.d.DeploymentPlanningManagerImpl]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea) No
>>>>>                         suitable hosts found under this Cluster: 25
>>>>>                         2017-01-30 08:44:05,787 DEBUG
>>>>>                         [c.c.d.DeploymentPlanningManagerImpl]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea) Could
>>>>>                         not find suitable Deployment Destination
>>>>>                         for this VM under any clusters, returning.
>>>>>                         2017-01-30 08:44:05,787 DEBUG
>>>>>                         [c.c.d.FirstFitPlanner]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea)
>>>>>                         Searching resources only under specified
>>>>>                         Cluster: 25
>>>>>                         2017-01-30 08:44:05,798 DEBUG
>>>>>                         [c.c.d.FirstFitPlanner]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea) The
>>>>>                         specified cluster is in avoid set, returning.
>>>>>                         2017-01-30 08:44:05,798 DEBUG
>>>>>                         [c.c.v.VirtualMachineManagerImpl]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea) Unable
>>>>>                         to find destination for migrating the vm
>>>>>                         VM[User|i-189-1178-VM]
>>>>>                         2017-01-30 08:44:05,798 WARN
>>>>>                         [c.c.v.VirtualMachineManagerImpl]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea) Failed
>>>>>                         to deploy vm 1178 with original planner,
>>>>>                         sending HAPlanner
>>>>>
>>>>>                         2017-01-30 08:44:05,985 DEBUG
>>>>>                         [c.c.d.DeploymentPlanningManagerImpl]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea) Deploy
>>>>>                         avoids pods: [], clusters: [], hosts: [29]
>>>>>                         2017-01-30 08:44:05,985 DEBUG
>>>>>                         [c.c.d.DeploymentPlanningManagerImpl]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea)
>>>>>                         DeploymentPlanner allocation algorithm:
>>>>>                         com.cloud.deploy.SkipHeuresticsPlanner@4fdf86ce
>>>>>                         2017-01-30 08:44:05,985 DEBUG
>>>>>                         [c.c.d.DeploymentPlanningManagerImpl]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea) Trying
>>>>>                         to allocate a host and storage pools from
>>>>>                         dc:17, pod:25,cluster:25, requested cpu:
>>>>>                         4200, requested ram: 4294967296
>>>>>                         <tel:%2842%29%209496-7296>
>>>>>                         2017-01-30 08:44:05,985 DEBUG
>>>>>                         [c.c.d.DeploymentPlanningManagerImpl]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea) Is ROOT
>>>>>                         volume READY (pool already allocated)?: Yes
>>>>>                         2017-01-30 08:44:05,985 DEBUG
>>>>>                         [c.c.d.DeploymentPlanningManagerImpl]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea) This VM
>>>>>                         has last host_id specified, trying to
>>>>>                         choose the same host: 25
>>>>>
>>>>>                         2017-01-30 08:44:06,022 DEBUG
>>>>>                         [c.c.d.DeploymentPlanningManagerImpl]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea) The
>>>>>                         last host of this VM is not UP or is not
>>>>>                         enabled, host status is: Disconnected,
>>>>>                         host resource state is: Enabled
>>>>>                         2017-01-30 08:44:06,022 DEBUG
>>>>>                         [c.c.d.DeploymentPlanningManagerImpl]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea) Cannot
>>>>>                         choose the last host to deploy this VM
>>>>>                         2017-01-30 08:44:06,022 DEBUG
>>>>>                         [c.c.d.FirstFitPlanner]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea)
>>>>>                         Searching resources only under specified
>>>>>                         Cluster: 25
>>>>>                         2017-01-30 08:44:06,039 DEBUG
>>>>>                         [c.c.d.SkipHeuresticsPlanner]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea)
>>>>>                         Deploying vm during HA process, so
>>>>>                         skipping disable threshold check
>>>>>
>>>>>                         2017-01-30 08:44:06,151 DEBUG
>>>>>                         [c.c.a.m.a.i.FirstFitAllocator]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea
>>>>>                         FirstFitRoutingAllocator) Found 0 hosts
>>>>>                         for allocation after prioritization: []
>>>>>                         2017-01-30 08:44:06,151 DEBUG
>>>>>                         [c.c.a.m.a.i.FirstFitAllocator]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea
>>>>>                         FirstFitRoutingAllocator) Looking for
>>>>>                         speed=4200Mhz, Ram=4096
>>>>>                         2017-01-30 08:44:06,151 DEBUG
>>>>>                         [c.c.a.m.a.i.FirstFitAllocator]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea
>>>>>                         FirstFitRoutingAllocator) Host Allocator
>>>>>                         returning 0 suitable hosts
>>>>>                         2017-01-30 08:44:06,151 DEBUG
>>>>>                         [c.c.d.DeploymentPlanningManagerImpl]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea) No
>>>>>                         suitable hosts found
>>>>>                         2017-01-30 08:44:06,151 DEBUG
>>>>>                         [c.c.d.DeploymentPlanningManagerImpl]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea) No
>>>>>                         suitable hosts found under this Cluster: 25
>>>>>                         2017-01-30 08:44:06,157 DEBUG
>>>>>                         [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>                         (Work-Job-Executor-4:ctx-017064e7
>>>>>                         job-13733/job-192986) Publish async
>>>>>                         job-192986 complete on message bus
>>>>>                         2017-01-30 08:44:06,157 DEBUG
>>>>>                         [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>                         (Work-Job-Executor-4:ctx-017064e7
>>>>>                         job-13733/job-192986) Wake up jobs related
>>>>>                         to job-192986
>>>>>                         2017-01-30 08:44:06,157 DEBUG
>>>>>                         [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>                         (Work-Job-Executor-4:ctx-017064e7
>>>>>                         job-13733/job-192986) Update db status for
>>>>>                         job-192986
>>>>>                         2017-01-30 08:44:06,157 DEBUG
>>>>>                         [c.c.d.DeploymentPlanningManagerImpl]
>>>>>                         (Work-Job-Executor-6:ctx-0d738a8e
>>>>>                         job-13737/job-192988 ctx-6e9c753f) The
>>>>>                         last host of this VM is not UP or is not
>>>>>                         enabled, host status is: Disconnected,
>>>>>                         host resource state is: Enabled
>>>>>                         2017-01-30 08:44:06,157 DEBUG
>>>>>                         [c.c.d.DeploymentPlanningManagerImpl]
>>>>>                         (Work-Job-Executor-6:ctx-0d738a8e
>>>>>                         job-13737/job-192988 ctx-6e9c753f) Cannot
>>>>>                         choose the last host to deploy this VM
>>>>>                         2017-01-30 08:44:06,157 DEBUG
>>>>>                         [c.c.d.FirstFitPlanner]
>>>>>                         (Work-Job-Executor-6:ctx-0d738a8e
>>>>>                         job-13737/job-192988 ctx-6e9c753f)
>>>>>                         Searching resources only under specified
>>>>>                         Cluster: 35
>>>>>                         2017-01-30 08:44:06,160 DEBUG
>>>>>                         [c.c.d.DeploymentPlanningManagerImpl]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea) Could
>>>>>                         not find suitable Deployment Destination
>>>>>                         for this VM under any clusters, returning.
>>>>>                         2017-01-30 08:44:06,160 DEBUG
>>>>>                         [c.c.d.FirstFitPlanner]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea)
>>>>>                         Searching resources only under specified
>>>>>                         Cluster: 25
>>>>>                         2017-01-30 08:44:06,163 DEBUG
>>>>>                         [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>                         (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b)
>>>>>                         Execute sync-queue item: SyncQueueItemVO
>>>>>                         {id:63629, queueId: 64423, contentType:
>>>>>                         AsyncJob, contentId: 192991,
>>>>>                         lastProcessMsid: 345049098498,
>>>>>                         lastprocessNumber: 27, lastProcessTime:
>>>>>                         Mon Jan 30 08:44:06 CET 2017, created: Mon
>>>>>                         Jan 30 08:44:05 CET 2017}
>>>>>                         2017-01-30 08:44:06,169 INFO
>>>>>                         [o.a.c.f.j.i.AsyncJobMonitor]
>>>>>                         (Work-Job-Executor-8:ctx-381e8d1a
>>>>>                         job-14953/job-192992) Add job-192992 into
>>>>>                         job monitoring
>>>>>                         2017-01-30 08:44:06,184 DEBUG
>>>>>                         [c.c.d.SkipHeuresticsPlanner]
>>>>>                         (Work-Job-Executor-6:ctx-0d738a8e
>>>>>                         job-13737/job-192988 ctx-6e9c753f)
>>>>>                         Deploying vm during HA process, so
>>>>>                         skipping disable threshold check
>>>>>                         2017-01-30 08:44:06,186 DEBUG
>>>>>                         [c.c.d.FirstFitPlanner]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea) The
>>>>>                         specified cluster is in avoid set, returning.
>>>>>                         2017-01-30 08:44:06,186 DEBUG
>>>>>                         [c.c.v.VirtualMachineManagerImpl]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea) Unable
>>>>>                         to find destination for migrating the vm
>>>>>                         VM[User|i-189-1178-VM]
>>>>>                         2017-01-30 08:44:06,186 ERROR
>>>>>                         [c.c.v.VmWorkJobHandlerProxy]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea)
>>>>>                         Invocation exception, caused by:
>>>>>                         com.cloud.exception.InsufficientServerCapacityException:
>>>>>                         Unable to find a server to migrate
>>>>>                         to.Scope=interface com.cloud.org.Cluster;
>>>>>                         id=25
>>>>>                         2017-01-30 08:44:06,187 INFO
>>>>>                         [c.c.v.VmWorkJobHandlerProxy]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989 ctx-ea83d6ea) Rethrow
>>>>>                         exception
>>>>>                         com.cloud.exception.InsufficientServerCapacityException:
>>>>>                         Unable to find a server to migrate
>>>>>                         to.Scope=interface com.cloud.org.Cluster;
>>>>>                         id=25
>>>>>                         2017-01-30 08:44:06,187 DEBUG
>>>>>                         [c.c.v.VmWorkJobDispatcher]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989) Done with run of VM
>>>>>                         work job: com.cloud.vm.VmWorkMigrateAway
>>>>>                         for VM 1178, job origin: 15393
>>>>>                         2017-01-30 08:44:06,187 ERROR
>>>>>                         [c.c.v.VmWorkJobDispatcher]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989) Unable to complete
>>>>>                         AsyncJobVO {id:192989, userId: 1,
>>>>>                         accountId: 1, instanceType: null,
>>>>>                         instanceId: null, cmd:
>>>>>                         com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>>>                         rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAEmnQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAHQ,
>>>>>                         cmdVersion: 0, status: IN_PROGRESS,
>>>>>                         processStatus: 0, resultCode: 0, result:
>>>>>                         null, initMsid: 345049103441,
>>>>>                         completeMsid: null, lastUpdated: null,
>>>>>                         lastPolled: null, created: Mon Jan 30
>>>>>                         08:44:03 CET 2017}, job origin:15393
>>>>>                         com.cloud.exception.InsufficientServerCapacityException:
>>>>>                         Unable to find a server to migrate
>>>>>                         to.Scope=interface com.cloud.org.Cluster;
>>>>>                         id=25
>>>>>                                 at
>>>>>                         com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:2390)
>>>>>                                 at
>>>>>                         com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:4520)
>>>>>                                 at
>>>>>                         sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>>>>>                         Method)
>>>>>                                 at
>>>>>                         sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>>                                 at
>>>>>                         sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>                                 at
>>>>>                         java.lang.reflect.Method.invoke(Method.java:606)
>>>>>                                 at
>>>>>                         com.cloud.vm.VmWorkJobHandlerProxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
>>>>>                                 at
>>>>>                         com.cloud.vm.VirtualMachineManagerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4636)
>>>>>                                 at
>>>>>                         com.cloud.vm.VmWorkJobDispatcher.runJob(VmWorkJobDispatcher.java:103)
>>>>>                                 at
>>>>>                         org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:537)
>>>>>                                 at
>>>>>                         org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>>>>>                                 at
>>>>>                         org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
>>>>>                                 at
>>>>>                         org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
>>>>>                                 at
>>>>>                         org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
>>>>>                                 at
>>>>>                         org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>>>>>                                 at
>>>>>                         org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:494)
>>>>>                                 at
>>>>>                         java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>>>>                                 at
>>>>>                         java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>>>                                 at
>>>>>                         java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>>                                 at
>>>>>                         java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>>                                 at
>>>>>                         java.lang.Thread.run(Thread.java:745)
>>>>>                         2017-01-30 08:44:06,188 DEBUG
>>>>>                         [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>                         (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b)
>>>>>                         Schedule queued job-192991
>>>>>                         2017-01-30 08:44:06,192 DEBUG
>>>>>                         [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>                         (Work-Job-Executor-4:ctx-017064e7
>>>>>                         job-13733/job-192986) Wake up jobs joined
>>>>>                         with job-192986 and disjoin all subjobs
>>>>>                         created from job- 192986
>>>>>                         2017-01-30 08:44:06,196 DEBUG
>>>>>                         [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989) Complete async
>>>>>                         job-192989, jobStatus: FAILED, resultCode:
>>>>>                         0, result:
>>>>>                         rO0ABXNyADdjb20uY2xvdWQuZXhjZXB0aW9uLkluc3VmZmljaWVudFNlcnZlckNhcGFjaXR5RXhjZXB0aW9uAAAAAFZNT3ACAAFaABVhZmZpbml0eUdyb3Vwc0FwcGxpZWR4cgAxY29tLmNsb3VkLmV4Y2VwdGlvbi5JbnN1ZmZpY2llbnRDYXBhY2l0eUV4Y2VwdGlvbgAAAABWTU91AgACTAACaWR0ABBMamF2YS9sYW5nL0xvbmc7TAAFc2NvcGV0ABFMamF2YS9sYW5nL0NsYXNzO3hyACJjb20uY2xvdWQuZXhjZXB0aW9uLkNsb3VkRXhjZXB0aW9ueeiOG_HkRMcCAAJMAAtjc0Vycm9yQ29kZXQAE0xqYXZhL2xhbmcvSW50ZWdlcjtMAAZpZExpc3R0ABVMamF2YS91dGlsL0FycmF5TGlzdDt4cgATamF2YS5sYW5nLkV4Y2VwdGlvbtD9Hz4aOxzEAgAAeHIAE2phdmEubGFuZy5UaHJvd2FibGXVxjUnOXe4ywMABEwABWNhdXNldAAVTGphdmEvbGFuZy9UaHJvd2FibGU7TAANZGV0YWlsTWVzc2FnZXQAEkxqYXZhL2xhbmcvU3RyaW5nO1sACnN0YWNrVHJhY2V0AB5bTGphdmEvbGFuZy9TdGFja1RyYWNlRWxlbWVudDtMABRzdXBwcmVzc2VkRXhjZXB0aW9uc3QAEExqYXZhL3V0aWwvTGlzdDt4cHEAfgANdAAmVW5hYmxlIHRvIGZpbmQgYSBzZXJ2ZXIgdG8gbWlncmF0ZSB0by51cgAeW0xqYXZhLmxhbmcuU3RhY2tUcmFjZUVsZW1lbnQ7AkYqPDz9IjkCAAB4cAAAABVzcgAbamF2YS5sYW5nLlN0YWNrVHJhY2VFbGVtZW50YQnFmiY23YUCAARJAApsaW5lTnVtYmVyTAAOZGVjbGFyaW5nQ2xhc3NxAH4ACkwACGZpbGVOYW1lcQB-AApMAAptZXRob2ROYW1lcQB-AAp4cAAACVZ0ACZjb20uY2xvdWQudm0uVmlydHVhbE1hY2hpbmVNYW5hZ2VySW1wbHQAHlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwuamF2YXQAFm9yY2hlc3RyYXRlTWlncmF0ZUF3YXlzcQB-ABEAABGocQB-ABNxAH4AFHEAfgAVc3EAfgAR_____nQAJHN1bi5yZWZsZWN0Lk5hdGl2ZU1ldGhvZEFjY2Vzc29ySW1wbHQAHU5hdGl2ZU1ldGhvZEFjY2Vzc29ySW1wbC5qYXZhdAAHaW52b2tlMHNxAH4AEQAAADlxAH4AGHEAfgAZdAAGaW52b2tlc3EAfgARAAAAK3QAKHN1bi5yZWZsZWN0LkRlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcGx0ACFEZWxlZ2F0aW5nTWV0aG9kQWNjZXNzb3JJbXBsLmphdmFxAH4AHHNxAH4AEQAAAl50ABhqYXZhLmxhbmcucmVmbGVjdC5NZXRob2R0AAtNZXRob2QuamF2YXEAfgAcc3EAfgARAAAAa3QAImNvbS5jbG91ZC52bS5WbVdvcmtKb2JIYW5kbGVyUHJveHl0ABpWbVdvcmtKb2JIYW5kbGVyUHJveHkuamF2YXQAD2hhbmRsZVZtV29ya0pvYnNxAH4AEQAAEhxxAH4AE3EAfgAUcQB-ACZzcQB-ABEAAABndAAgY29tLmNsb3VkLnZtLlZtV29ya0pvYkRpc3BhdGNoZXJ0ABhWbVdvcmtKb2JEaXNwYXRjaGVyLmphdmF0AAZydW5Kb2JzcQB-ABEAAAIZdAA_b3JnLmFwYWNoZS5jbG91ZHN0YWNrLmZyYW1ld29yay5qb2JzLmltcGwuQXN5bmNKb2JNYW5hZ2VySW1wbCQ1dAAYQXN5bmNKb2JNYW5hZ2VySW1wbC5qYXZhdAAMcnVuSW5Db250ZXh0c3EAfgARAAAAMXQAPm9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuTWFuYWdlZENvbnRleHRSdW5uYWJsZSQxdAAbTWFuYWdlZENvbnRleHRSdW5uYWJsZS5qYXZhdAADcnVuc3EAfgARAAAAOHQAQm9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0TWFuYWdlZENvbnRleHQkMXQAGkRlZmF1bHRNYW5hZ2VkQ29udGV4dC5qYXZhdAAEY2FsbHNxAH4AEQAAAGd0AEBvcmcuYXBhY2hlLmNsb3Vkc3RhY2subWFuYWdlZC5jb250ZXh0LmltcGwuRGVmYXVsdE1hbmFnZWRDb250ZXh0cQB-ADZ0AA9jYWxsV2l0aENvbnRleHRzcQB-ABEAAAA1cQB-ADlxAH4ANnQADnJ1bldpdGhDb250ZXh0c3EAfgARAAAALnQAPG9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuTWFuYWdlZENvbnRleHRSdW5uYWJsZXEAfgAycQB-ADNzcQB-ABEAAAHucQB-AC1xAH4ALnEAfgAzc3EAfgARAAAB13QALmphdmEudXRpbC5jb25jdXJyZW50LkV4ZWN1dG9ycyRSdW5uYWJsZUFkYXB0ZXJ0AA5FeGVjdXRvcnMuamF2YXEAfgA3c3EAfgARAAABBnQAH2phdmEudXRpbC5jb25jdXJyZW50LkZ1dHVyZVRhc2t0AA9GdXR1cmVUYXNrLmphdmFxAH4AM3NxAH4AEQAABHl0ACdqYXZhLnV0aWwuY29uY3VycmVudC5UaHJlYWRQb29sRXhlY3V0b3J0ABdUaHJlYWRQb29sRXhlY3V0b3IuamF2YXQACXJ1bldvcmtlcnNxAH4AEQAAAmd0AC5qYXZhLnV0aWwuY29uY3VycmVudC5UaHJlYWRQb29sRXhlY3V0b3IkV29ya2VycQB-AEhxAH4AM3NxAH4AEQAAAul0ABBqYXZhLmxhbmcuVGhyZWFkdAALVGhyZWFkLmphdmFxAH4AM3NyACZqYXZhLnV0aWwuQ29sbGVjdGlvbnMkVW5tb2RpZmlhYmxlTGlzdPwPJTG17I4QAgABTAAEbGlzdHEAfgAMeHIALGphdmEudXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVDb2xsZWN0aW9uGUIAgMte9x4CAAFMAAFjdAAWTGphdmEvdXRpbC9Db2xsZWN0aW9uO3hwc3IAE2phdmEudXRpbC5BcnJheUxpc3R4gdIdmcdhnQMAAUkABHNpemV4cAAAAAB3BAAAAAB4cQB-AFR4c3IAEWphdmEubGFuZy5JbnRlZ2VyEuKgpPeBhzgCAAFJAAV2YWx1ZXhyABBqYXZhLmxhbmcuTnVtYmVyhqyVHQuU4IsCAAB4cAAAEO9zcQB-AFMAAAAAdwQAAAAAeHNyAA5qYXZhLmxhbmcuTG9uZzuL5JDMjyPfAgABSgAFdmFsdWV4cQB-AFYAAAAAAAAAGXZyABVjb20uY2xvdWQub3JnLkNsdXN0ZXKqoV-omaW2FAIAAHhwAA
>>>>>                         2017-01-30 08:44:06,201 DEBUG
>>>>>                         [c.c.d.DeploymentPlanningManagerImpl]
>>>>>                         (Work-Job-Executor-7:ctx-bcf531e9
>>>>>                         job-13729/job-192987 ctx-4c93bc31) Deploy
>>>>>                         avoids pods: [], clusters: [], hosts: [3]
>>>>>                         2017-01-30 08:44:06,201 DEBUG
>>>>>                         [c.c.d.DeploymentPlanningManagerImpl]
>>>>>                         (Work-Job-Executor-7:ctx-bcf531e9
>>>>>                         job-13729/job-192987 ctx-4c93bc31)
>>>>>                         DeploymentPlanner allocation algorithm:
>>>>>                         com.cloud.deploy.SkipHeuresticsPlanner@4fdf86ce
>>>>>                         2017-01-30 08:44:06,201 DEBUG
>>>>>                         [c.c.d.DeploymentPlanningManagerImpl]
>>>>>                         (Work-Job-Executor-7:ctx-bcf531e9
>>>>>                         job-13729/job-192987 ctx-4c93bc31) Trying
>>>>>                         to allocate a host and storage pools from
>>>>>                         dc:14, pod:17,cluster:17, requested cpu:
>>>>>                         4200, requested ram: 4294967296
>>>>>                         2017-01-30 08:44:06,201 DEBUG
>>>>>                         [c.c.d.DeploymentPlanningManagerImpl]
>>>>>                         (Work-Job-Executor-7:ctx-bcf531e9
>>>>>                         job-13729/job-192987 ctx-4c93bc31) Is ROOT
>>>>>                         volume READY (pool already allocated)?: Yes
>>>>>                         2017-01-30 08:44:06,201 DEBUG
>>>>>                         [c.c.d.DeploymentPlanningManagerImpl]
>>>>>                         (Work-Job-Executor-7:ctx-bcf531e9
>>>>>                         job-13729/job-192987 ctx-4c93bc31) This VM
>>>>>                         has last host_id specified, trying to
>>>>>                         choose the same host: 213
>>>>>                         2017-01-30 08:44:06,213 DEBUG
>>>>>                         [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989) Publish async
>>>>>                         job-192989 complete on message bus
>>>>>                         2017-01-30 08:44:06,213 DEBUG
>>>>>                         [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989) Wake up jobs related
>>>>>                         to job-192989
>>>>>                         2017-01-30 08:44:06,213 DEBUG
>>>>>                         [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989) Update db status for
>>>>>                         job-192989
>>>>>                         2017-01-30 08:44:06,219 DEBUG
>>>>>                         [c.c.d.DeploymentPlanningManagerImpl]
>>>>>                         (Work-Job-Executor-6:ctx-0d738a8e
>>>>>                         job-13737/job-192988 ctx-6e9c753f)
>>>>>                         Checking resources in Cluster: 35 under
>>>>>                         Pod: 31
>>>>>                         2017-01-30 08:44:06,227 DEBUG
>>>>>                         [c.c.a.m.a.i.FirstFitAllocator]
>>>>>                         (Work-Job-Executor-6:ctx-0d738a8e
>>>>>                         job-13737/job-192988 ctx-6e9c753f
>>>>>                         FirstFitRoutingAllocator) Looking for
>>>>>                         hosts in dc: 23  pod:31 cluster:35
>>>>>                         2017-01-30 08:44:06,229 DEBUG
>>>>>                         [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>                         (Work-Job-Executor-8:ctx-381e8d1a
>>>>>                         job-14953/job-192992) Executing AsyncJobVO
>>>>>                         {id:192992, userId: 1, accountId: 1,
>>>>>                         instanceType: null, instanceId: null, cmd:
>>>>>                         com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>>>                         rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGwXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAEw,
>>>>>                         cmdVersion: 0, status: IN_PROGRESS,
>>>>>                         processStatus: 0, resultCode: 0, result:
>>>>>                         null, initMsid: 345049098122,
>>>>>                         completeMsid: null, lastUpdated: null,
>>>>>                         lastPolled: null, created: Mon Jan 30
>>>>>                         08:44:05 CET 2017}
>>>>>                         2017-01-30 08:44:06,229 DEBUG
>>>>>                         [c.c.v.VmWorkJobDispatcher]
>>>>>                         (Work-Job-Executor-8:ctx-381e8d1a
>>>>>                         job-14953/job-192992) Run VM work job:
>>>>>                         com.cloud.vm.VmWorkMigrateAway for VM
>>>>>                         1729, job origin: 14953
>>>>>                         2017-01-30 08:44:06,230 DEBUG
>>>>>                         [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>                         (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b)
>>>>>                         Execute sync-queue item: SyncQueueItemVO
>>>>>                         {id:63630, queueId: 67180, contentType:
>>>>>                         AsyncJob, contentId: 192990,
>>>>>                         lastProcessMsid: 345049098498,
>>>>>                         lastprocessNumber: 61, lastProcessTime:
>>>>>                         Mon Jan 30 08:44:06 CET 2017, created: Mon
>>>>>                         Jan 30 08:44:05 CET 2017}
>>>>>                         2017-01-30 08:44:06,244 DEBUG
>>>>>                         [c.c.d.DeploymentPlanningManagerImpl]
>>>>>                         (Work-Job-Executor-7:ctx-bcf531e9
>>>>>                         job-13729/job-192987 ctx-4c93bc31) The
>>>>>                         last host of this VM is not UP or is not
>>>>>                         enabled, host status is: Disconnected,
>>>>>                         host resource state is: Enabled
>>>>>                         2017-01-30 08:44:06,262 DEBUG
>>>>>                         [c.c.d.DeploymentPlanningManagerImpl]
>>>>>                         (Work-Job-Executor-7:ctx-bcf531e9
>>>>>                         job-13729/job-192987 ctx-4c93bc31) Cannot
>>>>>                         choose the last host to deploy this VM
>>>>>                         2017-01-30 08:44:06,263 DEBUG
>>>>>                         [c.c.d.FirstFitPlanner]
>>>>>                         (Work-Job-Executor-7:ctx-bcf531e9
>>>>>                         job-13729/job-192987 ctx-4c93bc31)
>>>>>                         Searching resources only under specified
>>>>>                         Cluster: 17
>>>>>                         2017-01-30 08:44:06,262 INFO
>>>>>                         [o.a.c.f.j.i.AsyncJobMonitor]
>>>>>                         (Work-Job-Executor-9:ctx-f4d09efb
>>>>>                         job-14965/job-192991) Add job-192991 into
>>>>>                         job monitoring
>>>>>                         2017-01-30 08:44:06,246 DEBUG
>>>>>                         [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>                         (Work-Job-Executor-4:ctx-017064e7
>>>>>                         job-13733/job-192986) Done executing
>>>>>                         com.cloud.vm.VmWorkMigrateAway for job-192986
>>>>>                         2017-01-30 08:44:06,244 DEBUG
>>>>>                         [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989) Wake up jobs joined
>>>>>                         with job-192989 and disjoin all subjobs
>>>>>                         created from job- 192989
>>>>>                         2017-01-30 08:44:06,269 INFO
>>>>>                         [o.a.c.f.j.i.AsyncJobMonitor]
>>>>>                         (Work-Job-Executor-4:ctx-017064e7
>>>>>                         job-13733/job-192986) Remove job-192986
>>>>>                         from job monitoring
>>>>>                         2017-01-30 08:44:06,279 DEBUG
>>>>>                         [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989) Done executing
>>>>>                         com.cloud.vm.VmWorkMigrateAway for job-192989
>>>>>                         2017-01-30 08:44:06,305 DEBUG
>>>>>                         [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>                         (Work-Job-Executor-9:ctx-f4d09efb
>>>>>                         job-14965/job-192991) Executing AsyncJobVO
>>>>>                         {id:192991, userId: 1, accountId: 1,
>>>>>                         instanceType: null, instanceId: null, cmd:
>>>>>                         com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>>>                         rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGRXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAABw,
>>>>>                         cmdVersion: 0, status: IN_PROGRESS,
>>>>>                         processStatus: 0, resultCode: 0, result:
>>>>>                         null, initMsid: 345049098122,
>>>>>                         completeMsid: null, lastUpdated: null,
>>>>>                         lastPolled: null, created: Mon Jan 30
>>>>>                         08:44:05 CET 2017}
>>>>>                         2017-01-30 08:44:06,305 DEBUG
>>>>>                         [c.c.v.VmWorkJobDispatcher]
>>>>>                         (Work-Job-Executor-9:ctx-f4d09efb
>>>>>                         job-14965/job-192991) Run VM work job:
>>>>>                         com.cloud.vm.VmWorkMigrateAway for VM
>>>>>                         1605, job origin: 14965
>>>>>                         2017-01-30 08:44:06,306 DEBUG
>>>>>                         [c.c.a.m.AgentManagerImpl]
>>>>>                         (AgentManager-Handler-13:null) Ping from 258
>>>>>                         2017-01-30 08:44:06,319 DEBUG
>>>>>                         [c.c.a.m.a.i.FirstFitAllocator]
>>>>>                         (Work-Job-Executor-6:ctx-0d738a8e
>>>>>                         job-13737/job-192988 ctx-6e9c753f
>>>>>                         FirstFitRoutingAllocator) List of hosts in
>>>>>                         ascending order of number of VMs: [224,
>>>>>                         19, 21, 15]
>>>>>                         2017-01-30 08:44:06,322 DEBUG
>>>>>                         [c.c.a.m.a.i.FirstFitAllocator]
>>>>>                         (Work-Job-Executor-6:ctx-0d738a8e
>>>>>                         job-13737/job-192988 ctx-6e9c753f
>>>>>                         FirstFitRoutingAllocator)
>>>>>                         FirstFitAllocator has 0 hosts to check for
>>>>>                         allocation: []
>>>>>                         2017-01-30 08:44:06,322 INFO
>>>>>                         [c.c.a.m.AgentManagerImpl]
>>>>>                         (AgentManager-Handler-13:null) PingMap for
>>>>>                         agent: 258 will not be updated because
>>>>>                         agent is no longer in the PingMap
>>>>>                         2017-01-30 08:44:06,326 INFO
>>>>>                         [o.a.c.f.j.i.AsyncJobMonitor]
>>>>>                         (Work-Job-Executor-5:ctx-32f1e087
>>>>>                         job-15393/job-192989) Remove job-192989
>>>>>                         from job monitoring
>>>>>
>>>>>                         2017-01-30 08:54:38,139 INFO
>>>>>                         [c.c.v.VirtualMachineManagerImpl]
>>>>>                         (Work-Job-Executor-20:ctx-a988090f
>>>>>                         job-14965/job-193019 ctx-052eb4f5)
>>>>>                         Migration cancelled because state has
>>>>>                         changed: VM[User|i-189-1178-VM]
>>>>>                         2017-01-30 08:54:38,139 DEBUG
>>>>>                         [c.c.v.VirtualMachineManagerImpl]
>>>>>                         (Work-Job-Executor-20:ctx-a988090f
>>>>>                         job-14965/job-193019 ctx-052eb4f5) Unable
>>>>>                         to migrate VM due to: Migration cancelled
>>>>>                         because state has changed:
>>>>>                         VM[User|i-189-1178-VM]
>>>>>                         2017-01-30 08:54:38,167 DEBUG
>>>>>                         [c.c.c.CapacityManagerImpl]
>>>>>                         (Work-Job-Executor-20:ctx-a988090f
>>>>>                         job-14965/job-193019 ctx-052eb4f5) VM
>>>>>                         state transitted from :Running to Stopping
>>>>>                         with event: StopRequestedvm's original
>>>>>                         host id: 29 new host id: 33 host id before
>>>>>                         state transition: 33
>>>>>                         2017-01-30 08:54:38,177 DEBUG
>>>>>                         [c.c.a.m.DirectAgentAttache]
>>>>>                         (DirectAgent-325:ctx-e540d7f5) Seq
>>>>>                         19-2355664080091480198: Response Received:
>>>>>                         2017-01-30 08:54:38,177 DEBUG
>>>>>                         [c.c.a.t.Request]
>>>>>                         (DirectAgent-325:ctx-e540d7f5) Seq
>>>>>                         19-2355664080091480198: Processing: { Ans:
>>>>>                         , MgmtId: 345049098498, via: 19, Ver: v1,
>>>>>                         Flags: 10,
>>>>>                         [{"com.cloud.agent.api.NetworkUsageAnswer":{"routerName":"r-1706-VM","bytesSent":12276848087,"bytesReceived":5226621450,"result":true,"details":"","wait":0}}]
>>>>>                         }
>>>>>                         2017-01-30 08:54:38,177 DEBUG
>>>>>                         [c.c.a.t.Request]
>>>>>                         (RouterMonitor-1:ctx-2e80f405) Seq
>>>>>                         19-2355664080091480198: Received:  { Ans:
>>>>>                         , MgmtId: 345049098498, via: 19, Ver: v1,
>>>>>                         Flags: 10, { NetworkUsageAnswer } }
>>>>>                         2017-01-30 08:54:38,177 DEBUG
>>>>>                         [c.c.a.m.AgentManagerImpl]
>>>>>                         (RouterMonitor-1:ctx-2e80f405) Details
>>>>>                         from executing class
>>>>>                         com.cloud.agent.api.NetworkUsageCommand:
>>>>>                         2017-01-30 08:54:38,178 DEBUG
>>>>>                         [c.c.a.t.Request]
>>>>>                         (Work-Job-Executor-20:ctx-a988090f
>>>>>                         job-14965/job-193019 ctx-052eb4f5) Seq
>>>>>                         33-1917970491306410035: Sending  { Cmd ,
>>>>>                         MgmtId: 345049098498, via:
>>>>>                         33(ewcstack-vh003-prod), Ver: v1, Flags:
>>>>>                         100011,
>>>>>                         [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"checkBeforeCleanup":false,"vmName":"i-189-1178-VM","wait":0}}]
>>>>>                         }
>>>>>                         2017-01-30 08:54:38,178 DEBUG
>>>>>                         [c.c.a.t.Request]
>>>>>                         (Work-Job-Executor-20:ctx-a988090f
>>>>>                         job-14965/job-193019 ctx-052eb4f5) Seq
>>>>>                         33-1917970491306410035: Executing:  { Cmd
>>>>>                         , MgmtId: 345049098498, via:
>>>>>                         33(ewcstack-vh003-prod), Ver: v1, Flags:
>>>>>                         100011,
>>>>>                         [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"checkBeforeCleanup":false,"vmName":"i-189-1178-VM","wait":0}}]
>>>>>                         }
>>>>>                         2017-01-30 08:54:38,178 DEBUG
>>>>>                         [c.c.a.m.DirectAgentAttache]
>>>>>                         (DirectAgent-43:ctx-da30f109) Seq
>>>>>                         33-1917970491306410035: Executing request
>>>>>                         2017-01-30 08:54:38,202 DEBUG
>>>>>                         [c.c.a.m.DirectAgentAttache]
>>>>>                         (DirectAgent-326:ctx-4fb88aa7) Seq
>>>>>                         21-1035546439318503505: Executing request
>>>>>
>>>>>                         2017-01-30 09:05:55,628 INFO
>>>>>                         [c.c.h.HighAvailabilityManagerImpl]
>>>>>                         (HA-Worker-4:ctx-36aae733 work-2049)
>>>>>                         Processing
>>>>>                         HAWork[2049-Migration-1178-Stopped-Migrating]
>>>>>                         2017-01-30 09:05:55,633 WARN
>>>>>                         [o.a.c.f.j.AsyncJobExecutionContext]
>>>>>                         (HA-Worker-4:ctx-36aae733 work-2049) Job
>>>>>                         is executed without a context, setup psudo
>>>>>                         job for the executing thread
>>>>>                         2017-01-30 09:05:55,645 DEBUG
>>>>>                         [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>                         (HA-Worker-4:ctx-36aae733 work-2049) Sync
>>>>>                         job-193045 execution on object
>>>>>                         VmWorkJobQueue.1178
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>                         On 31.01.2017 00:42, Rafael Weingärtner wrote:
>>>>>
>>>>>                             This seems to be one of those nasty
>>>>>                             bugs that are very hard to debug. It
>>>>>                             seems to be cause to a combination of
>>>>>                             different factors, such as planets
>>>>>                             alignments ;)
>>>>>
>>>>>                             At class “VirtualMachineManagerImpl”
>>>>>                             line 2395 (method
>>>>>                             “com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(String,
>>>>>                             long, DeploymentPlanner)”) is the
>>>>>                             start of the code flow that triggers the
>>>>>                             migration. The point here is that, if
>>>>>                             an error occurs during the migration,
>>>>>                             the VM will be stopped; For some
>>>>>                             reason this is executed only for HA VMs.
>>>>>
>>>>>                             Going a little bit further, we find
>>>>>                             the point that is triggering the error
>>>>>                             in the method
>>>>>                             “com.cloud.vm.VirtualMachineManagerImpl.migrate(VMInstanceVO,
>>>>>                             long, DeployDestination)” at line
>>>>>                             1941. This line will call the method
>>>>>                             “com.cloud.vm.VirtualMachineManagerImpl.changeState(VMInstanceVO,
>>>>>                             Event,
>>>>>                             Long, ItWorkVO, Step)” to change the
>>>>>                             VM state to migrating/prepare
>>>>>                             migration/or something like that (I am
>>>>>                             guessing here), but for some reason,
>>>>>                             the state of the VM is not the one
>>>>>                             expected and this is causing a
>>>>>                             “ConcurrentOperationException” to be
>>>>>                             thrown. The problem is that the
>>>>>                             “com.cloud.vm.VirtualMachineManagerImpl.changeState(VMInstanceVO,
>>>>>                             Event,
>>>>>                             Long, ItWorkVO, Step)” does not seem
>>>>>                             to provide much useful log entries.
>>>>>
>>>>>                             Just to enable us to discuss the
>>>>>                             problem a little further, could you tell
>>>>>                             us a little bit about your environment
>>>>>                             setting? How many ACS MSs do you
>>>>>                             have? What is the VM workload being
>>>>>                             migrated (I mean, the number of VMs and
>>>>>                             computing resources)? Do the other
>>>>>                             servers have enough computing resources
>>>>>                             to support these VMs? Could you also
>>>>>                             check the XenServer logs, and check
>>>>>                             what happened on both, source and
>>>>>                             destination hosts when the migration for
>>>>>                             VM “i-638-1735-VM” was triggered?
>>>>>
>>>>>                             Do you have the knowledge/skills to
>>>>>                             use an IDE (e.g. Eclipse) to connect to
>>>>>                             a running JVM and debug a code?
>>>>>                             Because for me it will not be possible to
>>>>>                             go further without looking at the
>>>>>                             variables in a running environment.
>>>>>
>>>>>
>>>>>                             On Mon, Jan 30, 2017 at 4:28 AM,
>>>>>                             Francois Scheurer <
>>>>>                             francois.scheurer@everyware.ch
>>>>>                             <ma...@everyware.ch>>
>>>>>                             wrote:
>>>>>
>>>>>                                 Dear Rafael
>>>>>
>>>>>
>>>>>                                 Sorry for the late answer (was
>>>>>                                 sick a couple of days).
>>>>>
>>>>>                                 I confirm we are migrating inside
>>>>>                                 the same xenserver pool (within a CS
>>>>>                                 Cluster).
>>>>>
>>>>>
>>>>>                                 Best Regards
>>>>>                                 Francois
>>>>>
>>>>>
>>>>>
>>>>>                                 On 26.01.2017 04:35, Rafael
>>>>>                                 Weingärtner wrote:
>>>>>
>>>>>                                     Found the code that is executed.
>>>>>                                     I need you to confirm if you
>>>>>                                     are migrating within a cluster
>>>>>                                     or across
>>>>>                                     clusters?
>>>>>
>>>>>                                     On Wed, Jan 25, 2017 at 11:57
>>>>>                                     AM, Rafael Weingärtner <
>>>>>                                     rafaelweingartner@gmail.com
>>>>>                                     <ma...@gmail.com>>
>>>>>                                     wrote:
>>>>>
>>>>>                                     I do not have access to ACS
>>>>>                                     code right now, but I suggest
>>>>>                                     starting
>>>>>
>>>>>                                         debugging here:
>>>>>
>>>>>                                         2017-01-24 18:05:28,427
>>>>>                                         DEBUG
>>>>>                                         [c.c.c.CapacityManagerImpl]
>>>>>
>>>>>                                             (Work-Job-Executor-156:ctx-14b86f5e
>>>>>                                             job-60216/job-191380
>>>>>                                             ctx-175d37df)
>>>>>                                             VM state transitted
>>>>>                                             from :Running to
>>>>>                                             Stopping with event:
>>>>>                                             StopRequestedvm's
>>>>>                                             original host id: 224
>>>>>                                             new host id: 15 host
>>>>>                                             id before state
>>>>>                                             transition:
>>>>>                                             15
>>>>>
>>>>>                                             I would try to
>>>>>                                             understand first why
>>>>>                                             ACS requested the
>>>>>                                             hypervisor to stop
>>>>>
>>>>>                                         the VM. I mean the
>>>>>                                         following, check the
>>>>>                                         source code for conditions
>>>>>                                         that
>>>>>                                         would make ACS request the
>>>>>                                         shutdown of VMs.
>>>>>                                         BTW: are you migrating
>>>>>                                         within a cluster or across
>>>>>                                         clusters?
>>>>>
>>>>>                                         I will be able to analyze
>>>>>                                         this further only after
>>>>>                                         work hours, so if you
>>>>>                                         find anything, keep me posted.
>>>>>
>>>>>                                         On Wed, Jan 25, 2017 at
>>>>>                                         11:45 AM, Francois Scheurer <
>>>>>                                         francois.scheurer@everyware.ch
>>>>>                                         <ma...@everyware.ch>>
>>>>>                                         wrote:
>>>>>
>>>>>                                         Hello Rafael
>>>>>
>>>>>
>>>>>                                             I think the important
>>>>>                                             log lines are these:
>>>>>
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:28,407 INFO
>>>>>                                             [c.c.v.VirtualMachineManagerImpl]
>>>>>                                             (Work-Job-Executor-156:ctx-14b86f5e
>>>>>                                             job-60216/job-191380
>>>>>                                             ctx-175d37df)
>>>>>                                             Migration cancelled
>>>>>                                             because state has
>>>>>                                             changed:
>>>>>                                             VM[User|i-638-1736-VM]
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:28,407 DEBUG
>>>>>                                             [c.c.v.VirtualMachineManagerImpl]
>>>>>                                             (Work-Job-Executor-156:ctx-14b86f5e
>>>>>                                             job-60216/job-191380
>>>>>                                             ctx-175d37df) Unable
>>>>>                                             to migrate VM due to:
>>>>>                                             Migration
>>>>>                                             cancelled because
>>>>>                                             state has changed:
>>>>>                                             VM[User|i-638-1736-VM]
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:28,427 DEBUG
>>>>>                                             [c.c.c.CapacityManagerImpl]
>>>>>                                             (Work-Job-Executor-156:ctx-14b86f5e
>>>>>                                             job-60216/job-191380
>>>>>                                             ctx-175d37df) VM state
>>>>>                                             transitted from
>>>>>                                             :Running to
>>>>>                                             Stopping with event:
>>>>>                                             StopRequestedvm's
>>>>>                                             original host id: 224
>>>>>                                             new host
>>>>>                                             id:
>>>>>                                             15 host id before
>>>>>                                             state transition: 15
>>>>>
>>>>>                                             The really disturbing
>>>>>                                             thing is that CS stop
>>>>>                                             again the VM if we
>>>>>                                             start it
>>>>>                                             manually, apparently
>>>>>                                             because the
>>>>>                                             migratingjob is still
>>>>>                                             running.
>>>>>
>>>>>                                             Note the the VM HA
>>>>>                                             Flag is enabled.
>>>>>
>>>>>
>>>>>
>>>>>                                             Thank you.
>>>>>
>>>>>
>>>>>                                             Best Regards.
>>>>>
>>>>>                                             Francois
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>                                             more details:
>>>>>
>>>>>
>>>>>
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:28,148 DEBUG
>>>>>                                             [o.a.c.e.o.VolumeOrchestrator]
>>>>>                                             (Work-Job-Executor-156:ctx-14b86f5e
>>>>>                                             job-60216/job-191380
>>>>>                                             ctx-175d37df)
>>>>>                                             Preparing 3 volumes for
>>>>>                                             VM[User|i-638-1736-VM]
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:28,155 DEBUG
>>>>>                                             [c.c.v.VmWorkJobDispatcher]
>>>>>                                             (Work-Job-Executor-122:ctx-a8bacc64
>>>>>                                             job-14953/job-191351)
>>>>>                                             Done with run of VM
>>>>>                                             work job:
>>>>>                                             com.cloud.vm.VmWorkMigrateAway
>>>>>                                             for VM 1411, job
>>>>>                                             origin: 14953
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:28,155 DEBUG
>>>>>                                             [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>                                             (Work-Job-Executor-122:ctx-a8bacc64
>>>>>                                             job-14953/job-191351)
>>>>>                                             Done executing
>>>>>                                             com.cloud.vm.VmWorkMigrateAway
>>>>>                                             for
>>>>>                                             job-191351
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:28,160 INFO
>>>>>                                             [o.a.c.f.j.i.AsyncJobMonitor]
>>>>>                                             (Work-Job-Executor-122:ctx-a8bacc64
>>>>>                                             job-14953/job-191351)
>>>>>                                             Remove job-191351 from
>>>>>                                             job monitoring
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:28,168 INFO
>>>>>                                             [c.c.h.HighAvailabilityManagerImpl]
>>>>>                                             (HA-Worker-2:ctx-f3bcf43c
>>>>>                                             work-1989)
>>>>>                                             Completed
>>>>>                                             HAWork[1989-Migration-1411-Running-Migrating]
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:28,246 DEBUG
>>>>>                                             [c.c.a.m.ClusteredAgentAttache]
>>>>>                                             (Work-Job-Executor-156:ctx-14b86f5e
>>>>>                                             job-60216/job-191380
>>>>>                                             ctx-175d37df) Seq
>>>>>                                             19-8526158519542516375:
>>>>>                                             Forwarding
>>>>>                                             Seq
>>>>>                                             19-8526158519542516375:
>>>>>                                             { Cmd , MgmtId:
>>>>>                                             345049103441, via:
>>>>>                                             19(ewcstack-vh010-prod),
>>>>>                                             Ver: v1, Flags: 100111, [{
>>>>>                                             "com.cloud.agent.api.PrepareForMigrationCommand":{"vm":{"id"
>>>>>                                             :1736,"name":"i-638-1736-VM","bootloader":"PyGrub","type":"U
>>>>>                                             ser","cpus":4,"minSpeed":2100,"maxSpeed":2100,"minRam":17179
>>>>>                                             869184,"maxRam":17179869184,"arch":"x86_64","os":"Other
>>>>>                                             (64-bit)","platformEmulator":"Other
>>>>>                                             install
>>>>>                                             media","bootArgs":"","enable
>>>>>                                             HA":true,"limitCpuUse":true,"enableDynamicallyScaleVm":true,
>>>>>                                             "vncPassword":"pC1WUC7h1DBH9J36q3H9pg==","params":{"memoryOv
>>>>>                                             ercommitRatio":"1.0","platform":"viridian:true;acpi:1;apic:t
>>>>>                                             rue;pae:true;nx:true;timeoffset:-3601","keyboard":"us","Mess
>>>>>                                             age.ReservedCapacityFreed.Flag":"false","cpuOvercommitRatio":"4.0","
>>>>>                                             hypervisortoolsversion":"xenserver61"},"uuid":"5bff01de-c033
>>>>>                                             -4925-9e47-30dd14539272","disks":[{"data":{"org.apache.cloud
>>>>>                                             stack.storage.to
>>>>>                                             <http://stack.storage.to>.VolumeObjectTO":{"uuid":"1ab514b2-f2de-
>>>>>                                             4d85-8e0d-9768184a1349","volumeType":"ROOT","dataStore":{"org.
>>>>>                                             apache.cloudstack.storage.to
>>>>>                                             <http://apache.cloudstack.storage.to>.PrimaryDataStoreTO":{"uuid":"b8751b89-00f
>>>>>                                             6-5079-ed7a-c073579c7563","id":64,"poolType":"PreSetup","hos
>>>>>                                             t":"localhost","path":"/b8751b89-00f6-5079-ed7a-c073579c7563
>>>>>                                             ","port":0,"url":"PreSetup://localhost/b8751b89-00f6-5079-ed
>>>>>                                             7a-c073579c7563/?ROLE=Primary&STOREUUID=b8751b89-00f6-5079-
>>>>>                                             ed7a-c073579c7563"}},"name":"ROOT-1736","size":53687091200,"path":"d3
>>>>>                                             f87e3c-1efe-42ad-aa93-609d5b980a34","volumeId":8655,"vmName"
>>>>>                                             :"i-638-1736-VM","accountId":638,"format":"VHD","provisionin
>>>>>                                             gType":"THIN","id":8655,"deviceId":0,"hypervisorType":"
>>>>>                                             XenServer"}},"diskSeq":0,"path":"d3f87e3c-1efe-42ad-aa93
>>>>>                                             -609d5b980a34","type":"ROOT","_details":{"managed":"false","
>>>>>                                             storagePort":"0",
>>>>>                                             "storageHost":"localhost","volumeSize":"53687091200"}},{"dat
>>>>>                                             a":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid
>>>>>                                             ":"733df9dc-5e24-4ab9-b22b-bce78322428e","volumeType":"DATAD
>>>>>                                             ISK","dataStore":{"org.apache.cloudstack.storage.to.PrimaryD
>>>>>                                             ataStoreTO":{"uuid":"b8751b89-00f6-5079-ed7a-c073579c7563","id":64,"
>>>>>                                             poolType":"PreSetup","host":"localhost","path":"/b8751b89-00
>>>>>                                             f6-5079-ed7a-c073579c7563","port":0,"url":"PreSetup://localh
>>>>>                                             ost/b8751b89-00f6-5079-ed7a-c073579c7563/?ROLE=Primary&STORE
>>>>>                                             UUID=b8751b89-00f6-5079-ed7a-c073579c7563"}},"name":"WELLA-
>>>>>                                             DB1-NCSA_D","size":268435456000,"path":"265f168a-bb92-4591-a32b-ee3e2
>>>>>                                             2c8cb68","volumeId":8656,"vmName":"i-638-1736-VM","accountId
>>>>>                                             ":638,"format":"VHD","provisioningType":"THIN","id":8656,"
>>>>>                                             deviceId":1,"hypervisorType":"XenServer"}},"diskSeq":1,"
>>>>>                                             path":"265f168a-bb92-4591-a32b-ee3e22c8cb68","type":"DATADIS
>>>>>                                             K","_details":{"managed":"false","storagePort":"0","stor
>>>>>                                             ageHost":"localhos
>>>>>                                             t","volumeSize":"268435456000"}},{"data":{"org.apache.clouds
>>>>>                                             tack.storage.to
>>>>>                                             <http://tack.storage.to>.VolumeObjectTO":{"uuid":"bda6f335-b8a3-479c-
>>>>>                                             8c0d-41b344d2a50d","volumeType":"DATADISK","dataStore":{"org.
>>>>>                                             apache.cloudstack.storage.to
>>>>>                                             <http://apache.cloudstack.storage.to>.PrimaryDataStoreTO":{"uuid"
>>>>>                                             :"09df3ca6-0a20-e4f0-3d79-12a6ed67a13d","id":89,"poolType":"PreSetup","
>>>>>                                             host":"localhost","path":"/09df3ca6-0a20-e4f0-3d79-12a6ed67a
>>>>>                                             13d","port":0,"url":"PreSetup://localhost/09df3ca6-0a20-e4f0
>>>>>                                             -3d79-12a6ed67a13d/?ROLE=Primary&STOREUUID=09df3ca6-0a20-
>>>>>
>>>>>                                             e4f0-3d79-12a6ed67a13d"}},"name":"WELLA-DB1-NCSA_E","size
>>>>>                                             ":429496729600,"path":"612cb586-a70c-4896-b120-062ee7a67894"
>>>>>                                             ,"volumeId":86
>>>>>                                             57,"vmName":"i-638-1736-VM","accountId":638,"format":"VHD","
>>>>>                                             provisioningType":"THIN","id":8657,"deviceId":2,"hypervisorT
>>>>>                                             ype":"XenServer"}},"diskSeq":2,"path":"612cb586-a70c-4896-b1
>>>>>                                             20-062ee7a67894","type":"DATADISK","_details":{"managed":"
>>>>>                                             false","storagePort":"0","storageHost":"localhost","volumeSize":"42949
>>>>>                                             6729600"}},{"data":{"org.apache.cloudstack.storage.to
>>>>>                                             <http://apache.cloudstack.storage.to>.Templa
>>>>>                                             teObjectTO":{"uuid":"0e5bf4ff-caca-4b08-abed-7bbcdf3c42ef","
>>>>>                                             id":201,"format":"ISO","accountId":1,"hvm":true,"displayText
>>>>>                                             ":"xen-pv-drv-iso","name":"xs-tools.iso","guestOsType":"CentOS
>>>>>                                             4.5
>>>>>                                             (32-bit)","hypervisorType":"XenServer"}},"diskSeq":3,"type":"
>>>>>                                             ISO"}],"nics":[{"deviceId":0,"networkRateMbps":1000,"default
>>>>>                                             Nic":true,"pxeDisable":false,"nicUuid":"3523c3d3-9f92-4076-a
>>>>>                                             d99-f8864d0f0edd","uuid":"e6fbb257-16e1-415f-88fb-9a79b263b3
>>>>>                                             a8","ip":"192.168.1.42","netmask":"255.255.255.0","
>>>>>                                             gateway":"192.168.1.1","mac":"02:00:0c:ae:00:0f","dns1":"8.8
>>>>>                                             .8.8","dns2":"8.8
>>>>>                                             .4.4","broadcastType":"Vlan","type":"Guest","broadcastUri":"
>>>>>                                             vlan://2285","isolationUri":"vlan://2285","isSecurityGroupEn
>>>>>                                             abled":false,"name":"bond0"}],"vcpuMaxLimit":16},"wait":0}}]
>>>>>                                             } to
>>>>>                                             345049098498
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:28,399 DEBUG
>>>>>                                             [c.c.a.t.Request]
>>>>>                                             (AgentManager-Handler-10:null)
>>>>>                                             Seq
>>>>>                                             19-8526158519542516375:
>>>>>                                             Processing:
>>>>>                                             {
>>>>>                                             Ans: , MgmtId:
>>>>>                                             345049103441, via: 19,
>>>>>                                             Ver: v1, Flags: 110,
>>>>>                                             [{"com.cloud.agent.api.PrepareForMigrationAnswer":{"result":
>>>>>                                             true,"wait":0}}]
>>>>>                                             }
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:28,399 DEBUG
>>>>>                                             [c.c.a.m.AgentAttache]
>>>>>                                             (AgentManager-Handler-10:null)
>>>>>                                             Seq
>>>>>                                             19-8526158519542516375:
>>>>>                                             No more
>>>>>                                             commands
>>>>>                                             found
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:28,399 DEBUG
>>>>>                                             [c.c.a.t.Request]
>>>>>                                             (Work-Job-Executor-156:ctx-14b86f5e
>>>>>                                             job-60216/job-191380
>>>>>                                             ctx-175d37df)
>>>>>                                             Seq
>>>>>                                             19-8526158519542516375:
>>>>>                                             Received:  { Ans: ,
>>>>>                                             MgmtId: 345049103441,
>>>>>                                             via:
>>>>>                                             19, Ver: v1, Flags:
>>>>>                                             110, {
>>>>>                                             PrepareForMigrationAnswer
>>>>>                                             } }
>>>>>
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:28,407 INFO
>>>>>                                             [c.c.v.VirtualMachineManagerImpl]
>>>>>                                             (Work-Job-Executor-156:ctx-14b86f5e
>>>>>                                             job-60216/job-191380
>>>>>                                             ctx-175d37df)
>>>>>                                             Migration cancelled
>>>>>                                             because state has
>>>>>                                             changed:
>>>>>                                             VM[User|i-638-1736-VM]
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:28,407 DEBUG
>>>>>                                             [c.c.v.VirtualMachineManagerImpl]
>>>>>                                             (Work-Job-Executor-156:ctx-14b86f5e
>>>>>                                             job-60216/job-191380
>>>>>                                             ctx-175d37df) Unable
>>>>>                                             to migrate VM due to:
>>>>>                                             Migration
>>>>>                                             cancelled because
>>>>>                                             state has changed:
>>>>>                                             VM[User|i-638-1736-VM]
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:28,427 DEBUG
>>>>>                                             [c.c.c.CapacityManagerImpl]
>>>>>                                             (Work-Job-Executor-156:ctx-14b86f5e
>>>>>                                             job-60216/job-191380
>>>>>                                             ctx-175d37df) VM state
>>>>>                                             transitted from
>>>>>                                             :Running to
>>>>>                                             Stopping with event:
>>>>>                                             StopRequestedvm's
>>>>>                                             original host id: 224
>>>>>                                             new host
>>>>>                                             id:
>>>>>                                             15 host id before
>>>>>                                             state transition: 15
>>>>>
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:28,438 DEBUG
>>>>>                                             [c.c.a.m.ClusteredAgentAttache]
>>>>>                                             (Work-Job-Executor-156:ctx-14b86f5e
>>>>>                                             job-60216/job-191380
>>>>>                                             ctx-175d37df) Seq
>>>>>                                             15-6889944479923437643:
>>>>>                                             Forwarding
>>>>>                                             Seq
>>>>>                                             15-6889944479923437643:
>>>>>                                             { Cmd , MgmtId:
>>>>>                                             345049103441, via:
>>>>>                                             15(ewcstack-vh011-prod),
>>>>>                                             Ver: v1, Flags: 100011,
>>>>>                                             [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"execut
>>>>>                                             eInSequence":false,"checkBeforeCleanup":false,"vmName":"i-
>>>>>                                             638-1736-VM","wait":0}}]
>>>>>                                             } to 345049098498
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:30,335 DEBUG
>>>>>                                             [c.c.a.m.AgentManagerImpl]
>>>>>                                             (AgentManager-Handler-1:null)
>>>>>                                             SeqA
>>>>>                                             254-800498:
>>>>>                                             Processing Seq
>>>>>                                             254-800498:  { Cmd ,
>>>>>                                             MgmtId: -1, via: 254,
>>>>>                                             Ver: v1,
>>>>>                                             Flags:
>>>>>                                             11,
>>>>>                                             [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_
>>>>>                                             proxyVmId":1530,"_loadInfo":"{\n
>>>>>                                             \"connections\":
>>>>>                                             []\n}","wait":0}}] }
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:30,340 DEBUG
>>>>>                                             [c.c.a.m.AgentManagerImpl]
>>>>>                                             (AgentManager-Handler-1:null)
>>>>>                                             SeqA
>>>>>                                             254-800498:
>>>>>                                             Sending Seq
>>>>>                                             254-800498:  { Ans: ,
>>>>>                                             MgmtId: 345049103441,
>>>>>                                             via: 254, Ver:
>>>>>                                             v1,
>>>>>                                             Flags: 100010,
>>>>>                                             [{"com.cloud.agent.api.AgentCo
>>>>>                                             ntrolAnswer":{"result":true,"wait":0}}]
>>>>>                                             }
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:32,447 DEBUG
>>>>>                                             [c.c.a.t.Request]
>>>>>                                             (AgentManager-Handler-11:null)
>>>>>                                             Seq
>>>>>                                             15-6889944479923437632:
>>>>>                                             Processing:
>>>>>                                             {
>>>>>                                             Ans: , MgmtId:
>>>>>                                             345049103441, via: 15,
>>>>>                                             Ver: v1, Flags: 10,
>>>>>                                             [{"com.cloud.agent.api.StopAnswer":{"platform":"viridian:tru
>>>>>                                             e;acpi:1;apic:true;pae:true;nx:true;timeoffset:3601","result
>>>>>                                             ":true,"details":"Stop
>>>>>                                             VM i-638-1735-VM
>>>>>                                             Succeed","wait":0}}] }
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:32,447 DEBUG
>>>>>                                             [c.c.a.t.Request]
>>>>>                                             (Work-Job-Executor-115:ctx-0d5db528
>>>>>                                             job-13733/job-191379
>>>>>                                             ctx-bbabfeae)
>>>>>                                             Seq
>>>>>                                             15-6889944479923437632:
>>>>>                                             Received:  { Ans: ,
>>>>>                                             MgmtId: 345049103441,
>>>>>                                             via:
>>>>>                                             15, Ver: v1, Flags:
>>>>>                                             10, { StopAnswer } }
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:32,470 DEBUG
>>>>>                                             [c.c.v.VirtualMachineManagerImpl]
>>>>>                                             (Work-Job-Executor-115:ctx-0d5db528
>>>>>                                             job-13733/job-191379
>>>>>                                             ctx-bbabfeae)
>>>>>                                             VM[User|i-638-1735-VM]
>>>>>                                             is stopped on
>>>>>                                             the
>>>>>                                             host. Proceeding to
>>>>>                                             release resource held.
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:32,475 DEBUG
>>>>>                                             [c.c.n.NetworkModelImpl]
>>>>>                                             (Work-Job-Executor-115:ctx-0d5db528
>>>>>                                             job-13733/job-191379
>>>>>                                             ctx-bbabfeae) Service
>>>>>                                             SecurityGroup is not
>>>>>                                             supported
>>>>>                                             in the network id=519
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:32,478 DEBUG
>>>>>                                             [o.a.c.e.o.NetworkOrchestrator]
>>>>>                                             (Work-Job-Executor-115:ctx-0d5db528
>>>>>                                             job-13733/job-191379
>>>>>                                             ctx-bbabfeae) Changing
>>>>>                                             active number of nics for
>>>>>                                             network id=519 on -1
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:32,488 DEBUG
>>>>>                                             [o.a.c.e.o.NetworkOrchestrator]
>>>>>                                             (Work-Job-Executor-115:ctx-0d5db528
>>>>>                                             job-13733/job-191379
>>>>>                                             ctx-bbabfeae) Asking
>>>>>                                             VpcVirtualRouter to
>>>>>                                             release
>>>>>                                             NicProfile[17403-1735-3364083b-dcc0-4eb9-a7f9-f3e85bf8c7d3-1
>>>>>                                             92.168.1.11-null
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:32,488 DEBUG
>>>>>                                             [c.c.v.VirtualMachineManagerImpl]
>>>>>                                             (Work-Job-Executor-115:ctx-0d5db528
>>>>>                                             job-13733/job-191379
>>>>>                                             ctx-bbabfeae)
>>>>>                                             Successfully released
>>>>>                                             network
>>>>>                                             resources
>>>>>                                             for the vm
>>>>>                                             VM[User|i-638-1735-VM]
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:32,488 DEBUG
>>>>>                                             [c.c.v.VirtualMachineManagerImpl]
>>>>>                                             (Work-Job-Executor-115:ctx-0d5db528
>>>>>                                             job-13733/job-191379
>>>>>                                             ctx-bbabfeae)
>>>>>                                             Successfully released
>>>>>                                             storage
>>>>>                                             resources
>>>>>                                             for the vm
>>>>>                                             VM[User|i-638-1735-VM]
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:32,496 DEBUG
>>>>>                                             [c.c.c.CapacityManagerImpl]
>>>>>                                             (Work-Job-Executor-115:ctx-0d5db528
>>>>>                                             job-13733/job-191379
>>>>>                                             ctx-bbabfeae) VM state
>>>>>                                             transitted from
>>>>>                                             :Stopping to
>>>>>                                             Stopped with event:
>>>>>                                             OperationSucceededvm's
>>>>>                                             original host id: 15
>>>>>                                             new host
>>>>>                                             id: null host id
>>>>>                                             before state
>>>>>                                             transition: 15
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:32,505 DEBUG
>>>>>                                             [c.c.c.CapacityManagerImpl]
>>>>>                                             (Work-Job-Executor-115:ctx-0d5db528
>>>>>                                             job-13733/job-191379
>>>>>                                             ctx-bbabfeae) Hosts's
>>>>>                                             actual total CPU:
>>>>>                                             82976 and
>>>>>                                             CPU
>>>>>                                             after applying
>>>>>                                             overprovisioning: 331904
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:32,505 DEBUG
>>>>>                                             [c.c.c.CapacityManagerImpl]
>>>>>                                             (Work-Job-Executor-115:ctx-0d5db528
>>>>>                                             job-13733/job-191379
>>>>>                                             ctx-bbabfeae) Hosts's
>>>>>                                             actual total RAM:
>>>>>                                             261961927808
>>>>>                                             and RAM after applying
>>>>>                                             overprovisioning:
>>>>>                                             261961924608
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:32,505 DEBUG
>>>>>                                             [c.c.c.CapacityManagerImpl]
>>>>>                                             (Work-Job-Executor-115:ctx-0d5db528
>>>>>                                             job-13733/job-191379
>>>>>                                             ctx-bbabfeae) release
>>>>>                                             cpu from host: 15, old
>>>>>                                             used:
>>>>>                                             71300,reserved: 2100,
>>>>>                                             actual total: 82976,
>>>>>                                             total with
>>>>>                                             overprovisioning:
>>>>>                                             331904; new used:
>>>>>                                             67100,reserved:6300;
>>>>>                                             movedfromreserved:
>>>>>                                             false,moveToReserveredtrue
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:32,505 DEBUG
>>>>>                                             [c.c.c.CapacityManagerImpl]
>>>>>                                             (Work-Job-Executor-115:ctx-0d5db528
>>>>>                                             job-13733/job-191379
>>>>>                                             ctx-bbabfeae) release
>>>>>                                             mem from host: 15, old
>>>>>                                             used:
>>>>>                                             98784247808,reserved:
>>>>>                                             1073741824, total:
>>>>>                                             261961924608; new used:
>>>>>                                             94489280512,reserved:5368709120
>>>>>                                             <tel:5368709120>
>>>>>                                             <(53)%206870-9120>;
>>>>>                                             movedfromreserved:
>>>>>
>>>>>                                             false,moveToReserveredtrue
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:32,538 DEBUG
>>>>>                                             [c.c.n.NetworkModelImpl]
>>>>>                                             (Work-Job-Executor-115:ctx-0d5db528
>>>>>                                             job-13733/job-191379
>>>>>                                             ctx-bbabfeae) Service
>>>>>                                             SecurityGroup is not
>>>>>                                             supported
>>>>>                                             in the network id=519
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:32,557 DEBUG
>>>>>                                             [c.c.n.NetworkModelImpl]
>>>>>                                             (Work-Job-Executor-115:ctx-0d5db528
>>>>>                                             job-13733/job-191379
>>>>>                                             ctx-bbabfeae) Service
>>>>>                                             SecurityGroup is not
>>>>>                                             supported
>>>>>                                             in the network id=519
>>>>>
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:32,588 ERROR
>>>>>                                             [c.c.v.VmWorkJobHandlerProxy]
>>>>>                                             (Work-Job-Executor-115:ctx-0d5db528
>>>>>                                             job-13733/job-191379
>>>>>                                             ctx-bbabfeae)
>>>>>                                             Invocation exception,
>>>>>                                             caused by:
>>>>>                                             com.cloud.utils.exception.CloudRuntimeException:
>>>>>                                             Unable to migrate
>>>>>                                             VM[User|i-638-1735-VM]
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:32,589 INFO
>>>>>                                             [c.c.v.VmWorkJobHandlerProxy]
>>>>>                                             (Work-Job-Executor-115:ctx-0d5db528
>>>>>                                             job-13733/job-191379
>>>>>                                             ctx-bbabfeae) Rethrow
>>>>>                                             exception
>>>>>                                             com.cloud.utils.exception.CloudRuntimeException:
>>>>>                                             Unable to migrate
>>>>>                                             VM[User|i-638-1735-VM]
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:32,589 DEBUG
>>>>>                                             [c.c.v.VmWorkJobDispatcher]
>>>>>                                             (Work-Job-Executor-115:ctx-0d5db528
>>>>>                                             job-13733/job-191379)
>>>>>                                             Done with run of VM
>>>>>                                             work job:
>>>>>                                             com.cloud.vm.VmWorkMigrateAway
>>>>>                                             for VM 1735, job
>>>>>                                             origin: 13733
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:32,589 ERROR
>>>>>                                             [c.c.v.VmWorkJobDispatcher]
>>>>>                                             (Work-Job-Executor-115:ctx-0d5db528
>>>>>                                             job-13733/job-191379)
>>>>>                                             Unable to complete
>>>>>                                             AsyncJobVO {id:191379,
>>>>>                                             userId:
>>>>>                                             1,
>>>>>                                             accountId: 1,
>>>>>                                             instanceType: null,
>>>>>                                             instanceId: null, cmd:
>>>>>                                             com.cloud.vm.VmWorkMigrateAway,
>>>>>                                             cmdInfo:
>>>>>                                             rO0ABXNyAB5jb20uY2xvdWQudm0uVm
>>>>>                                             1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb2
>>>>>                                             0uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2
>>>>>                                             VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZz
>>>>>                                             t4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGx3QAGVZpcnR1YWxNYWNoaW5lTW
>>>>>                                             FuYWdlckltcGwAAAAAAAAA4A,
>>>>>                                             cmdVersion: 0, status:
>>>>>                                             IN_PROGRESS,
>>>>>                                             processStatus: 0,
>>>>>                                             resultCode: 0,
>>>>>                                             result: null,
>>>>>                                             initMsid:
>>>>>                                             345049098498,
>>>>>                                             completeMsid: null,
>>>>>                                             lastUpdated:
>>>>>                                             null, lastPolled:
>>>>>                                             null, created: Tue Jan
>>>>>                                             24 18:05:00 CET 2017}, job
>>>>>                                             origin:13733
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             com.cloud.utils.exception.CloudRuntimeException:
>>>>>                                             Unable to migrate
>>>>>                                             VM[User|i-638-1735-VM]
>>>>>                                             ewcstack-man02-prod: 
>>>>>                                               at
>>>>>                                             com.cloud.vm.VirtualMachineMan
>>>>>                                             agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:2405)
>>>>>                                             ewcstack-man02-prod: 
>>>>>                                               at
>>>>>                                             com.cloud.vm.VirtualMachineMan
>>>>>                                             agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:4517)
>>>>>                                             ewcstack-man02-prod: 
>>>>>                                               at
>>>>>                                             sun.reflect.GeneratedMethodAcc
>>>>>                                             essor402.invoke(Unknown
>>>>>                                             Source)
>>>>>                                             ewcstack-man02-prod: 
>>>>>                                               at
>>>>>                                             sun.reflect.DelegatingMethodAc
>>>>>                                             cessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>                                             ewcstack-man02-prod: 
>>>>>                                               at
>>>>>                                             java.lang.reflect.Method.invok
>>>>>                                             e(Method.java:606)
>>>>>                                             ewcstack-man02-prod: 
>>>>>                                               at
>>>>>                                             com.cloud.vm.VmWorkJobHandlerP
>>>>>                                             roxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
>>>>>                                             ewcstack-man02-prod: 
>>>>>                                               at
>>>>>                                             com.cloud.vm.VirtualMachineMan
>>>>>                                             agerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4636)
>>>>>                                             ewcstack-man02-prod: 
>>>>>                                               at
>>>>>                                             com.cloud.vm.VmWorkJobDispatch
>>>>>                                             er.runJob(VmWorkJobDispatcher.java:103)
>>>>>                                             ewcstack-man02-prod: 
>>>>>                                               at
>>>>>                                             org.apache.cloudstack.framewor
>>>>>                                             k.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManag
>>>>>                                             erImpl.java:537)
>>>>>                                             ewcstack-man02-prod: 
>>>>>                                               at
>>>>>                                             org.apache.cloudstack.managed.
>>>>>                                             context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>>>>>                                             ewcstack-man02-prod: 
>>>>>                                               at
>>>>>                                             org.apache.cloudstack.managed.
>>>>>                                             context.impl.DefaultManagedContext$1.call(DefaultManagedCont
>>>>>                                             ext.java:56)
>>>>>                                             ewcstack-man02-prod: 
>>>>>                                               at
>>>>>                                             org.apache.cloudstack.managed.
>>>>>                                             context.impl.DefaultManagedContext.callWithContext(DefaultMa
>>>>>                                             nagedContext.java:103)
>>>>>                                             ewcstack-man02-prod: 
>>>>>                                               at
>>>>>                                             org.apache.cloudstack.managed.
>>>>>                                             context.impl.DefaultManagedContext.runWithContext(DefaultMan
>>>>>                                             agedContext.java:53)
>>>>>                                             ewcstack-man02-prod: 
>>>>>                                               at
>>>>>                                             org.apache.cloudstack.managed.
>>>>>                                             context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>>>>>                                             ewcstack-man02-prod: 
>>>>>                                               at
>>>>>                                             org.apache.cloudstack.framewor
>>>>>                                             k.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:494)
>>>>>                                             ewcstack-man02-prod: 
>>>>>                                               at
>>>>>                                             java.util.concurrent.Executors
>>>>>                                             $RunnableAdapter.call(Executors.java:471)
>>>>>                                             ewcstack-man02-prod: 
>>>>>                                               at
>>>>>                                             java.util.concurrent.FutureTas
>>>>>                                             k.run(FutureTask.java:262)
>>>>>                                             ewcstack-man02-prod: 
>>>>>                                               at
>>>>>                                             java.util.concurrent.ThreadPoo
>>>>>                                             lExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>>                                             ewcstack-man02-prod: 
>>>>>                                               at
>>>>>                                             java.util.concurrent.ThreadPoo
>>>>>                                             lExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>>                                             ewcstack-man02-prod: 
>>>>>                                               at
>>>>>                                             java.lang.Thread.run(Thread.java:745)
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:32,590 DEBUG
>>>>>                                             [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>                                             (Work-Job-Executor-115:ctx-0d5db528
>>>>>                                             job-13733/job-191379)
>>>>>                                             Complete async
>>>>>                                             job-191379, jobStatus:
>>>>>                                             FAILED,
>>>>>                                             resultCode: 0, result:
>>>>>                                             rO0ABXNyAC9jb20uY2xvdWQudXRpbH
>>>>>                                             MuZXhjZXB0aW9uLkNsb3VkUnVudGltZUV4Y2VwdGlvbgAAAABWTU9yAwABSQ
>>>>>                                             ALY3NFcnJvckNvZGV4cgAaamF2YS5sYW5nLlJ1bnRpbWVFeGNlcHRpb26eXw
>>>>>                                             ZHCjSD5QIAAHhyABNqYXZhLmxhbmcuRXhjZXB0aW9u0P0fPho7HMQCAAB4cg
>>>>>                                             ATamF2YS5sYW5nLlRocm93YWJsZdXGNSc5d7jLAwAETAAFY2F1c2V0ABVMam
>>>>>                                             F2YS9sYW5nL1Rocm93YWJsZTtMAA1kZXRhaWxNZXNzYWdldAASTGphdmEvbG
>>>>>                                             FuZy9TdHJpbmc7WwAKc3RhY2tUcmFjZXQAHltMamF2YS9sYW5nL1N0YWNrVH
>>>>>                                             JhY2VFbGVtZW50O0wAFHN1cHByZXNzZWRFeGNlcHRpb25zdAAQTGphdmEvdX
>>>>>                                             RpbC9MaXN0O3hwcQB-AAh0AChVbmFibGUgdG8gbWlncmF0ZSBWTVtVc2VyfG
>>>>>                                             ktNjM4LTE3MzUtVk1ddXIAHltMamF2YS5sYW5nLlN0YWNrVHJhY2VFbGVtZW
>>>>>                                             50OwJGKjw8_SI5AgAAeHAAAAAUc3IAG2phdmEubGFuZy5TdGFja1RyYWNlRW
>>>>>                                             xlbWVudGEJxZomNt2FAgAESQAKbGluZU51bWJlckwADmRlY2xhcmluZ0NsYX
>>>>>                                             NzcQB-AAVMAAhmaWxlTmFtZXEAfgAFTAAKbWV0aG9kTmFtZXEAfgAFeHAAAA
>>>>>                                             lldAAmY29tLmNsb3VkLnZtLlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGx0AB
>>>>>                                             5WaXJ0dWFsTWFjaGluZU1hbmFnZXJJbXBsLmphdmF0ABZvcmNoZXN0cmF0ZU
>>>>>                                             1pZ3JhdGVBd2F5c3EAfgAMAAARpXEAfgAOcQB-AA9xAH4AEHNxAH4ADP____
>>>>>                                             90ACZzdW4ucmVmbGVjdC5HZW5lcmF0ZWRNZXRob2RBY2Nlc3NvcjQwMnB0AA
>>>>>                                             ZpbnZva2VzcQB-AAwAAAArdAAoc3VuLnJlZmxlY3QuRGVsZWdhdGluZ01ldG
>>>>>                                             hvZEFjY2Vzc29ySW1wbHQAIURlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcG
>>>>>                                             wuamF2YXEAfgAUc3EAfgAMAAACXnQAGGphdmEubGFuZy5yZWZsZWN0Lk1ldG
>>>>>                                             hvZHQAC01ldGhvZC5qYXZhcQB-ABRzcQB-AAwAAABrdAAiY29tLmNsb3VkLn
>>>>>                                             ZtLlZtV29ya0pvYkhhbmRsZXJQcm94eXQAGlZtV29ya0pvYkhhbmRsZXJQcm
>>>>>                                             94eS5qYXZhdAAPaGFuZGxlVm1Xb3JrSm9ic3EAfgAMAAASHHEAfgAOcQB-
>>>>>                                             AA9xAH4AHnNxAH4ADAAAAGd0ACBjb20uY2xvdWQudm0uVm1Xb3JrSm9iRGlz
>>>>>                                             cGF0Y2hlcnQAGFZtV29ya0pvYkRpc3BhdGNoZXIuamF2YXQABnJ1bkpvYnNx
>>>>>                                             AH4ADAAAAhl0AD9vcmcuYXBhY2hlLmNsb3Vkc3RhY2suZnJhbWV3b3JrLmpv
>>>>>                                             YnMuaW1wbC5Bc3luY0pvYk1hbmFnZXJJbXBsJDV0ABhBc3luY0pvYk1hbmFn
>>>>>                                             ZXJJbXBsLmphdmF0AAxydW5JbkNvbnRleHRzcQB-AAwAAAAxdAA-b3JnLmFw
>>>>>                                             YWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29udGV4
>>>>>                                             dFJ1bm5hYmxlJDF0ABtNYW5hZ2VkQ29udGV4dFJ1bm5hYmxlLmphdmF0AANy
>>>>>                                             dW5zcQB-AAwAAAA4dABCb3JnLmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQu
>>>>>                                             Y29udGV4dC5pbXBsLkRlZmF1bHRNYW5hZ2VkQ29udGV4dCQxdAAaRGVmYXVs
>>>>>                                             dE1hbmFnZWRDb250ZXh0LmphdmF0AARjYWxsc3EAfgAMAAAAZ3QAQG9yZy5h
>>>>>                                             cGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0
>>>>>                                             TWFuYWdlZENvbnRleHRxAH4ALnQAD2NhbGxXaXRoQ29udGV4dHNxAH4ADAAA
>>>>>                                             ADVxAH4AMXEAfgAudAAOcnVuV2l0aENvbnRleHRzcQB-AAwAAAAudAA8b3Jn
>>>>>                                             LmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29u
>>>>>                                             dGV4dFJ1bm5hYmxlcQB-ACpxAH4AK3NxAH4ADAAAAe5xAH4AJXEAfgAmcQB-
>>>>>                                             ACtzcQB-AAwAAAHXdAAuamF2YS51dGlsLmNvbmN1cnJlbnQuRXhlY3V0b3Jz
>>>>>                                             JFJ1bm5hYmxlQWRhcHRlcnQADkV4ZWN1dG9ycy5qYXZhcQB-AC9zcQB-AAwA
>>>>>                                             AAEGdAAfamF2YS51dGlsLmNvbmN1cnJlbnQuRnV0dXJlVGFza3QAD0Z1dHVy
>>>>>                                             ZVRhc2suamF2YXEAfgArc3EAfgAMAAAEeXQAJ2phdmEudXRpbC5jb25jdXJy
>>>>>                                             ZW50LlRocmVhZFBvb2xFeGVjdXRvcnQAF1RocmVhZFBvb2xFeGVjdXRvci5q
>>>>>                                             YXZhdAAJcnVuV29ya2Vyc3EAfgAMAAACZ3QALmphdmEudXRpbC5jb25jdXJy
>>>>>                                             ZW50LlRocmVhZFBvb2xFeGVjdXRvciRXb3JrZXJxAH4AQHEAfgArc3EAfgAM
>>>>>                                             AAAC6XQAEGphdmEubGFuZy5UaHJlYWR0AAtUaHJlYWQuamF2YXEAfgArc3IA
>>>>>                                             JmphdmEudXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVMaXN0_A8lMbXs
>>>>>                                             jhACAAFMAARsaXN0cQB-AAd4cgAsamF2YS51dGlsLkNvbGxlY3Rpb25zJFVu
>>>>>                                             bW9kaWZpYWJsZUNvbGxlY3Rpb24ZQgCAy173HgIAAUwAAWN0ABZMamF2YS91
>>>>>                                             dGlsL0NvbGxlY3Rpb247eHBzcgATamF2YS51dGlsLkFycmF5TGlzdHiB0h2Z
>>>>>                                             x2GdAwABSQAEc2l6ZXhwAAAAAHcEAAAAAHhxAH4ATHgAABCadwgAAAAAAAAAAHg
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:32,592 DEBUG
>>>>>                                             [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>                                             (Work-Job-Executor-115:ctx-0d5db528
>>>>>                                             job-13733/job-191379)
>>>>>                                             Publish async
>>>>>                                             job-191379 complete on
>>>>>                                             message bus
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:32,592 DEBUG
>>>>>                                             [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>                                             (Work-Job-Executor-115:ctx-0d5db528
>>>>>                                             job-13733/job-191379)
>>>>>                                             Wake up jobs related
>>>>>                                             to job-191379
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:32,592 DEBUG
>>>>>                                             [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>                                             (Work-Job-Executor-115:ctx-0d5db528
>>>>>                                             job-13733/job-191379)
>>>>>                                             Update db status for
>>>>>                                             job-191379
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:32,594 DEBUG
>>>>>                                             [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>                                             (Work-Job-Executor-115:ctx-0d5db528
>>>>>                                             job-13733/job-191379)
>>>>>                                             Wake up jobs joined
>>>>>                                             with job-191379 and
>>>>>                                             disjoin
>>>>>                                             all
>>>>>                                             subjobs created from
>>>>>                                             job- 191379
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:32,604 DEBUG
>>>>>                                             [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>                                             (Work-Job-Executor-115:ctx-0d5db528
>>>>>                                             job-13733/job-191379)
>>>>>                                             Done executing
>>>>>                                             com.cloud.vm.VmWorkMigrateAway
>>>>>                                             for
>>>>>                                             job-191379
>>>>>                                             ewcstack-man02-prod:
>>>>>                                             2017-01-24
>>>>>                                             18:05:32,608 INFO
>>>>>                                             [o.a.c.f.j.i.AsyncJobMonitor]
>>>>>                                             (Work-Job-Executor-115:ctx-0d5db528
>>>>>                                             job-13733/job-191379)
>>>>>                                             Remove job-191379 from
>>>>>                                             job monitoring
>>>>>
>>>>>
>>>>>
>>>>>                                             On 25.01.2017 15:53,
>>>>>                                             Rafael Weingärtner wrote:
>>>>>
>>>>>                                             Did you check the ACS
>>>>>                                             log files looking for
>>>>>                                             some exceptions when the
>>>>>
>>>>>                                                 migrations for
>>>>>                                                 this specific VM
>>>>>                                                 started?
>>>>>
>>>>>                                                 On Wed, Jan 25,
>>>>>                                                 2017 at 9:49 AM,
>>>>>                                                 Francois Scheurer <
>>>>>                                                 francois.scheurer@everyware.ch
>>>>>                                                 <ma...@everyware.ch>>
>>>>>                                                 wrote:
>>>>>
>>>>>                                                 Dear CS contributors
>>>>>
>>>>>                                                     We use CS
>>>>>                                                     4.5.1 on a 3
>>>>>                                                     Clusters with
>>>>>                                                     XenServer 6.5.
>>>>>                                                     and shared primary
>>>>>                                                     storage.
>>>>>
>>>>>                                                     When we put a
>>>>>                                                     host in
>>>>>                                                     maintenance,
>>>>>                                                     cs evacuates
>>>>>                                                     all VM's on other
>>>>>                                                     hosts.
>>>>>
>>>>>                                                     But it happens
>>>>>                                                     regularly that
>>>>>                                                     some VM's are
>>>>>                                                     stopped
>>>>>                                                     instead of live
>>>>>                                                     migrated.
>>>>>
>>>>>                                                     If we restart
>>>>>                                                     that stopped
>>>>>                                                     VM's, CS will
>>>>>                                                     then stop them
>>>>>                                                     again later.
>>>>>
>>>>>                                                     We need to
>>>>>                                                     start them
>>>>>                                                     several times
>>>>>                                                     until they
>>>>>                                                     stays up.
>>>>>
>>>>>                                                     Is it a known
>>>>>                                                     issue? Is it
>>>>>                                                     fixed on CS
>>>>>                                                     4.9.2 ?
>>>>>
>>>>>
>>>>>                                                     Migrating
>>>>>                                                     manually all
>>>>>                                                     VM's is
>>>>>                                                     working but
>>>>>                                                     impracticable
>>>>>                                                     because too
>>>>>                                                     much
>>>>>                                                     time consuming.
>>>>>
>>>>>
>>>>>                                                     Many thanks in
>>>>>                                                     advance for
>>>>>                                                     your help.
>>>>>
>>>>>
>>>>>
>>>>>                                                     Best Regards
>>>>>                                                     Francois
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>                                                     --
>>>>>
>>>>>
>>>>>                                                     EveryWare AG
>>>>>                                                     François Scheurer
>>>>>                                                     Senior Systems
>>>>>                                                     Engineer
>>>>>                                                     Zurlindenstrasse
>>>>>                                                     52a
>>>>>                                                     CH-8003 Zürich
>>>>>
>>>>>                                                     tel: +41 44
>>>>>                                                     466 60 00
>>>>>                                                     <tel:%2B41%2044%20466%2060%2000>
>>>>>                                                     fax: +41 44
>>>>>                                                     466 60 10
>>>>>                                                     <tel:%2B41%2044%20466%2060%2010>
>>>>>                                                     mail:
>>>>>                                                     francois.scheurer@everyware.ch
>>>>>                                                     <ma...@everyware.ch>
>>>>>                                                     web:
>>>>>                                                     http://www.everyware.ch
>>>>>
>>>>>
>>>>>
>>>>>                                                 --
>>>>>
>>>>>
>>>>>                                             EveryWare AG
>>>>>                                             François Scheurer
>>>>>                                             Senior Systems Engineer
>>>>>                                             Zurlindenstrasse 52a
>>>>>                                             CH-8003 Zürich
>>>>>
>>>>>                                             tel: +41 44 466 60 00
>>>>>                                             <tel:%2B41%2044%20466%2060%2000>
>>>>>                                             fax: +41 44 466 60 10
>>>>>                                             <tel:%2B41%2044%20466%2060%2010>
>>>>>                                             mail:
>>>>>                                             francois.scheurer@everyware.ch
>>>>>                                             <ma...@everyware.ch>
>>>>>                                             web:
>>>>>                                             http://www.everyware.ch
>>>>>
>>>>>
>>>>>
>>>>>                                         --
>>>>>                                         Rafael Weingärtner
>>>>>
>>>>>
>>>>>
>>>>>                                 --
>>>>>
>>>>>
>>>>>                                 EveryWare AG
>>>>>                                 François Scheurer
>>>>>                                 Senior Systems Engineer
>>>>>                                 Zurlindenstrasse 52a
>>>>>                                 CH-8003 Zürich
>>>>>
>>>>>                                 tel: +41 44 466 60 00
>>>>>                                 <tel:%2B41%2044%20466%2060%2000>
>>>>>                                 fax: +41 44 466 60 10
>>>>>                                 <tel:%2B41%2044%20466%2060%2010>
>>>>>                                 mail:
>>>>>                                 francois.scheurer@everyware.ch
>>>>>                                 <ma...@everyware.ch>
>>>>>                                 web: http://www.everyware.ch
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>                         -- 
>>>>>
>>>>>
>>>>>                         EveryWare AG
>>>>>                         François Scheurer
>>>>>                         Senior Systems Engineer
>>>>>                         Zurlindenstrasse 52a
>>>>>                         CH-8003 Zürich
>>>>>
>>>>>                         tel: +41 44 466 60 00
>>>>>                         <tel:%2B41%2044%20466%2060%2000>
>>>>>                         fax: +41 44 466 60 10
>>>>>                         <tel:%2B41%2044%20466%2060%2010>
>>>>>                         mail: francois.scheurer@everyware.ch
>>>>>                         <ma...@everyware.ch>
>>>>>                         web: http://www.everyware.ch
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>                     -- 
>>>>>                     Rafael Weingärtner
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>                 -- 
>>>>>                 Rafael Weingärtner
>>>>
>>>>                 -- 
>>>>
>>>>
>>>>                 EveryWare AG
>>>>                 François Scheurer
>>>>                 Senior Systems Engineer
>>>>                 Zurlindenstrasse 52a
>>>>                 CH-8003 Zürich
>>>>
>>>>                 tel:+41 44 466 60 00 <tel:+41%2044%20466%2060%2000>
>>>>                 fax:+41 44 466 60 10 <tel:+41%2044%20466%2060%2010>
>>>>                 mail:francois.scheurer@everyware.ch
>>>>                 <ma...@everyware.ch>
>>>>                 web:http://www.everyware.ch  
>>>>
>>>>
>>>>
>>>>
>>>>             -- 
>>>>             Rafael Weingärtner
>>>
>>>             -- 
>>>
>>>
>>>             EveryWare AG
>>>             François Scheurer
>>>             Senior Systems Engineer
>>>             Zurlindenstrasse 52a
>>>             CH-8003 Zürich
>>>
>>>             tel:+41 44 466 60 00 <tel:+41%2044%20466%2060%2000>
>>>             fax:+41 44 466 60 10 <tel:+41%2044%20466%2060%2010>
>>>             mail:francois.scheurer@everyware.ch
>>>             <ma...@everyware.ch>
>>>             web:http://www.everyware.ch  
>>>
>>>
>>>
>>>
>>>         -- 
>>>         Rafael Weingärtner
>>
>>         -- 
>>
>>
>>         EveryWare AG
>>         François Scheurer
>>         Senior Systems Engineer
>>         Zurlindenstrasse 52a
>>         CH-8003 Zürich
>>
>>         tel:+41 44 466 60 00 <tel:+41%2044%20466%2060%2000>
>>         fax:+41 44 466 60 10 <tel:+41%2044%20466%2060%2010>
>>         mail:francois.scheurer@everyware.ch
>>         <ma...@everyware.ch>
>>         web:http://www.everyware.ch  
>>
>>
>>
>>
>>     -- 
>>     Rafael Weingärtner
>
>     -- 
>
>
>     EveryWare AG
>     François Scheurer
>     Senior Systems Engineer
>     Zurlindenstrasse 52a
>     CH-8003 Zürich
>
>     tel:+41 44 466 60 00 <tel:+41%2044%20466%2060%2000>
>     fax:+41 44 466 60 10 <tel:+41%2044%20466%2060%2010>
>     mail:francois.scheurer@everyware.ch <ma...@everyware.ch>
>     web:http://www.everyware.ch  
>
>
>
>
> -- 
> Rafael Weingärtner

-- 


EveryWare AG
François Scheurer
Senior Systems Engineer
Zurlindenstrasse 52a
CH-8003 Zürich

tel: +41 44 466 60 00
fax: +41 44 466 60 10
mail: francois.scheurer@everyware.ch
web: http://www.everyware.ch

Re: cs 4.5.1, vm stopped when putting xen hosts in maintenance

Posted by Rafael Weingärtner <ra...@gmail.com>.

Your update seems reasonable. The methods that use the “step” data are the
following:

   - com.cloud.vm.VirtualMachineManagerImpl.checkWorkItems(VMInstanceVO,
   State)
   - com.cloud.vm.VirtualMachineManagerImpl.cleanup(VirtualMachineGuru,
   VirtualMachineProfileItWorkVO, Event, boolean)
   - com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(String,
   Map<Param, Object>, DeploymentPlan, DeploymentPlanner)


Basically, all of them would stop the flow of execution if the step is
defined as “done”. What I would do before running your “update” query is
checking the current steps of the VMs that were having problems, just to
make sure you are not cutting some processes that may be important. After
that, I would probably do the same as you and update them to done. I am
assuming you already removed the HA from VMs, right?


On Wed, Feb 8, 2017 at 5:32 AM, Francois Scheurer <
francois.scheurer@everyware.ch> wrote:

> Dear Rafael
>
>
> You are right about cloudmonkey with CS 4.5.2 being able to update the
> ha_enable flag on VM's. Thank you for that.
>
> Back to the central question, do you think it is safe to run this update
> while all MS's are stopped?:
>   update op_it_work set step="Done" where step!="Done" and instance_id=<X>;
>
> Where <X> is pointing to an expunged VM's or to the specific VM's that are
> intermittently being stopped.
>
> If you are not sure about it, maybe do you know somebody else who can help
> us on this?
>
>
> Best Regards
>
> Francois Scheurer
>
>
>
>
>
>
> On 07.02.2017 17:47, Rafael Weingärtner wrote:
>
> Got the idea of your work around now.
>
> I have not paid much attention to the table data you posted (sorry for
> that). Well, it seems you have all taken care of.
>
> I have a suggestion to make; instead of changing data directly on the data
> base, what about using CloudMonkey to update the HA flag to false on VMs?
> Currently, it is the UI that is blocking such customization, but according
> to the API, you should be able to do that. https://cloudstack.apache.org/
> api/apidocs-4.5/root_admin/updateVirtualMachine.html
>
> On Tue, Feb 7, 2017 at 11:25 AM, Francois Scheurer <
> <fr...@everyware.ch> wrote:
>
>> Dear Rafael
>>
>>
>> We changed it on the VM to avoid the required stop and start. Changing on
>> the used Offering is not affecting the vm_instance.ha_enabled.
>> We could create a new Offering without HA but the VM need to be stopped
>> to let us apply it.
>>
>> We did not disabled the HA on these 4 VM's to solve the problem, but just
>> to minimize the impact of the problem (now only i-189-1177-VM should be
>> stopped intermittently) if the cleaning of op_it_work is not fixing the
>> issue.
>> Of course we could filter the update query with “instance id”, but if you
>> look the the table output all instance_id are related to the 5 problematic
>> VM's.
>> The only exceptions are the first 3 rows that are related to VM's that
>> does no exist anymore since July 2015 ...
>> That is why I thought it may be safe to do.
>>
>>
>> Best Regards
>>
>> Francois Scheurer
>>
>>
>>
>>
>>
>>
>>
>>
>> On 07.02.2017 16:27, Rafael Weingärtner wrote:
>>
>> So, you changed the flags directly on "vm_instance". Why did not you
>> change the service offering of these VMs using ACS UI or API? For me, it
>> feels that the best approach would be to change the service offering of the
>> affected VMs (the HA flag is first configured at the service offering). You
>> could assign a service offering without HA; I mean, it would be a matter of
>> creating a similar service offering (same configurations to the one they
>> are using right now), but without HA.
>>
>> I am assuming that the change directly on “vm_instance” did not solve
>> your problem, right? That is why you want to change some data at
>> “op_it_work”. I did a quick look at “op_it_work”, and it seems that this
>> table may be used for more things than HA. Here is a tip for complex
>> production system, never run update SQLs in production environments without
>> a very, very restrictive and well thought “where” clause. In my opinion
>> this is not restrictive enough (update op_it_work set step="Done" where
>> step!="Done"). I would advise you not to go further and change
>> configurations directly on database (unless you have a dev team that is
>> capable of reading, debugging, and work with ACS source code). ACS has over
>> 500k lines of java code, and God knows how many lines of shell, python and
>> other languages, software and systems that may be used; it is pretty hard
>> sometimes to get a clear picture of how a change in the database will
>> reverberate in the whole system without knowing in details the installation
>> configurations and underlying structures (networking, hypervisors, storage
>> systems and other pieces). However, if you want to proceed with that, I
>> would suggest you adding some extra filters to that “where” clause. What
>> about adding the “instance id” there?
>>
>> PS: BTW, do you already heard about the Cloudstack Fork from Schuberg
>> Philis named Cosmic? <https://github.com/MissionCriticalCloud/cosmic>
>> https://github.com/MissionCriticalCloud/cosmic
>> Yes, I have. They were part of the active ACS community back there. I was
>> already here when they created the fork. Still, I kept with ACS.
>>
>>
>> On Tue, Feb 7, 2017 at 3:54 AM, Francois Scheurer <
>> francois.scheurer@everyware.ch> wrote:
>>
>>> Dear Rafael
>>>
>>>
>>> We disabled temporarily the HA Flag on these VM's by updating the DB
>>> directly: update vm_instance set ha_enabled=0 where removed is null and
>>> id=1178 or id=1240 or id=1690 or id=1736;
>>>
>>> We kept the test VM i-189-1177-VM with HA Flag.
>>>
>>> Restarting the MS + cleaning the op_it_work table will hopefully fix the
>>> issue. And if the issue is persisting, then only the i-189-1177-VM should
>>> be affected, hence mitigating the impact.
>>>
>>> In the other hand if everything run fine we would thereafter re-enable
>>> the HA Flag on the problematic VM's.
>>>
>>>
>>> Do you think it is safe to run this update?: update op_it_work set
>>> step="Done" where step!="Done";
>>>
>>>
>>> Best Regards
>>>
>>> Francois Scheurer
>>>
>>>
>>>
>>> PS: BTW, do you already heard about the Cloudstack Fork from Schuberg
>>> Philis named Cosmic? https://github.com/MissionCriticalCloud/cosmic
>>>
>>>
>>>
>>>
>>>
>>> On 06.02.2017 16:41, Rafael Weingärtner wrote:
>>>
>>> I totally understand your position of not being able to debug the ACS in
>>> production ;)
>>>
>>>
>>> There is one bit I quite did not understand. If you have disabled the HA
>>> for VMs, why do you still need to change data on the data base? How did you
>>> disabled the HA of VMs?
>>>
>>> On Mon, Feb 6, 2017 at 5:04 AM, Francois Scheurer <
>>> francois.scheurer@everyware.ch> wrote:
>>>
>>>> Dear Rafael / Dear CS contributors
>>>>
>>>>
>>>> I tried the following:
>>>>
>>>> 1.     stop all MS
>>>> 2.     backup DB
>>>> 3.     select * from op_ha_work where type="Migration" and
>>>> mgmt_server_id is null and taken is null;
>>>> 4.     delete from op_ha_work where type="Migration" and mgmt_server_id
>>>> is null and taken is null;
>>>> 5.     start all MS
>>>>
>>>> But actually after the MS stopped the op_ha_work table was
>>>> automatically cleaned, making my steps useless.
>>>> After the MS started the op_ha_work table was filled again and related
>>>> VM's were stopped several times (3-7 times), requiring high attention to
>>>> manually restart them on each stop.
>>>>
>>>> I later found another table that could be the cause of the issue:
>>>>
>>>> mysql> select op_it_work.*,vm_instance.instance_name,vm_instance.removed
>>>> from op_it_work join vm_instance where op_it_work.step!="Done" and
>>>> op_it_work.instance_id=vm_instance.id order by instance_id;
>>>> +--------------------------------------+----------------+---
>>>> ---------+-----------------------+-----------+--------------
>>>> ------+-----------+------------+-------------+--------------
>>>> -+-------------+---------------+---------------------+
>>>> | id                                   | mgmt_server_id | created_at |
>>>> thread                | type      | vm_type            | step      |
>>>> updated_at | instance_id | resource_type | resource_id | instance_name |
>>>> removed             |
>>>> +--------------------------------------+----------------+---
>>>> ---------+-----------------------+-----------+--------------
>>>> ------+-----------+------------+-------------+--------------
>>>> -+-------------+---------------+---------------------+
>>>> | be4d0dd2-7d4f-47d3-ba49-623d2f115bbf |   345049098498 | 1436187965
>>>> <%2814%29%203618-7965> | consoleproxy-1        | Starting  |
>>>> ConsoleProxy       | Release   | 1436188265 <%2814%29%203618-8265>
>>>> |         402 | NULL          |           0 | v-402-VM      | 2015-07-06
>>>> 13:23:22 |
>>>> | fb4d1844-eb92-478c-9f50-d666790f43af |   345049098498 | 1436187965
>>>> <%2814%29%203618-7965> | secstorage-1          | Starting  |
>>>> SecondaryStorageVm | Release   | 1436188265 <%2814%29%203618-8265>
>>>> |         687 | NULL          |           0 | s-687-VM      | 2015-07-06
>>>> 13:11:34 |
>>>> | 7dfeb25a-a705-42e8-907c-79d1e69697ed |   345049098498 | 1436964548
>>>> <%2814%29%203696-4548> | Work-Job-Executor-165 | Migrating |
>>>> User               | Migrating | 1436964548 <%2814%29%203696-4548>
>>>> |         776 | Host          |          29 | i-51-776-VM   | 2015-07-15
>>>> 12:55:38 |
>>>> | 2018ac61-b11c-4101-8ef5-07a39509f2bd |   345049098498 | 1486020173 |
>>>> Work-Job-Executor-19  | Migrating | User               | Prepare   |
>>>> 1486020173 |        1177 | Host          |          25 | i-189-1177-VM |
>>>> NULL                |
>>>> | 25430cd5-4346-4643-9c7e-7b39b8a876a2 |   345049098122 | 1486019693 |
>>>> Work-Job-Executor-6   | Migrating | User               | Prepare   |
>>>> 1486019693 |        1177 | Host          |          25 | i-189-1177-VM |
>>>> NULL                |
>>>> | 3276d52e-cbd9-44c5-8ab6-c70784337ff9 |   345049098498 | 1486020953 |
>>>> Work-Job-Executor-33  | Migrating | User               | Prepare   |
>>>> 1486020953 |        1177 | Host          |          33 | i-189-1177-VM |
>>>> NULL                |
>>>> | 6462da33-83f2-4575-af38-0e52d2a48488 |   345049098122 | 1486021433 |
>>>> Work-Job-Executor-12  | Migrating | User               | Prepare   |
>>>> 1486021433 |        1177 | Host          |          25 | i-189-1177-VM |
>>>> NULL                |
>>>> | 688e16c8-c39f-4e88-9e8f-36c0282f3df1 |   345049098122 | 1486020773 |
>>>> Work-Job-Executor-11  | Migrating | User               | Prepare   |
>>>> 1486020773 |        1177 | Host          |          25 | i-189-1177-VM |
>>>> NULL                |
>>>> | d8a18e9d-ac6f-4273-923e-e401533e2cf7 |   345049098498 | 1486020353 |
>>>> Work-Job-Executor-25  | Migrating | User               | Prepare   |
>>>> 1486020353 |        1177 | Host          |          25 | i-189-1177-VM |
>>>> NULL                |
>>>> | 1222ece8-399d-43d8-8ff8-d24bdebca2ed |   345049103441 | 1486018924 |
>>>> Work-Job-Executor-2   | Migrating | User               | Prepare   |
>>>> 1486018924 |        1178 | Host          |          29 | i-189-1178-VM |
>>>> NULL                |
>>>> | 2ac5ca4a-c20d-433a-8379-40e0de3b3ba8 |   345049098498 | 1486021613 |
>>>> Work-Job-Executor-41  | Migrating | User               | Prepare   |
>>>> 1486021613 |        1178 | Host          |          29 | i-189-1178-VM |
>>>> NULL                |
>>>> | 3e3b95a0-2af4-4268-bfdd-f3399a85bcaa |   345049103441 | 1486019584 |
>>>> Work-Job-Executor-5   | Migrating | User               | Prepare   |
>>>> 1486019584 |        1178 | Host          |          29 | i-189-1178-VM |
>>>> NULL                |
>>>> | 53745a2d-056c-4a1b-9267-eb97f077a6c8 |   345049103441 | 1486020244 |
>>>> Work-Job-Executor-6   | Migrating | User               | Prepare   |
>>>> 1486020244 |        1178 | Host          |          29 | i-189-1178-VM |
>>>> NULL                |
>>>> | e3555b49-7c18-4272-b7f4-ba9351b3bbd1 |   345049098498 | 1486020893 |
>>>> Work-Job-Executor-32  | Migrating | User               | Prepare   |
>>>> 1486020893 |        1178 | Host          |          29 | i-189-1178-VM |
>>>> NULL                |
>>>> | 1dcd5204-4f9a-4342-a23b-329a48596e13 |   345049098498 | 1486020953 |
>>>> Work-Job-Executor-34  | Migrating | User               | Prepare   |
>>>> 1486020953 |        1240 | Host          |           3 | i-388-1240-VM |
>>>> NULL                |
>>>> | ec1668b3-fa3c-4673-b593-3d607b169a24 |   345049098498 | 1486019693 |
>>>> Work-Job-Executor-14  | Migrating | User               | Prepare   |
>>>> 1486019693 |        1240 | Host          |           3 | i-388-1240-VM |
>>>> NULL                |
>>>> | fa7b2dec-ed7f-4a67-a055-e6e6bc1631ca |   345049098498 | 1486020353 |
>>>> Work-Job-Executor-26  | Migrating | User               | Prepare   |
>>>> 1486020353 |        1240 | Host          |           3 | i-388-1240-VM |
>>>> NULL                |
>>>> | 8cfe6d3e-5b7b-4be8-ac10-c865a492aab9 |   345049098498 | 1465387271 |
>>>> Work-Job-Executor-18  | Running   | User               | Prepare   |
>>>> 1465387271 |        1388 | Host          |           7 | i-361-1388-VM |
>>>> NULL                |
>>>> | 6e941b7e-6f19-4717-bf82-974c59737dbc |   345049103441 | 1486019524 |
>>>> Work-Job-Executor-4   | Migrating | User               | Prepare   |
>>>> 1486019524 |        1690 | Host          |          15 | i-648-1690-VM |
>>>> NULL                |
>>>> | 718578a4-0dd6-42d9-be10-ed7680650108 |   345049098122 | 1486021433 |
>>>> Work-Job-Executor-13  | Migrating | User               | Prepare   |
>>>> 1486021433 |        1690 | Host          |          15 | i-648-1690-VM |
>>>> NULL                |
>>>> | 9acf3908-ef13-4b03-8c2b-1a7e55e18491 |   345049098498 | 1486020173 |
>>>> Work-Job-Executor-20  | Migrating | User               | Prepare   |
>>>> 1486020173 |        1690 | Host          |          15 | i-648-1690-VM |
>>>> NULL                |
>>>> | a50c420b-ae52-4caa-83db-704b52b299a4 |   345049098498 | 1486020773 |
>>>> Work-Job-Executor-28  | Migrating | User               | Prepare   |
>>>> 1486020773 |        1690 | Host          |          15 | i-648-1690-VM |
>>>> NULL                |
>>>> | 2b5f4179-5975-4326-9db7-643c3ede63fe |   345049103441 | 1484224516 |
>>>> Work-Job-Executor-161 | Running   | User               | Prepare   |
>>>> 1484224516 |        1703 | Host          |           3 | i-51-1703-VM  |
>>>> NULL                |
>>>> | 2f3c6c44-b2a0-4d10-88d2-ac143d2dd379 |   345049098498 | 1484223910 |
>>>> Work-Job-Executor-163 | Running   | User               | Prepare   |
>>>> 1484223910 |        1703 | Host          |           3 | i-51-1703-VM  |
>>>> NULL                |
>>>> | 4126a23b-1cde-49bc-8636-c0d063b85cf6 |   345049103441 | 1484224636 |
>>>> Work-Job-Executor-86  | Running   | User               | Prepare   |
>>>> 1484224636 |        1703 | Host          |           3 | i-51-1703-VM  |
>>>> NULL                |
>>>> | 5dad5645-766b-48c9-9a85-b77b3c101390 |   345049103441 | 1484224636 |
>>>> Work-Job-Executor-92  | Running   | User               | Prepare   |
>>>> 1484224636 |        1703 | Host          |           3 | i-51-1703-VM  |
>>>> NULL                |
>>>> | 8c1dfff9-e55d-4f81-a22f-45f72accc416 |   345049103441 | 1484224576 |
>>>> Work-Job-Executor-95  | Running   | User               | Prepare   |
>>>> 1484224576 |        1703 | Host          |           3 | i-51-1703-VM  |
>>>> NULL                |
>>>> | a703d2d7-b1e3-42ce-a551-831982990ffb |   345049098498 | 1484224630 |
>>>> Work-Job-Executor-13  | Running   | User               | Prepare   |
>>>> 1484224630 |        1703 | Host          |           3 | i-51-1703-VM  |
>>>> NULL                |
>>>> | 0e194cb5-95b7-41cb-969a-ae37d2ad449a |   345049098498 | 1485763613 |
>>>> Work-Job-Executor-28  | Migrating | User               | Prepare   |
>>>> 1485763613 |        1735 | Host          |          19 | i-638-1735-VM |
>>>> 2017-01-30 15:24:10 |
>>>> | 74f180b9-be0f-429b-9a45-597ca2ca997d |   345049098498 | 1485763013 |
>>>> Work-Job-Executor-21  | Migrating | User               | Prepare   |
>>>> 1485763013 |        1735 | Host          |          19 | i-638-1735-VM |
>>>> 2017-01-30 15:24:10 |
>>>> | feaa421c-88df-4493-9a4f-bddb6ee1b75b |   345049098122 | 1485764285 |
>>>> Work-Job-Executor-19  | Migrating | User               | Prepare   |
>>>> 1485764285 |        1735 | Host          |          19 | i-638-1735-VM |
>>>> 2017-01-30 15:24:10 |
>>>> | 32138a36-ad18-42b5-8478-76ad63caf4da |   345049098122 | 1486020293 |
>>>> Work-Job-Executor-9   | Migrating | User               | Prepare   |
>>>> 1486020293 |        1736 | Host          |          19 | i-638-1736-VM |
>>>> NULL                |
>>>> | 5306ab1f-3c92-4f95-bbff-283b43e6862d |   345049098498 | 1486019573 |
>>>> Work-Job-Executor-11  | Migrating | User               | Prepare   |
>>>> 1486019573 |        1736 | Host          |          19 | i-638-1736-VM |
>>>> NULL                |
>>>> | 843bdf9c-247d-432c-ac19-1a4f7858631b |   345049098498 | 1486021493 |
>>>> Work-Job-Executor-38  | Migrating | User               | Prepare   |
>>>> 1486021493 |        1736 | Host          |          19 | i-638-1736-VM |
>>>> NULL                |
>>>> | 8d400afb-766f-4ac5-89ee-854f25a38eba |   345049098498 | 1486020173 |
>>>> Work-Job-Executor-21  | Migrating | User               | Prepare   |
>>>> 1486020173 |        1736 | Host          |          19 | i-638-1736-VM |
>>>> NULL                |
>>>> | f0ae5612-b6ef-4882-94a2-40a1e6dce342 |   345049098498 | 1486020833 |
>>>> Work-Job-Executor-30  | Migrating | User               | Prepare   |
>>>> 1486020833 |        1736 | Host          |          19 | i-638-1736-VM |
>>>> NULL                |
>>>> +--------------------------------------+----------------+---
>>>> ---------+-----------------------+-----------+--------------
>>>> ------+-----------+------------+-------------+--------------
>>>> -+-------------+---------------+---------------------+
>>>>
>>>> As you see, the problematic VM's are all present on that table:
>>>>
>>>> i-189-1177-VM
>>>> i-189-1178-VM
>>>> i-388-1240-VM
>>>> i-648-1690-VM
>>>> i-638-1736-VM
>>>>
>>>> As the MS are running the production environnements, we cannot debug a
>>>> lot and just need the shortest path to resolution.
>>>> For example restarting the MS is having an impact on our customers and
>>>> we need to reduce the number of restarts at the minimum.
>>>>
>>>> I have disabled the HA on all problematic VM's except on i-189-1177-VM,
>>>> which is less critical.
>>>>
>>>> The next steps could be:
>>>> 1.     stop all MS
>>>> 2.     backup DB
>>>> 3.     select op_it_work.*,vm_instance.instance_name,vm_instance.removed
>>>> from op_it_work join vm_instance where op_it_work.step!="Done" and
>>>> op_it_work.instance_id=vm_instance.id order by instance_id;
>>>> 4.     update op_it_work set step="Done" where step!="Done";
>>>> 5.     start all MS
>>>>
>>>> What do you think about step 4., safe or risky?
>>>> Some of these entries are related to deleted VM's and seems to safe to
>>>> mark as "Done".
>>>> All other entries are related to the problematic VM's that keep
>>>> stopping... so marking them as "Done" may (hopefully) eliminate the cause
>>>> of the Migrating/Stopping.
>>>> But we are not sure if marking them as "Done" need also changes on
>>>> other tables for consistency.
>>>>
>>>>
>>>> Thank you for your help.
>>>>
>>>>
>>>> Best Regards
>>>> Francois Scheurer
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 01.02.2017 03:01, Rafael Weingärtner wrote:
>>>>
>>>> Ok, I dug a little further into the code.
>>>>
>>>> If we put a host in maintenance, it may unfortunately happen that a
>>>>> timeout occurs and CS aborts the migration. As far as we can interpret the
>>>>> logs, it seems (we speculate) that CS detects that the VM is then still
>>>>> running on the original host, which is in maintenance, and decides somehow
>>>>> to stop it because of the "host not up" status.
>>>>>
>>>>
>>>> From wat I have seen in the code, this may happen (as you speculated).
>>>> If a timeout or some other event interrupts migrations, the VM will be
>>>> stopped. This happens in the method “com.cloud.vm.VirtualMachineMa
>>>> nagerImpl.orchestrateMigrateAway(String, long, DeploymentPlanner)” at
>>>> line 2396 and further. This was implemented on purpose; if the migration
>>>> does not happen; the VM is stopped on the source host. Probably it was
>>>> implemented this way on that philosophy of gradient descendent of
>>>> artificial intelligence:
>>>>
>>>> “pick a random point in the field, update the values according to the
>>>>> slope and hopefully after successfully iterations you converge to a minimum
>>>>> point.” (the problem is that sometimes the function diverges).
>>>>>
>>>>
>>>> I mean, probably this was done this way because the HA was supposed to
>>>> see that a VM with HA is stopped, and should try to start it again
>>>> somewhere else. The log you posted do not provide any other details to
>>>> debug this further because the log is after an MS restart, right?
>>>>
>>>>
>>>> After restarting the MSs I guess that a similar issue may happen: the
>>>>> hosts are disconnected during the early stage and then reconnects. Maybe CS
>>>>> then again stops the VMs because the "host not up" status?
>>>>> Cf. 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>>>> The last host of this VM is not UP or is not enabled, host status is:
>>>>> Disconnected, host resource state is: Enabled
>>>>>
>>>>
>>>> Here is something interesting that you have found. I checked the code,
>>>> but without a live debug it is pretty hard for me to get the right flow of
>>>> execution. By looking at the code and at the logs you provided, your
>>>> deduction is right. After ACS boots up, the check for hosts that are
>>>> shutdown/deactivated starts, and the method “com.cloud.agent.manager.Agent
>>>> ManagerImpl.executeUserRequest(long, Event)” is executed. The problem
>>>> is that all of the hosts are considered as disconnected (ACS may not have
>>>> finished connecting to all of them yet), and then a whole process of HA is
>>>> kicked. The problem is that there is no host with status “UP”. That is why
>>>> this happen “No suitable hosts found under this Cluster: 25”, and then the
>>>> whole flow of HA, which shuts down the VM is kicked.
>>>>
>>>> In theory, the HA worker should see that a VM with HA is down, and
>>>> should try to start it. I am not understanding why this is not working. If
>>>> you decide to debug the running ACS, I may be able to help you, just keep
>>>> me posted.
>>>>
>>>>
>>>> On Tue, Jan 31, 2017 at 10:26 AM, Rafael Weingärtner <
>>>> <ra...@gmail.com> wrote:
>>>>
>>>>> I will be able to re-check some code and answer you with new thoughts
>>>>> on this problem tonight (the log files you provided today have some very
>>>>> insightful information regarding this problem). I believe this is a bug in
>>>>> HA, a very tricky one that has not been reported yet (at least I do not
>>>>> remember seeing anything similar). And thanks to randomness/destiny now you
>>>>> have found/uncovered a way to reproduce the problem (bug?!).
>>>>>
>>>>> I do not suggest you changing any data at all directly in the data
>>>>> base (unless you know ACS code well and are 100% of what you are doing).
>>>>> This can cause even bigger and more complicated problems. If you remove the
>>>>> HA feature from this VMs, this problem will stop. However, if you can/are
>>>>> willing to maintain the HA until we fully uncover the problem, map and
>>>>> report it, this would be great.
>>>>> BTW: why are you restarting MSs?
>>>>> What are the configurations of your MSs?
>>>>>
>>>>> Thanks for your time and for sharing information with our community.
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Jan 31, 2017 at 8:53 AM, Francois Scheurer <
>>>>> <fr...@everyware.ch> wrote:
>>>>>
>>>>>> Dear Rafael
>>>>>>
>>>>>>
>>>>>> Many thanks for your deep analysis.
>>>>>> Nice humor with the planetsalignments BTW ;)
>>>>>>
>>>>>> We also suspected the HA Flag to be related this issue, as we never
>>>>>> got this issue in the past, until we enabled the HA on all VMs and
>>>>>> Offerings.
>>>>>>
>>>>>> Here is also the workers list:
>>>>>>
>>>>>> mysql> select * from op_ha_work;
>>>>>> +------+-------------+-----------+--------------------+-----
>>>>>> -----+----------------+---------+---------------------+-----
>>>>>> --+---------------------+-------+-------------+---------+
>>>>>> | id   | instance_id | type      | vm_type            | state    |
>>>>>> mgmt_server_id | host_id | created             | tried | taken
>>>>>>  | step  | time_to_try | updated |
>>>>>> +------+-------------+-----------+--------------------+-----
>>>>>> -----+----------------+---------+---------------------+-----
>>>>>> --+---------------------+-------+-------------+---------+
>>>>>> |  105 |         343 | HA        | User               | Running  |
>>>>>>  345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>>> Error |  1394056013 |       3 |
>>>>>> |  109 |         407 | HA        | ConsoleProxy       | Running  |
>>>>>>  345049098498 |      33 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>>> Error |  1394056013 |       3 |
>>>>>> |  113 |         407 | HA        | ConsoleProxy       | Running  |
>>>>>>  345049098498 |      33 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>>> Error |  1394056013 |       3 |
>>>>>> |  117 |         407 | HA        | ConsoleProxy       | Running  |
>>>>>>  345049098498 |      33 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>>> Error |  1394056013 |       3 |
>>>>>> |  121 |         343 | HA        | User               | Running  |
>>>>>>  345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>>> Error |  1394056013 |       3 |
>>>>>> |  125 |         405 | HA        | SecondaryStorageVm | Running  |
>>>>>>  345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>>> Error |  1394056013 |       3 |
>>>>>> |  129 |         393 | HA        | User               | Running  |
>>>>>>  345049098498 |      25 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>>> Error |  1394056014 |      30 |
>>>>>> |  133 |         405 | HA        | SecondaryStorageVm | Running  |
>>>>>>  345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>>> Error |  1394056014 |       3 |
>>>>>> |  137 |         402 | HA        | ConsoleProxy       | Running  |
>>>>>>  345049098498 |      25 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>>> Error |  1394056014 |       3 |
>>>>>> |  345 |         402 | HA        | ConsoleProxy       | Starting |
>>>>>>  345049098498 |      25 | 2015-07-06 13:09:26 | 1 | 2015-07-06 13:09:26 |
>>>>>> Error |  1402527506 |     770 |
>>>>>> |  349 |         687 | HA        | SecondaryStorageVm | Starting |
>>>>>>  345049098498 |      25 | 2015-07-06 13:09:26 | 1 | 2015-07-06 13:09:26 |
>>>>>> Error |  1402527506 |       2 |
>>>>>> | 2057 |        1178 | Migration | User               | Running  |
>>>>>>  345049098498 |      33 | 2017-01-31 07:56:08 | 0 | 2017-01-31 07:56:08 |
>>>>>> Done  |  1451024774 |      95 |
>>>>>> | 2058 |        1736 | Migration | User               | Running  |
>>>>>>  345049098498 |     224 | 2017-01-31 07:56:08 | 0 | 2017-01-31 07:56:08 |
>>>>>> Done  |  1451024774 |      45 |
>>>>>> | 2059 |        1240 | Migration | User               | Running  |
>>>>>>  345049098498 |       7 | 2017-01-31 07:56:08 | 0 | 2017-01-31 07:56:08 |
>>>>>> Done  |  1451024774 |      61 |
>>>>>> | 2060 |        1178 | Migration | User               | Running  |
>>>>>>  345049098122 |      33 | 2017-01-31 07:56:08 | 1 | 2017-01-31 08:06:37 |
>>>>>> Done  |  1451025374 |      95 |
>>>>>> | 2061 |        1178 | Migration | User               | Stopped  |
>>>>>>        NULL |      33 | 2017-01-31 07:56:09 | 5 | NULL                |
>>>>>> Done  |  1451027900 |     109 |
>>>>>> | 2062 |        1690 | Migration | User               | Running  |
>>>>>>  345049098498 |     224 | 2017-01-31 07:56:09 | 0 | 2017-01-31 07:56:09 |
>>>>>> Done  |  1451024774 |     109 |
>>>>>> | 2063 |        1177 | Migration | User               | Running  |
>>>>>>  345049098498 |      29 | 2017-01-31 07:56:09 | 0 | 2017-01-31 07:56:09 |
>>>>>> Done  |  1451024774 |      82 |
>>>>>> | 2064 |        1690 | Migration | User               | Running  |
>>>>>>  345049098122 |     224 | 2017-01-31 07:56:09 | 1 | 2017-01-31 07:58:18 |
>>>>>> Done  |  1451024900 |     109 |
>>>>>> | 2065 |        1240 | Migration | User               | Running  |
>>>>>>  345049098122 |       7 | 2017-01-31 07:56:09 | 1 | 2017-01-31 07:58:18 |
>>>>>> Done  |  1451024900 |      61 |
>>>>>> | 2066 |        1736 | Migration | User               | Running  |
>>>>>>  345049098122 |     224 | 2017-01-31 07:56:09 | 1 | 2017-01-31 07:58:18 |
>>>>>> Done  |  1451024900 |      45 |
>>>>>> | 2067 |        1240 | Migration | User               | Stopped  |
>>>>>>        NULL |       7 | 2017-01-31 07:56:09 | 5 | NULL                |
>>>>>> Done  |  1451028003 |      80 |
>>>>>> | 2068 |        1178 | Migration | User               | Running  |
>>>>>>  345049098122 |      33 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
>>>>>> Done  |  1451024781 |      95 |
>>>>>> | 2069 |        1736 | Migration | User               | Running  |
>>>>>>  345049098122 |     224 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
>>>>>> Done  |  1451024781 |      45 |
>>>>>> | 2070 |        1240 | Migration | User               | Running  |
>>>>>>  345049098122 |       7 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
>>>>>> Done  |  1451024781 |      61 |
>>>>>> | 2071 |        1177 | Migration | User               | Running  |
>>>>>>  345049098122 |      29 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
>>>>>> Done  |  1451024781 |      82 |
>>>>>> | 2072 |        1177 | Migration | User               | Stopped  |
>>>>>>        NULL |      29 | 2017-01-31 07:56:16 | 5 | NULL                |
>>>>>> Done  |  1451027973 |     101 |
>>>>>> | 2073 |        1177 | Migration | User               | Running  |
>>>>>>  345049098122 |      29 | 2017-01-31 07:56:16 | 1 | 2017-01-31 07:58:22 |
>>>>>> Done  |  1451024904 |      82 |
>>>>>> | 2074 |        1690 | Migration | User               | Stopped  |
>>>>>>        NULL |     224 | 2017-01-31 07:56:16 | 5 | NULL                |
>>>>>> Done  |  1451027464 |     123 |
>>>>>> | 2075 |        1736 | Migration | User               | Stopped  |
>>>>>>        NULL |     224 | 2017-01-31 07:56:16 | 5 | NULL                |
>>>>>> Done  |  1451027477 |      59 |
>>>>>> +------+-------------+-----------+--------------------+-----
>>>>>> -----+----------------+---------+---------------------+-----
>>>>>> --+---------------------+-------+-------------+---------+
>>>>>>
>>>>>> If we put a host in maintenance, it may unfortunately happen that a
>>>>>> timeout occurs and CS aborts the migration.
>>>>>> As far as we can interpret the logs, it seems (we speculate) that CS
>>>>>> detects that the VM is then still running on the original host, which is in
>>>>>> maintenance, and decides somehow to stop it because of the "host not up"
>>>>>> status.
>>>>>> We then need to start several times the VM, until CS gives up the
>>>>>> stopping.
>>>>>>
>>>>>> After restarting the MSs I guess that a similar issue may happen: the
>>>>>> hosts are disconnected during the early stage and then reconnects. Maybe CS
>>>>>> then again stops the VMs because the "host not up" status?
>>>>>> Cf. 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>>>>> The last host of this VM is not UP or is not enabled, host status is:
>>>>>> Disconnected, host resource state is: Enabled
>>>>>>
>>>>>> But the problem after the MSs restart is definitely only impacting
>>>>>> VMs listed in the Table op_ha_work.
>>>>>>
>>>>>>
>>>>>> To your questions:
>>>>>>         We have 3 ACS MSs managing several Zones. Each zone has a
>>>>>> single pod and cluster.
>>>>>>         The Cluster of i-638-1735-VM has this summary:
>>>>>>          Total number of VMs: 121
>>>>>>          Total number hypervisors: 4
>>>>>>          Total allocated RAM: 522.0 / 972 GB (53.7 %)
>>>>>>          Total allocated cores: 211
>>>>>>          Total allocated storage: 11285 GB
>>>>>>         Used Memory on the hosts is now varying from 28 to 77%, so
>>>>>> there is enough RAM.
>>>>>>         I did not check the XS logs. The same Problem has arisen this
>>>>>> morning again (shortly after a restart of all MSs), so I will look in the
>>>>>> XS logs now.
>>>>>>
>>>>>>         I have some fair experience with Eclipse and Java, I will ask
>>>>>> my boss if we can go this way.
>>>>>>
>>>>>> The main issue for now, is that restarting the MSs is reproducing the
>>>>>> problem, which is quite annoying for the impacted VM's.
>>>>>> I thought to stop the MSs, delete all HA/Migration entries in
>>>>>> op_ha_work and start the MSs.
>>>>>> Is that a viable option? (or should I delete the Migration entries
>>>>>> only? or never touch this table?)
>>>>>> We will also very probably remove the HA Flag on all VMs if it does
>>>>>> not work reliably.
>>>>>>
>>>>>> Best Regards
>>>>>> Francois Scheurer
>>>>>>
>>>>>>
>>>>>>
>>>>>> Here are new logs regarding the same issue of today.
>>>>>> I tried to get the important parts related to the stop of
>>>>>> i-189-1178-VM.
>>>>>>
>>>>>> regexp: (i-189-1178-VM|VmWorkMigrateAway|job-15393|job-192989|StopRe
>>>>>> questedvm|job-14965|job-193019|Stopped-Migrating|ctx-36aae73
>>>>>> 3|work-2049)
>>>>>>
>>>>>> 2017-01-30 08:44:04,440 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Executing
>>>>>> AsyncJobVO {id:192989, userId: 1, accountId: 1, instanceType: null,
>>>>>> instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>>>> rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZU
>>>>>> F3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm
>>>>>> 1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTA
>>>>>> ALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAA
>>>>>> AAAAAAAAEAAAAAAAAEmnQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAHQ,
>>>>>> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resu
>>>>>> ltCode: 0, result: null, initMsid: 345049103441, completeMsid: null,
>>>>>> lastUpdated: null, lastPolled: null, created: Mon Jan 30 08:44:03 CET 2017}
>>>>>> 2017-01-30 08:44:04,440 DEBUG [c.c.v.VmWorkJobDispatcher]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Run VM work
>>>>>> job: com.cloud.vm.VmWorkMigrateAway for VM 1178, job origin: 15393
>>>>>>
>>>>>> 2017-01-30 08:44:05,217 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>>> Deploy avoids pods: [], clusters: [], hosts: [29]
>>>>>> 2017-01-30 08:44:05,220 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
>>>>>> (AgentConnectTaskPool-4:ctx-5b61f2e5) create ClusteredAgentAttache
>>>>>> for 232
>>>>>> 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>>> DeploymentPlanner allocation algorithm: com.cloud.deploy.UserDispersin
>>>>>> gPlanner@1734b2e2
>>>>>> 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>>> Trying to allocate a host and storage pools from dc:17, pod:25,cluster:25,
>>>>>> requested cpu: 4200, requested ram: 4294967296 <%2842%29%209496-7296>
>>>>>> 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>>> Is ROOT volume READY (pool already allocated)?: Yes
>>>>>> 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>>> This VM has last host_id specified, trying to choose the same host: 25
>>>>>>
>>>>>> 2017-01-30 08:44:05,614 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>>> The last host of this VM is not UP or is not enabled, host status is:
>>>>>> Disconnected, host resource state is: Enabled
>>>>>> 2017-01-30 08:44:05,614 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>>> Cannot choose the last host to deploy this VM
>>>>>> 2017-01-30 08:44:05,614 DEBUG [c.c.d.FirstFitPlanner]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>>> Searching resources only under specified Cluster: 25
>>>>>>
>>>>>> 2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>>>>> FirstFitRoutingAllocator) Found 0 hosts for allocation after
>>>>>> prioritization: []
>>>>>> 2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>>>>> FirstFitRoutingAllocator) Looking for speed=4200Mhz, Ram=4096
>>>>>> 2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>>>>> FirstFitRoutingAllocator) Host Allocator returning 0 suitable hosts
>>>>>> 2017-01-30 08:44:05,777 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>>> No suitable hosts found
>>>>>> 2017-01-30 08:44:05,777 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>>> No suitable hosts found under this Cluster: 25
>>>>>> 2017-01-30 08:44:05,787 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>>> Could not find suitable Deployment Destination for this VM under any
>>>>>> clusters, returning.
>>>>>> 2017-01-30 08:44:05,787 DEBUG [c.c.d.FirstFitPlanner]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>>> Searching resources only under specified Cluster: 25
>>>>>> 2017-01-30 08:44:05,798 DEBUG [c.c.d.FirstFitPlanner]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>>> The specified cluster is in avoid set, returning.
>>>>>> 2017-01-30 08:44:05,798 DEBUG [c.c.v.VirtualMachineManagerImpl]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>>> Unable to find destination for migrating the vm VM[User|i-189-1178-VM]
>>>>>> 2017-01-30 08:44:05,798 WARN [c.c.v.VirtualMachineManagerImpl]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>>> Failed to deploy vm 1178 with original planner, sending HAPlanner
>>>>>>
>>>>>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>>> Deploy avoids pods: [], clusters: [], hosts: [29]
>>>>>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>>> DeploymentPlanner allocation algorithm: com.cloud.deploy.SkipHeurestic
>>>>>> sPlanner@4fdf86ce
>>>>>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>>> Trying to allocate a host and storage pools from dc:17, pod:25,cluster:25,
>>>>>> requested cpu: 4200, requested ram: 4294967296 <%2842%29%209496-7296>
>>>>>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>>> Is ROOT volume READY (pool already allocated)?: Yes
>>>>>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>>> This VM has last host_id specified, trying to choose the same host: 25
>>>>>>
>>>>>> 2017-01-30 08:44:06,022 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>>> The last host of this VM is not UP or is not enabled, host status is:
>>>>>> Disconnected, host resource state is: Enabled
>>>>>> 2017-01-30 08:44:06,022 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>>> Cannot choose the last host to deploy this VM
>>>>>> 2017-01-30 08:44:06,022 DEBUG [c.c.d.FirstFitPlanner]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>>> Searching resources only under specified Cluster: 25
>>>>>> 2017-01-30 08:44:06,039 DEBUG [c.c.d.SkipHeuresticsPlanner]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>>> Deploying vm during HA process, so skipping disable threshold check
>>>>>>
>>>>>> 2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>>>>> FirstFitRoutingAllocator) Found 0 hosts for allocation after
>>>>>> prioritization: []
>>>>>> 2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>>>>> FirstFitRoutingAllocator) Looking for speed=4200Mhz, Ram=4096
>>>>>> 2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>>>>> FirstFitRoutingAllocator) Host Allocator returning 0 suitable hosts
>>>>>> 2017-01-30 08:44:06,151 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>>> No suitable hosts found
>>>>>> 2017-01-30 08:44:06,151 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>>> No suitable hosts found under this Cluster: 25
>>>>>> 2017-01-30 08:44:06,157 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Publish
>>>>>> async job-192986 complete on message bus
>>>>>> 2017-01-30 08:44:06,157 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Wake up jobs
>>>>>> related to job-192986
>>>>>> 2017-01-30 08:44:06,157 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Update db
>>>>>> status for job-192986
>>>>>> 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>>>>> The last host of this VM is not UP or is not enabled, host status is:
>>>>>> Disconnected, host resource state is: Enabled
>>>>>> 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>>>>> Cannot choose the last host to deploy this VM
>>>>>> 2017-01-30 08:44:06,157 DEBUG [c.c.d.FirstFitPlanner]
>>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>>>>> Searching resources only under specified Cluster: 35
>>>>>> 2017-01-30 08:44:06,160 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>>> Could not find suitable Deployment Destination for this VM under any
>>>>>> clusters, returning.
>>>>>> 2017-01-30 08:44:06,160 DEBUG [c.c.d.FirstFitPlanner]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>>> Searching resources only under specified Cluster: 25
>>>>>> 2017-01-30 08:44:06,163 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>> (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Execute sync-queue item:
>>>>>> SyncQueueItemVO {id:63629, queueId: 64423, contentType: AsyncJob,
>>>>>> contentId: 192991, lastProcessMsid: 345049098498, lastprocessNumber: 27,
>>>>>> lastProcessTime: Mon Jan 30 08:44:06 CET 2017, created: Mon Jan 30 08:44:05
>>>>>> CET 2017}
>>>>>> 2017-01-30 08:44:06,169 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>>>>>> (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Add
>>>>>> job-192992 into job monitoring
>>>>>> 2017-01-30 08:44:06,184 DEBUG [c.c.d.SkipHeuresticsPlanner]
>>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>>>>> Deploying vm during HA process, so skipping disable threshold check
>>>>>> 2017-01-30 08:44:06,186 DEBUG [c.c.d.FirstFitPlanner]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>>> The specified cluster is in avoid set, returning.
>>>>>> 2017-01-30 08:44:06,186 DEBUG [c.c.v.VirtualMachineManagerImpl]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>>> Unable to find destination for migrating the vm VM[User|i-189-1178-VM]
>>>>>> 2017-01-30 08:44:06,186 ERROR [c.c.v.VmWorkJobHandlerProxy]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>>> Invocation exception, caused by: com.cloud.exception.InsufficientServerCapacityException:
>>>>>> Unable to find a server to migrate to.Scope=interface
>>>>>> com.cloud.org.Cluster; id=25
>>>>>> 2017-01-30 08:44:06,187 INFO  [c.c.v.VmWorkJobHandlerProxy]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>>> Rethrow exception com.cloud.exception.InsufficientServerCapacityException:
>>>>>> Unable to find a server to migrate to.Scope=interface
>>>>>> com.cloud.org.Cluster; id=25
>>>>>> 2017-01-30 08:44:06,187 DEBUG [c.c.v.VmWorkJobDispatcher]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Done with
>>>>>> run of VM work job: com.cloud.vm.VmWorkMigrateAway for VM 1178, job origin:
>>>>>> 15393
>>>>>> 2017-01-30 08:44:06,187 ERROR [c.c.v.VmWorkJobDispatcher]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Unable to
>>>>>> complete AsyncJobVO {id:192989, userId: 1, accountId: 1, instanceType:
>>>>>> null, instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway,
>>>>>> cmdInfo: rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcE
>>>>>> mwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAl
>>>>>> Z2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5h
>>>>>> bWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAA
>>>>>> AAAEmnQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAHQ, cmdVersion:
>>>>>> 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null,
>>>>>> initMsid: 345049103441, completeMsid: null, lastUpdated: null, lastPolled:
>>>>>> null, created: Mon Jan 30 08:44:03 CET 2017}, job origin:15393
>>>>>> com.cloud.exception.InsufficientServerCapacityException: Unable to
>>>>>> find a server to migrate to.Scope=interface com.cloud.org.Cluster; id=25
>>>>>>         at com.cloud.vm.VirtualMachineMan
>>>>>> agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:2390)
>>>>>>         at com.cloud.vm.VirtualMachineMan
>>>>>> agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:4520)
>>>>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>>>>>> Method)
>>>>>>         at sun.reflect.NativeMethodAccess
>>>>>> orImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>>>         at sun.reflect.DelegatingMethodAc
>>>>>> cessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>>         at java.lang.reflect.Method.invoke(Method.java:606)
>>>>>>         at com.cloud.vm.VmWorkJobHandlerP
>>>>>> roxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
>>>>>>         at com.cloud.vm.VirtualMachineMan
>>>>>> agerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4636)
>>>>>>         at com.cloud.vm.VmWorkJobDispatch
>>>>>> er.runJob(VmWorkJobDispatcher.java:103)
>>>>>>         at org.apache.cloudstack.framewor
>>>>>> k.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManag
>>>>>> erImpl.java:537)
>>>>>>         at org.apache.cloudstack.managed.
>>>>>> context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>>>>>>         at org.apache.cloudstack.managed.
>>>>>> context.impl.DefaultManagedContext$1.call(DefaultManagedCont
>>>>>> ext.java:56)
>>>>>>         at org.apache.cloudstack.managed.
>>>>>> context.impl.DefaultManagedContext.callWithContext(DefaultMa
>>>>>> nagedContext.java:103)
>>>>>>         at org.apache.cloudstack.managed.
>>>>>> context.impl.DefaultManagedContext.runWithContext(DefaultMan
>>>>>> agedContext.java:53)
>>>>>>         at org.apache.cloudstack.managed.
>>>>>> context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>>>>>>         at org.apache.cloudstack.framewor
>>>>>> k.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:494)
>>>>>>         at java.util.concurrent.Executors
>>>>>> $RunnableAdapter.call(Executors.java:471)
>>>>>>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>>>>         at java.util.concurrent.ThreadPoo
>>>>>> lExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>>>         at java.util.concurrent.ThreadPoo
>>>>>> lExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>>>         at java.lang.Thread.run(Thread.java:745)
>>>>>> 2017-01-30 08:44:06,188 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>> (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Schedule queued job-192991
>>>>>> 2017-01-30 08:44:06,192 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Wake up jobs
>>>>>> joined with job-192986 and disjoin all subjobs created from job- 192986
>>>>>> 2017-01-30 08:44:06,196 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Complete
>>>>>> async job-192989, jobStatus: FAILED, resultCode: 0, result:
>>>>>> rO0ABXNyADdjb20uY2xvdWQuZXhjZXB0aW9uLkluc3VmZmljaWVudFNlcnZl
>>>>>> ckNhcGFjaXR5RXhjZXB0aW9uAAAAAFZNT3ACAAFaABVhZmZpbml0eUdyb3Vw
>>>>>> c0FwcGxpZWR4cgAxY29tLmNsb3VkLmV4Y2VwdGlvbi5JbnN1ZmZpY2llbnRD
>>>>>> YXBhY2l0eUV4Y2VwdGlvbgAAAABWTU91AgACTAACaWR0ABBMamF2YS9sYW5n
>>>>>> L0xvbmc7TAAFc2NvcGV0ABFMamF2YS9sYW5nL0NsYXNzO3hyACJjb20uY2xv
>>>>>> dWQuZXhjZXB0aW9uLkNsb3VkRXhjZXB0aW9ueeiOG_HkRMcCAAJMAAtjc0Vy
>>>>>> cm9yQ29kZXQAE0xqYXZhL2xhbmcvSW50ZWdlcjtMAAZpZExpc3R0ABVMamF2
>>>>>> YS91dGlsL0FycmF5TGlzdDt4cgATamF2YS5sYW5nLkV4Y2VwdGlvbtD9Hz4a
>>>>>> OxzEAgAAeHIAE2phdmEubGFuZy5UaHJvd2FibGXVxjUnOXe4ywMABEwABWNh
>>>>>> dXNldAAVTGphdmEvbGFuZy9UaHJvd2FibGU7TAANZGV0YWlsTWVzc2FnZXQA
>>>>>> EkxqYXZhL2xhbmcvU3RyaW5nO1sACnN0YWNrVHJhY2V0AB5bTGphdmEvbGFu
>>>>>> Zy9TdGFja1RyYWNlRWxlbWVudDtMABRzdXBwcmVzc2VkRXhjZXB0aW9uc3QA
>>>>>> EExqYXZhL3V0aWwvTGlzdDt4cHEAfgANdAAmVW5hYmxlIHRvIGZpbmQgYSBz
>>>>>> ZXJ2ZXIgdG8gbWlncmF0ZSB0by51cgAeW0xqYXZhLmxhbmcuU3RhY2tUcmFj
>>>>>> ZUVsZW1lbnQ7AkYqPDz9IjkCAAB4cAAAABVzcgAbamF2YS5sYW5nLlN0YWNr
>>>>>> VHJhY2VFbGVtZW50YQnFmiY23YUCAARJAApsaW5lTnVtYmVyTAAOZGVjbGFy
>>>>>> aW5nQ2xhc3NxAH4ACkwACGZpbGVOYW1lcQB-AApMAAptZXRob2ROYW1lcQB-
>>>>>> AAp4cAAACVZ0ACZjb20uY2xvdWQudm0uVmlydHVhbE1hY2hpbmVNYW5hZ2Vy
>>>>>> SW1wbHQAHlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwuamF2YXQAFm9yY2hl
>>>>>> c3RyYXRlTWlncmF0ZUF3YXlzcQB-ABEAABGocQB-ABNxAH4AFHEAfgAVc3EA
>>>>>> fgAR_____nQAJHN1bi5yZWZsZWN0Lk5hdGl2ZU1ldGhvZEFjY2Vzc29ySW1w
>>>>>> bHQAHU5hdGl2ZU1ldGhvZEFjY2Vzc29ySW1wbC5qYXZhdAAHaW52b2tlMHNx
>>>>>> AH4AEQAAADlxAH4AGHEAfgAZdAAGaW52b2tlc3EAfgARAAAAK3QAKHN1bi5y
>>>>>> ZWZsZWN0LkRlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcGx0ACFEZWxlZ2F0
>>>>>> aW5nTWV0aG9kQWNjZXNzb3JJbXBsLmphdmFxAH4AHHNxAH4AEQAAAl50ABhq
>>>>>> YXZhLmxhbmcucmVmbGVjdC5NZXRob2R0AAtNZXRob2QuamF2YXEAfgAcc3EA
>>>>>> fgARAAAAa3QAImNvbS5jbG91ZC52bS5WbVdvcmtKb2JIYW5kbGVyUHJveHl0
>>>>>> ABpWbVdvcmtKb2JIYW5kbGVyUHJveHkuamF2YXQAD2hhbmRsZVZtV29ya0pv
>>>>>> YnNxAH4AEQAAEhxxAH4AE3EAfgAUcQB-ACZzcQB-ABEAAABndAAgY29tLmNs
>>>>>> b3VkLnZtLlZtV29ya0pvYkRpc3BhdGNoZXJ0ABhWbVdvcmtKb2JEaXNwYXRj
>>>>>> aGVyLmphdmF0AAZydW5Kb2JzcQB-ABEAAAIZdAA_b3JnLmFwYWNoZS5jbG91
>>>>>> ZHN0YWNrLmZyYW1ld29yay5qb2JzLmltcGwuQXN5bmNKb2JNYW5hZ2VySW1w
>>>>>> bCQ1dAAYQXN5bmNKb2JNYW5hZ2VySW1wbC5qYXZhdAAMcnVuSW5Db250ZXh0
>>>>>> c3EAfgARAAAAMXQAPm9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNv
>>>>>> bnRleHQuTWFuYWdlZENvbnRleHRSdW5uYWJsZSQxdAAbTWFuYWdlZENvbnRl
>>>>>> eHRSdW5uYWJsZS5qYXZhdAADcnVuc3EAfgARAAAAOHQAQm9yZy5hcGFjaGUu
>>>>>> Y2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0TWFuYWdl
>>>>>> ZENvbnRleHQkMXQAGkRlZmF1bHRNYW5hZ2VkQ29udGV4dC5qYXZhdAAEY2Fs
>>>>>> bHNxAH4AEQAAAGd0AEBvcmcuYXBhY2hlLmNsb3Vkc3RhY2subWFuYWdlZC5j
>>>>>> b250ZXh0LmltcGwuRGVmYXVsdE1hbmFnZWRDb250ZXh0cQB-ADZ0AA9jYWxs
>>>>>> V2l0aENvbnRleHRzcQB-ABEAAAA1cQB-ADlxAH4ANnQADnJ1bldpdGhDb250
>>>>>> ZXh0c3EAfgARAAAALnQAPG9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2Vk
>>>>>> LmNvbnRleHQuTWFuYWdlZENvbnRleHRSdW5uYWJsZXEAfgAycQB-ADNzcQB-
>>>>>> ABEAAAHucQB-AC1xAH4ALnEAfgAzc3EAfgARAAAB13QALmphdmEudXRpbC5j
>>>>>> b25jdXJyZW50LkV4ZWN1dG9ycyRSdW5uYWJsZUFkYXB0ZXJ0AA5FeGVjdXRv
>>>>>> cnMuamF2YXEAfgA3c3EAfgARAAABBnQAH2phdmEudXRpbC5jb25jdXJyZW50
>>>>>> LkZ1dHVyZVRhc2t0AA9GdXR1cmVUYXNrLmphdmFxAH4AM3NxAH4AEQAABHl0
>>>>>> ACdqYXZhLnV0aWwuY29uY3VycmVudC5UaHJlYWRQb29sRXhlY3V0b3J0ABdU
>>>>>> aHJlYWRQb29sRXhlY3V0b3IuamF2YXQACXJ1bldvcmtlcnNxAH4AEQAAAmd0
>>>>>> AC5qYXZhLnV0aWwuY29uY3VycmVudC5UaHJlYWRQb29sRXhlY3V0b3IkV29y
>>>>>> a2VycQB-AEhxAH4AM3NxAH4AEQAAAul0ABBqYXZhLmxhbmcuVGhyZWFkdAAL
>>>>>> VGhyZWFkLmphdmFxAH4AM3NyACZqYXZhLnV0aWwuQ29sbGVjdGlvbnMkVW5t
>>>>>> b2RpZmlhYmxlTGlzdPwPJTG17I4QAgABTAAEbGlzdHEAfgAMeHIALGphdmEu
>>>>>> dXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVDb2xsZWN0aW9uGUIAgMte
>>>>>> 9x4CAAFMAAFjdAAWTGphdmEvdXRpbC9Db2xsZWN0aW9uO3hwc3IAE2phdmEu
>>>>>> dXRpbC5BcnJheUxpc3R4gdIdmcdhnQMAAUkABHNpemV4cAAAAAB3BAAAAAB4
>>>>>> cQB-AFR4c3IAEWphdmEubGFuZy5JbnRlZ2VyEuKgpPeBhzgCAAFJAAV2YWx1
>>>>>> ZXhyABBqYXZhLmxhbmcuTnVtYmVyhqyVHQuU4IsCAAB4cAAAEO9zcQB-AFMA
>>>>>> AAAAdwQAAAAAeHNyAA5qYXZhLmxhbmcuTG9uZzuL5JDMjyPfAgABSgAFdmFs
>>>>>> dWV4cQB-AFYAAAAAAAAAGXZyABVjb20uY2xvdWQub3JnLkNsdXN0ZXKqoV-o
>>>>>> maW2FAIAAHhwAA
>>>>>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>>>>> Deploy avoids pods: [], clusters: [], hosts: [3]
>>>>>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>>>>> DeploymentPlanner allocation algorithm: com.cloud.deploy.SkipHeurestic
>>>>>> sPlanner@4fdf86ce
>>>>>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>>>>> Trying to allocate a host and storage pools from dc:14, pod:17,cluster:17,
>>>>>> requested cpu: 4200, requested ram: 4294967296
>>>>>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>>>>> Is ROOT volume READY (pool already allocated)?: Yes
>>>>>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>>>>> This VM has last host_id specified, trying to choose the same host: 213
>>>>>> 2017-01-30 08:44:06,213 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Publish
>>>>>> async job-192989 complete on message bus
>>>>>> 2017-01-30 08:44:06,213 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Wake up jobs
>>>>>> related to job-192989
>>>>>> 2017-01-30 08:44:06,213 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Update db
>>>>>> status for job-192989
>>>>>> 2017-01-30 08:44:06,219 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>>>>> Checking resources in Cluster: 35 under Pod: 31
>>>>>> 2017-01-30 08:44:06,227 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f
>>>>>> FirstFitRoutingAllocator) Looking for hosts in dc: 23  pod:31  cluster:35
>>>>>> 2017-01-30 08:44:06,229 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>> (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Executing
>>>>>> AsyncJobVO {id:192992, userId: 1, accountId: 1, instanceType: null,
>>>>>> instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>>>> rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcE
>>>>>> mwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAl
>>>>>> Z2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5h
>>>>>> bWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAA
>>>>>> AAAGwXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAEw, cmdVersion:
>>>>>> 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null,
>>>>>> initMsid: 345049098122, completeMsid: null, lastUpdated: null, lastPolled:
>>>>>> null, created: Mon Jan 30 08:44:05 CET 2017}
>>>>>> 2017-01-30 08:44:06,229 DEBUG [c.c.v.VmWorkJobDispatcher]
>>>>>> (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Run VM work
>>>>>> job: com.cloud.vm.VmWorkMigrateAway for VM 1729, job origin: 14953
>>>>>> 2017-01-30 08:44:06,230 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>> (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Execute sync-queue item:
>>>>>> SyncQueueItemVO {id:63630, queueId: 67180, contentType: AsyncJob,
>>>>>> contentId: 192990, lastProcessMsid: 345049098498, lastprocessNumber: 61,
>>>>>> lastProcessTime: Mon Jan 30 08:44:06 CET 2017, created: Mon Jan 30 08:44:05
>>>>>> CET 2017}
>>>>>> 2017-01-30 08:44:06,244 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>>>>> The last host of this VM is not UP or is not enabled, host status is:
>>>>>> Disconnected, host resource state is: Enabled
>>>>>> 2017-01-30 08:44:06,262 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>>>>> Cannot choose the last host to deploy this VM
>>>>>> 2017-01-30 08:44:06,263 DEBUG [c.c.d.FirstFitPlanner]
>>>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>>>>> Searching resources only under specified Cluster: 17
>>>>>> 2017-01-30 08:44:06,262 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>>>>>> (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Add
>>>>>> job-192991 into job monitoring
>>>>>> 2017-01-30 08:44:06,246 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Done
>>>>>> executing com.cloud.vm.VmWorkMigrateAway for job-192986
>>>>>> 2017-01-30 08:44:06,244 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Wake up jobs
>>>>>> joined with job-192989 and disjoin all subjobs created from job- 192989
>>>>>> 2017-01-30 08:44:06,269 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>>>>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Remove
>>>>>> job-192986 from job monitoring
>>>>>> 2017-01-30 08:44:06,279 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Done
>>>>>> executing com.cloud.vm.VmWorkMigrateAway for job-192989
>>>>>> 2017-01-30 08:44:06,305 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>> (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Executing
>>>>>> AsyncJobVO {id:192991, userId: 1, accountId: 1, instanceType: null,
>>>>>> instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>>>> rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcE
>>>>>> mwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAl
>>>>>> Z2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5h
>>>>>> bWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAA
>>>>>> AAAGRXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAABw, cmdVersion:
>>>>>> 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null,
>>>>>> initMsid: 345049098122, completeMsid: null, lastUpdated: null, lastPolled:
>>>>>> null, created: Mon Jan 30 08:44:05 CET 2017}
>>>>>> 2017-01-30 08:44:06,305 DEBUG [c.c.v.VmWorkJobDispatcher]
>>>>>> (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Run VM work
>>>>>> job: com.cloud.vm.VmWorkMigrateAway for VM 1605, job origin: 14965
>>>>>> 2017-01-30 08:44:06,306 DEBUG [c.c.a.m.AgentManagerImpl]
>>>>>> (AgentManager-Handler-13:null) Ping from 258
>>>>>> 2017-01-30 08:44:06,319 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f
>>>>>> FirstFitRoutingAllocator) List of hosts in ascending order of number of
>>>>>> VMs: [224, 19, 21, 15]
>>>>>> 2017-01-30 08:44:06,322 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f
>>>>>> FirstFitRoutingAllocator) FirstFitAllocator has 0 hosts to check for
>>>>>> allocation: []
>>>>>> 2017-01-30 08:44:06,322 INFO  [c.c.a.m.AgentManagerImpl]
>>>>>> (AgentManager-Handler-13:null) PingMap for agent: 258 will not be updated
>>>>>> because agent is no longer in the PingMap
>>>>>> 2017-01-30 08:44:06,326 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Remove
>>>>>> job-192989 from job monitoring
>>>>>>
>>>>>> 2017-01-30 08:54:38,139 INFO [c.c.v.VirtualMachineManagerImpl]
>>>>>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019
>>>>>> ctx-052eb4f5) Migration cancelled because state has changed:
>>>>>> VM[User|i-189-1178-VM]
>>>>>> 2017-01-30 08:54:38,139 DEBUG [c.c.v.VirtualMachineManagerImpl]
>>>>>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019
>>>>>> ctx-052eb4f5) Unable to migrate VM due to: Migration cancelled because
>>>>>> state has changed: VM[User|i-189-1178-VM]
>>>>>> 2017-01-30 08:54:38,167 DEBUG [c.c.c.CapacityManagerImpl]
>>>>>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019
>>>>>> ctx-052eb4f5) VM state transitted from :Running to Stopping with event:
>>>>>> StopRequestedvm's original host id: 29 new host id: 33 host id before state
>>>>>> transition: 33
>>>>>> 2017-01-30 08:54:38,177 DEBUG [c.c.a.m.DirectAgentAttache]
>>>>>> (DirectAgent-325:ctx-e540d7f5) Seq 19-2355664080091480198: Response
>>>>>> Received:
>>>>>> 2017-01-30 08:54:38,177 DEBUG [c.c.a.t.Request]
>>>>>> (DirectAgent-325:ctx-e540d7f5) Seq 19-2355664080091480198: Processing:  {
>>>>>> Ans: , MgmtId: 345049098498, via: 19, Ver: v1, Flags: 10,
>>>>>> [{"com.cloud.agent.api.NetworkUsageAnswer":{"routerName":"r-
>>>>>> 1706-VM","bytesSent":12276848087,"bytesReceived":5226621450,
>>>>>> "result":true,"details":"","wait":0}}] }
>>>>>> 2017-01-30 08:54:38,177 DEBUG [c.c.a.t.Request]
>>>>>> (RouterMonitor-1:ctx-2e80f405) Seq 19-2355664080091480198: Received:  {
>>>>>> Ans: , MgmtId: 345049098498, via: 19, Ver: v1, Flags: 10, {
>>>>>> NetworkUsageAnswer } }
>>>>>> 2017-01-30 08:54:38,177 DEBUG [c.c.a.m.AgentManagerImpl]
>>>>>> (RouterMonitor-1:ctx-2e80f405) Details from executing class
>>>>>> com.cloud.agent.api.NetworkUsageCommand:
>>>>>> 2017-01-30 08:54:38,178 DEBUG [c.c.a.t.Request]
>>>>>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019
>>>>>> ctx-052eb4f5) Seq 33-1917970491306410035: Sending  { Cmd , MgmtId:
>>>>>> 345049098498, via: 33(ewcstack-vh003-prod), Ver: v1, Flags: 100011,
>>>>>> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"execut
>>>>>> eInSequence":false,"checkBeforeCleanup":false,"vmName":"i-189-1178-VM","wait":0}}]
>>>>>> }
>>>>>> 2017-01-30 08:54:38,178 DEBUG [c.c.a.t.Request]
>>>>>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019
>>>>>> ctx-052eb4f5) Seq 33-1917970491306410035: Executing:  { Cmd , MgmtId:
>>>>>> 345049098498, via: 33(ewcstack-vh003-prod), Ver: v1, Flags: 100011,
>>>>>> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"execut
>>>>>> eInSequence":false,"checkBeforeCleanup":false,"vmName":"i-189-1178-VM","wait":0}}]
>>>>>> }
>>>>>> 2017-01-30 08:54:38,178 DEBUG [c.c.a.m.DirectAgentAttache]
>>>>>> (DirectAgent-43:ctx-da30f109) Seq 33-1917970491306410035: Executing request
>>>>>> 2017-01-30 08:54:38,202 DEBUG [c.c.a.m.DirectAgentAttache]
>>>>>> (DirectAgent-326:ctx-4fb88aa7) Seq 21-1035546439318503505: Executing request
>>>>>>
>>>>>> 2017-01-30 09:05:55,628 INFO [c.c.h.HighAvailabilityManagerImpl]
>>>>>> (HA-Worker-4:ctx-36aae733 work-2049) Processing
>>>>>> HAWork[2049-Migration-1178-Stopped-Migrating]
>>>>>> 2017-01-30 09:05:55,633 WARN [o.a.c.f.j.AsyncJobExecutionContext]
>>>>>> (HA-Worker-4:ctx-36aae733 work-2049) Job is executed without a context,
>>>>>> setup psudo job for the executing thread
>>>>>> 2017-01-30 09:05:55,645 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>>> (HA-Worker-4:ctx-36aae733 work-2049) Sync job-193045 execution on object
>>>>>> VmWorkJobQueue.1178
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 31.01.2017 00:42, Rafael Weingärtner wrote:
>>>>>>
>>>>>>> This seems to be one of those nasty bugs that are very hard to
>>>>>>> debug. It
>>>>>>> seems to be cause to a combination of different factors, such as
>>>>>>> planets
>>>>>>> alignments ;)
>>>>>>>
>>>>>>> At class “VirtualMachineManagerImpl” line 2395 (method
>>>>>>> “com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAw
>>>>>>> ay(String,
>>>>>>> long, DeploymentPlanner)”) is the start of the code flow that
>>>>>>> triggers the
>>>>>>> migration. The point here is that, if an error occurs during the
>>>>>>> migration,
>>>>>>> the VM will be stopped; For some reason this is executed only for HA
>>>>>>> VMs.
>>>>>>>
>>>>>>> Going a little bit further, we find the point that is triggering the
>>>>>>> error
>>>>>>> in the method “com.cloud.vm.VirtualMachineMa
>>>>>>> nagerImpl.migrate(VMInstanceVO,
>>>>>>> long, DeployDestination)” at line 1941. This line will call the
>>>>>>> method
>>>>>>> “com.cloud.vm.VirtualMachineManagerImpl.changeState(VMInstanceVO,
>>>>>>> Event,
>>>>>>> Long, ItWorkVO, Step)” to change the VM state to migrating/prepare
>>>>>>> migration/or something like that (I am guessing here), but for some
>>>>>>> reason,
>>>>>>> the state of the VM is not the one expected and this is causing a
>>>>>>> “ConcurrentOperationException” to be thrown. The problem is that the
>>>>>>> “com.cloud.vm.VirtualMachineManagerImpl.changeState(VMInstanceVO,
>>>>>>> Event,
>>>>>>> Long, ItWorkVO, Step)” does not seem to provide much useful log
>>>>>>> entries.
>>>>>>>
>>>>>>> Just to enable us to discuss the problem a little further, could you
>>>>>>> tell
>>>>>>> us a little bit about your environment setting? How many ACS MSs do
>>>>>>> you
>>>>>>> have? What is the VM workload being migrated (I mean, the number of
>>>>>>> VMs and
>>>>>>> computing resources)? Do the other servers have enough computing
>>>>>>> resources
>>>>>>> to support these VMs? Could you also check the XenServer logs, and
>>>>>>> check
>>>>>>> what happened on both, source and destination hosts when the
>>>>>>> migration for
>>>>>>> VM “i-638-1735-VM” was triggered?
>>>>>>>
>>>>>>> Do you have the knowledge/skills to use an IDE (e.g. Eclipse) to
>>>>>>> connect to
>>>>>>> a running JVM and debug a code? Because for me it will not be
>>>>>>> possible to
>>>>>>> go further without looking at the variables in a running environment.
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Jan 30, 2017 at 4:28 AM, Francois Scheurer <
>>>>>>> <fr...@everyware.ch>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Dear Rafael
>>>>>>>>
>>>>>>>>
>>>>>>>> Sorry for the late answer (was sick a couple of days).
>>>>>>>>
>>>>>>>> I confirm we are migrating inside the same xenserver pool (within a
>>>>>>>> CS
>>>>>>>> Cluster).
>>>>>>>>
>>>>>>>>
>>>>>>>> Best Regards
>>>>>>>> Francois
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 26.01.2017 04:35, Rafael Weingärtner wrote:
>>>>>>>>
>>>>>>>> Found the code that is executed.
>>>>>>>>> I need you to confirm if you are migrating within a cluster or
>>>>>>>>> across
>>>>>>>>> clusters?
>>>>>>>>>
>>>>>>>>> On Wed, Jan 25, 2017 at 11:57 AM, Rafael Weingärtner <
>>>>>>>>> <ra...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> I do not have access to ACS code right now, but I suggest starting
>>>>>>>>>
>>>>>>>>>> debugging here:
>>>>>>>>>>
>>>>>>>>>> 2017-01-24 18:05:28,427 DEBUG [c.c.c.CapacityManagerImpl]
>>>>>>>>>>
>>>>>>>>>>> (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380
>>>>>>>>>>> ctx-175d37df)
>>>>>>>>>>> VM state transitted from :Running to Stopping with event:
>>>>>>>>>>> StopRequestedvm's
>>>>>>>>>>> original host id: 224 new host id: 15 host id before state
>>>>>>>>>>> transition:
>>>>>>>>>>> 15
>>>>>>>>>>>
>>>>>>>>>>> I would try to understand first why ACS requested the hypervisor
>>>>>>>>>>> to stop
>>>>>>>>>>>
>>>>>>>>>> the VM. I mean the following, check the source code for
>>>>>>>>>> conditions that
>>>>>>>>>> would make ACS request the shutdown of VMs.
>>>>>>>>>> BTW: are you migrating within a cluster or across clusters?
>>>>>>>>>>
>>>>>>>>>> I will be able to analyze this further only after work hours, so
>>>>>>>>>> if you
>>>>>>>>>> find anything, keep me posted.
>>>>>>>>>>
>>>>>>>>>> On Wed, Jan 25, 2017 at 11:45 AM, Francois Scheurer <
>>>>>>>>>> <fr...@everyware.ch>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Hello Rafael
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I think the important log lines are these:
>>>>>>>>>>>
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 INFO
>>>>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b
>>>>>>>>>>> 86f5e
>>>>>>>>>>> job-60216/job-191380 ctx-175d37df) Migration cancelled because
>>>>>>>>>>> state has
>>>>>>>>>>> changed: VM[User|i-638-1736-VM]
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 DEBUG
>>>>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b
>>>>>>>>>>> 86f5e
>>>>>>>>>>> job-60216/job-191380 ctx-175d37df) Unable to migrate VM due to:
>>>>>>>>>>> Migration
>>>>>>>>>>> cancelled because state has changed: VM[User|i-638-1736-VM]
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,427 DEBUG
>>>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>>>>>>>>>> job-60216/job-191380 ctx-175d37df) VM state transitted from
>>>>>>>>>>> :Running to
>>>>>>>>>>> Stopping with event: StopRequestedvm's original host id: 224 new
>>>>>>>>>>> host
>>>>>>>>>>> id:
>>>>>>>>>>> 15 host id before state transition: 15
>>>>>>>>>>>
>>>>>>>>>>> The really disturbing thing is that CS stop again the VM if we
>>>>>>>>>>> start it
>>>>>>>>>>> manually, apparently because the migratingjob is still running.
>>>>>>>>>>>
>>>>>>>>>>> Note the the VM HA Flag is enabled.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thank you.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Best Regards.
>>>>>>>>>>>
>>>>>>>>>>> Francois
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> more details:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,148 DEBUG
>>>>>>>>>>> [o.a.c.e.o.VolumeOrchestrator] (Work-Job-Executor-156:ctx-14b
>>>>>>>>>>> 86f5e
>>>>>>>>>>> job-60216/job-191380 ctx-175d37df) Preparing 3 volumes for
>>>>>>>>>>> VM[User|i-638-1736-VM]
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,155 DEBUG
>>>>>>>>>>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-122:ctx-a8bacc64
>>>>>>>>>>> job-14953/job-191351) Done with run of VM work job:
>>>>>>>>>>> com.cloud.vm.VmWorkMigrateAway for VM 1411, job origin: 14953
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,155 DEBUG
>>>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-122:ctx-a8b
>>>>>>>>>>> acc64
>>>>>>>>>>> job-14953/job-191351) Done executing
>>>>>>>>>>> com.cloud.vm.VmWorkMigrateAway for
>>>>>>>>>>> job-191351
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,160 INFO
>>>>>>>>>>> [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-122:ctx-a8b
>>>>>>>>>>> acc64
>>>>>>>>>>> job-14953/job-191351) Remove job-191351 from job monitoring
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,168 INFO
>>>>>>>>>>> [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-2:ctx-f3bcf43c
>>>>>>>>>>> work-1989)
>>>>>>>>>>> Completed HAWork[1989-Migration-1411-Running-Migrating]
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,246 DEBUG
>>>>>>>>>>> [c.c.a.m.ClusteredAgentAttache] (Work-Job-Executor-156:ctx-14b
>>>>>>>>>>> 86f5e
>>>>>>>>>>> job-60216/job-191380 ctx-175d37df) Seq 19-8526158519542516375:
>>>>>>>>>>> Forwarding
>>>>>>>>>>> Seq 19-8526158519542516375:  { Cmd , MgmtId: 345049103441, via:
>>>>>>>>>>> 19(ewcstack-vh010-prod), Ver: v1, Flags: 100111, [{
>>>>>>>>>>> "com.cloud.agent.api.PrepareForMigrationCommand":{"vm":{"id"
>>>>>>>>>>> :1736,"name":"i-638-1736-VM","bootloader":"PyGrub","type":"U
>>>>>>>>>>> ser","cpus":4,"minSpeed":2100,"maxSpeed":2100,"minRam":17179
>>>>>>>>>>> 869184,"maxRam":17179869184,"arch":"x86_64","os":"Other
>>>>>>>>>>> (64-bit)","platformEmulator":"Other install
>>>>>>>>>>> media","bootArgs":"","enable
>>>>>>>>>>> HA":true,"limitCpuUse":true,"enableDynamicallyScaleVm":true,
>>>>>>>>>>> "vncPassword":"pC1WUC7h1DBH9J36q3H9pg==","params":{"memoryOv
>>>>>>>>>>> ercommitRatio":"1.0","platform":"viridian:true;acpi:1;apic:t
>>>>>>>>>>> rue;pae:true;nx:true;timeoffset:-3601","keyboard":"us","Mess
>>>>>>>>>>> age.ReservedCapacityFreed.Flag":"false","cpuOvercommitRatio"
>>>>>>>>>>> :"4.0","
>>>>>>>>>>> hypervisortoolsversion":"xenserver61"},"uuid":"5bff01de-c033
>>>>>>>>>>> -4925-9e47-30dd14539272","disks":[{"data":{"org.apache.cloud
>>>>>>>>>>> stack.storage.to.VolumeObjectTO":{"uuid":"1ab514b2-f2de-
>>>>>>>>>>> 4d85-8e0d-9768184a1349","volumeType":"ROOT","dataStore":{"org.
>>>>>>>>>>> apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"b8
>>>>>>>>>>> 751b89-00f
>>>>>>>>>>> 6-5079-ed7a-c073579c7563","id":64,"poolType":"PreSetup","hos
>>>>>>>>>>> t":"localhost","path":"/b8751b89-00f6-5079-ed7a-c073579c7563
>>>>>>>>>>> ","port":0,"url":"PreSetup://localhost/b8751b89-00f6-5079-ed
>>>>>>>>>>> 7a-c073579c7563/?ROLE=Primary&STOREUUID=b8751b89-00f6-5079-
>>>>>>>>>>> ed7a-c073579c7563"}},"name":"ROOT-1736","size":53687091200,"
>>>>>>>>>>> path":"d3
>>>>>>>>>>> f87e3c-1efe-42ad-aa93-609d5b980a34","volumeId":8655,"vmName"
>>>>>>>>>>> :"i-638-1736-VM","accountId":638,"format":"VHD","provisionin
>>>>>>>>>>> gType":"THIN","id":8655,"deviceId":0,"hypervisorType":"
>>>>>>>>>>> XenServer"}},"diskSeq":0,"path":"d3f87e3c-1efe-42ad-aa93
>>>>>>>>>>> -609d5b980a34","type":"ROOT","_details":{"managed":"false","
>>>>>>>>>>> storagePort":"0",
>>>>>>>>>>> "storageHost":"localhost","volumeSize":"53687091200"}},{"dat
>>>>>>>>>>> a":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid
>>>>>>>>>>> ":"733df9dc-5e24-4ab9-b22b-bce78322428e","volumeType":"DATAD
>>>>>>>>>>> ISK","dataStore":{"org.apache.cloudstack.storage.to.PrimaryD
>>>>>>>>>>> ataStoreTO":{"uuid":"b8751b89-00f6-5079-ed7a-c073579c7563","
>>>>>>>>>>> id":64,"
>>>>>>>>>>> poolType":"PreSetup","host":"localhost","path":"/b8751b89-00
>>>>>>>>>>> f6-5079-ed7a-c073579c7563","port":0,"url":"PreSetup://localh
>>>>>>>>>>> ost/b8751b89-00f6-5079-ed7a-c073579c7563/?ROLE=Primary&STORE
>>>>>>>>>>> UUID=b8751b89-00f6-5079-ed7a-c073579c7563"}},"name":"WELLA-
>>>>>>>>>>> DB1-NCSA_D","size":268435456000,"path":"265f168a-bb92-4591-a
>>>>>>>>>>> 32b-ee3e2
>>>>>>>>>>> 2c8cb68","volumeId":8656,"vmName":"i-638-1736-VM","accountId
>>>>>>>>>>> ":638,"format":"VHD","provisioningType":"THIN","id":8656,"
>>>>>>>>>>> deviceId":1,"hypervisorType":"XenServer"}},"diskSeq":1,"
>>>>>>>>>>> path":"265f168a-bb92-4591-a32b-ee3e22c8cb68","type":"DATADIS
>>>>>>>>>>> K","_details":{"managed":"false","storagePort":"0","stor
>>>>>>>>>>> ageHost":"localhos
>>>>>>>>>>> t","volumeSize":"268435456000"}},{"data":{"org.apache.clouds
>>>>>>>>>>> tack.storage.to.VolumeObjectTO":{"uuid":"bda6f335-b8a3-479c-
>>>>>>>>>>> 8c0d-41b344d2a50d","volumeType":"DATADISK","dataStore":{"org.
>>>>>>>>>>> apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid"
>>>>>>>>>>> :"09df3ca6-0a20-e4f0-3d79-12a6ed67a13d","id":89,"poolType":"
>>>>>>>>>>> PreSetup","
>>>>>>>>>>> host":"localhost","path":"/09df3ca6-0a20-e4f0-3d79-12a6ed67a
>>>>>>>>>>> 13d","port":0,"url":"PreSetup://localhost/09df3ca6-0a20-e4f0
>>>>>>>>>>> -3d79-12a6ed67a13d/?ROLE=Primary&STOREUUID=09df3ca6-0a20-
>>>>>>>>>>>
>>>>>>>>>>> e4f0-3d79-12a6ed67a13d"}},"name":"WELLA-DB1-NCSA_E","size
>>>>>>>>>>> ":429496729600,"path":"612cb586-a70c-4896-b120-062ee7a67894"
>>>>>>>>>>> ,"volumeId":86
>>>>>>>>>>> 57,"vmName":"i-638-1736-VM","accountId":638,"format":"VHD","
>>>>>>>>>>> provisioningType":"THIN","id":8657,"deviceId":2,"hypervisorT
>>>>>>>>>>> ype":"XenServer"}},"diskSeq":2,"path":"612cb586-a70c-4896-b1
>>>>>>>>>>> 20-062ee7a67894","type":"DATADISK","_details":{"managed":"
>>>>>>>>>>> false","storagePort":"0","storageHost":"localhost","volumeSi
>>>>>>>>>>> ze":"42949
>>>>>>>>>>> 6729600"}},{"data":{"org.apache.cloudstack.storage.to.Templa
>>>>>>>>>>> teObjectTO":{"uuid":"0e5bf4ff-caca-4b08-abed-7bbcdf3c42ef","
>>>>>>>>>>> id":201,"format":"ISO","accountId":1,"hvm":true,"displayText
>>>>>>>>>>> ":"xen-pv-drv-iso","name":"xs-tools.iso","guestOsType":"CentOS
>>>>>>>>>>> 4.5
>>>>>>>>>>> (32-bit)","hypervisorType":"XenServer"}},"diskSeq":3,"type":"
>>>>>>>>>>> ISO"}],"nics":[{"deviceId":0,"networkRateMbps":1000,"default
>>>>>>>>>>> Nic":true,"pxeDisable":false,"nicUuid":"3523c3d3-9f92-4076-a
>>>>>>>>>>> d99-f8864d0f0edd","uuid":"e6fbb257-16e1-415f-88fb-9a79b263b3
>>>>>>>>>>> a8","ip":"192.168.1.42","netmask":"255.255.255.0","
>>>>>>>>>>> gateway":"192.168.1.1","mac":"02:00:0c:ae:00:0f","dns1":"8.8
>>>>>>>>>>> .8.8","dns2":"8.8
>>>>>>>>>>> .4.4","broadcastType":"Vlan","type":"Guest","broadcastUri":"
>>>>>>>>>>> vlan://2285","isolationUri":"vlan://2285","isSecurityGroupEn
>>>>>>>>>>> abled":false,"name":"bond0"}],"vcpuMaxLimit":16},"wait":0}}] }
>>>>>>>>>>> to
>>>>>>>>>>> 345049098498
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG
>>>>>>>>>>> [c.c.a.t.Request]
>>>>>>>>>>> (AgentManager-Handler-10:null) Seq 19-8526158519542516375:
>>>>>>>>>>> Processing:
>>>>>>>>>>> {
>>>>>>>>>>> Ans: , MgmtId: 345049103441, via: 19, Ver: v1, Flags: 110,
>>>>>>>>>>> [{"com.cloud.agent.api.PrepareForMigrationAnswer":{"result":
>>>>>>>>>>> true,"wait":0}}]
>>>>>>>>>>> }
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG
>>>>>>>>>>> [c.c.a.m.AgentAttache]
>>>>>>>>>>> (AgentManager-Handler-10:null) Seq 19-8526158519542516375: No
>>>>>>>>>>> more
>>>>>>>>>>> commands
>>>>>>>>>>> found
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG
>>>>>>>>>>> [c.c.a.t.Request]
>>>>>>>>>>> (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380
>>>>>>>>>>> ctx-175d37df)
>>>>>>>>>>> Seq 19-8526158519542516375: Received:  { Ans: , MgmtId:
>>>>>>>>>>> 345049103441,
>>>>>>>>>>> via:
>>>>>>>>>>> 19, Ver: v1, Flags: 110, { PrepareForMigrationAnswer } }
>>>>>>>>>>>
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 INFO
>>>>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b
>>>>>>>>>>> 86f5e
>>>>>>>>>>> job-60216/job-191380 ctx-175d37df) Migration cancelled because
>>>>>>>>>>> state has
>>>>>>>>>>> changed: VM[User|i-638-1736-VM]
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 DEBUG
>>>>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b
>>>>>>>>>>> 86f5e
>>>>>>>>>>> job-60216/job-191380 ctx-175d37df) Unable to migrate VM due to:
>>>>>>>>>>> Migration
>>>>>>>>>>> cancelled because state has changed: VM[User|i-638-1736-VM]
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,427 DEBUG
>>>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>>>>>>>>>> job-60216/job-191380 ctx-175d37df) VM state transitted from
>>>>>>>>>>> :Running to
>>>>>>>>>>> Stopping with event: StopRequestedvm's original host id: 224 new
>>>>>>>>>>> host
>>>>>>>>>>> id:
>>>>>>>>>>> 15 host id before state transition: 15
>>>>>>>>>>>
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,438 DEBUG
>>>>>>>>>>> [c.c.a.m.ClusteredAgentAttache] (Work-Job-Executor-156:ctx-14b
>>>>>>>>>>> 86f5e
>>>>>>>>>>> job-60216/job-191380 ctx-175d37df) Seq 15-6889944479923437643:
>>>>>>>>>>> Forwarding
>>>>>>>>>>> Seq 15-6889944479923437643:  { Cmd , MgmtId: 345049103441, via:
>>>>>>>>>>> 15(ewcstack-vh011-prod), Ver: v1, Flags: 100011,
>>>>>>>>>>> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"execut
>>>>>>>>>>> eInSequence":false,"checkBeforeCleanup":false,"vmName":"i-
>>>>>>>>>>> 638-1736-VM","wait":0}}]
>>>>>>>>>>> } to 345049098498
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:30,335 DEBUG
>>>>>>>>>>> [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-1:null) SeqA
>>>>>>>>>>> 254-800498:
>>>>>>>>>>> Processing Seq 254-800498:  { Cmd , MgmtId: -1, via: 254, Ver:
>>>>>>>>>>> v1,
>>>>>>>>>>> Flags:
>>>>>>>>>>> 11, [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_
>>>>>>>>>>> proxyVmId":1530,"_loadInfo":"{\n
>>>>>>>>>>> \"connections\": []\n}","wait":0}}] }
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:30,340 DEBUG
>>>>>>>>>>> [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-1:null) SeqA
>>>>>>>>>>> 254-800498:
>>>>>>>>>>> Sending Seq 254-800498:  { Ans: , MgmtId: 345049103441, via:
>>>>>>>>>>> 254, Ver:
>>>>>>>>>>> v1,
>>>>>>>>>>> Flags: 100010, [{"com.cloud.agent.api.AgentCo
>>>>>>>>>>> ntrolAnswer":{"result":true,"wait":0}}] }
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,447 DEBUG
>>>>>>>>>>> [c.c.a.t.Request]
>>>>>>>>>>> (AgentManager-Handler-11:null) Seq 15-6889944479923437632:
>>>>>>>>>>> Processing:
>>>>>>>>>>> {
>>>>>>>>>>> Ans: , MgmtId: 345049103441, via: 15, Ver: v1, Flags: 10,
>>>>>>>>>>> [{"com.cloud.agent.api.StopAnswer":{"platform":"viridian:tru
>>>>>>>>>>> e;acpi:1;apic:true;pae:true;nx:true;timeoffset:3601","result
>>>>>>>>>>> ":true,"details":"Stop
>>>>>>>>>>> VM i-638-1735-VM Succeed","wait":0}}] }
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,447 DEBUG
>>>>>>>>>>> [c.c.a.t.Request]
>>>>>>>>>>> (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379
>>>>>>>>>>> ctx-bbabfeae)
>>>>>>>>>>> Seq 15-6889944479923437632: Received:  { Ans: , MgmtId:
>>>>>>>>>>> 345049103441,
>>>>>>>>>>> via:
>>>>>>>>>>> 15, Ver: v1, Flags: 10, { StopAnswer } }
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,470 DEBUG
>>>>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>>> db528
>>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) VM[User|i-638-1735-VM] is
>>>>>>>>>>> stopped on
>>>>>>>>>>> the
>>>>>>>>>>> host. Proceeding to release resource held.
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,475 DEBUG
>>>>>>>>>>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not
>>>>>>>>>>> supported
>>>>>>>>>>> in the network id=519
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,478 DEBUG
>>>>>>>>>>> [o.a.c.e.o.NetworkOrchestrator] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>>> db528
>>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Changing active number of
>>>>>>>>>>> nics for
>>>>>>>>>>> network id=519 on -1
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>>>>>>>>>>> [o.a.c.e.o.NetworkOrchestrator] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>>> db528
>>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Asking VpcVirtualRouter to
>>>>>>>>>>> release
>>>>>>>>>>> NicProfile[17403-1735-3364083b-dcc0-4eb9-a7f9-f3e85bf8c7d3-1
>>>>>>>>>>> 92.168.1.11-null
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>>>>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>>> db528
>>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Successfully released network
>>>>>>>>>>> resources
>>>>>>>>>>> for the vm VM[User|i-638-1735-VM]
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>>>>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>>> db528
>>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Successfully released storage
>>>>>>>>>>> resources
>>>>>>>>>>> for the vm VM[User|i-638-1735-VM]
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,496 DEBUG
>>>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) VM state transitted from
>>>>>>>>>>> :Stopping to
>>>>>>>>>>> Stopped with event: OperationSucceededvm's original host id: 15
>>>>>>>>>>> new host
>>>>>>>>>>> id: null host id before state transition: 15
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Hosts's actual total CPU:
>>>>>>>>>>> 82976 and
>>>>>>>>>>> CPU
>>>>>>>>>>> after applying overprovisioning: 331904
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Hosts's actual total RAM:
>>>>>>>>>>> 261961927808
>>>>>>>>>>> and RAM after applying overprovisioning: 261961924608
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) release cpu from host: 15,
>>>>>>>>>>> old used:
>>>>>>>>>>> 71300,reserved: 2100, actual total: 82976, total with
>>>>>>>>>>> overprovisioning:
>>>>>>>>>>> 331904; new used: 67100,reserved:6300; movedfromreserved:
>>>>>>>>>>> false,moveToReserveredtrue
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) release mem from host: 15,
>>>>>>>>>>> old used:
>>>>>>>>>>> 98784247808,reserved: 1073741824, total: 261961924608; new used:
>>>>>>>>>>> 94489280512,reserved:5368709120 <(53)%206870-9120>;
>>>>>>>>>>> movedfromreserved:
>>>>>>>>>>>
>>>>>>>>>>> false,moveToReserveredtrue
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,538 DEBUG
>>>>>>>>>>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not
>>>>>>>>>>> supported
>>>>>>>>>>> in the network id=519
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,557 DEBUG
>>>>>>>>>>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not
>>>>>>>>>>> supported
>>>>>>>>>>> in the network id=519
>>>>>>>>>>>
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,588 ERROR
>>>>>>>>>>> [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>>> db528
>>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Invocation exception, caused
>>>>>>>>>>> by:
>>>>>>>>>>> com.cloud.utils.exception.CloudRuntimeException: Unable to
>>>>>>>>>>> migrate
>>>>>>>>>>> VM[User|i-638-1735-VM]
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 INFO
>>>>>>>>>>> [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>>> db528
>>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Rethrow exception
>>>>>>>>>>> com.cloud.utils.exception.CloudRuntimeException: Unable to
>>>>>>>>>>> migrate
>>>>>>>>>>> VM[User|i-638-1735-VM]
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 DEBUG
>>>>>>>>>>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>>> job-13733/job-191379) Done with run of VM work job:
>>>>>>>>>>> com.cloud.vm.VmWorkMigrateAway for VM 1735, job origin: 13733
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 ERROR
>>>>>>>>>>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>>> job-13733/job-191379) Unable to complete AsyncJobVO {id:191379,
>>>>>>>>>>> userId:
>>>>>>>>>>> 1,
>>>>>>>>>>> accountId: 1, instanceType: null, instanceId: null, cmd:
>>>>>>>>>>> com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>>>>>>>>> rO0ABXNyAB5jb20uY2xvdWQudm0uVm
>>>>>>>>>>> 1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb2
>>>>>>>>>>> 0uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2
>>>>>>>>>>> VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZz
>>>>>>>>>>> t4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGx3QAGVZpcnR1YWxNYWNoaW5lTW
>>>>>>>>>>> FuYWdlckltcGwAAAAAAAAA4A,
>>>>>>>>>>> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0,
>>>>>>>>>>> resultCode: 0,
>>>>>>>>>>> result: null, initMsid: 345049098498, completeMsid: null,
>>>>>>>>>>> lastUpdated:
>>>>>>>>>>> null, lastPolled: null, created: Tue Jan 24 18:05:00 CET 2017},
>>>>>>>>>>> job
>>>>>>>>>>> origin:13733
>>>>>>>>>>> ewcstack-man02-prod: com.cloud.utils.exception.Clou
>>>>>>>>>>> dRuntimeException:
>>>>>>>>>>> Unable to migrate VM[User|i-638-1735-VM]
>>>>>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>>>>>>>>>>> agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.ja
>>>>>>>>>>> va:2405)
>>>>>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>>>>>>>>>>> agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.ja
>>>>>>>>>>> va:4517)
>>>>>>>>>>> ewcstack-man02-prod:    at sun.reflect.GeneratedMethodAcc
>>>>>>>>>>> essor402.invoke(Unknown
>>>>>>>>>>> Source)
>>>>>>>>>>> ewcstack-man02-prod:    at sun.reflect.DelegatingMethodAc
>>>>>>>>>>> cessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>>>>>>> ewcstack-man02-prod:    at java.lang.reflect.Method.invok
>>>>>>>>>>> e(Method.java:606)
>>>>>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VmWorkJobHandlerP
>>>>>>>>>>> roxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
>>>>>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>>>>>>>>>>> agerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4636)
>>>>>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VmWorkJobDispatch
>>>>>>>>>>> er.runJob(VmWorkJobDispatcher.java:103)
>>>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.framewor
>>>>>>>>>>> k.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManag
>>>>>>>>>>> erImpl.java:537)
>>>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>>>>>> context.ManagedContextRunnable$1.run(ManagedContextRunnable.
>>>>>>>>>>> java:49)
>>>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>>>>>> context.impl.DefaultManagedContext$1.call(DefaultManagedCont
>>>>>>>>>>> ext.java:56)
>>>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>>>>>> context.impl.DefaultManagedContext.callWithContext(DefaultMa
>>>>>>>>>>> nagedContext.java:103)
>>>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>>>>>> context.impl.DefaultManagedContext.runWithContext(DefaultMan
>>>>>>>>>>> agedContext.java:53)
>>>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>>>>>> context.ManagedContextRunnable.run(ManagedContextRunnable.ja
>>>>>>>>>>> va:46)
>>>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.framewor
>>>>>>>>>>> k.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.ja
>>>>>>>>>>> va:494)
>>>>>>>>>>> ewcstack-man02-prod:    at java.util.concurrent.Executors
>>>>>>>>>>> $RunnableAdapter.call(Executors.java:471)
>>>>>>>>>>> ewcstack-man02-prod:    at java.util.concurrent.FutureTas
>>>>>>>>>>> k.run(FutureTask.java:262)
>>>>>>>>>>> ewcstack-man02-prod:    at java.util.concurrent.ThreadPoo
>>>>>>>>>>> lExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>>>>>>>> ewcstack-man02-prod:    at java.util.concurrent.ThreadPoo
>>>>>>>>>>> lExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>>>>>>>> ewcstack-man02-prod:    at java.lang.Thread.run(Thread.java:745)
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,590 DEBUG
>>>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>>> db528
>>>>>>>>>>> job-13733/job-191379) Complete async job-191379, jobStatus:
>>>>>>>>>>> FAILED,
>>>>>>>>>>> resultCode: 0, result: rO0ABXNyAC9jb20uY2xvdWQudXRpbH
>>>>>>>>>>> MuZXhjZXB0aW9uLkNsb3VkUnVudGltZUV4Y2VwdGlvbgAAAABWTU9yAwABSQ
>>>>>>>>>>> ALY3NFcnJvckNvZGV4cgAaamF2YS5sYW5nLlJ1bnRpbWVFeGNlcHRpb26eXw
>>>>>>>>>>> ZHCjSD5QIAAHhyABNqYXZhLmxhbmcuRXhjZXB0aW9u0P0fPho7HMQCAAB4cg
>>>>>>>>>>> ATamF2YS5sYW5nLlRocm93YWJsZdXGNSc5d7jLAwAETAAFY2F1c2V0ABVMam
>>>>>>>>>>> F2YS9sYW5nL1Rocm93YWJsZTtMAA1kZXRhaWxNZXNzYWdldAASTGphdmEvbG
>>>>>>>>>>> FuZy9TdHJpbmc7WwAKc3RhY2tUcmFjZXQAHltMamF2YS9sYW5nL1N0YWNrVH
>>>>>>>>>>> JhY2VFbGVtZW50O0wAFHN1cHByZXNzZWRFeGNlcHRpb25zdAAQTGphdmEvdX
>>>>>>>>>>> RpbC9MaXN0O3hwcQB-AAh0AChVbmFibGUgdG8gbWlncmF0ZSBWTVtVc2VyfG
>>>>>>>>>>> ktNjM4LTE3MzUtVk1ddXIAHltMamF2YS5sYW5nLlN0YWNrVHJhY2VFbGVtZW
>>>>>>>>>>> 50OwJGKjw8_SI5AgAAeHAAAAAUc3IAG2phdmEubGFuZy5TdGFja1RyYWNlRW
>>>>>>>>>>> xlbWVudGEJxZomNt2FAgAESQAKbGluZU51bWJlckwADmRlY2xhcmluZ0NsYX
>>>>>>>>>>> NzcQB-AAVMAAhmaWxlTmFtZXEAfgAFTAAKbWV0aG9kTmFtZXEAfgAFeHAAAA
>>>>>>>>>>> lldAAmY29tLmNsb3VkLnZtLlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGx0AB
>>>>>>>>>>> 5WaXJ0dWFsTWFjaGluZU1hbmFnZXJJbXBsLmphdmF0ABZvcmNoZXN0cmF0ZU
>>>>>>>>>>> 1pZ3JhdGVBd2F5c3EAfgAMAAARpXEAfgAOcQB-AA9xAH4AEHNxAH4ADP____
>>>>>>>>>>> 90ACZzdW4ucmVmbGVjdC5HZW5lcmF0ZWRNZXRob2RBY2Nlc3NvcjQwMnB0AA
>>>>>>>>>>> ZpbnZva2VzcQB-AAwAAAArdAAoc3VuLnJlZmxlY3QuRGVsZWdhdGluZ01ldG
>>>>>>>>>>> hvZEFjY2Vzc29ySW1wbHQAIURlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcG
>>>>>>>>>>> wuamF2YXEAfgAUc3EAfgAMAAACXnQAGGphdmEubGFuZy5yZWZsZWN0Lk1ldG
>>>>>>>>>>> hvZHQAC01ldGhvZC5qYXZhcQB-ABRzcQB-AAwAAABrdAAiY29tLmNsb3VkLn
>>>>>>>>>>> ZtLlZtV29ya0pvYkhhbmRsZXJQcm94eXQAGlZtV29ya0pvYkhhbmRsZXJQcm
>>>>>>>>>>> 94eS5qYXZhdAAPaGFuZGxlVm1Xb3JrSm9ic3EAfgAMAAASHHEAfgAOcQB-
>>>>>>>>>>> AA9xAH4AHnNxAH4ADAAAAGd0ACBjb20uY2xvdWQudm0uVm1Xb3JrSm9iRGlz
>>>>>>>>>>> cGF0Y2hlcnQAGFZtV29ya0pvYkRpc3BhdGNoZXIuamF2YXQABnJ1bkpvYnNx
>>>>>>>>>>> AH4ADAAAAhl0AD9vcmcuYXBhY2hlLmNsb3Vkc3RhY2suZnJhbWV3b3JrLmpv
>>>>>>>>>>> YnMuaW1wbC5Bc3luY0pvYk1hbmFnZXJJbXBsJDV0ABhBc3luY0pvYk1hbmFn
>>>>>>>>>>> ZXJJbXBsLmphdmF0AAxydW5JbkNvbnRleHRzcQB-AAwAAAAxdAA-b3JnLmFw
>>>>>>>>>>> YWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29udGV4
>>>>>>>>>>> dFJ1bm5hYmxlJDF0ABtNYW5hZ2VkQ29udGV4dFJ1bm5hYmxlLmphdmF0AANy
>>>>>>>>>>> dW5zcQB-AAwAAAA4dABCb3JnLmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQu
>>>>>>>>>>> Y29udGV4dC5pbXBsLkRlZmF1bHRNYW5hZ2VkQ29udGV4dCQxdAAaRGVmYXVs
>>>>>>>>>>> dE1hbmFnZWRDb250ZXh0LmphdmF0AARjYWxsc3EAfgAMAAAAZ3QAQG9yZy5h
>>>>>>>>>>> cGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0
>>>>>>>>>>> TWFuYWdlZENvbnRleHRxAH4ALnQAD2NhbGxXaXRoQ29udGV4dHNxAH4ADAAA
>>>>>>>>>>> ADVxAH4AMXEAfgAudAAOcnVuV2l0aENvbnRleHRzcQB-AAwAAAAudAA8b3Jn
>>>>>>>>>>> LmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29u
>>>>>>>>>>> dGV4dFJ1bm5hYmxlcQB-ACpxAH4AK3NxAH4ADAAAAe5xAH4AJXEAfgAmcQB-
>>>>>>>>>>> ACtzcQB-AAwAAAHXdAAuamF2YS51dGlsLmNvbmN1cnJlbnQuRXhlY3V0b3Jz
>>>>>>>>>>> JFJ1bm5hYmxlQWRhcHRlcnQADkV4ZWN1dG9ycy5qYXZhcQB-AC9zcQB-AAwA
>>>>>>>>>>> AAEGdAAfamF2YS51dGlsLmNvbmN1cnJlbnQuRnV0dXJlVGFza3QAD0Z1dHVy
>>>>>>>>>>> ZVRhc2suamF2YXEAfgArc3EAfgAMAAAEeXQAJ2phdmEudXRpbC5jb25jdXJy
>>>>>>>>>>> ZW50LlRocmVhZFBvb2xFeGVjdXRvcnQAF1RocmVhZFBvb2xFeGVjdXRvci5q
>>>>>>>>>>> YXZhdAAJcnVuV29ya2Vyc3EAfgAMAAACZ3QALmphdmEudXRpbC5jb25jdXJy
>>>>>>>>>>> ZW50LlRocmVhZFBvb2xFeGVjdXRvciRXb3JrZXJxAH4AQHEAfgArc3EAfgAM
>>>>>>>>>>> AAAC6XQAEGphdmEubGFuZy5UaHJlYWR0AAtUaHJlYWQuamF2YXEAfgArc3IA
>>>>>>>>>>> JmphdmEudXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVMaXN0_A8lMbXs
>>>>>>>>>>> jhACAAFMAARsaXN0cQB-AAd4cgAsamF2YS51dGlsLkNvbGxlY3Rpb25zJFVu
>>>>>>>>>>> bW9kaWZpYWJsZUNvbGxlY3Rpb24ZQgCAy173HgIAAUwAAWN0ABZMamF2YS91
>>>>>>>>>>> dGlsL0NvbGxlY3Rpb247eHBzcgATamF2YS51dGlsLkFycmF5TGlzdHiB0h2Z
>>>>>>>>>>> x2GdAwABSQAEc2l6ZXhwAAAAAHcEAAAAAHhxAH4ATHgAABCadwgAAAAAAAAAAHg
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>>>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>>> db528
>>>>>>>>>>> job-13733/job-191379) Publish async job-191379 complete on
>>>>>>>>>>> message bus
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>>>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>>> db528
>>>>>>>>>>> job-13733/job-191379) Wake up jobs related to job-191379
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>>>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>>> db528
>>>>>>>>>>> job-13733/job-191379) Update db status for job-191379
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,594 DEBUG
>>>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>>> db528
>>>>>>>>>>> job-13733/job-191379) Wake up jobs joined with job-191379 and
>>>>>>>>>>> disjoin
>>>>>>>>>>> all
>>>>>>>>>>> subjobs created from job- 191379
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,604 DEBUG
>>>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>>> db528
>>>>>>>>>>> job-13733/job-191379) Done executing
>>>>>>>>>>> com.cloud.vm.VmWorkMigrateAway for
>>>>>>>>>>> job-191379
>>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,608 INFO
>>>>>>>>>>> [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>>> db528
>>>>>>>>>>> job-13733/job-191379) Remove job-191379 from job monitoring
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 25.01.2017 15:53, Rafael Weingärtner wrote:
>>>>>>>>>>>
>>>>>>>>>>> Did you check the ACS log files looking for some exceptions when
>>>>>>>>>>> the
>>>>>>>>>>>
>>>>>>>>>>>> migrations for this specific VM started?
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Jan 25, 2017 at 9:49 AM, Francois Scheurer <
>>>>>>>>>>>> <fr...@everyware.ch>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Dear CS contributors
>>>>>>>>>>>>
>>>>>>>>>>>> We use CS 4.5.1 on a 3 Clusters with XenServer 6.5. and shared
>>>>>>>>>>>>> primary
>>>>>>>>>>>>> storage.
>>>>>>>>>>>>>
>>>>>>>>>>>>> When we put a host in maintenance, cs evacuates all VM's on
>>>>>>>>>>>>> other
>>>>>>>>>>>>> hosts.
>>>>>>>>>>>>>
>>>>>>>>>>>>> But it happens regularly that some VM's are stopped instead of
>>>>>>>>>>>>> live
>>>>>>>>>>>>> migrated.
>>>>>>>>>>>>>
>>>>>>>>>>>>> If we restart that stopped VM's, CS will then stop them again
>>>>>>>>>>>>> later.
>>>>>>>>>>>>>
>>>>>>>>>>>>> We need to start them several times until they stays up.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Is it a known issue? Is it fixed on CS 4.9.2 ?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Migrating manually all VM's is working but impracticable
>>>>>>>>>>>>> because too
>>>>>>>>>>>>> much
>>>>>>>>>>>>> time consuming.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Many thanks in advance for your help.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best Regards
>>>>>>>>>>>>> Francois
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> EveryWare AG
>>>>>>>>>>>>> François Scheurer
>>>>>>>>>>>>> Senior Systems Engineer
>>>>>>>>>>>>> Zurlindenstrasse 52a
>>>>>>>>>>>>> CH-8003 Zürich
>>>>>>>>>>>>>
>>>>>>>>>>>>> tel: +41 44 466 60 00 <%2B41%2044%20466%2060%2000>
>>>>>>>>>>>>> fax: +41 44 466 60 10 <%2B41%2044%20466%2060%2010>
>>>>>>>>>>>>> mail: <fr...@everyware.ch>
>>>>>>>>>>>>> francois.scheurer@everyware.ch
>>>>>>>>>>>>> web: <http://www.everyware.ch>http://www.everyware.ch
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> EveryWare AG
>>>>>>>>>>> François Scheurer
>>>>>>>>>>> Senior Systems Engineer
>>>>>>>>>>> Zurlindenstrasse 52a
>>>>>>>>>>> CH-8003 Zürich
>>>>>>>>>>>
>>>>>>>>>>> tel: +41 44 466 60 00 <%2B41%2044%20466%2060%2000>
>>>>>>>>>>> fax: +41 44 466 60 10 <%2B41%2044%20466%2060%2010>
>>>>>>>>>>> mail: <fr...@everyware.ch>
>>>>>>>>>>> francois.scheurer@everyware.ch
>>>>>>>>>>> web: <http://www.everyware.ch>http://www.everyware.ch
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>> Rafael Weingärtner
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> --
>>>>>>>>
>>>>>>>>
>>>>>>>> EveryWare AG
>>>>>>>> François Scheurer
>>>>>>>> Senior Systems Engineer
>>>>>>>> Zurlindenstrasse 52a
>>>>>>>> CH-8003 Zürich
>>>>>>>>
>>>>>>>> tel: +41 44 466 60 00 <%2B41%2044%20466%2060%2000>
>>>>>>>> fax: +41 44 466 60 10 <%2B41%2044%20466%2060%2010>
>>>>>>>> mail: <fr...@everyware.ch>
>>>>>>>> francois.scheurer@everyware.ch
>>>>>>>> web: <http://www.everyware.ch>http://www.everyware.ch
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> --
>>>>>>
>>>>>>
>>>>>> EveryWare AG
>>>>>> François Scheurer
>>>>>> Senior Systems Engineer
>>>>>> Zurlindenstrasse 52a
>>>>>> CH-8003 Zürich
>>>>>>
>>>>>> tel: +41 44 466 60 00 <%2B41%2044%20466%2060%2000>
>>>>>> fax: +41 44 466 60 10 <%2B41%2044%20466%2060%2010>
>>>>>> mail: <fr...@everyware.ch>francois.scheurer@everyware.ch
>>>>>> web: <http://www.everyware.ch>http://www.everyware.ch
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Rafael Weingärtner
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Rafael Weingärtner
>>>>
>>>>
>>>> --
>>>>
>>>>
>>>> EveryWare AG
>>>> François Scheurer
>>>> Senior Systems Engineer
>>>> Zurlindenstrasse 52a
>>>> CH-8003 Zürich
>>>>
>>>> tel: +41 44 466 60 00 <+41%2044%20466%2060%2000>
>>>> fax: +41 44 466 60 10 <+41%2044%20466%2060%2010>
>>>> mail: francois.scheurer@everyware.ch
>>>> web: http://www.everyware.ch
>>>>
>>>>
>>>
>>>
>>> --
>>> Rafael Weingärtner
>>>
>>>
>>> --
>>>
>>>
>>> EveryWare AG
>>> François Scheurer
>>> Senior Systems Engineer
>>> Zurlindenstrasse 52a
>>> CH-8003 Zürich
>>>
>>> tel: +41 44 466 60 00 <+41%2044%20466%2060%2000>
>>> fax: +41 44 466 60 10 <+41%2044%20466%2060%2010>
>>> mail: francois.scheurer@everyware.ch
>>> web: http://www.everyware.ch
>>>
>>>
>>
>>
>> --
>> Rafael Weingärtner
>>
>>
>> --
>>
>>
>> EveryWare AG
>> François Scheurer
>> Senior Systems Engineer
>> Zurlindenstrasse 52a
>> CH-8003 Zürich
>>
>> tel: +41 44 466 60 00 <+41%2044%20466%2060%2000>
>> fax: +41 44 466 60 10 <+41%2044%20466%2060%2010>
>> mail: francois.scheurer@everyware.ch
>> web: http://www.everyware.ch
>>
>>
>
>
> --
> Rafael Weingärtner
>
>
> --
>
>
> EveryWare AG
> François Scheurer
> Senior Systems Engineer
> Zurlindenstrasse 52a
> CH-8003 Zürich
>
> tel: +41 44 466 60 00 <+41%2044%20466%2060%2000>
> fax: +41 44 466 60 10 <+41%2044%20466%2060%2010>
> mail: francois.scheurer@everyware.ch
> web: http://www.everyware.ch
>
>


-- 
Rafael Weingärtner

Re: cs 4.5.1, vm stopped when putting xen hosts in maintenance

Posted by Francois Scheurer <fr...@everyware.ch>.

Dear Rafael


You are right about cloudmonkey with CS 4.5.2 being able to update the 
ha_enable flag on VM's. Thank you for that.

Back to the central question, do you think it is safe to run this update 
while all MS's are stopped?:
   update op_it_work set step="Done" where step!="Done" and instance_id=<X>;

Where <X> is pointing to an expunged VM's or to the specific VM's that 
are intermittently being stopped.

If you are not sure about it, maybe do you know somebody else who can 
help us on this?


Best Regards

Francois Scheurer






On 07.02.2017 17:47, Rafael Weingärtner wrote:
> Got the idea of your work around now.
>
> I have not paid much attention to the table data you posted (sorry for 
> that). Well, it seems you have all taken care of.
>
> I have a suggestion to make; instead of changing data directly on the 
> data base, what about using CloudMonkey to update the HA flag to false 
> on VMs? Currently, it is the UI that is blocking such customization, 
> but according to the API, you should be able to do that. 
> https://cloudstack.apache.org/api/apidocs-4.5/root_admin/updateVirtualMachine.html
>
> On Tue, Feb 7, 2017 at 11:25 AM, Francois Scheurer 
> <francois.scheurer@everyware.ch 
> <ma...@everyware.ch>> wrote:
>
>     Dear Rafael
>
>
>     We changed it on the VM to avoid the required stop and start.
>     Changing on the used Offering is not affecting the
>     vm_instance.ha_enabled.
>     We could create a new Offering without HA but the VM need to be
>     stopped to let us apply it.
>
>     We did not disabled the HA on these 4 VM's to solve the problem,
>     but just to minimize the impact of the problem (now only
>     i-189-1177-VM should be stopped intermittently) if the cleaning of
>     op_it_work is not fixing the issue.
>     Of course we could filter the update query with “instance id”, but
>     if you look the the table output all instance_id are related to
>     the 5 problematic VM's.
>     The only exceptions are the first 3 rows that are related to VM's
>     that does no exist anymore since July 2015 ...
>     That is why I thought it may be safe to do.
>
>
>     Best Regards
>
>     Francois Scheurer
>
>
>
>
>
>
>
>
>     On 07.02.2017 16:27, Rafael Weingärtner wrote:
>>     So, you changed the flags directly on "vm_instance". Why did not
>>     you change the service offering of these VMs using ACS UI or API?
>>     For me, it feels that the best approach would be to change the
>>     service offering of the affected VMs (the HA flag is first
>>     configured at the service offering). You could assign a service
>>     offering without HA; I mean, it would be a matter of creating a
>>     similar service offering (same configurations to the one they are
>>     using right now), but without HA.
>>
>>     I am assuming that the change directly on “vm_instance” did not
>>     solve your problem, right? That is why you want to change some
>>     data at “op_it_work”. I did a quick look at “op_it_work”, and it
>>     seems that this table may be used for more things than HA. Here
>>     is a tip for complex production system, never run update SQLs in
>>     production environments without a very, very restrictive and well
>>     thought “where” clause. In my opinion this is not restrictive
>>     enough (update op_it_work set step="Done" where step!="Done"). I
>>     would advise you not to go further and change configurations
>>     directly on database (unless you have a dev team that is capable
>>     of reading, debugging, and work with ACS source code). ACS has
>>     over 500k lines of java code, and God knows how many lines of
>>     shell, python and other languages, software and systems that may
>>     be used; it is pretty hard sometimes to get a clear picture of
>>     how a change in the database will reverberate in the whole system
>>     without knowing in details the installation configurations and
>>     underlying structures (networking, hypervisors, storage systems
>>     and other pieces). However, if you want to proceed with that, I
>>     would suggest you adding some extra filters to that “where”
>>     clause. What about adding the “instance id” there?
>>
>>     PS: BTW, do you already heard about the Cloudstack Fork from
>>     Schuberg Philis named Cosmic?
>>     https://github.com/MissionCriticalCloud/cosmic
>>     <https://github.com/MissionCriticalCloud/cosmic>
>>     Yes, I have. They were part of the active ACS community back
>>     there. I was already here when they created the fork. Still, I
>>     kept with ACS.
>>
>>
>>     On Tue, Feb 7, 2017 at 3:54 AM, Francois Scheurer
>>     <francois.scheurer@everyware.ch
>>     <ma...@everyware.ch>> wrote:
>>
>>         Dear Rafael
>>
>>
>>         We disabled temporarily the HA Flag on these VM's by updating
>>         the DB directly: update vm_instance set ha_enabled=0 where
>>         removed is null and id=1178 or id=1240 or id=1690 or id=1736;
>>
>>         We kept the test VM i-189-1177-VM with HA Flag.
>>
>>         Restarting the MS + cleaning the op_it_work table will
>>         hopefully fix the issue. And if the issue is persisting, then
>>         only the i-189-1177-VM should be affected, hence mitigating
>>         the impact.
>>
>>         In the other hand if everything run fine we would thereafter
>>         re-enable the HA Flag on the problematic VM's.
>>
>>
>>         Do you think it is safe to run this update?: update
>>         op_it_work set step="Done" where step!="Done";
>>
>>
>>         Best Regards
>>
>>         Francois Scheurer
>>
>>
>>
>>         PS: BTW, do you already heard about the Cloudstack Fork from
>>         Schuberg Philis named Cosmic?
>>         https://github.com/MissionCriticalCloud/cosmic
>>
>>
>>
>>
>>
>>         On 06.02.2017 16:41, Rafael Weingärtner wrote:
>>>
>>>         I totally understand your position of not being able to
>>>         debug the ACS in production ;)
>>>
>>>
>>>         There is one bit I quite did not understand. If you have
>>>         disabled the HA for VMs, why do you still need to change
>>>         data on the data base? How did you disabled the HA of VMs?
>>>
>>>
>>>         On Mon, Feb 6, 2017 at 5:04 AM, Francois Scheurer
>>>         <francois.scheurer@everyware.ch
>>>         <ma...@everyware.ch>> wrote:
>>>
>>>             Dear Rafael / Dear CS contributors
>>>
>>>
>>>             I tried the following:
>>>
>>>             1.     stop all MS
>>>             2.     backup DB
>>>             3.     select * from op_ha_work where type="Migration"
>>>             and mgmt_server_id is null and taken is null;
>>>             4.     delete from op_ha_work where type="Migration" and
>>>             mgmt_server_id is null and taken is null;
>>>             5.     start all MS
>>>
>>>             But actually after the MS stopped the op_ha_work table
>>>             was automatically cleaned, making my steps useless.
>>>             After the MS started theop_ha_work table was filled
>>>             again and related VM's were stopped several times (3-7
>>>             times), requiring high attention to manually restart
>>>             them on each stop.
>>>
>>>             I later found another table that could be the cause of
>>>             the issue:
>>>
>>>             mysql> select
>>>             op_it_work.*,vm_instance.instance_name,vm_instance.removed
>>>             from op_it_work join vm_instance where
>>>             op_it_work.step!="Done" and
>>>             op_it_work.instance_id=vm_instance.id
>>>             <http://vm_instance.id> order by instance_id;
>>>             +--------------------------------------+----------------+------------+-----------------------+-----------+--------------------+-----------+------------+-------------+---------------+-------------+---------------+---------------------+
>>>             | id | mgmt_server_id | created_at |
>>>             thread                | type | vm_type            |
>>>             step      | updated_at | instance_id | resource_type |
>>>             resource_id | instance_name | removed             |
>>>             +--------------------------------------+----------------+------------+-----------------------+-----------+--------------------+-----------+------------+-------------+---------------+-------------+---------------+---------------------+
>>>             | be4d0dd2-7d4f-47d3-ba49-623d2f115bbf |   345049098498
>>>             | 1436187965 <tel:%2814%29%203618-7965> |
>>>             consoleproxy-1        | Starting | ConsoleProxy       |
>>>             Release   | 1436188265 <tel:%2814%29%203618-8265>
>>>             |         402 | NULL |           0 | v-402-VM      |
>>>             2015-07-06 13:23:22 |
>>>             | fb4d1844-eb92-478c-9f50-d666790f43af |   345049098498
>>>             | 1436187965 <tel:%2814%29%203618-7965> |
>>>             secstorage-1          | Starting | SecondaryStorageVm |
>>>             Release   | 1436188265 <tel:%2814%29%203618-8265>
>>>             |         687 | NULL |           0 | s-687-VM      |
>>>             2015-07-06 13:11:34 |
>>>             | 7dfeb25a-a705-42e8-907c-79d1e69697ed |   345049098498
>>>             | 1436964548 <tel:%2814%29%203696-4548> |
>>>             Work-Job-Executor-165 | Migrating | User               |
>>>             Migrating | 1436964548 <tel:%2814%29%203696-4548>
>>>             |         776 | Host |          29 | i-51-776-VM   |
>>>             2015-07-15 12:55:38 |
>>>             | 2018ac61-b11c-4101-8ef5-07a39509f2bd |   345049098498
>>>             | 1486020173 | Work-Job-Executor-19  | Migrating |
>>>             User               | Prepare   | 1486020173 |       
>>>             1177 | Host          |          25 | i-189-1177-VM |
>>>             NULL                |
>>>             | 25430cd5-4346-4643-9c7e-7b39b8a876a2 |   345049098122
>>>             | 1486019693 | Work-Job-Executor-6   | Migrating |
>>>             User               | Prepare   | 1486019693 |       
>>>             1177 | Host          |          25 | i-189-1177-VM |
>>>             NULL                |
>>>             | 3276d52e-cbd9-44c5-8ab6-c70784337ff9 |   345049098498
>>>             | 1486020953 | Work-Job-Executor-33  | Migrating |
>>>             User               | Prepare   | 1486020953 |       
>>>             1177 | Host          |          33 | i-189-1177-VM |
>>>             NULL                |
>>>             | 6462da33-83f2-4575-af38-0e52d2a48488 |   345049098122
>>>             | 1486021433 | Work-Job-Executor-12  | Migrating |
>>>             User               | Prepare   | 1486021433 |       
>>>             1177 | Host          |          25 | i-189-1177-VM |
>>>             NULL                |
>>>             | 688e16c8-c39f-4e88-9e8f-36c0282f3df1 |   345049098122
>>>             | 1486020773 | Work-Job-Executor-11  | Migrating |
>>>             User               | Prepare   | 1486020773 |       
>>>             1177 | Host          |          25 | i-189-1177-VM |
>>>             NULL                |
>>>             | d8a18e9d-ac6f-4273-923e-e401533e2cf7 |   345049098498
>>>             | 1486020353 | Work-Job-Executor-25  | Migrating |
>>>             User               | Prepare   | 1486020353 |       
>>>             1177 | Host          |          25 | i-189-1177-VM |
>>>             NULL                |
>>>             | 1222ece8-399d-43d8-8ff8-d24bdebca2ed |   345049103441
>>>             | 1486018924 | Work-Job-Executor-2   | Migrating |
>>>             User               | Prepare   | 1486018924 |       
>>>             1178 | Host          |          29 | i-189-1178-VM |
>>>             NULL                |
>>>             | 2ac5ca4a-c20d-433a-8379-40e0de3b3ba8 |   345049098498
>>>             | 1486021613 | Work-Job-Executor-41  | Migrating |
>>>             User               | Prepare   | 1486021613 |       
>>>             1178 | Host          |          29 | i-189-1178-VM |
>>>             NULL                |
>>>             | 3e3b95a0-2af4-4268-bfdd-f3399a85bcaa |   345049103441
>>>             | 1486019584 | Work-Job-Executor-5   | Migrating |
>>>             User               | Prepare   | 1486019584 |       
>>>             1178 | Host          |          29 | i-189-1178-VM |
>>>             NULL                |
>>>             | 53745a2d-056c-4a1b-9267-eb97f077a6c8 |   345049103441
>>>             | 1486020244 | Work-Job-Executor-6   | Migrating |
>>>             User               | Prepare   | 1486020244 |       
>>>             1178 | Host          |          29 | i-189-1178-VM |
>>>             NULL                |
>>>             | e3555b49-7c18-4272-b7f4-ba9351b3bbd1 |   345049098498
>>>             | 1486020893 | Work-Job-Executor-32  | Migrating |
>>>             User               | Prepare   | 1486020893 |       
>>>             1178 | Host          |          29 | i-189-1178-VM |
>>>             NULL                |
>>>             | 1dcd5204-4f9a-4342-a23b-329a48596e13 |   345049098498
>>>             | 1486020953 | Work-Job-Executor-34  | Migrating |
>>>             User               | Prepare   | 1486020953 |       
>>>             1240 | Host          |           3 | i-388-1240-VM |
>>>             NULL                |
>>>             | ec1668b3-fa3c-4673-b593-3d607b169a24 |   345049098498
>>>             | 1486019693 | Work-Job-Executor-14  | Migrating |
>>>             User               | Prepare   | 1486019693 |       
>>>             1240 | Host          |           3 | i-388-1240-VM |
>>>             NULL                |
>>>             | fa7b2dec-ed7f-4a67-a055-e6e6bc1631ca |   345049098498
>>>             | 1486020353 | Work-Job-Executor-26  | Migrating |
>>>             User               | Prepare   | 1486020353 |       
>>>             1240 | Host          |           3 | i-388-1240-VM |
>>>             NULL                |
>>>             | 8cfe6d3e-5b7b-4be8-ac10-c865a492aab9 |   345049098498
>>>             | 1465387271 | Work-Job-Executor-18  | Running |
>>>             User               | Prepare   | 1465387271 |       
>>>             1388 | Host          |           7 | i-361-1388-VM |
>>>             NULL                |
>>>             | 6e941b7e-6f19-4717-bf82-974c59737dbc |   345049103441
>>>             | 1486019524 | Work-Job-Executor-4   | Migrating |
>>>             User               | Prepare   | 1486019524 |       
>>>             1690 | Host          |          15 | i-648-1690-VM |
>>>             NULL                |
>>>             | 718578a4-0dd6-42d9-be10-ed7680650108 |   345049098122
>>>             | 1486021433 | Work-Job-Executor-13  | Migrating |
>>>             User               | Prepare   | 1486021433 |       
>>>             1690 | Host          |          15 | i-648-1690-VM |
>>>             NULL                |
>>>             | 9acf3908-ef13-4b03-8c2b-1a7e55e18491 |   345049098498
>>>             | 1486020173 | Work-Job-Executor-20  | Migrating |
>>>             User               | Prepare   | 1486020173 |       
>>>             1690 | Host          |          15 | i-648-1690-VM |
>>>             NULL                |
>>>             | a50c420b-ae52-4caa-83db-704b52b299a4 |   345049098498
>>>             | 1486020773 | Work-Job-Executor-28  | Migrating |
>>>             User               | Prepare   | 1486020773 |       
>>>             1690 | Host          |          15 | i-648-1690-VM |
>>>             NULL                |
>>>             | 2b5f4179-5975-4326-9db7-643c3ede63fe |   345049103441
>>>             | 1484224516 | Work-Job-Executor-161 | Running |
>>>             User               | Prepare   | 1484224516 |       
>>>             1703 | Host          |           3 | i-51-1703-VM  |
>>>             NULL                |
>>>             | 2f3c6c44-b2a0-4d10-88d2-ac143d2dd379 |   345049098498
>>>             | 1484223910 | Work-Job-Executor-163 | Running |
>>>             User               | Prepare   | 1484223910 |       
>>>             1703 | Host          |           3 | i-51-1703-VM  |
>>>             NULL                |
>>>             | 4126a23b-1cde-49bc-8636-c0d063b85cf6 |   345049103441
>>>             | 1484224636 | Work-Job-Executor-86  | Running |
>>>             User               | Prepare   | 1484224636 |       
>>>             1703 | Host          |           3 | i-51-1703-VM  |
>>>             NULL                |
>>>             | 5dad5645-766b-48c9-9a85-b77b3c101390 |   345049103441
>>>             | 1484224636 | Work-Job-Executor-92  | Running |
>>>             User               | Prepare   | 1484224636 |       
>>>             1703 | Host          |           3 | i-51-1703-VM  |
>>>             NULL                |
>>>             | 8c1dfff9-e55d-4f81-a22f-45f72accc416 |   345049103441
>>>             | 1484224576 | Work-Job-Executor-95  | Running |
>>>             User               | Prepare   | 1484224576 |       
>>>             1703 | Host          |           3 | i-51-1703-VM  |
>>>             NULL                |
>>>             | a703d2d7-b1e3-42ce-a551-831982990ffb |   345049098498
>>>             | 1484224630 | Work-Job-Executor-13  | Running |
>>>             User               | Prepare   | 1484224630 |       
>>>             1703 | Host          |           3 | i-51-1703-VM  |
>>>             NULL                |
>>>             | 0e194cb5-95b7-41cb-969a-ae37d2ad449a |   345049098498
>>>             | 1485763613 | Work-Job-Executor-28  | Migrating |
>>>             User               | Prepare   | 1485763613 |       
>>>             1735 | Host          |          19 | i-638-1735-VM |
>>>             2017-01-30 15:24:10 |
>>>             | 74f180b9-be0f-429b-9a45-597ca2ca997d |   345049098498
>>>             | 1485763013 | Work-Job-Executor-21  | Migrating |
>>>             User               | Prepare   | 1485763013 |       
>>>             1735 | Host          |          19 | i-638-1735-VM |
>>>             2017-01-30 15:24:10 |
>>>             | feaa421c-88df-4493-9a4f-bddb6ee1b75b |   345049098122
>>>             | 1485764285 | Work-Job-Executor-19  | Migrating |
>>>             User               | Prepare   | 1485764285 |       
>>>             1735 | Host          |          19 | i-638-1735-VM |
>>>             2017-01-30 15:24:10 |
>>>             | 32138a36-ad18-42b5-8478-76ad63caf4da |   345049098122
>>>             | 1486020293 | Work-Job-Executor-9   | Migrating |
>>>             User               | Prepare   | 1486020293 |       
>>>             1736 | Host          |          19 | i-638-1736-VM |
>>>             NULL                |
>>>             | 5306ab1f-3c92-4f95-bbff-283b43e6862d |   345049098498
>>>             | 1486019573 | Work-Job-Executor-11  | Migrating |
>>>             User               | Prepare   | 1486019573 |       
>>>             1736 | Host          |          19 | i-638-1736-VM |
>>>             NULL                |
>>>             | 843bdf9c-247d-432c-ac19-1a4f7858631b |   345049098498
>>>             | 1486021493 | Work-Job-Executor-38  | Migrating |
>>>             User               | Prepare   | 1486021493 |       
>>>             1736 | Host          |          19 | i-638-1736-VM |
>>>             NULL                |
>>>             | 8d400afb-766f-4ac5-89ee-854f25a38eba |   345049098498
>>>             | 1486020173 | Work-Job-Executor-21  | Migrating |
>>>             User               | Prepare   | 1486020173 |       
>>>             1736 | Host          |          19 | i-638-1736-VM |
>>>             NULL                |
>>>             | f0ae5612-b6ef-4882-94a2-40a1e6dce342 |   345049098498
>>>             | 1486020833 | Work-Job-Executor-30  | Migrating |
>>>             User               | Prepare   | 1486020833 |       
>>>             1736 | Host          |          19 | i-638-1736-VM |
>>>             NULL                |
>>>             +--------------------------------------+----------------+------------+-----------------------+-----------+--------------------+-----------+------------+-------------+---------------+-------------+---------------+---------------------+
>>>
>>>             As you see, the problematic VM's are all present on that
>>>             table:
>>>
>>>                 i-189-1177-VM
>>>                 i-189-1178-VM
>>>                 i-388-1240-VM
>>>                 i-648-1690-VM
>>>                 i-638-1736-VM
>>>
>>>             As the MS are running the production environnements, we
>>>             cannot debug a lot and just need the shortest path to
>>>             resolution.
>>>             For example restarting the MS is having an impact on our
>>>             customers and we need to reduce the number of restarts
>>>             at the minimum.
>>>
>>>             I have disabled the HA on all problematic VM's except on
>>>             i-189-1177-VM, which is less critical.
>>>
>>>             The next steps could be:
>>>
>>>             1.     stop all MS
>>>             2.     backup DB
>>>             3.     select
>>>             op_it_work.*,vm_instance.instance_name,vm_instance.removed
>>>             from op_it_work join vm_instance where
>>>             op_it_work.step!="Done" and
>>>             op_it_work.instance_id=vm_instance.id
>>>             <http://vm_instance.id> order by instance_id;
>>>             4.     update op_it_work set step="Done" where step!="Done";
>>>             5.     start all MS
>>>
>>>             What do you think about step 4., safe or risky?
>>>             Some of these entries are related to deleted VM's and
>>>             seems to safe to mark as "Done".
>>>             All other entries are related to the problematic VM's
>>>             that keep stopping... so marking them as "Done" may
>>>             (hopefully) eliminate the cause of the Migrating/Stopping.
>>>             But we are not sure if marking them as "Done" need also
>>>             changes on other tables for consistency.
>>>
>>>
>>>             Thank you for your help.
>>>
>>>
>>>             Best Regards
>>>             Francois Scheurer
>>>
>>>
>>>
>>>
>>>
>>>             On 01.02.2017 03:01, Rafael Weingärtner wrote:
>>>>             Ok, I dug a little further into the code.
>>>>
>>>>                 If we put a host in maintenance, it may
>>>>                 unfortunately happen that a timeout occurs and CS
>>>>                 aborts the migration. As far as we can interpret
>>>>                 the logs, it seems (we speculate) that CS detects
>>>>                 that the VM is then still running on the original
>>>>                 host, which is in maintenance, and decides somehow
>>>>                 to stop it because of the "host not up" status.
>>>>
>>>>
>>>>             From wat I have seen in the code, this may happen (as
>>>>             you speculated). If a timeout or some other event
>>>>             interrupts migrations, the VM will be stopped. This
>>>>             happens in the method
>>>>             “com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(String,
>>>>             long, DeploymentPlanner)” at line 2396 and further.
>>>>             This was implemented on purpose; if the migration does
>>>>             not happen; the VM is stopped on the source host.
>>>>             Probably it was implemented this way on that philosophy
>>>>             of gradient descendent of artificial intelligence:
>>>>
>>>>                 “pick a random point in the field, update the
>>>>                 values according to the slope and hopefully after
>>>>                 successfully iterations you converge to a minimum
>>>>                 point.” (the problem is that sometimes the function
>>>>                 diverges).
>>>>
>>>>
>>>>             I mean, probably this was done this way because the HA
>>>>             was supposed to see that a VM with HA is stopped, and
>>>>             should try to start it again somewhere else. The log
>>>>             you posted do not provide any other details to debug
>>>>             this further because the log is after an MS restart, right?
>>>>
>>>>
>>>>                 After restarting the MSs I guess that a similar
>>>>                 issue may happen: the hosts are disconnected during
>>>>                 the early stage and then reconnects. Maybe CS then
>>>>                 again stops the VMs because the "host not up" status?
>>>>                 Cf. 2017-01-30 08:44:06,157 DEBUG
>>>>                 [c.c.d.DeploymentPlanningManagerImpl]
>>>>                 (Work-Job-Executor-6:ctx-0d738a8e
>>>>                 job-13737/job-192988 ctx-6e9c753f) The last host of
>>>>                 this VM is not UP or is not enabled, host status
>>>>                 is: Disconnected, host resource state is: Enabled
>>>>
>>>>
>>>>             Here is something interesting that you have found. I
>>>>             checked the code, but without a live debug it is pretty
>>>>             hard for me to get the right flow of execution. By
>>>>             looking at the code and at the logs you provided, your
>>>>             deduction is right. After ACS boots up, the check for
>>>>             hosts that are shutdown/deactivated starts, and the
>>>>             method
>>>>             “com.cloud.agent.manager.AgentManagerImpl.executeUserRequest(long,
>>>>             Event)” is executed. The problem is that all of the
>>>>             hosts are considered as disconnected (ACS may not have
>>>>             finished connecting to all of them yet), and then a
>>>>             whole process of HA is kicked. The problem is that
>>>>             there is no host with status “UP”. That is why this
>>>>             happen “No suitable hosts found under this Cluster:
>>>>             25”, and then the whole flow of HA, which shuts down
>>>>             the VM is kicked.
>>>>
>>>>             In theory, the HA worker should see that a VM with HA
>>>>             is down, and should try to start it. I am not
>>>>             understanding why this is not working. If you decide to
>>>>             debug the running ACS, I may be able to help you, just
>>>>             keep me posted.
>>>>
>>>>
>>>>             On Tue, Jan 31, 2017 at 10:26 AM, Rafael Weingärtner
>>>>             <rafaelweingartner@gmail.com
>>>>             <ma...@gmail.com>> wrote:
>>>>
>>>>                 I will be able to re-check some code and answer you
>>>>                 with new thoughts on this problem tonight (the log
>>>>                 files you provided today have some very insightful
>>>>                 information regarding this problem). I believe this
>>>>                 is a bug in HA, a very tricky one that has not been
>>>>                 reported yet (at least I do not remember seeing
>>>>                 anything similar). And thanks to randomness/destiny
>>>>                 now you have found/uncovered a way to reproduce the
>>>>                 problem (bug?!).
>>>>
>>>>                 I do not suggest you changing any data at all
>>>>                 directly in the data base (unless you know ACS code
>>>>                 well and are 100% of what you are doing). This can
>>>>                 cause even bigger and more complicated problems. If
>>>>                 you remove the HA feature from this VMs, this
>>>>                 problem will stop. However, if you can/are willing
>>>>                 to maintain the HA until we fully uncover the
>>>>                 problem, map and report it, this would be great.
>>>>                 BTW: why are you restarting MSs?
>>>>                 What are the configurations of your MSs?
>>>>
>>>>                 Thanks for your time and for sharing information
>>>>                 with our community.
>>>>
>>>>
>>>>
>>>>                 On Tue, Jan 31, 2017 at 8:53 AM, Francois Scheurer
>>>>                 <francois.scheurer@everyware.c
>>>>                 <ma...@everyware.c>h> wrote:
>>>>
>>>>                     Dear Rafael
>>>>
>>>>
>>>>                     Many thanks for your deep analysis.
>>>>                     Nice humor with the planetsalignments BTW ;)
>>>>
>>>>                     We also suspected the HA Flag to be related
>>>>                     this issue, as we never got this issue in the
>>>>                     past, until we enabled the HA on all VMs and
>>>>                     Offerings.
>>>>
>>>>                     Here is also the workers list:
>>>>
>>>>                     mysql> select * from op_ha_work;
>>>>                     +------+-------------+-----------+--------------------+----------+----------------+---------+---------------------+-------+---------------------+-------+-------------+---------+
>>>>                     | id   | instance_id | type      | vm_type     
>>>>                           | state    | mgmt_server_id | host_id |
>>>>                     created          | tried | taken  | step  |
>>>>                     time_to_try | updated |
>>>>                     +------+-------------+-----------+--------------------+----------+----------------+---------+---------------------+-------+---------------------+-------+-------------+---------+
>>>>                     |  105 |         343 | HA        | User        
>>>>                          | Running  |  345049098498 | 29 |
>>>>                     2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>                     Error | 1394056013 |       3 |
>>>>                     |  109 |         407 | HA        |
>>>>                     ConsoleProxy       | Running  |  345049098498 |
>>>>                     33 | 2015-03-28 03:19:03 | 2 | 2015-03-28
>>>>                     03:30:04 | Error | 1394056013 |       3 |
>>>>                     |  113 |         407 | HA        |
>>>>                     ConsoleProxy       | Running  |  345049098498 |
>>>>                     33 | 2015-03-28 03:19:03 | 2 | 2015-03-28
>>>>                     03:30:04 | Error | 1394056013 |       3 |
>>>>                     |  117 |         407 | HA        |
>>>>                     ConsoleProxy       | Running  |  345049098498 |
>>>>                     33 | 2015-03-28 03:19:03 | 2 | 2015-03-28
>>>>                     03:30:04 | Error | 1394056013 |       3 |
>>>>                     |  121 |         343 | HA        | User        
>>>>                          | Running  |  345049098498 | 29 |
>>>>                     2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>                     Error | 1394056013 |       3 |
>>>>                     |  125 |         405 | HA        |
>>>>                     SecondaryStorageVm | Running  |  345049098498 |
>>>>                     29 | 2015-03-28 03:19:03 | 2 | 2015-03-28
>>>>                     03:30:04 | Error | 1394056013 |       3 |
>>>>                     |  129 |         393 | HA        | User        
>>>>                          | Running  |  345049098498 | 25 |
>>>>                     2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>                     Error | 1394056014 |      30 |
>>>>                     |  133 |         405 | HA        |
>>>>                     SecondaryStorageVm | Running  |  345049098498 |
>>>>                     29 | 2015-03-28 03:19:03 | 2 | 2015-03-28
>>>>                     03:30:04 | Error | 1394056014 |       3 |
>>>>                     |  137 |         402 | HA        |
>>>>                     ConsoleProxy       | Running  |  345049098498 |
>>>>                     25 | 2015-03-28 03:19:03 | 2 | 2015-03-28
>>>>                     03:30:04 | Error | 1394056014 |       3 |
>>>>                     |  345 |         402 | HA        |
>>>>                     ConsoleProxy       | Starting |  345049098498 |
>>>>                     25 | 2015-07-06 13:09:26 | 1 | 2015-07-06
>>>>                     13:09:26 | Error | 1402527506 |     770 |
>>>>                     |  349 |         687 | HA        |
>>>>                     SecondaryStorageVm | Starting |  345049098498 |
>>>>                     25 | 2015-07-06 13:09:26 | 1 | 2015-07-06
>>>>                     13:09:26 | Error | 1402527506 |       2 |
>>>>                     | 2057 |        1178 | Migration | User        
>>>>                          | Running  |  345049098498 | 33 |
>>>>                     2017-01-31 07:56:08 | 0 | 2017-01-31 07:56:08 |
>>>>                     Done  | 1451024774 |      95 |
>>>>                     | 2058 |        1736 | Migration | User        
>>>>                          | Running  |  345049098498 |  224 |
>>>>                     2017-01-31 07:56:08 | 0 | 2017-01-31 07:56:08 |
>>>>                     Done  | 1451024774 |      45 |
>>>>                     | 2059 |        1240 | Migration | User        
>>>>                          | Running  |  345049098498 |  7 |
>>>>                     2017-01-31 07:56:08 | 0 | 2017-01-31 07:56:08 |
>>>>                     Done  | 1451024774 |      61 |
>>>>                     | 2060 |        1178 | Migration | User        
>>>>                          | Running  |  345049098122 | 33 |
>>>>                     2017-01-31 07:56:08 | 1 | 2017-01-31 08:06:37 |
>>>>                     Done  | 1451025374 |      95 |
>>>>                     | 2061 |        1178 | Migration | User        
>>>>                          | Stopped  |  NULL |      33 | 2017-01-31
>>>>                     07:56:09 | 5 | NULL       | Done  | 1451027900
>>>>                     <tel:1451027900> |     109 |
>>>>                     | 2062 |        1690 | Migration | User        
>>>>                          | Running  |  345049098498 |  224 |
>>>>                     2017-01-31 07:56:09 | 0 | 2017-01-31 07:56:09 |
>>>>                     Done  | 1451024774 |     109 |
>>>>                     | 2063 |        1177 | Migration | User        
>>>>                          | Running  |  345049098498 | 29 |
>>>>                     2017-01-31 07:56:09 | 0 | 2017-01-31 07:56:09 |
>>>>                     Done  | 1451024774 |      82 |
>>>>                     | 2064 |        1690 | Migration | User        
>>>>                          | Running  |  345049098122 |  224 |
>>>>                     2017-01-31 07:56:09 | 1 | 2017-01-31 07:58:18 |
>>>>                     Done  | 1451024900 |     109 |
>>>>                     | 2065 |        1240 | Migration | User        
>>>>                          | Running  |  345049098122 |  7 |
>>>>                     2017-01-31 07:56:09 | 1 | 2017-01-31 07:58:18 |
>>>>                     Done  | 1451024900 |      61 |
>>>>                     | 2066 |        1736 | Migration | User        
>>>>                          | Running  |  345049098122 |  224 |
>>>>                     2017-01-31 07:56:09 | 1 | 2017-01-31 07:58:18 |
>>>>                     Done  | 1451024900 |      45 |
>>>>                     | 2067 |        1240 | Migration | User        
>>>>                          | Stopped  |  NULL |       7 | 2017-01-31
>>>>                     07:56:09 | 5 | NULL       | Done  | 1451028003
>>>>                     <tel:1451028003> |      80 |
>>>>                     | 2068 |        1178 | Migration | User        
>>>>                          | Running  |  345049098122 | 33 |
>>>>                     2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
>>>>                     Done  | 1451024781 |      95 |
>>>>                     | 2069 |        1736 | Migration | User        
>>>>                          | Running  |  345049098122 |  224 |
>>>>                     2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
>>>>                     Done  | 1451024781 |      45 |
>>>>                     | 2070 |        1240 | Migration | User        
>>>>                          | Running  |  345049098122 |  7 |
>>>>                     2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
>>>>                     Done  | 1451024781 |      61 |
>>>>                     | 2071 |        1177 | Migration | User        
>>>>                          | Running  |  345049098122 | 29 |
>>>>                     2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
>>>>                     Done  | 1451024781 |      82 |
>>>>                     | 2072 |        1177 | Migration | User        
>>>>                          | Stopped  |  NULL |      29 | 2017-01-31
>>>>                     07:56:16 | 5 | NULL       | Done  | 1451027973
>>>>                     <tel:1451027973> |     101 |
>>>>                     | 2073 |        1177 | Migration | User        
>>>>                          | Running  |  345049098122 | 29 |
>>>>                     2017-01-31 07:56:16 | 1 | 2017-01-31 07:58:22 |
>>>>                     Done  | 1451024904 |      82 |
>>>>                     | 2074 |        1690 | Migration | User        
>>>>                          | Stopped  |  NULL |     224 | 2017-01-31
>>>>                     07:56:16 | 5 | NULL       | Done  | 1451027464
>>>>                     <tel:1451027464> |     123 |
>>>>                     | 2075 |        1736 | Migration | User        
>>>>                          | Stopped  |  NULL |     224 | 2017-01-31
>>>>                     07:56:16 | 5 | NULL       | Done  | 1451027477
>>>>                     <tel:1451027477> |      59 |
>>>>                     +------+-------------+-----------+--------------------+----------+----------------+---------+---------------------+-------+---------------------+-------+-------------+---------+
>>>>
>>>>                     If we put a host in maintenance, it may
>>>>                     unfortunately happen that a timeout occurs and
>>>>                     CS aborts the migration.
>>>>                     As far as we can interpret the logs, it seems
>>>>                     (we speculate) that CS detects that the VM is
>>>>                     then still running on the original host, which
>>>>                     is in maintenance, and decides somehow to stop
>>>>                     it because of the "host not up" status.
>>>>                     We then need to start several times the VM,
>>>>                     until CS gives up the stopping.
>>>>
>>>>                     After restarting the MSs I guess that a similar
>>>>                     issue may happen: the hosts are disconnected
>>>>                     during the early stage and then reconnects.
>>>>                     Maybe CS then again stops the VMs because the
>>>>                     "host not up" status?
>>>>                     Cf. 2017-01-30 08:44:06,157 DEBUG
>>>>                     [c.c.d.DeploymentPlanningManagerImpl]
>>>>                     (Work-Job-Executor-6:ctx-0d738a8e
>>>>                     job-13737/job-192988 ctx-6e9c753f) The last
>>>>                     host of this VM is not UP or is not enabled,
>>>>                     host status is: Disconnected, host resource
>>>>                     state is: Enabled
>>>>
>>>>                     But the problem after the MSs restart is
>>>>                     definitely only impacting VMs listed in the
>>>>                     Table op_ha_work.
>>>>
>>>>
>>>>                     To your questions:
>>>>                             We have 3 ACS MSs managing several
>>>>                     Zones. Each zone has a single pod and cluster.
>>>>                             The Cluster of i-638-1735-VM has this
>>>>                     summary:
>>>>                              Total number of VMs: 121
>>>>                              Total number hypervisors: 4
>>>>                              Total allocated RAM: 522.0 / 972 GB
>>>>                     (53.7 %)
>>>>                              Total allocated cores: 211
>>>>                              Total allocated storage: 11285 GB
>>>>                             Used Memory on the hosts is now varying
>>>>                     from 28 to 77%, so there is enough RAM.
>>>>                             I did not check the XS logs. The same
>>>>                     Problem has arisen this morning again (shortly
>>>>                     after a restart of all MSs), so I will look in
>>>>                     the XS logs now.
>>>>
>>>>                             I have some fair experience with
>>>>                     Eclipse and Java, I will ask my boss if we can
>>>>                     go this way.
>>>>
>>>>                     The main issue for now, is that restarting the
>>>>                     MSs is reproducing the problem, which is quite
>>>>                     annoying for the impacted VM's.
>>>>                     I thought to stop the MSs, delete all
>>>>                     HA/Migration entries in op_ha_work and start
>>>>                     the MSs.
>>>>                     Is that a viable option? (or should I delete
>>>>                     the Migration entries only? or never touch this
>>>>                     table?)
>>>>                     We will also very probably remove the HA Flag
>>>>                     on all VMs if it does not work reliably.
>>>>
>>>>                     Best Regards
>>>>                     Francois Scheurer
>>>>
>>>>
>>>>
>>>>                     Here are new logs regarding the same issue of
>>>>                     today.
>>>>                     I tried to get the important parts related to
>>>>                     the stop of i-189-1178-VM.
>>>>
>>>>                     regexp:
>>>>                     (i-189-1178-VM|VmWorkMigrateAway|job-15393|job-192989|StopRequestedvm|job-14965|job-193019|Stopped-Migrating|ctx-36aae733|work-2049)
>>>>
>>>>                     2017-01-30 08:44:04,440 DEBUG
>>>>                     [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989) Executing AsyncJobVO
>>>>                     {id:192989, userId: 1, accountId: 1,
>>>>                     instanceType: null, instanceId: null, cmd:
>>>>                     com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>>                     rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZU
>>>>                     F3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAEmnQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAHQ,
>>>>                     cmdVersion: 0, status: IN_PROGRESS,
>>>>                     processStatus: 0, resu
>>>>                     ltCode: 0, result: null, initMsid:
>>>>                     345049103441, completeMsid: null, lastUpdated:
>>>>                     null, lastPolled: null, created: Mon Jan 30
>>>>                     08:44:03 CET 2017}
>>>>                     2017-01-30 08:44:04,440 DEBUG
>>>>                     [c.c.v.VmWorkJobDispatcher]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989) Run VM work job:
>>>>                     com.cloud.vm.VmWorkMigrateAway for VM 1178, job
>>>>                     origin: 15393
>>>>
>>>>                     2017-01-30 08:44:05,217 DEBUG
>>>>                     [c.c.d.DeploymentPlanningManagerImpl]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea) Deploy
>>>>                     avoids pods: [], clusters: [], hosts: [29]
>>>>                     2017-01-30 08:44:05,220 DEBUG
>>>>                     [c.c.a.m.ClusteredAgentManagerImpl]
>>>>                     (AgentConnectTaskPool-4:ctx-5b61f2e5) create
>>>>                     ClusteredAgentAttache for 232
>>>>                     2017-01-30 08:44:05,220 DEBUG
>>>>                     [c.c.d.DeploymentPlanningManagerImpl]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea)
>>>>                     DeploymentPlanner allocation algorithm:
>>>>                     com.cloud.deploy.UserDispersingPlanner@1734b2e2
>>>>                     2017-01-30 08:44:05,220 DEBUG
>>>>                     [c.c.d.DeploymentPlanningManagerImpl]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea) Trying to
>>>>                     allocate a host and storage pools from dc:17,
>>>>                     pod:25,cluster:25, requested cpu: 4200,
>>>>                     requested ram: 4294967296
>>>>                     <tel:%2842%29%209496-7296>
>>>>                     2017-01-30 08:44:05,220 DEBUG
>>>>                     [c.c.d.DeploymentPlanningManagerImpl]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea) Is ROOT
>>>>                     volume READY (pool already allocated)?: Yes
>>>>                     2017-01-30 08:44:05,220 DEBUG
>>>>                     [c.c.d.DeploymentPlanningManagerImpl]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea) This VM has
>>>>                     last host_id specified, trying to choose the
>>>>                     same host: 25
>>>>
>>>>                     2017-01-30 08:44:05,614 DEBUG
>>>>                     [c.c.d.DeploymentPlanningManagerImpl]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea) The last
>>>>                     host of this VM is not UP or is not enabled,
>>>>                     host status is: Disconnected, host resource
>>>>                     state is: Enabled
>>>>                     2017-01-30 08:44:05,614 DEBUG
>>>>                     [c.c.d.DeploymentPlanningManagerImpl]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea) Cannot
>>>>                     choose the last host to deploy this VM
>>>>                     2017-01-30 08:44:05,614 DEBUG
>>>>                     [c.c.d.FirstFitPlanner]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea) Searching
>>>>                     resources only under specified Cluster: 25
>>>>
>>>>                     2017-01-30 08:44:05,777 DEBUG
>>>>                     [c.c.a.m.a.i.FirstFitAllocator]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea
>>>>                     FirstFitRoutingAllocator) Found 0 hosts for
>>>>                     allocation after prioritization: []
>>>>                     2017-01-30 08:44:05,777 DEBUG
>>>>                     [c.c.a.m.a.i.FirstFitAllocator]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea
>>>>                     FirstFitRoutingAllocator) Looking for
>>>>                     speed=4200Mhz, Ram=4096
>>>>                     2017-01-30 08:44:05,777 DEBUG
>>>>                     [c.c.a.m.a.i.FirstFitAllocator]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea
>>>>                     FirstFitRoutingAllocator) Host Allocator
>>>>                     returning 0 suitable hosts
>>>>                     2017-01-30 08:44:05,777 DEBUG
>>>>                     [c.c.d.DeploymentPlanningManagerImpl]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea) No suitable
>>>>                     hosts found
>>>>                     2017-01-30 08:44:05,777 DEBUG
>>>>                     [c.c.d.DeploymentPlanningManagerImpl]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea) No suitable
>>>>                     hosts found under this Cluster: 25
>>>>                     2017-01-30 08:44:05,787 DEBUG
>>>>                     [c.c.d.DeploymentPlanningManagerImpl]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea) Could not
>>>>                     find suitable Deployment Destination for this
>>>>                     VM under any clusters, returning.
>>>>                     2017-01-30 08:44:05,787 DEBUG
>>>>                     [c.c.d.FirstFitPlanner]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea) Searching
>>>>                     resources only under specified Cluster: 25
>>>>                     2017-01-30 08:44:05,798 DEBUG
>>>>                     [c.c.d.FirstFitPlanner]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea) The
>>>>                     specified cluster is in avoid set, returning.
>>>>                     2017-01-30 08:44:05,798 DEBUG
>>>>                     [c.c.v.VirtualMachineManagerImpl]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea) Unable to
>>>>                     find destination for migrating the vm
>>>>                     VM[User|i-189-1178-VM]
>>>>                     2017-01-30 08:44:05,798 WARN
>>>>                     [c.c.v.VirtualMachineManagerImpl]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea) Failed to
>>>>                     deploy vm 1178 with original planner, sending
>>>>                     HAPlanner
>>>>
>>>>                     2017-01-30 08:44:05,985 DEBUG
>>>>                     [c.c.d.DeploymentPlanningManagerImpl]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea) Deploy
>>>>                     avoids pods: [], clusters: [], hosts: [29]
>>>>                     2017-01-30 08:44:05,985 DEBUG
>>>>                     [c.c.d.DeploymentPlanningManagerImpl]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea)
>>>>                     DeploymentPlanner allocation algorithm:
>>>>                     com.cloud.deploy.SkipHeuresticsPlanner@4fdf86ce
>>>>                     2017-01-30 08:44:05,985 DEBUG
>>>>                     [c.c.d.DeploymentPlanningManagerImpl]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea) Trying to
>>>>                     allocate a host and storage pools from dc:17,
>>>>                     pod:25,cluster:25, requested cpu: 4200,
>>>>                     requested ram: 4294967296
>>>>                     <tel:%2842%29%209496-7296>
>>>>                     2017-01-30 08:44:05,985 DEBUG
>>>>                     [c.c.d.DeploymentPlanningManagerImpl]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea) Is ROOT
>>>>                     volume READY (pool already allocated)?: Yes
>>>>                     2017-01-30 08:44:05,985 DEBUG
>>>>                     [c.c.d.DeploymentPlanningManagerImpl]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea) This VM has
>>>>                     last host_id specified, trying to choose the
>>>>                     same host: 25
>>>>
>>>>                     2017-01-30 08:44:06,022 DEBUG
>>>>                     [c.c.d.DeploymentPlanningManagerImpl]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea) The last
>>>>                     host of this VM is not UP or is not enabled,
>>>>                     host status is: Disconnected, host resource
>>>>                     state is: Enabled
>>>>                     2017-01-30 08:44:06,022 DEBUG
>>>>                     [c.c.d.DeploymentPlanningManagerImpl]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea) Cannot
>>>>                     choose the last host to deploy this VM
>>>>                     2017-01-30 08:44:06,022 DEBUG
>>>>                     [c.c.d.FirstFitPlanner]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea) Searching
>>>>                     resources only under specified Cluster: 25
>>>>                     2017-01-30 08:44:06,039 DEBUG
>>>>                     [c.c.d.SkipHeuresticsPlanner]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea) Deploying vm
>>>>                     during HA process, so skipping disable
>>>>                     threshold check
>>>>
>>>>                     2017-01-30 08:44:06,151 DEBUG
>>>>                     [c.c.a.m.a.i.FirstFitAllocator]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea
>>>>                     FirstFitRoutingAllocator) Found 0 hosts for
>>>>                     allocation after prioritization: []
>>>>                     2017-01-30 08:44:06,151 DEBUG
>>>>                     [c.c.a.m.a.i.FirstFitAllocator]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea
>>>>                     FirstFitRoutingAllocator) Looking for
>>>>                     speed=4200Mhz, Ram=4096
>>>>                     2017-01-30 08:44:06,151 DEBUG
>>>>                     [c.c.a.m.a.i.FirstFitAllocator]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea
>>>>                     FirstFitRoutingAllocator) Host Allocator
>>>>                     returning 0 suitable hosts
>>>>                     2017-01-30 08:44:06,151 DEBUG
>>>>                     [c.c.d.DeploymentPlanningManagerImpl]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea) No suitable
>>>>                     hosts found
>>>>                     2017-01-30 08:44:06,151 DEBUG
>>>>                     [c.c.d.DeploymentPlanningManagerImpl]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea) No suitable
>>>>                     hosts found under this Cluster: 25
>>>>                     2017-01-30 08:44:06,157 DEBUG
>>>>                     [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>                     (Work-Job-Executor-4:ctx-017064e7
>>>>                     job-13733/job-192986) Publish async job-192986
>>>>                     complete on message bus
>>>>                     2017-01-30 08:44:06,157 DEBUG
>>>>                     [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>                     (Work-Job-Executor-4:ctx-017064e7
>>>>                     job-13733/job-192986) Wake up jobs related to
>>>>                     job-192986
>>>>                     2017-01-30 08:44:06,157 DEBUG
>>>>                     [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>                     (Work-Job-Executor-4:ctx-017064e7
>>>>                     job-13733/job-192986) Update db status for
>>>>                     job-192986
>>>>                     2017-01-30 08:44:06,157 DEBUG
>>>>                     [c.c.d.DeploymentPlanningManagerImpl]
>>>>                     (Work-Job-Executor-6:ctx-0d738a8e
>>>>                     job-13737/job-192988 ctx-6e9c753f) The last
>>>>                     host of this VM is not UP or is not enabled,
>>>>                     host status is: Disconnected, host resource
>>>>                     state is: Enabled
>>>>                     2017-01-30 08:44:06,157 DEBUG
>>>>                     [c.c.d.DeploymentPlanningManagerImpl]
>>>>                     (Work-Job-Executor-6:ctx-0d738a8e
>>>>                     job-13737/job-192988 ctx-6e9c753f) Cannot
>>>>                     choose the last host to deploy this VM
>>>>                     2017-01-30 08:44:06,157 DEBUG
>>>>                     [c.c.d.FirstFitPlanner]
>>>>                     (Work-Job-Executor-6:ctx-0d738a8e
>>>>                     job-13737/job-192988 ctx-6e9c753f) Searching
>>>>                     resources only under specified Cluster: 35
>>>>                     2017-01-30 08:44:06,160 DEBUG
>>>>                     [c.c.d.DeploymentPlanningManagerImpl]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea) Could not
>>>>                     find suitable Deployment Destination for this
>>>>                     VM under any clusters, returning.
>>>>                     2017-01-30 08:44:06,160 DEBUG
>>>>                     [c.c.d.FirstFitPlanner]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea) Searching
>>>>                     resources only under specified Cluster: 25
>>>>                     2017-01-30 08:44:06,163 DEBUG
>>>>                     [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>                     (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Execute
>>>>                     sync-queue item: SyncQueueItemVO {id:63629,
>>>>                     queueId: 64423, contentType: AsyncJob,
>>>>                     contentId: 192991, lastProcessMsid:
>>>>                     345049098498, lastprocessNumber: 27,
>>>>                     lastProcessTime: Mon Jan 30 08:44:06 CET 2017,
>>>>                     created: Mon Jan 30 08:44:05 CET 2017}
>>>>                     2017-01-30 08:44:06,169 INFO
>>>>                     [o.a.c.f.j.i.AsyncJobMonitor]
>>>>                     (Work-Job-Executor-8:ctx-381e8d1a
>>>>                     job-14953/job-192992) Add job-192992 into job
>>>>                     monitoring
>>>>                     2017-01-30 08:44:06,184 DEBUG
>>>>                     [c.c.d.SkipHeuresticsPlanner]
>>>>                     (Work-Job-Executor-6:ctx-0d738a8e
>>>>                     job-13737/job-192988 ctx-6e9c753f) Deploying vm
>>>>                     during HA process, so skipping disable
>>>>                     threshold check
>>>>                     2017-01-30 08:44:06,186 DEBUG
>>>>                     [c.c.d.FirstFitPlanner]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea) The
>>>>                     specified cluster is in avoid set, returning.
>>>>                     2017-01-30 08:44:06,186 DEBUG
>>>>                     [c.c.v.VirtualMachineManagerImpl]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea) Unable to
>>>>                     find destination for migrating the vm
>>>>                     VM[User|i-189-1178-VM]
>>>>                     2017-01-30 08:44:06,186 ERROR
>>>>                     [c.c.v.VmWorkJobHandlerProxy]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea) Invocation
>>>>                     exception, caused by:
>>>>                     com.cloud.exception.InsufficientServerCapacityException:
>>>>                     Unable to find a server to migrate
>>>>                     to.Scope=interface com.cloud.org.Cluster; id=25
>>>>                     2017-01-30 08:44:06,187 INFO
>>>>                     [c.c.v.VmWorkJobHandlerProxy]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989 ctx-ea83d6ea) Rethrow
>>>>                     exception
>>>>                     com.cloud.exception.InsufficientServerCapacityException:
>>>>                     Unable to find a server to migrate
>>>>                     to.Scope=interface com.cloud.org.Cluster; id=25
>>>>                     2017-01-30 08:44:06,187 DEBUG
>>>>                     [c.c.v.VmWorkJobDispatcher]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989) Done with run of VM work
>>>>                     job: com.cloud.vm.VmWorkMigrateAway for VM
>>>>                     1178, job origin: 15393
>>>>                     2017-01-30 08:44:06,187 ERROR
>>>>                     [c.c.v.VmWorkJobDispatcher]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989) Unable to complete
>>>>                     AsyncJobVO {id:192989, userId: 1, accountId: 1,
>>>>                     instanceType: null, instanceId: null, cmd:
>>>>                     com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>>                     rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAEmnQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAHQ,
>>>>                     cmdVersion: 0, status: IN_PROGRESS,
>>>>                     processStatus: 0, resultCode: 0, result: null,
>>>>                     initMsid: 345049103441, completeMsid: null,
>>>>                     lastUpdated: null, lastPolled: null, created:
>>>>                     Mon Jan 30 08:44:03 CET 2017}, job origin:15393
>>>>                     com.cloud.exception.InsufficientServerCapacityException:
>>>>                     Unable to find a server to migrate
>>>>                     to.Scope=interface com.cloud.org.Cluster; id=25
>>>>                             at
>>>>                     com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:2390)
>>>>                             at
>>>>                     com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:4520)
>>>>                             at
>>>>                     sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>>>>                     Method)
>>>>                             at
>>>>                     sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>                             at
>>>>                     sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>                             at
>>>>                     java.lang.reflect.Method.invoke(Method.java:606)
>>>>                             at
>>>>                     com.cloud.vm.VmWorkJobHandlerProxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
>>>>                             at
>>>>                     com.cloud.vm.VirtualMachineManagerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4636)
>>>>                             at
>>>>                     com.cloud.vm.VmWorkJobDispatcher.runJob(VmWorkJobDispatcher.java:103)
>>>>                             at
>>>>                     org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:537)
>>>>                             at
>>>>                     org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>>>>                             at
>>>>                     org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
>>>>                             at
>>>>                     org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
>>>>                             at
>>>>                     org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
>>>>                             at
>>>>                     org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>>>>                             at
>>>>                     org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:494)
>>>>                             at
>>>>                     java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>>>                             at
>>>>                     java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>>                             at
>>>>                     java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>                             at
>>>>                     java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>                             at java.lang.Thread.run(Thread.java:745)
>>>>                     2017-01-30 08:44:06,188 DEBUG
>>>>                     [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>                     (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Schedule
>>>>                     queued job-192991
>>>>                     2017-01-30 08:44:06,192 DEBUG
>>>>                     [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>                     (Work-Job-Executor-4:ctx-017064e7
>>>>                     job-13733/job-192986) Wake up jobs joined with
>>>>                     job-192986 and disjoin all subjobs created from
>>>>                     job- 192986
>>>>                     2017-01-30 08:44:06,196 DEBUG
>>>>                     [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989) Complete async
>>>>                     job-192989, jobStatus: FAILED, resultCode: 0,
>>>>                     result:
>>>>                     rO0ABXNyADdjb20uY2xvdWQuZXhjZXB0aW9uLkluc3VmZmljaWVudFNlcnZlckNhcGFjaXR5RXhjZXB0aW9uAAAAAFZNT3ACAAFaABVhZmZpbml0eUdyb3Vwc0FwcGxpZWR4cgAxY29tLmNsb3VkLmV4Y2VwdGlvbi5JbnN1ZmZpY2llbnRDYXBhY2l0eUV4Y2VwdGlvbgAAAABWTU91AgACTAACaWR0ABBMamF2YS9sYW5nL0xvbmc7TAAFc2NvcGV0ABFMamF2YS9sYW5nL0NsYXNzO3hyACJjb20uY2xvdWQuZXhjZXB0aW9uLkNsb3VkRXhjZXB0aW9ueeiOG_HkRMcCAAJMAAtjc0Vycm9yQ29kZXQAE0xqYXZhL2xhbmcvSW50ZWdlcjtMAAZpZExpc3R0ABVMamF2YS91dGlsL0FycmF5TGlzdDt4cgATamF2YS5sYW5nLkV4Y2VwdGlvbtD9Hz4aOxzEAgAAeHIAE2phdmEubGFuZy5UaHJvd2FibGXVxjUnOXe4ywMABEwABWNhdXNldAAVTGphdmEvbGFuZy9UaHJvd2FibGU7TAANZGV0YWlsTWVzc2FnZXQAEkxqYXZhL2xhbmcvU3RyaW5nO1sACnN0YWNrVHJhY2V0AB5bTGphdmEvbGFuZy9TdGFja1RyYWNlRWxlbWVudDtMABRzdXBwcmVzc2VkRXhjZXB0aW9uc3QAEExqYXZhL3V0aWwvTGlzdDt4cHEAfgANdAAmVW5hYmxlIHRvIGZpbmQgYSBzZXJ2ZXIgdG8gbWlncmF0ZSB0by51cgAeW0xqYXZhLmxhbmcuU3RhY2tUcmFjZUVsZW1lbnQ7AkYqPDz9IjkCAAB4cAAAABVzcgAbamF2YS5sYW5nLlN0YWNrVHJhY2VFbGVtZW50YQnFmiY23YUCAARJAApsaW5lTnVtYmVyTAAOZGVjbGFyaW5nQ2xhc3NxAH4ACkwACGZpbGVOYW1lcQB-AApMAAptZXRob2ROYW1lcQB-AAp4cAAACVZ0ACZjb20uY2xvdWQudm0uVmlydHVhbE1hY2hpbmVNYW5hZ2VySW1wbHQAHlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwuamF2YXQAFm9yY2hlc3RyYXRlTWlncmF0ZUF3YXlzcQB-ABEAABGocQB-ABNxAH4AFHEAfgAVc3EAfgAR_____nQAJHN1bi5yZWZsZWN0Lk5hdGl2ZU1ldGhvZEFjY2Vzc29ySW1wbHQAHU5hdGl2ZU1ldGhvZEFjY2Vzc29ySW1wbC5qYXZhdAAHaW52b2tlMHNxAH4AEQAAADlxAH4AGHEAfgAZdAAGaW52b2tlc3EAfgARAAAAK3QAKHN1bi5yZWZsZWN0LkRlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcGx0ACFEZWxlZ2F0aW5nTWV0aG9kQWNjZXNzb3JJbXBsLmphdmFxAH4AHHNxAH4AEQAAAl50ABhqYXZhLmxhbmcucmVmbGVjdC5NZXRob2R0AAtNZXRob2QuamF2YXEAfgAcc3EAfgARAAAAa3QAImNvbS5jbG91ZC52bS5WbVdvcmtKb2JIYW5kbGVyUHJveHl0ABpWbVdvcmtKb2JIYW5kbGVyUHJveHkuamF2YXQAD2hhbmRsZVZtV29ya0pvYnNxAH4AEQAAEhxxAH4AE3EAfgAUcQB-ACZzcQB-ABEAAABndAAgY29tLmNsb3VkLnZtLlZtV29ya0pvYkRpc3BhdGNoZXJ0ABhWbVdvcmtKb2JEaXNwYXRjaGVyLmphdmF0AAZydW5Kb2JzcQB-ABEAAAIZdAA_b3JnLmFwYWNoZS5jbG91ZHN0YWNrLmZyYW1ld29yay5qb2JzLmltcGwuQXN5bmNKb2JNYW5hZ2VySW1wbCQ1dAAYQXN5bmNKb2JNYW5hZ2VySW1wbC5qYXZhdAAMcnVuSW5Db250ZXh0c3EAfgARAAAAMXQAPm9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuTWFuYWdlZENvbnRleHRSdW5uYWJsZSQxdAAbTWFuYWdlZENvbnRleHRSdW5uYWJsZS5qYXZhdAADcnVuc3EAfgARAAAAOHQAQm9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0TWFuYWdlZENvbnRleHQkMXQAGkRlZmF1bHRNYW5hZ2VkQ29udGV4dC5qYXZhdAAEY2FsbHNxAH4AEQAAAGd0AEBvcmcuYXBhY2hlLmNsb3Vkc3RhY2subWFuYWdlZC5jb250ZXh0LmltcGwuRGVmYXVsdE1hbmFnZWRDb250ZXh0cQB-ADZ0AA9jYWxsV2l0aENvbnRleHRzcQB-ABEAAAA1cQB-ADlxAH4ANnQADnJ1bldpdGhDb250ZXh0c3EAfgARAAAALnQAPG9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuTWFuYWdlZENvbnRleHRSdW5uYWJsZXEAfgAycQB-ADNzcQB-ABEAAAHucQB-AC1xAH4ALnEAfgAzc3EAfgARAAAB13QALmphdmEudXRpbC5jb25jdXJyZW50LkV4ZWN1dG9ycyRSdW5uYWJsZUFkYXB0ZXJ0AA5FeGVjdXRvcnMuamF2YXEAfgA3c3EAfgARAAABBnQAH2phdmEudXRpbC5jb25jdXJyZW50LkZ1dHVyZVRhc2t0AA9GdXR1cmVUYXNrLmphdmFxAH4AM3NxAH4AEQAABHl0ACdqYXZhLnV0aWwuY29uY3VycmVudC5UaHJlYWRQb29sRXhlY3V0b3J0ABdUaHJlYWRQb29sRXhlY3V0b3IuamF2YXQACXJ1bldvcmtlcnNxAH4AEQAAAmd0AC5qYXZhLnV0aWwuY29uY3VycmVudC5UaHJlYWRQb29sRXhlY3V0b3IkV29ya2VycQB-AEhxAH4AM3NxAH4AEQAAAul0ABBqYXZhLmxhbmcuVGhyZWFkdAALVGhyZWFkLmphdmFxAH4AM3NyACZqYXZhLnV0aWwuQ29sbGVjdGlvbnMkVW5tb2RpZmlhYmxlTGlzdPwPJTG17I4QAgABTAAEbGlzdHEAfgAMeHIALGphdmEudXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVDb2xsZWN0aW9uGUIAgMte9x4CAAFMAAFjdAAWTGphdmEvdXRpbC9Db2xsZWN0aW9uO3hwc3IAE2phdmEudXRpbC5BcnJheUxpc3R4gdIdmcdhnQMAAUkABHNpemV4cAAAAAB3BAAAAAB4cQB-AFR4c3IAEWphdmEubGFuZy5JbnRlZ2VyEuKgpPeBhzgCAAFJAAV2YWx1ZXhyABBqYXZhLmxhbmcuTnVtYmVyhqyVHQuU4IsCAAB4cAAAEO9zcQB-AFMAAAAAdwQAAAAAeHNyAA5qYXZhLmxhbmcuTG9uZzuL5JDMjyPfAgABSgAFdmFsdWV4cQB-AFYAAAAAAAAAGXZyABVjb20uY2xvdWQub3JnLkNsdXN0ZXKqoV-omaW2FAIAAHhwAA
>>>>                     2017-01-30 08:44:06,201 DEBUG
>>>>                     [c.c.d.DeploymentPlanningManagerImpl]
>>>>                     (Work-Job-Executor-7:ctx-bcf531e9
>>>>                     job-13729/job-192987 ctx-4c93bc31) Deploy
>>>>                     avoids pods: [], clusters: [], hosts: [3]
>>>>                     2017-01-30 08:44:06,201 DEBUG
>>>>                     [c.c.d.DeploymentPlanningManagerImpl]
>>>>                     (Work-Job-Executor-7:ctx-bcf531e9
>>>>                     job-13729/job-192987 ctx-4c93bc31)
>>>>                     DeploymentPlanner allocation algorithm:
>>>>                     com.cloud.deploy.SkipHeuresticsPlanner@4fdf86ce
>>>>                     2017-01-30 08:44:06,201 DEBUG
>>>>                     [c.c.d.DeploymentPlanningManagerImpl]
>>>>                     (Work-Job-Executor-7:ctx-bcf531e9
>>>>                     job-13729/job-192987 ctx-4c93bc31) Trying to
>>>>                     allocate a host and storage pools from dc:14,
>>>>                     pod:17,cluster:17, requested cpu: 4200,
>>>>                     requested ram: 4294967296
>>>>                     2017-01-30 08:44:06,201 DEBUG
>>>>                     [c.c.d.DeploymentPlanningManagerImpl]
>>>>                     (Work-Job-Executor-7:ctx-bcf531e9
>>>>                     job-13729/job-192987 ctx-4c93bc31) Is ROOT
>>>>                     volume READY (pool already allocated)?: Yes
>>>>                     2017-01-30 08:44:06,201 DEBUG
>>>>                     [c.c.d.DeploymentPlanningManagerImpl]
>>>>                     (Work-Job-Executor-7:ctx-bcf531e9
>>>>                     job-13729/job-192987 ctx-4c93bc31) This VM has
>>>>                     last host_id specified, trying to choose the
>>>>                     same host: 213
>>>>                     2017-01-30 08:44:06,213 DEBUG
>>>>                     [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989) Publish async job-192989
>>>>                     complete on message bus
>>>>                     2017-01-30 08:44:06,213 DEBUG
>>>>                     [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989) Wake up jobs related to
>>>>                     job-192989
>>>>                     2017-01-30 08:44:06,213 DEBUG
>>>>                     [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989) Update db status for
>>>>                     job-192989
>>>>                     2017-01-30 08:44:06,219 DEBUG
>>>>                     [c.c.d.DeploymentPlanningManagerImpl]
>>>>                     (Work-Job-Executor-6:ctx-0d738a8e
>>>>                     job-13737/job-192988 ctx-6e9c753f) Checking
>>>>                     resources in Cluster: 35 under Pod: 31
>>>>                     2017-01-30 08:44:06,227 DEBUG
>>>>                     [c.c.a.m.a.i.FirstFitAllocator]
>>>>                     (Work-Job-Executor-6:ctx-0d738a8e
>>>>                     job-13737/job-192988 ctx-6e9c753f
>>>>                     FirstFitRoutingAllocator) Looking for hosts in
>>>>                     dc: 23  pod:31 cluster:35
>>>>                     2017-01-30 08:44:06,229 DEBUG
>>>>                     [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>                     (Work-Job-Executor-8:ctx-381e8d1a
>>>>                     job-14953/job-192992) Executing AsyncJobVO
>>>>                     {id:192992, userId: 1, accountId: 1,
>>>>                     instanceType: null, instanceId: null, cmd:
>>>>                     com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>>                     rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGwXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAEw,
>>>>                     cmdVersion: 0, status: IN_PROGRESS,
>>>>                     processStatus: 0, resultCode: 0, result: null,
>>>>                     initMsid: 345049098122, completeMsid: null,
>>>>                     lastUpdated: null, lastPolled: null, created:
>>>>                     Mon Jan 30 08:44:05 CET 2017}
>>>>                     2017-01-30 08:44:06,229 DEBUG
>>>>                     [c.c.v.VmWorkJobDispatcher]
>>>>                     (Work-Job-Executor-8:ctx-381e8d1a
>>>>                     job-14953/job-192992) Run VM work job:
>>>>                     com.cloud.vm.VmWorkMigrateAway for VM 1729, job
>>>>                     origin: 14953
>>>>                     2017-01-30 08:44:06,230 DEBUG
>>>>                     [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>                     (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Execute
>>>>                     sync-queue item: SyncQueueItemVO {id:63630,
>>>>                     queueId: 67180, contentType: AsyncJob,
>>>>                     contentId: 192990, lastProcessMsid:
>>>>                     345049098498, lastprocessNumber: 61,
>>>>                     lastProcessTime: Mon Jan 30 08:44:06 CET 2017,
>>>>                     created: Mon Jan 30 08:44:05 CET 2017}
>>>>                     2017-01-30 08:44:06,244 DEBUG
>>>>                     [c.c.d.DeploymentPlanningManagerImpl]
>>>>                     (Work-Job-Executor-7:ctx-bcf531e9
>>>>                     job-13729/job-192987 ctx-4c93bc31) The last
>>>>                     host of this VM is not UP or is not enabled,
>>>>                     host status is: Disconnected, host resource
>>>>                     state is: Enabled
>>>>                     2017-01-30 08:44:06,262 DEBUG
>>>>                     [c.c.d.DeploymentPlanningManagerImpl]
>>>>                     (Work-Job-Executor-7:ctx-bcf531e9
>>>>                     job-13729/job-192987 ctx-4c93bc31) Cannot
>>>>                     choose the last host to deploy this VM
>>>>                     2017-01-30 08:44:06,263 DEBUG
>>>>                     [c.c.d.FirstFitPlanner]
>>>>                     (Work-Job-Executor-7:ctx-bcf531e9
>>>>                     job-13729/job-192987 ctx-4c93bc31) Searching
>>>>                     resources only under specified Cluster: 17
>>>>                     2017-01-30 08:44:06,262 INFO
>>>>                     [o.a.c.f.j.i.AsyncJobMonitor]
>>>>                     (Work-Job-Executor-9:ctx-f4d09efb
>>>>                     job-14965/job-192991) Add job-192991 into job
>>>>                     monitoring
>>>>                     2017-01-30 08:44:06,246 DEBUG
>>>>                     [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>                     (Work-Job-Executor-4:ctx-017064e7
>>>>                     job-13733/job-192986) Done executing
>>>>                     com.cloud.vm.VmWorkMigrateAway for job-192986
>>>>                     2017-01-30 08:44:06,244 DEBUG
>>>>                     [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989) Wake up jobs joined with
>>>>                     job-192989 and disjoin all subjobs created from
>>>>                     job- 192989
>>>>                     2017-01-30 08:44:06,269 INFO
>>>>                     [o.a.c.f.j.i.AsyncJobMonitor]
>>>>                     (Work-Job-Executor-4:ctx-017064e7
>>>>                     job-13733/job-192986) Remove job-192986 from
>>>>                     job monitoring
>>>>                     2017-01-30 08:44:06,279 DEBUG
>>>>                     [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989) Done executing
>>>>                     com.cloud.vm.VmWorkMigrateAway for job-192989
>>>>                     2017-01-30 08:44:06,305 DEBUG
>>>>                     [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>                     (Work-Job-Executor-9:ctx-f4d09efb
>>>>                     job-14965/job-192991) Executing AsyncJobVO
>>>>                     {id:192991, userId: 1, accountId: 1,
>>>>                     instanceType: null, instanceId: null, cmd:
>>>>                     com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>>                     rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGRXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAABw,
>>>>                     cmdVersion: 0, status: IN_PROGRESS,
>>>>                     processStatus: 0, resultCode: 0, result: null,
>>>>                     initMsid: 345049098122, completeMsid: null,
>>>>                     lastUpdated: null, lastPolled: null, created:
>>>>                     Mon Jan 30 08:44:05 CET 2017}
>>>>                     2017-01-30 08:44:06,305 DEBUG
>>>>                     [c.c.v.VmWorkJobDispatcher]
>>>>                     (Work-Job-Executor-9:ctx-f4d09efb
>>>>                     job-14965/job-192991) Run VM work job:
>>>>                     com.cloud.vm.VmWorkMigrateAway for VM 1605, job
>>>>                     origin: 14965
>>>>                     2017-01-30 08:44:06,306 DEBUG
>>>>                     [c.c.a.m.AgentManagerImpl]
>>>>                     (AgentManager-Handler-13:null) Ping from 258
>>>>                     2017-01-30 08:44:06,319 DEBUG
>>>>                     [c.c.a.m.a.i.FirstFitAllocator]
>>>>                     (Work-Job-Executor-6:ctx-0d738a8e
>>>>                     job-13737/job-192988 ctx-6e9c753f
>>>>                     FirstFitRoutingAllocator) List of hosts in
>>>>                     ascending order of number of VMs: [224, 19, 21, 15]
>>>>                     2017-01-30 08:44:06,322 DEBUG
>>>>                     [c.c.a.m.a.i.FirstFitAllocator]
>>>>                     (Work-Job-Executor-6:ctx-0d738a8e
>>>>                     job-13737/job-192988 ctx-6e9c753f
>>>>                     FirstFitRoutingAllocator) FirstFitAllocator has
>>>>                     0 hosts to check for allocation: []
>>>>                     2017-01-30 08:44:06,322 INFO
>>>>                     [c.c.a.m.AgentManagerImpl]
>>>>                     (AgentManager-Handler-13:null) PingMap for
>>>>                     agent: 258 will not be updated because agent is
>>>>                     no longer in the PingMap
>>>>                     2017-01-30 08:44:06,326 INFO
>>>>                     [o.a.c.f.j.i.AsyncJobMonitor]
>>>>                     (Work-Job-Executor-5:ctx-32f1e087
>>>>                     job-15393/job-192989) Remove job-192989 from
>>>>                     job monitoring
>>>>
>>>>                     2017-01-30 08:54:38,139 INFO
>>>>                     [c.c.v.VirtualMachineManagerImpl]
>>>>                     (Work-Job-Executor-20:ctx-a988090f
>>>>                     job-14965/job-193019 ctx-052eb4f5) Migration
>>>>                     cancelled because state has changed:
>>>>                     VM[User|i-189-1178-VM]
>>>>                     2017-01-30 08:54:38,139 DEBUG
>>>>                     [c.c.v.VirtualMachineManagerImpl]
>>>>                     (Work-Job-Executor-20:ctx-a988090f
>>>>                     job-14965/job-193019 ctx-052eb4f5) Unable to
>>>>                     migrate VM due to: Migration cancelled because
>>>>                     state has changed: VM[User|i-189-1178-VM]
>>>>                     2017-01-30 08:54:38,167 DEBUG
>>>>                     [c.c.c.CapacityManagerImpl]
>>>>                     (Work-Job-Executor-20:ctx-a988090f
>>>>                     job-14965/job-193019 ctx-052eb4f5) VM state
>>>>                     transitted from :Running to Stopping with
>>>>                     event: StopRequestedvm's original host id: 29
>>>>                     new host id: 33 host id before state transition: 33
>>>>                     2017-01-30 08:54:38,177 DEBUG
>>>>                     [c.c.a.m.DirectAgentAttache]
>>>>                     (DirectAgent-325:ctx-e540d7f5) Seq
>>>>                     19-2355664080091480198: Response Received:
>>>>                     2017-01-30 08:54:38,177 DEBUG [c.c.a.t.Request]
>>>>                     (DirectAgent-325:ctx-e540d7f5) Seq
>>>>                     19-2355664080091480198: Processing:  { Ans: ,
>>>>                     MgmtId: 345049098498, via: 19, Ver: v1, Flags:
>>>>                     10,
>>>>                     [{"com.cloud.agent.api.NetworkUsageAnswer":{"routerName":"r-1706-VM","bytesSent":12276848087,"bytesReceived":5226621450,"result":true,"details":"","wait":0}}]
>>>>                     }
>>>>                     2017-01-30 08:54:38,177 DEBUG [c.c.a.t.Request]
>>>>                     (RouterMonitor-1:ctx-2e80f405) Seq
>>>>                     19-2355664080091480198: Received:  { Ans: ,
>>>>                     MgmtId: 345049098498, via: 19, Ver: v1, Flags:
>>>>                     10, { NetworkUsageAnswer } }
>>>>                     2017-01-30 08:54:38,177 DEBUG
>>>>                     [c.c.a.m.AgentManagerImpl]
>>>>                     (RouterMonitor-1:ctx-2e80f405) Details from
>>>>                     executing class
>>>>                     com.cloud.agent.api.NetworkUsageCommand:
>>>>                     2017-01-30 08:54:38,178 DEBUG [c.c.a.t.Request]
>>>>                     (Work-Job-Executor-20:ctx-a988090f
>>>>                     job-14965/job-193019 ctx-052eb4f5) Seq
>>>>                     33-1917970491306410035: Sending  { Cmd ,
>>>>                     MgmtId: 345049098498, via:
>>>>                     33(ewcstack-vh003-prod), Ver: v1, Flags:
>>>>                     100011,
>>>>                     [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"checkBeforeCleanup":false,"vmName":"i-189-1178-VM","wait":0}}]
>>>>                     }
>>>>                     2017-01-30 08:54:38,178 DEBUG [c.c.a.t.Request]
>>>>                     (Work-Job-Executor-20:ctx-a988090f
>>>>                     job-14965/job-193019 ctx-052eb4f5) Seq
>>>>                     33-1917970491306410035: Executing:  { Cmd ,
>>>>                     MgmtId: 345049098498, via:
>>>>                     33(ewcstack-vh003-prod), Ver: v1, Flags:
>>>>                     100011,
>>>>                     [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"checkBeforeCleanup":false,"vmName":"i-189-1178-VM","wait":0}}]
>>>>                     }
>>>>                     2017-01-30 08:54:38,178 DEBUG
>>>>                     [c.c.a.m.DirectAgentAttache]
>>>>                     (DirectAgent-43:ctx-da30f109) Seq
>>>>                     33-1917970491306410035: Executing request
>>>>                     2017-01-30 08:54:38,202 DEBUG
>>>>                     [c.c.a.m.DirectAgentAttache]
>>>>                     (DirectAgent-326:ctx-4fb88aa7) Seq
>>>>                     21-1035546439318503505: Executing request
>>>>
>>>>                     2017-01-30 09:05:55,628 INFO
>>>>                     [c.c.h.HighAvailabilityManagerImpl]
>>>>                     (HA-Worker-4:ctx-36aae733 work-2049) Processing
>>>>                     HAWork[2049-Migration-1178-Stopped-Migrating]
>>>>                     2017-01-30 09:05:55,633 WARN
>>>>                     [o.a.c.f.j.AsyncJobExecutionContext]
>>>>                     (HA-Worker-4:ctx-36aae733 work-2049) Job is
>>>>                     executed without a context, setup psudo job for
>>>>                     the executing thread
>>>>                     2017-01-30 09:05:55,645 DEBUG
>>>>                     [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>                     (HA-Worker-4:ctx-36aae733 work-2049) Sync
>>>>                     job-193045 execution on object VmWorkJobQueue.1178
>>>>
>>>>
>>>>
>>>>
>>>>                     On 31.01.2017 00:42, Rafael Weingärtner wrote:
>>>>
>>>>                         This seems to be one of those nasty bugs
>>>>                         that are very hard to debug. It
>>>>                         seems to be cause to a combination of
>>>>                         different factors, such as planets
>>>>                         alignments ;)
>>>>
>>>>                         At class “VirtualMachineManagerImpl” line
>>>>                         2395 (method
>>>>                         “com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(String,
>>>>                         long, DeploymentPlanner)”) is the start of
>>>>                         the code flow that triggers the
>>>>                         migration. The point here is that, if an
>>>>                         error occurs during the migration,
>>>>                         the VM will be stopped; For some reason
>>>>                         this is executed only for HA VMs.
>>>>
>>>>                         Going a little bit further, we find the
>>>>                         point that is triggering the error
>>>>                         in the method
>>>>                         “com.cloud.vm.VirtualMachineManagerImpl.migrate(VMInstanceVO,
>>>>                         long, DeployDestination)” at line 1941.
>>>>                         This line will call the method
>>>>                         “com.cloud.vm.VirtualMachineManagerImpl.changeState(VMInstanceVO,
>>>>                         Event,
>>>>                         Long, ItWorkVO, Step)” to change the VM
>>>>                         state to migrating/prepare
>>>>                         migration/or something like that (I am
>>>>                         guessing here), but for some reason,
>>>>                         the state of the VM is not the one expected
>>>>                         and this is causing a
>>>>                         “ConcurrentOperationException” to be
>>>>                         thrown. The problem is that the
>>>>                         “com.cloud.vm.VirtualMachineManagerImpl.changeState(VMInstanceVO,
>>>>                         Event,
>>>>                         Long, ItWorkVO, Step)” does not seem to
>>>>                         provide much useful log entries.
>>>>
>>>>                         Just to enable us to discuss the problem a
>>>>                         little further, could you tell
>>>>                         us a little bit about your environment
>>>>                         setting? How many ACS MSs do you
>>>>                         have? What is the VM workload being
>>>>                         migrated (I mean, the number of VMs and
>>>>                         computing resources)? Do the other servers
>>>>                         have enough computing resources
>>>>                         to support these VMs? Could you also check
>>>>                         the XenServer logs, and check
>>>>                         what happened on both, source and
>>>>                         destination hosts when the migration for
>>>>                         VM “i-638-1735-VM” was triggered?
>>>>
>>>>                         Do you have the knowledge/skills to use an
>>>>                         IDE (e.g. Eclipse) to connect to
>>>>                         a running JVM and debug a code? Because for
>>>>                         me it will not be possible to
>>>>                         go further without looking at the variables
>>>>                         in a running environment.
>>>>
>>>>
>>>>                         On Mon, Jan 30, 2017 at 4:28 AM, Francois
>>>>                         Scheurer <
>>>>                         francois.scheurer@everyware.ch
>>>>                         <ma...@everyware.ch>> wrote:
>>>>
>>>>                             Dear Rafael
>>>>
>>>>
>>>>                             Sorry for the late answer (was sick a
>>>>                             couple of days).
>>>>
>>>>                             I confirm we are migrating inside the
>>>>                             same xenserver pool (within a CS
>>>>                             Cluster).
>>>>
>>>>
>>>>                             Best Regards
>>>>                             Francois
>>>>
>>>>
>>>>
>>>>                             On 26.01.2017 04:35, Rafael Weingärtner
>>>>                             wrote:
>>>>
>>>>                                 Found the code that is executed.
>>>>                                 I need you to confirm if you are
>>>>                                 migrating within a cluster or across
>>>>                                 clusters?
>>>>
>>>>                                 On Wed, Jan 25, 2017 at 11:57 AM,
>>>>                                 Rafael Weingärtner <
>>>>                                 rafaelweingartner@gmail.com
>>>>                                 <ma...@gmail.com>>
>>>>                                 wrote:
>>>>
>>>>                                 I do not have access to ACS code
>>>>                                 right now, but I suggest starting
>>>>
>>>>                                     debugging here:
>>>>
>>>>                                     2017-01-24 18:05:28,427 DEBUG
>>>>                                     [c.c.c.CapacityManagerImpl]
>>>>
>>>>                                         (Work-Job-Executor-156:ctx-14b86f5e
>>>>                                         job-60216/job-191380
>>>>                                         ctx-175d37df)
>>>>                                         VM state transitted from
>>>>                                         :Running to Stopping with
>>>>                                         event:
>>>>                                         StopRequestedvm's
>>>>                                         original host id: 224 new
>>>>                                         host id: 15 host id before
>>>>                                         state transition:
>>>>                                         15
>>>>
>>>>                                         I would try to understand
>>>>                                         first why ACS requested the
>>>>                                         hypervisor to stop
>>>>
>>>>                                     the VM. I mean the following,
>>>>                                     check the source code for
>>>>                                     conditions that
>>>>                                     would make ACS request the
>>>>                                     shutdown of VMs.
>>>>                                     BTW: are you migrating within a
>>>>                                     cluster or across clusters?
>>>>
>>>>                                     I will be able to analyze this
>>>>                                     further only after work hours,
>>>>                                     so if you
>>>>                                     find anything, keep me posted.
>>>>
>>>>                                     On Wed, Jan 25, 2017 at 11:45
>>>>                                     AM, Francois Scheurer <
>>>>                                     francois.scheurer@everyware.ch
>>>>                                     <ma...@everyware.ch>>
>>>>                                     wrote:
>>>>
>>>>                                     Hello Rafael
>>>>
>>>>
>>>>                                         I think the important log
>>>>                                         lines are these:
>>>>
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:28,407 INFO
>>>>                                         [c.c.v.VirtualMachineManagerImpl]
>>>>                                         (Work-Job-Executor-156:ctx-14b86f5e
>>>>                                         job-60216/job-191380
>>>>                                         ctx-175d37df) Migration
>>>>                                         cancelled because state has
>>>>                                         changed: VM[User|i-638-1736-VM]
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:28,407 DEBUG
>>>>                                         [c.c.v.VirtualMachineManagerImpl]
>>>>                                         (Work-Job-Executor-156:ctx-14b86f5e
>>>>                                         job-60216/job-191380
>>>>                                         ctx-175d37df) Unable to
>>>>                                         migrate VM due to:
>>>>                                         Migration
>>>>                                         cancelled because state has
>>>>                                         changed: VM[User|i-638-1736-VM]
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:28,427 DEBUG
>>>>                                         [c.c.c.CapacityManagerImpl]
>>>>                                         (Work-Job-Executor-156:ctx-14b86f5e
>>>>                                         job-60216/job-191380
>>>>                                         ctx-175d37df) VM state
>>>>                                         transitted from :Running to
>>>>                                         Stopping with event:
>>>>                                         StopRequestedvm's original
>>>>                                         host id: 224 new host
>>>>                                         id:
>>>>                                         15 host id before state
>>>>                                         transition: 15
>>>>
>>>>                                         The really disturbing thing
>>>>                                         is that CS stop again the
>>>>                                         VM if we start it
>>>>                                         manually, apparently
>>>>                                         because the migratingjob is
>>>>                                         still running.
>>>>
>>>>                                         Note the the VM HA Flag is
>>>>                                         enabled.
>>>>
>>>>
>>>>
>>>>                                         Thank you.
>>>>
>>>>
>>>>                                         Best Regards.
>>>>
>>>>                                         Francois
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>                                         more details:
>>>>
>>>>
>>>>
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:28,148 DEBUG
>>>>                                         [o.a.c.e.o.VolumeOrchestrator]
>>>>                                         (Work-Job-Executor-156:ctx-14b86f5e
>>>>                                         job-60216/job-191380
>>>>                                         ctx-175d37df) Preparing 3
>>>>                                         volumes for
>>>>                                         VM[User|i-638-1736-VM]
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:28,155 DEBUG
>>>>                                         [c.c.v.VmWorkJobDispatcher]
>>>>                                         (Work-Job-Executor-122:ctx-a8bacc64
>>>>                                         job-14953/job-191351) Done
>>>>                                         with run of VM work job:
>>>>                                         com.cloud.vm.VmWorkMigrateAway
>>>>                                         for VM 1411, job origin: 14953
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:28,155 DEBUG
>>>>                                         [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>                                         (Work-Job-Executor-122:ctx-a8bacc64
>>>>                                         job-14953/job-191351) Done
>>>>                                         executing
>>>>                                         com.cloud.vm.VmWorkMigrateAway
>>>>                                         for
>>>>                                         job-191351
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:28,160 INFO
>>>>                                         [o.a.c.f.j.i.AsyncJobMonitor]
>>>>                                         (Work-Job-Executor-122:ctx-a8bacc64
>>>>                                         job-14953/job-191351)
>>>>                                         Remove job-191351 from job
>>>>                                         monitoring
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:28,168 INFO
>>>>                                         [c.c.h.HighAvailabilityManagerImpl]
>>>>                                         (HA-Worker-2:ctx-f3bcf43c
>>>>                                         work-1989)
>>>>                                         Completed
>>>>                                         HAWork[1989-Migration-1411-Running-Migrating]
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:28,246 DEBUG
>>>>                                         [c.c.a.m.ClusteredAgentAttache]
>>>>                                         (Work-Job-Executor-156:ctx-14b86f5e
>>>>                                         job-60216/job-191380
>>>>                                         ctx-175d37df) Seq
>>>>                                         19-8526158519542516375:
>>>>                                         Forwarding
>>>>                                         Seq 19-8526158519542516375:
>>>>                                         { Cmd , MgmtId:
>>>>                                         345049103441, via:
>>>>                                         19(ewcstack-vh010-prod),
>>>>                                         Ver: v1, Flags: 100111, [{
>>>>                                         "com.cloud.agent.api.PrepareForMigrationCommand":{"vm":{"id"
>>>>                                         :1736,"name":"i-638-1736-VM","bootloader":"PyGrub","type":"U
>>>>                                         ser","cpus":4,"minSpeed":2100,"maxSpeed":2100,"minRam":17179
>>>>                                         869184,"maxRam":17179869184,"arch":"x86_64","os":"Other
>>>>                                         (64-bit)","platformEmulator":"Other
>>>>                                         install
>>>>                                         media","bootArgs":"","enable
>>>>                                         HA":true,"limitCpuUse":true,"enableDynamicallyScaleVm":true,
>>>>                                         "vncPassword":"pC1WUC7h1DBH9J36q3H9pg==","params":{"memoryOv
>>>>                                         ercommitRatio":"1.0","platform":"viridian:true;acpi:1;apic:t
>>>>                                         rue;pae:true;nx:true;timeoffset:-3601","keyboard":"us","Mess
>>>>                                         age.ReservedCapacityFreed.Flag":"false","cpuOvercommitRatio":"4.0","
>>>>                                         hypervisortoolsversion":"xenserver61"},"uuid":"5bff01de-c033
>>>>                                         -4925-9e47-30dd14539272","disks":[{"data":{"org.apache.cloud
>>>>                                         stack.storage.to
>>>>                                         <http://stack.storage.to>.VolumeObjectTO":{"uuid":"1ab514b2-f2de-
>>>>                                         4d85-8e0d-9768184a1349","volumeType":"ROOT","dataStore":{"org.
>>>>                                         apache.cloudstack.storage.to
>>>>                                         <http://apache.cloudstack.storage.to>.PrimaryDataStoreTO":{"uuid":"b8751b89-00f
>>>>                                         6-5079-ed7a-c073579c7563","id":64,"poolType":"PreSetup","hos
>>>>                                         t":"localhost","path":"/b8751b89-00f6-5079-ed7a-c073579c7563
>>>>                                         ","port":0,"url":"PreSetup://localhost/b8751b89-00f6-5079-ed
>>>>                                         7a-c073579c7563/?ROLE=Primary&STOREUUID=b8751b89-00f6-5079-
>>>>                                         ed7a-c073579c7563"}},"name":"ROOT-1736","size":53687091200,"path":"d3
>>>>                                         f87e3c-1efe-42ad-aa93-609d5b980a34","volumeId":8655,"vmName"
>>>>                                         :"i-638-1736-VM","accountId":638,"format":"VHD","provisionin
>>>>                                         gType":"THIN","id":8655,"deviceId":0,"hypervisorType":"
>>>>                                         XenServer"}},"diskSeq":0,"path":"d3f87e3c-1efe-42ad-aa93
>>>>                                         -609d5b980a34","type":"ROOT","_details":{"managed":"false","
>>>>                                         storagePort":"0",
>>>>                                         "storageHost":"localhost","volumeSize":"53687091200"}},{"dat
>>>>                                         a":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid
>>>>                                         ":"733df9dc-5e24-4ab9-b22b-bce78322428e","volumeType":"DATAD
>>>>                                         ISK","dataStore":{"org.apache.cloudstack.storage.to.PrimaryD
>>>>                                         ataStoreTO":{"uuid":"b8751b89-00f6-5079-ed7a-c073579c7563","id":64,"
>>>>                                         poolType":"PreSetup","host":"localhost","path":"/b8751b89-00
>>>>                                         f6-5079-ed7a-c073579c7563","port":0,"url":"PreSetup://localh
>>>>                                         ost/b8751b89-00f6-5079-ed7a-c073579c7563/?ROLE=Primary&STORE
>>>>                                         UUID=b8751b89-00f6-5079-ed7a-c073579c7563"}},"name":"WELLA-
>>>>                                         DB1-NCSA_D","size":268435456000,"path":"265f168a-bb92-4591-a32b-ee3e2
>>>>                                         2c8cb68","volumeId":8656,"vmName":"i-638-1736-VM","accountId
>>>>                                         ":638,"format":"VHD","provisioningType":"THIN","id":8656,"
>>>>                                         deviceId":1,"hypervisorType":"XenServer"}},"diskSeq":1,"
>>>>                                         path":"265f168a-bb92-4591-a32b-ee3e22c8cb68","type":"DATADIS
>>>>                                         K","_details":{"managed":"false","storagePort":"0","stor
>>>>                                         ageHost":"localhos
>>>>                                         t","volumeSize":"268435456000"}},{"data":{"org.apache.clouds
>>>>                                         tack.storage.to
>>>>                                         <http://tack.storage.to>.VolumeObjectTO":{"uuid":"bda6f335-b8a3-479c-
>>>>                                         8c0d-41b344d2a50d","volumeType":"DATADISK","dataStore":{"org.
>>>>                                         apache.cloudstack.storage.to
>>>>                                         <http://apache.cloudstack.storage.to>.PrimaryDataStoreTO":{"uuid"
>>>>                                         :"09df3ca6-0a20-e4f0-3d79-12a6ed67a13d","id":89,"poolType":"PreSetup","
>>>>                                         host":"localhost","path":"/09df3ca6-0a20-e4f0-3d79-12a6ed67a
>>>>                                         13d","port":0,"url":"PreSetup://localhost/09df3ca6-0a20-e4f0
>>>>                                         -3d79-12a6ed67a13d/?ROLE=Primary&STOREUUID=09df3ca6-0a20-
>>>>
>>>>                                         e4f0-3d79-12a6ed67a13d"}},"name":"WELLA-DB1-NCSA_E","size
>>>>                                         ":429496729600,"path":"612cb586-a70c-4896-b120-062ee7a67894"
>>>>                                         ,"volumeId":86
>>>>                                         57,"vmName":"i-638-1736-VM","accountId":638,"format":"VHD","
>>>>                                         provisioningType":"THIN","id":8657,"deviceId":2,"hypervisorT
>>>>                                         ype":"XenServer"}},"diskSeq":2,"path":"612cb586-a70c-4896-b1
>>>>                                         20-062ee7a67894","type":"DATADISK","_details":{"managed":"
>>>>                                         false","storagePort":"0","storageHost":"localhost","volumeSize":"42949
>>>>                                         6729600"}},{"data":{"org.apache.cloudstack.storage.to
>>>>                                         <http://apache.cloudstack.storage.to>.Templa
>>>>                                         teObjectTO":{"uuid":"0e5bf4ff-caca-4b08-abed-7bbcdf3c42ef","
>>>>                                         id":201,"format":"ISO","accountId":1,"hvm":true,"displayText
>>>>                                         ":"xen-pv-drv-iso","name":"xs-tools.iso","guestOsType":"CentOS
>>>>                                         4.5
>>>>                                         (32-bit)","hypervisorType":"XenServer"}},"diskSeq":3,"type":"
>>>>                                         ISO"}],"nics":[{"deviceId":0,"networkRateMbps":1000,"default
>>>>                                         Nic":true,"pxeDisable":false,"nicUuid":"3523c3d3-9f92-4076-a
>>>>                                         d99-f8864d0f0edd","uuid":"e6fbb257-16e1-415f-88fb-9a79b263b3
>>>>                                         a8","ip":"192.168.1.42","netmask":"255.255.255.0","
>>>>                                         gateway":"192.168.1.1","mac":"02:00:0c:ae:00:0f","dns1":"8.8
>>>>                                         .8.8","dns2":"8.8
>>>>                                         .4.4","broadcastType":"Vlan","type":"Guest","broadcastUri":"
>>>>                                         vlan://2285","isolationUri":"vlan://2285","isSecurityGroupEn
>>>>                                         abled":false,"name":"bond0"}],"vcpuMaxLimit":16},"wait":0}}]
>>>>                                         } to
>>>>                                         345049098498
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:28,399
>>>>                                         DEBUG [c.c.a.t.Request]
>>>>                                         (AgentManager-Handler-10:null)
>>>>                                         Seq 19-8526158519542516375:
>>>>                                         Processing:
>>>>                                         {
>>>>                                         Ans: , MgmtId:
>>>>                                         345049103441, via: 19, Ver:
>>>>                                         v1, Flags: 110,
>>>>                                         [{"com.cloud.agent.api.PrepareForMigrationAnswer":{"result":
>>>>                                         true,"wait":0}}]
>>>>                                         }
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:28,399 DEBUG
>>>>                                         [c.c.a.m.AgentAttache]
>>>>                                         (AgentManager-Handler-10:null)
>>>>                                         Seq 19-8526158519542516375:
>>>>                                         No more
>>>>                                         commands
>>>>                                         found
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:28,399
>>>>                                         DEBUG [c.c.a.t.Request]
>>>>                                         (Work-Job-Executor-156:ctx-14b86f5e
>>>>                                         job-60216/job-191380
>>>>                                         ctx-175d37df)
>>>>                                         Seq 19-8526158519542516375:
>>>>                                         Received:  { Ans: , MgmtId:
>>>>                                         345049103441,
>>>>                                         via:
>>>>                                         19, Ver: v1, Flags: 110, {
>>>>                                         PrepareForMigrationAnswer } }
>>>>
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:28,407 INFO
>>>>                                         [c.c.v.VirtualMachineManagerImpl]
>>>>                                         (Work-Job-Executor-156:ctx-14b86f5e
>>>>                                         job-60216/job-191380
>>>>                                         ctx-175d37df) Migration
>>>>                                         cancelled because state has
>>>>                                         changed: VM[User|i-638-1736-VM]
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:28,407 DEBUG
>>>>                                         [c.c.v.VirtualMachineManagerImpl]
>>>>                                         (Work-Job-Executor-156:ctx-14b86f5e
>>>>                                         job-60216/job-191380
>>>>                                         ctx-175d37df) Unable to
>>>>                                         migrate VM due to:
>>>>                                         Migration
>>>>                                         cancelled because state has
>>>>                                         changed: VM[User|i-638-1736-VM]
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:28,427 DEBUG
>>>>                                         [c.c.c.CapacityManagerImpl]
>>>>                                         (Work-Job-Executor-156:ctx-14b86f5e
>>>>                                         job-60216/job-191380
>>>>                                         ctx-175d37df) VM state
>>>>                                         transitted from :Running to
>>>>                                         Stopping with event:
>>>>                                         StopRequestedvm's original
>>>>                                         host id: 224 new host
>>>>                                         id:
>>>>                                         15 host id before state
>>>>                                         transition: 15
>>>>
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:28,438 DEBUG
>>>>                                         [c.c.a.m.ClusteredAgentAttache]
>>>>                                         (Work-Job-Executor-156:ctx-14b86f5e
>>>>                                         job-60216/job-191380
>>>>                                         ctx-175d37df) Seq
>>>>                                         15-6889944479923437643:
>>>>                                         Forwarding
>>>>                                         Seq 15-6889944479923437643:
>>>>                                         { Cmd , MgmtId:
>>>>                                         345049103441, via:
>>>>                                         15(ewcstack-vh011-prod),
>>>>                                         Ver: v1, Flags: 100011,
>>>>                                         [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"execut
>>>>                                         eInSequence":false,"checkBeforeCleanup":false,"vmName":"i-
>>>>                                         638-1736-VM","wait":0}}]
>>>>                                         } to 345049098498
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:30,335 DEBUG
>>>>                                         [c.c.a.m.AgentManagerImpl]
>>>>                                         (AgentManager-Handler-1:null)
>>>>                                         SeqA
>>>>                                         254-800498:
>>>>                                         Processing Seq 254-800498: 
>>>>                                         { Cmd , MgmtId: -1, via:
>>>>                                         254, Ver: v1,
>>>>                                         Flags:
>>>>                                         11,
>>>>                                         [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_
>>>>                                         proxyVmId":1530,"_loadInfo":"{\n
>>>>                                         \"connections\":
>>>>                                         []\n}","wait":0}}] }
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:30,340 DEBUG
>>>>                                         [c.c.a.m.AgentManagerImpl]
>>>>                                         (AgentManager-Handler-1:null)
>>>>                                         SeqA
>>>>                                         254-800498:
>>>>                                         Sending Seq 254-800498:  {
>>>>                                         Ans: , MgmtId:
>>>>                                         345049103441, via: 254, Ver:
>>>>                                         v1,
>>>>                                         Flags: 100010,
>>>>                                         [{"com.cloud.agent.api.AgentCo
>>>>                                         ntrolAnswer":{"result":true,"wait":0}}]
>>>>                                         }
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:32,447
>>>>                                         DEBUG [c.c.a.t.Request]
>>>>                                         (AgentManager-Handler-11:null)
>>>>                                         Seq 15-6889944479923437632:
>>>>                                         Processing:
>>>>                                         {
>>>>                                         Ans: , MgmtId:
>>>>                                         345049103441, via: 15, Ver:
>>>>                                         v1, Flags: 10,
>>>>                                         [{"com.cloud.agent.api.StopAnswer":{"platform":"viridian:tru
>>>>                                         e;acpi:1;apic:true;pae:true;nx:true;timeoffset:3601","result
>>>>                                         ":true,"details":"Stop
>>>>                                         VM i-638-1735-VM
>>>>                                         Succeed","wait":0}}] }
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:32,447
>>>>                                         DEBUG [c.c.a.t.Request]
>>>>                                         (Work-Job-Executor-115:ctx-0d5db528
>>>>                                         job-13733/job-191379
>>>>                                         ctx-bbabfeae)
>>>>                                         Seq 15-6889944479923437632:
>>>>                                         Received:  { Ans: , MgmtId:
>>>>                                         345049103441,
>>>>                                         via:
>>>>                                         15, Ver: v1, Flags: 10, {
>>>>                                         StopAnswer } }
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:32,470 DEBUG
>>>>                                         [c.c.v.VirtualMachineManagerImpl]
>>>>                                         (Work-Job-Executor-115:ctx-0d5db528
>>>>                                         job-13733/job-191379
>>>>                                         ctx-bbabfeae)
>>>>                                         VM[User|i-638-1735-VM] is
>>>>                                         stopped on
>>>>                                         the
>>>>                                         host. Proceeding to release
>>>>                                         resource held.
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:32,475 DEBUG
>>>>                                         [c.c.n.NetworkModelImpl]
>>>>                                         (Work-Job-Executor-115:ctx-0d5db528
>>>>                                         job-13733/job-191379
>>>>                                         ctx-bbabfeae) Service
>>>>                                         SecurityGroup is not
>>>>                                         supported
>>>>                                         in the network id=519
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:32,478 DEBUG
>>>>                                         [o.a.c.e.o.NetworkOrchestrator]
>>>>                                         (Work-Job-Executor-115:ctx-0d5db528
>>>>                                         job-13733/job-191379
>>>>                                         ctx-bbabfeae) Changing
>>>>                                         active number of nics for
>>>>                                         network id=519 on -1
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:32,488 DEBUG
>>>>                                         [o.a.c.e.o.NetworkOrchestrator]
>>>>                                         (Work-Job-Executor-115:ctx-0d5db528
>>>>                                         job-13733/job-191379
>>>>                                         ctx-bbabfeae) Asking
>>>>                                         VpcVirtualRouter to release
>>>>                                         NicProfile[17403-1735-3364083b-dcc0-4eb9-a7f9-f3e85bf8c7d3-1
>>>>                                         92.168.1.11-null
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:32,488 DEBUG
>>>>                                         [c.c.v.VirtualMachineManagerImpl]
>>>>                                         (Work-Job-Executor-115:ctx-0d5db528
>>>>                                         job-13733/job-191379
>>>>                                         ctx-bbabfeae) Successfully
>>>>                                         released network
>>>>                                         resources
>>>>                                         for the vm
>>>>                                         VM[User|i-638-1735-VM]
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:32,488 DEBUG
>>>>                                         [c.c.v.VirtualMachineManagerImpl]
>>>>                                         (Work-Job-Executor-115:ctx-0d5db528
>>>>                                         job-13733/job-191379
>>>>                                         ctx-bbabfeae) Successfully
>>>>                                         released storage
>>>>                                         resources
>>>>                                         for the vm
>>>>                                         VM[User|i-638-1735-VM]
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:32,496 DEBUG
>>>>                                         [c.c.c.CapacityManagerImpl]
>>>>                                         (Work-Job-Executor-115:ctx-0d5db528
>>>>                                         job-13733/job-191379
>>>>                                         ctx-bbabfeae) VM state
>>>>                                         transitted from :Stopping to
>>>>                                         Stopped with event:
>>>>                                         OperationSucceededvm's
>>>>                                         original host id: 15 new host
>>>>                                         id: null host id before
>>>>                                         state transition: 15
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:32,505 DEBUG
>>>>                                         [c.c.c.CapacityManagerImpl]
>>>>                                         (Work-Job-Executor-115:ctx-0d5db528
>>>>                                         job-13733/job-191379
>>>>                                         ctx-bbabfeae) Hosts's
>>>>                                         actual total CPU: 82976 and
>>>>                                         CPU
>>>>                                         after applying
>>>>                                         overprovisioning: 331904
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:32,505 DEBUG
>>>>                                         [c.c.c.CapacityManagerImpl]
>>>>                                         (Work-Job-Executor-115:ctx-0d5db528
>>>>                                         job-13733/job-191379
>>>>                                         ctx-bbabfeae) Hosts's
>>>>                                         actual total RAM:
>>>>                                         261961927808
>>>>                                         and RAM after applying
>>>>                                         overprovisioning: 261961924608
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:32,505 DEBUG
>>>>                                         [c.c.c.CapacityManagerImpl]
>>>>                                         (Work-Job-Executor-115:ctx-0d5db528
>>>>                                         job-13733/job-191379
>>>>                                         ctx-bbabfeae) release cpu
>>>>                                         from host: 15, old used:
>>>>                                         71300,reserved: 2100,
>>>>                                         actual total: 82976, total
>>>>                                         with overprovisioning:
>>>>                                         331904; new used:
>>>>                                         67100,reserved:6300;
>>>>                                         movedfromreserved:
>>>>                                         false,moveToReserveredtrue
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:32,505 DEBUG
>>>>                                         [c.c.c.CapacityManagerImpl]
>>>>                                         (Work-Job-Executor-115:ctx-0d5db528
>>>>                                         job-13733/job-191379
>>>>                                         ctx-bbabfeae) release mem
>>>>                                         from host: 15, old used:
>>>>                                         98784247808,reserved:
>>>>                                         1073741824, total:
>>>>                                         261961924608; new used:
>>>>                                         94489280512,reserved:5368709120
>>>>                                         <tel:5368709120>
>>>>                                         <(53)%206870-9120>;
>>>>                                         movedfromreserved:
>>>>
>>>>                                         false,moveToReserveredtrue
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:32,538 DEBUG
>>>>                                         [c.c.n.NetworkModelImpl]
>>>>                                         (Work-Job-Executor-115:ctx-0d5db528
>>>>                                         job-13733/job-191379
>>>>                                         ctx-bbabfeae) Service
>>>>                                         SecurityGroup is not
>>>>                                         supported
>>>>                                         in the network id=519
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:32,557 DEBUG
>>>>                                         [c.c.n.NetworkModelImpl]
>>>>                                         (Work-Job-Executor-115:ctx-0d5db528
>>>>                                         job-13733/job-191379
>>>>                                         ctx-bbabfeae) Service
>>>>                                         SecurityGroup is not
>>>>                                         supported
>>>>                                         in the network id=519
>>>>
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:32,588 ERROR
>>>>                                         [c.c.v.VmWorkJobHandlerProxy]
>>>>                                         (Work-Job-Executor-115:ctx-0d5db528
>>>>                                         job-13733/job-191379
>>>>                                         ctx-bbabfeae) Invocation
>>>>                                         exception, caused by:
>>>>                                         com.cloud.utils.exception.CloudRuntimeException:
>>>>                                         Unable to migrate
>>>>                                         VM[User|i-638-1735-VM]
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:32,589 INFO
>>>>                                         [c.c.v.VmWorkJobHandlerProxy]
>>>>                                         (Work-Job-Executor-115:ctx-0d5db528
>>>>                                         job-13733/job-191379
>>>>                                         ctx-bbabfeae) Rethrow exception
>>>>                                         com.cloud.utils.exception.CloudRuntimeException:
>>>>                                         Unable to migrate
>>>>                                         VM[User|i-638-1735-VM]
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:32,589 DEBUG
>>>>                                         [c.c.v.VmWorkJobDispatcher]
>>>>                                         (Work-Job-Executor-115:ctx-0d5db528
>>>>                                         job-13733/job-191379) Done
>>>>                                         with run of VM work job:
>>>>                                         com.cloud.vm.VmWorkMigrateAway
>>>>                                         for VM 1735, job origin: 13733
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:32,589 ERROR
>>>>                                         [c.c.v.VmWorkJobDispatcher]
>>>>                                         (Work-Job-Executor-115:ctx-0d5db528
>>>>                                         job-13733/job-191379)
>>>>                                         Unable to complete
>>>>                                         AsyncJobVO {id:191379, userId:
>>>>                                         1,
>>>>                                         accountId: 1, instanceType:
>>>>                                         null, instanceId: null, cmd:
>>>>                                         com.cloud.vm.VmWorkMigrateAway,
>>>>                                         cmdInfo:
>>>>                                         rO0ABXNyAB5jb20uY2xvdWQudm0uVm
>>>>                                         1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb2
>>>>                                         0uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2
>>>>                                         VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZz
>>>>                                         t4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGx3QAGVZpcnR1YWxNYWNoaW5lTW
>>>>                                         FuYWdlckltcGwAAAAAAAAA4A,
>>>>                                         cmdVersion: 0, status:
>>>>                                         IN_PROGRESS, processStatus:
>>>>                                         0, resultCode: 0,
>>>>                                         result: null, initMsid:
>>>>                                         345049098498, completeMsid:
>>>>                                         null, lastUpdated:
>>>>                                         null, lastPolled: null,
>>>>                                         created: Tue Jan 24
>>>>                                         18:05:00 CET 2017}, job
>>>>                                         origin:13733
>>>>                                         ewcstack-man02-prod:
>>>>                                         com.cloud.utils.exception.CloudRuntimeException:
>>>>                                         Unable to migrate
>>>>                                         VM[User|i-638-1735-VM]
>>>>                                         ewcstack-man02-prod:    at
>>>>                                         com.cloud.vm.VirtualMachineMan
>>>>                                         agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:2405)
>>>>                                         ewcstack-man02-prod:    at
>>>>                                         com.cloud.vm.VirtualMachineMan
>>>>                                         agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:4517)
>>>>                                         ewcstack-man02-prod:    at
>>>>                                         sun.reflect.GeneratedMethodAcc
>>>>                                         essor402.invoke(Unknown
>>>>                                         Source)
>>>>                                         ewcstack-man02-prod:    at
>>>>                                         sun.reflect.DelegatingMethodAc
>>>>                                         cessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>                                         ewcstack-man02-prod:    at
>>>>                                         java.lang.reflect.Method.invok
>>>>                                         e(Method.java:606)
>>>>                                         ewcstack-man02-prod:    at
>>>>                                         com.cloud.vm.VmWorkJobHandlerP
>>>>                                         roxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
>>>>                                         ewcstack-man02-prod:    at
>>>>                                         com.cloud.vm.VirtualMachineMan
>>>>                                         agerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4636)
>>>>                                         ewcstack-man02-prod:    at
>>>>                                         com.cloud.vm.VmWorkJobDispatch
>>>>                                         er.runJob(VmWorkJobDispatcher.java:103)
>>>>                                         ewcstack-man02-prod:    at
>>>>                                         org.apache.cloudstack.framewor
>>>>                                         k.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManag
>>>>                                         erImpl.java:537)
>>>>                                         ewcstack-man02-prod:    at
>>>>                                         org.apache.cloudstack.managed.
>>>>                                         context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>>>>                                         ewcstack-man02-prod:    at
>>>>                                         org.apache.cloudstack.managed.
>>>>                                         context.impl.DefaultManagedContext$1.call(DefaultManagedCont
>>>>                                         ext.java:56)
>>>>                                         ewcstack-man02-prod:    at
>>>>                                         org.apache.cloudstack.managed.
>>>>                                         context.impl.DefaultManagedContext.callWithContext(DefaultMa
>>>>                                         nagedContext.java:103)
>>>>                                         ewcstack-man02-prod:    at
>>>>                                         org.apache.cloudstack.managed.
>>>>                                         context.impl.DefaultManagedContext.runWithContext(DefaultMan
>>>>                                         agedContext.java:53)
>>>>                                         ewcstack-man02-prod:    at
>>>>                                         org.apache.cloudstack.managed.
>>>>                                         context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>>>>                                         ewcstack-man02-prod:    at
>>>>                                         org.apache.cloudstack.framewor
>>>>                                         k.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:494)
>>>>                                         ewcstack-man02-prod:    at
>>>>                                         java.util.concurrent.Executors
>>>>                                         $RunnableAdapter.call(Executors.java:471)
>>>>                                         ewcstack-man02-prod:    at
>>>>                                         java.util.concurrent.FutureTas
>>>>                                         k.run(FutureTask.java:262)
>>>>                                         ewcstack-man02-prod:    at
>>>>                                         java.util.concurrent.ThreadPoo
>>>>                                         lExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>                                         ewcstack-man02-prod:    at
>>>>                                         java.util.concurrent.ThreadPoo
>>>>                                         lExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>                                         ewcstack-man02-prod:    at
>>>>                                         java.lang.Thread.run(Thread.java:745)
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:32,590 DEBUG
>>>>                                         [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>                                         (Work-Job-Executor-115:ctx-0d5db528
>>>>                                         job-13733/job-191379)
>>>>                                         Complete async job-191379,
>>>>                                         jobStatus: FAILED,
>>>>                                         resultCode: 0, result:
>>>>                                         rO0ABXNyAC9jb20uY2xvdWQudXRpbH
>>>>                                         MuZXhjZXB0aW9uLkNsb3VkUnVudGltZUV4Y2VwdGlvbgAAAABWTU9yAwABSQ
>>>>                                         ALY3NFcnJvckNvZGV4cgAaamF2YS5sYW5nLlJ1bnRpbWVFeGNlcHRpb26eXw
>>>>                                         ZHCjSD5QIAAHhyABNqYXZhLmxhbmcuRXhjZXB0aW9u0P0fPho7HMQCAAB4cg
>>>>                                         ATamF2YS5sYW5nLlRocm93YWJsZdXGNSc5d7jLAwAETAAFY2F1c2V0ABVMam
>>>>                                         F2YS9sYW5nL1Rocm93YWJsZTtMAA1kZXRhaWxNZXNzYWdldAASTGphdmEvbG
>>>>                                         FuZy9TdHJpbmc7WwAKc3RhY2tUcmFjZXQAHltMamF2YS9sYW5nL1N0YWNrVH
>>>>                                         JhY2VFbGVtZW50O0wAFHN1cHByZXNzZWRFeGNlcHRpb25zdAAQTGphdmEvdX
>>>>                                         RpbC9MaXN0O3hwcQB-AAh0AChVbmFibGUgdG8gbWlncmF0ZSBWTVtVc2VyfG
>>>>                                         ktNjM4LTE3MzUtVk1ddXIAHltMamF2YS5sYW5nLlN0YWNrVHJhY2VFbGVtZW
>>>>                                         50OwJGKjw8_SI5AgAAeHAAAAAUc3IAG2phdmEubGFuZy5TdGFja1RyYWNlRW
>>>>                                         xlbWVudGEJxZomNt2FAgAESQAKbGluZU51bWJlckwADmRlY2xhcmluZ0NsYX
>>>>                                         NzcQB-AAVMAAhmaWxlTmFtZXEAfgAFTAAKbWV0aG9kTmFtZXEAfgAFeHAAAA
>>>>                                         lldAAmY29tLmNsb3VkLnZtLlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGx0AB
>>>>                                         5WaXJ0dWFsTWFjaGluZU1hbmFnZXJJbXBsLmphdmF0ABZvcmNoZXN0cmF0ZU
>>>>                                         1pZ3JhdGVBd2F5c3EAfgAMAAARpXEAfgAOcQB-AA9xAH4AEHNxAH4ADP____
>>>>                                         90ACZzdW4ucmVmbGVjdC5HZW5lcmF0ZWRNZXRob2RBY2Nlc3NvcjQwMnB0AA
>>>>                                         ZpbnZva2VzcQB-AAwAAAArdAAoc3VuLnJlZmxlY3QuRGVsZWdhdGluZ01ldG
>>>>                                         hvZEFjY2Vzc29ySW1wbHQAIURlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcG
>>>>                                         wuamF2YXEAfgAUc3EAfgAMAAACXnQAGGphdmEubGFuZy5yZWZsZWN0Lk1ldG
>>>>                                         hvZHQAC01ldGhvZC5qYXZhcQB-ABRzcQB-AAwAAABrdAAiY29tLmNsb3VkLn
>>>>                                         ZtLlZtV29ya0pvYkhhbmRsZXJQcm94eXQAGlZtV29ya0pvYkhhbmRsZXJQcm
>>>>                                         94eS5qYXZhdAAPaGFuZGxlVm1Xb3JrSm9ic3EAfgAMAAASHHEAfgAOcQB-
>>>>                                         AA9xAH4AHnNxAH4ADAAAAGd0ACBjb20uY2xvdWQudm0uVm1Xb3JrSm9iRGlz
>>>>                                         cGF0Y2hlcnQAGFZtV29ya0pvYkRpc3BhdGNoZXIuamF2YXQABnJ1bkpvYnNx
>>>>                                         AH4ADAAAAhl0AD9vcmcuYXBhY2hlLmNsb3Vkc3RhY2suZnJhbWV3b3JrLmpv
>>>>                                         YnMuaW1wbC5Bc3luY0pvYk1hbmFnZXJJbXBsJDV0ABhBc3luY0pvYk1hbmFn
>>>>                                         ZXJJbXBsLmphdmF0AAxydW5JbkNvbnRleHRzcQB-AAwAAAAxdAA-b3JnLmFw
>>>>                                         YWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29udGV4
>>>>                                         dFJ1bm5hYmxlJDF0ABtNYW5hZ2VkQ29udGV4dFJ1bm5hYmxlLmphdmF0AANy
>>>>                                         dW5zcQB-AAwAAAA4dABCb3JnLmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQu
>>>>                                         Y29udGV4dC5pbXBsLkRlZmF1bHRNYW5hZ2VkQ29udGV4dCQxdAAaRGVmYXVs
>>>>                                         dE1hbmFnZWRDb250ZXh0LmphdmF0AARjYWxsc3EAfgAMAAAAZ3QAQG9yZy5h
>>>>                                         cGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0
>>>>                                         TWFuYWdlZENvbnRleHRxAH4ALnQAD2NhbGxXaXRoQ29udGV4dHNxAH4ADAAA
>>>>                                         ADVxAH4AMXEAfgAudAAOcnVuV2l0aENvbnRleHRzcQB-AAwAAAAudAA8b3Jn
>>>>                                         LmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29u
>>>>                                         dGV4dFJ1bm5hYmxlcQB-ACpxAH4AK3NxAH4ADAAAAe5xAH4AJXEAfgAmcQB-
>>>>                                         ACtzcQB-AAwAAAHXdAAuamF2YS51dGlsLmNvbmN1cnJlbnQuRXhlY3V0b3Jz
>>>>                                         JFJ1bm5hYmxlQWRhcHRlcnQADkV4ZWN1dG9ycy5qYXZhcQB-AC9zcQB-AAwA
>>>>                                         AAEGdAAfamF2YS51dGlsLmNvbmN1cnJlbnQuRnV0dXJlVGFza3QAD0Z1dHVy
>>>>                                         ZVRhc2suamF2YXEAfgArc3EAfgAMAAAEeXQAJ2phdmEudXRpbC5jb25jdXJy
>>>>                                         ZW50LlRocmVhZFBvb2xFeGVjdXRvcnQAF1RocmVhZFBvb2xFeGVjdXRvci5q
>>>>                                         YXZhdAAJcnVuV29ya2Vyc3EAfgAMAAACZ3QALmphdmEudXRpbC5jb25jdXJy
>>>>                                         ZW50LlRocmVhZFBvb2xFeGVjdXRvciRXb3JrZXJxAH4AQHEAfgArc3EAfgAM
>>>>                                         AAAC6XQAEGphdmEubGFuZy5UaHJlYWR0AAtUaHJlYWQuamF2YXEAfgArc3IA
>>>>                                         JmphdmEudXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVMaXN0_A8lMbXs
>>>>                                         jhACAAFMAARsaXN0cQB-AAd4cgAsamF2YS51dGlsLkNvbGxlY3Rpb25zJFVu
>>>>                                         bW9kaWZpYWJsZUNvbGxlY3Rpb24ZQgCAy173HgIAAUwAAWN0ABZMamF2YS91
>>>>                                         dGlsL0NvbGxlY3Rpb247eHBzcgATamF2YS51dGlsLkFycmF5TGlzdHiB0h2Z
>>>>                                         x2GdAwABSQAEc2l6ZXhwAAAAAHcEAAAAAHhxAH4ATHgAABCadwgAAAAAAAAAAHg
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:32,592 DEBUG
>>>>                                         [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>                                         (Work-Job-Executor-115:ctx-0d5db528
>>>>                                         job-13733/job-191379)
>>>>                                         Publish async job-191379
>>>>                                         complete on message bus
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:32,592 DEBUG
>>>>                                         [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>                                         (Work-Job-Executor-115:ctx-0d5db528
>>>>                                         job-13733/job-191379) Wake
>>>>                                         up jobs related to job-191379
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:32,592 DEBUG
>>>>                                         [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>                                         (Work-Job-Executor-115:ctx-0d5db528
>>>>                                         job-13733/job-191379)
>>>>                                         Update db status for job-191379
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:32,594 DEBUG
>>>>                                         [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>                                         (Work-Job-Executor-115:ctx-0d5db528
>>>>                                         job-13733/job-191379) Wake
>>>>                                         up jobs joined with
>>>>                                         job-191379 and disjoin
>>>>                                         all
>>>>                                         subjobs created from job-
>>>>                                         191379
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:32,604 DEBUG
>>>>                                         [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>                                         (Work-Job-Executor-115:ctx-0d5db528
>>>>                                         job-13733/job-191379) Done
>>>>                                         executing
>>>>                                         com.cloud.vm.VmWorkMigrateAway
>>>>                                         for
>>>>                                         job-191379
>>>>                                         ewcstack-man02-prod:
>>>>                                         2017-01-24 18:05:32,608 INFO
>>>>                                         [o.a.c.f.j.i.AsyncJobMonitor]
>>>>                                         (Work-Job-Executor-115:ctx-0d5db528
>>>>                                         job-13733/job-191379)
>>>>                                         Remove job-191379 from job
>>>>                                         monitoring
>>>>
>>>>
>>>>
>>>>                                         On 25.01.2017 15:53, Rafael
>>>>                                         Weingärtner wrote:
>>>>
>>>>                                         Did you check the ACS log
>>>>                                         files looking for some
>>>>                                         exceptions when the
>>>>
>>>>                                             migrations for this
>>>>                                             specific VM started?
>>>>
>>>>                                             On Wed, Jan 25, 2017 at
>>>>                                             9:49 AM, Francois
>>>>                                             Scheurer <
>>>>                                             francois.scheurer@everyware.ch
>>>>                                             <ma...@everyware.ch>>
>>>>                                             wrote:
>>>>
>>>>                                             Dear CS contributors
>>>>
>>>>                                                 We use CS 4.5.1 on
>>>>                                                 a 3 Clusters with
>>>>                                                 XenServer 6.5. and
>>>>                                                 shared primary
>>>>                                                 storage.
>>>>
>>>>                                                 When we put a host
>>>>                                                 in maintenance, cs
>>>>                                                 evacuates all VM's
>>>>                                                 on other
>>>>                                                 hosts.
>>>>
>>>>                                                 But it happens
>>>>                                                 regularly that some
>>>>                                                 VM's are stopped
>>>>                                                 instead of live
>>>>                                                 migrated.
>>>>
>>>>                                                 If we restart that
>>>>                                                 stopped VM's, CS
>>>>                                                 will then stop them
>>>>                                                 again later.
>>>>
>>>>                                                 We need to start
>>>>                                                 them several times
>>>>                                                 until they stays up.
>>>>
>>>>                                                 Is it a known
>>>>                                                 issue? Is it fixed
>>>>                                                 on CS 4.9.2 ?
>>>>
>>>>
>>>>                                                 Migrating manually
>>>>                                                 all VM's is working
>>>>                                                 but impracticable
>>>>                                                 because too
>>>>                                                 much
>>>>                                                 time consuming.
>>>>
>>>>
>>>>                                                 Many thanks in
>>>>                                                 advance for your help.
>>>>
>>>>
>>>>
>>>>                                                 Best Regards
>>>>                                                 Francois
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>                                                 --
>>>>
>>>>
>>>>                                                 EveryWare AG
>>>>                                                 François Scheurer
>>>>                                                 Senior Systems Engineer
>>>>                                                 Zurlindenstrasse 52a
>>>>                                                 CH-8003 Zürich
>>>>
>>>>                                                 tel: +41 44 466 60
>>>>                                                 00
>>>>                                                 <tel:%2B41%2044%20466%2060%2000>
>>>>                                                 fax: +41 44 466 60
>>>>                                                 10
>>>>                                                 <tel:%2B41%2044%20466%2060%2010>
>>>>                                                 mail:
>>>>                                                 francois.scheurer@everyware.ch
>>>>                                                 <ma...@everyware.ch>
>>>>                                                 web:
>>>>                                                 http://www.everyware.ch
>>>>
>>>>
>>>>
>>>>                                             --
>>>>
>>>>
>>>>                                         EveryWare AG
>>>>                                         François Scheurer
>>>>                                         Senior Systems Engineer
>>>>                                         Zurlindenstrasse 52a
>>>>                                         CH-8003 Zürich
>>>>
>>>>                                         tel: +41 44 466 60 00
>>>>                                         <tel:%2B41%2044%20466%2060%2000>
>>>>                                         fax: +41 44 466 60 10
>>>>                                         <tel:%2B41%2044%20466%2060%2010>
>>>>                                         mail:
>>>>                                         francois.scheurer@everyware.ch
>>>>                                         <ma...@everyware.ch>
>>>>                                         web: http://www.everyware.ch
>>>>
>>>>
>>>>
>>>>                                     --
>>>>                                     Rafael Weingärtner
>>>>
>>>>
>>>>
>>>>                             --
>>>>
>>>>
>>>>                             EveryWare AG
>>>>                             François Scheurer
>>>>                             Senior Systems Engineer
>>>>                             Zurlindenstrasse 52a
>>>>                             CH-8003 Zürich
>>>>
>>>>                             tel: +41 44 466 60 00
>>>>                             <tel:%2B41%2044%20466%2060%2000>
>>>>                             fax: +41 44 466 60 10
>>>>                             <tel:%2B41%2044%20466%2060%2010>
>>>>                             mail: francois.scheurer@everyware.ch
>>>>                             <ma...@everyware.ch>
>>>>                             web: http://www.everyware.ch
>>>>
>>>>
>>>>
>>>>
>>>>                     -- 
>>>>
>>>>
>>>>                     EveryWare AG
>>>>                     François Scheurer
>>>>                     Senior Systems Engineer
>>>>                     Zurlindenstrasse 52a
>>>>                     CH-8003 Zürich
>>>>
>>>>                     tel: +41 44 466 60 00
>>>>                     <tel:%2B41%2044%20466%2060%2000>
>>>>                     fax: +41 44 466 60 10
>>>>                     <tel:%2B41%2044%20466%2060%2010>
>>>>                     mail: francois.scheurer@everyware.ch
>>>>                     <ma...@everyware.ch>
>>>>                     web: http://www.everyware.ch
>>>>
>>>>
>>>>
>>>>
>>>>                 -- 
>>>>                 Rafael Weingärtner
>>>>
>>>>
>>>>
>>>>
>>>>             -- 
>>>>             Rafael Weingärtner
>>>
>>>             -- 
>>>
>>>
>>>             EveryWare AG
>>>             François Scheurer
>>>             Senior Systems Engineer
>>>             Zurlindenstrasse 52a
>>>             CH-8003 Zürich
>>>
>>>             tel:+41 44 466 60 00 <tel:+41%2044%20466%2060%2000>
>>>             fax:+41 44 466 60 10 <tel:+41%2044%20466%2060%2010>
>>>             mail:francois.scheurer@everyware.ch
>>>             <ma...@everyware.ch>
>>>             web:http://www.everyware.ch  
>>>
>>>
>>>
>>>
>>>         -- 
>>>         Rafael Weingärtner
>>
>>         -- 
>>
>>
>>         EveryWare AG
>>         François Scheurer
>>         Senior Systems Engineer
>>         Zurlindenstrasse 52a
>>         CH-8003 Zürich
>>
>>         tel:+41 44 466 60 00 <tel:+41%2044%20466%2060%2000>
>>         fax:+41 44 466 60 10 <tel:+41%2044%20466%2060%2010>
>>         mail:francois.scheurer@everyware.ch
>>         <ma...@everyware.ch>
>>         web:http://www.everyware.ch  
>>
>>
>>
>>
>>     -- 
>>     Rafael Weingärtner
>
>     -- 
>
>
>     EveryWare AG
>     François Scheurer
>     Senior Systems Engineer
>     Zurlindenstrasse 52a
>     CH-8003 Zürich
>
>     tel:+41 44 466 60 00 <tel:+41%2044%20466%2060%2000>
>     fax:+41 44 466 60 10 <tel:+41%2044%20466%2060%2010>
>     mail:francois.scheurer@everyware.ch <ma...@everyware.ch>
>     web:http://www.everyware.ch  
>
>
>
>
> -- 
> Rafael Weingärtner

-- 


EveryWare AG
François Scheurer
Senior Systems Engineer
Zurlindenstrasse 52a
CH-8003 Zürich

tel: +41 44 466 60 00
fax: +41 44 466 60 10
mail: francois.scheurer@everyware.ch
web: http://www.everyware.ch

Re: cs 4.5.1, vm stopped when putting xen hosts in maintenance

Posted by Rafael Weingärtner <ra...@gmail.com>.

Got the idea of your work around now.

I have not paid much attention to the table data you posted (sorry for
that). Well, it seems you have all taken care of.

I have a suggestion to make; instead of changing data directly on the data
base, what about using CloudMonkey to update the HA flag to false on VMs?
Currently, it is the UI that is blocking such customization, but according
to the API, you should be able to do that.
https://cloudstack.apache.org/api/apidocs-4.5/root_admin/updateVirtualMachine.html

On Tue, Feb 7, 2017 at 11:25 AM, Francois Scheurer <
francois.scheurer@everyware.ch> wrote:

> Dear Rafael
>
>
> We changed it on the VM to avoid the required stop and start. Changing on
> the used Offering is not affecting the vm_instance.ha_enabled.
> We could create a new Offering without HA but the VM need to be stopped to
> let us apply it.
>
> We did not disabled the HA on these 4 VM's to solve the problem, but just
> to minimize the impact of the problem (now only i-189-1177-VM should be
> stopped intermittently) if the cleaning of op_it_work is not fixing the
> issue.
> Of course we could filter the update query with “instance id”, but if you
> look the the table output all instance_id are related to the 5 problematic
> VM's.
> The only exceptions are the first 3 rows that are related to VM's that
> does no exist anymore since July 2015 ...
> That is why I thought it may be safe to do.
>
>
> Best Regards
>
> Francois Scheurer
>
>
>
>
>
>
>
>
> On 07.02.2017 16:27, Rafael Weingärtner wrote:
>
> So, you changed the flags directly on "vm_instance". Why did not you
> change the service offering of these VMs using ACS UI or API? For me, it
> feels that the best approach would be to change the service offering of the
> affected VMs (the HA flag is first configured at the service offering). You
> could assign a service offering without HA; I mean, it would be a matter of
> creating a similar service offering (same configurations to the one they
> are using right now), but without HA.
>
> I am assuming that the change directly on “vm_instance” did not solve your
> problem, right? That is why you want to change some data at “op_it_work”. I
> did a quick look at “op_it_work”, and it seems that this table may be used
> for more things than HA. Here is a tip for complex production system, never
> run update SQLs in production environments without a very, very restrictive
> and well thought “where” clause. In my opinion this is not restrictive
> enough (update op_it_work set step="Done" where step!="Done"). I would
> advise you not to go further and change configurations directly on database
> (unless you have a dev team that is capable of reading, debugging, and work
> with ACS source code). ACS has over 500k lines of java code, and God knows
> how many lines of shell, python and other languages, software and systems
> that may be used; it is pretty hard sometimes to get a clear picture of how
> a change in the database will reverberate in the whole system without
> knowing in details the installation configurations and underlying
> structures (networking, hypervisors, storage systems and other pieces).
> However, if you want to proceed with that, I would suggest you adding some
> extra filters to that “where” clause. What about adding the “instance id”
> there?
>
> PS: BTW, do you already heard about the Cloudstack Fork from Schuberg
> Philis named Cosmic? https://github.com/MissionCriticalCloud/cosmic
> Yes, I have. They were part of the active ACS community back there. I was
> already here when they created the fork. Still, I kept with ACS.
>
>
> On Tue, Feb 7, 2017 at 3:54 AM, Francois Scheurer <
> <fr...@everyware.ch> wrote:
>
>> Dear Rafael
>>
>>
>> We disabled temporarily the HA Flag on these VM's by updating the DB
>> directly: update vm_instance set ha_enabled=0 where removed is null and
>> id=1178 or id=1240 or id=1690 or id=1736;
>>
>> We kept the test VM i-189-1177-VM with HA Flag.
>>
>> Restarting the MS + cleaning the op_it_work table will hopefully fix the
>> issue. And if the issue is persisting, then only the i-189-1177-VM should
>> be affected, hence mitigating the impact.
>>
>> In the other hand if everything run fine we would thereafter re-enable
>> the HA Flag on the problematic VM's.
>>
>>
>> Do you think it is safe to run this update?: update op_it_work set
>> step="Done" where step!="Done";
>>
>>
>> Best Regards
>>
>> Francois Scheurer
>>
>>
>>
>> PS: BTW, do you already heard about the Cloudstack Fork from Schuberg
>> Philis named Cosmic? <https://github.com/MissionCriticalCloud/cosmic>
>> https://github.com/MissionCriticalCloud/cosmic
>>
>>
>>
>>
>>
>> On 06.02.2017 16:41, Rafael Weingärtner wrote:
>>
>> I totally understand your position of not being able to debug the ACS in
>> production ;)
>>
>>
>> There is one bit I quite did not understand. If you have disabled the HA
>> for VMs, why do you still need to change data on the data base? How did you
>> disabled the HA of VMs?
>>
>> On Mon, Feb 6, 2017 at 5:04 AM, Francois Scheurer <
>> francois.scheurer@everyware.ch> wrote:
>>
>>> Dear Rafael / Dear CS contributors
>>>
>>>
>>> I tried the following:
>>>
>>> 1.     stop all MS
>>> 2.     backup DB
>>> 3.     select * from op_ha_work where type="Migration" and
>>> mgmt_server_id is null and taken is null;
>>> 4.     delete from op_ha_work where type="Migration" and mgmt_server_id
>>> is null and taken is null;
>>> 5.     start all MS
>>>
>>> But actually after the MS stopped the op_ha_work table was
>>> automatically cleaned, making my steps useless.
>>> After the MS started the op_ha_work table was filled again and related
>>> VM's were stopped several times (3-7 times), requiring high attention to
>>> manually restart them on each stop.
>>>
>>> I later found another table that could be the cause of the issue:
>>>
>>> mysql> select op_it_work.*,vm_instance.instance_name,vm_instance.removed
>>> from op_it_work join vm_instance where op_it_work.step!="Done" and
>>> op_it_work.instance_id=vm_instance.id order by instance_id;
>>> +--------------------------------------+----------------+---
>>> ---------+-----------------------+-----------+--------------
>>> ------+-----------+------------+-------------+--------------
>>> -+-------------+---------------+---------------------+
>>> | id                                   | mgmt_server_id | created_at |
>>> thread                | type      | vm_type            | step      |
>>> updated_at | instance_id | resource_type | resource_id | instance_name |
>>> removed             |
>>> +--------------------------------------+----------------+---
>>> ---------+-----------------------+-----------+--------------
>>> ------+-----------+------------+-------------+--------------
>>> -+-------------+---------------+---------------------+
>>> | be4d0dd2-7d4f-47d3-ba49-623d2f115bbf |   345049098498 | 1436187965
>>> <%2814%29%203618-7965> | consoleproxy-1        | Starting  |
>>> ConsoleProxy       | Release   | 1436188265 <%2814%29%203618-8265>
>>> |         402 | NULL          |           0 | v-402-VM      | 2015-07-06
>>> 13:23:22 |
>>> | fb4d1844-eb92-478c-9f50-d666790f43af |   345049098498 | 1436187965
>>> <%2814%29%203618-7965> | secstorage-1          | Starting  |
>>> SecondaryStorageVm | Release   | 1436188265 <%2814%29%203618-8265>
>>> |         687 | NULL          |           0 | s-687-VM      | 2015-07-06
>>> 13:11:34 |
>>> | 7dfeb25a-a705-42e8-907c-79d1e69697ed |   345049098498 | 1436964548
>>> <%2814%29%203696-4548> | Work-Job-Executor-165 | Migrating |
>>> User               | Migrating | 1436964548 <%2814%29%203696-4548>
>>> |         776 | Host          |          29 | i-51-776-VM   | 2015-07-15
>>> 12:55:38 |
>>> | 2018ac61-b11c-4101-8ef5-07a39509f2bd |   345049098498 | 1486020173 |
>>> Work-Job-Executor-19  | Migrating | User               | Prepare   |
>>> 1486020173 |        1177 | Host          |          25 | i-189-1177-VM |
>>> NULL                |
>>> | 25430cd5-4346-4643-9c7e-7b39b8a876a2 |   345049098122 | 1486019693 |
>>> Work-Job-Executor-6   | Migrating | User               | Prepare   |
>>> 1486019693 |        1177 | Host          |          25 | i-189-1177-VM |
>>> NULL                |
>>> | 3276d52e-cbd9-44c5-8ab6-c70784337ff9 |   345049098498 | 1486020953 |
>>> Work-Job-Executor-33  | Migrating | User               | Prepare   |
>>> 1486020953 |        1177 | Host          |          33 | i-189-1177-VM |
>>> NULL                |
>>> | 6462da33-83f2-4575-af38-0e52d2a48488 |   345049098122 | 1486021433 |
>>> Work-Job-Executor-12  | Migrating | User               | Prepare   |
>>> 1486021433 |        1177 | Host          |          25 | i-189-1177-VM |
>>> NULL                |
>>> | 688e16c8-c39f-4e88-9e8f-36c0282f3df1 |   345049098122 | 1486020773 |
>>> Work-Job-Executor-11  | Migrating | User               | Prepare   |
>>> 1486020773 |        1177 | Host          |          25 | i-189-1177-VM |
>>> NULL                |
>>> | d8a18e9d-ac6f-4273-923e-e401533e2cf7 |   345049098498 | 1486020353 |
>>> Work-Job-Executor-25  | Migrating | User               | Prepare   |
>>> 1486020353 |        1177 | Host          |          25 | i-189-1177-VM |
>>> NULL                |
>>> | 1222ece8-399d-43d8-8ff8-d24bdebca2ed |   345049103441 | 1486018924 |
>>> Work-Job-Executor-2   | Migrating | User               | Prepare   |
>>> 1486018924 |        1178 | Host          |          29 | i-189-1178-VM |
>>> NULL                |
>>> | 2ac5ca4a-c20d-433a-8379-40e0de3b3ba8 |   345049098498 | 1486021613 |
>>> Work-Job-Executor-41  | Migrating | User               | Prepare   |
>>> 1486021613 |        1178 | Host          |          29 | i-189-1178-VM |
>>> NULL                |
>>> | 3e3b95a0-2af4-4268-bfdd-f3399a85bcaa |   345049103441 | 1486019584 |
>>> Work-Job-Executor-5   | Migrating | User               | Prepare   |
>>> 1486019584 |        1178 | Host          |          29 | i-189-1178-VM |
>>> NULL                |
>>> | 53745a2d-056c-4a1b-9267-eb97f077a6c8 |   345049103441 | 1486020244 |
>>> Work-Job-Executor-6   | Migrating | User               | Prepare   |
>>> 1486020244 |        1178 | Host          |          29 | i-189-1178-VM |
>>> NULL                |
>>> | e3555b49-7c18-4272-b7f4-ba9351b3bbd1 |   345049098498 | 1486020893 |
>>> Work-Job-Executor-32  | Migrating | User               | Prepare   |
>>> 1486020893 |        1178 | Host          |          29 | i-189-1178-VM |
>>> NULL                |
>>> | 1dcd5204-4f9a-4342-a23b-329a48596e13 |   345049098498 | 1486020953 |
>>> Work-Job-Executor-34  | Migrating | User               | Prepare   |
>>> 1486020953 |        1240 | Host          |           3 | i-388-1240-VM |
>>> NULL                |
>>> | ec1668b3-fa3c-4673-b593-3d607b169a24 |   345049098498 | 1486019693 |
>>> Work-Job-Executor-14  | Migrating | User               | Prepare   |
>>> 1486019693 |        1240 | Host          |           3 | i-388-1240-VM |
>>> NULL                |
>>> | fa7b2dec-ed7f-4a67-a055-e6e6bc1631ca |   345049098498 | 1486020353 |
>>> Work-Job-Executor-26  | Migrating | User               | Prepare   |
>>> 1486020353 |        1240 | Host          |           3 | i-388-1240-VM |
>>> NULL                |
>>> | 8cfe6d3e-5b7b-4be8-ac10-c865a492aab9 |   345049098498 | 1465387271 |
>>> Work-Job-Executor-18  | Running   | User               | Prepare   |
>>> 1465387271 |        1388 | Host          |           7 | i-361-1388-VM |
>>> NULL                |
>>> | 6e941b7e-6f19-4717-bf82-974c59737dbc |   345049103441 | 1486019524 |
>>> Work-Job-Executor-4   | Migrating | User               | Prepare   |
>>> 1486019524 |        1690 | Host          |          15 | i-648-1690-VM |
>>> NULL                |
>>> | 718578a4-0dd6-42d9-be10-ed7680650108 |   345049098122 | 1486021433 |
>>> Work-Job-Executor-13  | Migrating | User               | Prepare   |
>>> 1486021433 |        1690 | Host          |          15 | i-648-1690-VM |
>>> NULL                |
>>> | 9acf3908-ef13-4b03-8c2b-1a7e55e18491 |   345049098498 | 1486020173 |
>>> Work-Job-Executor-20  | Migrating | User               | Prepare   |
>>> 1486020173 |        1690 | Host          |          15 | i-648-1690-VM |
>>> NULL                |
>>> | a50c420b-ae52-4caa-83db-704b52b299a4 |   345049098498 | 1486020773 |
>>> Work-Job-Executor-28  | Migrating | User               | Prepare   |
>>> 1486020773 |        1690 | Host          |          15 | i-648-1690-VM |
>>> NULL                |
>>> | 2b5f4179-5975-4326-9db7-643c3ede63fe |   345049103441 | 1484224516 |
>>> Work-Job-Executor-161 | Running   | User               | Prepare   |
>>> 1484224516 |        1703 | Host          |           3 | i-51-1703-VM  |
>>> NULL                |
>>> | 2f3c6c44-b2a0-4d10-88d2-ac143d2dd379 |   345049098498 | 1484223910 |
>>> Work-Job-Executor-163 | Running   | User               | Prepare   |
>>> 1484223910 |        1703 | Host          |           3 | i-51-1703-VM  |
>>> NULL                |
>>> | 4126a23b-1cde-49bc-8636-c0d063b85cf6 |   345049103441 | 1484224636 |
>>> Work-Job-Executor-86  | Running   | User               | Prepare   |
>>> 1484224636 |        1703 | Host          |           3 | i-51-1703-VM  |
>>> NULL                |
>>> | 5dad5645-766b-48c9-9a85-b77b3c101390 |   345049103441 | 1484224636 |
>>> Work-Job-Executor-92  | Running   | User               | Prepare   |
>>> 1484224636 |        1703 | Host          |           3 | i-51-1703-VM  |
>>> NULL                |
>>> | 8c1dfff9-e55d-4f81-a22f-45f72accc416 |   345049103441 | 1484224576 |
>>> Work-Job-Executor-95  | Running   | User               | Prepare   |
>>> 1484224576 |        1703 | Host          |           3 | i-51-1703-VM  |
>>> NULL                |
>>> | a703d2d7-b1e3-42ce-a551-831982990ffb |   345049098498 | 1484224630 |
>>> Work-Job-Executor-13  | Running   | User               | Prepare   |
>>> 1484224630 |        1703 | Host          |           3 | i-51-1703-VM  |
>>> NULL                |
>>> | 0e194cb5-95b7-41cb-969a-ae37d2ad449a |   345049098498 | 1485763613 |
>>> Work-Job-Executor-28  | Migrating | User               | Prepare   |
>>> 1485763613 |        1735 | Host          |          19 | i-638-1735-VM |
>>> 2017-01-30 15:24:10 |
>>> | 74f180b9-be0f-429b-9a45-597ca2ca997d |   345049098498 | 1485763013 |
>>> Work-Job-Executor-21  | Migrating | User               | Prepare   |
>>> 1485763013 |        1735 | Host          |          19 | i-638-1735-VM |
>>> 2017-01-30 15:24:10 |
>>> | feaa421c-88df-4493-9a4f-bddb6ee1b75b |   345049098122 | 1485764285 |
>>> Work-Job-Executor-19  | Migrating | User               | Prepare   |
>>> 1485764285 |        1735 | Host          |          19 | i-638-1735-VM |
>>> 2017-01-30 15:24:10 |
>>> | 32138a36-ad18-42b5-8478-76ad63caf4da |   345049098122 | 1486020293 |
>>> Work-Job-Executor-9   | Migrating | User               | Prepare   |
>>> 1486020293 |        1736 | Host          |          19 | i-638-1736-VM |
>>> NULL                |
>>> | 5306ab1f-3c92-4f95-bbff-283b43e6862d |   345049098498 | 1486019573 |
>>> Work-Job-Executor-11  | Migrating | User               | Prepare   |
>>> 1486019573 |        1736 | Host          |          19 | i-638-1736-VM |
>>> NULL                |
>>> | 843bdf9c-247d-432c-ac19-1a4f7858631b |   345049098498 | 1486021493 |
>>> Work-Job-Executor-38  | Migrating | User               | Prepare   |
>>> 1486021493 |        1736 | Host          |          19 | i-638-1736-VM |
>>> NULL                |
>>> | 8d400afb-766f-4ac5-89ee-854f25a38eba |   345049098498 | 1486020173 |
>>> Work-Job-Executor-21  | Migrating | User               | Prepare   |
>>> 1486020173 |        1736 | Host          |          19 | i-638-1736-VM |
>>> NULL                |
>>> | f0ae5612-b6ef-4882-94a2-40a1e6dce342 |   345049098498 | 1486020833 |
>>> Work-Job-Executor-30  | Migrating | User               | Prepare   |
>>> 1486020833 |        1736 | Host          |          19 | i-638-1736-VM |
>>> NULL                |
>>> +--------------------------------------+----------------+---
>>> ---------+-----------------------+-----------+--------------
>>> ------+-----------+------------+-------------+--------------
>>> -+-------------+---------------+---------------------+
>>>
>>> As you see, the problematic VM's are all present on that table:
>>>
>>> i-189-1177-VM
>>> i-189-1178-VM
>>> i-388-1240-VM
>>> i-648-1690-VM
>>> i-638-1736-VM
>>>
>>> As the MS are running the production environnements, we cannot debug a
>>> lot and just need the shortest path to resolution.
>>> For example restarting the MS is having an impact on our customers and
>>> we need to reduce the number of restarts at the minimum.
>>>
>>> I have disabled the HA on all problematic VM's except on i-189-1177-VM,
>>> which is less critical.
>>>
>>> The next steps could be:
>>> 1.     stop all MS
>>> 2.     backup DB
>>> 3.     select op_it_work.*,vm_instance.instance_name,vm_instance.removed
>>> from op_it_work join vm_instance where op_it_work.step!="Done" and
>>> op_it_work.instance_id=vm_instance.id order by instance_id;
>>> 4.     update op_it_work set step="Done" where step!="Done";
>>> 5.     start all MS
>>>
>>> What do you think about step 4., safe or risky?
>>> Some of these entries are related to deleted VM's and seems to safe to
>>> mark as "Done".
>>> All other entries are related to the problematic VM's that keep
>>> stopping... so marking them as "Done" may (hopefully) eliminate the cause
>>> of the Migrating/Stopping.
>>> But we are not sure if marking them as "Done" need also changes on other
>>> tables for consistency.
>>>
>>>
>>> Thank you for your help.
>>>
>>>
>>> Best Regards
>>> Francois Scheurer
>>>
>>>
>>>
>>>
>>>
>>> On 01.02.2017 03:01, Rafael Weingärtner wrote:
>>>
>>> Ok, I dug a little further into the code.
>>>
>>> If we put a host in maintenance, it may unfortunately happen that a
>>>> timeout occurs and CS aborts the migration. As far as we can interpret the
>>>> logs, it seems (we speculate) that CS detects that the VM is then still
>>>> running on the original host, which is in maintenance, and decides somehow
>>>> to stop it because of the "host not up" status.
>>>>
>>>
>>> From wat I have seen in the code, this may happen (as you speculated).
>>> If a timeout or some other event interrupts migrations, the VM will be
>>> stopped. This happens in the method “com.cloud.vm.VirtualMachineMa
>>> nagerImpl.orchestrateMigrateAway(String, long, DeploymentPlanner)” at
>>> line 2396 and further. This was implemented on purpose; if the migration
>>> does not happen; the VM is stopped on the source host. Probably it was
>>> implemented this way on that philosophy of gradient descendent of
>>> artificial intelligence:
>>>
>>> “pick a random point in the field, update the values according to the
>>>> slope and hopefully after successfully iterations you converge to a minimum
>>>> point.” (the problem is that sometimes the function diverges).
>>>>
>>>
>>> I mean, probably this was done this way because the HA was supposed to
>>> see that a VM with HA is stopped, and should try to start it again
>>> somewhere else. The log you posted do not provide any other details to
>>> debug this further because the log is after an MS restart, right?
>>>
>>>
>>> After restarting the MSs I guess that a similar issue may happen: the
>>>> hosts are disconnected during the early stage and then reconnects. Maybe CS
>>>> then again stops the VMs because the "host not up" status?
>>>> Cf. 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>>> The last host of this VM is not UP or is not enabled, host status is:
>>>> Disconnected, host resource state is: Enabled
>>>>
>>>
>>> Here is something interesting that you have found. I checked the code,
>>> but without a live debug it is pretty hard for me to get the right flow of
>>> execution. By looking at the code and at the logs you provided, your
>>> deduction is right. After ACS boots up, the check for hosts that are
>>> shutdown/deactivated starts, and the method “com.cloud.agent.manager.Agent
>>> ManagerImpl.executeUserRequest(long, Event)” is executed. The problem
>>> is that all of the hosts are considered as disconnected (ACS may not have
>>> finished connecting to all of them yet), and then a whole process of HA is
>>> kicked. The problem is that there is no host with status “UP”. That is why
>>> this happen “No suitable hosts found under this Cluster: 25”, and then the
>>> whole flow of HA, which shuts down the VM is kicked.
>>>
>>> In theory, the HA worker should see that a VM with HA is down, and
>>> should try to start it. I am not understanding why this is not working. If
>>> you decide to debug the running ACS, I may be able to help you, just keep
>>> me posted.
>>>
>>>
>>> On Tue, Jan 31, 2017 at 10:26 AM, Rafael Weingärtner <
>>> rafaelweingartner@gmail.com> wrote:
>>>
>>>> I will be able to re-check some code and answer you with new thoughts
>>>> on this problem tonight (the log files you provided today have some very
>>>> insightful information regarding this problem). I believe this is a bug in
>>>> HA, a very tricky one that has not been reported yet (at least I do not
>>>> remember seeing anything similar). And thanks to randomness/destiny now you
>>>> have found/uncovered a way to reproduce the problem (bug?!).
>>>>
>>>> I do not suggest you changing any data at all directly in the data base
>>>> (unless you know ACS code well and are 100% of what you are doing). This
>>>> can cause even bigger and more complicated problems. If you remove the HA
>>>> feature from this VMs, this problem will stop. However, if you can/are
>>>> willing to maintain the HA until we fully uncover the problem, map and
>>>> report it, this would be great.
>>>> BTW: why are you restarting MSs?
>>>> What are the configurations of your MSs?
>>>>
>>>> Thanks for your time and for sharing information with our community.
>>>>
>>>>
>>>>
>>>> On Tue, Jan 31, 2017 at 8:53 AM, Francois Scheurer <
>>>> <fr...@everyware.ch> wrote:
>>>>
>>>>> Dear Rafael
>>>>>
>>>>>
>>>>> Many thanks for your deep analysis.
>>>>> Nice humor with the planetsalignments BTW ;)
>>>>>
>>>>> We also suspected the HA Flag to be related this issue, as we never
>>>>> got this issue in the past, until we enabled the HA on all VMs and
>>>>> Offerings.
>>>>>
>>>>> Here is also the workers list:
>>>>>
>>>>> mysql> select * from op_ha_work;
>>>>> +------+-------------+-----------+--------------------+-----
>>>>> -----+----------------+---------+---------------------+-----
>>>>> --+---------------------+-------+-------------+---------+
>>>>> | id   | instance_id | type      | vm_type            | state    |
>>>>> mgmt_server_id | host_id | created             | tried | taken
>>>>>  | step  | time_to_try | updated |
>>>>> +------+-------------+-----------+--------------------+-----
>>>>> -----+----------------+---------+---------------------+-----
>>>>> --+---------------------+-------+-------------+---------+
>>>>> |  105 |         343 | HA        | User               | Running  |
>>>>>  345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>> Error |  1394056013 |       3 |
>>>>> |  109 |         407 | HA        | ConsoleProxy       | Running  |
>>>>>  345049098498 |      33 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>> Error |  1394056013 |       3 |
>>>>> |  113 |         407 | HA        | ConsoleProxy       | Running  |
>>>>>  345049098498 |      33 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>> Error |  1394056013 |       3 |
>>>>> |  117 |         407 | HA        | ConsoleProxy       | Running  |
>>>>>  345049098498 |      33 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>> Error |  1394056013 |       3 |
>>>>> |  121 |         343 | HA        | User               | Running  |
>>>>>  345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>> Error |  1394056013 |       3 |
>>>>> |  125 |         405 | HA        | SecondaryStorageVm | Running  |
>>>>>  345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>> Error |  1394056013 |       3 |
>>>>> |  129 |         393 | HA        | User               | Running  |
>>>>>  345049098498 |      25 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>> Error |  1394056014 |      30 |
>>>>> |  133 |         405 | HA        | SecondaryStorageVm | Running  |
>>>>>  345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>> Error |  1394056014 |       3 |
>>>>> |  137 |         402 | HA        | ConsoleProxy       | Running  |
>>>>>  345049098498 |      25 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>>> Error |  1394056014 |       3 |
>>>>> |  345 |         402 | HA        | ConsoleProxy       | Starting |
>>>>>  345049098498 |      25 | 2015-07-06 13:09:26 | 1 | 2015-07-06 13:09:26 |
>>>>> Error |  1402527506 |     770 |
>>>>> |  349 |         687 | HA        | SecondaryStorageVm | Starting |
>>>>>  345049098498 |      25 | 2015-07-06 13:09:26 | 1 | 2015-07-06 13:09:26 |
>>>>> Error |  1402527506 |       2 |
>>>>> | 2057 |        1178 | Migration | User               | Running  |
>>>>>  345049098498 |      33 | 2017-01-31 07:56:08 | 0 | 2017-01-31 07:56:08 |
>>>>> Done  |  1451024774 |      95 |
>>>>> | 2058 |        1736 | Migration | User               | Running  |
>>>>>  345049098498 |     224 | 2017-01-31 07:56:08 | 0 | 2017-01-31 07:56:08 |
>>>>> Done  |  1451024774 |      45 |
>>>>> | 2059 |        1240 | Migration | User               | Running  |
>>>>>  345049098498 |       7 | 2017-01-31 07:56:08 | 0 | 2017-01-31 07:56:08 |
>>>>> Done  |  1451024774 |      61 |
>>>>> | 2060 |        1178 | Migration | User               | Running  |
>>>>>  345049098122 |      33 | 2017-01-31 07:56:08 | 1 | 2017-01-31 08:06:37 |
>>>>> Done  |  1451025374 |      95 |
>>>>> | 2061 |        1178 | Migration | User               | Stopped  |
>>>>>        NULL |      33 | 2017-01-31 07:56:09 | 5 | NULL                |
>>>>> Done  |  1451027900 |     109 |
>>>>> | 2062 |        1690 | Migration | User               | Running  |
>>>>>  345049098498 |     224 | 2017-01-31 07:56:09 | 0 | 2017-01-31 07:56:09 |
>>>>> Done  |  1451024774 |     109 |
>>>>> | 2063 |        1177 | Migration | User               | Running  |
>>>>>  345049098498 |      29 | 2017-01-31 07:56:09 | 0 | 2017-01-31 07:56:09 |
>>>>> Done  |  1451024774 |      82 |
>>>>> | 2064 |        1690 | Migration | User               | Running  |
>>>>>  345049098122 |     224 | 2017-01-31 07:56:09 | 1 | 2017-01-31 07:58:18 |
>>>>> Done  |  1451024900 |     109 |
>>>>> | 2065 |        1240 | Migration | User               | Running  |
>>>>>  345049098122 |       7 | 2017-01-31 07:56:09 | 1 | 2017-01-31 07:58:18 |
>>>>> Done  |  1451024900 |      61 |
>>>>> | 2066 |        1736 | Migration | User               | Running  |
>>>>>  345049098122 |     224 | 2017-01-31 07:56:09 | 1 | 2017-01-31 07:58:18 |
>>>>> Done  |  1451024900 |      45 |
>>>>> | 2067 |        1240 | Migration | User               | Stopped  |
>>>>>        NULL |       7 | 2017-01-31 07:56:09 | 5 | NULL                |
>>>>> Done  |  1451028003 |      80 |
>>>>> | 2068 |        1178 | Migration | User               | Running  |
>>>>>  345049098122 |      33 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
>>>>> Done  |  1451024781 |      95 |
>>>>> | 2069 |        1736 | Migration | User               | Running  |
>>>>>  345049098122 |     224 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
>>>>> Done  |  1451024781 |      45 |
>>>>> | 2070 |        1240 | Migration | User               | Running  |
>>>>>  345049098122 |       7 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
>>>>> Done  |  1451024781 |      61 |
>>>>> | 2071 |        1177 | Migration | User               | Running  |
>>>>>  345049098122 |      29 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
>>>>> Done  |  1451024781 |      82 |
>>>>> | 2072 |        1177 | Migration | User               | Stopped  |
>>>>>        NULL |      29 | 2017-01-31 07:56:16 | 5 | NULL                |
>>>>> Done  |  1451027973 |     101 |
>>>>> | 2073 |        1177 | Migration | User               | Running  |
>>>>>  345049098122 |      29 | 2017-01-31 07:56:16 | 1 | 2017-01-31 07:58:22 |
>>>>> Done  |  1451024904 |      82 |
>>>>> | 2074 |        1690 | Migration | User               | Stopped  |
>>>>>        NULL |     224 | 2017-01-31 07:56:16 | 5 | NULL                |
>>>>> Done  |  1451027464 |     123 |
>>>>> | 2075 |        1736 | Migration | User               | Stopped  |
>>>>>        NULL |     224 | 2017-01-31 07:56:16 | 5 | NULL                |
>>>>> Done  |  1451027477 |      59 |
>>>>> +------+-------------+-----------+--------------------+-----
>>>>> -----+----------------+---------+---------------------+-----
>>>>> --+---------------------+-------+-------------+---------+
>>>>>
>>>>> If we put a host in maintenance, it may unfortunately happen that a
>>>>> timeout occurs and CS aborts the migration.
>>>>> As far as we can interpret the logs, it seems (we speculate) that CS
>>>>> detects that the VM is then still running on the original host, which is in
>>>>> maintenance, and decides somehow to stop it because of the "host not up"
>>>>> status.
>>>>> We then need to start several times the VM, until CS gives up the
>>>>> stopping.
>>>>>
>>>>> After restarting the MSs I guess that a similar issue may happen: the
>>>>> hosts are disconnected during the early stage and then reconnects. Maybe CS
>>>>> then again stops the VMs because the "host not up" status?
>>>>> Cf. 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>>>> The last host of this VM is not UP or is not enabled, host status is:
>>>>> Disconnected, host resource state is: Enabled
>>>>>
>>>>> But the problem after the MSs restart is definitely only impacting VMs
>>>>> listed in the Table op_ha_work.
>>>>>
>>>>>
>>>>> To your questions:
>>>>>         We have 3 ACS MSs managing several Zones. Each zone has a
>>>>> single pod and cluster.
>>>>>         The Cluster of i-638-1735-VM has this summary:
>>>>>          Total number of VMs: 121
>>>>>          Total number hypervisors: 4
>>>>>          Total allocated RAM: 522.0 / 972 GB (53.7 %)
>>>>>          Total allocated cores: 211
>>>>>          Total allocated storage: 11285 GB
>>>>>         Used Memory on the hosts is now varying from 28 to 77%, so
>>>>> there is enough RAM.
>>>>>         I did not check the XS logs. The same Problem has arisen this
>>>>> morning again (shortly after a restart of all MSs), so I will look in the
>>>>> XS logs now.
>>>>>
>>>>>         I have some fair experience with Eclipse and Java, I will ask
>>>>> my boss if we can go this way.
>>>>>
>>>>> The main issue for now, is that restarting the MSs is reproducing the
>>>>> problem, which is quite annoying for the impacted VM's.
>>>>> I thought to stop the MSs, delete all HA/Migration entries in
>>>>> op_ha_work and start the MSs.
>>>>> Is that a viable option? (or should I delete the Migration entries
>>>>> only? or never touch this table?)
>>>>> We will also very probably remove the HA Flag on all VMs if it does
>>>>> not work reliably.
>>>>>
>>>>> Best Regards
>>>>> Francois Scheurer
>>>>>
>>>>>
>>>>>
>>>>> Here are new logs regarding the same issue of today.
>>>>> I tried to get the important parts related to the stop of
>>>>> i-189-1178-VM.
>>>>>
>>>>> regexp: (i-189-1178-VM|VmWorkMigrateAway|job-15393|job-192989|StopRe
>>>>> questedvm|job-14965|job-193019|Stopped-Migrating|ctx-36aae73
>>>>> 3|work-2049)
>>>>>
>>>>> 2017-01-30 08:44:04,440 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Executing
>>>>> AsyncJobVO {id:192989, userId: 1, accountId: 1, instanceType: null,
>>>>> instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>>> rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZU
>>>>> F3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm
>>>>> 1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTA
>>>>> ALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAA
>>>>> AAAAAAAAEAAAAAAAAEmnQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAHQ,
>>>>> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resu
>>>>> ltCode: 0, result: null, initMsid: 345049103441, completeMsid: null,
>>>>> lastUpdated: null, lastPolled: null, created: Mon Jan 30 08:44:03 CET 2017}
>>>>> 2017-01-30 08:44:04,440 DEBUG [c.c.v.VmWorkJobDispatcher]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Run VM work
>>>>> job: com.cloud.vm.VmWorkMigrateAway for VM 1178, job origin: 15393
>>>>>
>>>>> 2017-01-30 08:44:05,217 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Deploy avoids pods: [], clusters: [], hosts: [29]
>>>>> 2017-01-30 08:44:05,220 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
>>>>> (AgentConnectTaskPool-4:ctx-5b61f2e5) create ClusteredAgentAttache
>>>>> for 232
>>>>> 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> DeploymentPlanner allocation algorithm: com.cloud.deploy.UserDispersin
>>>>> gPlanner@1734b2e2
>>>>> 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Trying to allocate a host and storage pools from dc:17, pod:25,cluster:25,
>>>>> requested cpu: 4200, requested ram: 4294967296 <%2842%29%209496-7296>
>>>>> 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Is ROOT volume READY (pool already allocated)?: Yes
>>>>> 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> This VM has last host_id specified, trying to choose the same host: 25
>>>>>
>>>>> 2017-01-30 08:44:05,614 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> The last host of this VM is not UP or is not enabled, host status is:
>>>>> Disconnected, host resource state is: Enabled
>>>>> 2017-01-30 08:44:05,614 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Cannot choose the last host to deploy this VM
>>>>> 2017-01-30 08:44:05,614 DEBUG [c.c.d.FirstFitPlanner]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Searching resources only under specified Cluster: 25
>>>>>
>>>>> 2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>>>> FirstFitRoutingAllocator) Found 0 hosts for allocation after
>>>>> prioritization: []
>>>>> 2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>>>> FirstFitRoutingAllocator) Looking for speed=4200Mhz, Ram=4096
>>>>> 2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>>>> FirstFitRoutingAllocator) Host Allocator returning 0 suitable hosts
>>>>> 2017-01-30 08:44:05,777 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> No suitable hosts found
>>>>> 2017-01-30 08:44:05,777 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> No suitable hosts found under this Cluster: 25
>>>>> 2017-01-30 08:44:05,787 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Could not find suitable Deployment Destination for this VM under any
>>>>> clusters, returning.
>>>>> 2017-01-30 08:44:05,787 DEBUG [c.c.d.FirstFitPlanner]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Searching resources only under specified Cluster: 25
>>>>> 2017-01-30 08:44:05,798 DEBUG [c.c.d.FirstFitPlanner]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> The specified cluster is in avoid set, returning.
>>>>> 2017-01-30 08:44:05,798 DEBUG [c.c.v.VirtualMachineManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Unable to find destination for migrating the vm VM[User|i-189-1178-VM]
>>>>> 2017-01-30 08:44:05,798 WARN [c.c.v.VirtualMachineManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Failed to deploy vm 1178 with original planner, sending HAPlanner
>>>>>
>>>>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Deploy avoids pods: [], clusters: [], hosts: [29]
>>>>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> DeploymentPlanner allocation algorithm: com.cloud.deploy.SkipHeurestic
>>>>> sPlanner@4fdf86ce
>>>>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Trying to allocate a host and storage pools from dc:17, pod:25,cluster:25,
>>>>> requested cpu: 4200, requested ram: 4294967296 <%2842%29%209496-7296>
>>>>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Is ROOT volume READY (pool already allocated)?: Yes
>>>>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> This VM has last host_id specified, trying to choose the same host: 25
>>>>>
>>>>> 2017-01-30 08:44:06,022 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> The last host of this VM is not UP or is not enabled, host status is:
>>>>> Disconnected, host resource state is: Enabled
>>>>> 2017-01-30 08:44:06,022 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Cannot choose the last host to deploy this VM
>>>>> 2017-01-30 08:44:06,022 DEBUG [c.c.d.FirstFitPlanner]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Searching resources only under specified Cluster: 25
>>>>> 2017-01-30 08:44:06,039 DEBUG [c.c.d.SkipHeuresticsPlanner]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Deploying vm during HA process, so skipping disable threshold check
>>>>>
>>>>> 2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>>>> FirstFitRoutingAllocator) Found 0 hosts for allocation after
>>>>> prioritization: []
>>>>> 2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>>>> FirstFitRoutingAllocator) Looking for speed=4200Mhz, Ram=4096
>>>>> 2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>>>> FirstFitRoutingAllocator) Host Allocator returning 0 suitable hosts
>>>>> 2017-01-30 08:44:06,151 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> No suitable hosts found
>>>>> 2017-01-30 08:44:06,151 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> No suitable hosts found under this Cluster: 25
>>>>> 2017-01-30 08:44:06,157 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Publish async
>>>>> job-192986 complete on message bus
>>>>> 2017-01-30 08:44:06,157 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Wake up jobs
>>>>> related to job-192986
>>>>> 2017-01-30 08:44:06,157 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Update db
>>>>> status for job-192986
>>>>> 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>>>> The last host of this VM is not UP or is not enabled, host status is:
>>>>> Disconnected, host resource state is: Enabled
>>>>> 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>>>> Cannot choose the last host to deploy this VM
>>>>> 2017-01-30 08:44:06,157 DEBUG [c.c.d.FirstFitPlanner]
>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>>>> Searching resources only under specified Cluster: 35
>>>>> 2017-01-30 08:44:06,160 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Could not find suitable Deployment Destination for this VM under any
>>>>> clusters, returning.
>>>>> 2017-01-30 08:44:06,160 DEBUG [c.c.d.FirstFitPlanner]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Searching resources only under specified Cluster: 25
>>>>> 2017-01-30 08:44:06,163 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Execute sync-queue item:
>>>>> SyncQueueItemVO {id:63629, queueId: 64423, contentType: AsyncJob,
>>>>> contentId: 192991, lastProcessMsid: 345049098498, lastprocessNumber: 27,
>>>>> lastProcessTime: Mon Jan 30 08:44:06 CET 2017, created: Mon Jan 30 08:44:05
>>>>> CET 2017}
>>>>> 2017-01-30 08:44:06,169 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>>>>> (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Add
>>>>> job-192992 into job monitoring
>>>>> 2017-01-30 08:44:06,184 DEBUG [c.c.d.SkipHeuresticsPlanner]
>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>>>> Deploying vm during HA process, so skipping disable threshold check
>>>>> 2017-01-30 08:44:06,186 DEBUG [c.c.d.FirstFitPlanner]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> The specified cluster is in avoid set, returning.
>>>>> 2017-01-30 08:44:06,186 DEBUG [c.c.v.VirtualMachineManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Unable to find destination for migrating the vm VM[User|i-189-1178-VM]
>>>>> 2017-01-30 08:44:06,186 ERROR [c.c.v.VmWorkJobHandlerProxy]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Invocation exception, caused by: com.cloud.exception.InsufficientServerCapacityException:
>>>>> Unable to find a server to migrate to.Scope=interface
>>>>> com.cloud.org.Cluster; id=25
>>>>> 2017-01-30 08:44:06,187 INFO  [c.c.v.VmWorkJobHandlerProxy]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>>> Rethrow exception com.cloud.exception.InsufficientServerCapacityException:
>>>>> Unable to find a server to migrate to.Scope=interface
>>>>> com.cloud.org.Cluster; id=25
>>>>> 2017-01-30 08:44:06,187 DEBUG [c.c.v.VmWorkJobDispatcher]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Done with run
>>>>> of VM work job: com.cloud.vm.VmWorkMigrateAway for VM 1178, job origin:
>>>>> 15393
>>>>> 2017-01-30 08:44:06,187 ERROR [c.c.v.VmWorkJobDispatcher]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Unable to
>>>>> complete AsyncJobVO {id:192989, userId: 1, accountId: 1, instanceType:
>>>>> null, instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>>> rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcE
>>>>> mwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAl
>>>>> Z2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5h
>>>>> bWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAA
>>>>> AAAEmnQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAHQ, cmdVersion:
>>>>> 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null,
>>>>> initMsid: 345049103441, completeMsid: null, lastUpdated: null, lastPolled:
>>>>> null, created: Mon Jan 30 08:44:03 CET 2017}, job origin:15393
>>>>> com.cloud.exception.InsufficientServerCapacityException: Unable to
>>>>> find a server to migrate to.Scope=interface com.cloud.org.Cluster; id=25
>>>>>         at com.cloud.vm.VirtualMachineMan
>>>>> agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:2390)
>>>>>         at com.cloud.vm.VirtualMachineMan
>>>>> agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:4520)
>>>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>         at sun.reflect.NativeMethodAccess
>>>>> orImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>>         at sun.reflect.DelegatingMethodAc
>>>>> cessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>         at java.lang.reflect.Method.invoke(Method.java:606)
>>>>>         at com.cloud.vm.VmWorkJobHandlerP
>>>>> roxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
>>>>>         at com.cloud.vm.VirtualMachineMan
>>>>> agerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4636)
>>>>>         at com.cloud.vm.VmWorkJobDispatch
>>>>> er.runJob(VmWorkJobDispatcher.java:103)
>>>>>         at org.apache.cloudstack.framewor
>>>>> k.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManag
>>>>> erImpl.java:537)
>>>>>         at org.apache.cloudstack.managed.
>>>>> context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>>>>>         at org.apache.cloudstack.managed.
>>>>> context.impl.DefaultManagedContext$1.call(DefaultManagedCont
>>>>> ext.java:56)
>>>>>         at org.apache.cloudstack.managed.
>>>>> context.impl.DefaultManagedContext.callWithContext(DefaultMa
>>>>> nagedContext.java:103)
>>>>>         at org.apache.cloudstack.managed.
>>>>> context.impl.DefaultManagedContext.runWithContext(DefaultMan
>>>>> agedContext.java:53)
>>>>>         at org.apache.cloudstack.managed.
>>>>> context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>>>>>         at org.apache.cloudstack.framewor
>>>>> k.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:494)
>>>>>         at java.util.concurrent.Executors
>>>>> $RunnableAdapter.call(Executors.java:471)
>>>>>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>>>         at java.util.concurrent.ThreadPoo
>>>>> lExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>>         at java.util.concurrent.ThreadPoo
>>>>> lExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>>         at java.lang.Thread.run(Thread.java:745)
>>>>> 2017-01-30 08:44:06,188 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Schedule queued job-192991
>>>>> 2017-01-30 08:44:06,192 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Wake up jobs
>>>>> joined with job-192986 and disjoin all subjobs created from job- 192986
>>>>> 2017-01-30 08:44:06,196 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Complete
>>>>> async job-192989, jobStatus: FAILED, resultCode: 0, result:
>>>>> rO0ABXNyADdjb20uY2xvdWQuZXhjZXB0aW9uLkluc3VmZmljaWVudFNlcnZl
>>>>> ckNhcGFjaXR5RXhjZXB0aW9uAAAAAFZNT3ACAAFaABVhZmZpbml0eUdyb3Vw
>>>>> c0FwcGxpZWR4cgAxY29tLmNsb3VkLmV4Y2VwdGlvbi5JbnN1ZmZpY2llbnRD
>>>>> YXBhY2l0eUV4Y2VwdGlvbgAAAABWTU91AgACTAACaWR0ABBMamF2YS9sYW5n
>>>>> L0xvbmc7TAAFc2NvcGV0ABFMamF2YS9sYW5nL0NsYXNzO3hyACJjb20uY2xv
>>>>> dWQuZXhjZXB0aW9uLkNsb3VkRXhjZXB0aW9ueeiOG_HkRMcCAAJMAAtjc0Vy
>>>>> cm9yQ29kZXQAE0xqYXZhL2xhbmcvSW50ZWdlcjtMAAZpZExpc3R0ABVMamF2
>>>>> YS91dGlsL0FycmF5TGlzdDt4cgATamF2YS5sYW5nLkV4Y2VwdGlvbtD9Hz4a
>>>>> OxzEAgAAeHIAE2phdmEubGFuZy5UaHJvd2FibGXVxjUnOXe4ywMABEwABWNh
>>>>> dXNldAAVTGphdmEvbGFuZy9UaHJvd2FibGU7TAANZGV0YWlsTWVzc2FnZXQA
>>>>> EkxqYXZhL2xhbmcvU3RyaW5nO1sACnN0YWNrVHJhY2V0AB5bTGphdmEvbGFu
>>>>> Zy9TdGFja1RyYWNlRWxlbWVudDtMABRzdXBwcmVzc2VkRXhjZXB0aW9uc3QA
>>>>> EExqYXZhL3V0aWwvTGlzdDt4cHEAfgANdAAmVW5hYmxlIHRvIGZpbmQgYSBz
>>>>> ZXJ2ZXIgdG8gbWlncmF0ZSB0by51cgAeW0xqYXZhLmxhbmcuU3RhY2tUcmFj
>>>>> ZUVsZW1lbnQ7AkYqPDz9IjkCAAB4cAAAABVzcgAbamF2YS5sYW5nLlN0YWNr
>>>>> VHJhY2VFbGVtZW50YQnFmiY23YUCAARJAApsaW5lTnVtYmVyTAAOZGVjbGFy
>>>>> aW5nQ2xhc3NxAH4ACkwACGZpbGVOYW1lcQB-AApMAAptZXRob2ROYW1lcQB-
>>>>> AAp4cAAACVZ0ACZjb20uY2xvdWQudm0uVmlydHVhbE1hY2hpbmVNYW5hZ2Vy
>>>>> SW1wbHQAHlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwuamF2YXQAFm9yY2hl
>>>>> c3RyYXRlTWlncmF0ZUF3YXlzcQB-ABEAABGocQB-ABNxAH4AFHEAfgAVc3EA
>>>>> fgAR_____nQAJHN1bi5yZWZsZWN0Lk5hdGl2ZU1ldGhvZEFjY2Vzc29ySW1w
>>>>> bHQAHU5hdGl2ZU1ldGhvZEFjY2Vzc29ySW1wbC5qYXZhdAAHaW52b2tlMHNx
>>>>> AH4AEQAAADlxAH4AGHEAfgAZdAAGaW52b2tlc3EAfgARAAAAK3QAKHN1bi5y
>>>>> ZWZsZWN0LkRlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcGx0ACFEZWxlZ2F0
>>>>> aW5nTWV0aG9kQWNjZXNzb3JJbXBsLmphdmFxAH4AHHNxAH4AEQAAAl50ABhq
>>>>> YXZhLmxhbmcucmVmbGVjdC5NZXRob2R0AAtNZXRob2QuamF2YXEAfgAcc3EA
>>>>> fgARAAAAa3QAImNvbS5jbG91ZC52bS5WbVdvcmtKb2JIYW5kbGVyUHJveHl0
>>>>> ABpWbVdvcmtKb2JIYW5kbGVyUHJveHkuamF2YXQAD2hhbmRsZVZtV29ya0pv
>>>>> YnNxAH4AEQAAEhxxAH4AE3EAfgAUcQB-ACZzcQB-ABEAAABndAAgY29tLmNs
>>>>> b3VkLnZtLlZtV29ya0pvYkRpc3BhdGNoZXJ0ABhWbVdvcmtKb2JEaXNwYXRj
>>>>> aGVyLmphdmF0AAZydW5Kb2JzcQB-ABEAAAIZdAA_b3JnLmFwYWNoZS5jbG91
>>>>> ZHN0YWNrLmZyYW1ld29yay5qb2JzLmltcGwuQXN5bmNKb2JNYW5hZ2VySW1w
>>>>> bCQ1dAAYQXN5bmNKb2JNYW5hZ2VySW1wbC5qYXZhdAAMcnVuSW5Db250ZXh0
>>>>> c3EAfgARAAAAMXQAPm9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNv
>>>>> bnRleHQuTWFuYWdlZENvbnRleHRSdW5uYWJsZSQxdAAbTWFuYWdlZENvbnRl
>>>>> eHRSdW5uYWJsZS5qYXZhdAADcnVuc3EAfgARAAAAOHQAQm9yZy5hcGFjaGUu
>>>>> Y2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0TWFuYWdl
>>>>> ZENvbnRleHQkMXQAGkRlZmF1bHRNYW5hZ2VkQ29udGV4dC5qYXZhdAAEY2Fs
>>>>> bHNxAH4AEQAAAGd0AEBvcmcuYXBhY2hlLmNsb3Vkc3RhY2subWFuYWdlZC5j
>>>>> b250ZXh0LmltcGwuRGVmYXVsdE1hbmFnZWRDb250ZXh0cQB-ADZ0AA9jYWxs
>>>>> V2l0aENvbnRleHRzcQB-ABEAAAA1cQB-ADlxAH4ANnQADnJ1bldpdGhDb250
>>>>> ZXh0c3EAfgARAAAALnQAPG9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2Vk
>>>>> LmNvbnRleHQuTWFuYWdlZENvbnRleHRSdW5uYWJsZXEAfgAycQB-ADNzcQB-
>>>>> ABEAAAHucQB-AC1xAH4ALnEAfgAzc3EAfgARAAAB13QALmphdmEudXRpbC5j
>>>>> b25jdXJyZW50LkV4ZWN1dG9ycyRSdW5uYWJsZUFkYXB0ZXJ0AA5FeGVjdXRv
>>>>> cnMuamF2YXEAfgA3c3EAfgARAAABBnQAH2phdmEudXRpbC5jb25jdXJyZW50
>>>>> LkZ1dHVyZVRhc2t0AA9GdXR1cmVUYXNrLmphdmFxAH4AM3NxAH4AEQAABHl0
>>>>> ACdqYXZhLnV0aWwuY29uY3VycmVudC5UaHJlYWRQb29sRXhlY3V0b3J0ABdU
>>>>> aHJlYWRQb29sRXhlY3V0b3IuamF2YXQACXJ1bldvcmtlcnNxAH4AEQAAAmd0
>>>>> AC5qYXZhLnV0aWwuY29uY3VycmVudC5UaHJlYWRQb29sRXhlY3V0b3IkV29y
>>>>> a2VycQB-AEhxAH4AM3NxAH4AEQAAAul0ABBqYXZhLmxhbmcuVGhyZWFkdAAL
>>>>> VGhyZWFkLmphdmFxAH4AM3NyACZqYXZhLnV0aWwuQ29sbGVjdGlvbnMkVW5t
>>>>> b2RpZmlhYmxlTGlzdPwPJTG17I4QAgABTAAEbGlzdHEAfgAMeHIALGphdmEu
>>>>> dXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVDb2xsZWN0aW9uGUIAgMte
>>>>> 9x4CAAFMAAFjdAAWTGphdmEvdXRpbC9Db2xsZWN0aW9uO3hwc3IAE2phdmEu
>>>>> dXRpbC5BcnJheUxpc3R4gdIdmcdhnQMAAUkABHNpemV4cAAAAAB3BAAAAAB4
>>>>> cQB-AFR4c3IAEWphdmEubGFuZy5JbnRlZ2VyEuKgpPeBhzgCAAFJAAV2YWx1
>>>>> ZXhyABBqYXZhLmxhbmcuTnVtYmVyhqyVHQuU4IsCAAB4cAAAEO9zcQB-AFMA
>>>>> AAAAdwQAAAAAeHNyAA5qYXZhLmxhbmcuTG9uZzuL5JDMjyPfAgABSgAFdmFs
>>>>> dWV4cQB-AFYAAAAAAAAAGXZyABVjb20uY2xvdWQub3JnLkNsdXN0ZXKqoV-o
>>>>> maW2FAIAAHhwAA
>>>>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>>>> Deploy avoids pods: [], clusters: [], hosts: [3]
>>>>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>>>> DeploymentPlanner allocation algorithm: com.cloud.deploy.SkipHeurestic
>>>>> sPlanner@4fdf86ce
>>>>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>>>> Trying to allocate a host and storage pools from dc:14, pod:17,cluster:17,
>>>>> requested cpu: 4200, requested ram: 4294967296
>>>>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>>>> Is ROOT volume READY (pool already allocated)?: Yes
>>>>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>>>> This VM has last host_id specified, trying to choose the same host: 213
>>>>> 2017-01-30 08:44:06,213 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Publish async
>>>>> job-192989 complete on message bus
>>>>> 2017-01-30 08:44:06,213 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Wake up jobs
>>>>> related to job-192989
>>>>> 2017-01-30 08:44:06,213 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Update db
>>>>> status for job-192989
>>>>> 2017-01-30 08:44:06,219 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>>>> Checking resources in Cluster: 35 under Pod: 31
>>>>> 2017-01-30 08:44:06,227 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f
>>>>> FirstFitRoutingAllocator) Looking for hosts in dc: 23  pod:31  cluster:35
>>>>> 2017-01-30 08:44:06,229 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Executing
>>>>> AsyncJobVO {id:192992, userId: 1, accountId: 1, instanceType: null,
>>>>> instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>>> rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcE
>>>>> mwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAl
>>>>> Z2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5h
>>>>> bWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAA
>>>>> AAAGwXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAEw, cmdVersion:
>>>>> 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null,
>>>>> initMsid: 345049098122, completeMsid: null, lastUpdated: null, lastPolled:
>>>>> null, created: Mon Jan 30 08:44:05 CET 2017}
>>>>> 2017-01-30 08:44:06,229 DEBUG [c.c.v.VmWorkJobDispatcher]
>>>>> (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Run VM work
>>>>> job: com.cloud.vm.VmWorkMigrateAway for VM 1729, job origin: 14953
>>>>> 2017-01-30 08:44:06,230 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Execute sync-queue item:
>>>>> SyncQueueItemVO {id:63630, queueId: 67180, contentType: AsyncJob,
>>>>> contentId: 192990, lastProcessMsid: 345049098498, lastprocessNumber: 61,
>>>>> lastProcessTime: Mon Jan 30 08:44:06 CET 2017, created: Mon Jan 30 08:44:05
>>>>> CET 2017}
>>>>> 2017-01-30 08:44:06,244 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>>>> The last host of this VM is not UP or is not enabled, host status is:
>>>>> Disconnected, host resource state is: Enabled
>>>>> 2017-01-30 08:44:06,262 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>>>> Cannot choose the last host to deploy this VM
>>>>> 2017-01-30 08:44:06,263 DEBUG [c.c.d.FirstFitPlanner]
>>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>>>> Searching resources only under specified Cluster: 17
>>>>> 2017-01-30 08:44:06,262 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>>>>> (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Add
>>>>> job-192991 into job monitoring
>>>>> 2017-01-30 08:44:06,246 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Done
>>>>> executing com.cloud.vm.VmWorkMigrateAway for job-192986
>>>>> 2017-01-30 08:44:06,244 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Wake up jobs
>>>>> joined with job-192989 and disjoin all subjobs created from job- 192989
>>>>> 2017-01-30 08:44:06,269 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>>>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Remove
>>>>> job-192986 from job monitoring
>>>>> 2017-01-30 08:44:06,279 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Done
>>>>> executing com.cloud.vm.VmWorkMigrateAway for job-192989
>>>>> 2017-01-30 08:44:06,305 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Executing
>>>>> AsyncJobVO {id:192991, userId: 1, accountId: 1, instanceType: null,
>>>>> instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>>> rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcE
>>>>> mwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAl
>>>>> Z2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5h
>>>>> bWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAA
>>>>> AAAGRXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAABw, cmdVersion:
>>>>> 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null,
>>>>> initMsid: 345049098122, completeMsid: null, lastUpdated: null, lastPolled:
>>>>> null, created: Mon Jan 30 08:44:05 CET 2017}
>>>>> 2017-01-30 08:44:06,305 DEBUG [c.c.v.VmWorkJobDispatcher]
>>>>> (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Run VM work
>>>>> job: com.cloud.vm.VmWorkMigrateAway for VM 1605, job origin: 14965
>>>>> 2017-01-30 08:44:06,306 DEBUG [c.c.a.m.AgentManagerImpl]
>>>>> (AgentManager-Handler-13:null) Ping from 258
>>>>> 2017-01-30 08:44:06,319 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f
>>>>> FirstFitRoutingAllocator) List of hosts in ascending order of number of
>>>>> VMs: [224, 19, 21, 15]
>>>>> 2017-01-30 08:44:06,322 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f
>>>>> FirstFitRoutingAllocator) FirstFitAllocator has 0 hosts to check for
>>>>> allocation: []
>>>>> 2017-01-30 08:44:06,322 INFO  [c.c.a.m.AgentManagerImpl]
>>>>> (AgentManager-Handler-13:null) PingMap for agent: 258 will not be updated
>>>>> because agent is no longer in the PingMap
>>>>> 2017-01-30 08:44:06,326 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Remove
>>>>> job-192989 from job monitoring
>>>>>
>>>>> 2017-01-30 08:54:38,139 INFO [c.c.v.VirtualMachineManagerImpl]
>>>>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5)
>>>>> Migration cancelled because state has changed: VM[User|i-189-1178-VM]
>>>>> 2017-01-30 08:54:38,139 DEBUG [c.c.v.VirtualMachineManagerImpl]
>>>>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5)
>>>>> Unable to migrate VM due to: Migration cancelled because state has changed:
>>>>> VM[User|i-189-1178-VM]
>>>>> 2017-01-30 08:54:38,167 DEBUG [c.c.c.CapacityManagerImpl]
>>>>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5)
>>>>> VM state transitted from :Running to Stopping with event: StopRequestedvm's
>>>>> original host id: 29 new host id: 33 host id before state transition: 33
>>>>> 2017-01-30 08:54:38,177 DEBUG [c.c.a.m.DirectAgentAttache]
>>>>> (DirectAgent-325:ctx-e540d7f5) Seq 19-2355664080091480198: Response
>>>>> Received:
>>>>> 2017-01-30 08:54:38,177 DEBUG [c.c.a.t.Request]
>>>>> (DirectAgent-325:ctx-e540d7f5) Seq 19-2355664080091480198: Processing:  {
>>>>> Ans: , MgmtId: 345049098498, via: 19, Ver: v1, Flags: 10,
>>>>> [{"com.cloud.agent.api.NetworkUsageAnswer":{"routerName":"r-
>>>>> 1706-VM","bytesSent":12276848087,"bytesReceived":5226621450,
>>>>> "result":true,"details":"","wait":0}}] }
>>>>> 2017-01-30 08:54:38,177 DEBUG [c.c.a.t.Request]
>>>>> (RouterMonitor-1:ctx-2e80f405) Seq 19-2355664080091480198: Received:  {
>>>>> Ans: , MgmtId: 345049098498, via: 19, Ver: v1, Flags: 10, {
>>>>> NetworkUsageAnswer } }
>>>>> 2017-01-30 08:54:38,177 DEBUG [c.c.a.m.AgentManagerImpl]
>>>>> (RouterMonitor-1:ctx-2e80f405) Details from executing class
>>>>> com.cloud.agent.api.NetworkUsageCommand:
>>>>> 2017-01-30 08:54:38,178 DEBUG [c.c.a.t.Request]
>>>>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5)
>>>>> Seq 33-1917970491306410035: Sending  { Cmd , MgmtId: 345049098498, via:
>>>>> 33(ewcstack-vh003-prod), Ver: v1, Flags: 100011,
>>>>> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"execut
>>>>> eInSequence":false,"checkBeforeCleanup":false,"vmName":"i-189-1178-VM","wait":0}}]
>>>>> }
>>>>> 2017-01-30 08:54:38,178 DEBUG [c.c.a.t.Request]
>>>>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5)
>>>>> Seq 33-1917970491306410035: Executing:  { Cmd , MgmtId: 345049098498, via:
>>>>> 33(ewcstack-vh003-prod), Ver: v1, Flags: 100011,
>>>>> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"execut
>>>>> eInSequence":false,"checkBeforeCleanup":false,"vmName":"i-189-1178-VM","wait":0}}]
>>>>> }
>>>>> 2017-01-30 08:54:38,178 DEBUG [c.c.a.m.DirectAgentAttache]
>>>>> (DirectAgent-43:ctx-da30f109) Seq 33-1917970491306410035: Executing request
>>>>> 2017-01-30 08:54:38,202 DEBUG [c.c.a.m.DirectAgentAttache]
>>>>> (DirectAgent-326:ctx-4fb88aa7) Seq 21-1035546439318503505: Executing request
>>>>>
>>>>> 2017-01-30 09:05:55,628 INFO [c.c.h.HighAvailabilityManagerImpl]
>>>>> (HA-Worker-4:ctx-36aae733 work-2049) Processing
>>>>> HAWork[2049-Migration-1178-Stopped-Migrating]
>>>>> 2017-01-30 09:05:55,633 WARN [o.a.c.f.j.AsyncJobExecutionContext]
>>>>> (HA-Worker-4:ctx-36aae733 work-2049) Job is executed without a context,
>>>>> setup psudo job for the executing thread
>>>>> 2017-01-30 09:05:55,645 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>>> (HA-Worker-4:ctx-36aae733 work-2049) Sync job-193045 execution on object
>>>>> VmWorkJobQueue.1178
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 31.01.2017 00:42, Rafael Weingärtner wrote:
>>>>>
>>>>>> This seems to be one of those nasty bugs that are very hard to debug.
>>>>>> It
>>>>>> seems to be cause to a combination of different factors, such as
>>>>>> planets
>>>>>> alignments ;)
>>>>>>
>>>>>> At class “VirtualMachineManagerImpl” line 2395 (method
>>>>>> “com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAw
>>>>>> ay(String,
>>>>>> long, DeploymentPlanner)”) is the start of the code flow that
>>>>>> triggers the
>>>>>> migration. The point here is that, if an error occurs during the
>>>>>> migration,
>>>>>> the VM will be stopped; For some reason this is executed only for HA
>>>>>> VMs.
>>>>>>
>>>>>> Going a little bit further, we find the point that is triggering the
>>>>>> error
>>>>>> in the method “com.cloud.vm.VirtualMachineMa
>>>>>> nagerImpl.migrate(VMInstanceVO,
>>>>>> long, DeployDestination)” at line 1941. This line will call the method
>>>>>> “com.cloud.vm.VirtualMachineManagerImpl.changeState(VMInstanceVO,
>>>>>> Event,
>>>>>> Long, ItWorkVO, Step)” to change the VM state to migrating/prepare
>>>>>> migration/or something like that (I am guessing here), but for some
>>>>>> reason,
>>>>>> the state of the VM is not the one expected and this is causing a
>>>>>> “ConcurrentOperationException” to be thrown. The problem is that the
>>>>>> “com.cloud.vm.VirtualMachineManagerImpl.changeState(VMInstanceVO,
>>>>>> Event,
>>>>>> Long, ItWorkVO, Step)” does not seem to provide much useful log
>>>>>> entries.
>>>>>>
>>>>>> Just to enable us to discuss the problem a little further, could you
>>>>>> tell
>>>>>> us a little bit about your environment setting? How many ACS MSs do
>>>>>> you
>>>>>> have? What is the VM workload being migrated (I mean, the number of
>>>>>> VMs and
>>>>>> computing resources)? Do the other servers have enough computing
>>>>>> resources
>>>>>> to support these VMs? Could you also check the XenServer logs, and
>>>>>> check
>>>>>> what happened on both, source and destination hosts when the
>>>>>> migration for
>>>>>> VM “i-638-1735-VM” was triggered?
>>>>>>
>>>>>> Do you have the knowledge/skills to use an IDE (e.g. Eclipse) to
>>>>>> connect to
>>>>>> a running JVM and debug a code? Because for me it will not be
>>>>>> possible to
>>>>>> go further without looking at the variables in a running environment.
>>>>>>
>>>>>>
>>>>>> On Mon, Jan 30, 2017 at 4:28 AM, Francois Scheurer <
>>>>>> <fr...@everyware.ch>
>>>>>> wrote:
>>>>>>
>>>>>> Dear Rafael
>>>>>>>
>>>>>>>
>>>>>>> Sorry for the late answer (was sick a couple of days).
>>>>>>>
>>>>>>> I confirm we are migrating inside the same xenserver pool (within a
>>>>>>> CS
>>>>>>> Cluster).
>>>>>>>
>>>>>>>
>>>>>>> Best Regards
>>>>>>> Francois
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 26.01.2017 04:35, Rafael Weingärtner wrote:
>>>>>>>
>>>>>>> Found the code that is executed.
>>>>>>>> I need you to confirm if you are migrating within a cluster or
>>>>>>>> across
>>>>>>>> clusters?
>>>>>>>>
>>>>>>>> On Wed, Jan 25, 2017 at 11:57 AM, Rafael Weingärtner <
>>>>>>>> <ra...@gmail.com> wrote:
>>>>>>>>
>>>>>>>> I do not have access to ACS code right now, but I suggest starting
>>>>>>>>
>>>>>>>>> debugging here:
>>>>>>>>>
>>>>>>>>> 2017-01-24 18:05:28,427 DEBUG [c.c.c.CapacityManagerImpl]
>>>>>>>>>
>>>>>>>>>> (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380
>>>>>>>>>> ctx-175d37df)
>>>>>>>>>> VM state transitted from :Running to Stopping with event:
>>>>>>>>>> StopRequestedvm's
>>>>>>>>>> original host id: 224 new host id: 15 host id before state
>>>>>>>>>> transition:
>>>>>>>>>> 15
>>>>>>>>>>
>>>>>>>>>> I would try to understand first why ACS requested the hypervisor
>>>>>>>>>> to stop
>>>>>>>>>>
>>>>>>>>> the VM. I mean the following, check the source code for conditions
>>>>>>>>> that
>>>>>>>>> would make ACS request the shutdown of VMs.
>>>>>>>>> BTW: are you migrating within a cluster or across clusters?
>>>>>>>>>
>>>>>>>>> I will be able to analyze this further only after work hours, so
>>>>>>>>> if you
>>>>>>>>> find anything, keep me posted.
>>>>>>>>>
>>>>>>>>> On Wed, Jan 25, 2017 at 11:45 AM, Francois Scheurer <
>>>>>>>>> <fr...@everyware.ch>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Hello Rafael
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I think the important log lines are these:
>>>>>>>>>>
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 INFO
>>>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b
>>>>>>>>>> 86f5e
>>>>>>>>>> job-60216/job-191380 ctx-175d37df) Migration cancelled because
>>>>>>>>>> state has
>>>>>>>>>> changed: VM[User|i-638-1736-VM]
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 DEBUG
>>>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b
>>>>>>>>>> 86f5e
>>>>>>>>>> job-60216/job-191380 ctx-175d37df) Unable to migrate VM due to:
>>>>>>>>>> Migration
>>>>>>>>>> cancelled because state has changed: VM[User|i-638-1736-VM]
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,427 DEBUG
>>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>>>>>>>>> job-60216/job-191380 ctx-175d37df) VM state transitted from
>>>>>>>>>> :Running to
>>>>>>>>>> Stopping with event: StopRequestedvm's original host id: 224 new
>>>>>>>>>> host
>>>>>>>>>> id:
>>>>>>>>>> 15 host id before state transition: 15
>>>>>>>>>>
>>>>>>>>>> The really disturbing thing is that CS stop again the VM if we
>>>>>>>>>> start it
>>>>>>>>>> manually, apparently because the migratingjob is still running.
>>>>>>>>>>
>>>>>>>>>> Note the the VM HA Flag is enabled.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thank you.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Best Regards.
>>>>>>>>>>
>>>>>>>>>> Francois
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> more details:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,148 DEBUG
>>>>>>>>>> [o.a.c.e.o.VolumeOrchestrator] (Work-Job-Executor-156:ctx-14b
>>>>>>>>>> 86f5e
>>>>>>>>>> job-60216/job-191380 ctx-175d37df) Preparing 3 volumes for
>>>>>>>>>> VM[User|i-638-1736-VM]
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,155 DEBUG
>>>>>>>>>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-122:ctx-a8bacc64
>>>>>>>>>> job-14953/job-191351) Done with run of VM work job:
>>>>>>>>>> com.cloud.vm.VmWorkMigrateAway for VM 1411, job origin: 14953
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,155 DEBUG
>>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-122:ctx-a8b
>>>>>>>>>> acc64
>>>>>>>>>> job-14953/job-191351) Done executing
>>>>>>>>>> com.cloud.vm.VmWorkMigrateAway for
>>>>>>>>>> job-191351
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,160 INFO
>>>>>>>>>> [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-122:ctx-a8bacc64
>>>>>>>>>> job-14953/job-191351) Remove job-191351 from job monitoring
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,168 INFO
>>>>>>>>>> [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-2:ctx-f3bcf43c
>>>>>>>>>> work-1989)
>>>>>>>>>> Completed HAWork[1989-Migration-1411-Running-Migrating]
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,246 DEBUG
>>>>>>>>>> [c.c.a.m.ClusteredAgentAttache] (Work-Job-Executor-156:ctx-14b
>>>>>>>>>> 86f5e
>>>>>>>>>> job-60216/job-191380 ctx-175d37df) Seq 19-8526158519542516375:
>>>>>>>>>> Forwarding
>>>>>>>>>> Seq 19-8526158519542516375:  { Cmd , MgmtId: 345049103441, via:
>>>>>>>>>> 19(ewcstack-vh010-prod), Ver: v1, Flags: 100111, [{
>>>>>>>>>> "com.cloud.agent.api.PrepareForMigrationCommand":{"vm":{"id"
>>>>>>>>>> :1736,"name":"i-638-1736-VM","bootloader":"PyGrub","type":"U
>>>>>>>>>> ser","cpus":4,"minSpeed":2100,"maxSpeed":2100,"minRam":17179
>>>>>>>>>> 869184,"maxRam":17179869184,"arch":"x86_64","os":"Other
>>>>>>>>>> (64-bit)","platformEmulator":"Other install
>>>>>>>>>> media","bootArgs":"","enable
>>>>>>>>>> HA":true,"limitCpuUse":true,"enableDynamicallyScaleVm":true,
>>>>>>>>>> "vncPassword":"pC1WUC7h1DBH9J36q3H9pg==","params":{"memoryOv
>>>>>>>>>> ercommitRatio":"1.0","platform":"viridian:true;acpi:1;apic:t
>>>>>>>>>> rue;pae:true;nx:true;timeoffset:-3601","keyboard":"us","Mess
>>>>>>>>>> age.ReservedCapacityFreed.Flag":"false","cpuOvercommitRatio"
>>>>>>>>>> :"4.0","
>>>>>>>>>> hypervisortoolsversion":"xenserver61"},"uuid":"5bff01de-c033
>>>>>>>>>> -4925-9e47-30dd14539272","disks":[{"data":{"org.apache.cloud
>>>>>>>>>> stack.storage.to.VolumeObjectTO":{"uuid":"1ab514b2-f2de-
>>>>>>>>>> 4d85-8e0d-9768184a1349","volumeType":"ROOT","dataStore":{"org.
>>>>>>>>>> apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"b8
>>>>>>>>>> 751b89-00f
>>>>>>>>>> 6-5079-ed7a-c073579c7563","id":64,"poolType":"PreSetup","hos
>>>>>>>>>> t":"localhost","path":"/b8751b89-00f6-5079-ed7a-c073579c7563
>>>>>>>>>> ","port":0,"url":"PreSetup://localhost/b8751b89-00f6-5079-ed
>>>>>>>>>> 7a-c073579c7563/?ROLE=Primary&STOREUUID=b8751b89-00f6-5079-
>>>>>>>>>> ed7a-c073579c7563"}},"name":"ROOT-1736","size":53687091200,"
>>>>>>>>>> path":"d3
>>>>>>>>>> f87e3c-1efe-42ad-aa93-609d5b980a34","volumeId":8655,"vmName"
>>>>>>>>>> :"i-638-1736-VM","accountId":638,"format":"VHD","provisionin
>>>>>>>>>> gType":"THIN","id":8655,"deviceId":0,"hypervisorType":"
>>>>>>>>>> XenServer"}},"diskSeq":0,"path":"d3f87e3c-1efe-42ad-aa93
>>>>>>>>>> -609d5b980a34","type":"ROOT","_details":{"managed":"false","
>>>>>>>>>> storagePort":"0",
>>>>>>>>>> "storageHost":"localhost","volumeSize":"53687091200"}},{"dat
>>>>>>>>>> a":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid
>>>>>>>>>> ":"733df9dc-5e24-4ab9-b22b-bce78322428e","volumeType":"DATAD
>>>>>>>>>> ISK","dataStore":{"org.apache.cloudstack.storage.to.PrimaryD
>>>>>>>>>> ataStoreTO":{"uuid":"b8751b89-00f6-5079-ed7a-c073579c7563","
>>>>>>>>>> id":64,"
>>>>>>>>>> poolType":"PreSetup","host":"localhost","path":"/b8751b89-00
>>>>>>>>>> f6-5079-ed7a-c073579c7563","port":0,"url":"PreSetup://localh
>>>>>>>>>> ost/b8751b89-00f6-5079-ed7a-c073579c7563/?ROLE=Primary&STORE
>>>>>>>>>> UUID=b8751b89-00f6-5079-ed7a-c073579c7563"}},"name":"WELLA-
>>>>>>>>>> DB1-NCSA_D","size":268435456000,"path":"265f168a-bb92-4591-a
>>>>>>>>>> 32b-ee3e2
>>>>>>>>>> 2c8cb68","volumeId":8656,"vmName":"i-638-1736-VM","accountId
>>>>>>>>>> ":638,"format":"VHD","provisioningType":"THIN","id":8656,"
>>>>>>>>>> deviceId":1,"hypervisorType":"XenServer"}},"diskSeq":1,"
>>>>>>>>>> path":"265f168a-bb92-4591-a32b-ee3e22c8cb68","type":"DATADIS
>>>>>>>>>> K","_details":{"managed":"false","storagePort":"0","stor
>>>>>>>>>> ageHost":"localhos
>>>>>>>>>> t","volumeSize":"268435456000"}},{"data":{"org.apache.clouds
>>>>>>>>>> tack.storage.to.VolumeObjectTO":{"uuid":"bda6f335-b8a3-479c-
>>>>>>>>>> 8c0d-41b344d2a50d","volumeType":"DATADISK","dataStore":{"org.
>>>>>>>>>> apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid"
>>>>>>>>>> :"09df3ca6-0a20-e4f0-3d79-12a6ed67a13d","id":89,"poolType":"
>>>>>>>>>> PreSetup","
>>>>>>>>>> host":"localhost","path":"/09df3ca6-0a20-e4f0-3d79-12a6ed67a
>>>>>>>>>> 13d","port":0,"url":"PreSetup://localhost/09df3ca6-0a20-e4f0
>>>>>>>>>> -3d79-12a6ed67a13d/?ROLE=Primary&STOREUUID=09df3ca6-0a20-
>>>>>>>>>>
>>>>>>>>>> e4f0-3d79-12a6ed67a13d"}},"name":"WELLA-DB1-NCSA_E","size
>>>>>>>>>> ":429496729600,"path":"612cb586-a70c-4896-b120-062ee7a67894"
>>>>>>>>>> ,"volumeId":86
>>>>>>>>>> 57,"vmName":"i-638-1736-VM","accountId":638,"format":"VHD","
>>>>>>>>>> provisioningType":"THIN","id":8657,"deviceId":2,"hypervisorT
>>>>>>>>>> ype":"XenServer"}},"diskSeq":2,"path":"612cb586-a70c-4896-b1
>>>>>>>>>> 20-062ee7a67894","type":"DATADISK","_details":{"managed":"
>>>>>>>>>> false","storagePort":"0","storageHost":"localhost","volumeSi
>>>>>>>>>> ze":"42949
>>>>>>>>>> 6729600"}},{"data":{"org.apache.cloudstack.storage.to.Templa
>>>>>>>>>> teObjectTO":{"uuid":"0e5bf4ff-caca-4b08-abed-7bbcdf3c42ef","
>>>>>>>>>> id":201,"format":"ISO","accountId":1,"hvm":true,"displayText
>>>>>>>>>> ":"xen-pv-drv-iso","name":"xs-tools.iso","guestOsType":"CentOS
>>>>>>>>>> 4.5
>>>>>>>>>> (32-bit)","hypervisorType":"XenServer"}},"diskSeq":3,"type":"
>>>>>>>>>> ISO"}],"nics":[{"deviceId":0,"networkRateMbps":1000,"default
>>>>>>>>>> Nic":true,"pxeDisable":false,"nicUuid":"3523c3d3-9f92-4076-a
>>>>>>>>>> d99-f8864d0f0edd","uuid":"e6fbb257-16e1-415f-88fb-9a79b263b3
>>>>>>>>>> a8","ip":"192.168.1.42","netmask":"255.255.255.0","
>>>>>>>>>> gateway":"192.168.1.1","mac":"02:00:0c:ae:00:0f","dns1":"8.8
>>>>>>>>>> .8.8","dns2":"8.8
>>>>>>>>>> .4.4","broadcastType":"Vlan","type":"Guest","broadcastUri":"
>>>>>>>>>> vlan://2285","isolationUri":"vlan://2285","isSecurityGroupEn
>>>>>>>>>> abled":false,"name":"bond0"}],"vcpuMaxLimit":16},"wait":0}}] } to
>>>>>>>>>> 345049098498
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG
>>>>>>>>>> [c.c.a.t.Request]
>>>>>>>>>> (AgentManager-Handler-10:null) Seq 19-8526158519542516375:
>>>>>>>>>> Processing:
>>>>>>>>>> {
>>>>>>>>>> Ans: , MgmtId: 345049103441, via: 19, Ver: v1, Flags: 110,
>>>>>>>>>> [{"com.cloud.agent.api.PrepareForMigrationAnswer":{"result":
>>>>>>>>>> true,"wait":0}}]
>>>>>>>>>> }
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG
>>>>>>>>>> [c.c.a.m.AgentAttache]
>>>>>>>>>> (AgentManager-Handler-10:null) Seq 19-8526158519542516375: No more
>>>>>>>>>> commands
>>>>>>>>>> found
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG
>>>>>>>>>> [c.c.a.t.Request]
>>>>>>>>>> (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380
>>>>>>>>>> ctx-175d37df)
>>>>>>>>>> Seq 19-8526158519542516375: Received:  { Ans: , MgmtId:
>>>>>>>>>> 345049103441,
>>>>>>>>>> via:
>>>>>>>>>> 19, Ver: v1, Flags: 110, { PrepareForMigrationAnswer } }
>>>>>>>>>>
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 INFO
>>>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b
>>>>>>>>>> 86f5e
>>>>>>>>>> job-60216/job-191380 ctx-175d37df) Migration cancelled because
>>>>>>>>>> state has
>>>>>>>>>> changed: VM[User|i-638-1736-VM]
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 DEBUG
>>>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b
>>>>>>>>>> 86f5e
>>>>>>>>>> job-60216/job-191380 ctx-175d37df) Unable to migrate VM due to:
>>>>>>>>>> Migration
>>>>>>>>>> cancelled because state has changed: VM[User|i-638-1736-VM]
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,427 DEBUG
>>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>>>>>>>>> job-60216/job-191380 ctx-175d37df) VM state transitted from
>>>>>>>>>> :Running to
>>>>>>>>>> Stopping with event: StopRequestedvm's original host id: 224 new
>>>>>>>>>> host
>>>>>>>>>> id:
>>>>>>>>>> 15 host id before state transition: 15
>>>>>>>>>>
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,438 DEBUG
>>>>>>>>>> [c.c.a.m.ClusteredAgentAttache] (Work-Job-Executor-156:ctx-14b
>>>>>>>>>> 86f5e
>>>>>>>>>> job-60216/job-191380 ctx-175d37df) Seq 15-6889944479923437643:
>>>>>>>>>> Forwarding
>>>>>>>>>> Seq 15-6889944479923437643:  { Cmd , MgmtId: 345049103441, via:
>>>>>>>>>> 15(ewcstack-vh011-prod), Ver: v1, Flags: 100011,
>>>>>>>>>> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"execut
>>>>>>>>>> eInSequence":false,"checkBeforeCleanup":false,"vmName":"i-
>>>>>>>>>> 638-1736-VM","wait":0}}]
>>>>>>>>>> } to 345049098498
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:30,335 DEBUG
>>>>>>>>>> [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-1:null) SeqA
>>>>>>>>>> 254-800498:
>>>>>>>>>> Processing Seq 254-800498:  { Cmd , MgmtId: -1, via: 254, Ver: v1,
>>>>>>>>>> Flags:
>>>>>>>>>> 11, [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_
>>>>>>>>>> proxyVmId":1530,"_loadInfo":"{\n
>>>>>>>>>> \"connections\": []\n}","wait":0}}] }
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:30,340 DEBUG
>>>>>>>>>> [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-1:null) SeqA
>>>>>>>>>> 254-800498:
>>>>>>>>>> Sending Seq 254-800498:  { Ans: , MgmtId: 345049103441, via: 254,
>>>>>>>>>> Ver:
>>>>>>>>>> v1,
>>>>>>>>>> Flags: 100010, [{"com.cloud.agent.api.AgentCo
>>>>>>>>>> ntrolAnswer":{"result":true,"wait":0}}] }
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,447 DEBUG
>>>>>>>>>> [c.c.a.t.Request]
>>>>>>>>>> (AgentManager-Handler-11:null) Seq 15-6889944479923437632:
>>>>>>>>>> Processing:
>>>>>>>>>> {
>>>>>>>>>> Ans: , MgmtId: 345049103441, via: 15, Ver: v1, Flags: 10,
>>>>>>>>>> [{"com.cloud.agent.api.StopAnswer":{"platform":"viridian:tru
>>>>>>>>>> e;acpi:1;apic:true;pae:true;nx:true;timeoffset:3601","result
>>>>>>>>>> ":true,"details":"Stop
>>>>>>>>>> VM i-638-1735-VM Succeed","wait":0}}] }
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,447 DEBUG
>>>>>>>>>> [c.c.a.t.Request]
>>>>>>>>>> (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379
>>>>>>>>>> ctx-bbabfeae)
>>>>>>>>>> Seq 15-6889944479923437632: Received:  { Ans: , MgmtId:
>>>>>>>>>> 345049103441,
>>>>>>>>>> via:
>>>>>>>>>> 15, Ver: v1, Flags: 10, { StopAnswer } }
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,470 DEBUG
>>>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>> db528
>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) VM[User|i-638-1735-VM] is
>>>>>>>>>> stopped on
>>>>>>>>>> the
>>>>>>>>>> host. Proceeding to release resource held.
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,475 DEBUG
>>>>>>>>>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not
>>>>>>>>>> supported
>>>>>>>>>> in the network id=519
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,478 DEBUG
>>>>>>>>>> [o.a.c.e.o.NetworkOrchestrator] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>> db528
>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Changing active number of nics
>>>>>>>>>> for
>>>>>>>>>> network id=519 on -1
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>>>>>>>>>> [o.a.c.e.o.NetworkOrchestrator] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>> db528
>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Asking VpcVirtualRouter to
>>>>>>>>>> release
>>>>>>>>>> NicProfile[17403-1735-3364083b-dcc0-4eb9-a7f9-f3e85bf8c7d3-1
>>>>>>>>>> 92.168.1.11-null
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>>>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>> db528
>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Successfully released network
>>>>>>>>>> resources
>>>>>>>>>> for the vm VM[User|i-638-1735-VM]
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>>>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>> db528
>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Successfully released storage
>>>>>>>>>> resources
>>>>>>>>>> for the vm VM[User|i-638-1735-VM]
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,496 DEBUG
>>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) VM state transitted from
>>>>>>>>>> :Stopping to
>>>>>>>>>> Stopped with event: OperationSucceededvm's original host id: 15
>>>>>>>>>> new host
>>>>>>>>>> id: null host id before state transition: 15
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Hosts's actual total CPU:
>>>>>>>>>> 82976 and
>>>>>>>>>> CPU
>>>>>>>>>> after applying overprovisioning: 331904
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Hosts's actual total RAM:
>>>>>>>>>> 261961927808
>>>>>>>>>> and RAM after applying overprovisioning: 261961924608
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) release cpu from host: 15, old
>>>>>>>>>> used:
>>>>>>>>>> 71300,reserved: 2100, actual total: 82976, total with
>>>>>>>>>> overprovisioning:
>>>>>>>>>> 331904; new used: 67100,reserved:6300; movedfromreserved:
>>>>>>>>>> false,moveToReserveredtrue
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) release mem from host: 15, old
>>>>>>>>>> used:
>>>>>>>>>> 98784247808,reserved: 1073741824, total: 261961924608; new used:
>>>>>>>>>> 94489280512,reserved:5368709120 <(53)%206870-9120>;
>>>>>>>>>> movedfromreserved:
>>>>>>>>>>
>>>>>>>>>> false,moveToReserveredtrue
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,538 DEBUG
>>>>>>>>>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not
>>>>>>>>>> supported
>>>>>>>>>> in the network id=519
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,557 DEBUG
>>>>>>>>>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not
>>>>>>>>>> supported
>>>>>>>>>> in the network id=519
>>>>>>>>>>
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,588 ERROR
>>>>>>>>>> [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Invocation exception, caused
>>>>>>>>>> by:
>>>>>>>>>> com.cloud.utils.exception.CloudRuntimeException: Unable to
>>>>>>>>>> migrate
>>>>>>>>>> VM[User|i-638-1735-VM]
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 INFO
>>>>>>>>>> [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Rethrow exception
>>>>>>>>>> com.cloud.utils.exception.CloudRuntimeException: Unable to
>>>>>>>>>> migrate
>>>>>>>>>> VM[User|i-638-1735-VM]
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 DEBUG
>>>>>>>>>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>> job-13733/job-191379) Done with run of VM work job:
>>>>>>>>>> com.cloud.vm.VmWorkMigrateAway for VM 1735, job origin: 13733
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 ERROR
>>>>>>>>>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>> job-13733/job-191379) Unable to complete AsyncJobVO {id:191379,
>>>>>>>>>> userId:
>>>>>>>>>> 1,
>>>>>>>>>> accountId: 1, instanceType: null, instanceId: null, cmd:
>>>>>>>>>> com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>>>>>>>> rO0ABXNyAB5jb20uY2xvdWQudm0uVm
>>>>>>>>>> 1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb2
>>>>>>>>>> 0uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2
>>>>>>>>>> VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZz
>>>>>>>>>> t4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGx3QAGVZpcnR1YWxNYWNoaW5lTW
>>>>>>>>>> FuYWdlckltcGwAAAAAAAAA4A,
>>>>>>>>>> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode:
>>>>>>>>>> 0,
>>>>>>>>>> result: null, initMsid: 345049098498, completeMsid: null,
>>>>>>>>>> lastUpdated:
>>>>>>>>>> null, lastPolled: null, created: Tue Jan 24 18:05:00 CET 2017},
>>>>>>>>>> job
>>>>>>>>>> origin:13733
>>>>>>>>>> ewcstack-man02-prod: com.cloud.utils.exception.Clou
>>>>>>>>>> dRuntimeException:
>>>>>>>>>> Unable to migrate VM[User|i-638-1735-VM]
>>>>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>>>>>>>>>> agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.ja
>>>>>>>>>> va:2405)
>>>>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>>>>>>>>>> agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.ja
>>>>>>>>>> va:4517)
>>>>>>>>>> ewcstack-man02-prod:    at sun.reflect.GeneratedMethodAcc
>>>>>>>>>> essor402.invoke(Unknown
>>>>>>>>>> Source)
>>>>>>>>>> ewcstack-man02-prod:    at sun.reflect.DelegatingMethodAc
>>>>>>>>>> cessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>>>>>> ewcstack-man02-prod:    at java.lang.reflect.Method.invok
>>>>>>>>>> e(Method.java:606)
>>>>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VmWorkJobHandlerP
>>>>>>>>>> roxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
>>>>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>>>>>>>>>> agerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4636)
>>>>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VmWorkJobDispatch
>>>>>>>>>> er.runJob(VmWorkJobDispatcher.java:103)
>>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.framewor
>>>>>>>>>> k.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManag
>>>>>>>>>> erImpl.java:537)
>>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>>>>> context.ManagedContextRunnable$1.run(ManagedContextRunnable.
>>>>>>>>>> java:49)
>>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>>>>> context.impl.DefaultManagedContext$1.call(DefaultManagedCont
>>>>>>>>>> ext.java:56)
>>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>>>>> context.impl.DefaultManagedContext.callWithContext(DefaultMa
>>>>>>>>>> nagedContext.java:103)
>>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>>>>> context.impl.DefaultManagedContext.runWithContext(DefaultMan
>>>>>>>>>> agedContext.java:53)
>>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>>>>> context.ManagedContextRunnable.run(ManagedContextRunnable.ja
>>>>>>>>>> va:46)
>>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.framewor
>>>>>>>>>> k.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.ja
>>>>>>>>>> va:494)
>>>>>>>>>> ewcstack-man02-prod:    at java.util.concurrent.Executors
>>>>>>>>>> $RunnableAdapter.call(Executors.java:471)
>>>>>>>>>> ewcstack-man02-prod:    at java.util.concurrent.FutureTas
>>>>>>>>>> k.run(FutureTask.java:262)
>>>>>>>>>> ewcstack-man02-prod:    at java.util.concurrent.ThreadPoo
>>>>>>>>>> lExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>>>>>>> ewcstack-man02-prod:    at java.util.concurrent.ThreadPoo
>>>>>>>>>> lExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>>>>>>> ewcstack-man02-prod:    at java.lang.Thread.run(Thread.java:745)
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,590 DEBUG
>>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>> db528
>>>>>>>>>> job-13733/job-191379) Complete async job-191379, jobStatus:
>>>>>>>>>> FAILED,
>>>>>>>>>> resultCode: 0, result: rO0ABXNyAC9jb20uY2xvdWQudXRpbH
>>>>>>>>>> MuZXhjZXB0aW9uLkNsb3VkUnVudGltZUV4Y2VwdGlvbgAAAABWTU9yAwABSQ
>>>>>>>>>> ALY3NFcnJvckNvZGV4cgAaamF2YS5sYW5nLlJ1bnRpbWVFeGNlcHRpb26eXw
>>>>>>>>>> ZHCjSD5QIAAHhyABNqYXZhLmxhbmcuRXhjZXB0aW9u0P0fPho7HMQCAAB4cg
>>>>>>>>>> ATamF2YS5sYW5nLlRocm93YWJsZdXGNSc5d7jLAwAETAAFY2F1c2V0ABVMam
>>>>>>>>>> F2YS9sYW5nL1Rocm93YWJsZTtMAA1kZXRhaWxNZXNzYWdldAASTGphdmEvbG
>>>>>>>>>> FuZy9TdHJpbmc7WwAKc3RhY2tUcmFjZXQAHltMamF2YS9sYW5nL1N0YWNrVH
>>>>>>>>>> JhY2VFbGVtZW50O0wAFHN1cHByZXNzZWRFeGNlcHRpb25zdAAQTGphdmEvdX
>>>>>>>>>> RpbC9MaXN0O3hwcQB-AAh0AChVbmFibGUgdG8gbWlncmF0ZSBWTVtVc2VyfG
>>>>>>>>>> ktNjM4LTE3MzUtVk1ddXIAHltMamF2YS5sYW5nLlN0YWNrVHJhY2VFbGVtZW
>>>>>>>>>> 50OwJGKjw8_SI5AgAAeHAAAAAUc3IAG2phdmEubGFuZy5TdGFja1RyYWNlRW
>>>>>>>>>> xlbWVudGEJxZomNt2FAgAESQAKbGluZU51bWJlckwADmRlY2xhcmluZ0NsYX
>>>>>>>>>> NzcQB-AAVMAAhmaWxlTmFtZXEAfgAFTAAKbWV0aG9kTmFtZXEAfgAFeHAAAA
>>>>>>>>>> lldAAmY29tLmNsb3VkLnZtLlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGx0AB
>>>>>>>>>> 5WaXJ0dWFsTWFjaGluZU1hbmFnZXJJbXBsLmphdmF0ABZvcmNoZXN0cmF0ZU
>>>>>>>>>> 1pZ3JhdGVBd2F5c3EAfgAMAAARpXEAfgAOcQB-AA9xAH4AEHNxAH4ADP____
>>>>>>>>>> 90ACZzdW4ucmVmbGVjdC5HZW5lcmF0ZWRNZXRob2RBY2Nlc3NvcjQwMnB0AA
>>>>>>>>>> ZpbnZva2VzcQB-AAwAAAArdAAoc3VuLnJlZmxlY3QuRGVsZWdhdGluZ01ldG
>>>>>>>>>> hvZEFjY2Vzc29ySW1wbHQAIURlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcG
>>>>>>>>>> wuamF2YXEAfgAUc3EAfgAMAAACXnQAGGphdmEubGFuZy5yZWZsZWN0Lk1ldG
>>>>>>>>>> hvZHQAC01ldGhvZC5qYXZhcQB-ABRzcQB-AAwAAABrdAAiY29tLmNsb3VkLn
>>>>>>>>>> ZtLlZtV29ya0pvYkhhbmRsZXJQcm94eXQAGlZtV29ya0pvYkhhbmRsZXJQcm
>>>>>>>>>> 94eS5qYXZhdAAPaGFuZGxlVm1Xb3JrSm9ic3EAfgAMAAASHHEAfgAOcQB-
>>>>>>>>>> AA9xAH4AHnNxAH4ADAAAAGd0ACBjb20uY2xvdWQudm0uVm1Xb3JrSm9iRGlz
>>>>>>>>>> cGF0Y2hlcnQAGFZtV29ya0pvYkRpc3BhdGNoZXIuamF2YXQABnJ1bkpvYnNx
>>>>>>>>>> AH4ADAAAAhl0AD9vcmcuYXBhY2hlLmNsb3Vkc3RhY2suZnJhbWV3b3JrLmpv
>>>>>>>>>> YnMuaW1wbC5Bc3luY0pvYk1hbmFnZXJJbXBsJDV0ABhBc3luY0pvYk1hbmFn
>>>>>>>>>> ZXJJbXBsLmphdmF0AAxydW5JbkNvbnRleHRzcQB-AAwAAAAxdAA-b3JnLmFw
>>>>>>>>>> YWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29udGV4
>>>>>>>>>> dFJ1bm5hYmxlJDF0ABtNYW5hZ2VkQ29udGV4dFJ1bm5hYmxlLmphdmF0AANy
>>>>>>>>>> dW5zcQB-AAwAAAA4dABCb3JnLmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQu
>>>>>>>>>> Y29udGV4dC5pbXBsLkRlZmF1bHRNYW5hZ2VkQ29udGV4dCQxdAAaRGVmYXVs
>>>>>>>>>> dE1hbmFnZWRDb250ZXh0LmphdmF0AARjYWxsc3EAfgAMAAAAZ3QAQG9yZy5h
>>>>>>>>>> cGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0
>>>>>>>>>> TWFuYWdlZENvbnRleHRxAH4ALnQAD2NhbGxXaXRoQ29udGV4dHNxAH4ADAAA
>>>>>>>>>> ADVxAH4AMXEAfgAudAAOcnVuV2l0aENvbnRleHRzcQB-AAwAAAAudAA8b3Jn
>>>>>>>>>> LmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29u
>>>>>>>>>> dGV4dFJ1bm5hYmxlcQB-ACpxAH4AK3NxAH4ADAAAAe5xAH4AJXEAfgAmcQB-
>>>>>>>>>> ACtzcQB-AAwAAAHXdAAuamF2YS51dGlsLmNvbmN1cnJlbnQuRXhlY3V0b3Jz
>>>>>>>>>> JFJ1bm5hYmxlQWRhcHRlcnQADkV4ZWN1dG9ycy5qYXZhcQB-AC9zcQB-AAwA
>>>>>>>>>> AAEGdAAfamF2YS51dGlsLmNvbmN1cnJlbnQuRnV0dXJlVGFza3QAD0Z1dHVy
>>>>>>>>>> ZVRhc2suamF2YXEAfgArc3EAfgAMAAAEeXQAJ2phdmEudXRpbC5jb25jdXJy
>>>>>>>>>> ZW50LlRocmVhZFBvb2xFeGVjdXRvcnQAF1RocmVhZFBvb2xFeGVjdXRvci5q
>>>>>>>>>> YXZhdAAJcnVuV29ya2Vyc3EAfgAMAAACZ3QALmphdmEudXRpbC5jb25jdXJy
>>>>>>>>>> ZW50LlRocmVhZFBvb2xFeGVjdXRvciRXb3JrZXJxAH4AQHEAfgArc3EAfgAM
>>>>>>>>>> AAAC6XQAEGphdmEubGFuZy5UaHJlYWR0AAtUaHJlYWQuamF2YXEAfgArc3IA
>>>>>>>>>> JmphdmEudXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVMaXN0_A8lMbXs
>>>>>>>>>> jhACAAFMAARsaXN0cQB-AAd4cgAsamF2YS51dGlsLkNvbGxlY3Rpb25zJFVu
>>>>>>>>>> bW9kaWZpYWJsZUNvbGxlY3Rpb24ZQgCAy173HgIAAUwAAWN0ABZMamF2YS91
>>>>>>>>>> dGlsL0NvbGxlY3Rpb247eHBzcgATamF2YS51dGlsLkFycmF5TGlzdHiB0h2Z
>>>>>>>>>> x2GdAwABSQAEc2l6ZXhwAAAAAHcEAAAAAHhxAH4ATHgAABCadwgAAAAAAAAAAHg
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>> db528
>>>>>>>>>> job-13733/job-191379) Publish async job-191379 complete on
>>>>>>>>>> message bus
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>> db528
>>>>>>>>>> job-13733/job-191379) Wake up jobs related to job-191379
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>> db528
>>>>>>>>>> job-13733/job-191379) Update db status for job-191379
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,594 DEBUG
>>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>> db528
>>>>>>>>>> job-13733/job-191379) Wake up jobs joined with job-191379 and
>>>>>>>>>> disjoin
>>>>>>>>>> all
>>>>>>>>>> subjobs created from job- 191379
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,604 DEBUG
>>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>>> db528
>>>>>>>>>> job-13733/job-191379) Done executing
>>>>>>>>>> com.cloud.vm.VmWorkMigrateAway for
>>>>>>>>>> job-191379
>>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,608 INFO
>>>>>>>>>> [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>>> job-13733/job-191379) Remove job-191379 from job monitoring
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 25.01.2017 15:53, Rafael Weingärtner wrote:
>>>>>>>>>>
>>>>>>>>>> Did you check the ACS log files looking for some exceptions when
>>>>>>>>>> the
>>>>>>>>>>
>>>>>>>>>>> migrations for this specific VM started?
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Jan 25, 2017 at 9:49 AM, Francois Scheurer <
>>>>>>>>>>> <fr...@everyware.ch>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Dear CS contributors
>>>>>>>>>>>
>>>>>>>>>>> We use CS 4.5.1 on a 3 Clusters with XenServer 6.5. and shared
>>>>>>>>>>>> primary
>>>>>>>>>>>> storage.
>>>>>>>>>>>>
>>>>>>>>>>>> When we put a host in maintenance, cs evacuates all VM's on
>>>>>>>>>>>> other
>>>>>>>>>>>> hosts.
>>>>>>>>>>>>
>>>>>>>>>>>> But it happens regularly that some VM's are stopped instead of
>>>>>>>>>>>> live
>>>>>>>>>>>> migrated.
>>>>>>>>>>>>
>>>>>>>>>>>> If we restart that stopped VM's, CS will then stop them again
>>>>>>>>>>>> later.
>>>>>>>>>>>>
>>>>>>>>>>>> We need to start them several times until they stays up.
>>>>>>>>>>>>
>>>>>>>>>>>> Is it a known issue? Is it fixed on CS 4.9.2 ?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Migrating manually all VM's is working but impracticable
>>>>>>>>>>>> because too
>>>>>>>>>>>> much
>>>>>>>>>>>> time consuming.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Many thanks in advance for your help.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Best Regards
>>>>>>>>>>>> Francois
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> EveryWare AG
>>>>>>>>>>>> François Scheurer
>>>>>>>>>>>> Senior Systems Engineer
>>>>>>>>>>>> Zurlindenstrasse 52a
>>>>>>>>>>>> CH-8003 Zürich
>>>>>>>>>>>>
>>>>>>>>>>>> tel: +41 44 466 60 00 <%2B41%2044%20466%2060%2000>
>>>>>>>>>>>> fax: +41 44 466 60 10 <%2B41%2044%20466%2060%2010>
>>>>>>>>>>>> mail: <fr...@everyware.ch>
>>>>>>>>>>>> francois.scheurer@everyware.ch
>>>>>>>>>>>> web: <http://www.everyware.ch>http://www.everyware.ch
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> EveryWare AG
>>>>>>>>>> François Scheurer
>>>>>>>>>> Senior Systems Engineer
>>>>>>>>>> Zurlindenstrasse 52a
>>>>>>>>>> CH-8003 Zürich
>>>>>>>>>>
>>>>>>>>>> tel: +41 44 466 60 00 <%2B41%2044%20466%2060%2000>
>>>>>>>>>> fax: +41 44 466 60 10 <%2B41%2044%20466%2060%2010>
>>>>>>>>>> mail: <fr...@everyware.ch>
>>>>>>>>>> francois.scheurer@everyware.ch
>>>>>>>>>> web: <http://www.everyware.ch>http://www.everyware.ch
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>> Rafael Weingärtner
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>> --
>>>>>>>
>>>>>>>
>>>>>>> EveryWare AG
>>>>>>> François Scheurer
>>>>>>> Senior Systems Engineer
>>>>>>> Zurlindenstrasse 52a
>>>>>>> CH-8003 Zürich
>>>>>>>
>>>>>>> tel: +41 44 466 60 00 <%2B41%2044%20466%2060%2000>
>>>>>>> fax: +41 44 466 60 10 <%2B41%2044%20466%2060%2010>
>>>>>>> mail: <fr...@everyware.ch>francois.scheurer@everyware.ch
>>>>>>> web: <http://www.everyware.ch>http://www.everyware.ch
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>> --
>>>>>
>>>>>
>>>>> EveryWare AG
>>>>> François Scheurer
>>>>> Senior Systems Engineer
>>>>> Zurlindenstrasse 52a
>>>>> CH-8003 Zürich
>>>>>
>>>>> tel: +41 44 466 60 00 <%2B41%2044%20466%2060%2000>
>>>>> fax: +41 44 466 60 10 <%2B41%2044%20466%2060%2010>
>>>>> mail: <fr...@everyware.ch>francois.scheurer@everyware.ch
>>>>> web: <http://www.everyware.ch>http://www.everyware.ch
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Rafael Weingärtner
>>>>
>>>
>>>
>>>
>>> --
>>> Rafael Weingärtner
>>>
>>>
>>> --
>>>
>>>
>>> EveryWare AG
>>> François Scheurer
>>> Senior Systems Engineer
>>> Zurlindenstrasse 52a
>>> CH-8003 Zürich
>>>
>>> tel: +41 44 466 60 00 <+41%2044%20466%2060%2000>
>>> fax: +41 44 466 60 10 <+41%2044%20466%2060%2010>
>>> mail: francois.scheurer@everyware.ch
>>> web: http://www.everyware.ch
>>>
>>>
>>
>>
>> --
>> Rafael Weingärtner
>>
>>
>> --
>>
>>
>> EveryWare AG
>> François Scheurer
>> Senior Systems Engineer
>> Zurlindenstrasse 52a
>> CH-8003 Zürich
>>
>> tel: +41 44 466 60 00 <+41%2044%20466%2060%2000>
>> fax: +41 44 466 60 10 <+41%2044%20466%2060%2010>
>> mail: francois.scheurer@everyware.ch
>> web: http://www.everyware.ch
>>
>>
>
>
> --
> Rafael Weingärtner
>
>
> --
>
>
> EveryWare AG
> François Scheurer
> Senior Systems Engineer
> Zurlindenstrasse 52a
> CH-8003 Zürich
>
> tel: +41 44 466 60 00 <+41%2044%20466%2060%2000>
> fax: +41 44 466 60 10 <+41%2044%20466%2060%2010>
> mail: francois.scheurer@everyware.ch
> web: http://www.everyware.ch
>
>


-- 
Rafael Weingärtner

Re: cs 4.5.1, vm stopped when putting xen hosts in maintenance

Posted by Francois Scheurer <fr...@everyware.ch>.

Dear Rafael


We changed it on the VM to avoid the required stop and start. Changing 
on the used Offering is not affecting the vm_instance.ha_enabled.
We could create a new Offering without HA but the VM need to be stopped 
to let us apply it.

We did not disabled the HA on these 4 VM's to solve the problem, but 
just to minimize the impact of the problem (now only i-189-1177-VM 
should be stopped intermittently) if the cleaning of op_it_work is not 
fixing the issue.
Of course we could filter the update query with “instance id”, but if 
you look the the table output all instance_id are related to the 5 
problematic VM's.
The only exceptions are the first 3 rows that are related to VM's that 
does no exist anymore since July 2015 ...
That is why I thought it may be safe to do.


Best Regards

Francois Scheurer








On 07.02.2017 16:27, Rafael Weingärtner wrote:
> So, you changed the flags directly on "vm_instance". Why did not you 
> change the service offering of these VMs using ACS UI or API? For me, 
> it feels that the best approach would be to change the service 
> offering of the affected VMs (the HA flag is first configured at the 
> service offering). You could assign a service offering without HA; I 
> mean, it would be a matter of creating a similar service offering 
> (same configurations to the one they are using right now), but without HA.
>
> I am assuming that the change directly on “vm_instance” did not solve 
> your problem, right? That is why you want to change some data at 
> “op_it_work”. I did a quick look at “op_it_work”, and it seems that 
> this table may be used for more things than HA. Here is a tip for 
> complex production system, never run update SQLs in production 
> environments without a very, very restrictive and well thought “where” 
> clause. In my opinion this is not restrictive enough (update 
> op_it_work set step="Done" where step!="Done"). I would advise you not 
> to go further and change configurations directly on database (unless 
> you have a dev team that is capable of reading, debugging, and work 
> with ACS source code). ACS has over 500k lines of java code, and God 
> knows how many lines of shell, python and other languages, software 
> and systems that may be used; it is pretty hard sometimes to get a 
> clear picture of how a change in the database will reverberate in the 
> whole system without knowing in details the installation 
> configurations and underlying structures (networking, hypervisors, 
> storage systems and other pieces). However, if you want to proceed 
> with that, I would suggest you adding some extra filters to that 
> “where” clause. What about adding the “instance id” there?
>
> PS: BTW, do you already heard about the Cloudstack Fork from Schuberg 
> Philis named Cosmic? https://github.com/MissionCriticalCloud/cosmic
> Yes, I have. They were part of the active ACS community back there. I 
> was already here when they created the fork. Still, I kept with ACS.
>
>
> On Tue, Feb 7, 2017 at 3:54 AM, Francois Scheurer 
> <francois.scheurer@everyware.ch 
> <ma...@everyware.ch>> wrote:
>
>     Dear Rafael
>
>
>     We disabled temporarily the HA Flag on these VM's by updating the
>     DB directly: update vm_instance set ha_enabled=0 where removed is
>     null and id=1178 or id=1240 or id=1690 or id=1736;
>
>     We kept the test VM i-189-1177-VM with HA Flag.
>
>     Restarting the MS + cleaning the op_it_work table will hopefully
>     fix the issue. And if the issue is persisting, then only the
>     i-189-1177-VM should be affected, hence mitigating the impact.
>
>     In the other hand if everything run fine we would thereafter
>     re-enable the HA Flag on the problematic VM's.
>
>
>     Do you think it is safe to run this update?: update op_it_work set
>     step="Done" where step!="Done";
>
>
>     Best Regards
>
>     Francois Scheurer
>
>
>
>     PS: BTW, do you already heard about the Cloudstack Fork from
>     Schuberg Philis named Cosmic?
>     https://github.com/MissionCriticalCloud/cosmic
>     <https://github.com/MissionCriticalCloud/cosmic>
>
>
>
>
>
>     On 06.02.2017 16:41, Rafael Weingärtner wrote:
>>
>>     I totally understand your position of not being able to debug the
>>     ACS in production ;)
>>
>>
>>     There is one bit I quite did not understand. If you have disabled
>>     the HA for VMs, why do you still need to change data on the data
>>     base? How did you disabled the HA of VMs?
>>
>>
>>     On Mon, Feb 6, 2017 at 5:04 AM, Francois Scheurer
>>     <francois.scheurer@everyware.ch
>>     <ma...@everyware.ch>> wrote:
>>
>>         Dear Rafael / Dear CS contributors
>>
>>
>>         I tried the following:
>>
>>         1.     stop all MS
>>         2.     backup DB
>>         3.     select * from op_ha_work where type="Migration" and
>>         mgmt_server_id is null and taken is null;
>>         4.     delete from op_ha_work where type="Migration" and
>>         mgmt_server_id is null and taken is null;
>>         5.     start all MS
>>
>>         But actually after the MS stopped the op_ha_work table was
>>         automatically cleaned, making my steps useless.
>>         After the MS started theop_ha_work table was filled again and
>>         related VM's were stopped several times (3-7 times),
>>         requiring high attention to manually restart them on each stop.
>>
>>         I later found another table that could be the cause of the issue:
>>
>>         mysql> select
>>         op_it_work.*,vm_instance.instance_name,vm_instance.removed
>>         from op_it_work join vm_instance where
>>         op_it_work.step!="Done" and
>>         op_it_work.instance_id=vm_instance.id <http://vm_instance.id>
>>         order by instance_id;
>>         +--------------------------------------+----------------+------------+-----------------------+-----------+--------------------+-----------+------------+-------------+---------------+-------------+---------------+---------------------+
>>         | id | mgmt_server_id | created_at | thread                |
>>         type      | vm_type            | step      | updated_at |
>>         instance_id | resource_type | resource_id | instance_name |
>>         removed             |
>>         +--------------------------------------+----------------+------------+-----------------------+-----------+--------------------+-----------+------------+-------------+---------------+-------------+---------------+---------------------+
>>         | be4d0dd2-7d4f-47d3-ba49-623d2f115bbf |   345049098498 |
>>         1436187965 <tel:%2814%29%203618-7965> | consoleproxy-1       
>>         | Starting  | ConsoleProxy       | Release   | 1436188265
>>         <tel:%2814%29%203618-8265> |         402 | NULL         
>>         |           0 | v-402-VM      | 2015-07-06 13:23:22 |
>>         | fb4d1844-eb92-478c-9f50-d666790f43af |   345049098498 |
>>         1436187965 <tel:%2814%29%203618-7965> | secstorage-1         
>>         | Starting  | SecondaryStorageVm | Release   | 1436188265
>>         <tel:%2814%29%203618-8265> |         687 | NULL         
>>         |           0 | s-687-VM      | 2015-07-06 13:11:34 |
>>         | 7dfeb25a-a705-42e8-907c-79d1e69697ed |   345049098498 |
>>         1436964548 <tel:%2814%29%203696-4548> | Work-Job-Executor-165
>>         | Migrating | User               | Migrating | 1436964548
>>         <tel:%2814%29%203696-4548> |         776 | Host         
>>         |          29 | i-51-776-VM   | 2015-07-15 12:55:38 |
>>         | 2018ac61-b11c-4101-8ef5-07a39509f2bd |   345049098498 |
>>         1486020173 | Work-Job-Executor-19  | Migrating |
>>         User               | Prepare   | 1486020173 |        1177 |
>>         Host          |          25 | i-189-1177-VM |
>>         NULL                |
>>         | 25430cd5-4346-4643-9c7e-7b39b8a876a2 |   345049098122 |
>>         1486019693 | Work-Job-Executor-6   | Migrating |
>>         User               | Prepare   | 1486019693 |        1177 |
>>         Host          |          25 | i-189-1177-VM |
>>         NULL                |
>>         | 3276d52e-cbd9-44c5-8ab6-c70784337ff9 |   345049098498 |
>>         1486020953 | Work-Job-Executor-33  | Migrating |
>>         User               | Prepare   | 1486020953 |        1177 |
>>         Host          |          33 | i-189-1177-VM |
>>         NULL                |
>>         | 6462da33-83f2-4575-af38-0e52d2a48488 |   345049098122 |
>>         1486021433 | Work-Job-Executor-12  | Migrating |
>>         User               | Prepare   | 1486021433 |        1177 |
>>         Host          |          25 | i-189-1177-VM |
>>         NULL                |
>>         | 688e16c8-c39f-4e88-9e8f-36c0282f3df1 |   345049098122 |
>>         1486020773 | Work-Job-Executor-11  | Migrating |
>>         User               | Prepare   | 1486020773 |        1177 |
>>         Host          |          25 | i-189-1177-VM |
>>         NULL                |
>>         | d8a18e9d-ac6f-4273-923e-e401533e2cf7 |   345049098498 |
>>         1486020353 | Work-Job-Executor-25  | Migrating |
>>         User               | Prepare   | 1486020353 |        1177 |
>>         Host          |          25 | i-189-1177-VM |
>>         NULL                |
>>         | 1222ece8-399d-43d8-8ff8-d24bdebca2ed |   345049103441 |
>>         1486018924 | Work-Job-Executor-2   | Migrating |
>>         User               | Prepare   | 1486018924 |        1178 |
>>         Host          |          29 | i-189-1178-VM |
>>         NULL                |
>>         | 2ac5ca4a-c20d-433a-8379-40e0de3b3ba8 |   345049098498 |
>>         1486021613 | Work-Job-Executor-41  | Migrating |
>>         User               | Prepare   | 1486021613 |        1178 |
>>         Host          |          29 | i-189-1178-VM |
>>         NULL                |
>>         | 3e3b95a0-2af4-4268-bfdd-f3399a85bcaa |   345049103441 |
>>         1486019584 | Work-Job-Executor-5   | Migrating |
>>         User               | Prepare   | 1486019584 |        1178 |
>>         Host          |          29 | i-189-1178-VM |
>>         NULL                |
>>         | 53745a2d-056c-4a1b-9267-eb97f077a6c8 |   345049103441 |
>>         1486020244 | Work-Job-Executor-6   | Migrating |
>>         User               | Prepare   | 1486020244 |        1178 |
>>         Host          |          29 | i-189-1178-VM |
>>         NULL                |
>>         | e3555b49-7c18-4272-b7f4-ba9351b3bbd1 |   345049098498 |
>>         1486020893 | Work-Job-Executor-32  | Migrating |
>>         User               | Prepare   | 1486020893 |        1178 |
>>         Host          |          29 | i-189-1178-VM |
>>         NULL                |
>>         | 1dcd5204-4f9a-4342-a23b-329a48596e13 |   345049098498 |
>>         1486020953 | Work-Job-Executor-34  | Migrating |
>>         User               | Prepare   | 1486020953 |        1240 |
>>         Host          |           3 | i-388-1240-VM |
>>         NULL                |
>>         | ec1668b3-fa3c-4673-b593-3d607b169a24 |   345049098498 |
>>         1486019693 | Work-Job-Executor-14  | Migrating |
>>         User               | Prepare   | 1486019693 |        1240 |
>>         Host          |           3 | i-388-1240-VM |
>>         NULL                |
>>         | fa7b2dec-ed7f-4a67-a055-e6e6bc1631ca |   345049098498 |
>>         1486020353 | Work-Job-Executor-26  | Migrating |
>>         User               | Prepare   | 1486020353 |        1240 |
>>         Host          |           3 | i-388-1240-VM |
>>         NULL                |
>>         | 8cfe6d3e-5b7b-4be8-ac10-c865a492aab9 |   345049098498 |
>>         1465387271 | Work-Job-Executor-18  | Running   |
>>         User               | Prepare   | 1465387271 |        1388 |
>>         Host          |           7 | i-361-1388-VM |
>>         NULL                |
>>         | 6e941b7e-6f19-4717-bf82-974c59737dbc |   345049103441 |
>>         1486019524 | Work-Job-Executor-4   | Migrating |
>>         User               | Prepare   | 1486019524 |        1690 |
>>         Host          |          15 | i-648-1690-VM |
>>         NULL                |
>>         | 718578a4-0dd6-42d9-be10-ed7680650108 |   345049098122 |
>>         1486021433 | Work-Job-Executor-13  | Migrating |
>>         User               | Prepare   | 1486021433 |        1690 |
>>         Host          |          15 | i-648-1690-VM |
>>         NULL                |
>>         | 9acf3908-ef13-4b03-8c2b-1a7e55e18491 |   345049098498 |
>>         1486020173 | Work-Job-Executor-20  | Migrating |
>>         User               | Prepare   | 1486020173 |        1690 |
>>         Host          |          15 | i-648-1690-VM |
>>         NULL                |
>>         | a50c420b-ae52-4caa-83db-704b52b299a4 |   345049098498 |
>>         1486020773 | Work-Job-Executor-28  | Migrating |
>>         User               | Prepare   | 1486020773 |        1690 |
>>         Host          |          15 | i-648-1690-VM |
>>         NULL                |
>>         | 2b5f4179-5975-4326-9db7-643c3ede63fe |   345049103441 |
>>         1484224516 | Work-Job-Executor-161 | Running   |
>>         User               | Prepare   | 1484224516 |        1703 |
>>         Host          |           3 | i-51-1703-VM  |
>>         NULL                |
>>         | 2f3c6c44-b2a0-4d10-88d2-ac143d2dd379 |   345049098498 |
>>         1484223910 | Work-Job-Executor-163 | Running   |
>>         User               | Prepare   | 1484223910 |        1703 |
>>         Host          |           3 | i-51-1703-VM  |
>>         NULL                |
>>         | 4126a23b-1cde-49bc-8636-c0d063b85cf6 |   345049103441 |
>>         1484224636 | Work-Job-Executor-86  | Running   |
>>         User               | Prepare   | 1484224636 |        1703 |
>>         Host          |           3 | i-51-1703-VM  |
>>         NULL                |
>>         | 5dad5645-766b-48c9-9a85-b77b3c101390 |   345049103441 |
>>         1484224636 | Work-Job-Executor-92  | Running   |
>>         User               | Prepare   | 1484224636 |        1703 |
>>         Host          |           3 | i-51-1703-VM  |
>>         NULL                |
>>         | 8c1dfff9-e55d-4f81-a22f-45f72accc416 |   345049103441 |
>>         1484224576 | Work-Job-Executor-95  | Running   |
>>         User               | Prepare   | 1484224576 |        1703 |
>>         Host          |           3 | i-51-1703-VM  |
>>         NULL                |
>>         | a703d2d7-b1e3-42ce-a551-831982990ffb |   345049098498 |
>>         1484224630 | Work-Job-Executor-13  | Running   |
>>         User               | Prepare   | 1484224630 |        1703 |
>>         Host          |           3 | i-51-1703-VM  |
>>         NULL                |
>>         | 0e194cb5-95b7-41cb-969a-ae37d2ad449a |   345049098498 |
>>         1485763613 | Work-Job-Executor-28  | Migrating |
>>         User               | Prepare   | 1485763613 |        1735 |
>>         Host          |          19 | i-638-1735-VM | 2017-01-30
>>         15:24:10 |
>>         | 74f180b9-be0f-429b-9a45-597ca2ca997d |   345049098498 |
>>         1485763013 | Work-Job-Executor-21  | Migrating |
>>         User               | Prepare   | 1485763013 |        1735 |
>>         Host          |          19 | i-638-1735-VM | 2017-01-30
>>         15:24:10 |
>>         | feaa421c-88df-4493-9a4f-bddb6ee1b75b |   345049098122 |
>>         1485764285 | Work-Job-Executor-19  | Migrating |
>>         User               | Prepare   | 1485764285 |        1735 |
>>         Host          |          19 | i-638-1735-VM | 2017-01-30
>>         15:24:10 |
>>         | 32138a36-ad18-42b5-8478-76ad63caf4da |   345049098122 |
>>         1486020293 | Work-Job-Executor-9   | Migrating |
>>         User               | Prepare   | 1486020293 |        1736 |
>>         Host          |          19 | i-638-1736-VM |
>>         NULL                |
>>         | 5306ab1f-3c92-4f95-bbff-283b43e6862d |   345049098498 |
>>         1486019573 | Work-Job-Executor-11  | Migrating |
>>         User               | Prepare   | 1486019573 |        1736 |
>>         Host          |          19 | i-638-1736-VM |
>>         NULL                |
>>         | 843bdf9c-247d-432c-ac19-1a4f7858631b |   345049098498 |
>>         1486021493 | Work-Job-Executor-38  | Migrating |
>>         User               | Prepare   | 1486021493 |        1736 |
>>         Host          |          19 | i-638-1736-VM |
>>         NULL                |
>>         | 8d400afb-766f-4ac5-89ee-854f25a38eba |   345049098498 |
>>         1486020173 | Work-Job-Executor-21  | Migrating |
>>         User               | Prepare   | 1486020173 |        1736 |
>>         Host          |          19 | i-638-1736-VM |
>>         NULL                |
>>         | f0ae5612-b6ef-4882-94a2-40a1e6dce342 |   345049098498 |
>>         1486020833 | Work-Job-Executor-30  | Migrating |
>>         User               | Prepare   | 1486020833 |        1736 |
>>         Host          |          19 | i-638-1736-VM |
>>         NULL                |
>>         +--------------------------------------+----------------+------------+-----------------------+-----------+--------------------+-----------+------------+-------------+---------------+-------------+---------------+---------------------+
>>
>>         As you see, the problematic VM's are all present on that table:
>>
>>             i-189-1177-VM
>>             i-189-1178-VM
>>             i-388-1240-VM
>>             i-648-1690-VM
>>             i-638-1736-VM
>>
>>         As the MS are running the production environnements, we
>>         cannot debug a lot and just need the shortest path to resolution.
>>         For example restarting the MS is having an impact on our
>>         customers and we need to reduce the number of restarts at the
>>         minimum.
>>
>>         I have disabled the HA on all problematic VM's except on
>>         i-189-1177-VM, which is less critical.
>>
>>         The next steps could be:
>>
>>         1.     stop all MS
>>         2.     backup DB
>>         3.     select
>>         op_it_work.*,vm_instance.instance_name,vm_instance.removed
>>         from op_it_work join vm_instance where
>>         op_it_work.step!="Done" and
>>         op_it_work.instance_id=vm_instance.id <http://vm_instance.id>
>>         order by instance_id;
>>         4.     update op_it_work set step="Done" where step!="Done";
>>         5.     start all MS
>>
>>         What do you think about step 4., safe or risky?
>>         Some of these entries are related to deleted VM's and seems
>>         to safe to mark as "Done".
>>         All other entries are related to the problematic VM's that
>>         keep stopping... so marking them as "Done" may (hopefully)
>>         eliminate the cause of the Migrating/Stopping.
>>         But we are not sure if marking them as "Done" need also
>>         changes on other tables for consistency.
>>
>>
>>         Thank you for your help.
>>
>>
>>         Best Regards
>>         Francois Scheurer
>>
>>
>>
>>
>>
>>         On 01.02.2017 03:01, Rafael Weingärtner wrote:
>>>         Ok, I dug a little further into the code.
>>>
>>>             If we put a host in maintenance, it may unfortunately
>>>             happen that a timeout occurs and CS aborts the
>>>             migration. As far as we can interpret the logs, it seems
>>>             (we speculate) that CS detects that the VM is then still
>>>             running on the original host, which is in maintenance,
>>>             and decides somehow to stop it because of the "host not
>>>             up" status.
>>>
>>>
>>>         From wat I have seen in the code, this may happen (as you
>>>         speculated). If a timeout or some other event interrupts
>>>         migrations, the VM will be stopped. This happens in the
>>>         method
>>>         “com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(String,
>>>         long, DeploymentPlanner)” at line 2396 and further. This was
>>>         implemented on purpose; if the migration does not happen;
>>>         the VM is stopped on the source host. Probably it was
>>>         implemented this way on that philosophy of gradient
>>>         descendent of artificial intelligence:
>>>
>>>             “pick a random point in the field, update the values
>>>             according to the slope and hopefully after successfully
>>>             iterations you converge to a minimum point.” (the
>>>             problem is that sometimes the function diverges).
>>>
>>>
>>>         I mean, probably this was done this way because the HA was
>>>         supposed to see that a VM with HA is stopped, and should try
>>>         to start it again somewhere else. The log you posted do not
>>>         provide any other details to debug this further because the
>>>         log is after an MS restart, right?
>>>
>>>
>>>             After restarting the MSs I guess that a similar issue
>>>             may happen: the hosts are disconnected during the early
>>>             stage and then reconnects. Maybe CS then again stops the
>>>             VMs because the "host not up" status?
>>>             Cf. 2017-01-30 08:44:06,157 DEBUG
>>>             [c.c.d.DeploymentPlanningManagerImpl]
>>>             (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>>>             ctx-6e9c753f) The last host of this VM is not UP or is
>>>             not enabled, host status is: Disconnected, host resource
>>>             state is: Enabled
>>>
>>>
>>>         Here is something interesting that you have found. I checked
>>>         the code, but without a live debug it is pretty hard for me
>>>         to get the right flow of execution. By looking at the code
>>>         and at the logs you provided, your deduction is right. After
>>>         ACS boots up, the check for hosts that are
>>>         shutdown/deactivated starts, and the method
>>>         “com.cloud.agent.manager.AgentManagerImpl.executeUserRequest(long,
>>>         Event)” is executed. The problem is that all of the hosts
>>>         are considered as disconnected (ACS may not have finished
>>>         connecting to all of them yet), and then a whole process of
>>>         HA is kicked. The problem is that there is no host with
>>>         status “UP”. That is why this happen “No suitable hosts
>>>         found under this Cluster: 25”, and then the whole flow of
>>>         HA, which shuts down the VM is kicked.
>>>
>>>         In theory, the HA worker should see that a VM with HA is
>>>         down, and should try to start it. I am not understanding why
>>>         this is not working. If you decide to debug the running ACS,
>>>         I may be able to help you, just keep me posted.
>>>
>>>
>>>         On Tue, Jan 31, 2017 at 10:26 AM, Rafael Weingärtner
>>>         <rafaelweingartner@gmail.com
>>>         <ma...@gmail.com>> wrote:
>>>
>>>             I will be able to re-check some code and answer you with
>>>             new thoughts on this problem tonight (the log files you
>>>             provided today have some very insightful information
>>>             regarding this problem). I believe this is a bug in HA,
>>>             a very tricky one that has not been reported yet (at
>>>             least I do not remember seeing anything similar). And
>>>             thanks to randomness/destiny now you have
>>>             found/uncovered a way to reproduce the problem (bug?!).
>>>
>>>             I do not suggest you changing any data at all directly
>>>             in the data base (unless you know ACS code well and are
>>>             100% of what you are doing). This can cause even bigger
>>>             and more complicated problems. If you remove the HA
>>>             feature from this VMs, this problem will stop. However,
>>>             if you can/are willing to maintain the HA until we fully
>>>             uncover the problem, map and report it, this would be great.
>>>             BTW: why are you restarting MSs?
>>>             What are the configurations of your MSs?
>>>
>>>             Thanks for your time and for sharing information with
>>>             our community.
>>>
>>>
>>>
>>>             On Tue, Jan 31, 2017 at 8:53 AM, Francois Scheurer
>>>             <francois.scheurer@everyware.ch
>>>             <ma...@everyware.ch>> wrote:
>>>
>>>                 Dear Rafael
>>>
>>>
>>>                 Many thanks for your deep analysis.
>>>                 Nice humor with the planetsalignments BTW ;)
>>>
>>>                 We also suspected the HA Flag to be related this
>>>                 issue, as we never got this issue in the past, until
>>>                 we enabled the HA on all VMs and Offerings.
>>>
>>>                 Here is also the workers list:
>>>
>>>                 mysql> select * from op_ha_work;
>>>                 +------+-------------+-----------+--------------------+----------+----------------+---------+---------------------+-------+---------------------+-------+-------------+---------+
>>>                 | id   | instance_id | type   | vm_type            |
>>>                 state    | mgmt_server_id | host_id | created  |
>>>                 tried | taken  | step  | time_to_try | updated |
>>>                 +------+-------------+-----------+--------------------+----------+----------------+---------+---------------------+-------+---------------------+-------+-------------+---------+
>>>                 |  105 |         343 | HA   | User               |
>>>                 Running  |   345049098498 |   29 | 2015-03-28
>>>                 03:19:03 | 2 | 2015-03-28 03:30:04 | Error | 
>>>                 1394056013 |       3 |
>>>                 |  109 |         407 | HA   | ConsoleProxy       |
>>>                 Running  |   345049098498 |   33 | 2015-03-28
>>>                 03:19:03 | 2 | 2015-03-28 03:30:04 | Error | 
>>>                 1394056013 |       3 |
>>>                 |  113 |         407 | HA   | ConsoleProxy       |
>>>                 Running  |   345049098498 |   33 | 2015-03-28
>>>                 03:19:03 | 2 | 2015-03-28 03:30:04 | Error | 
>>>                 1394056013 |       3 |
>>>                 |  117 |         407 | HA   | ConsoleProxy       |
>>>                 Running  |   345049098498 |   33 | 2015-03-28
>>>                 03:19:03 | 2 | 2015-03-28 03:30:04 | Error | 
>>>                 1394056013 |       3 |
>>>                 |  121 |         343 | HA   | User               |
>>>                 Running  |   345049098498 |   29 | 2015-03-28
>>>                 03:19:03 | 2 | 2015-03-28 03:30:04 | Error | 
>>>                 1394056013 |       3 |
>>>                 |  125 |         405 | HA   | SecondaryStorageVm |
>>>                 Running  |   345049098498 |   29 | 2015-03-28
>>>                 03:19:03 | 2 | 2015-03-28 03:30:04 | Error | 
>>>                 1394056013 |       3 |
>>>                 |  129 |         393 | HA   | User               |
>>>                 Running  |   345049098498 |   25 | 2015-03-28
>>>                 03:19:03 | 2 | 2015-03-28 03:30:04 | Error | 
>>>                 1394056014 |      30 |
>>>                 |  133 |         405 | HA   | SecondaryStorageVm |
>>>                 Running  |   345049098498 |   29 | 2015-03-28
>>>                 03:19:03 | 2 | 2015-03-28 03:30:04 | Error | 
>>>                 1394056014 |       3 |
>>>                 |  137 |         402 | HA   | ConsoleProxy       |
>>>                 Running  |   345049098498 |   25 | 2015-03-28
>>>                 03:19:03 | 2 | 2015-03-28 03:30:04 | Error | 
>>>                 1394056014 |       3 |
>>>                 |  345 |         402 | HA   | ConsoleProxy       |
>>>                 Starting |   345049098498 |   25 | 2015-07-06
>>>                 13:09:26 | 1 | 2015-07-06 13:09:26 | Error | 
>>>                 1402527506 |     770 |
>>>                 |  349 |         687 | HA   | SecondaryStorageVm |
>>>                 Starting |   345049098498 |   25 | 2015-07-06
>>>                 13:09:26 | 1 | 2015-07-06 13:09:26 | Error | 
>>>                 1402527506 |       2 |
>>>                 | 2057 |        1178 | Migration | User  | Running 
>>>                 |   345049098498 |      33 | 2017-01-31 07:56:08 | 0
>>>                 | 2017-01-31 07:56:08 | Done  |  1451024774 |      95 |
>>>                 | 2058 |        1736 | Migration | User  | Running 
>>>                 |   345049098498 |     224 | 2017-01-31 07:56:08 | 0
>>>                 | 2017-01-31 07:56:08 | Done  |  1451024774 |      45 |
>>>                 | 2059 |        1240 | Migration | User  | Running 
>>>                 |   345049098498 |       7 | 2017-01-31 07:56:08 | 0
>>>                 | 2017-01-31 07:56:08 | Done  |  1451024774 |      61 |
>>>                 | 2060 |        1178 | Migration | User  | Running 
>>>                 |   345049098122 |      33 | 2017-01-31 07:56:08 | 1
>>>                 | 2017-01-31 08:06:37 | Done  |  1451025374 |      95 |
>>>                 | 2061 |        1178 | Migration | User  | Stopped 
>>>                 |           NULL |      33 | 2017-01-31 07:56:09 | 5
>>>                 | NULL     | Done  | 1451027900 <tel:1451027900> | 
>>>                    109 |
>>>                 | 2062 |        1690 | Migration | User  | Running 
>>>                 |   345049098498 |     224 | 2017-01-31 07:56:09 | 0
>>>                 | 2017-01-31 07:56:09 | Done  |  1451024774 |     109 |
>>>                 | 2063 |        1177 | Migration | User  | Running 
>>>                 |   345049098498 |      29 | 2017-01-31 07:56:09 | 0
>>>                 | 2017-01-31 07:56:09 | Done  |  1451024774 |      82 |
>>>                 | 2064 |        1690 | Migration | User  | Running 
>>>                 |   345049098122 |     224 | 2017-01-31 07:56:09 | 1
>>>                 | 2017-01-31 07:58:18 | Done  |  1451024900 |     109 |
>>>                 | 2065 |        1240 | Migration | User  | Running 
>>>                 |   345049098122 |       7 | 2017-01-31 07:56:09 | 1
>>>                 | 2017-01-31 07:58:18 | Done  |  1451024900 |      61 |
>>>                 | 2066 |        1736 | Migration | User  | Running 
>>>                 |   345049098122 |     224 | 2017-01-31 07:56:09 | 1
>>>                 | 2017-01-31 07:58:18 | Done  |  1451024900 |      45 |
>>>                 | 2067 |        1240 | Migration | User  | Stopped 
>>>                 |           NULL |       7 | 2017-01-31 07:56:09 | 5
>>>                 | NULL     | Done  | 1451028003 <tel:1451028003> | 
>>>                     80 |
>>>                 | 2068 |        1178 | Migration | User  | Running 
>>>                 |   345049098122 |      33 | 2017-01-31 07:56:16 | 0
>>>                 | 2017-01-31 07:56:16 | Done  |  1451024781 |      95 |
>>>                 | 2069 |        1736 | Migration | User  | Running 
>>>                 |   345049098122 |     224 | 2017-01-31 07:56:16 | 0
>>>                 | 2017-01-31 07:56:16 | Done  |  1451024781 |      45 |
>>>                 | 2070 |        1240 | Migration | User  | Running 
>>>                 |   345049098122 |       7 | 2017-01-31 07:56:16 | 0
>>>                 | 2017-01-31 07:56:16 | Done  |  1451024781 |      61 |
>>>                 | 2071 |        1177 | Migration | User  | Running 
>>>                 |   345049098122 |      29 | 2017-01-31 07:56:16 | 0
>>>                 | 2017-01-31 07:56:16 | Done  |  1451024781 |      82 |
>>>                 | 2072 |        1177 | Migration | User  | Stopped 
>>>                 |           NULL |      29 | 2017-01-31 07:56:16 | 5
>>>                 | NULL     | Done  | 1451027973 <tel:1451027973> | 
>>>                    101 |
>>>                 | 2073 |        1177 | Migration | User  | Running 
>>>                 |   345049098122 |      29 | 2017-01-31 07:56:16 | 1
>>>                 | 2017-01-31 07:58:22 | Done  |  1451024904 |      82 |
>>>                 | 2074 |        1690 | Migration | User  | Stopped 
>>>                 |           NULL |     224 | 2017-01-31 07:56:16 | 5
>>>                 | NULL     | Done  | 1451027464 <tel:1451027464> | 
>>>                    123 |
>>>                 | 2075 |        1736 | Migration | User  | Stopped 
>>>                 |           NULL |     224 | 2017-01-31 07:56:16 | 5
>>>                 | NULL     | Done  | 1451027477 <tel:1451027477> | 
>>>                     59 |
>>>                 +------+-------------+-----------+--------------------+----------+----------------+---------+---------------------+-------+---------------------+-------+-------------+---------+
>>>
>>>                 If we put a host in maintenance, it may
>>>                 unfortunately happen that a timeout occurs and CS
>>>                 aborts the migration.
>>>                 As far as we can interpret the logs, it seems (we
>>>                 speculate) that CS detects that the VM is then still
>>>                 running on the original host, which is in
>>>                 maintenance, and decides somehow to stop it because
>>>                 of the "host not up" status.
>>>                 We then need to start several times the VM, until CS
>>>                 gives up the stopping.
>>>
>>>                 After restarting the MSs I guess that a similar
>>>                 issue may happen: the hosts are disconnected during
>>>                 the early stage and then reconnects. Maybe CS then
>>>                 again stops the VMs because the "host not up" status?
>>>                 Cf. 2017-01-30 08:44:06,157 DEBUG
>>>                 [c.c.d.DeploymentPlanningManagerImpl]
>>>                 (Work-Job-Executor-6:ctx-0d738a8e
>>>                 job-13737/job-192988 ctx-6e9c753f) The last host of
>>>                 this VM is not UP or is not enabled, host status is:
>>>                 Disconnected, host resource state is: Enabled
>>>
>>>                 But the problem after the MSs restart is definitely
>>>                 only impacting VMs listed in the Table op_ha_work.
>>>
>>>
>>>                 To your questions:
>>>                         We have 3 ACS MSs managing several Zones.
>>>                 Each zone has a single pod and cluster.
>>>                         The Cluster of i-638-1735-VM has this summary:
>>>                          Total number of VMs: 121
>>>                          Total number hypervisors: 4
>>>                          Total allocated RAM: 522.0 / 972 GB (53.7 %)
>>>                          Total allocated cores: 211
>>>                          Total allocated storage: 11285 GB
>>>                         Used Memory on the hosts is now varying from
>>>                 28 to 77%, so there is enough RAM.
>>>                         I did not check the XS logs. The same
>>>                 Problem has arisen this morning again (shortly after
>>>                 a restart of all MSs), so I will look in the XS logs
>>>                 now.
>>>
>>>                         I have some fair experience with Eclipse and
>>>                 Java, I will ask my boss if we can go this way.
>>>
>>>                 The main issue for now, is that restarting the MSs
>>>                 is reproducing the problem, which is quite annoying
>>>                 for the impacted VM's.
>>>                 I thought to stop the MSs, delete all HA/Migration
>>>                 entries in op_ha_work and start the MSs.
>>>                 Is that a viable option? (or should I delete the
>>>                 Migration entries only? or never touch this table?)
>>>                 We will also very probably remove the HA Flag on all
>>>                 VMs if it does not work reliably.
>>>
>>>                 Best Regards
>>>                 Francois Scheurer
>>>
>>>
>>>
>>>                 Here are new logs regarding the same issue of today.
>>>                 I tried to get the important parts related to the
>>>                 stop of i-189-1178-VM.
>>>
>>>                 regexp:
>>>                 (i-189-1178-VM|VmWorkMigrateAway|job-15393|job-192989|StopRequestedvm|job-14965|job-193019|Stopped-Migrating|ctx-36aae733|work-2049)
>>>
>>>                 2017-01-30 08:44:04,440 DEBUG
>>>                 [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989) Executing AsyncJobVO
>>>                 {id:192989, userId: 1, accountId: 1, instanceType:
>>>                 null, instanceId: null, cmd:
>>>                 com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>                 rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZU
>>>                 F3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAEmnQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAHQ,
>>>                 cmdVersion: 0, status: IN_PROGRESS, processStatus:
>>>                 0, resu
>>>                 ltCode: 0, result: null, initMsid: 345049103441,
>>>                 completeMsid: null, lastUpdated: null, lastPolled:
>>>                 null, created: Mon Jan 30 08:44:03 CET 2017}
>>>                 2017-01-30 08:44:04,440 DEBUG
>>>                 [c.c.v.VmWorkJobDispatcher]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989) Run VM work job:
>>>                 com.cloud.vm.VmWorkMigrateAway for VM 1178, job
>>>                 origin: 15393
>>>
>>>                 2017-01-30 08:44:05,217 DEBUG
>>>                 [c.c.d.DeploymentPlanningManagerImpl]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea) Deploy avoids
>>>                 pods: [], clusters: [], hosts: [29]
>>>                 2017-01-30 08:44:05,220 DEBUG
>>>                 [c.c.a.m.ClusteredAgentManagerImpl]
>>>                 (AgentConnectTaskPool-4:ctx-5b61f2e5) create
>>>                 ClusteredAgentAttache for 232
>>>                 2017-01-30 08:44:05,220 DEBUG
>>>                 [c.c.d.DeploymentPlanningManagerImpl]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea) DeploymentPlanner
>>>                 allocation algorithm:
>>>                 com.cloud.deploy.UserDispersingPlanner@1734b2e2
>>>                 2017-01-30 08:44:05,220 DEBUG
>>>                 [c.c.d.DeploymentPlanningManagerImpl]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea) Trying to
>>>                 allocate a host and storage pools from dc:17,
>>>                 pod:25,cluster:25, requested cpu: 4200, requested
>>>                 ram: 4294967296 <tel:%2842%29%209496-7296>
>>>                 2017-01-30 08:44:05,220 DEBUG
>>>                 [c.c.d.DeploymentPlanningManagerImpl]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea) Is ROOT volume
>>>                 READY (pool already allocated)?: Yes
>>>                 2017-01-30 08:44:05,220 DEBUG
>>>                 [c.c.d.DeploymentPlanningManagerImpl]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea) This VM has last
>>>                 host_id specified, trying to choose the same host: 25
>>>
>>>                 2017-01-30 08:44:05,614 DEBUG
>>>                 [c.c.d.DeploymentPlanningManagerImpl]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea) The last host of
>>>                 this VM is not UP or is not enabled, host status is:
>>>                 Disconnected, host resource state is: Enabled
>>>                 2017-01-30 08:44:05,614 DEBUG
>>>                 [c.c.d.DeploymentPlanningManagerImpl]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea) Cannot choose the
>>>                 last host to deploy this VM
>>>                 2017-01-30 08:44:05,614 DEBUG
>>>                 [c.c.d.FirstFitPlanner]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea) Searching
>>>                 resources only under specified Cluster: 25
>>>
>>>                 2017-01-30 08:44:05,777 DEBUG
>>>                 [c.c.a.m.a.i.FirstFitAllocator]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea
>>>                 FirstFitRoutingAllocator) Found 0 hosts for
>>>                 allocation after prioritization: []
>>>                 2017-01-30 08:44:05,777 DEBUG
>>>                 [c.c.a.m.a.i.FirstFitAllocator]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea
>>>                 FirstFitRoutingAllocator) Looking for speed=4200Mhz,
>>>                 Ram=4096
>>>                 2017-01-30 08:44:05,777 DEBUG
>>>                 [c.c.a.m.a.i.FirstFitAllocator]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea
>>>                 FirstFitRoutingAllocator) Host Allocator returning 0
>>>                 suitable hosts
>>>                 2017-01-30 08:44:05,777 DEBUG
>>>                 [c.c.d.DeploymentPlanningManagerImpl]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea) No suitable hosts
>>>                 found
>>>                 2017-01-30 08:44:05,777 DEBUG
>>>                 [c.c.d.DeploymentPlanningManagerImpl]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea) No suitable hosts
>>>                 found under this Cluster: 25
>>>                 2017-01-30 08:44:05,787 DEBUG
>>>                 [c.c.d.DeploymentPlanningManagerImpl]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea) Could not find
>>>                 suitable Deployment Destination for this VM under
>>>                 any clusters, returning.
>>>                 2017-01-30 08:44:05,787 DEBUG
>>>                 [c.c.d.FirstFitPlanner]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea) Searching
>>>                 resources only under specified Cluster: 25
>>>                 2017-01-30 08:44:05,798 DEBUG
>>>                 [c.c.d.FirstFitPlanner]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea) The specified
>>>                 cluster is in avoid set, returning.
>>>                 2017-01-30 08:44:05,798 DEBUG
>>>                 [c.c.v.VirtualMachineManagerImpl]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea) Unable to find
>>>                 destination for migrating the vm VM[User|i-189-1178-VM]
>>>                 2017-01-30 08:44:05,798 WARN
>>>                 [c.c.v.VirtualMachineManagerImpl]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea) Failed to deploy
>>>                 vm 1178 with original planner, sending HAPlanner
>>>
>>>                 2017-01-30 08:44:05,985 DEBUG
>>>                 [c.c.d.DeploymentPlanningManagerImpl]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea) Deploy avoids
>>>                 pods: [], clusters: [], hosts: [29]
>>>                 2017-01-30 08:44:05,985 DEBUG
>>>                 [c.c.d.DeploymentPlanningManagerImpl]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea) DeploymentPlanner
>>>                 allocation algorithm:
>>>                 com.cloud.deploy.SkipHeuresticsPlanner@4fdf86ce
>>>                 2017-01-30 08:44:05,985 DEBUG
>>>                 [c.c.d.DeploymentPlanningManagerImpl]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea) Trying to
>>>                 allocate a host and storage pools from dc:17,
>>>                 pod:25,cluster:25, requested cpu: 4200, requested
>>>                 ram: 4294967296 <tel:%2842%29%209496-7296>
>>>                 2017-01-30 08:44:05,985 DEBUG
>>>                 [c.c.d.DeploymentPlanningManagerImpl]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea) Is ROOT volume
>>>                 READY (pool already allocated)?: Yes
>>>                 2017-01-30 08:44:05,985 DEBUG
>>>                 [c.c.d.DeploymentPlanningManagerImpl]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea) This VM has last
>>>                 host_id specified, trying to choose the same host: 25
>>>
>>>                 2017-01-30 08:44:06,022 DEBUG
>>>                 [c.c.d.DeploymentPlanningManagerImpl]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea) The last host of
>>>                 this VM is not UP or is not enabled, host status is:
>>>                 Disconnected, host resource state is: Enabled
>>>                 2017-01-30 08:44:06,022 DEBUG
>>>                 [c.c.d.DeploymentPlanningManagerImpl]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea) Cannot choose the
>>>                 last host to deploy this VM
>>>                 2017-01-30 08:44:06,022 DEBUG
>>>                 [c.c.d.FirstFitPlanner]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea) Searching
>>>                 resources only under specified Cluster: 25
>>>                 2017-01-30 08:44:06,039 DEBUG
>>>                 [c.c.d.SkipHeuresticsPlanner]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea) Deploying vm
>>>                 during HA process, so skipping disable threshold check
>>>
>>>                 2017-01-30 08:44:06,151 DEBUG
>>>                 [c.c.a.m.a.i.FirstFitAllocator]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea
>>>                 FirstFitRoutingAllocator) Found 0 hosts for
>>>                 allocation after prioritization: []
>>>                 2017-01-30 08:44:06,151 DEBUG
>>>                 [c.c.a.m.a.i.FirstFitAllocator]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea
>>>                 FirstFitRoutingAllocator) Looking for speed=4200Mhz,
>>>                 Ram=4096
>>>                 2017-01-30 08:44:06,151 DEBUG
>>>                 [c.c.a.m.a.i.FirstFitAllocator]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea
>>>                 FirstFitRoutingAllocator) Host Allocator returning 0
>>>                 suitable hosts
>>>                 2017-01-30 08:44:06,151 DEBUG
>>>                 [c.c.d.DeploymentPlanningManagerImpl]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea) No suitable hosts
>>>                 found
>>>                 2017-01-30 08:44:06,151 DEBUG
>>>                 [c.c.d.DeploymentPlanningManagerImpl]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea) No suitable hosts
>>>                 found under this Cluster: 25
>>>                 2017-01-30 08:44:06,157 DEBUG
>>>                 [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>                 (Work-Job-Executor-4:ctx-017064e7
>>>                 job-13733/job-192986) Publish async job-192986
>>>                 complete on message bus
>>>                 2017-01-30 08:44:06,157 DEBUG
>>>                 [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>                 (Work-Job-Executor-4:ctx-017064e7
>>>                 job-13733/job-192986) Wake up jobs related to job-192986
>>>                 2017-01-30 08:44:06,157 DEBUG
>>>                 [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>                 (Work-Job-Executor-4:ctx-017064e7
>>>                 job-13733/job-192986) Update db status for job-192986
>>>                 2017-01-30 08:44:06,157 DEBUG
>>>                 [c.c.d.DeploymentPlanningManagerImpl]
>>>                 (Work-Job-Executor-6:ctx-0d738a8e
>>>                 job-13737/job-192988 ctx-6e9c753f) The last host of
>>>                 this VM is not UP or is not enabled, host status is:
>>>                 Disconnected, host resource state is: Enabled
>>>                 2017-01-30 08:44:06,157 DEBUG
>>>                 [c.c.d.DeploymentPlanningManagerImpl]
>>>                 (Work-Job-Executor-6:ctx-0d738a8e
>>>                 job-13737/job-192988 ctx-6e9c753f) Cannot choose the
>>>                 last host to deploy this VM
>>>                 2017-01-30 08:44:06,157 DEBUG
>>>                 [c.c.d.FirstFitPlanner]
>>>                 (Work-Job-Executor-6:ctx-0d738a8e
>>>                 job-13737/job-192988 ctx-6e9c753f) Searching
>>>                 resources only under specified Cluster: 35
>>>                 2017-01-30 08:44:06,160 DEBUG
>>>                 [c.c.d.DeploymentPlanningManagerImpl]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea) Could not find
>>>                 suitable Deployment Destination for this VM under
>>>                 any clusters, returning.
>>>                 2017-01-30 08:44:06,160 DEBUG
>>>                 [c.c.d.FirstFitPlanner]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea) Searching
>>>                 resources only under specified Cluster: 25
>>>                 2017-01-30 08:44:06,163 DEBUG
>>>                 [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>                 (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Execute
>>>                 sync-queue item: SyncQueueItemVO {id:63629, queueId:
>>>                 64423, contentType: AsyncJob, contentId: 192991,
>>>                 lastProcessMsid: 345049098498, lastprocessNumber:
>>>                 27, lastProcessTime: Mon Jan 30 08:44:06 CET 2017,
>>>                 created: Mon Jan 30 08:44:05 CET 2017}
>>>                 2017-01-30 08:44:06,169 INFO
>>>                 [o.a.c.f.j.i.AsyncJobMonitor]
>>>                 (Work-Job-Executor-8:ctx-381e8d1a
>>>                 job-14953/job-192992) Add job-192992 into job monitoring
>>>                 2017-01-30 08:44:06,184 DEBUG
>>>                 [c.c.d.SkipHeuresticsPlanner]
>>>                 (Work-Job-Executor-6:ctx-0d738a8e
>>>                 job-13737/job-192988 ctx-6e9c753f) Deploying vm
>>>                 during HA process, so skipping disable threshold check
>>>                 2017-01-30 08:44:06,186 DEBUG
>>>                 [c.c.d.FirstFitPlanner]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea) The specified
>>>                 cluster is in avoid set, returning.
>>>                 2017-01-30 08:44:06,186 DEBUG
>>>                 [c.c.v.VirtualMachineManagerImpl]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea) Unable to find
>>>                 destination for migrating the vm VM[User|i-189-1178-VM]
>>>                 2017-01-30 08:44:06,186 ERROR
>>>                 [c.c.v.VmWorkJobHandlerProxy]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea) Invocation
>>>                 exception, caused by:
>>>                 com.cloud.exception.InsufficientServerCapacityException:
>>>                 Unable to find a server to migrate
>>>                 to.Scope=interface com.cloud.org.Cluster; id=25
>>>                 2017-01-30 08:44:06,187 INFO
>>>                 [c.c.v.VmWorkJobHandlerProxy]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989 ctx-ea83d6ea) Rethrow exception
>>>                 com.cloud.exception.InsufficientServerCapacityException:
>>>                 Unable to find a server to migrate
>>>                 to.Scope=interface com.cloud.org.Cluster; id=25
>>>                 2017-01-30 08:44:06,187 DEBUG
>>>                 [c.c.v.VmWorkJobDispatcher]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989) Done with run of VM work job:
>>>                 com.cloud.vm.VmWorkMigrateAway for VM 1178, job
>>>                 origin: 15393
>>>                 2017-01-30 08:44:06,187 ERROR
>>>                 [c.c.v.VmWorkJobDispatcher]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989) Unable to complete AsyncJobVO
>>>                 {id:192989, userId: 1, accountId: 1, instanceType:
>>>                 null, instanceId: null, cmd:
>>>                 com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>                 rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAEmnQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAHQ,
>>>                 cmdVersion: 0, status: IN_PROGRESS, processStatus:
>>>                 0, resultCode: 0, result: null, initMsid:
>>>                 345049103441, completeMsid: null, lastUpdated: null,
>>>                 lastPolled: null, created: Mon Jan 30 08:44:03 CET
>>>                 2017}, job origin:15393
>>>                 com.cloud.exception.InsufficientServerCapacityException:
>>>                 Unable to find a server to migrate
>>>                 to.Scope=interface com.cloud.org.Cluster; id=25
>>>                         at
>>>                 com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:2390)
>>>                         at
>>>                 com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:4520)
>>>                         at
>>>                 sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>>>                 Method)
>>>                         at
>>>                 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>                         at
>>>                 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>                         at
>>>                 java.lang.reflect.Method.invoke(Method.java:606)
>>>                         at
>>>                 com.cloud.vm.VmWorkJobHandlerProxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
>>>                         at
>>>                 com.cloud.vm.VirtualMachineManagerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4636)
>>>                         at
>>>                 com.cloud.vm.VmWorkJobDispatcher.runJob(VmWorkJobDispatcher.java:103)
>>>                         at
>>>                 org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:537)
>>>                         at
>>>                 org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>>>                         at
>>>                 org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
>>>                         at
>>>                 org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
>>>                         at
>>>                 org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
>>>                         at
>>>                 org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>>>                         at
>>>                 org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:494)
>>>                         at
>>>                 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>>                         at
>>>                 java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>                         at
>>>                 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>                         at
>>>                 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>                         at java.lang.Thread.run(Thread.java:745)
>>>                 2017-01-30 08:44:06,188 DEBUG
>>>                 [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>                 (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Schedule
>>>                 queued job-192991
>>>                 2017-01-30 08:44:06,192 DEBUG
>>>                 [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>                 (Work-Job-Executor-4:ctx-017064e7
>>>                 job-13733/job-192986) Wake up jobs joined with
>>>                 job-192986 and disjoin all subjobs created from job-
>>>                 192986
>>>                 2017-01-30 08:44:06,196 DEBUG
>>>                 [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989) Complete async job-192989,
>>>                 jobStatus: FAILED, resultCode: 0, result:
>>>                 rO0ABXNyADdjb20uY2xvdWQuZXhjZXB0aW9uLkluc3VmZmljaWVudFNlcnZlckNhcGFjaXR5RXhjZXB0aW9uAAAAAFZNT3ACAAFaABVhZmZpbml0eUdyb3Vwc0FwcGxpZWR4cgAxY29tLmNsb3VkLmV4Y2VwdGlvbi5JbnN1ZmZpY2llbnRDYXBhY2l0eUV4Y2VwdGlvbgAAAABWTU91AgACTAACaWR0ABBMamF2YS9sYW5nL0xvbmc7TAAFc2NvcGV0ABFMamF2YS9sYW5nL0NsYXNzO3hyACJjb20uY2xvdWQuZXhjZXB0aW9uLkNsb3VkRXhjZXB0aW9ueeiOG_HkRMcCAAJMAAtjc0Vycm9yQ29kZXQAE0xqYXZhL2xhbmcvSW50ZWdlcjtMAAZpZExpc3R0ABVMamF2YS91dGlsL0FycmF5TGlzdDt4cgATamF2YS5sYW5nLkV4Y2VwdGlvbtD9Hz4aOxzEAgAAeHIAE2phdmEubGFuZy5UaHJvd2FibGXVxjUnOXe4ywMABEwABWNhdXNldAAVTGphdmEvbGFuZy9UaHJvd2FibGU7TAANZGV0YWlsTWVzc2FnZXQAEkxqYXZhL2xhbmcvU3RyaW5nO1sACnN0YWNrVHJhY2V0AB5bTGphdmEvbGFuZy9TdGFja1RyYWNlRWxlbWVudDtMABRzdXBwcmVzc2VkRXhjZXB0aW9uc3QAEExqYXZhL3V0aWwvTGlzdDt4cHEAfgANdAAmVW5hYmxlIHRvIGZpbmQgYSBzZXJ2ZXIgdG8gbWlncmF0ZSB0by51cgAeW0xqYXZhLmxhbmcuU3RhY2tUcmFjZUVsZW1lbnQ7AkYqPDz9IjkCAAB4cAAAABVzcgAbamF2YS5sYW5nLlN0YWNrVHJhY2VFbGVtZW50YQnFmiY23YUCAARJAApsaW5lTnVtYmVyTAAOZGVjbGFyaW5nQ2xhc3NxAH4ACkwACGZpbGVOYW1lcQB-AApMAAptZXRob2ROYW1lcQB-AAp4cAAACVZ0ACZjb20uY2xvdWQudm0uVmlydHVhbE1hY2hpbmVNYW5hZ2VySW1wbHQAHlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwuamF2YXQAFm9yY2hlc3RyYXRlTWlncmF0ZUF3YXlzcQB-ABEAABGocQB-ABNxAH4AFHEAfgAVc3EAfgAR_____nQAJHN1bi5yZWZsZWN0Lk5hdGl2ZU1ldGhvZEFjY2Vzc29ySW1wbHQAHU5hdGl2ZU1ldGhvZEFjY2Vzc29ySW1wbC5qYXZhdAAHaW52b2tlMHNxAH4AEQAAADlxAH4AGHEAfgAZdAAGaW52b2tlc3EAfgARAAAAK3QAKHN1bi5yZWZsZWN0LkRlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcGx0ACFEZWxlZ2F0aW5nTWV0aG9kQWNjZXNzb3JJbXBsLmphdmFxAH4AHHNxAH4AEQAAAl50ABhqYXZhLmxhbmcucmVmbGVjdC5NZXRob2R0AAtNZXRob2QuamF2YXEAfgAcc3EAfgARAAAAa3QAImNvbS5jbG91ZC52bS5WbVdvcmtKb2JIYW5kbGVyUHJveHl0ABpWbVdvcmtKb2JIYW5kbGVyUHJveHkuamF2YXQAD2hhbmRsZVZtV29ya0pvYnNxAH4AEQAAEhxxAH4AE3EAfgAUcQB-ACZzcQB-ABEAAABndAAgY29tLmNsb3VkLnZtLlZtV29ya0pvYkRpc3BhdGNoZXJ0ABhWbVdvcmtKb2JEaXNwYXRjaGVyLmphdmF0AAZydW5Kb2JzcQB-ABEAAAIZdAA_b3JnLmFwYWNoZS5jbG91ZHN0YWNrLmZyYW1ld29yay5qb2JzLmltcGwuQXN5bmNKb2JNYW5hZ2VySW1wbCQ1dAAYQXN5bmNKb2JNYW5hZ2VySW1wbC5qYXZhdAAMcnVuSW5Db250ZXh0c3EAfgARAAAAMXQAPm9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuTWFuYWdlZENvbnRleHRSdW5uYWJsZSQxdAAbTWFuYWdlZENvbnRleHRSdW5uYWJsZS5qYXZhdAADcnVuc3EAfgARAAAAOHQAQm9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0TWFuYWdlZENvbnRleHQkMXQAGkRlZmF1bHRNYW5hZ2VkQ29udGV4dC5qYXZhdAAEY2FsbHNxAH4AEQAAAGd0AEBvcmcuYXBhY2hlLmNsb3Vkc3RhY2subWFuYWdlZC5jb250ZXh0LmltcGwuRGVmYXVsdE1hbmFnZWRDb250ZXh0cQB-ADZ0AA9jYWxsV2l0aENvbnRleHRzcQB-ABEAAAA1cQB-ADlxAH4ANnQADnJ1bldpdGhDb250ZXh0c3EAfgARAAAALnQAPG9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuTWFuYWdlZENvbnRleHRSdW5uYWJsZXEAfgAycQB-ADNzcQB-ABEAAAHucQB-AC1xAH4ALnEAfgAzc3EAfgARAAAB13QALmphdmEudXRpbC5jb25jdXJyZW50LkV4ZWN1dG9ycyRSdW5uYWJsZUFkYXB0ZXJ0AA5FeGVjdXRvcnMuamF2YXEAfgA3c3EAfgARAAABBnQAH2phdmEudXRpbC5jb25jdXJyZW50LkZ1dHVyZVRhc2t0AA9GdXR1cmVUYXNrLmphdmFxAH4AM3NxAH4AEQAABHl0ACdqYXZhLnV0aWwuY29uY3VycmVudC5UaHJlYWRQb29sRXhlY3V0b3J0ABdUaHJlYWRQb29sRXhlY3V0b3IuamF2YXQACXJ1bldvcmtlcnNxAH4AEQAAAmd0AC5qYXZhLnV0aWwuY29uY3VycmVudC5UaHJlYWRQb29sRXhlY3V0b3IkV29ya2VycQB-AEhxAH4AM3NxAH4AEQAAAul0ABBqYXZhLmxhbmcuVGhyZWFkdAALVGhyZWFkLmphdmFxAH4AM3NyACZqYXZhLnV0aWwuQ29sbGVjdGlvbnMkVW5tb2RpZmlhYmxlTGlzdPwPJTG17I4QAgABTAAEbGlzdHEAfgAMeHIALGphdmEudXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVDb2xsZWN0aW9uGUIAgMte9x4CAAFMAAFjdAAWTGphdmEvdXRpbC9Db2xsZWN0aW9uO3hwc3IAE2phdmEudXRpbC5BcnJheUxpc3R4gdIdmcdhnQMAAUkABHNpemV4cAAAAAB3BAAAAAB4cQB-AFR4c3IAEWphdmEubGFuZy5JbnRlZ2VyEuKgpPeBhzgCAAFJAAV2YWx1ZXhyABBqYXZhLmxhbmcuTnVtYmVyhqyVHQuU4IsCAAB4cAAAEO9zcQB-AFMAAAAAdwQAAAAAeHNyAA5qYXZhLmxhbmcuTG9uZzuL5JDMjyPfAgABSgAFdmFsdWV4cQB-AFYAAAAAAAAAGXZyABVjb20uY2xvdWQub3JnLkNsdXN0ZXKqoV-omaW2FAIAAHhwAA
>>>                 2017-01-30 08:44:06,201 DEBUG
>>>                 [c.c.d.DeploymentPlanningManagerImpl]
>>>                 (Work-Job-Executor-7:ctx-bcf531e9
>>>                 job-13729/job-192987 ctx-4c93bc31) Deploy avoids
>>>                 pods: [], clusters: [], hosts: [3]
>>>                 2017-01-30 08:44:06,201 DEBUG
>>>                 [c.c.d.DeploymentPlanningManagerImpl]
>>>                 (Work-Job-Executor-7:ctx-bcf531e9
>>>                 job-13729/job-192987 ctx-4c93bc31) DeploymentPlanner
>>>                 allocation algorithm:
>>>                 com.cloud.deploy.SkipHeuresticsPlanner@4fdf86ce
>>>                 2017-01-30 08:44:06,201 DEBUG
>>>                 [c.c.d.DeploymentPlanningManagerImpl]
>>>                 (Work-Job-Executor-7:ctx-bcf531e9
>>>                 job-13729/job-192987 ctx-4c93bc31) Trying to
>>>                 allocate a host and storage pools from dc:14,
>>>                 pod:17,cluster:17, requested cpu: 4200, requested
>>>                 ram: 4294967296
>>>                 2017-01-30 08:44:06,201 DEBUG
>>>                 [c.c.d.DeploymentPlanningManagerImpl]
>>>                 (Work-Job-Executor-7:ctx-bcf531e9
>>>                 job-13729/job-192987 ctx-4c93bc31) Is ROOT volume
>>>                 READY (pool already allocated)?: Yes
>>>                 2017-01-30 08:44:06,201 DEBUG
>>>                 [c.c.d.DeploymentPlanningManagerImpl]
>>>                 (Work-Job-Executor-7:ctx-bcf531e9
>>>                 job-13729/job-192987 ctx-4c93bc31) This VM has last
>>>                 host_id specified, trying to choose the same host: 213
>>>                 2017-01-30 08:44:06,213 DEBUG
>>>                 [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989) Publish async job-192989
>>>                 complete on message bus
>>>                 2017-01-30 08:44:06,213 DEBUG
>>>                 [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989) Wake up jobs related to job-192989
>>>                 2017-01-30 08:44:06,213 DEBUG
>>>                 [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989) Update db status for job-192989
>>>                 2017-01-30 08:44:06,219 DEBUG
>>>                 [c.c.d.DeploymentPlanningManagerImpl]
>>>                 (Work-Job-Executor-6:ctx-0d738a8e
>>>                 job-13737/job-192988 ctx-6e9c753f) Checking
>>>                 resources in Cluster: 35 under Pod: 31
>>>                 2017-01-30 08:44:06,227 DEBUG
>>>                 [c.c.a.m.a.i.FirstFitAllocator]
>>>                 (Work-Job-Executor-6:ctx-0d738a8e
>>>                 job-13737/job-192988 ctx-6e9c753f
>>>                 FirstFitRoutingAllocator) Looking for hosts in dc:
>>>                 23 pod:31  cluster:35
>>>                 2017-01-30 08:44:06,229 DEBUG
>>>                 [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>                 (Work-Job-Executor-8:ctx-381e8d1a
>>>                 job-14953/job-192992) Executing AsyncJobVO
>>>                 {id:192992, userId: 1, accountId: 1, instanceType:
>>>                 null, instanceId: null, cmd:
>>>                 com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>                 rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGwXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAEw,
>>>                 cmdVersion: 0, status: IN_PROGRESS, processStatus:
>>>                 0, resultCode: 0, result: null, initMsid:
>>>                 345049098122, completeMsid: null, lastUpdated: null,
>>>                 lastPolled: null, created: Mon Jan 30 08:44:05 CET 2017}
>>>                 2017-01-30 08:44:06,229 DEBUG
>>>                 [c.c.v.VmWorkJobDispatcher]
>>>                 (Work-Job-Executor-8:ctx-381e8d1a
>>>                 job-14953/job-192992) Run VM work job:
>>>                 com.cloud.vm.VmWorkMigrateAway for VM 1729, job
>>>                 origin: 14953
>>>                 2017-01-30 08:44:06,230 DEBUG
>>>                 [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>                 (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Execute
>>>                 sync-queue item: SyncQueueItemVO {id:63630, queueId:
>>>                 67180, contentType: AsyncJob, contentId: 192990,
>>>                 lastProcessMsid: 345049098498, lastprocessNumber:
>>>                 61, lastProcessTime: Mon Jan 30 08:44:06 CET 2017,
>>>                 created: Mon Jan 30 08:44:05 CET 2017}
>>>                 2017-01-30 08:44:06,244 DEBUG
>>>                 [c.c.d.DeploymentPlanningManagerImpl]
>>>                 (Work-Job-Executor-7:ctx-bcf531e9
>>>                 job-13729/job-192987 ctx-4c93bc31) The last host of
>>>                 this VM is not UP or is not enabled, host status is:
>>>                 Disconnected, host resource state is: Enabled
>>>                 2017-01-30 08:44:06,262 DEBUG
>>>                 [c.c.d.DeploymentPlanningManagerImpl]
>>>                 (Work-Job-Executor-7:ctx-bcf531e9
>>>                 job-13729/job-192987 ctx-4c93bc31) Cannot choose the
>>>                 last host to deploy this VM
>>>                 2017-01-30 08:44:06,263 DEBUG
>>>                 [c.c.d.FirstFitPlanner]
>>>                 (Work-Job-Executor-7:ctx-bcf531e9
>>>                 job-13729/job-192987 ctx-4c93bc31) Searching
>>>                 resources only under specified Cluster: 17
>>>                 2017-01-30 08:44:06,262 INFO
>>>                 [o.a.c.f.j.i.AsyncJobMonitor]
>>>                 (Work-Job-Executor-9:ctx-f4d09efb
>>>                 job-14965/job-192991) Add job-192991 into job monitoring
>>>                 2017-01-30 08:44:06,246 DEBUG
>>>                 [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>                 (Work-Job-Executor-4:ctx-017064e7
>>>                 job-13733/job-192986) Done executing
>>>                 com.cloud.vm.VmWorkMigrateAway for job-192986
>>>                 2017-01-30 08:44:06,244 DEBUG
>>>                 [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989) Wake up jobs joined with
>>>                 job-192989 and disjoin all subjobs created from job-
>>>                 192989
>>>                 2017-01-30 08:44:06,269 INFO
>>>                 [o.a.c.f.j.i.AsyncJobMonitor]
>>>                 (Work-Job-Executor-4:ctx-017064e7
>>>                 job-13733/job-192986) Remove job-192986 from job
>>>                 monitoring
>>>                 2017-01-30 08:44:06,279 DEBUG
>>>                 [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989) Done executing
>>>                 com.cloud.vm.VmWorkMigrateAway for job-192989
>>>                 2017-01-30 08:44:06,305 DEBUG
>>>                 [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>                 (Work-Job-Executor-9:ctx-f4d09efb
>>>                 job-14965/job-192991) Executing AsyncJobVO
>>>                 {id:192991, userId: 1, accountId: 1, instanceType:
>>>                 null, instanceId: null, cmd:
>>>                 com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>                 rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGRXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAABw,
>>>                 cmdVersion: 0, status: IN_PROGRESS, processStatus:
>>>                 0, resultCode: 0, result: null, initMsid:
>>>                 345049098122, completeMsid: null, lastUpdated: null,
>>>                 lastPolled: null, created: Mon Jan 30 08:44:05 CET 2017}
>>>                 2017-01-30 08:44:06,305 DEBUG
>>>                 [c.c.v.VmWorkJobDispatcher]
>>>                 (Work-Job-Executor-9:ctx-f4d09efb
>>>                 job-14965/job-192991) Run VM work job:
>>>                 com.cloud.vm.VmWorkMigrateAway for VM 1605, job
>>>                 origin: 14965
>>>                 2017-01-30 08:44:06,306 DEBUG
>>>                 [c.c.a.m.AgentManagerImpl]
>>>                 (AgentManager-Handler-13:null) Ping from 258
>>>                 2017-01-30 08:44:06,319 DEBUG
>>>                 [c.c.a.m.a.i.FirstFitAllocator]
>>>                 (Work-Job-Executor-6:ctx-0d738a8e
>>>                 job-13737/job-192988 ctx-6e9c753f
>>>                 FirstFitRoutingAllocator) List of hosts in ascending
>>>                 order of number of VMs: [224, 19, 21, 15]
>>>                 2017-01-30 08:44:06,322 DEBUG
>>>                 [c.c.a.m.a.i.FirstFitAllocator]
>>>                 (Work-Job-Executor-6:ctx-0d738a8e
>>>                 job-13737/job-192988 ctx-6e9c753f
>>>                 FirstFitRoutingAllocator) FirstFitAllocator has 0
>>>                 hosts to check for allocation: []
>>>                 2017-01-30 08:44:06,322 INFO
>>>                 [c.c.a.m.AgentManagerImpl]
>>>                 (AgentManager-Handler-13:null) PingMap for agent:
>>>                 258 will not be updated because agent is no longer
>>>                 in the PingMap
>>>                 2017-01-30 08:44:06,326 INFO
>>>                 [o.a.c.f.j.i.AsyncJobMonitor]
>>>                 (Work-Job-Executor-5:ctx-32f1e087
>>>                 job-15393/job-192989) Remove job-192989 from job
>>>                 monitoring
>>>
>>>                 2017-01-30 08:54:38,139 INFO
>>>                 [c.c.v.VirtualMachineManagerImpl]
>>>                 (Work-Job-Executor-20:ctx-a988090f
>>>                 job-14965/job-193019 ctx-052eb4f5) Migration
>>>                 cancelled because state has changed:
>>>                 VM[User|i-189-1178-VM]
>>>                 2017-01-30 08:54:38,139 DEBUG
>>>                 [c.c.v.VirtualMachineManagerImpl]
>>>                 (Work-Job-Executor-20:ctx-a988090f
>>>                 job-14965/job-193019 ctx-052eb4f5) Unable to migrate
>>>                 VM due to: Migration cancelled because state has
>>>                 changed: VM[User|i-189-1178-VM]
>>>                 2017-01-30 08:54:38,167 DEBUG
>>>                 [c.c.c.CapacityManagerImpl]
>>>                 (Work-Job-Executor-20:ctx-a988090f
>>>                 job-14965/job-193019 ctx-052eb4f5) VM state
>>>                 transitted from :Running to Stopping with event:
>>>                 StopRequestedvm's original host id: 29 new host id:
>>>                 33 host id before state transition: 33
>>>                 2017-01-30 08:54:38,177 DEBUG
>>>                 [c.c.a.m.DirectAgentAttache]
>>>                 (DirectAgent-325:ctx-e540d7f5) Seq
>>>                 19-2355664080091480198: Response Received:
>>>                 2017-01-30 08:54:38,177 DEBUG [c.c.a.t.Request]
>>>                 (DirectAgent-325:ctx-e540d7f5) Seq
>>>                 19-2355664080091480198: Processing:  { Ans: ,
>>>                 MgmtId: 345049098498, via: 19, Ver: v1, Flags: 10,
>>>                 [{"com.cloud.agent.api.NetworkUsageAnswer":{"routerName":"r-1706-VM","bytesSent":12276848087,"bytesReceived":5226621450,"result":true,"details":"","wait":0}}]
>>>                 }
>>>                 2017-01-30 08:54:38,177 DEBUG [c.c.a.t.Request]
>>>                 (RouterMonitor-1:ctx-2e80f405) Seq
>>>                 19-2355664080091480198: Received:  { Ans: , MgmtId:
>>>                 345049098498, via: 19, Ver: v1, Flags: 10, {
>>>                 NetworkUsageAnswer } }
>>>                 2017-01-30 08:54:38,177 DEBUG
>>>                 [c.c.a.m.AgentManagerImpl]
>>>                 (RouterMonitor-1:ctx-2e80f405) Details from
>>>                 executing class com.cloud.agent.api.NetworkUsageCommand:
>>>                 2017-01-30 08:54:38,178 DEBUG [c.c.a.t.Request]
>>>                 (Work-Job-Executor-20:ctx-a988090f
>>>                 job-14965/job-193019 ctx-052eb4f5) Seq
>>>                 33-1917970491306410035: Sending  { Cmd , MgmtId:
>>>                 345049098498, via: 33(ewcstack-vh003-prod), Ver: v1,
>>>                 Flags: 100011,
>>>                 [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"checkBeforeCleanup":false,"vmName":"i-189-1178-VM","wait":0}}]
>>>                 }
>>>                 2017-01-30 08:54:38,178 DEBUG [c.c.a.t.Request]
>>>                 (Work-Job-Executor-20:ctx-a988090f
>>>                 job-14965/job-193019 ctx-052eb4f5) Seq
>>>                 33-1917970491306410035: Executing:  { Cmd , MgmtId:
>>>                 345049098498, via: 33(ewcstack-vh003-prod), Ver: v1,
>>>                 Flags: 100011,
>>>                 [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"checkBeforeCleanup":false,"vmName":"i-189-1178-VM","wait":0}}]
>>>                 }
>>>                 2017-01-30 08:54:38,178 DEBUG
>>>                 [c.c.a.m.DirectAgentAttache]
>>>                 (DirectAgent-43:ctx-da30f109) Seq
>>>                 33-1917970491306410035: Executing request
>>>                 2017-01-30 08:54:38,202 DEBUG
>>>                 [c.c.a.m.DirectAgentAttache]
>>>                 (DirectAgent-326:ctx-4fb88aa7) Seq
>>>                 21-1035546439318503505: Executing request
>>>
>>>                 2017-01-30 09:05:55,628 INFO
>>>                 [c.c.h.HighAvailabilityManagerImpl]
>>>                 (HA-Worker-4:ctx-36aae733 work-2049) Processing
>>>                 HAWork[2049-Migration-1178-Stopped-Migrating]
>>>                 2017-01-30 09:05:55,633 WARN
>>>                 [o.a.c.f.j.AsyncJobExecutionContext]
>>>                 (HA-Worker-4:ctx-36aae733 work-2049) Job is executed
>>>                 without a context, setup psudo job for the executing
>>>                 thread
>>>                 2017-01-30 09:05:55,645 DEBUG
>>>                 [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>                 (HA-Worker-4:ctx-36aae733 work-2049) Sync job-193045
>>>                 execution on object VmWorkJobQueue.1178
>>>
>>>
>>>
>>>
>>>                 On 31.01.2017 00:42, Rafael Weingärtner wrote:
>>>
>>>                     This seems to be one of those nasty bugs that
>>>                     are very hard to debug. It
>>>                     seems to be cause to a combination of different
>>>                     factors, such as planets
>>>                     alignments ;)
>>>
>>>                     At class “VirtualMachineManagerImpl” line 2395
>>>                     (method
>>>                     “com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(String,
>>>                     long, DeploymentPlanner)”) is the start of the
>>>                     code flow that triggers the
>>>                     migration. The point here is that, if an error
>>>                     occurs during the migration,
>>>                     the VM will be stopped; For some reason this is
>>>                     executed only for HA VMs.
>>>
>>>                     Going a little bit further, we find the point
>>>                     that is triggering the error
>>>                     in the method
>>>                     “com.cloud.vm.VirtualMachineManagerImpl.migrate(VMInstanceVO,
>>>                     long, DeployDestination)” at line 1941. This
>>>                     line will call the method
>>>                     “com.cloud.vm.VirtualMachineManagerImpl.changeState(VMInstanceVO,
>>>                     Event,
>>>                     Long, ItWorkVO, Step)” to change the VM state to
>>>                     migrating/prepare
>>>                     migration/or something like that (I am guessing
>>>                     here), but for some reason,
>>>                     the state of the VM is not the one expected and
>>>                     this is causing a
>>>                     “ConcurrentOperationException” to be thrown. The
>>>                     problem is that the
>>>                     “com.cloud.vm.VirtualMachineManagerImpl.changeState(VMInstanceVO,
>>>                     Event,
>>>                     Long, ItWorkVO, Step)” does not seem to provide
>>>                     much useful log entries.
>>>
>>>                     Just to enable us to discuss the problem a
>>>                     little further, could you tell
>>>                     us a little bit about your environment setting?
>>>                     How many ACS MSs do you
>>>                     have? What is the VM workload being migrated (I
>>>                     mean, the number of VMs and
>>>                     computing resources)? Do the other servers have
>>>                     enough computing resources
>>>                     to support these VMs? Could you also check the
>>>                     XenServer logs, and check
>>>                     what happened on both, source and destination
>>>                     hosts when the migration for
>>>                     VM “i-638-1735-VM” was triggered?
>>>
>>>                     Do you have the knowledge/skills to use an IDE
>>>                     (e.g. Eclipse) to connect to
>>>                     a running JVM and debug a code? Because for me
>>>                     it will not be possible to
>>>                     go further without looking at the variables in a
>>>                     running environment.
>>>
>>>
>>>                     On Mon, Jan 30, 2017 at 4:28 AM, Francois Scheurer <
>>>                     francois.scheurer@everyware.ch
>>>                     <ma...@everyware.ch>> wrote:
>>>
>>>                         Dear Rafael
>>>
>>>
>>>                         Sorry for the late answer (was sick a couple
>>>                         of days).
>>>
>>>                         I confirm we are migrating inside the same
>>>                         xenserver pool (within a CS
>>>                         Cluster).
>>>
>>>
>>>                         Best Regards
>>>                         Francois
>>>
>>>
>>>
>>>                         On 26.01.2017 04:35, Rafael Weingärtner wrote:
>>>
>>>                             Found the code that is executed.
>>>                             I need you to confirm if you are
>>>                             migrating within a cluster or across
>>>                             clusters?
>>>
>>>                             On Wed, Jan 25, 2017 at 11:57 AM, Rafael
>>>                             Weingärtner <
>>>                             rafaelweingartner@gmail.com
>>>                             <ma...@gmail.com>> wrote:
>>>
>>>                             I do not have access to ACS code right
>>>                             now, but I suggest starting
>>>
>>>                                 debugging here:
>>>
>>>                                 2017-01-24 18:05:28,427 DEBUG
>>>                                 [c.c.c.CapacityManagerImpl]
>>>
>>>                                     (Work-Job-Executor-156:ctx-14b86f5e
>>>                                     job-60216/job-191380 ctx-175d37df)
>>>                                     VM state transitted from
>>>                                     :Running to Stopping with event:
>>>                                     StopRequestedvm's
>>>                                     original host id: 224 new host
>>>                                     id: 15 host id before state
>>>                                     transition:
>>>                                     15
>>>
>>>                                     I would try to understand first
>>>                                     why ACS requested the hypervisor
>>>                                     to stop
>>>
>>>                                 the VM. I mean the following, check
>>>                                 the source code for conditions that
>>>                                 would make ACS request the shutdown
>>>                                 of VMs.
>>>                                 BTW: are you migrating within a
>>>                                 cluster or across clusters?
>>>
>>>                                 I will be able to analyze this
>>>                                 further only after work hours, so if you
>>>                                 find anything, keep me posted.
>>>
>>>                                 On Wed, Jan 25, 2017 at 11:45 AM,
>>>                                 Francois Scheurer <
>>>                                 francois.scheurer@everyware.ch
>>>                                 <ma...@everyware.ch>>
>>>                                 wrote:
>>>
>>>                                 Hello Rafael
>>>
>>>
>>>                                     I think the important log lines
>>>                                     are these:
>>>
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:28,407 INFO
>>>                                     [c.c.v.VirtualMachineManagerImpl]
>>>                                     (Work-Job-Executor-156:ctx-14b86f5e
>>>                                     job-60216/job-191380
>>>                                     ctx-175d37df) Migration
>>>                                     cancelled because state has
>>>                                     changed: VM[User|i-638-1736-VM]
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:28,407 DEBUG
>>>                                     [c.c.v.VirtualMachineManagerImpl]
>>>                                     (Work-Job-Executor-156:ctx-14b86f5e
>>>                                     job-60216/job-191380
>>>                                     ctx-175d37df) Unable to migrate
>>>                                     VM due to:
>>>                                     Migration
>>>                                     cancelled because state has
>>>                                     changed: VM[User|i-638-1736-VM]
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:28,427 DEBUG
>>>                                     [c.c.c.CapacityManagerImpl]
>>>                                     (Work-Job-Executor-156:ctx-14b86f5e
>>>                                     job-60216/job-191380
>>>                                     ctx-175d37df) VM state
>>>                                     transitted from :Running to
>>>                                     Stopping with event:
>>>                                     StopRequestedvm's original host
>>>                                     id: 224 new host
>>>                                     id:
>>>                                     15 host id before state
>>>                                     transition: 15
>>>
>>>                                     The really disturbing thing is
>>>                                     that CS stop again the VM if we
>>>                                     start it
>>>                                     manually, apparently because the
>>>                                     migratingjob is still running.
>>>
>>>                                     Note the the VM HA Flag is enabled.
>>>
>>>
>>>
>>>                                     Thank you.
>>>
>>>
>>>                                     Best Regards.
>>>
>>>                                     Francois
>>>
>>>
>>>
>>>
>>>
>>>
>>>                                     more details:
>>>
>>>
>>>
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:28,148 DEBUG
>>>                                     [o.a.c.e.o.VolumeOrchestrator]
>>>                                     (Work-Job-Executor-156:ctx-14b86f5e
>>>                                     job-60216/job-191380
>>>                                     ctx-175d37df) Preparing 3
>>>                                     volumes for
>>>                                     VM[User|i-638-1736-VM]
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:28,155 DEBUG
>>>                                     [c.c.v.VmWorkJobDispatcher]
>>>                                     (Work-Job-Executor-122:ctx-a8bacc64
>>>                                     job-14953/job-191351) Done with
>>>                                     run of VM work job:
>>>                                     com.cloud.vm.VmWorkMigrateAway
>>>                                     for VM 1411, job origin: 14953
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:28,155 DEBUG
>>>                                     [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>                                     (Work-Job-Executor-122:ctx-a8bacc64
>>>                                     job-14953/job-191351) Done
>>>                                     executing
>>>                                     com.cloud.vm.VmWorkMigrateAway for
>>>                                     job-191351
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:28,160 INFO
>>>                                     [o.a.c.f.j.i.AsyncJobMonitor]
>>>                                     (Work-Job-Executor-122:ctx-a8bacc64
>>>                                     job-14953/job-191351) Remove
>>>                                     job-191351 from job monitoring
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:28,168 INFO
>>>                                     [c.c.h.HighAvailabilityManagerImpl]
>>>                                     (HA-Worker-2:ctx-f3bcf43c
>>>                                     work-1989)
>>>                                     Completed
>>>                                     HAWork[1989-Migration-1411-Running-Migrating]
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:28,246 DEBUG
>>>                                     [c.c.a.m.ClusteredAgentAttache]
>>>                                     (Work-Job-Executor-156:ctx-14b86f5e
>>>                                     job-60216/job-191380
>>>                                     ctx-175d37df) Seq
>>>                                     19-8526158519542516375:
>>>                                     Forwarding
>>>                                     Seq 19-8526158519542516375: {
>>>                                     Cmd , MgmtId: 345049103441, via:
>>>                                     19(ewcstack-vh010-prod), Ver:
>>>                                     v1, Flags: 100111, [{
>>>                                     "com.cloud.agent.api.PrepareForMigrationCommand":{"vm":{"id"
>>>                                     :1736,"name":"i-638-1736-VM","bootloader":"PyGrub","type":"U
>>>                                     ser","cpus":4,"minSpeed":2100,"maxSpeed":2100,"minRam":17179
>>>                                     869184,"maxRam":17179869184,"arch":"x86_64","os":"Other
>>>                                     (64-bit)","platformEmulator":"Other
>>>                                     install
>>>                                     media","bootArgs":"","enable
>>>                                     HA":true,"limitCpuUse":true,"enableDynamicallyScaleVm":true,
>>>                                     "vncPassword":"pC1WUC7h1DBH9J36q3H9pg==","params":{"memoryOv
>>>                                     ercommitRatio":"1.0","platform":"viridian:true;acpi:1;apic:t
>>>                                     rue;pae:true;nx:true;timeoffset:-3601","keyboard":"us","Mess
>>>                                     age.ReservedCapacityFreed.Flag":"false","cpuOvercommitRatio":"4.0","
>>>                                     hypervisortoolsversion":"xenserver61"},"uuid":"5bff01de-c033
>>>                                     -4925-9e47-30dd14539272","disks":[{"data":{"org.apache.cloud
>>>                                     stack.storage.to
>>>                                     <http://stack.storage.to>.VolumeObjectTO":{"uuid":"1ab514b2-f2de-
>>>                                     4d85-8e0d-9768184a1349","volumeType":"ROOT","dataStore":{"org.
>>>                                     apache.cloudstack.storage.to
>>>                                     <http://apache.cloudstack.storage.to>.PrimaryDataStoreTO":{"uuid":"b8751b89-00f
>>>                                     6-5079-ed7a-c073579c7563","id":64,"poolType":"PreSetup","hos
>>>                                     t":"localhost","path":"/b8751b89-00f6-5079-ed7a-c073579c7563
>>>                                     ","port":0,"url":"PreSetup://localhost/b8751b89-00f6-5079-ed
>>>                                     7a-c073579c7563/?ROLE=Primary&STOREUUID=b8751b89-00f6-5079-
>>>                                     ed7a-c073579c7563"}},"name":"ROOT-1736","size":53687091200,"path":"d3
>>>                                     f87e3c-1efe-42ad-aa93-609d5b980a34","volumeId":8655,"vmName"
>>>                                     :"i-638-1736-VM","accountId":638,"format":"VHD","provisionin
>>>                                     gType":"THIN","id":8655,"deviceId":0,"hypervisorType":"
>>>                                     XenServer"}},"diskSeq":0,"path":"d3f87e3c-1efe-42ad-aa93
>>>                                     -609d5b980a34","type":"ROOT","_details":{"managed":"false","
>>>                                     storagePort":"0",
>>>                                     "storageHost":"localhost","volumeSize":"53687091200"}},{"dat
>>>                                     a":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid
>>>                                     ":"733df9dc-5e24-4ab9-b22b-bce78322428e","volumeType":"DATAD
>>>                                     ISK","dataStore":{"org.apache.cloudstack.storage.to.PrimaryD
>>>                                     ataStoreTO":{"uuid":"b8751b89-00f6-5079-ed7a-c073579c7563","id":64,"
>>>                                     poolType":"PreSetup","host":"localhost","path":"/b8751b89-00
>>>                                     f6-5079-ed7a-c073579c7563","port":0,"url":"PreSetup://localh
>>>                                     ost/b8751b89-00f6-5079-ed7a-c073579c7563/?ROLE=Primary&STORE
>>>                                     UUID=b8751b89-00f6-5079-ed7a-c073579c7563"}},"name":"WELLA-
>>>                                     DB1-NCSA_D","size":268435456000,"path":"265f168a-bb92-4591-a32b-ee3e2
>>>                                     2c8cb68","volumeId":8656,"vmName":"i-638-1736-VM","accountId
>>>                                     ":638,"format":"VHD","provisioningType":"THIN","id":8656,"
>>>                                     deviceId":1,"hypervisorType":"XenServer"}},"diskSeq":1,"
>>>                                     path":"265f168a-bb92-4591-a32b-ee3e22c8cb68","type":"DATADIS
>>>                                     K","_details":{"managed":"false","storagePort":"0","stor
>>>                                     ageHost":"localhos
>>>                                     t","volumeSize":"268435456000"}},{"data":{"org.apache.clouds
>>>                                     tack.storage.to
>>>                                     <http://tack.storage.to>.VolumeObjectTO":{"uuid":"bda6f335-b8a3-479c-
>>>                                     8c0d-41b344d2a50d","volumeType":"DATADISK","dataStore":{"org.
>>>                                     apache.cloudstack.storage.to
>>>                                     <http://apache.cloudstack.storage.to>.PrimaryDataStoreTO":{"uuid"
>>>                                     :"09df3ca6-0a20-e4f0-3d79-12a6ed67a13d","id":89,"poolType":"PreSetup","
>>>                                     host":"localhost","path":"/09df3ca6-0a20-e4f0-3d79-12a6ed67a
>>>                                     13d","port":0,"url":"PreSetup://localhost/09df3ca6-0a20-e4f0
>>>                                     -3d79-12a6ed67a13d/?ROLE=Primary&STOREUUID=09df3ca6-0a20-
>>>
>>>                                     e4f0-3d79-12a6ed67a13d"}},"name":"WELLA-DB1-NCSA_E","size
>>>                                     ":429496729600,"path":"612cb586-a70c-4896-b120-062ee7a67894"
>>>                                     ,"volumeId":86
>>>                                     57,"vmName":"i-638-1736-VM","accountId":638,"format":"VHD","
>>>                                     provisioningType":"THIN","id":8657,"deviceId":2,"hypervisorT
>>>                                     ype":"XenServer"}},"diskSeq":2,"path":"612cb586-a70c-4896-b1
>>>                                     20-062ee7a67894","type":"DATADISK","_details":{"managed":"
>>>                                     false","storagePort":"0","storageHost":"localhost","volumeSize":"42949
>>>                                     6729600"}},{"data":{"org.apache.cloudstack.storage.to
>>>                                     <http://apache.cloudstack.storage.to>.Templa
>>>                                     teObjectTO":{"uuid":"0e5bf4ff-caca-4b08-abed-7bbcdf3c42ef","
>>>                                     id":201,"format":"ISO","accountId":1,"hvm":true,"displayText
>>>                                     ":"xen-pv-drv-iso","name":"xs-tools.iso","guestOsType":"CentOS
>>>                                     4.5
>>>                                     (32-bit)","hypervisorType":"XenServer"}},"diskSeq":3,"type":"
>>>                                     ISO"}],"nics":[{"deviceId":0,"networkRateMbps":1000,"default
>>>                                     Nic":true,"pxeDisable":false,"nicUuid":"3523c3d3-9f92-4076-a
>>>                                     d99-f8864d0f0edd","uuid":"e6fbb257-16e1-415f-88fb-9a79b263b3
>>>                                     a8","ip":"192.168.1.42","netmask":"255.255.255.0","
>>>                                     gateway":"192.168.1.1","mac":"02:00:0c:ae:00:0f","dns1":"8.8
>>>                                     .8.8","dns2":"8.8
>>>                                     .4.4","broadcastType":"Vlan","type":"Guest","broadcastUri":"
>>>                                     vlan://2285","isolationUri":"vlan://2285","isSecurityGroupEn
>>>                                     abled":false,"name":"bond0"}],"vcpuMaxLimit":16},"wait":0}}]
>>>                                     } to
>>>                                     345049098498
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:28,399 DEBUG [c.c.a.t.Request]
>>>                                     (AgentManager-Handler-10:null)
>>>                                     Seq 19-8526158519542516375:
>>>                                     Processing:
>>>                                     {
>>>                                     Ans: , MgmtId: 345049103441,
>>>                                     via: 19, Ver: v1, Flags: 110,
>>>                                     [{"com.cloud.agent.api.PrepareForMigrationAnswer":{"result":
>>>                                     true,"wait":0}}]
>>>                                     }
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:28,399 DEBUG
>>>                                     [c.c.a.m.AgentAttache]
>>>                                     (AgentManager-Handler-10:null)
>>>                                     Seq 19-8526158519542516375: No more
>>>                                     commands
>>>                                     found
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:28,399 DEBUG [c.c.a.t.Request]
>>>                                     (Work-Job-Executor-156:ctx-14b86f5e
>>>                                     job-60216/job-191380 ctx-175d37df)
>>>                                     Seq 19-8526158519542516375:
>>>                                     Received:  { Ans: , MgmtId:
>>>                                     345049103441,
>>>                                     via:
>>>                                     19, Ver: v1, Flags: 110, {
>>>                                     PrepareForMigrationAnswer } }
>>>
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:28,407 INFO
>>>                                     [c.c.v.VirtualMachineManagerImpl]
>>>                                     (Work-Job-Executor-156:ctx-14b86f5e
>>>                                     job-60216/job-191380
>>>                                     ctx-175d37df) Migration
>>>                                     cancelled because state has
>>>                                     changed: VM[User|i-638-1736-VM]
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:28,407 DEBUG
>>>                                     [c.c.v.VirtualMachineManagerImpl]
>>>                                     (Work-Job-Executor-156:ctx-14b86f5e
>>>                                     job-60216/job-191380
>>>                                     ctx-175d37df) Unable to migrate
>>>                                     VM due to:
>>>                                     Migration
>>>                                     cancelled because state has
>>>                                     changed: VM[User|i-638-1736-VM]
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:28,427 DEBUG
>>>                                     [c.c.c.CapacityManagerImpl]
>>>                                     (Work-Job-Executor-156:ctx-14b86f5e
>>>                                     job-60216/job-191380
>>>                                     ctx-175d37df) VM state
>>>                                     transitted from :Running to
>>>                                     Stopping with event:
>>>                                     StopRequestedvm's original host
>>>                                     id: 224 new host
>>>                                     id:
>>>                                     15 host id before state
>>>                                     transition: 15
>>>
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:28,438 DEBUG
>>>                                     [c.c.a.m.ClusteredAgentAttache]
>>>                                     (Work-Job-Executor-156:ctx-14b86f5e
>>>                                     job-60216/job-191380
>>>                                     ctx-175d37df) Seq
>>>                                     15-6889944479923437643:
>>>                                     Forwarding
>>>                                     Seq 15-6889944479923437643: {
>>>                                     Cmd , MgmtId: 345049103441, via:
>>>                                     15(ewcstack-vh011-prod), Ver:
>>>                                     v1, Flags: 100011,
>>>                                     [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"execut
>>>                                     eInSequence":false,"checkBeforeCleanup":false,"vmName":"i-
>>>                                     638-1736-VM","wait":0}}]
>>>                                     } to 345049098498
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:30,335 DEBUG
>>>                                     [c.c.a.m.AgentManagerImpl]
>>>                                     (AgentManager-Handler-1:null) SeqA
>>>                                     254-800498:
>>>                                     Processing Seq 254-800498:  {
>>>                                     Cmd , MgmtId: -1, via: 254, Ver: v1,
>>>                                     Flags:
>>>                                     11,
>>>                                     [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_
>>>                                     proxyVmId":1530,"_loadInfo":"{\n
>>>                                     \"connections\":
>>>                                     []\n}","wait":0}}] }
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:30,340 DEBUG
>>>                                     [c.c.a.m.AgentManagerImpl]
>>>                                     (AgentManager-Handler-1:null) SeqA
>>>                                     254-800498:
>>>                                     Sending Seq 254-800498:  { Ans:
>>>                                     , MgmtId: 345049103441, via:
>>>                                     254, Ver:
>>>                                     v1,
>>>                                     Flags: 100010,
>>>                                     [{"com.cloud.agent.api.AgentCo
>>>                                     ntrolAnswer":{"result":true,"wait":0}}]
>>>                                     }
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:32,447 DEBUG [c.c.a.t.Request]
>>>                                     (AgentManager-Handler-11:null)
>>>                                     Seq 15-6889944479923437632:
>>>                                     Processing:
>>>                                     {
>>>                                     Ans: , MgmtId: 345049103441,
>>>                                     via: 15, Ver: v1, Flags: 10,
>>>                                     [{"com.cloud.agent.api.StopAnswer":{"platform":"viridian:tru
>>>                                     e;acpi:1;apic:true;pae:true;nx:true;timeoffset:3601","result
>>>                                     ":true,"details":"Stop
>>>                                     VM i-638-1735-VM
>>>                                     Succeed","wait":0}}] }
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:32,447 DEBUG [c.c.a.t.Request]
>>>                                     (Work-Job-Executor-115:ctx-0d5db528
>>>                                     job-13733/job-191379 ctx-bbabfeae)
>>>                                     Seq 15-6889944479923437632:
>>>                                     Received:  { Ans: , MgmtId:
>>>                                     345049103441,
>>>                                     via:
>>>                                     15, Ver: v1, Flags: 10, {
>>>                                     StopAnswer } }
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:32,470 DEBUG
>>>                                     [c.c.v.VirtualMachineManagerImpl]
>>>                                     (Work-Job-Executor-115:ctx-0d5db528
>>>                                     job-13733/job-191379
>>>                                     ctx-bbabfeae)
>>>                                     VM[User|i-638-1735-VM] is stopped on
>>>                                     the
>>>                                     host. Proceeding to release
>>>                                     resource held.
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:32,475 DEBUG
>>>                                     [c.c.n.NetworkModelImpl]
>>>                                     (Work-Job-Executor-115:ctx-0d5db528
>>>                                     job-13733/job-191379
>>>                                     ctx-bbabfeae) Service
>>>                                     SecurityGroup is not
>>>                                     supported
>>>                                     in the network id=519
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:32,478 DEBUG
>>>                                     [o.a.c.e.o.NetworkOrchestrator]
>>>                                     (Work-Job-Executor-115:ctx-0d5db528
>>>                                     job-13733/job-191379
>>>                                     ctx-bbabfeae) Changing active
>>>                                     number of nics for
>>>                                     network id=519 on -1
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:32,488 DEBUG
>>>                                     [o.a.c.e.o.NetworkOrchestrator]
>>>                                     (Work-Job-Executor-115:ctx-0d5db528
>>>                                     job-13733/job-191379
>>>                                     ctx-bbabfeae) Asking
>>>                                     VpcVirtualRouter to release
>>>                                     NicProfile[17403-1735-3364083b-dcc0-4eb9-a7f9-f3e85bf8c7d3-1
>>>                                     92.168.1.11-null
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:32,488 DEBUG
>>>                                     [c.c.v.VirtualMachineManagerImpl]
>>>                                     (Work-Job-Executor-115:ctx-0d5db528
>>>                                     job-13733/job-191379
>>>                                     ctx-bbabfeae) Successfully
>>>                                     released network
>>>                                     resources
>>>                                     for the vm VM[User|i-638-1735-VM]
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:32,488 DEBUG
>>>                                     [c.c.v.VirtualMachineManagerImpl]
>>>                                     (Work-Job-Executor-115:ctx-0d5db528
>>>                                     job-13733/job-191379
>>>                                     ctx-bbabfeae) Successfully
>>>                                     released storage
>>>                                     resources
>>>                                     for the vm VM[User|i-638-1735-VM]
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:32,496 DEBUG
>>>                                     [c.c.c.CapacityManagerImpl]
>>>                                     (Work-Job-Executor-115:ctx-0d5db528
>>>                                     job-13733/job-191379
>>>                                     ctx-bbabfeae) VM state
>>>                                     transitted from :Stopping to
>>>                                     Stopped with event:
>>>                                     OperationSucceededvm's original
>>>                                     host id: 15 new host
>>>                                     id: null host id before state
>>>                                     transition: 15
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:32,505 DEBUG
>>>                                     [c.c.c.CapacityManagerImpl]
>>>                                     (Work-Job-Executor-115:ctx-0d5db528
>>>                                     job-13733/job-191379
>>>                                     ctx-bbabfeae) Hosts's actual
>>>                                     total CPU: 82976 and
>>>                                     CPU
>>>                                     after applying overprovisioning:
>>>                                     331904
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:32,505 DEBUG
>>>                                     [c.c.c.CapacityManagerImpl]
>>>                                     (Work-Job-Executor-115:ctx-0d5db528
>>>                                     job-13733/job-191379
>>>                                     ctx-bbabfeae) Hosts's actual
>>>                                     total RAM:
>>>                                     261961927808
>>>                                     and RAM after applying
>>>                                     overprovisioning: 261961924608
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:32,505 DEBUG
>>>                                     [c.c.c.CapacityManagerImpl]
>>>                                     (Work-Job-Executor-115:ctx-0d5db528
>>>                                     job-13733/job-191379
>>>                                     ctx-bbabfeae) release cpu from
>>>                                     host: 15, old used:
>>>                                     71300,reserved: 2100, actual
>>>                                     total: 82976, total with
>>>                                     overprovisioning:
>>>                                     331904; new used:
>>>                                     67100,reserved:6300;
>>>                                     movedfromreserved:
>>>                                     false,moveToReserveredtrue
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:32,505 DEBUG
>>>                                     [c.c.c.CapacityManagerImpl]
>>>                                     (Work-Job-Executor-115:ctx-0d5db528
>>>                                     job-13733/job-191379
>>>                                     ctx-bbabfeae) release mem from
>>>                                     host: 15, old used:
>>>                                     98784247808,reserved:
>>>                                     1073741824, total: 261961924608;
>>>                                     new used:
>>>                                     94489280512,reserved:5368709120
>>>                                     <tel:5368709120>
>>>                                     <(53)%206870-9120>;
>>>                                     movedfromreserved:
>>>
>>>                                     false,moveToReserveredtrue
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:32,538 DEBUG
>>>                                     [c.c.n.NetworkModelImpl]
>>>                                     (Work-Job-Executor-115:ctx-0d5db528
>>>                                     job-13733/job-191379
>>>                                     ctx-bbabfeae) Service
>>>                                     SecurityGroup is not
>>>                                     supported
>>>                                     in the network id=519
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:32,557 DEBUG
>>>                                     [c.c.n.NetworkModelImpl]
>>>                                     (Work-Job-Executor-115:ctx-0d5db528
>>>                                     job-13733/job-191379
>>>                                     ctx-bbabfeae) Service
>>>                                     SecurityGroup is not
>>>                                     supported
>>>                                     in the network id=519
>>>
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:32,588 ERROR
>>>                                     [c.c.v.VmWorkJobHandlerProxy]
>>>                                     (Work-Job-Executor-115:ctx-0d5db528
>>>                                     job-13733/job-191379
>>>                                     ctx-bbabfeae) Invocation
>>>                                     exception, caused by:
>>>                                     com.cloud.utils.exception.CloudRuntimeException:
>>>                                     Unable to migrate
>>>                                     VM[User|i-638-1735-VM]
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:32,589 INFO
>>>                                     [c.c.v.VmWorkJobHandlerProxy]
>>>                                     (Work-Job-Executor-115:ctx-0d5db528
>>>                                     job-13733/job-191379
>>>                                     ctx-bbabfeae) Rethrow exception
>>>                                     com.cloud.utils.exception.CloudRuntimeException:
>>>                                     Unable to migrate
>>>                                     VM[User|i-638-1735-VM]
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:32,589 DEBUG
>>>                                     [c.c.v.VmWorkJobDispatcher]
>>>                                     (Work-Job-Executor-115:ctx-0d5db528
>>>                                     job-13733/job-191379) Done with
>>>                                     run of VM work job:
>>>                                     com.cloud.vm.VmWorkMigrateAway
>>>                                     for VM 1735, job origin: 13733
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:32,589 ERROR
>>>                                     [c.c.v.VmWorkJobDispatcher]
>>>                                     (Work-Job-Executor-115:ctx-0d5db528
>>>                                     job-13733/job-191379) Unable to
>>>                                     complete AsyncJobVO {id:191379,
>>>                                     userId:
>>>                                     1,
>>>                                     accountId: 1, instanceType:
>>>                                     null, instanceId: null, cmd:
>>>                                     com.cloud.vm.VmWorkMigrateAway,
>>>                                     cmdInfo:
>>>                                     rO0ABXNyAB5jb20uY2xvdWQudm0uVm
>>>                                     1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb2
>>>                                     0uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2
>>>                                     VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZz
>>>                                     t4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGx3QAGVZpcnR1YWxNYWNoaW5lTW
>>>                                     FuYWdlckltcGwAAAAAAAAA4A,
>>>                                     cmdVersion: 0, status:
>>>                                     IN_PROGRESS, processStatus: 0,
>>>                                     resultCode: 0,
>>>                                     result: null, initMsid:
>>>                                     345049098498, completeMsid:
>>>                                     null, lastUpdated:
>>>                                     null, lastPolled: null, created:
>>>                                     Tue Jan 24 18:05:00 CET 2017}, job
>>>                                     origin:13733
>>>                                     ewcstack-man02-prod:
>>>                                     com.cloud.utils.exception.CloudRuntimeException:
>>>                                     Unable to migrate
>>>                                     VM[User|i-638-1735-VM]
>>>                                     ewcstack-man02-prod:    at
>>>                                     com.cloud.vm.VirtualMachineMan
>>>                                     agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:2405)
>>>                                     ewcstack-man02-prod:    at
>>>                                     com.cloud.vm.VirtualMachineMan
>>>                                     agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:4517)
>>>                                     ewcstack-man02-prod:    at
>>>                                     sun.reflect.GeneratedMethodAcc
>>>                                     essor402.invoke(Unknown
>>>                                     Source)
>>>                                     ewcstack-man02-prod:    at
>>>                                     sun.reflect.DelegatingMethodAc
>>>                                     cessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>                                     ewcstack-man02-prod:    at
>>>                                     java.lang.reflect.Method.invok
>>>                                     e(Method.java:606)
>>>                                     ewcstack-man02-prod:    at
>>>                                     com.cloud.vm.VmWorkJobHandlerP
>>>                                     roxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
>>>                                     ewcstack-man02-prod:    at
>>>                                     com.cloud.vm.VirtualMachineMan
>>>                                     agerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4636)
>>>                                     ewcstack-man02-prod:    at
>>>                                     com.cloud.vm.VmWorkJobDispatch
>>>                                     er.runJob(VmWorkJobDispatcher.java:103)
>>>                                     ewcstack-man02-prod:    at
>>>                                     org.apache.cloudstack.framewor
>>>                                     k.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManag
>>>                                     erImpl.java:537)
>>>                                     ewcstack-man02-prod:    at
>>>                                     org.apache.cloudstack.managed.
>>>                                     context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>>>                                     ewcstack-man02-prod:    at
>>>                                     org.apache.cloudstack.managed.
>>>                                     context.impl.DefaultManagedContext$1.call(DefaultManagedCont
>>>                                     ext.java:56)
>>>                                     ewcstack-man02-prod:    at
>>>                                     org.apache.cloudstack.managed.
>>>                                     context.impl.DefaultManagedContext.callWithContext(DefaultMa
>>>                                     nagedContext.java:103)
>>>                                     ewcstack-man02-prod:    at
>>>                                     org.apache.cloudstack.managed.
>>>                                     context.impl.DefaultManagedContext.runWithContext(DefaultMan
>>>                                     agedContext.java:53)
>>>                                     ewcstack-man02-prod:    at
>>>                                     org.apache.cloudstack.managed.
>>>                                     context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>>>                                     ewcstack-man02-prod:    at
>>>                                     org.apache.cloudstack.framewor
>>>                                     k.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:494)
>>>                                     ewcstack-man02-prod:    at
>>>                                     java.util.concurrent.Executors
>>>                                     $RunnableAdapter.call(Executors.java:471)
>>>                                     ewcstack-man02-prod:    at
>>>                                     java.util.concurrent.FutureTas
>>>                                     k.run(FutureTask.java:262)
>>>                                     ewcstack-man02-prod:    at
>>>                                     java.util.concurrent.ThreadPoo
>>>                                     lExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>                                     ewcstack-man02-prod:    at
>>>                                     java.util.concurrent.ThreadPoo
>>>                                     lExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>                                     ewcstack-man02-prod:    at
>>>                                     java.lang.Thread.run(Thread.java:745)
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:32,590 DEBUG
>>>                                     [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>                                     (Work-Job-Executor-115:ctx-0d5db528
>>>                                     job-13733/job-191379) Complete
>>>                                     async job-191379, jobStatus: FAILED,
>>>                                     resultCode: 0, result:
>>>                                     rO0ABXNyAC9jb20uY2xvdWQudXRpbH
>>>                                     MuZXhjZXB0aW9uLkNsb3VkUnVudGltZUV4Y2VwdGlvbgAAAABWTU9yAwABSQ
>>>                                     ALY3NFcnJvckNvZGV4cgAaamF2YS5sYW5nLlJ1bnRpbWVFeGNlcHRpb26eXw
>>>                                     ZHCjSD5QIAAHhyABNqYXZhLmxhbmcuRXhjZXB0aW9u0P0fPho7HMQCAAB4cg
>>>                                     ATamF2YS5sYW5nLlRocm93YWJsZdXGNSc5d7jLAwAETAAFY2F1c2V0ABVMam
>>>                                     F2YS9sYW5nL1Rocm93YWJsZTtMAA1kZXRhaWxNZXNzYWdldAASTGphdmEvbG
>>>                                     FuZy9TdHJpbmc7WwAKc3RhY2tUcmFjZXQAHltMamF2YS9sYW5nL1N0YWNrVH
>>>                                     JhY2VFbGVtZW50O0wAFHN1cHByZXNzZWRFeGNlcHRpb25zdAAQTGphdmEvdX
>>>                                     RpbC9MaXN0O3hwcQB-AAh0AChVbmFibGUgdG8gbWlncmF0ZSBWTVtVc2VyfG
>>>                                     ktNjM4LTE3MzUtVk1ddXIAHltMamF2YS5sYW5nLlN0YWNrVHJhY2VFbGVtZW
>>>                                     50OwJGKjw8_SI5AgAAeHAAAAAUc3IAG2phdmEubGFuZy5TdGFja1RyYWNlRW
>>>                                     xlbWVudGEJxZomNt2FAgAESQAKbGluZU51bWJlckwADmRlY2xhcmluZ0NsYX
>>>                                     NzcQB-AAVMAAhmaWxlTmFtZXEAfgAFTAAKbWV0aG9kTmFtZXEAfgAFeHAAAA
>>>                                     lldAAmY29tLmNsb3VkLnZtLlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGx0AB
>>>                                     5WaXJ0dWFsTWFjaGluZU1hbmFnZXJJbXBsLmphdmF0ABZvcmNoZXN0cmF0ZU
>>>                                     1pZ3JhdGVBd2F5c3EAfgAMAAARpXEAfgAOcQB-AA9xAH4AEHNxAH4ADP____
>>>                                     90ACZzdW4ucmVmbGVjdC5HZW5lcmF0ZWRNZXRob2RBY2Nlc3NvcjQwMnB0AA
>>>                                     ZpbnZva2VzcQB-AAwAAAArdAAoc3VuLnJlZmxlY3QuRGVsZWdhdGluZ01ldG
>>>                                     hvZEFjY2Vzc29ySW1wbHQAIURlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcG
>>>                                     wuamF2YXEAfgAUc3EAfgAMAAACXnQAGGphdmEubGFuZy5yZWZsZWN0Lk1ldG
>>>                                     hvZHQAC01ldGhvZC5qYXZhcQB-ABRzcQB-AAwAAABrdAAiY29tLmNsb3VkLn
>>>                                     ZtLlZtV29ya0pvYkhhbmRsZXJQcm94eXQAGlZtV29ya0pvYkhhbmRsZXJQcm
>>>                                     94eS5qYXZhdAAPaGFuZGxlVm1Xb3JrSm9ic3EAfgAMAAASHHEAfgAOcQB-
>>>                                     AA9xAH4AHnNxAH4ADAAAAGd0ACBjb20uY2xvdWQudm0uVm1Xb3JrSm9iRGlz
>>>                                     cGF0Y2hlcnQAGFZtV29ya0pvYkRpc3BhdGNoZXIuamF2YXQABnJ1bkpvYnNx
>>>                                     AH4ADAAAAhl0AD9vcmcuYXBhY2hlLmNsb3Vkc3RhY2suZnJhbWV3b3JrLmpv
>>>                                     YnMuaW1wbC5Bc3luY0pvYk1hbmFnZXJJbXBsJDV0ABhBc3luY0pvYk1hbmFn
>>>                                     ZXJJbXBsLmphdmF0AAxydW5JbkNvbnRleHRzcQB-AAwAAAAxdAA-b3JnLmFw
>>>                                     YWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29udGV4
>>>                                     dFJ1bm5hYmxlJDF0ABtNYW5hZ2VkQ29udGV4dFJ1bm5hYmxlLmphdmF0AANy
>>>                                     dW5zcQB-AAwAAAA4dABCb3JnLmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQu
>>>                                     Y29udGV4dC5pbXBsLkRlZmF1bHRNYW5hZ2VkQ29udGV4dCQxdAAaRGVmYXVs
>>>                                     dE1hbmFnZWRDb250ZXh0LmphdmF0AARjYWxsc3EAfgAMAAAAZ3QAQG9yZy5h
>>>                                     cGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0
>>>                                     TWFuYWdlZENvbnRleHRxAH4ALnQAD2NhbGxXaXRoQ29udGV4dHNxAH4ADAAA
>>>                                     ADVxAH4AMXEAfgAudAAOcnVuV2l0aENvbnRleHRzcQB-AAwAAAAudAA8b3Jn
>>>                                     LmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29u
>>>                                     dGV4dFJ1bm5hYmxlcQB-ACpxAH4AK3NxAH4ADAAAAe5xAH4AJXEAfgAmcQB-
>>>                                     ACtzcQB-AAwAAAHXdAAuamF2YS51dGlsLmNvbmN1cnJlbnQuRXhlY3V0b3Jz
>>>                                     JFJ1bm5hYmxlQWRhcHRlcnQADkV4ZWN1dG9ycy5qYXZhcQB-AC9zcQB-AAwA
>>>                                     AAEGdAAfamF2YS51dGlsLmNvbmN1cnJlbnQuRnV0dXJlVGFza3QAD0Z1dHVy
>>>                                     ZVRhc2suamF2YXEAfgArc3EAfgAMAAAEeXQAJ2phdmEudXRpbC5jb25jdXJy
>>>                                     ZW50LlRocmVhZFBvb2xFeGVjdXRvcnQAF1RocmVhZFBvb2xFeGVjdXRvci5q
>>>                                     YXZhdAAJcnVuV29ya2Vyc3EAfgAMAAACZ3QALmphdmEudXRpbC5jb25jdXJy
>>>                                     ZW50LlRocmVhZFBvb2xFeGVjdXRvciRXb3JrZXJxAH4AQHEAfgArc3EAfgAM
>>>                                     AAAC6XQAEGphdmEubGFuZy5UaHJlYWR0AAtUaHJlYWQuamF2YXEAfgArc3IA
>>>                                     JmphdmEudXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVMaXN0_A8lMbXs
>>>                                     jhACAAFMAARsaXN0cQB-AAd4cgAsamF2YS51dGlsLkNvbGxlY3Rpb25zJFVu
>>>                                     bW9kaWZpYWJsZUNvbGxlY3Rpb24ZQgCAy173HgIAAUwAAWN0ABZMamF2YS91
>>>                                     dGlsL0NvbGxlY3Rpb247eHBzcgATamF2YS51dGlsLkFycmF5TGlzdHiB0h2Z
>>>                                     x2GdAwABSQAEc2l6ZXhwAAAAAHcEAAAAAHhxAH4ATHgAABCadwgAAAAAAAAAAHg
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:32,592 DEBUG
>>>                                     [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>                                     (Work-Job-Executor-115:ctx-0d5db528
>>>                                     job-13733/job-191379) Publish
>>>                                     async job-191379 complete on
>>>                                     message bus
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:32,592 DEBUG
>>>                                     [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>                                     (Work-Job-Executor-115:ctx-0d5db528
>>>                                     job-13733/job-191379) Wake up
>>>                                     jobs related to job-191379
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:32,592 DEBUG
>>>                                     [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>                                     (Work-Job-Executor-115:ctx-0d5db528
>>>                                     job-13733/job-191379) Update db
>>>                                     status for job-191379
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:32,594 DEBUG
>>>                                     [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>                                     (Work-Job-Executor-115:ctx-0d5db528
>>>                                     job-13733/job-191379) Wake up
>>>                                     jobs joined with job-191379 and
>>>                                     disjoin
>>>                                     all
>>>                                     subjobs created from job- 191379
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:32,604 DEBUG
>>>                                     [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>                                     (Work-Job-Executor-115:ctx-0d5db528
>>>                                     job-13733/job-191379) Done
>>>                                     executing
>>>                                     com.cloud.vm.VmWorkMigrateAway for
>>>                                     job-191379
>>>                                     ewcstack-man02-prod: 2017-01-24
>>>                                     18:05:32,608 INFO
>>>                                     [o.a.c.f.j.i.AsyncJobMonitor]
>>>                                     (Work-Job-Executor-115:ctx-0d5db528
>>>                                     job-13733/job-191379) Remove
>>>                                     job-191379 from job monitoring
>>>
>>>
>>>
>>>                                     On 25.01.2017 15:53, Rafael
>>>                                     Weingärtner wrote:
>>>
>>>                                     Did you check the ACS log files
>>>                                     looking for some exceptions when the
>>>
>>>                                         migrations for this specific
>>>                                         VM started?
>>>
>>>                                         On Wed, Jan 25, 2017 at 9:49
>>>                                         AM, Francois Scheurer <
>>>                                         francois.scheurer@everyware.ch
>>>                                         <ma...@everyware.ch>>
>>>                                         wrote:
>>>
>>>                                         Dear CS contributors
>>>
>>>                                             We use CS 4.5.1 on a 3
>>>                                             Clusters with XenServer
>>>                                             6.5. and shared primary
>>>                                             storage.
>>>
>>>                                             When we put a host in
>>>                                             maintenance, cs
>>>                                             evacuates all VM's on other
>>>                                             hosts.
>>>
>>>                                             But it happens regularly
>>>                                             that some VM's are
>>>                                             stopped instead of live
>>>                                             migrated.
>>>
>>>                                             If we restart that
>>>                                             stopped VM's, CS will
>>>                                             then stop them again later.
>>>
>>>                                             We need to start them
>>>                                             several times until they
>>>                                             stays up.
>>>
>>>                                             Is it a known issue? Is
>>>                                             it fixed on CS 4.9.2 ?
>>>
>>>
>>>                                             Migrating manually all
>>>                                             VM's is working but
>>>                                             impracticable because too
>>>                                             much
>>>                                             time consuming.
>>>
>>>
>>>                                             Many thanks in advance
>>>                                             for your help.
>>>
>>>
>>>
>>>                                             Best Regards
>>>                                             Francois
>>>
>>>
>>>
>>>
>>>
>>>
>>>                                             --
>>>
>>>
>>>                                             EveryWare AG
>>>                                             François Scheurer
>>>                                             Senior Systems Engineer
>>>                                             Zurlindenstrasse 52a
>>>                                             CH-8003 Zürich
>>>
>>>                                             tel: +41 44 466 60 00
>>>                                             <tel:%2B41%2044%20466%2060%2000>
>>>                                             fax: +41 44 466 60 10
>>>                                             <tel:%2B41%2044%20466%2060%2010>
>>>                                             mail:
>>>                                             francois.scheurer@everyware.ch
>>>                                             <ma...@everyware.ch>
>>>                                             web: http://www.everyware.ch
>>>
>>>
>>>
>>>                                         --
>>>
>>>
>>>                                     EveryWare AG
>>>                                     François Scheurer
>>>                                     Senior Systems Engineer
>>>                                     Zurlindenstrasse 52a
>>>                                     CH-8003 Zürich
>>>
>>>                                     tel: +41 44 466 60 00
>>>                                     <tel:%2B41%2044%20466%2060%2000>
>>>                                     fax: +41 44 466 60 10
>>>                                     <tel:%2B41%2044%20466%2060%2010>
>>>                                     mail:
>>>                                     francois.scheurer@everyware.ch
>>>                                     <ma...@everyware.ch>
>>>                                     web: http://www.everyware.ch
>>>
>>>
>>>
>>>                                 --
>>>                                 Rafael Weingärtner
>>>
>>>
>>>
>>>                         --
>>>
>>>
>>>                         EveryWare AG
>>>                         François Scheurer
>>>                         Senior Systems Engineer
>>>                         Zurlindenstrasse 52a
>>>                         CH-8003 Zürich
>>>
>>>                         tel: +41 44 466 60 00
>>>                         <tel:%2B41%2044%20466%2060%2000>
>>>                         fax: +41 44 466 60 10
>>>                         <tel:%2B41%2044%20466%2060%2010>
>>>                         mail: francois.scheurer@everyware.ch
>>>                         <ma...@everyware.ch>
>>>                         web: http://www.everyware.ch
>>>
>>>
>>>
>>>
>>>                 -- 
>>>
>>>
>>>                 EveryWare AG
>>>                 François Scheurer
>>>                 Senior Systems Engineer
>>>                 Zurlindenstrasse 52a
>>>                 CH-8003 Zürich
>>>
>>>                 tel: +41 44 466 60 00 <tel:%2B41%2044%20466%2060%2000>
>>>                 fax: +41 44 466 60 10 <tel:%2B41%2044%20466%2060%2010>
>>>                 mail: francois.scheurer@everyware.ch
>>>                 <ma...@everyware.ch>
>>>                 web: http://www.everyware.ch
>>>
>>>
>>>
>>>
>>>             -- 
>>>             Rafael Weingärtner
>>>
>>>
>>>
>>>
>>>         -- 
>>>         Rafael Weingärtner
>>
>>         -- 
>>
>>
>>         EveryWare AG
>>         François Scheurer
>>         Senior Systems Engineer
>>         Zurlindenstrasse 52a
>>         CH-8003 Zürich
>>
>>         tel:+41 44 466 60 00 <tel:+41%2044%20466%2060%2000>
>>         fax:+41 44 466 60 10 <tel:+41%2044%20466%2060%2010>
>>         mail:francois.scheurer@everyware.ch
>>         <ma...@everyware.ch>
>>         web:http://www.everyware.ch  
>>
>>
>>
>>
>>     -- 
>>     Rafael Weingärtner
>
>     -- 
>
>
>     EveryWare AG
>     François Scheurer
>     Senior Systems Engineer
>     Zurlindenstrasse 52a
>     CH-8003 Zürich
>
>     tel:+41 44 466 60 00 <tel:+41%2044%20466%2060%2000>
>     fax:+41 44 466 60 10 <tel:+41%2044%20466%2060%2010>
>     mail:francois.scheurer@everyware.ch <ma...@everyware.ch>
>     web:http://www.everyware.ch  
>
>
>
>
> -- 
> Rafael Weingärtner

-- 


EveryWare AG
François Scheurer
Senior Systems Engineer
Zurlindenstrasse 52a
CH-8003 Zürich

tel: +41 44 466 60 00
fax: +41 44 466 60 10
mail: francois.scheurer@everyware.ch
web: http://www.everyware.ch

Re: cs 4.5.1, vm stopped when putting xen hosts in maintenance

Posted by Rafael Weingärtner <ra...@gmail.com>.

So, you changed the flags directly on "vm_instance". Why did not you change
the service offering of these VMs using ACS UI or API? For me, it feels
that the best approach would be to change the service offering of the
affected VMs (the HA flag is first configured at the service offering). You
could assign a service offering without HA; I mean, it would be a matter of
creating a similar service offering (same configurations to the one they
are using right now), but without HA.

I am assuming that the change directly on “vm_instance” did not solve your
problem, right? That is why you want to change some data at “op_it_work”. I
did a quick look at “op_it_work”, and it seems that this table may be used
for more things than HA. Here is a tip for complex production system, never
run update SQLs in production environments without a very, very restrictive
and well thought “where” clause. In my opinion this is not restrictive
enough (update op_it_work set step="Done" where step!="Done"). I would
advise you not to go further and change configurations directly on database
(unless you have a dev team that is capable of reading, debugging, and work
with ACS source code). ACS has over 500k lines of java code, and God knows
how many lines of shell, python and other languages, software and systems
that may be used; it is pretty hard sometimes to get a clear picture of how
a change in the database will reverberate in the whole system without
knowing in details the installation configurations and underlying
structures (networking, hypervisors, storage systems and other pieces).
However, if you want to proceed with that, I would suggest you adding some
extra filters to that “where” clause. What about adding the “instance id”
there?

PS: BTW, do you already heard about the Cloudstack Fork from Schuberg
Philis named Cosmic? https://github.com/MissionCriticalCloud/cosmic
Yes, I have. They were part of the active ACS community back there. I was
already here when they created the fork. Still, I kept with ACS.


On Tue, Feb 7, 2017 at 3:54 AM, Francois Scheurer <
francois.scheurer@everyware.ch> wrote:

> Dear Rafael
>
>
> We disabled temporarily the HA Flag on these VM's by updating the DB
> directly: update vm_instance set ha_enabled=0 where removed is null and
> id=1178 or id=1240 or id=1690 or id=1736;
>
> We kept the test VM i-189-1177-VM with HA Flag.
>
> Restarting the MS + cleaning the op_it_work table will hopefully fix the
> issue. And if the issue is persisting, then only the i-189-1177-VM should
> be affected, hence mitigating the impact.
>
> In the other hand if everything run fine we would thereafter re-enable the
> HA Flag on the problematic VM's.
>
>
> Do you think it is safe to run this update?: update op_it_work set
> step="Done" where step!="Done";
>
>
> Best Regards
>
> Francois Scheurer
>
>
>
> PS: BTW, do you already heard about the Cloudstack Fork from Schuberg
> Philis named Cosmic? https://github.com/MissionCriticalCloud/cosmic
>
>
>
>
>
> On 06.02.2017 16:41, Rafael Weingärtner wrote:
>
> I totally understand your position of not being able to debug the ACS in
> production ;)
>
>
> There is one bit I quite did not understand. If you have disabled the HA
> for VMs, why do you still need to change data on the data base? How did you
> disabled the HA of VMs?
>
> On Mon, Feb 6, 2017 at 5:04 AM, Francois Scheurer <
> <fr...@everyware.ch> wrote:
>
>> Dear Rafael / Dear CS contributors
>>
>>
>> I tried the following:
>>
>> 1.     stop all MS
>> 2.     backup DB
>> 3.     select * from op_ha_work where type="Migration" and mgmt_server_id
>> is null and taken is null;
>> 4.     delete from op_ha_work where type="Migration" and mgmt_server_id
>> is null and taken is null;
>> 5.     start all MS
>>
>> But actually after the MS stopped the op_ha_work table was automatically
>> cleaned, making my steps useless.
>> After the MS started the op_ha_work table was filled again and related
>> VM's were stopped several times (3-7 times), requiring high attention to
>> manually restart them on each stop.
>>
>> I later found another table that could be the cause of the issue:
>>
>> mysql> select op_it_work.*,vm_instance.instance_name,vm_instance.removed
>> from op_it_work join vm_instance where op_it_work.step!="Done" and
>> op_it_work.instance_id=vm_instance.id order by instance_id;
>> +--------------------------------------+----------------+---
>> ---------+-----------------------+-----------+--------------
>> ------+-----------+------------+-------------+--------------
>> -+-------------+---------------+---------------------+
>> | id                                   | mgmt_server_id | created_at |
>> thread                | type      | vm_type            | step      |
>> updated_at | instance_id | resource_type | resource_id | instance_name |
>> removed             |
>> +--------------------------------------+----------------+---
>> ---------+-----------------------+-----------+--------------
>> ------+-----------+------------+-------------+--------------
>> -+-------------+---------------+---------------------+
>> | be4d0dd2-7d4f-47d3-ba49-623d2f115bbf |   345049098498 | 1436187965
>> <(14)%203618-7965> | consoleproxy-1        | Starting  |
>> ConsoleProxy       | Release   | 1436188265 <(14)%203618-8265> |
>> 402 | NULL          |           0 | v-402-VM      | 2015-07-06 13:23:22 |
>> | fb4d1844-eb92-478c-9f50-d666790f43af |   345049098498 | 1436187965
>> <%2814%29%203618-7965> | secstorage-1          | Starting  |
>> SecondaryStorageVm | Release   | 1436188265 <%2814%29%203618-8265>
>> |         687 | NULL          |           0 | s-687-VM      | 2015-07-06
>> 13:11:34 |
>> | 7dfeb25a-a705-42e8-907c-79d1e69697ed |   345049098498 | 1436964548
>> <%2814%29%203696-4548> | Work-Job-Executor-165 | Migrating |
>> User               | Migrating | 1436964548 <%2814%29%203696-4548>
>> |         776 | Host          |          29 | i-51-776-VM   | 2015-07-15
>> 12:55:38 |
>> | 2018ac61-b11c-4101-8ef5-07a39509f2bd |   345049098498 | 1486020173 |
>> Work-Job-Executor-19  | Migrating | User               | Prepare   |
>> 1486020173 |        1177 | Host          |          25 | i-189-1177-VM |
>> NULL                |
>> | 25430cd5-4346-4643-9c7e-7b39b8a876a2 |   345049098122 | 1486019693 |
>> Work-Job-Executor-6   | Migrating | User               | Prepare   |
>> 1486019693 |        1177 | Host          |          25 | i-189-1177-VM |
>> NULL                |
>> | 3276d52e-cbd9-44c5-8ab6-c70784337ff9 |   345049098498 | 1486020953 |
>> Work-Job-Executor-33  | Migrating | User               | Prepare   |
>> 1486020953 |        1177 | Host          |          33 | i-189-1177-VM |
>> NULL                |
>> | 6462da33-83f2-4575-af38-0e52d2a48488 |   345049098122 | 1486021433 |
>> Work-Job-Executor-12  | Migrating | User               | Prepare   |
>> 1486021433 |        1177 | Host          |          25 | i-189-1177-VM |
>> NULL                |
>> | 688e16c8-c39f-4e88-9e8f-36c0282f3df1 |   345049098122 | 1486020773 |
>> Work-Job-Executor-11  | Migrating | User               | Prepare   |
>> 1486020773 |        1177 | Host          |          25 | i-189-1177-VM |
>> NULL                |
>> | d8a18e9d-ac6f-4273-923e-e401533e2cf7 |   345049098498 | 1486020353 |
>> Work-Job-Executor-25  | Migrating | User               | Prepare   |
>> 1486020353 |        1177 | Host          |          25 | i-189-1177-VM |
>> NULL                |
>> | 1222ece8-399d-43d8-8ff8-d24bdebca2ed |   345049103441 | 1486018924 |
>> Work-Job-Executor-2   | Migrating | User               | Prepare   |
>> 1486018924 |        1178 | Host          |          29 | i-189-1178-VM |
>> NULL                |
>> | 2ac5ca4a-c20d-433a-8379-40e0de3b3ba8 |   345049098498 | 1486021613 |
>> Work-Job-Executor-41  | Migrating | User               | Prepare   |
>> 1486021613 |        1178 | Host          |          29 | i-189-1178-VM |
>> NULL                |
>> | 3e3b95a0-2af4-4268-bfdd-f3399a85bcaa |   345049103441 | 1486019584 |
>> Work-Job-Executor-5   | Migrating | User               | Prepare   |
>> 1486019584 |        1178 | Host          |          29 | i-189-1178-VM |
>> NULL                |
>> | 53745a2d-056c-4a1b-9267-eb97f077a6c8 |   345049103441 | 1486020244 |
>> Work-Job-Executor-6   | Migrating | User               | Prepare   |
>> 1486020244 |        1178 | Host          |          29 | i-189-1178-VM |
>> NULL                |
>> | e3555b49-7c18-4272-b7f4-ba9351b3bbd1 |   345049098498 | 1486020893 |
>> Work-Job-Executor-32  | Migrating | User               | Prepare   |
>> 1486020893 |        1178 | Host          |          29 | i-189-1178-VM |
>> NULL                |
>> | 1dcd5204-4f9a-4342-a23b-329a48596e13 |   345049098498 | 1486020953 |
>> Work-Job-Executor-34  | Migrating | User               | Prepare   |
>> 1486020953 |        1240 | Host          |           3 | i-388-1240-VM |
>> NULL                |
>> | ec1668b3-fa3c-4673-b593-3d607b169a24 |   345049098498 | 1486019693 |
>> Work-Job-Executor-14  | Migrating | User               | Prepare   |
>> 1486019693 |        1240 | Host          |           3 | i-388-1240-VM |
>> NULL                |
>> | fa7b2dec-ed7f-4a67-a055-e6e6bc1631ca |   345049098498 | 1486020353 |
>> Work-Job-Executor-26  | Migrating | User               | Prepare   |
>> 1486020353 |        1240 | Host          |           3 | i-388-1240-VM |
>> NULL                |
>> | 8cfe6d3e-5b7b-4be8-ac10-c865a492aab9 |   345049098498 | 1465387271 |
>> Work-Job-Executor-18  | Running   | User               | Prepare   |
>> 1465387271 |        1388 | Host          |           7 | i-361-1388-VM |
>> NULL                |
>> | 6e941b7e-6f19-4717-bf82-974c59737dbc |   345049103441 | 1486019524 |
>> Work-Job-Executor-4   | Migrating | User               | Prepare   |
>> 1486019524 |        1690 | Host          |          15 | i-648-1690-VM |
>> NULL                |
>> | 718578a4-0dd6-42d9-be10-ed7680650108 |   345049098122 | 1486021433 |
>> Work-Job-Executor-13  | Migrating | User               | Prepare   |
>> 1486021433 |        1690 | Host          |          15 | i-648-1690-VM |
>> NULL                |
>> | 9acf3908-ef13-4b03-8c2b-1a7e55e18491 |   345049098498 | 1486020173 |
>> Work-Job-Executor-20  | Migrating | User               | Prepare   |
>> 1486020173 |        1690 | Host          |          15 | i-648-1690-VM |
>> NULL                |
>> | a50c420b-ae52-4caa-83db-704b52b299a4 |   345049098498 | 1486020773 |
>> Work-Job-Executor-28  | Migrating | User               | Prepare   |
>> 1486020773 |        1690 | Host          |          15 | i-648-1690-VM |
>> NULL                |
>> | 2b5f4179-5975-4326-9db7-643c3ede63fe |   345049103441 | 1484224516 |
>> Work-Job-Executor-161 | Running   | User               | Prepare   |
>> 1484224516 |        1703 | Host          |           3 | i-51-1703-VM  |
>> NULL                |
>> | 2f3c6c44-b2a0-4d10-88d2-ac143d2dd379 |   345049098498 | 1484223910 |
>> Work-Job-Executor-163 | Running   | User               | Prepare   |
>> 1484223910 |        1703 | Host          |           3 | i-51-1703-VM  |
>> NULL                |
>> | 4126a23b-1cde-49bc-8636-c0d063b85cf6 |   345049103441 | 1484224636 |
>> Work-Job-Executor-86  | Running   | User               | Prepare   |
>> 1484224636 |        1703 | Host          |           3 | i-51-1703-VM  |
>> NULL                |
>> | 5dad5645-766b-48c9-9a85-b77b3c101390 |   345049103441 | 1484224636 |
>> Work-Job-Executor-92  | Running   | User               | Prepare   |
>> 1484224636 |        1703 | Host          |           3 | i-51-1703-VM  |
>> NULL                |
>> | 8c1dfff9-e55d-4f81-a22f-45f72accc416 |   345049103441 | 1484224576 |
>> Work-Job-Executor-95  | Running   | User               | Prepare   |
>> 1484224576 |        1703 | Host          |           3 | i-51-1703-VM  |
>> NULL                |
>> | a703d2d7-b1e3-42ce-a551-831982990ffb |   345049098498 | 1484224630 |
>> Work-Job-Executor-13  | Running   | User               | Prepare   |
>> 1484224630 |        1703 | Host          |           3 | i-51-1703-VM  |
>> NULL                |
>> | 0e194cb5-95b7-41cb-969a-ae37d2ad449a |   345049098498 | 1485763613 |
>> Work-Job-Executor-28  | Migrating | User               | Prepare   |
>> 1485763613 |        1735 | Host          |          19 | i-638-1735-VM |
>> 2017-01-30 15:24:10 |
>> | 74f180b9-be0f-429b-9a45-597ca2ca997d |   345049098498 | 1485763013 |
>> Work-Job-Executor-21  | Migrating | User               | Prepare   |
>> 1485763013 |        1735 | Host          |          19 | i-638-1735-VM |
>> 2017-01-30 15:24:10 |
>> | feaa421c-88df-4493-9a4f-bddb6ee1b75b |   345049098122 | 1485764285 |
>> Work-Job-Executor-19  | Migrating | User               | Prepare   |
>> 1485764285 |        1735 | Host          |          19 | i-638-1735-VM |
>> 2017-01-30 15:24:10 |
>> | 32138a36-ad18-42b5-8478-76ad63caf4da |   345049098122 | 1486020293 |
>> Work-Job-Executor-9   | Migrating | User               | Prepare   |
>> 1486020293 |        1736 | Host          |          19 | i-638-1736-VM |
>> NULL                |
>> | 5306ab1f-3c92-4f95-bbff-283b43e6862d |   345049098498 | 1486019573 |
>> Work-Job-Executor-11  | Migrating | User               | Prepare   |
>> 1486019573 |        1736 | Host          |          19 | i-638-1736-VM |
>> NULL                |
>> | 843bdf9c-247d-432c-ac19-1a4f7858631b |   345049098498 | 1486021493 |
>> Work-Job-Executor-38  | Migrating | User               | Prepare   |
>> 1486021493 |        1736 | Host          |          19 | i-638-1736-VM |
>> NULL                |
>> | 8d400afb-766f-4ac5-89ee-854f25a38eba |   345049098498 | 1486020173 |
>> Work-Job-Executor-21  | Migrating | User               | Prepare   |
>> 1486020173 |        1736 | Host          |          19 | i-638-1736-VM |
>> NULL                |
>> | f0ae5612-b6ef-4882-94a2-40a1e6dce342 |   345049098498 | 1486020833 |
>> Work-Job-Executor-30  | Migrating | User               | Prepare   |
>> 1486020833 |        1736 | Host          |          19 | i-638-1736-VM |
>> NULL                |
>> +--------------------------------------+----------------+---
>> ---------+-----------------------+-----------+--------------
>> ------+-----------+------------+-------------+--------------
>> -+-------------+---------------+---------------------+
>>
>> As you see, the problematic VM's are all present on that table:
>>
>> i-189-1177-VM
>> i-189-1178-VM
>> i-388-1240-VM
>> i-648-1690-VM
>> i-638-1736-VM
>>
>> As the MS are running the production environnements, we cannot debug a
>> lot and just need the shortest path to resolution.
>> For example restarting the MS is having an impact on our customers and we
>> need to reduce the number of restarts at the minimum.
>>
>> I have disabled the HA on all problematic VM's except on i-189-1177-VM,
>> which is less critical.
>>
>> The next steps could be:
>> 1.     stop all MS
>> 2.     backup DB
>> 3.     select op_it_work.*,vm_instance.instance_name,vm_instance.removed
>> from op_it_work join vm_instance where op_it_work.step!="Done" and
>> op_it_work.instance_id=vm_instance.id order by instance_id;
>> 4.     update op_it_work set step="Done" where step!="Done";
>> 5.     start all MS
>>
>> What do you think about step 4., safe or risky?
>> Some of these entries are related to deleted VM's and seems to safe to
>> mark as "Done".
>> All other entries are related to the problematic VM's that keep
>> stopping... so marking them as "Done" may (hopefully) eliminate the cause
>> of the Migrating/Stopping.
>> But we are not sure if marking them as "Done" need also changes on other
>> tables for consistency.
>>
>>
>> Thank you for your help.
>>
>>
>> Best Regards
>> Francois Scheurer
>>
>>
>>
>>
>>
>> On 01.02.2017 03:01, Rafael Weingärtner wrote:
>>
>> Ok, I dug a little further into the code.
>>
>> If we put a host in maintenance, it may unfortunately happen that a
>>> timeout occurs and CS aborts the migration. As far as we can interpret the
>>> logs, it seems (we speculate) that CS detects that the VM is then still
>>> running on the original host, which is in maintenance, and decides somehow
>>> to stop it because of the "host not up" status.
>>>
>>
>> From wat I have seen in the code, this may happen (as you speculated). If
>> a timeout or some other event interrupts migrations, the VM will be
>> stopped. This happens in the method “com.cloud.vm.VirtualMachineMa
>> nagerImpl.orchestrateMigrateAway(String, long, DeploymentPlanner)” at
>> line 2396 and further. This was implemented on purpose; if the migration
>> does not happen; the VM is stopped on the source host. Probably it was
>> implemented this way on that philosophy of gradient descendent of
>> artificial intelligence:
>>
>> “pick a random point in the field, update the values according to the
>>> slope and hopefully after successfully iterations you converge to a minimum
>>> point.” (the problem is that sometimes the function diverges).
>>>
>>
>> I mean, probably this was done this way because the HA was supposed to
>> see that a VM with HA is stopped, and should try to start it again
>> somewhere else. The log you posted do not provide any other details to
>> debug this further because the log is after an MS restart, right?
>>
>>
>> After restarting the MSs I guess that a similar issue may happen: the
>>> hosts are disconnected during the early stage and then reconnects. Maybe CS
>>> then again stops the VMs because the "host not up" status?
>>> Cf. 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>> The last host of this VM is not UP or is not enabled, host status is:
>>> Disconnected, host resource state is: Enabled
>>>
>>
>> Here is something interesting that you have found. I checked the code,
>> but without a live debug it is pretty hard for me to get the right flow of
>> execution. By looking at the code and at the logs you provided, your
>> deduction is right. After ACS boots up, the check for hosts that are
>> shutdown/deactivated starts, and the method “com.cloud.agent.manager.Agent
>> ManagerImpl.executeUserRequest(long, Event)” is executed. The problem is
>> that all of the hosts are considered as disconnected (ACS may not have
>> finished connecting to all of them yet), and then a whole process of HA is
>> kicked. The problem is that there is no host with status “UP”. That is why
>> this happen “No suitable hosts found under this Cluster: 25”, and then the
>> whole flow of HA, which shuts down the VM is kicked.
>>
>> In theory, the HA worker should see that a VM with HA is down, and should
>> try to start it. I am not understanding why this is not working. If you
>> decide to debug the running ACS, I may be able to help you, just keep me
>> posted.
>>
>>
>> On Tue, Jan 31, 2017 at 10:26 AM, Rafael Weingärtner <
>> <ra...@gmail.com> wrote:
>>
>>> I will be able to re-check some code and answer you with new thoughts on
>>> this problem tonight (the log files you provided today have some very
>>> insightful information regarding this problem). I believe this is a bug in
>>> HA, a very tricky one that has not been reported yet (at least I do not
>>> remember seeing anything similar). And thanks to randomness/destiny now you
>>> have found/uncovered a way to reproduce the problem (bug?!).
>>>
>>> I do not suggest you changing any data at all directly in the data base
>>> (unless you know ACS code well and are 100% of what you are doing). This
>>> can cause even bigger and more complicated problems. If you remove the HA
>>> feature from this VMs, this problem will stop. However, if you can/are
>>> willing to maintain the HA until we fully uncover the problem, map and
>>> report it, this would be great.
>>> BTW: why are you restarting MSs?
>>> What are the configurations of your MSs?
>>>
>>> Thanks for your time and for sharing information with our community.
>>>
>>>
>>>
>>> On Tue, Jan 31, 2017 at 8:53 AM, Francois Scheurer <
>>> francois.scheurer@everyware.ch> wrote:
>>>
>>>> Dear Rafael
>>>>
>>>>
>>>> Many thanks for your deep analysis.
>>>> Nice humor with the planetsalignments BTW ;)
>>>>
>>>> We also suspected the HA Flag to be related this issue, as we never got
>>>> this issue in the past, until we enabled the HA on all VMs and Offerings.
>>>>
>>>> Here is also the workers list:
>>>>
>>>> mysql> select * from op_ha_work;
>>>> +------+-------------+-----------+--------------------+-----
>>>> -----+----------------+---------+---------------------+-----
>>>> --+---------------------+-------+-------------+---------+
>>>> | id   | instance_id | type      | vm_type            | state    |
>>>> mgmt_server_id | host_id | created             | tried | taken
>>>>  | step  | time_to_try | updated |
>>>> +------+-------------+-----------+--------------------+-----
>>>> -----+----------------+---------+---------------------+-----
>>>> --+---------------------+-------+-------------+---------+
>>>> |  105 |         343 | HA        | User               | Running  |
>>>>  345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>> Error |  1394056013 |       3 |
>>>> |  109 |         407 | HA        | ConsoleProxy       | Running  |
>>>>  345049098498 |      33 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>> Error |  1394056013 |       3 |
>>>> |  113 |         407 | HA        | ConsoleProxy       | Running  |
>>>>  345049098498 |      33 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>> Error |  1394056013 |       3 |
>>>> |  117 |         407 | HA        | ConsoleProxy       | Running  |
>>>>  345049098498 |      33 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>> Error |  1394056013 |       3 |
>>>> |  121 |         343 | HA        | User               | Running  |
>>>>  345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>> Error |  1394056013 |       3 |
>>>> |  125 |         405 | HA        | SecondaryStorageVm | Running  |
>>>>  345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>> Error |  1394056013 |       3 |
>>>> |  129 |         393 | HA        | User               | Running  |
>>>>  345049098498 |      25 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>> Error |  1394056014 |      30 |
>>>> |  133 |         405 | HA        | SecondaryStorageVm | Running  |
>>>>  345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>> Error |  1394056014 |       3 |
>>>> |  137 |         402 | HA        | ConsoleProxy       | Running  |
>>>>  345049098498 |      25 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>>> Error |  1394056014 |       3 |
>>>> |  345 |         402 | HA        | ConsoleProxy       | Starting |
>>>>  345049098498 |      25 | 2015-07-06 13:09:26 | 1 | 2015-07-06 13:09:26 |
>>>> Error |  1402527506 |     770 |
>>>> |  349 |         687 | HA        | SecondaryStorageVm | Starting |
>>>>  345049098498 |      25 | 2015-07-06 13:09:26 | 1 | 2015-07-06 13:09:26 |
>>>> Error |  1402527506 |       2 |
>>>> | 2057 |        1178 | Migration | User               | Running  |
>>>>  345049098498 |      33 | 2017-01-31 07:56:08 | 0 | 2017-01-31 07:56:08 |
>>>> Done  |  1451024774 |      95 |
>>>> | 2058 |        1736 | Migration | User               | Running  |
>>>>  345049098498 |     224 | 2017-01-31 07:56:08 | 0 | 2017-01-31 07:56:08 |
>>>> Done  |  1451024774 |      45 |
>>>> | 2059 |        1240 | Migration | User               | Running  |
>>>>  345049098498 |       7 | 2017-01-31 07:56:08 | 0 | 2017-01-31 07:56:08 |
>>>> Done  |  1451024774 |      61 |
>>>> | 2060 |        1178 | Migration | User               | Running  |
>>>>  345049098122 |      33 | 2017-01-31 07:56:08 | 1 | 2017-01-31 08:06:37 |
>>>> Done  |  1451025374 |      95 |
>>>> | 2061 |        1178 | Migration | User               | Stopped  |
>>>>      NULL |      33 | 2017-01-31 07:56:09 | 5 | NULL                | Done
>>>> |  1451027900 |     109 |
>>>> | 2062 |        1690 | Migration | User               | Running  |
>>>>  345049098498 |     224 | 2017-01-31 07:56:09 | 0 | 2017-01-31 07:56:09 |
>>>> Done  |  1451024774 |     109 |
>>>> | 2063 |        1177 | Migration | User               | Running  |
>>>>  345049098498 |      29 | 2017-01-31 07:56:09 | 0 | 2017-01-31 07:56:09 |
>>>> Done  |  1451024774 |      82 |
>>>> | 2064 |        1690 | Migration | User               | Running  |
>>>>  345049098122 |     224 | 2017-01-31 07:56:09 | 1 | 2017-01-31 07:58:18 |
>>>> Done  |  1451024900 |     109 |
>>>> | 2065 |        1240 | Migration | User               | Running  |
>>>>  345049098122 |       7 | 2017-01-31 07:56:09 | 1 | 2017-01-31 07:58:18 |
>>>> Done  |  1451024900 |      61 |
>>>> | 2066 |        1736 | Migration | User               | Running  |
>>>>  345049098122 |     224 | 2017-01-31 07:56:09 | 1 | 2017-01-31 07:58:18 |
>>>> Done  |  1451024900 |      45 |
>>>> | 2067 |        1240 | Migration | User               | Stopped  |
>>>>      NULL |       7 | 2017-01-31 07:56:09 | 5 | NULL                | Done
>>>> |  1451028003 |      80 |
>>>> | 2068 |        1178 | Migration | User               | Running  |
>>>>  345049098122 |      33 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
>>>> Done  |  1451024781 |      95 |
>>>> | 2069 |        1736 | Migration | User               | Running  |
>>>>  345049098122 |     224 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
>>>> Done  |  1451024781 |      45 |
>>>> | 2070 |        1240 | Migration | User               | Running  |
>>>>  345049098122 |       7 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
>>>> Done  |  1451024781 |      61 |
>>>> | 2071 |        1177 | Migration | User               | Running  |
>>>>  345049098122 |      29 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
>>>> Done  |  1451024781 |      82 |
>>>> | 2072 |        1177 | Migration | User               | Stopped  |
>>>>      NULL |      29 | 2017-01-31 07:56:16 | 5 | NULL                | Done
>>>> |  1451027973 |     101 |
>>>> | 2073 |        1177 | Migration | User               | Running  |
>>>>  345049098122 |      29 | 2017-01-31 07:56:16 | 1 | 2017-01-31 07:58:22 |
>>>> Done  |  1451024904 |      82 |
>>>> | 2074 |        1690 | Migration | User               | Stopped  |
>>>>      NULL |     224 | 2017-01-31 07:56:16 | 5 | NULL                | Done
>>>> |  1451027464 |     123 |
>>>> | 2075 |        1736 | Migration | User               | Stopped  |
>>>>      NULL |     224 | 2017-01-31 07:56:16 | 5 | NULL                | Done
>>>> |  1451027477 |      59 |
>>>> +------+-------------+-----------+--------------------+-----
>>>> -----+----------------+---------+---------------------+-----
>>>> --+---------------------+-------+-------------+---------+
>>>>
>>>> If we put a host in maintenance, it may unfortunately happen that a
>>>> timeout occurs and CS aborts the migration.
>>>> As far as we can interpret the logs, it seems (we speculate) that CS
>>>> detects that the VM is then still running on the original host, which is in
>>>> maintenance, and decides somehow to stop it because of the "host not up"
>>>> status.
>>>> We then need to start several times the VM, until CS gives up the
>>>> stopping.
>>>>
>>>> After restarting the MSs I guess that a similar issue may happen: the
>>>> hosts are disconnected during the early stage and then reconnects. Maybe CS
>>>> then again stops the VMs because the "host not up" status?
>>>> Cf. 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>>> The last host of this VM is not UP or is not enabled, host status is:
>>>> Disconnected, host resource state is: Enabled
>>>>
>>>> But the problem after the MSs restart is definitely only impacting VMs
>>>> listed in the Table op_ha_work.
>>>>
>>>>
>>>> To your questions:
>>>>         We have 3 ACS MSs managing several Zones. Each zone has a
>>>> single pod and cluster.
>>>>         The Cluster of i-638-1735-VM has this summary:
>>>>          Total number of VMs: 121
>>>>          Total number hypervisors: 4
>>>>          Total allocated RAM: 522.0 / 972 GB (53.7 %)
>>>>          Total allocated cores: 211
>>>>          Total allocated storage: 11285 GB
>>>>         Used Memory on the hosts is now varying from 28 to 77%, so
>>>> there is enough RAM.
>>>>         I did not check the XS logs. The same Problem has arisen this
>>>> morning again (shortly after a restart of all MSs), so I will look in the
>>>> XS logs now.
>>>>
>>>>         I have some fair experience with Eclipse and Java, I will ask
>>>> my boss if we can go this way.
>>>>
>>>> The main issue for now, is that restarting the MSs is reproducing the
>>>> problem, which is quite annoying for the impacted VM's.
>>>> I thought to stop the MSs, delete all HA/Migration entries in
>>>> op_ha_work and start the MSs.
>>>> Is that a viable option? (or should I delete the Migration entries
>>>> only? or never touch this table?)
>>>> We will also very probably remove the HA Flag on all VMs if it does not
>>>> work reliably.
>>>>
>>>> Best Regards
>>>> Francois Scheurer
>>>>
>>>>
>>>>
>>>> Here are new logs regarding the same issue of today.
>>>> I tried to get the important parts related to the stop of i-189-1178-VM.
>>>>
>>>> regexp: (i-189-1178-VM|VmWorkMigrateAway|job-15393|job-192989|StopRe
>>>> questedvm|job-14965|job-193019|Stopped-Migrating|ctx-36aae73
>>>> 3|work-2049)
>>>>
>>>> 2017-01-30 08:44:04,440 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Executing
>>>> AsyncJobVO {id:192989, userId: 1, accountId: 1, instanceType: null,
>>>> instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>> rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZU
>>>> F3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm
>>>> 1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTA
>>>> ALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAA
>>>> AAAAAAAAEAAAAAAAAEmnQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAHQ,
>>>> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resu
>>>> ltCode: 0, result: null, initMsid: 345049103441, completeMsid: null,
>>>> lastUpdated: null, lastPolled: null, created: Mon Jan 30 08:44:03 CET 2017}
>>>> 2017-01-30 08:44:04,440 DEBUG [c.c.v.VmWorkJobDispatcher]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Run VM work
>>>> job: com.cloud.vm.VmWorkMigrateAway for VM 1178, job origin: 15393
>>>>
>>>> 2017-01-30 08:44:05,217 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>> Deploy avoids pods: [], clusters: [], hosts: [29]
>>>> 2017-01-30 08:44:05,220 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
>>>> (AgentConnectTaskPool-4:ctx-5b61f2e5) create ClusteredAgentAttache for
>>>> 232
>>>> 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>> DeploymentPlanner allocation algorithm: com.cloud.deploy.UserDispersin
>>>> gPlanner@1734b2e2
>>>> 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>> Trying to allocate a host and storage pools from dc:17, pod:25,cluster:25,
>>>> requested cpu: 4200, requested ram: 4294967296 <%2842%29%209496-7296>
>>>> 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>> Is ROOT volume READY (pool already allocated)?: Yes
>>>> 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>> This VM has last host_id specified, trying to choose the same host: 25
>>>>
>>>> 2017-01-30 08:44:05,614 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>> The last host of this VM is not UP or is not enabled, host status is:
>>>> Disconnected, host resource state is: Enabled
>>>> 2017-01-30 08:44:05,614 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>> Cannot choose the last host to deploy this VM
>>>> 2017-01-30 08:44:05,614 DEBUG [c.c.d.FirstFitPlanner]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>> Searching resources only under specified Cluster: 25
>>>>
>>>> 2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>>> FirstFitRoutingAllocator) Found 0 hosts for allocation after
>>>> prioritization: []
>>>> 2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>>> FirstFitRoutingAllocator) Looking for speed=4200Mhz, Ram=4096
>>>> 2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>>> FirstFitRoutingAllocator) Host Allocator returning 0 suitable hosts
>>>> 2017-01-30 08:44:05,777 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>> No suitable hosts found
>>>> 2017-01-30 08:44:05,777 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>> No suitable hosts found under this Cluster: 25
>>>> 2017-01-30 08:44:05,787 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>> Could not find suitable Deployment Destination for this VM under any
>>>> clusters, returning.
>>>> 2017-01-30 08:44:05,787 DEBUG [c.c.d.FirstFitPlanner]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>> Searching resources only under specified Cluster: 25
>>>> 2017-01-30 08:44:05,798 DEBUG [c.c.d.FirstFitPlanner]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>> The specified cluster is in avoid set, returning.
>>>> 2017-01-30 08:44:05,798 DEBUG [c.c.v.VirtualMachineManagerImpl]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>> Unable to find destination for migrating the vm VM[User|i-189-1178-VM]
>>>> 2017-01-30 08:44:05,798 WARN [c.c.v.VirtualMachineManagerImpl]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>> Failed to deploy vm 1178 with original planner, sending HAPlanner
>>>>
>>>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>> Deploy avoids pods: [], clusters: [], hosts: [29]
>>>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>> DeploymentPlanner allocation algorithm: com.cloud.deploy.SkipHeurestic
>>>> sPlanner@4fdf86ce
>>>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>> Trying to allocate a host and storage pools from dc:17, pod:25,cluster:25,
>>>> requested cpu: 4200, requested ram: 4294967296 <%2842%29%209496-7296>
>>>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>> Is ROOT volume READY (pool already allocated)?: Yes
>>>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>> This VM has last host_id specified, trying to choose the same host: 25
>>>>
>>>> 2017-01-30 08:44:06,022 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>> The last host of this VM is not UP or is not enabled, host status is:
>>>> Disconnected, host resource state is: Enabled
>>>> 2017-01-30 08:44:06,022 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>> Cannot choose the last host to deploy this VM
>>>> 2017-01-30 08:44:06,022 DEBUG [c.c.d.FirstFitPlanner]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>> Searching resources only under specified Cluster: 25
>>>> 2017-01-30 08:44:06,039 DEBUG [c.c.d.SkipHeuresticsPlanner]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>> Deploying vm during HA process, so skipping disable threshold check
>>>>
>>>> 2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>>> FirstFitRoutingAllocator) Found 0 hosts for allocation after
>>>> prioritization: []
>>>> 2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>>> FirstFitRoutingAllocator) Looking for speed=4200Mhz, Ram=4096
>>>> 2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>>> FirstFitRoutingAllocator) Host Allocator returning 0 suitable hosts
>>>> 2017-01-30 08:44:06,151 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>> No suitable hosts found
>>>> 2017-01-30 08:44:06,151 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>> No suitable hosts found under this Cluster: 25
>>>> 2017-01-30 08:44:06,157 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Publish async
>>>> job-192986 complete on message bus
>>>> 2017-01-30 08:44:06,157 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Wake up jobs
>>>> related to job-192986
>>>> 2017-01-30 08:44:06,157 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Update db
>>>> status for job-192986
>>>> 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>>> The last host of this VM is not UP or is not enabled, host status is:
>>>> Disconnected, host resource state is: Enabled
>>>> 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>>> Cannot choose the last host to deploy this VM
>>>> 2017-01-30 08:44:06,157 DEBUG [c.c.d.FirstFitPlanner]
>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>>> Searching resources only under specified Cluster: 35
>>>> 2017-01-30 08:44:06,160 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>> Could not find suitable Deployment Destination for this VM under any
>>>> clusters, returning.
>>>> 2017-01-30 08:44:06,160 DEBUG [c.c.d.FirstFitPlanner]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>> Searching resources only under specified Cluster: 25
>>>> 2017-01-30 08:44:06,163 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>> (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Execute sync-queue item:
>>>> SyncQueueItemVO {id:63629, queueId: 64423, contentType: AsyncJob,
>>>> contentId: 192991, lastProcessMsid: 345049098498, lastprocessNumber: 27,
>>>> lastProcessTime: Mon Jan 30 08:44:06 CET 2017, created: Mon Jan 30 08:44:05
>>>> CET 2017}
>>>> 2017-01-30 08:44:06,169 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>>>> (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Add job-192992
>>>> into job monitoring
>>>> 2017-01-30 08:44:06,184 DEBUG [c.c.d.SkipHeuresticsPlanner]
>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>>> Deploying vm during HA process, so skipping disable threshold check
>>>> 2017-01-30 08:44:06,186 DEBUG [c.c.d.FirstFitPlanner]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>> The specified cluster is in avoid set, returning.
>>>> 2017-01-30 08:44:06,186 DEBUG [c.c.v.VirtualMachineManagerImpl]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>> Unable to find destination for migrating the vm VM[User|i-189-1178-VM]
>>>> 2017-01-30 08:44:06,186 ERROR [c.c.v.VmWorkJobHandlerProxy]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>> Invocation exception, caused by: com.cloud.exception.InsufficientServerCapacityException:
>>>> Unable to find a server to migrate to.Scope=interface
>>>> com.cloud.org.Cluster; id=25
>>>> 2017-01-30 08:44:06,187 INFO  [c.c.v.VmWorkJobHandlerProxy]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>>> Rethrow exception com.cloud.exception.InsufficientServerCapacityException:
>>>> Unable to find a server to migrate to.Scope=interface
>>>> com.cloud.org.Cluster; id=25
>>>> 2017-01-30 08:44:06,187 DEBUG [c.c.v.VmWorkJobDispatcher]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Done with run
>>>> of VM work job: com.cloud.vm.VmWorkMigrateAway for VM 1178, job origin:
>>>> 15393
>>>> 2017-01-30 08:44:06,187 ERROR [c.c.v.VmWorkJobDispatcher]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Unable to
>>>> complete AsyncJobVO {id:192989, userId: 1, accountId: 1, instanceType:
>>>> null, instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>> rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcE
>>>> mwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAl
>>>> Z2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5h
>>>> bWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAA
>>>> AAAEmnQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAHQ, cmdVersion: 0,
>>>> status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null,
>>>> initMsid: 345049103441, completeMsid: null, lastUpdated: null, lastPolled:
>>>> null, created: Mon Jan 30 08:44:03 CET 2017}, job origin:15393
>>>> com.cloud.exception.InsufficientServerCapacityException: Unable to
>>>> find a server to migrate to.Scope=interface com.cloud.org.Cluster; id=25
>>>>         at com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAwa
>>>> y(VirtualMachineManagerImpl.java:2390)
>>>>         at com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAwa
>>>> y(VirtualMachineManagerImpl.java:4520)
>>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>>>> ssorImpl.java:57)
>>>>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>>>> thodAccessorImpl.java:43)
>>>>         at java.lang.reflect.Method.invoke(Method.java:606)
>>>>         at com.cloud.vm.VmWorkJobHandlerProxy.handleVmWorkJob(VmWorkJob
>>>> HandlerProxy.java:107)
>>>>         at com.cloud.vm.VirtualMachineManagerImpl.handleVmWorkJob(Virtu
>>>> alMachineManagerImpl.java:4636)
>>>>         at com.cloud.vm.VmWorkJobDispatcher.runJob(VmWorkJobDispatcher.
>>>> java:103)
>>>>         at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImp
>>>> l$5.runInContext(AsyncJobManagerImpl.java:537)
>>>>         at org.apache.cloudstack.managed.context.ManagedContextRunnable
>>>> $1.run(ManagedContextRunnable.java:49)
>>>>         at org.apache.cloudstack.managed.context.impl.DefaultManagedCon
>>>> text$1.call(DefaultManagedContext.java:56)
>>>>         at org.apache.cloudstack.managed.context.impl.DefaultManagedCon
>>>> text.callWithContext(DefaultManagedContext.java:103)
>>>>         at org.apache.cloudstack.managed.context.impl.DefaultManagedCon
>>>> text.runWithContext(DefaultManagedContext.java:53)
>>>>         at org.apache.cloudstack.managed.context.ManagedContextRunnable
>>>> .run(ManagedContextRunnable.java:46)
>>>>         at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImp
>>>> l$5.run(AsyncJobManagerImpl.java:494)
>>>>         at java.util.concurrent.Executors$RunnableAdapter.call(Executor
>>>> s.java:471)
>>>>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>>> Executor.java:1145)
>>>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>>> lExecutor.java:615)
>>>>         at java.lang.Thread.run(Thread.java:745)
>>>> 2017-01-30 08:44:06,188 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>> (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Schedule queued job-192991
>>>> 2017-01-30 08:44:06,192 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Wake up jobs
>>>> joined with job-192986 and disjoin all subjobs created from job- 192986
>>>> 2017-01-30 08:44:06,196 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Complete async
>>>> job-192989, jobStatus: FAILED, resultCode: 0, result:
>>>> rO0ABXNyADdjb20uY2xvdWQuZXhjZXB0aW9uLkluc3VmZmljaWVudFNlcnZl
>>>> ckNhcGFjaXR5RXhjZXB0aW9uAAAAAFZNT3ACAAFaABVhZmZpbml0eUdyb3Vw
>>>> c0FwcGxpZWR4cgAxY29tLmNsb3VkLmV4Y2VwdGlvbi5JbnN1ZmZpY2llbnRD
>>>> YXBhY2l0eUV4Y2VwdGlvbgAAAABWTU91AgACTAACaWR0ABBMamF2YS9sYW5n
>>>> L0xvbmc7TAAFc2NvcGV0ABFMamF2YS9sYW5nL0NsYXNzO3hyACJjb20uY2xv
>>>> dWQuZXhjZXB0aW9uLkNsb3VkRXhjZXB0aW9ueeiOG_HkRMcCAAJMAAtjc0Vy
>>>> cm9yQ29kZXQAE0xqYXZhL2xhbmcvSW50ZWdlcjtMAAZpZExpc3R0ABVMamF2
>>>> YS91dGlsL0FycmF5TGlzdDt4cgATamF2YS5sYW5nLkV4Y2VwdGlvbtD9Hz4a
>>>> OxzEAgAAeHIAE2phdmEubGFuZy5UaHJvd2FibGXVxjUnOXe4ywMABEwABWNh
>>>> dXNldAAVTGphdmEvbGFuZy9UaHJvd2FibGU7TAANZGV0YWlsTWVzc2FnZXQA
>>>> EkxqYXZhL2xhbmcvU3RyaW5nO1sACnN0YWNrVHJhY2V0AB5bTGphdmEvbGFu
>>>> Zy9TdGFja1RyYWNlRWxlbWVudDtMABRzdXBwcmVzc2VkRXhjZXB0aW9uc3QA
>>>> EExqYXZhL3V0aWwvTGlzdDt4cHEAfgANdAAmVW5hYmxlIHRvIGZpbmQgYSBz
>>>> ZXJ2ZXIgdG8gbWlncmF0ZSB0by51cgAeW0xqYXZhLmxhbmcuU3RhY2tUcmFj
>>>> ZUVsZW1lbnQ7AkYqPDz9IjkCAAB4cAAAABVzcgAbamF2YS5sYW5nLlN0YWNr
>>>> VHJhY2VFbGVtZW50YQnFmiY23YUCAARJAApsaW5lTnVtYmVyTAAOZGVjbGFy
>>>> aW5nQ2xhc3NxAH4ACkwACGZpbGVOYW1lcQB-AApMAAptZXRob2ROYW1lcQB-
>>>> AAp4cAAACVZ0ACZjb20uY2xvdWQudm0uVmlydHVhbE1hY2hpbmVNYW5hZ2Vy
>>>> SW1wbHQAHlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwuamF2YXQAFm9yY2hl
>>>> c3RyYXRlTWlncmF0ZUF3YXlzcQB-ABEAABGocQB-ABNxAH4AFHEAfgAVc3EA
>>>> fgAR_____nQAJHN1bi5yZWZsZWN0Lk5hdGl2ZU1ldGhvZEFjY2Vzc29ySW1w
>>>> bHQAHU5hdGl2ZU1ldGhvZEFjY2Vzc29ySW1wbC5qYXZhdAAHaW52b2tlMHNx
>>>> AH4AEQAAADlxAH4AGHEAfgAZdAAGaW52b2tlc3EAfgARAAAAK3QAKHN1bi5y
>>>> ZWZsZWN0LkRlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcGx0ACFEZWxlZ2F0
>>>> aW5nTWV0aG9kQWNjZXNzb3JJbXBsLmphdmFxAH4AHHNxAH4AEQAAAl50ABhq
>>>> YXZhLmxhbmcucmVmbGVjdC5NZXRob2R0AAtNZXRob2QuamF2YXEAfgAcc3EA
>>>> fgARAAAAa3QAImNvbS5jbG91ZC52bS5WbVdvcmtKb2JIYW5kbGVyUHJveHl0
>>>> ABpWbVdvcmtKb2JIYW5kbGVyUHJveHkuamF2YXQAD2hhbmRsZVZtV29ya0pv
>>>> YnNxAH4AEQAAEhxxAH4AE3EAfgAUcQB-ACZzcQB-ABEAAABndAAgY29tLmNs
>>>> b3VkLnZtLlZtV29ya0pvYkRpc3BhdGNoZXJ0ABhWbVdvcmtKb2JEaXNwYXRj
>>>> aGVyLmphdmF0AAZydW5Kb2JzcQB-ABEAAAIZdAA_b3JnLmFwYWNoZS5jbG91
>>>> ZHN0YWNrLmZyYW1ld29yay5qb2JzLmltcGwuQXN5bmNKb2JNYW5hZ2VySW1w
>>>> bCQ1dAAYQXN5bmNKb2JNYW5hZ2VySW1wbC5qYXZhdAAMcnVuSW5Db250ZXh0
>>>> c3EAfgARAAAAMXQAPm9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNv
>>>> bnRleHQuTWFuYWdlZENvbnRleHRSdW5uYWJsZSQxdAAbTWFuYWdlZENvbnRl
>>>> eHRSdW5uYWJsZS5qYXZhdAADcnVuc3EAfgARAAAAOHQAQm9yZy5hcGFjaGUu
>>>> Y2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0TWFuYWdl
>>>> ZENvbnRleHQkMXQAGkRlZmF1bHRNYW5hZ2VkQ29udGV4dC5qYXZhdAAEY2Fs
>>>> bHNxAH4AEQAAAGd0AEBvcmcuYXBhY2hlLmNsb3Vkc3RhY2subWFuYWdlZC5j
>>>> b250ZXh0LmltcGwuRGVmYXVsdE1hbmFnZWRDb250ZXh0cQB-ADZ0AA9jYWxs
>>>> V2l0aENvbnRleHRzcQB-ABEAAAA1cQB-ADlxAH4ANnQADnJ1bldpdGhDb250
>>>> ZXh0c3EAfgARAAAALnQAPG9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2Vk
>>>> LmNvbnRleHQuTWFuYWdlZENvbnRleHRSdW5uYWJsZXEAfgAycQB-ADNzcQB-
>>>> ABEAAAHucQB-AC1xAH4ALnEAfgAzc3EAfgARAAAB13QALmphdmEudXRpbC5j
>>>> b25jdXJyZW50LkV4ZWN1dG9ycyRSdW5uYWJsZUFkYXB0ZXJ0AA5FeGVjdXRv
>>>> cnMuamF2YXEAfgA3c3EAfgARAAABBnQAH2phdmEudXRpbC5jb25jdXJyZW50
>>>> LkZ1dHVyZVRhc2t0AA9GdXR1cmVUYXNrLmphdmFxAH4AM3NxAH4AEQAABHl0
>>>> ACdqYXZhLnV0aWwuY29uY3VycmVudC5UaHJlYWRQb29sRXhlY3V0b3J0ABdU
>>>> aHJlYWRQb29sRXhlY3V0b3IuamF2YXQACXJ1bldvcmtlcnNxAH4AEQAAAmd0
>>>> AC5qYXZhLnV0aWwuY29uY3VycmVudC5UaHJlYWRQb29sRXhlY3V0b3IkV29y
>>>> a2VycQB-AEhxAH4AM3NxAH4AEQAAAul0ABBqYXZhLmxhbmcuVGhyZWFkdAAL
>>>> VGhyZWFkLmphdmFxAH4AM3NyACZqYXZhLnV0aWwuQ29sbGVjdGlvbnMkVW5t
>>>> b2RpZmlhYmxlTGlzdPwPJTG17I4QAgABTAAEbGlzdHEAfgAMeHIALGphdmEu
>>>> dXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVDb2xsZWN0aW9uGUIAgMte
>>>> 9x4CAAFMAAFjdAAWTGphdmEvdXRpbC9Db2xsZWN0aW9uO3hwc3IAE2phdmEu
>>>> dXRpbC5BcnJheUxpc3R4gdIdmcdhnQMAAUkABHNpemV4cAAAAAB3BAAAAAB4
>>>> cQB-AFR4c3IAEWphdmEubGFuZy5JbnRlZ2VyEuKgpPeBhzgCAAFJAAV2YWx1
>>>> ZXhyABBqYXZhLmxhbmcuTnVtYmVyhqyVHQuU4IsCAAB4cAAAEO9zcQB-AFMA
>>>> AAAAdwQAAAAAeHNyAA5qYXZhLmxhbmcuTG9uZzuL5JDMjyPfAgABSgAFdmFs
>>>> dWV4cQB-AFYAAAAAAAAAGXZyABVjb20uY2xvdWQub3JnLkNsdXN0ZXKqoV-o
>>>> maW2FAIAAHhwAA
>>>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>>> Deploy avoids pods: [], clusters: [], hosts: [3]
>>>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>>> DeploymentPlanner allocation algorithm: com.cloud.deploy.SkipHeurestic
>>>> sPlanner@4fdf86ce
>>>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>>> Trying to allocate a host and storage pools from dc:14, pod:17,cluster:17,
>>>> requested cpu: 4200, requested ram: 4294967296
>>>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>>> Is ROOT volume READY (pool already allocated)?: Yes
>>>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>>> This VM has last host_id specified, trying to choose the same host: 213
>>>> 2017-01-30 08:44:06,213 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Publish async
>>>> job-192989 complete on message bus
>>>> 2017-01-30 08:44:06,213 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Wake up jobs
>>>> related to job-192989
>>>> 2017-01-30 08:44:06,213 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Update db
>>>> status for job-192989
>>>> 2017-01-30 08:44:06,219 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>>> Checking resources in Cluster: 35 under Pod: 31
>>>> 2017-01-30 08:44:06,227 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f
>>>> FirstFitRoutingAllocator) Looking for hosts in dc: 23  pod:31  cluster:35
>>>> 2017-01-30 08:44:06,229 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>> (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Executing
>>>> AsyncJobVO {id:192992, userId: 1, accountId: 1, instanceType: null,
>>>> instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>> rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcE
>>>> mwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAl
>>>> Z2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5h
>>>> bWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAA
>>>> AAAGwXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAEw, cmdVersion: 0,
>>>> status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null,
>>>> initMsid: 345049098122, completeMsid: null, lastUpdated: null, lastPolled:
>>>> null, created: Mon Jan 30 08:44:05 CET 2017}
>>>> 2017-01-30 08:44:06,229 DEBUG [c.c.v.VmWorkJobDispatcher]
>>>> (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Run VM work
>>>> job: com.cloud.vm.VmWorkMigrateAway for VM 1729, job origin: 14953
>>>> 2017-01-30 08:44:06,230 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>> (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Execute sync-queue item:
>>>> SyncQueueItemVO {id:63630, queueId: 67180, contentType: AsyncJob,
>>>> contentId: 192990, lastProcessMsid: 345049098498, lastprocessNumber: 61,
>>>> lastProcessTime: Mon Jan 30 08:44:06 CET 2017, created: Mon Jan 30 08:44:05
>>>> CET 2017}
>>>> 2017-01-30 08:44:06,244 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>>> The last host of this VM is not UP or is not enabled, host status is:
>>>> Disconnected, host resource state is: Enabled
>>>> 2017-01-30 08:44:06,262 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>>> Cannot choose the last host to deploy this VM
>>>> 2017-01-30 08:44:06,263 DEBUG [c.c.d.FirstFitPlanner]
>>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>>> Searching resources only under specified Cluster: 17
>>>> 2017-01-30 08:44:06,262 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>>>> (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Add job-192991
>>>> into job monitoring
>>>> 2017-01-30 08:44:06,246 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Done executing
>>>> com.cloud.vm.VmWorkMigrateAway for job-192986
>>>> 2017-01-30 08:44:06,244 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Wake up jobs
>>>> joined with job-192989 and disjoin all subjobs created from job- 192989
>>>> 2017-01-30 08:44:06,269 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Remove
>>>> job-192986 from job monitoring
>>>> 2017-01-30 08:44:06,279 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Done executing
>>>> com.cloud.vm.VmWorkMigrateAway for job-192989
>>>> 2017-01-30 08:44:06,305 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>> (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Executing
>>>> AsyncJobVO {id:192991, userId: 1, accountId: 1, instanceType: null,
>>>> instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>> rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcE
>>>> mwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAl
>>>> Z2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5h
>>>> bWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAA
>>>> AAAGRXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAABw, cmdVersion: 0,
>>>> status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null,
>>>> initMsid: 345049098122, completeMsid: null, lastUpdated: null, lastPolled:
>>>> null, created: Mon Jan 30 08:44:05 CET 2017}
>>>> 2017-01-30 08:44:06,305 DEBUG [c.c.v.VmWorkJobDispatcher]
>>>> (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Run VM work
>>>> job: com.cloud.vm.VmWorkMigrateAway for VM 1605, job origin: 14965
>>>> 2017-01-30 08:44:06,306 DEBUG [c.c.a.m.AgentManagerImpl]
>>>> (AgentManager-Handler-13:null) Ping from 258
>>>> 2017-01-30 08:44:06,319 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f
>>>> FirstFitRoutingAllocator) List of hosts in ascending order of number of
>>>> VMs: [224, 19, 21, 15]
>>>> 2017-01-30 08:44:06,322 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f
>>>> FirstFitRoutingAllocator) FirstFitAllocator has 0 hosts to check for
>>>> allocation: []
>>>> 2017-01-30 08:44:06,322 INFO  [c.c.a.m.AgentManagerImpl]
>>>> (AgentManager-Handler-13:null) PingMap for agent: 258 will not be updated
>>>> because agent is no longer in the PingMap
>>>> 2017-01-30 08:44:06,326 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Remove
>>>> job-192989 from job monitoring
>>>>
>>>> 2017-01-30 08:54:38,139 INFO [c.c.v.VirtualMachineManagerImpl]
>>>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5)
>>>> Migration cancelled because state has changed: VM[User|i-189-1178-VM]
>>>> 2017-01-30 08:54:38,139 DEBUG [c.c.v.VirtualMachineManagerImpl]
>>>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5)
>>>> Unable to migrate VM due to: Migration cancelled because state has changed:
>>>> VM[User|i-189-1178-VM]
>>>> 2017-01-30 08:54:38,167 DEBUG [c.c.c.CapacityManagerImpl]
>>>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5)
>>>> VM state transitted from :Running to Stopping with event: StopRequestedvm's
>>>> original host id: 29 new host id: 33 host id before state transition: 33
>>>> 2017-01-30 08:54:38,177 DEBUG [c.c.a.m.DirectAgentAttache]
>>>> (DirectAgent-325:ctx-e540d7f5) Seq 19-2355664080091480198: Response
>>>> Received:
>>>> 2017-01-30 08:54:38,177 DEBUG [c.c.a.t.Request]
>>>> (DirectAgent-325:ctx-e540d7f5) Seq 19-2355664080091480198: Processing:  {
>>>> Ans: , MgmtId: 345049098498, via: 19, Ver: v1, Flags: 10,
>>>> [{"com.cloud.agent.api.NetworkUsageAnswer":{"routerName":"r-
>>>> 1706-VM","bytesSent":12276848087,"bytesReceived":5226621450,
>>>> "result":true,"details":"","wait":0}}] }
>>>> 2017-01-30 08:54:38,177 DEBUG [c.c.a.t.Request]
>>>> (RouterMonitor-1:ctx-2e80f405) Seq 19-2355664080091480198: Received:  {
>>>> Ans: , MgmtId: 345049098498, via: 19, Ver: v1, Flags: 10, {
>>>> NetworkUsageAnswer } }
>>>> 2017-01-30 08:54:38,177 DEBUG [c.c.a.m.AgentManagerImpl]
>>>> (RouterMonitor-1:ctx-2e80f405) Details from executing class
>>>> com.cloud.agent.api.NetworkUsageCommand:
>>>> 2017-01-30 08:54:38,178 DEBUG [c.c.a.t.Request]
>>>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5)
>>>> Seq 33-1917970491306410035: Sending  { Cmd , MgmtId: 345049098498, via:
>>>> 33(ewcstack-vh003-prod), Ver: v1, Flags: 100011,
>>>> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"execut
>>>> eInSequence":false,"checkBeforeCleanup":false,"vmName":"i-189-1178-VM","wait":0}}]
>>>> }
>>>> 2017-01-30 08:54:38,178 DEBUG [c.c.a.t.Request]
>>>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5)
>>>> Seq 33-1917970491306410035: Executing:  { Cmd , MgmtId: 345049098498, via:
>>>> 33(ewcstack-vh003-prod), Ver: v1, Flags: 100011,
>>>> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"execut
>>>> eInSequence":false,"checkBeforeCleanup":false,"vmName":"i-189-1178-VM","wait":0}}]
>>>> }
>>>> 2017-01-30 08:54:38,178 DEBUG [c.c.a.m.DirectAgentAttache]
>>>> (DirectAgent-43:ctx-da30f109) Seq 33-1917970491306410035: Executing request
>>>> 2017-01-30 08:54:38,202 DEBUG [c.c.a.m.DirectAgentAttache]
>>>> (DirectAgent-326:ctx-4fb88aa7) Seq 21-1035546439318503505: Executing request
>>>>
>>>> 2017-01-30 09:05:55,628 INFO [c.c.h.HighAvailabilityManagerImpl]
>>>> (HA-Worker-4:ctx-36aae733 work-2049) Processing
>>>> HAWork[2049-Migration-1178-Stopped-Migrating]
>>>> 2017-01-30 09:05:55,633 WARN [o.a.c.f.j.AsyncJobExecutionContext]
>>>> (HA-Worker-4:ctx-36aae733 work-2049) Job is executed without a context,
>>>> setup psudo job for the executing thread
>>>> 2017-01-30 09:05:55,645 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>>> (HA-Worker-4:ctx-36aae733 work-2049) Sync job-193045 execution on object
>>>> VmWorkJobQueue.1178
>>>>
>>>>
>>>>
>>>>
>>>> On 31.01.2017 00:42, Rafael Weingärtner wrote:
>>>>
>>>>> This seems to be one of those nasty bugs that are very hard to debug.
>>>>> It
>>>>> seems to be cause to a combination of different factors, such as
>>>>> planets
>>>>> alignments ;)
>>>>>
>>>>> At class “VirtualMachineManagerImpl” line 2395 (method
>>>>> “com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(String,
>>>>> long, DeploymentPlanner)”) is the start of the code flow that triggers
>>>>> the
>>>>> migration. The point here is that, if an error occurs during the
>>>>> migration,
>>>>> the VM will be stopped; For some reason this is executed only for HA
>>>>> VMs.
>>>>>
>>>>> Going a little bit further, we find the point that is triggering the
>>>>> error
>>>>> in the method “com.cloud.vm.VirtualMachineMa
>>>>> nagerImpl.migrate(VMInstanceVO,
>>>>> long, DeployDestination)” at line 1941. This line will call the method
>>>>> “com.cloud.vm.VirtualMachineManagerImpl.changeState(VMInstanceVO,
>>>>> Event,
>>>>> Long, ItWorkVO, Step)” to change the VM state to migrating/prepare
>>>>> migration/or something like that (I am guessing here), but for some
>>>>> reason,
>>>>> the state of the VM is not the one expected and this is causing a
>>>>> “ConcurrentOperationException” to be thrown. The problem is that the
>>>>> “com.cloud.vm.VirtualMachineManagerImpl.changeState(VMInstanceVO,
>>>>> Event,
>>>>> Long, ItWorkVO, Step)” does not seem to provide much useful log
>>>>> entries.
>>>>>
>>>>> Just to enable us to discuss the problem a little further, could you
>>>>> tell
>>>>> us a little bit about your environment setting? How many ACS MSs do you
>>>>> have? What is the VM workload being migrated (I mean, the number of
>>>>> VMs and
>>>>> computing resources)? Do the other servers have enough computing
>>>>> resources
>>>>> to support these VMs? Could you also check the XenServer logs, and
>>>>> check
>>>>> what happened on both, source and destination hosts when the migration
>>>>> for
>>>>> VM “i-638-1735-VM” was triggered?
>>>>>
>>>>> Do you have the knowledge/skills to use an IDE (e.g. Eclipse) to
>>>>> connect to
>>>>> a running JVM and debug a code? Because for me it will not be possible
>>>>> to
>>>>> go further without looking at the variables in a running environment.
>>>>>
>>>>>
>>>>> On Mon, Jan 30, 2017 at 4:28 AM, Francois Scheurer <
>>>>> francois.scheurer@everyware.ch> wrote:
>>>>>
>>>>> Dear Rafael
>>>>>>
>>>>>>
>>>>>> Sorry for the late answer (was sick a couple of days).
>>>>>>
>>>>>> I confirm we are migrating inside the same xenserver pool (within a CS
>>>>>> Cluster).
>>>>>>
>>>>>>
>>>>>> Best Regards
>>>>>> Francois
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 26.01.2017 04:35, Rafael Weingärtner wrote:
>>>>>>
>>>>>> Found the code that is executed.
>>>>>>> I need you to confirm if you are migrating within a cluster or across
>>>>>>> clusters?
>>>>>>>
>>>>>>> On Wed, Jan 25, 2017 at 11:57 AM, Rafael Weingärtner <
>>>>>>> rafaelweingartner@gmail.com> wrote:
>>>>>>>
>>>>>>> I do not have access to ACS code right now, but I suggest starting
>>>>>>>
>>>>>>>> debugging here:
>>>>>>>>
>>>>>>>> 2017-01-24 18:05:28,427 DEBUG [c.c.c.CapacityManagerImpl]
>>>>>>>>
>>>>>>>>> (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380
>>>>>>>>> ctx-175d37df)
>>>>>>>>> VM state transitted from :Running to Stopping with event:
>>>>>>>>> StopRequestedvm's
>>>>>>>>> original host id: 224 new host id: 15 host id before state
>>>>>>>>> transition:
>>>>>>>>> 15
>>>>>>>>>
>>>>>>>>> I would try to understand first why ACS requested the hypervisor
>>>>>>>>> to stop
>>>>>>>>>
>>>>>>>> the VM. I mean the following, check the source code for conditions
>>>>>>>> that
>>>>>>>> would make ACS request the shutdown of VMs.
>>>>>>>> BTW: are you migrating within a cluster or across clusters?
>>>>>>>>
>>>>>>>> I will be able to analyze this further only after work hours, so if
>>>>>>>> you
>>>>>>>> find anything, keep me posted.
>>>>>>>>
>>>>>>>> On Wed, Jan 25, 2017 at 11:45 AM, Francois Scheurer <
>>>>>>>> francois.scheurer@everyware.ch> wrote:
>>>>>>>>
>>>>>>>> Hello Rafael
>>>>>>>>
>>>>>>>>>
>>>>>>>>> I think the important log lines are these:
>>>>>>>>>
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 INFO
>>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b
>>>>>>>>> 86f5e
>>>>>>>>> job-60216/job-191380 ctx-175d37df) Migration cancelled because
>>>>>>>>> state has
>>>>>>>>> changed: VM[User|i-638-1736-VM]
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 DEBUG
>>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b
>>>>>>>>> 86f5e
>>>>>>>>> job-60216/job-191380 ctx-175d37df) Unable to migrate VM due to:
>>>>>>>>> Migration
>>>>>>>>> cancelled because state has changed: VM[User|i-638-1736-VM]
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,427 DEBUG
>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>>>>>>>> job-60216/job-191380 ctx-175d37df) VM state transitted from
>>>>>>>>> :Running to
>>>>>>>>> Stopping with event: StopRequestedvm's original host id: 224 new
>>>>>>>>> host
>>>>>>>>> id:
>>>>>>>>> 15 host id before state transition: 15
>>>>>>>>>
>>>>>>>>> The really disturbing thing is that CS stop again the VM if we
>>>>>>>>> start it
>>>>>>>>> manually, apparently because the migratingjob is still running.
>>>>>>>>>
>>>>>>>>> Note the the VM HA Flag is enabled.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thank you.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Best Regards.
>>>>>>>>>
>>>>>>>>> Francois
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> more details:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,148 DEBUG
>>>>>>>>> [o.a.c.e.o.VolumeOrchestrator] (Work-Job-Executor-156:ctx-14b86f5e
>>>>>>>>> job-60216/job-191380 ctx-175d37df) Preparing 3 volumes for
>>>>>>>>> VM[User|i-638-1736-VM]
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,155 DEBUG
>>>>>>>>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-122:ctx-a8bacc64
>>>>>>>>> job-14953/job-191351) Done with run of VM work job:
>>>>>>>>> com.cloud.vm.VmWorkMigrateAway for VM 1411, job origin: 14953
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,155 DEBUG
>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-122:ctx-a8b
>>>>>>>>> acc64
>>>>>>>>> job-14953/job-191351) Done executing
>>>>>>>>> com.cloud.vm.VmWorkMigrateAway for
>>>>>>>>> job-191351
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,160 INFO
>>>>>>>>> [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-122:ctx-a8bacc64
>>>>>>>>> job-14953/job-191351) Remove job-191351 from job monitoring
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,168 INFO
>>>>>>>>> [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-2:ctx-f3bcf43c
>>>>>>>>> work-1989)
>>>>>>>>> Completed HAWork[1989-Migration-1411-Running-Migrating]
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,246 DEBUG
>>>>>>>>> [c.c.a.m.ClusteredAgentAttache] (Work-Job-Executor-156:ctx-14b
>>>>>>>>> 86f5e
>>>>>>>>> job-60216/job-191380 ctx-175d37df) Seq 19-8526158519542516375:
>>>>>>>>> Forwarding
>>>>>>>>> Seq 19-8526158519542516375:  { Cmd , MgmtId: 345049103441, via:
>>>>>>>>> 19(ewcstack-vh010-prod), Ver: v1, Flags: 100111, [{
>>>>>>>>> "com.cloud.agent.api.PrepareForMigrationCommand":{"vm":{"id"
>>>>>>>>> :1736,"name":"i-638-1736-VM","bootloader":"PyGrub","type":"U
>>>>>>>>> ser","cpus":4,"minSpeed":2100,"maxSpeed":2100,"minRam":17179
>>>>>>>>> 869184,"maxRam":17179869184,"arch":"x86_64","os":"Other
>>>>>>>>> (64-bit)","platformEmulator":"Other install
>>>>>>>>> media","bootArgs":"","enable
>>>>>>>>> HA":true,"limitCpuUse":true,"enableDynamicallyScaleVm":true,
>>>>>>>>> "vncPassword":"pC1WUC7h1DBH9J36q3H9pg==","params":{"memoryOv
>>>>>>>>> ercommitRatio":"1.0","platform":"viridian:true;acpi:1;apic:t
>>>>>>>>> rue;pae:true;nx:true;timeoffset:-3601","keyboard":"us","Mess
>>>>>>>>> age.ReservedCapacityFreed.Flag":"false","cpuOvercommitRatio"
>>>>>>>>> :"4.0","
>>>>>>>>> hypervisortoolsversion":"xenserver61"},"uuid":"5bff01de-c033
>>>>>>>>> -4925-9e47-30dd14539272","disks":[{"data":{"org.apache.cloud
>>>>>>>>> stack.storage.to.VolumeObjectTO":{"uuid":"1ab514b2-f2de-
>>>>>>>>> 4d85-8e0d-9768184a1349","volumeType":"ROOT","dataStore":{"org.
>>>>>>>>> apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"b8
>>>>>>>>> 751b89-00f
>>>>>>>>> 6-5079-ed7a-c073579c7563","id":64,"poolType":"PreSetup","hos
>>>>>>>>> t":"localhost","path":"/b8751b89-00f6-5079-ed7a-c073579c7563
>>>>>>>>> ","port":0,"url":"PreSetup://localhost/b8751b89-00f6-5079-ed
>>>>>>>>> 7a-c073579c7563/?ROLE=Primary&STOREUUID=b8751b89-00f6-5079-
>>>>>>>>> ed7a-c073579c7563"}},"name":"ROOT-1736","size":53687091200,"
>>>>>>>>> path":"d3
>>>>>>>>> f87e3c-1efe-42ad-aa93-609d5b980a34","volumeId":8655,"vmName"
>>>>>>>>> :"i-638-1736-VM","accountId":638,"format":"VHD","provisionin
>>>>>>>>> gType":"THIN","id":8655,"deviceId":0,"hypervisorType":"
>>>>>>>>> XenServer"}},"diskSeq":0,"path":"d3f87e3c-1efe-42ad-aa93
>>>>>>>>> -609d5b980a34","type":"ROOT","_details":{"managed":"false","
>>>>>>>>> storagePort":"0",
>>>>>>>>> "storageHost":"localhost","volumeSize":"53687091200"}},{"dat
>>>>>>>>> a":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid
>>>>>>>>> ":"733df9dc-5e24-4ab9-b22b-bce78322428e","volumeType":"DATAD
>>>>>>>>> ISK","dataStore":{"org.apache.cloudstack.storage.to.PrimaryD
>>>>>>>>> ataStoreTO":{"uuid":"b8751b89-00f6-5079-ed7a-c073579c7563","
>>>>>>>>> id":64,"
>>>>>>>>> poolType":"PreSetup","host":"localhost","path":"/b8751b89-00
>>>>>>>>> f6-5079-ed7a-c073579c7563","port":0,"url":"PreSetup://localh
>>>>>>>>> ost/b8751b89-00f6-5079-ed7a-c073579c7563/?ROLE=Primary&STORE
>>>>>>>>> UUID=b8751b89-00f6-5079-ed7a-c073579c7563"}},"name":"WELLA-
>>>>>>>>> DB1-NCSA_D","size":268435456000,"path":"265f168a-bb92-4591-a
>>>>>>>>> 32b-ee3e2
>>>>>>>>> 2c8cb68","volumeId":8656,"vmName":"i-638-1736-VM","accountId
>>>>>>>>> ":638,"format":"VHD","provisioningType":"THIN","id":8656,"
>>>>>>>>> deviceId":1,"hypervisorType":"XenServer"}},"diskSeq":1,"
>>>>>>>>> path":"265f168a-bb92-4591-a32b-ee3e22c8cb68","type":"DATADIS
>>>>>>>>> K","_details":{"managed":"false","storagePort":"0","stor
>>>>>>>>> ageHost":"localhos
>>>>>>>>> t","volumeSize":"268435456000"}},{"data":{"org.apache.clouds
>>>>>>>>> tack.storage.to.VolumeObjectTO":{"uuid":"bda6f335-b8a3-479c-
>>>>>>>>> 8c0d-41b344d2a50d","volumeType":"DATADISK","dataStore":{"org.
>>>>>>>>> apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid"
>>>>>>>>> :"09df3ca6-0a20-e4f0-3d79-12a6ed67a13d","id":89,"poolType":"
>>>>>>>>> PreSetup","
>>>>>>>>> host":"localhost","path":"/09df3ca6-0a20-e4f0-3d79-12a6ed67a
>>>>>>>>> 13d","port":0,"url":"PreSetup://localhost/09df3ca6-0a20-e4f0
>>>>>>>>> -3d79-12a6ed67a13d/?ROLE=Primary&STOREUUID=09df3ca6-0a20-
>>>>>>>>>
>>>>>>>>> e4f0-3d79-12a6ed67a13d"}},"name":"WELLA-DB1-NCSA_E","size
>>>>>>>>> ":429496729600,"path":"612cb586-a70c-4896-b120-062ee7a67894"
>>>>>>>>> ,"volumeId":86
>>>>>>>>> 57,"vmName":"i-638-1736-VM","accountId":638,"format":"VHD","
>>>>>>>>> provisioningType":"THIN","id":8657,"deviceId":2,"hypervisorT
>>>>>>>>> ype":"XenServer"}},"diskSeq":2,"path":"612cb586-a70c-4896-b1
>>>>>>>>> 20-062ee7a67894","type":"DATADISK","_details":{"managed":"
>>>>>>>>> false","storagePort":"0","storageHost":"localhost","volumeSi
>>>>>>>>> ze":"42949
>>>>>>>>> 6729600"}},{"data":{"org.apache.cloudstack.storage.to.Templa
>>>>>>>>> teObjectTO":{"uuid":"0e5bf4ff-caca-4b08-abed-7bbcdf3c42ef","
>>>>>>>>> id":201,"format":"ISO","accountId":1,"hvm":true,"displayText
>>>>>>>>> ":"xen-pv-drv-iso","name":"xs-tools.iso","guestOsType":"CentOS 4.5
>>>>>>>>> (32-bit)","hypervisorType":"XenServer"}},"diskSeq":3,"type":"
>>>>>>>>> ISO"}],"nics":[{"deviceId":0,"networkRateMbps":1000,"default
>>>>>>>>> Nic":true,"pxeDisable":false,"nicUuid":"3523c3d3-9f92-4076-a
>>>>>>>>> d99-f8864d0f0edd","uuid":"e6fbb257-16e1-415f-88fb-9a79b263b3
>>>>>>>>> a8","ip":"192.168.1.42","netmask":"255.255.255.0","
>>>>>>>>> gateway":"192.168.1.1","mac":"02:00:0c:ae:00:0f","dns1":"8.8
>>>>>>>>> .8.8","dns2":"8.8
>>>>>>>>> .4.4","broadcastType":"Vlan","type":"Guest","broadcastUri":"
>>>>>>>>> vlan://2285","isolationUri":"vlan://2285","isSecurityGroupEn
>>>>>>>>> abled":false,"name":"bond0"}],"vcpuMaxLimit":16},"wait":0}}] } to
>>>>>>>>> 345049098498
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG
>>>>>>>>> [c.c.a.t.Request]
>>>>>>>>> (AgentManager-Handler-10:null) Seq 19-8526158519542516375:
>>>>>>>>> Processing:
>>>>>>>>> {
>>>>>>>>> Ans: , MgmtId: 345049103441, via: 19, Ver: v1, Flags: 110,
>>>>>>>>> [{"com.cloud.agent.api.PrepareForMigrationAnswer":{"result":
>>>>>>>>> true,"wait":0}}]
>>>>>>>>> }
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG
>>>>>>>>> [c.c.a.m.AgentAttache]
>>>>>>>>> (AgentManager-Handler-10:null) Seq 19-8526158519542516375: No more
>>>>>>>>> commands
>>>>>>>>> found
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG
>>>>>>>>> [c.c.a.t.Request]
>>>>>>>>> (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380
>>>>>>>>> ctx-175d37df)
>>>>>>>>> Seq 19-8526158519542516375: Received:  { Ans: , MgmtId:
>>>>>>>>> 345049103441,
>>>>>>>>> via:
>>>>>>>>> 19, Ver: v1, Flags: 110, { PrepareForMigrationAnswer } }
>>>>>>>>>
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 INFO
>>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b
>>>>>>>>> 86f5e
>>>>>>>>> job-60216/job-191380 ctx-175d37df) Migration cancelled because
>>>>>>>>> state has
>>>>>>>>> changed: VM[User|i-638-1736-VM]
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 DEBUG
>>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b
>>>>>>>>> 86f5e
>>>>>>>>> job-60216/job-191380 ctx-175d37df) Unable to migrate VM due to:
>>>>>>>>> Migration
>>>>>>>>> cancelled because state has changed: VM[User|i-638-1736-VM]
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,427 DEBUG
>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>>>>>>>> job-60216/job-191380 ctx-175d37df) VM state transitted from
>>>>>>>>> :Running to
>>>>>>>>> Stopping with event: StopRequestedvm's original host id: 224 new
>>>>>>>>> host
>>>>>>>>> id:
>>>>>>>>> 15 host id before state transition: 15
>>>>>>>>>
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,438 DEBUG
>>>>>>>>> [c.c.a.m.ClusteredAgentAttache] (Work-Job-Executor-156:ctx-14b
>>>>>>>>> 86f5e
>>>>>>>>> job-60216/job-191380 ctx-175d37df) Seq 15-6889944479923437643:
>>>>>>>>> Forwarding
>>>>>>>>> Seq 15-6889944479923437643:  { Cmd , MgmtId: 345049103441, via:
>>>>>>>>> 15(ewcstack-vh011-prod), Ver: v1, Flags: 100011,
>>>>>>>>> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"execut
>>>>>>>>> eInSequence":false,"checkBeforeCleanup":false,"vmName":"i-
>>>>>>>>> 638-1736-VM","wait":0}}]
>>>>>>>>> } to 345049098498
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:30,335 DEBUG
>>>>>>>>> [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-1:null) SeqA
>>>>>>>>> 254-800498:
>>>>>>>>> Processing Seq 254-800498:  { Cmd , MgmtId: -1, via: 254, Ver: v1,
>>>>>>>>> Flags:
>>>>>>>>> 11, [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_
>>>>>>>>> proxyVmId":1530,"_loadInfo":"{\n
>>>>>>>>> \"connections\": []\n}","wait":0}}] }
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:30,340 DEBUG
>>>>>>>>> [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-1:null) SeqA
>>>>>>>>> 254-800498:
>>>>>>>>> Sending Seq 254-800498:  { Ans: , MgmtId: 345049103441, via: 254,
>>>>>>>>> Ver:
>>>>>>>>> v1,
>>>>>>>>> Flags: 100010, [{"com.cloud.agent.api.AgentCo
>>>>>>>>> ntrolAnswer":{"result":true,"wait":0}}] }
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,447 DEBUG
>>>>>>>>> [c.c.a.t.Request]
>>>>>>>>> (AgentManager-Handler-11:null) Seq 15-6889944479923437632:
>>>>>>>>> Processing:
>>>>>>>>> {
>>>>>>>>> Ans: , MgmtId: 345049103441, via: 15, Ver: v1, Flags: 10,
>>>>>>>>> [{"com.cloud.agent.api.StopAnswer":{"platform":"viridian:tru
>>>>>>>>> e;acpi:1;apic:true;pae:true;nx:true;timeoffset:3601","result
>>>>>>>>> ":true,"details":"Stop
>>>>>>>>> VM i-638-1735-VM Succeed","wait":0}}] }
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,447 DEBUG
>>>>>>>>> [c.c.a.t.Request]
>>>>>>>>> (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379
>>>>>>>>> ctx-bbabfeae)
>>>>>>>>> Seq 15-6889944479923437632: Received:  { Ans: , MgmtId:
>>>>>>>>> 345049103441,
>>>>>>>>> via:
>>>>>>>>> 15, Ver: v1, Flags: 10, { StopAnswer } }
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,470 DEBUG
>>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>> db528
>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) VM[User|i-638-1735-VM] is
>>>>>>>>> stopped on
>>>>>>>>> the
>>>>>>>>> host. Proceeding to release resource held.
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,475 DEBUG
>>>>>>>>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not
>>>>>>>>> supported
>>>>>>>>> in the network id=519
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,478 DEBUG
>>>>>>>>> [o.a.c.e.o.NetworkOrchestrator] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>> db528
>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Changing active number of nics
>>>>>>>>> for
>>>>>>>>> network id=519 on -1
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>>>>>>>>> [o.a.c.e.o.NetworkOrchestrator] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>> db528
>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Asking VpcVirtualRouter to
>>>>>>>>> release
>>>>>>>>> NicProfile[17403-1735-3364083b-dcc0-4eb9-a7f9-f3e85bf8c7d3-1
>>>>>>>>> 92.168.1.11-null
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>> db528
>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Successfully released network
>>>>>>>>> resources
>>>>>>>>> for the vm VM[User|i-638-1735-VM]
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>> db528
>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Successfully released storage
>>>>>>>>> resources
>>>>>>>>> for the vm VM[User|i-638-1735-VM]
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,496 DEBUG
>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) VM state transitted from
>>>>>>>>> :Stopping to
>>>>>>>>> Stopped with event: OperationSucceededvm's original host id: 15
>>>>>>>>> new host
>>>>>>>>> id: null host id before state transition: 15
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Hosts's actual total CPU: 82976
>>>>>>>>> and
>>>>>>>>> CPU
>>>>>>>>> after applying overprovisioning: 331904
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Hosts's actual total RAM:
>>>>>>>>> 261961927808
>>>>>>>>> and RAM after applying overprovisioning: 261961924608
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) release cpu from host: 15, old
>>>>>>>>> used:
>>>>>>>>> 71300,reserved: 2100, actual total: 82976, total with
>>>>>>>>> overprovisioning:
>>>>>>>>> 331904; new used: 67100,reserved:6300; movedfromreserved:
>>>>>>>>> false,moveToReserveredtrue
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) release mem from host: 15, old
>>>>>>>>> used:
>>>>>>>>> 98784247808,reserved: 1073741824, total: 261961924608; new used:
>>>>>>>>> 94489280512,reserved:5368709120 <(53)%206870-9120>;
>>>>>>>>> movedfromreserved:
>>>>>>>>>
>>>>>>>>> false,moveToReserveredtrue
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,538 DEBUG
>>>>>>>>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not
>>>>>>>>> supported
>>>>>>>>> in the network id=519
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,557 DEBUG
>>>>>>>>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not
>>>>>>>>> supported
>>>>>>>>> in the network id=519
>>>>>>>>>
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,588 ERROR
>>>>>>>>> [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Invocation exception, caused by:
>>>>>>>>> com.cloud.utils.exception.CloudRuntimeException: Unable to migrate
>>>>>>>>> VM[User|i-638-1735-VM]
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 INFO
>>>>>>>>> [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Rethrow exception
>>>>>>>>> com.cloud.utils.exception.CloudRuntimeException: Unable to migrate
>>>>>>>>> VM[User|i-638-1735-VM]
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 DEBUG
>>>>>>>>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>> job-13733/job-191379) Done with run of VM work job:
>>>>>>>>> com.cloud.vm.VmWorkMigrateAway for VM 1735, job origin: 13733
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 ERROR
>>>>>>>>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>> job-13733/job-191379) Unable to complete AsyncJobVO {id:191379,
>>>>>>>>> userId:
>>>>>>>>> 1,
>>>>>>>>> accountId: 1, instanceType: null, instanceId: null, cmd:
>>>>>>>>> com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>>>>>>> rO0ABXNyAB5jb20uY2xvdWQudm0uVm
>>>>>>>>> 1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb2
>>>>>>>>> 0uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2
>>>>>>>>> VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZz
>>>>>>>>> t4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGx3QAGVZpcnR1YWxNYWNoaW5lTW
>>>>>>>>> FuYWdlckltcGwAAAAAAAAA4A,
>>>>>>>>> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode:
>>>>>>>>> 0,
>>>>>>>>> result: null, initMsid: 345049098498, completeMsid: null,
>>>>>>>>> lastUpdated:
>>>>>>>>> null, lastPolled: null, created: Tue Jan 24 18:05:00 CET 2017}, job
>>>>>>>>> origin:13733
>>>>>>>>> ewcstack-man02-prod: com.cloud.utils.exception.Clou
>>>>>>>>> dRuntimeException:
>>>>>>>>> Unable to migrate VM[User|i-638-1735-VM]
>>>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>>>>>>>>> agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.ja
>>>>>>>>> va:2405)
>>>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>>>>>>>>> agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.ja
>>>>>>>>> va:4517)
>>>>>>>>> ewcstack-man02-prod:    at sun.reflect.GeneratedMethodAcc
>>>>>>>>> essor402.invoke(Unknown
>>>>>>>>> Source)
>>>>>>>>> ewcstack-man02-prod:    at sun.reflect.DelegatingMethodAc
>>>>>>>>> cessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>>>>> ewcstack-man02-prod:    at java.lang.reflect.Method.invok
>>>>>>>>> e(Method.java:606)
>>>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VmWorkJobHandlerP
>>>>>>>>> roxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
>>>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>>>>>>>>> agerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4636)
>>>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VmWorkJobDispatch
>>>>>>>>> er.runJob(VmWorkJobDispatcher.java:103)
>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.framewor
>>>>>>>>> k.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManag
>>>>>>>>> erImpl.java:537)
>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>>>> context.ManagedContextRunnable$1.run(ManagedContextRunnable.
>>>>>>>>> java:49)
>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>>>> context.impl.DefaultManagedContext$1.call(DefaultManagedCont
>>>>>>>>> ext.java:56)
>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>>>> context.impl.DefaultManagedContext.callWithContext(DefaultMa
>>>>>>>>> nagedContext.java:103)
>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>>>> context.impl.DefaultManagedContext.runWithContext(DefaultMan
>>>>>>>>> agedContext.java:53)
>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>>>> context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.framewor
>>>>>>>>> k.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.ja
>>>>>>>>> va:494)
>>>>>>>>> ewcstack-man02-prod:    at java.util.concurrent.Executors
>>>>>>>>> $RunnableAdapter.call(Executors.java:471)
>>>>>>>>> ewcstack-man02-prod:    at java.util.concurrent.FutureTas
>>>>>>>>> k.run(FutureTask.java:262)
>>>>>>>>> ewcstack-man02-prod:    at java.util.concurrent.ThreadPoo
>>>>>>>>> lExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>>>>>> ewcstack-man02-prod:    at java.util.concurrent.ThreadPoo
>>>>>>>>> lExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>>>>>> ewcstack-man02-prod:    at java.lang.Thread.run(Thread.java:745)
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,590 DEBUG
>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>> db528
>>>>>>>>> job-13733/job-191379) Complete async job-191379, jobStatus: FAILED,
>>>>>>>>> resultCode: 0, result: rO0ABXNyAC9jb20uY2xvdWQudXRpbH
>>>>>>>>> MuZXhjZXB0aW9uLkNsb3VkUnVudGltZUV4Y2VwdGlvbgAAAABWTU9yAwABSQ
>>>>>>>>> ALY3NFcnJvckNvZGV4cgAaamF2YS5sYW5nLlJ1bnRpbWVFeGNlcHRpb26eXw
>>>>>>>>> ZHCjSD5QIAAHhyABNqYXZhLmxhbmcuRXhjZXB0aW9u0P0fPho7HMQCAAB4cg
>>>>>>>>> ATamF2YS5sYW5nLlRocm93YWJsZdXGNSc5d7jLAwAETAAFY2F1c2V0ABVMam
>>>>>>>>> F2YS9sYW5nL1Rocm93YWJsZTtMAA1kZXRhaWxNZXNzYWdldAASTGphdmEvbG
>>>>>>>>> FuZy9TdHJpbmc7WwAKc3RhY2tUcmFjZXQAHltMamF2YS9sYW5nL1N0YWNrVH
>>>>>>>>> JhY2VFbGVtZW50O0wAFHN1cHByZXNzZWRFeGNlcHRpb25zdAAQTGphdmEvdX
>>>>>>>>> RpbC9MaXN0O3hwcQB-AAh0AChVbmFibGUgdG8gbWlncmF0ZSBWTVtVc2VyfG
>>>>>>>>> ktNjM4LTE3MzUtVk1ddXIAHltMamF2YS5sYW5nLlN0YWNrVHJhY2VFbGVtZW
>>>>>>>>> 50OwJGKjw8_SI5AgAAeHAAAAAUc3IAG2phdmEubGFuZy5TdGFja1RyYWNlRW
>>>>>>>>> xlbWVudGEJxZomNt2FAgAESQAKbGluZU51bWJlckwADmRlY2xhcmluZ0NsYX
>>>>>>>>> NzcQB-AAVMAAhmaWxlTmFtZXEAfgAFTAAKbWV0aG9kTmFtZXEAfgAFeHAAAA
>>>>>>>>> lldAAmY29tLmNsb3VkLnZtLlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGx0AB
>>>>>>>>> 5WaXJ0dWFsTWFjaGluZU1hbmFnZXJJbXBsLmphdmF0ABZvcmNoZXN0cmF0ZU
>>>>>>>>> 1pZ3JhdGVBd2F5c3EAfgAMAAARpXEAfgAOcQB-AA9xAH4AEHNxAH4ADP____
>>>>>>>>> 90ACZzdW4ucmVmbGVjdC5HZW5lcmF0ZWRNZXRob2RBY2Nlc3NvcjQwMnB0AA
>>>>>>>>> ZpbnZva2VzcQB-AAwAAAArdAAoc3VuLnJlZmxlY3QuRGVsZWdhdGluZ01ldG
>>>>>>>>> hvZEFjY2Vzc29ySW1wbHQAIURlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcG
>>>>>>>>> wuamF2YXEAfgAUc3EAfgAMAAACXnQAGGphdmEubGFuZy5yZWZsZWN0Lk1ldG
>>>>>>>>> hvZHQAC01ldGhvZC5qYXZhcQB-ABRzcQB-AAwAAABrdAAiY29tLmNsb3VkLn
>>>>>>>>> ZtLlZtV29ya0pvYkhhbmRsZXJQcm94eXQAGlZtV29ya0pvYkhhbmRsZXJQcm
>>>>>>>>> 94eS5qYXZhdAAPaGFuZGxlVm1Xb3JrSm9ic3EAfgAMAAASHHEAfgAOcQB-
>>>>>>>>> AA9xAH4AHnNxAH4ADAAAAGd0ACBjb20uY2xvdWQudm0uVm1Xb3JrSm9iRGlz
>>>>>>>>> cGF0Y2hlcnQAGFZtV29ya0pvYkRpc3BhdGNoZXIuamF2YXQABnJ1bkpvYnNx
>>>>>>>>> AH4ADAAAAhl0AD9vcmcuYXBhY2hlLmNsb3Vkc3RhY2suZnJhbWV3b3JrLmpv
>>>>>>>>> YnMuaW1wbC5Bc3luY0pvYk1hbmFnZXJJbXBsJDV0ABhBc3luY0pvYk1hbmFn
>>>>>>>>> ZXJJbXBsLmphdmF0AAxydW5JbkNvbnRleHRzcQB-AAwAAAAxdAA-b3JnLmFw
>>>>>>>>> YWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29udGV4
>>>>>>>>> dFJ1bm5hYmxlJDF0ABtNYW5hZ2VkQ29udGV4dFJ1bm5hYmxlLmphdmF0AANy
>>>>>>>>> dW5zcQB-AAwAAAA4dABCb3JnLmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQu
>>>>>>>>> Y29udGV4dC5pbXBsLkRlZmF1bHRNYW5hZ2VkQ29udGV4dCQxdAAaRGVmYXVs
>>>>>>>>> dE1hbmFnZWRDb250ZXh0LmphdmF0AARjYWxsc3EAfgAMAAAAZ3QAQG9yZy5h
>>>>>>>>> cGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0
>>>>>>>>> TWFuYWdlZENvbnRleHRxAH4ALnQAD2NhbGxXaXRoQ29udGV4dHNxAH4ADAAA
>>>>>>>>> ADVxAH4AMXEAfgAudAAOcnVuV2l0aENvbnRleHRzcQB-AAwAAAAudAA8b3Jn
>>>>>>>>> LmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29u
>>>>>>>>> dGV4dFJ1bm5hYmxlcQB-ACpxAH4AK3NxAH4ADAAAAe5xAH4AJXEAfgAmcQB-
>>>>>>>>> ACtzcQB-AAwAAAHXdAAuamF2YS51dGlsLmNvbmN1cnJlbnQuRXhlY3V0b3Jz
>>>>>>>>> JFJ1bm5hYmxlQWRhcHRlcnQADkV4ZWN1dG9ycy5qYXZhcQB-AC9zcQB-AAwA
>>>>>>>>> AAEGdAAfamF2YS51dGlsLmNvbmN1cnJlbnQuRnV0dXJlVGFza3QAD0Z1dHVy
>>>>>>>>> ZVRhc2suamF2YXEAfgArc3EAfgAMAAAEeXQAJ2phdmEudXRpbC5jb25jdXJy
>>>>>>>>> ZW50LlRocmVhZFBvb2xFeGVjdXRvcnQAF1RocmVhZFBvb2xFeGVjdXRvci5q
>>>>>>>>> YXZhdAAJcnVuV29ya2Vyc3EAfgAMAAACZ3QALmphdmEudXRpbC5jb25jdXJy
>>>>>>>>> ZW50LlRocmVhZFBvb2xFeGVjdXRvciRXb3JrZXJxAH4AQHEAfgArc3EAfgAM
>>>>>>>>> AAAC6XQAEGphdmEubGFuZy5UaHJlYWR0AAtUaHJlYWQuamF2YXEAfgArc3IA
>>>>>>>>> JmphdmEudXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVMaXN0_A8lMbXs
>>>>>>>>> jhACAAFMAARsaXN0cQB-AAd4cgAsamF2YS51dGlsLkNvbGxlY3Rpb25zJFVu
>>>>>>>>> bW9kaWZpYWJsZUNvbGxlY3Rpb24ZQgCAy173HgIAAUwAAWN0ABZMamF2YS91
>>>>>>>>> dGlsL0NvbGxlY3Rpb247eHBzcgATamF2YS51dGlsLkFycmF5TGlzdHiB0h2Z
>>>>>>>>> x2GdAwABSQAEc2l6ZXhwAAAAAHcEAAAAAHhxAH4ATHgAABCadwgAAAAAAAAAAHg
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>> db528
>>>>>>>>> job-13733/job-191379) Publish async job-191379 complete on message
>>>>>>>>> bus
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>> db528
>>>>>>>>> job-13733/job-191379) Wake up jobs related to job-191379
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>> db528
>>>>>>>>> job-13733/job-191379) Update db status for job-191379
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,594 DEBUG
>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>> db528
>>>>>>>>> job-13733/job-191379) Wake up jobs joined with job-191379 and
>>>>>>>>> disjoin
>>>>>>>>> all
>>>>>>>>> subjobs created from job- 191379
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,604 DEBUG
>>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>>> db528
>>>>>>>>> job-13733/job-191379) Done executing
>>>>>>>>> com.cloud.vm.VmWorkMigrateAway for
>>>>>>>>> job-191379
>>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,608 INFO
>>>>>>>>> [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>>> job-13733/job-191379) Remove job-191379 from job monitoring
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 25.01.2017 15:53, Rafael Weingärtner wrote:
>>>>>>>>>
>>>>>>>>> Did you check the ACS log files looking for some exceptions when
>>>>>>>>> the
>>>>>>>>>
>>>>>>>>>> migrations for this specific VM started?
>>>>>>>>>>
>>>>>>>>>> On Wed, Jan 25, 2017 at 9:49 AM, Francois Scheurer <
>>>>>>>>>> <fr...@everyware.ch>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Dear CS contributors
>>>>>>>>>>
>>>>>>>>>> We use CS 4.5.1 on a 3 Clusters with XenServer 6.5. and shared
>>>>>>>>>>> primary
>>>>>>>>>>> storage.
>>>>>>>>>>>
>>>>>>>>>>> When we put a host in maintenance, cs evacuates all VM's on other
>>>>>>>>>>> hosts.
>>>>>>>>>>>
>>>>>>>>>>> But it happens regularly that some VM's are stopped instead of
>>>>>>>>>>> live
>>>>>>>>>>> migrated.
>>>>>>>>>>>
>>>>>>>>>>> If we restart that stopped VM's, CS will then stop them again
>>>>>>>>>>> later.
>>>>>>>>>>>
>>>>>>>>>>> We need to start them several times until they stays up.
>>>>>>>>>>>
>>>>>>>>>>> Is it a known issue? Is it fixed on CS 4.9.2 ?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Migrating manually all VM's is working but impracticable because
>>>>>>>>>>> too
>>>>>>>>>>> much
>>>>>>>>>>> time consuming.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Many thanks in advance for your help.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Best Regards
>>>>>>>>>>> Francois
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> EveryWare AG
>>>>>>>>>>> François Scheurer
>>>>>>>>>>> Senior Systems Engineer
>>>>>>>>>>> Zurlindenstrasse 52a
>>>>>>>>>>> CH-8003 Zürich
>>>>>>>>>>>
>>>>>>>>>>> tel: +41 44 466 60 00 <%2B41%2044%20466%2060%2000>
>>>>>>>>>>> fax: +41 44 466 60 10 <%2B41%2044%20466%2060%2010>
>>>>>>>>>>> mail: <fr...@everyware.ch>
>>>>>>>>>>> francois.scheurer@everyware.ch
>>>>>>>>>>> web: <http://www.everyware.ch>http://www.everyware.ch
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> EveryWare AG
>>>>>>>>> François Scheurer
>>>>>>>>> Senior Systems Engineer
>>>>>>>>> Zurlindenstrasse 52a
>>>>>>>>> CH-8003 Zürich
>>>>>>>>>
>>>>>>>>> tel: +41 44 466 60 00 <%2B41%2044%20466%2060%2000>
>>>>>>>>> fax: +41 44 466 60 10 <%2B41%2044%20466%2060%2010>
>>>>>>>>> mail: <fr...@everyware.ch>
>>>>>>>>> francois.scheurer@everyware.ch
>>>>>>>>> web: <http://www.everyware.ch>http://www.everyware.ch
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>> Rafael Weingärtner
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> --
>>>>>>
>>>>>>
>>>>>> EveryWare AG
>>>>>> François Scheurer
>>>>>> Senior Systems Engineer
>>>>>> Zurlindenstrasse 52a
>>>>>> CH-8003 Zürich
>>>>>>
>>>>>> tel: +41 44 466 60 00 <%2B41%2044%20466%2060%2000>
>>>>>> fax: +41 44 466 60 10 <%2B41%2044%20466%2060%2010>
>>>>>> mail: francois.scheurer@everyware.ch
>>>>>> web: http://www.everyware.ch
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>> --
>>>>
>>>>
>>>> EveryWare AG
>>>> François Scheurer
>>>> Senior Systems Engineer
>>>> Zurlindenstrasse 52a
>>>> CH-8003 Zürich
>>>>
>>>> tel: +41 44 466 60 00
>>>> fax: +41 44 466 60 10
>>>> mail: francois.scheurer@everyware.ch
>>>> web: http://www.everyware.ch
>>>>
>>>>
>>>
>>>
>>> --
>>> Rafael Weingärtner
>>>
>>
>>
>>
>> --
>> Rafael Weingärtner
>>
>>
>> --
>>
>>
>> EveryWare AG
>> François Scheurer
>> Senior Systems Engineer
>> Zurlindenstrasse 52a
>> CH-8003 Zürich
>>
>> tel: +41 44 466 60 00 <+41%2044%20466%2060%2000>
>> fax: +41 44 466 60 10 <+41%2044%20466%2060%2010>
>> mail: francois.scheurer@everyware.ch
>> web: http://www.everyware.ch
>>
>>
>
>
> --
> Rafael Weingärtner
>
>
> --
>
>
> EveryWare AG
> François Scheurer
> Senior Systems Engineer
> Zurlindenstrasse 52a
> CH-8003 Zürich
>
> tel: +41 44 466 60 00 <+41%2044%20466%2060%2000>
> fax: +41 44 466 60 10 <+41%2044%20466%2060%2010>
> mail: francois.scheurer@everyware.ch
> web: http://www.everyware.ch
>
>


-- 
Rafael Weingärtner

Re: cs 4.5.1, vm stopped when putting xen hosts in maintenance

Posted by Francois Scheurer <fr...@everyware.ch>.

Dear Rafael


We disabled temporarily the HA Flag on these VM's by updating the DB 
directly: update vm_instance set ha_enabled=0 where removed is null and 
id=1178 or id=1240 or id=1690 or id=1736;

We kept the test VM i-189-1177-VM with HA Flag.

Restarting the MS + cleaning the op_it_work table will hopefully fix the 
issue. And if the issue is persisting, then only the i-189-1177-VM 
should be affected, hence mitigating the impact.

In the other hand if everything run fine we would thereafter re-enable 
the HA Flag on the problematic VM's.


Do you think it is safe to run this update?: update op_it_work set 
step="Done" where step!="Done";


Best Regards

Francois Scheurer



PS: BTW, do you already heard about the Cloudstack Fork from Schuberg 
Philis named Cosmic? https://github.com/MissionCriticalCloud/cosmic





On 06.02.2017 16:41, Rafael Weingärtner wrote:
>
> I totally understand your position of not being able to debug the ACS 
> in production ;)
>
>
> There is one bit I quite did not understand. If you have disabled the 
> HA for VMs, why do you still need to change data on the data base? How 
> did you disabled the HA of VMs?
>
>
> On Mon, Feb 6, 2017 at 5:04 AM, Francois Scheurer 
> <francois.scheurer@everyware.ch 
> <ma...@everyware.ch>> wrote:
>
>     Dear Rafael / Dear CS contributors
>
>
>     I tried the following:
>
>     1.     stop all MS
>     2.     backup DB
>     3.     select * from op_ha_work where type="Migration" and
>     mgmt_server_id is null and taken is null;
>     4.     delete from op_ha_work where type="Migration" and
>     mgmt_server_id is null and taken is null;
>     5.     start all MS
>
>     But actually after the MS stopped the op_ha_work table was
>     automatically cleaned, making my steps useless.
>     After the MS started theop_ha_work table was filled again and
>     related VM's were stopped several times (3-7 times), requiring
>     high attention to manually restart them on each stop.
>
>     I later found another table that could be the cause of the issue:
>
>     mysql> select
>     op_it_work.*,vm_instance.instance_name,vm_instance.removed from
>     op_it_work join vm_instance where op_it_work.step!="Done" and
>     op_it_work.instance_id=vm_instance.id <http://vm_instance.id>
>     order by instance_id;
>     +--------------------------------------+----------------+------------+-----------------------+-----------+--------------------+-----------+------------+-------------+---------------+-------------+---------------+---------------------+
>     | id        | mgmt_server_id | created_at | thread               
>     | type      | vm_type            | step      | updated_at |
>     instance_id | resource_type | resource_id | instance_name |
>     removed             |
>     +--------------------------------------+----------------+------------+-----------------------+-----------+--------------------+-----------+------------+-------------+---------------+-------------+---------------+---------------------+
>     | be4d0dd2-7d4f-47d3-ba49-623d2f115bbf | 345049098498 | 1436187965
>     | consoleproxy-1        | Starting  | ConsoleProxy       |
>     Release   | 1436188265 |         402 | NULL          |           0
>     | v-402-VM      | 2015-07-06 13:23:22 |
>     | fb4d1844-eb92-478c-9f50-d666790f43af | 345049098498 | 1436187965
>     <tel:%2814%29%203618-7965> | secstorage-1          | Starting  |
>     SecondaryStorageVm | Release   | 1436188265
>     <tel:%2814%29%203618-8265> |         687 | NULL         
>     |           0 | s-687-VM      | 2015-07-06 13:11:34 |
>     | 7dfeb25a-a705-42e8-907c-79d1e69697ed | 345049098498 | 1436964548
>     <tel:%2814%29%203696-4548> | Work-Job-Executor-165 | Migrating |
>     User               | Migrating | 1436964548
>     <tel:%2814%29%203696-4548> |         776 | Host         
>     |          29 | i-51-776-VM   | 2015-07-15 12:55:38 |
>     | 2018ac61-b11c-4101-8ef5-07a39509f2bd | 345049098498 | 1486020173
>     | Work-Job-Executor-19  | Migrating | User               |
>     Prepare   | 1486020173 |        1177 | Host          |          25
>     | i-189-1177-VM | NULL                |
>     | 25430cd5-4346-4643-9c7e-7b39b8a876a2 | 345049098122 | 1486019693
>     | Work-Job-Executor-6   | Migrating | User               |
>     Prepare   | 1486019693 |        1177 | Host          |          25
>     | i-189-1177-VM | NULL                |
>     | 3276d52e-cbd9-44c5-8ab6-c70784337ff9 | 345049098498 | 1486020953
>     | Work-Job-Executor-33  | Migrating | User               |
>     Prepare   | 1486020953 |        1177 | Host          |          33
>     | i-189-1177-VM | NULL                |
>     | 6462da33-83f2-4575-af38-0e52d2a48488 | 345049098122 | 1486021433
>     | Work-Job-Executor-12  | Migrating | User               |
>     Prepare   | 1486021433 |        1177 | Host          |          25
>     | i-189-1177-VM | NULL                |
>     | 688e16c8-c39f-4e88-9e8f-36c0282f3df1 | 345049098122 | 1486020773
>     | Work-Job-Executor-11  | Migrating | User               |
>     Prepare   | 1486020773 |        1177 | Host          |          25
>     | i-189-1177-VM | NULL                |
>     | d8a18e9d-ac6f-4273-923e-e401533e2cf7 | 345049098498 | 1486020353
>     | Work-Job-Executor-25  | Migrating | User               |
>     Prepare   | 1486020353 |        1177 | Host          |          25
>     | i-189-1177-VM | NULL                |
>     | 1222ece8-399d-43d8-8ff8-d24bdebca2ed | 345049103441 | 1486018924
>     | Work-Job-Executor-2   | Migrating | User               |
>     Prepare   | 1486018924 |        1178 | Host          |          29
>     | i-189-1178-VM | NULL                |
>     | 2ac5ca4a-c20d-433a-8379-40e0de3b3ba8 | 345049098498 | 1486021613
>     | Work-Job-Executor-41  | Migrating | User               |
>     Prepare   | 1486021613 |        1178 | Host          |          29
>     | i-189-1178-VM | NULL                |
>     | 3e3b95a0-2af4-4268-bfdd-f3399a85bcaa | 345049103441 | 1486019584
>     | Work-Job-Executor-5   | Migrating | User               |
>     Prepare   | 1486019584 |        1178 | Host          |          29
>     | i-189-1178-VM | NULL                |
>     | 53745a2d-056c-4a1b-9267-eb97f077a6c8 | 345049103441 | 1486020244
>     | Work-Job-Executor-6   | Migrating | User               |
>     Prepare   | 1486020244 |        1178 | Host          |          29
>     | i-189-1178-VM | NULL                |
>     | e3555b49-7c18-4272-b7f4-ba9351b3bbd1 | 345049098498 | 1486020893
>     | Work-Job-Executor-32  | Migrating | User               |
>     Prepare   | 1486020893 |        1178 | Host          |          29
>     | i-189-1178-VM | NULL                |
>     | 1dcd5204-4f9a-4342-a23b-329a48596e13 | 345049098498 | 1486020953
>     | Work-Job-Executor-34  | Migrating | User               |
>     Prepare   | 1486020953 |        1240 | Host          |           3
>     | i-388-1240-VM | NULL                |
>     | ec1668b3-fa3c-4673-b593-3d607b169a24 | 345049098498 | 1486019693
>     | Work-Job-Executor-14  | Migrating | User               |
>     Prepare   | 1486019693 |        1240 | Host          |           3
>     | i-388-1240-VM | NULL                |
>     | fa7b2dec-ed7f-4a67-a055-e6e6bc1631ca | 345049098498 | 1486020353
>     | Work-Job-Executor-26  | Migrating | User               |
>     Prepare   | 1486020353 |        1240 | Host          |           3
>     | i-388-1240-VM | NULL                |
>     | 8cfe6d3e-5b7b-4be8-ac10-c865a492aab9 | 345049098498 | 1465387271
>     | Work-Job-Executor-18  | Running   | User               |
>     Prepare   | 1465387271 |        1388 | Host          |           7
>     | i-361-1388-VM | NULL                |
>     | 6e941b7e-6f19-4717-bf82-974c59737dbc | 345049103441 | 1486019524
>     | Work-Job-Executor-4   | Migrating | User               |
>     Prepare   | 1486019524 |        1690 | Host          |          15
>     | i-648-1690-VM | NULL                |
>     | 718578a4-0dd6-42d9-be10-ed7680650108 | 345049098122 | 1486021433
>     | Work-Job-Executor-13  | Migrating | User               |
>     Prepare   | 1486021433 |        1690 | Host          |          15
>     | i-648-1690-VM | NULL                |
>     | 9acf3908-ef13-4b03-8c2b-1a7e55e18491 | 345049098498 | 1486020173
>     | Work-Job-Executor-20  | Migrating | User               |
>     Prepare   | 1486020173 |        1690 | Host          |          15
>     | i-648-1690-VM | NULL                |
>     | a50c420b-ae52-4caa-83db-704b52b299a4 | 345049098498 | 1486020773
>     | Work-Job-Executor-28  | Migrating | User               |
>     Prepare   | 1486020773 |        1690 | Host          |          15
>     | i-648-1690-VM | NULL                |
>     | 2b5f4179-5975-4326-9db7-643c3ede63fe | 345049103441 | 1484224516
>     | Work-Job-Executor-161 | Running   | User               |
>     Prepare   | 1484224516 |        1703 | Host          |           3
>     | i-51-1703-VM  | NULL                |
>     | 2f3c6c44-b2a0-4d10-88d2-ac143d2dd379 | 345049098498 | 1484223910
>     | Work-Job-Executor-163 | Running   | User               |
>     Prepare   | 1484223910 |        1703 | Host          |           3
>     | i-51-1703-VM  | NULL                |
>     | 4126a23b-1cde-49bc-8636-c0d063b85cf6 | 345049103441 | 1484224636
>     | Work-Job-Executor-86  | Running   | User               |
>     Prepare   | 1484224636 |        1703 | Host          |           3
>     | i-51-1703-VM  | NULL                |
>     | 5dad5645-766b-48c9-9a85-b77b3c101390 | 345049103441 | 1484224636
>     | Work-Job-Executor-92  | Running   | User               |
>     Prepare   | 1484224636 |        1703 | Host          |           3
>     | i-51-1703-VM  | NULL                |
>     | 8c1dfff9-e55d-4f81-a22f-45f72accc416 | 345049103441 | 1484224576
>     | Work-Job-Executor-95  | Running   | User               |
>     Prepare   | 1484224576 |        1703 | Host          |           3
>     | i-51-1703-VM  | NULL                |
>     | a703d2d7-b1e3-42ce-a551-831982990ffb | 345049098498 | 1484224630
>     | Work-Job-Executor-13  | Running   | User               |
>     Prepare   | 1484224630 |        1703 | Host          |           3
>     | i-51-1703-VM  | NULL                |
>     | 0e194cb5-95b7-41cb-969a-ae37d2ad449a | 345049098498 | 1485763613
>     | Work-Job-Executor-28  | Migrating | User               |
>     Prepare   | 1485763613 |        1735 | Host          |          19
>     | i-638-1735-VM | 2017-01-30 15:24:10 |
>     | 74f180b9-be0f-429b-9a45-597ca2ca997d | 345049098498 | 1485763013
>     | Work-Job-Executor-21  | Migrating | User               |
>     Prepare   | 1485763013 |        1735 | Host          |          19
>     | i-638-1735-VM | 2017-01-30 15:24:10 |
>     | feaa421c-88df-4493-9a4f-bddb6ee1b75b | 345049098122 | 1485764285
>     | Work-Job-Executor-19  | Migrating | User               |
>     Prepare   | 1485764285 |        1735 | Host          |          19
>     | i-638-1735-VM | 2017-01-30 15:24:10 |
>     | 32138a36-ad18-42b5-8478-76ad63caf4da | 345049098122 | 1486020293
>     | Work-Job-Executor-9   | Migrating | User               |
>     Prepare   | 1486020293 |        1736 | Host          |          19
>     | i-638-1736-VM | NULL                |
>     | 5306ab1f-3c92-4f95-bbff-283b43e6862d | 345049098498 | 1486019573
>     | Work-Job-Executor-11  | Migrating | User               |
>     Prepare   | 1486019573 |        1736 | Host          |          19
>     | i-638-1736-VM | NULL                |
>     | 843bdf9c-247d-432c-ac19-1a4f7858631b | 345049098498 | 1486021493
>     | Work-Job-Executor-38  | Migrating | User               |
>     Prepare   | 1486021493 |        1736 | Host          |          19
>     | i-638-1736-VM | NULL                |
>     | 8d400afb-766f-4ac5-89ee-854f25a38eba | 345049098498 | 1486020173
>     | Work-Job-Executor-21  | Migrating | User               |
>     Prepare   | 1486020173 |        1736 | Host          |          19
>     | i-638-1736-VM | NULL                |
>     | f0ae5612-b6ef-4882-94a2-40a1e6dce342 | 345049098498 | 1486020833
>     | Work-Job-Executor-30  | Migrating | User               |
>     Prepare   | 1486020833 |        1736 | Host          |          19
>     | i-638-1736-VM | NULL                |
>     +--------------------------------------+----------------+------------+-----------------------+-----------+--------------------+-----------+------------+-------------+---------------+-------------+---------------+---------------------+
>
>     As you see, the problematic VM's are all present on that table:
>
>         i-189-1177-VM
>         i-189-1178-VM
>         i-388-1240-VM
>         i-648-1690-VM
>         i-638-1736-VM
>
>     As the MS are running the production environnements, we cannot
>     debug a lot and just need the shortest path to resolution.
>     For example restarting the MS is having an impact on our customers
>     and we need to reduce the number of restarts at the minimum.
>
>     I have disabled the HA on all problematic VM's except on
>     i-189-1177-VM, which is less critical.
>
>     The next steps could be:
>
>     1.     stop all MS
>     2.     backup DB
>     3.     select
>     op_it_work.*,vm_instance.instance_name,vm_instance.removed from
>     op_it_work join vm_instance where op_it_work.step!="Done" and
>     op_it_work.instance_id=vm_instance.id <http://vm_instance.id>
>     order by instance_id;
>     4.     update op_it_work set step="Done" where step!="Done";
>     5.     start all MS
>
>     What do you think about step 4., safe or risky?
>     Some of these entries are related to deleted VM's and seems to
>     safe to mark as "Done".
>     All other entries are related to the problematic VM's that keep
>     stopping... so marking them as "Done" may (hopefully) eliminate
>     the cause of the Migrating/Stopping.
>     But we are not sure if marking them as "Done" need also changes on
>     other tables for consistency.
>
>
>     Thank you for your help.
>
>
>     Best Regards
>     Francois Scheurer
>
>
>
>
>
>     On 01.02.2017 03:01, Rafael Weingärtner wrote:
>>     Ok, I dug a little further into the code.
>>
>>         If we put a host in maintenance, it may unfortunately happen
>>         that a timeout occurs and CS aborts the migration. As far as
>>         we can interpret the logs, it seems (we speculate) that CS
>>         detects that the VM is then still running on the original
>>         host, which is in maintenance, and decides somehow to stop it
>>         because of the "host not up" status.
>>
>>
>>     From wat I have seen in the code, this may happen (as you
>>     speculated). If a timeout or some other event interrupts
>>     migrations, the VM will be stopped. This happens in the method
>>     “com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(String,
>>     long, DeploymentPlanner)” at line 2396 and further. This was
>>     implemented on purpose; if the migration does not happen; the VM
>>     is stopped on the source host. Probably it was implemented this
>>     way on that philosophy of gradient descendent of artificial
>>     intelligence:
>>
>>         “pick a random point in the field, update the values
>>         according to the slope and hopefully after successfully
>>         iterations you converge to a minimum point.” (the problem is
>>         that sometimes the function diverges).
>>
>>
>>     I mean, probably this was done this way because the HA was
>>     supposed to see that a VM with HA is stopped, and should try to
>>     start it again somewhere else. The log you posted do not provide
>>     any other details to debug this further because the log is after
>>     an MS restart, right?
>>
>>
>>         After restarting the MSs I guess that a similar issue may
>>         happen: the hosts are disconnected during the early stage and
>>         then reconnects. Maybe CS then again stops the VMs because
>>         the "host not up" status?
>>         Cf. 2017-01-30 08:44:06,157 DEBUG
>>         [c.c.d.DeploymentPlanningManagerImpl]
>>         (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>>         ctx-6e9c753f) The last host of this VM is not UP or is not
>>         enabled, host status is: Disconnected, host resource state
>>         is: Enabled
>>
>>
>>     Here is something interesting that you have found. I checked the
>>     code, but without a live debug it is pretty hard for me to get
>>     the right flow of execution. By looking at the code and at the
>>     logs you provided, your deduction is right. After ACS boots up,
>>     the check for hosts that are shutdown/deactivated starts, and the
>>     method
>>     “com.cloud.agent.manager.AgentManagerImpl.executeUserRequest(long,
>>     Event)” is executed. The problem is that all of the hosts are
>>     considered as disconnected (ACS may not have finished connecting
>>     to all of them yet), and then a whole process of HA is kicked.
>>     The problem is that there is no host with status “UP”. That is
>>     why this happen “No suitable hosts found under this Cluster: 25”,
>>     and then the whole flow of HA, which shuts down the VM is kicked.
>>
>>     In theory, the HA worker should see that a VM with HA is down,
>>     and should try to start it. I am not understanding why this is
>>     not working. If you decide to debug the running ACS, I may be
>>     able to help you, just keep me posted.
>>
>>
>>     On Tue, Jan 31, 2017 at 10:26 AM, Rafael Weingärtner
>>     <rafaelweingartner@gmail.com
>>     <ma...@gmail.com>> wrote:
>>
>>         I will be able to re-check some code and answer you with new
>>         thoughts on this problem tonight (the log files you provided
>>         today have some very insightful information regarding this
>>         problem). I believe this is a bug in HA, a very tricky one
>>         that has not been reported yet (at least I do not remember
>>         seeing anything similar). And thanks to randomness/destiny
>>         now you have found/uncovered a way to reproduce the problem
>>         (bug?!).
>>
>>         I do not suggest you changing any data at all directly in the
>>         data base (unless you know ACS code well and are 100% of what
>>         you are doing). This can cause even bigger and more
>>         complicated problems. If you remove the HA feature from this
>>         VMs, this problem will stop. However, if you can/are willing
>>         to maintain the HA until we fully uncover the problem, map
>>         and report it, this would be great.
>>         BTW: why are you restarting MSs?
>>         What are the configurations of your MSs?
>>
>>         Thanks for your time and for sharing information with our
>>         community.
>>
>>
>>
>>         On Tue, Jan 31, 2017 at 8:53 AM, Francois Scheurer
>>         <francois.scheurer@everyware.ch
>>         <ma...@everyware.ch>> wrote:
>>
>>             Dear Rafael
>>
>>
>>             Many thanks for your deep analysis.
>>             Nice humor with the planetsalignments BTW ;)
>>
>>             We also suspected the HA Flag to be related this issue,
>>             as we never got this issue in the past, until we enabled
>>             the HA on all VMs and Offerings.
>>
>>             Here is also the workers list:
>>
>>             mysql> select * from op_ha_work;
>>             +------+-------------+-----------+--------------------+----------+----------------+---------+---------------------+-------+---------------------+-------+-------------+---------+
>>             | id   | instance_id | type      | vm_type            |
>>             state    | mgmt_server_id | host_id | created        |
>>             tried | taken               | step  | time_to_try | updated |
>>             +------+-------------+-----------+--------------------+----------+----------------+---------+---------------------+-------+---------------------+-------+-------------+---------+
>>             |  105 |         343 | HA        | User              |
>>             Running  |   345049098498 |      29 | 2015-03-28 03:19:03
>>             | 2 | 2015-03-28 03:30:04 | Error | 1394056013 |       3 |
>>             |  109 |         407 | HA        | ConsoleProxy       |
>>             Running  |  345049098498 |      33 | 2015-03-28 03:19:03
>>             | 2 | 2015-03-28 03:30:04 | Error |  1394056013 |       3 |
>>             |  113 |         407 | HA        | ConsoleProxy       |
>>             Running  |  345049098498 |      33 | 2015-03-28 03:19:03
>>             | 2 | 2015-03-28 03:30:04 | Error |  1394056013 |       3 |
>>             |  117 |         407 | HA        | ConsoleProxy       |
>>             Running  |  345049098498 |      33 | 2015-03-28 03:19:03
>>             | 2 | 2015-03-28 03:30:04 | Error |  1394056013 |       3 |
>>             |  121 |         343 | HA        | User              |
>>             Running  |   345049098498 |      29 | 2015-03-28 03:19:03
>>             | 2 | 2015-03-28 03:30:04 | Error | 1394056013 |       3 |
>>             |  125 |         405 | HA        | SecondaryStorageVm |
>>             Running  |  345049098498 |      29 | 2015-03-28 03:19:03
>>             | 2 | 2015-03-28 03:30:04 | Error |  1394056013 |       3 |
>>             |  129 |         393 | HA        | User              |
>>             Running  |   345049098498 |      25 | 2015-03-28 03:19:03
>>             | 2 | 2015-03-28 03:30:04 | Error | 1394056014 |      30 |
>>             |  133 |         405 | HA        | SecondaryStorageVm |
>>             Running  |  345049098498 |      29 | 2015-03-28 03:19:03
>>             | 2 | 2015-03-28 03:30:04 | Error |  1394056014 |       3 |
>>             |  137 |         402 | HA        | ConsoleProxy       |
>>             Running  |  345049098498 |      25 | 2015-03-28 03:19:03
>>             | 2 | 2015-03-28 03:30:04 | Error |  1394056014 |       3 |
>>             |  345 |         402 | HA        | ConsoleProxy       |
>>             Starting |  345049098498 |      25 | 2015-07-06 13:09:26
>>             | 1 | 2015-07-06 13:09:26 | Error |  1402527506 |     770 |
>>             |  349 |         687 | HA        | SecondaryStorageVm |
>>             Starting |  345049098498 |      25 | 2015-07-06 13:09:26
>>             | 1 | 2015-07-06 13:09:26 | Error |  1402527506 |       2 |
>>             | 2057 |        1178 | Migration | User              |
>>             Running  |   345049098498 |      33 | 2017-01-31 07:56:08
>>             | 0 | 2017-01-31 07:56:08 | Done  | 1451024774 |      95 |
>>             | 2058 |        1736 | Migration | User              |
>>             Running  |   345049098498 |     224 | 2017-01-31 07:56:08
>>             | 0 | 2017-01-31 07:56:08 | Done  | 1451024774 |      45 |
>>             | 2059 |        1240 | Migration | User              |
>>             Running  |   345049098498 |       7 | 2017-01-31 07:56:08
>>             | 0 | 2017-01-31 07:56:08 | Done  | 1451024774 |      61 |
>>             | 2060 |        1178 | Migration | User              |
>>             Running  |   345049098122 |      33 | 2017-01-31 07:56:08
>>             | 1 | 2017-01-31 08:06:37 | Done  | 1451025374 |      95 |
>>             | 2061 |        1178 | Migration | User              |
>>             Stopped  |           NULL |      33 | 2017-01-31 07:56:09
>>             | 5 | NULL                | Done  | 1451027900
>>             <tel:1451027900> |     109 |
>>             | 2062 |        1690 | Migration | User              |
>>             Running  |   345049098498 |     224 | 2017-01-31 07:56:09
>>             | 0 | 2017-01-31 07:56:09 | Done  | 1451024774 |     109 |
>>             | 2063 |        1177 | Migration | User              |
>>             Running  |   345049098498 |      29 | 2017-01-31 07:56:09
>>             | 0 | 2017-01-31 07:56:09 | Done  | 1451024774 |      82 |
>>             | 2064 |        1690 | Migration | User              |
>>             Running  |   345049098122 |     224 | 2017-01-31 07:56:09
>>             | 1 | 2017-01-31 07:58:18 | Done  | 1451024900 |     109 |
>>             | 2065 |        1240 | Migration | User              |
>>             Running  |   345049098122 |       7 | 2017-01-31 07:56:09
>>             | 1 | 2017-01-31 07:58:18 | Done  | 1451024900 |      61 |
>>             | 2066 |        1736 | Migration | User              |
>>             Running  |   345049098122 |     224 | 2017-01-31 07:56:09
>>             | 1 | 2017-01-31 07:58:18 | Done  | 1451024900 |      45 |
>>             | 2067 |        1240 | Migration | User              |
>>             Stopped  |           NULL |       7 | 2017-01-31 07:56:09
>>             | 5 | NULL                | Done  | 1451028003
>>             <tel:1451028003> |      80 |
>>             | 2068 |        1178 | Migration | User              |
>>             Running  |   345049098122 |      33 | 2017-01-31 07:56:16
>>             | 0 | 2017-01-31 07:56:16 | Done  | 1451024781 |      95 |
>>             | 2069 |        1736 | Migration | User              |
>>             Running  |   345049098122 |     224 | 2017-01-31 07:56:16
>>             | 0 | 2017-01-31 07:56:16 | Done  | 1451024781 |      45 |
>>             | 2070 |        1240 | Migration | User              |
>>             Running  |   345049098122 |       7 | 2017-01-31 07:56:16
>>             | 0 | 2017-01-31 07:56:16 | Done  | 1451024781 |      61 |
>>             | 2071 |        1177 | Migration | User              |
>>             Running  |   345049098122 |      29 | 2017-01-31 07:56:16
>>             | 0 | 2017-01-31 07:56:16 | Done  | 1451024781 |      82 |
>>             | 2072 |        1177 | Migration | User              |
>>             Stopped  |           NULL |      29 | 2017-01-31 07:56:16
>>             | 5 | NULL                | Done  | 1451027973
>>             <tel:1451027973> |     101 |
>>             | 2073 |        1177 | Migration | User              |
>>             Running  |   345049098122 |      29 | 2017-01-31 07:56:16
>>             | 1 | 2017-01-31 07:58:22 | Done  | 1451024904 |      82 |
>>             | 2074 |        1690 | Migration | User              |
>>             Stopped  |           NULL |     224 | 2017-01-31 07:56:16
>>             | 5 | NULL                | Done  | 1451027464
>>             <tel:1451027464> |     123 |
>>             | 2075 |        1736 | Migration | User              |
>>             Stopped  |           NULL |     224 | 2017-01-31 07:56:16
>>             | 5 | NULL                | Done  | 1451027477
>>             <tel:1451027477> |      59 |
>>             +------+-------------+-----------+--------------------+----------+----------------+---------+---------------------+-------+---------------------+-------+-------------+---------+
>>
>>             If we put a host in maintenance, it may unfortunately
>>             happen that a timeout occurs and CS aborts the migration.
>>             As far as we can interpret the logs, it seems (we
>>             speculate) that CS detects that the VM is then still
>>             running on the original host, which is in maintenance,
>>             and decides somehow to stop it because of the "host not
>>             up" status.
>>             We then need to start several times the VM, until CS
>>             gives up the stopping.
>>
>>             After restarting the MSs I guess that a similar issue may
>>             happen: the hosts are disconnected during the early stage
>>             and then reconnects. Maybe CS then again stops the VMs
>>             because the "host not up" status?
>>             Cf. 2017-01-30 08:44:06,157 DEBUG
>>             [c.c.d.DeploymentPlanningManagerImpl]
>>             (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>>             ctx-6e9c753f) The last host of this VM is not UP or is
>>             not enabled, host status is: Disconnected, host resource
>>             state is: Enabled
>>
>>             But the problem after the MSs restart is definitely only
>>             impacting VMs listed in the Table op_ha_work.
>>
>>
>>             To your questions:
>>                     We have 3 ACS MSs managing several Zones. Each
>>             zone has a single pod and cluster.
>>                     The Cluster of i-638-1735-VM has this summary:
>>                      Total number of VMs: 121
>>                      Total number hypervisors: 4
>>                      Total allocated RAM: 522.0 / 972 GB (53.7 %)
>>                      Total allocated cores: 211
>>                      Total allocated storage: 11285 GB
>>                     Used Memory on the hosts is now varying from 28
>>             to 77%, so there is enough RAM.
>>                     I did not check the XS logs. The same Problem has
>>             arisen this morning again (shortly after a restart of all
>>             MSs), so I will look in the XS logs now.
>>
>>                     I have some fair experience with Eclipse and
>>             Java, I will ask my boss if we can go this way.
>>
>>             The main issue for now, is that restarting the MSs is
>>             reproducing the problem, which is quite annoying for the
>>             impacted VM's.
>>             I thought to stop the MSs, delete all HA/Migration
>>             entries in op_ha_work and start the MSs.
>>             Is that a viable option? (or should I delete the
>>             Migration entries only? or never touch this table?)
>>             We will also very probably remove the HA Flag on all VMs
>>             if it does not work reliably.
>>
>>             Best Regards
>>             Francois Scheurer
>>
>>
>>
>>             Here are new logs regarding the same issue of today.
>>             I tried to get the important parts related to the stop of
>>             i-189-1178-VM.
>>
>>             regexp:
>>             (i-189-1178-VM|VmWorkMigrateAway|job-15393|job-192989|StopRequestedvm|job-14965|job-193019|Stopped-Migrating|ctx-36aae733|work-2049)
>>
>>             2017-01-30 08:44:04,440 DEBUG
>>             [o.a.c.f.j.i.AsyncJobManagerImpl]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989)
>>             Executing AsyncJobVO {id:192989, userId: 1, accountId: 1,
>>             instanceType: null, instanceId: null, cmd:
>>             com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>             rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZU
>>             F3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAEmnQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAHQ,
>>             cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resu
>>             ltCode: 0, result: null, initMsid: 345049103441,
>>             completeMsid: null, lastUpdated: null, lastPolled: null,
>>             created: Mon Jan 30 08:44:03 CET 2017}
>>             2017-01-30 08:44:04,440 DEBUG [c.c.v.VmWorkJobDispatcher]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989)
>>             Run VM work job: com.cloud.vm.VmWorkMigrateAway for VM
>>             1178, job origin: 15393
>>
>>             2017-01-30 08:44:05,217 DEBUG
>>             [c.c.d.DeploymentPlanningManagerImpl]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea) Deploy avoids pods: [], clusters: [],
>>             hosts: [29]
>>             2017-01-30 08:44:05,220 DEBUG
>>             [c.c.a.m.ClusteredAgentManagerImpl]
>>             (AgentConnectTaskPool-4:ctx-5b61f2e5) create
>>             ClusteredAgentAttache for 232
>>             2017-01-30 08:44:05,220 DEBUG
>>             [c.c.d.DeploymentPlanningManagerImpl]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea) DeploymentPlanner allocation algorithm:
>>             com.cloud.deploy.UserDispersingPlanner@1734b2e2
>>             2017-01-30 08:44:05,220 DEBUG
>>             [c.c.d.DeploymentPlanningManagerImpl]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea) Trying to allocate a host and storage pools
>>             from dc:17, pod:25,cluster:25, requested cpu: 4200,
>>             requested ram: 4294967296 <tel:%2842%29%209496-7296>
>>             2017-01-30 08:44:05,220 DEBUG
>>             [c.c.d.DeploymentPlanningManagerImpl]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea) Is ROOT volume READY (pool already
>>             allocated)?: Yes
>>             2017-01-30 08:44:05,220 DEBUG
>>             [c.c.d.DeploymentPlanningManagerImpl]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea) This VM has last host_id specified, trying
>>             to choose the same host: 25
>>
>>             2017-01-30 08:44:05,614 DEBUG
>>             [c.c.d.DeploymentPlanningManagerImpl]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea) The last host of this VM is not UP or is
>>             not enabled, host status is: Disconnected, host resource
>>             state is: Enabled
>>             2017-01-30 08:44:05,614 DEBUG
>>             [c.c.d.DeploymentPlanningManagerImpl]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea) Cannot choose the last host to deploy this VM
>>             2017-01-30 08:44:05,614 DEBUG [c.c.d.FirstFitPlanner]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea) Searching resources only under specified
>>             Cluster: 25
>>
>>             2017-01-30 08:44:05,777 DEBUG
>>             [c.c.a.m.a.i.FirstFitAllocator]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea FirstFitRoutingAllocator) Found 0 hosts for
>>             allocation after prioritization: []
>>             2017-01-30 08:44:05,777 DEBUG
>>             [c.c.a.m.a.i.FirstFitAllocator]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea FirstFitRoutingAllocator) Looking for
>>             speed=4200Mhz, Ram=4096
>>             2017-01-30 08:44:05,777 DEBUG
>>             [c.c.a.m.a.i.FirstFitAllocator]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea FirstFitRoutingAllocator) Host Allocator
>>             returning 0 suitable hosts
>>             2017-01-30 08:44:05,777 DEBUG
>>             [c.c.d.DeploymentPlanningManagerImpl]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea) No suitable hosts found
>>             2017-01-30 08:44:05,777 DEBUG
>>             [c.c.d.DeploymentPlanningManagerImpl]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea) No suitable hosts found under this Cluster: 25
>>             2017-01-30 08:44:05,787 DEBUG
>>             [c.c.d.DeploymentPlanningManagerImpl]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea) Could not find suitable Deployment
>>             Destination for this VM under any clusters, returning.
>>             2017-01-30 08:44:05,787 DEBUG [c.c.d.FirstFitPlanner]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea) Searching resources only under specified
>>             Cluster: 25
>>             2017-01-30 08:44:05,798 DEBUG [c.c.d.FirstFitPlanner]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea) The specified cluster is in avoid set,
>>             returning.
>>             2017-01-30 08:44:05,798 DEBUG
>>             [c.c.v.VirtualMachineManagerImpl]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea) Unable to find destination for migrating
>>             the vm VM[User|i-189-1178-VM]
>>             2017-01-30 08:44:05,798 WARN
>>             [c.c.v.VirtualMachineManagerImpl]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea) Failed to deploy vm 1178 with original
>>             planner, sending HAPlanner
>>
>>             2017-01-30 08:44:05,985 DEBUG
>>             [c.c.d.DeploymentPlanningManagerImpl]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea) Deploy avoids pods: [], clusters: [],
>>             hosts: [29]
>>             2017-01-30 08:44:05,985 DEBUG
>>             [c.c.d.DeploymentPlanningManagerImpl]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea) DeploymentPlanner allocation algorithm:
>>             com.cloud.deploy.SkipHeuresticsPlanner@4fdf86ce
>>             2017-01-30 08:44:05,985 DEBUG
>>             [c.c.d.DeploymentPlanningManagerImpl]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea) Trying to allocate a host and storage pools
>>             from dc:17, pod:25,cluster:25, requested cpu: 4200,
>>             requested ram: 4294967296 <tel:%2842%29%209496-7296>
>>             2017-01-30 08:44:05,985 DEBUG
>>             [c.c.d.DeploymentPlanningManagerImpl]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea) Is ROOT volume READY (pool already
>>             allocated)?: Yes
>>             2017-01-30 08:44:05,985 DEBUG
>>             [c.c.d.DeploymentPlanningManagerImpl]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea) This VM has last host_id specified, trying
>>             to choose the same host: 25
>>
>>             2017-01-30 08:44:06,022 DEBUG
>>             [c.c.d.DeploymentPlanningManagerImpl]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea) The last host of this VM is not UP or is
>>             not enabled, host status is: Disconnected, host resource
>>             state is: Enabled
>>             2017-01-30 08:44:06,022 DEBUG
>>             [c.c.d.DeploymentPlanningManagerImpl]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea) Cannot choose the last host to deploy this VM
>>             2017-01-30 08:44:06,022 DEBUG [c.c.d.FirstFitPlanner]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea) Searching resources only under specified
>>             Cluster: 25
>>             2017-01-30 08:44:06,039 DEBUG
>>             [c.c.d.SkipHeuresticsPlanner]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea) Deploying vm during HA process, so skipping
>>             disable threshold check
>>
>>             2017-01-30 08:44:06,151 DEBUG
>>             [c.c.a.m.a.i.FirstFitAllocator]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea FirstFitRoutingAllocator) Found 0 hosts for
>>             allocation after prioritization: []
>>             2017-01-30 08:44:06,151 DEBUG
>>             [c.c.a.m.a.i.FirstFitAllocator]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea FirstFitRoutingAllocator) Looking for
>>             speed=4200Mhz, Ram=4096
>>             2017-01-30 08:44:06,151 DEBUG
>>             [c.c.a.m.a.i.FirstFitAllocator]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea FirstFitRoutingAllocator) Host Allocator
>>             returning 0 suitable hosts
>>             2017-01-30 08:44:06,151 DEBUG
>>             [c.c.d.DeploymentPlanningManagerImpl]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea) No suitable hosts found
>>             2017-01-30 08:44:06,151 DEBUG
>>             [c.c.d.DeploymentPlanningManagerImpl]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea) No suitable hosts found under this Cluster: 25
>>             2017-01-30 08:44:06,157 DEBUG
>>             [o.a.c.f.j.i.AsyncJobManagerImpl]
>>             (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986)
>>             Publish async job-192986 complete on message bus
>>             2017-01-30 08:44:06,157 DEBUG
>>             [o.a.c.f.j.i.AsyncJobManagerImpl]
>>             (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986)
>>             Wake up jobs related to job-192986
>>             2017-01-30 08:44:06,157 DEBUG
>>             [o.a.c.f.j.i.AsyncJobManagerImpl]
>>             (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986)
>>             Update db status for job-192986
>>             2017-01-30 08:44:06,157 DEBUG
>>             [c.c.d.DeploymentPlanningManagerImpl]
>>             (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>>             ctx-6e9c753f) The last host of this VM is not UP or is
>>             not enabled, host status is: Disconnected, host resource
>>             state is: Enabled
>>             2017-01-30 08:44:06,157 DEBUG
>>             [c.c.d.DeploymentPlanningManagerImpl]
>>             (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>>             ctx-6e9c753f) Cannot choose the last host to deploy this VM
>>             2017-01-30 08:44:06,157 DEBUG [c.c.d.FirstFitPlanner]
>>             (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>>             ctx-6e9c753f) Searching resources only under specified
>>             Cluster: 35
>>             2017-01-30 08:44:06,160 DEBUG
>>             [c.c.d.DeploymentPlanningManagerImpl]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea) Could not find suitable Deployment
>>             Destination for this VM under any clusters, returning.
>>             2017-01-30 08:44:06,160 DEBUG [c.c.d.FirstFitPlanner]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea) Searching resources only under specified
>>             Cluster: 25
>>             2017-01-30 08:44:06,163 DEBUG
>>             [o.a.c.f.j.i.AsyncJobManagerImpl]
>>             (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Execute sync-queue
>>             item: SyncQueueItemVO {id:63629, queueId: 64423,
>>             contentType: AsyncJob, contentId: 192991,
>>             lastProcessMsid: 345049098498, lastprocessNumber: 27,
>>             lastProcessTime: Mon Jan 30 08:44:06 CET 2017, created:
>>             Mon Jan 30 08:44:05 CET 2017}
>>             2017-01-30 08:44:06,169 INFO
>>             [o.a.c.f.j.i.AsyncJobMonitor]
>>             (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992)
>>             Add job-192992 into job monitoring
>>             2017-01-30 08:44:06,184 DEBUG
>>             [c.c.d.SkipHeuresticsPlanner]
>>             (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>>             ctx-6e9c753f) Deploying vm during HA process, so skipping
>>             disable threshold check
>>             2017-01-30 08:44:06,186 DEBUG [c.c.d.FirstFitPlanner]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea) The specified cluster is in avoid set,
>>             returning.
>>             2017-01-30 08:44:06,186 DEBUG
>>             [c.c.v.VirtualMachineManagerImpl]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea) Unable to find destination for migrating
>>             the vm VM[User|i-189-1178-VM]
>>             2017-01-30 08:44:06,186 ERROR
>>             [c.c.v.VmWorkJobHandlerProxy]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea) Invocation exception, caused by:
>>             com.cloud.exception.InsufficientServerCapacityException:
>>             Unable to find a server to migrate to.Scope=interface
>>             com.cloud.org.Cluster; id=25
>>             2017-01-30 08:44:06,187 INFO
>>             [c.c.v.VmWorkJobHandlerProxy]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>>             ctx-ea83d6ea) Rethrow exception
>>             com.cloud.exception.InsufficientServerCapacityException:
>>             Unable to find a server to migrate to.Scope=interface
>>             com.cloud.org.Cluster; id=25
>>             2017-01-30 08:44:06,187 DEBUG [c.c.v.VmWorkJobDispatcher]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989)
>>             Done with run of VM work job:
>>             com.cloud.vm.VmWorkMigrateAway for VM 1178, job origin: 15393
>>             2017-01-30 08:44:06,187 ERROR [c.c.v.VmWorkJobDispatcher]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989)
>>             Unable to complete AsyncJobVO {id:192989, userId: 1,
>>             accountId: 1, instanceType: null, instanceId: null, cmd:
>>             com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>             rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAEmnQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAHQ,
>>             cmdVersion: 0, status: IN_PROGRESS, processStatus: 0,
>>             resultCode: 0, result: null, initMsid: 345049103441,
>>             completeMsid: null, lastUpdated: null, lastPolled: null,
>>             created: Mon Jan 30 08:44:03 CET 2017}, job origin:15393
>>             com.cloud.exception.InsufficientServerCapacityException:
>>             Unable to find a server to migrate to.Scope=interface
>>             com.cloud.org.Cluster; id=25
>>                     at
>>             com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:2390)
>>                     at
>>             com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:4520)
>>                     at
>>             sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>                     at
>>             sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>                     at
>>             sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>                     at java.lang.reflect.Method.invoke(Method.java:606)
>>                     at
>>             com.cloud.vm.VmWorkJobHandlerProxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
>>                     at
>>             com.cloud.vm.VirtualMachineManagerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4636)
>>                     at
>>             com.cloud.vm.VmWorkJobDispatcher.runJob(VmWorkJobDispatcher.java:103)
>>                     at
>>             org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:537)
>>                     at
>>             org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>>                     at
>>             org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
>>                     at
>>             org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
>>                     at
>>             org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
>>                     at
>>             org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>>                     at
>>             org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:494)
>>                     at
>>             java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>                     at
>>             java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>                     at
>>             java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>                     at
>>             java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>                     at java.lang.Thread.run(Thread.java:745)
>>             2017-01-30 08:44:06,188 DEBUG
>>             [o.a.c.f.j.i.AsyncJobManagerImpl]
>>             (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Schedule queued
>>             job-192991
>>             2017-01-30 08:44:06,192 DEBUG
>>             [o.a.c.f.j.i.AsyncJobManagerImpl]
>>             (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986)
>>             Wake up jobs joined with job-192986 and disjoin all
>>             subjobs created from job- 192986
>>             2017-01-30 08:44:06,196 DEBUG
>>             [o.a.c.f.j.i.AsyncJobManagerImpl]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989)
>>             Complete async job-192989, jobStatus: FAILED, resultCode:
>>             0, result:
>>             rO0ABXNyADdjb20uY2xvdWQuZXhjZXB0aW9uLkluc3VmZmljaWVudFNlcnZlckNhcGFjaXR5RXhjZXB0aW9uAAAAAFZNT3ACAAFaABVhZmZpbml0eUdyb3Vwc0FwcGxpZWR4cgAxY29tLmNsb3VkLmV4Y2VwdGlvbi5JbnN1ZmZpY2llbnRDYXBhY2l0eUV4Y2VwdGlvbgAAAABWTU91AgACTAACaWR0ABBMamF2YS9sYW5nL0xvbmc7TAAFc2NvcGV0ABFMamF2YS9sYW5nL0NsYXNzO3hyACJjb20uY2xvdWQuZXhjZXB0aW9uLkNsb3VkRXhjZXB0aW9ueeiOG_HkRMcCAAJMAAtjc0Vycm9yQ29kZXQAE0xqYXZhL2xhbmcvSW50ZWdlcjtMAAZpZExpc3R0ABVMamF2YS91dGlsL0FycmF5TGlzdDt4cgATamF2YS5sYW5nLkV4Y2VwdGlvbtD9Hz4aOxzEAgAAeHIAE2phdmEubGFuZy5UaHJvd2FibGXVxjUnOXe4ywMABEwABWNhdXNldAAVTGphdmEvbGFuZy9UaHJvd2FibGU7TAANZGV0YWlsTWVzc2FnZXQAEkxqYXZhL2xhbmcvU3RyaW5nO1sACnN0YWNrVHJhY2V0AB5bTGphdmEvbGFuZy9TdGFja1RyYWNlRWxlbWVudDtMABRzdXBwcmVzc2VkRXhjZXB0aW9uc3QAEExqYXZhL3V0aWwvTGlzdDt4cHEAfgANdAAmVW5hYmxlIHRvIGZpbmQgYSBzZXJ2ZXIgdG8gbWlncmF0ZSB0by51cgAeW0xqYXZhLmxhbmcuU3RhY2tUcmFjZUVsZW1lbnQ7AkYqPDz9IjkCAAB4cAAAABVzcgAbamF2YS5sYW5nLlN0YWNrVHJhY2VFbGVtZW50YQnFmiY23YUCAARJAApsaW5lTnVtYmVyTAAOZGVjbGFyaW5nQ2xhc3NxAH4ACkwACGZpbGVOYW1lcQB-AApMAAptZXRob2ROYW1lcQB-AAp4cAAACVZ0ACZjb20uY2xvdWQudm0uVmlydHVhbE1hY2hpbmVNYW5hZ2VySW1wbHQAHlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwuamF2YXQAFm9yY2hlc3RyYXRlTWlncmF0ZUF3YXlzcQB-ABEAABGocQB-ABNxAH4AFHEAfgAVc3EAfgAR_____nQAJHN1bi5yZWZsZWN0Lk5hdGl2ZU1ldGhvZEFjY2Vzc29ySW1wbHQAHU5hdGl2ZU1ldGhvZEFjY2Vzc29ySW1wbC5qYXZhdAAHaW52b2tlMHNxAH4AEQAAADlxAH4AGHEAfgAZdAAGaW52b2tlc3EAfgARAAAAK3QAKHN1bi5yZWZsZWN0LkRlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcGx0ACFEZWxlZ2F0aW5nTWV0aG9kQWNjZXNzb3JJbXBsLmphdmFxAH4AHHNxAH4AEQAAAl50ABhqYXZhLmxhbmcucmVmbGVjdC5NZXRob2R0AAtNZXRob2QuamF2YXEAfgAcc3EAfgARAAAAa3QAImNvbS5jbG91ZC52bS5WbVdvcmtKb2JIYW5kbGVyUHJveHl0ABpWbVdvcmtKb2JIYW5kbGVyUHJveHkuamF2YXQAD2hhbmRsZVZtV29ya0pvYnNxAH4AEQAAEhxxAH4AE3EAfgAUcQB-ACZzcQB-ABEAAABndAAgY29tLmNsb3VkLnZtLlZtV29ya0pvYkRpc3BhdGNoZXJ0ABhWbVdvcmtKb2JEaXNwYXRjaGVyLmphdmF0AAZydW5Kb2JzcQB-ABEAAAIZdAA_b3JnLmFwYWNoZS5jbG91ZHN0YWNrLmZyYW1ld29yay5qb2JzLmltcGwuQXN5bmNKb2JNYW5hZ2VySW1wbCQ1dAAYQXN5bmNKb2JNYW5hZ2VySW1wbC5qYXZhdAAMcnVuSW5Db250ZXh0c3EAfgARAAAAMXQAPm9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuTWFuYWdlZENvbnRleHRSdW5uYWJsZSQxdAAbTWFuYWdlZENvbnRleHRSdW5uYWJsZS5qYXZhdAADcnVuc3EAfgARAAAAOHQAQm9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0TWFuYWdlZENvbnRleHQkMXQAGkRlZmF1bHRNYW5hZ2VkQ29udGV4dC5qYXZhdAAEY2FsbHNxAH4AEQAAAGd0AEBvcmcuYXBhY2hlLmNsb3Vkc3RhY2subWFuYWdlZC5jb250ZXh0LmltcGwuRGVmYXVsdE1hbmFnZWRDb250ZXh0cQB-ADZ0AA9jYWxsV2l0aENvbnRleHRzcQB-ABEAAAA1cQB-ADlxAH4ANnQADnJ1bldpdGhDb250ZXh0c3EAfgARAAAALnQAPG9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuTWFuYWdlZENvbnRleHRSdW5uYWJsZXEAfgAycQB-ADNzcQB-ABEAAAHucQB-AC1xAH4ALnEAfgAzc3EAfgARAAAB13QALmphdmEudXRpbC5jb25jdXJyZW50LkV4ZWN1dG9ycyRSdW5uYWJsZUFkYXB0ZXJ0AA5FeGVjdXRvcnMuamF2YXEAfgA3c3EAfgARAAABBnQAH2phdmEudXRpbC5jb25jdXJyZW50LkZ1dHVyZVRhc2t0AA9GdXR1cmVUYXNrLmphdmFxAH4AM3NxAH4AEQAABHl0ACdqYXZhLnV0aWwuY29uY3VycmVudC5UaHJlYWRQb29sRXhlY3V0b3J0ABdUaHJlYWRQb29sRXhlY3V0b3IuamF2YXQACXJ1bldvcmtlcnNxAH4AEQAAAmd0AC5qYXZhLnV0aWwuY29uY3VycmVudC5UaHJlYWRQb29sRXhlY3V0b3IkV29ya2VycQB-AEhxAH4AM3NxAH4AEQAAAul0ABBqYXZhLmxhbmcuVGhyZWFkdAALVGhyZWFkLmphdmFxAH4AM3NyACZqYXZhLnV0aWwuQ29sbGVjdGlvbnMkVW5tb2RpZmlhYmxlTGlzdPwPJTG17I4QAgABTAAEbGlzdHEAfgAMeHIALGphdmEudXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVDb2xsZWN0aW9uGUIAgMte9x4CAAFMAAFjdAAWTGphdmEvdXRpbC9Db2xsZWN0aW9uO3hwc3IAE2phdmEudXRpbC5BcnJheUxpc3R4gdIdmcdhnQMAAUkABHNpemV4cAAAAAB3BAAAAAB4cQB-AFR4c3IAEWphdmEubGFuZy5JbnRlZ2VyEuKgpPeBhzgCAAFJAAV2YWx1ZXhyABBqYXZhLmxhbmcuTnVtYmVyhqyVHQuU4IsCAAB4cAAAEO9zcQB-AFMAAAAAdwQAAAAAeHNyAA5qYXZhLmxhbmcuTG9uZzuL5JDMjyPfAgABSgAFdmFsdWV4cQB-AFYAAAAAAAAAGXZyABVjb20uY2xvdWQub3JnLkNsdXN0ZXKqoV-omaW2FAIAAHhwAA
>>             2017-01-30 08:44:06,201 DEBUG
>>             [c.c.d.DeploymentPlanningManagerImpl]
>>             (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987
>>             ctx-4c93bc31) Deploy avoids pods: [], clusters: [],
>>             hosts: [3]
>>             2017-01-30 08:44:06,201 DEBUG
>>             [c.c.d.DeploymentPlanningManagerImpl]
>>             (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987
>>             ctx-4c93bc31) DeploymentPlanner allocation algorithm:
>>             com.cloud.deploy.SkipHeuresticsPlanner@4fdf86ce
>>             2017-01-30 08:44:06,201 DEBUG
>>             [c.c.d.DeploymentPlanningManagerImpl]
>>             (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987
>>             ctx-4c93bc31) Trying to allocate a host and storage pools
>>             from dc:14, pod:17,cluster:17, requested cpu: 4200,
>>             requested ram: 4294967296
>>             2017-01-30 08:44:06,201 DEBUG
>>             [c.c.d.DeploymentPlanningManagerImpl]
>>             (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987
>>             ctx-4c93bc31) Is ROOT volume READY (pool already
>>             allocated)?: Yes
>>             2017-01-30 08:44:06,201 DEBUG
>>             [c.c.d.DeploymentPlanningManagerImpl]
>>             (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987
>>             ctx-4c93bc31) This VM has last host_id specified, trying
>>             to choose the same host: 213
>>             2017-01-30 08:44:06,213 DEBUG
>>             [o.a.c.f.j.i.AsyncJobManagerImpl]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989)
>>             Publish async job-192989 complete on message bus
>>             2017-01-30 08:44:06,213 DEBUG
>>             [o.a.c.f.j.i.AsyncJobManagerImpl]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989)
>>             Wake up jobs related to job-192989
>>             2017-01-30 08:44:06,213 DEBUG
>>             [o.a.c.f.j.i.AsyncJobManagerImpl]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989)
>>             Update db status for job-192989
>>             2017-01-30 08:44:06,219 DEBUG
>>             [c.c.d.DeploymentPlanningManagerImpl]
>>             (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>>             ctx-6e9c753f) Checking resources in Cluster: 35 under Pod: 31
>>             2017-01-30 08:44:06,227 DEBUG
>>             [c.c.a.m.a.i.FirstFitAllocator]
>>             (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>>             ctx-6e9c753f FirstFitRoutingAllocator) Looking for hosts
>>             in dc: 23  pod:31  cluster:35
>>             2017-01-30 08:44:06,229 DEBUG
>>             [o.a.c.f.j.i.AsyncJobManagerImpl]
>>             (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992)
>>             Executing AsyncJobVO {id:192992, userId: 1, accountId: 1,
>>             instanceType: null, instanceId: null, cmd:
>>             com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>             rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGwXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAEw,
>>             cmdVersion: 0, status: IN_PROGRESS, processStatus: 0,
>>             resultCode: 0, result: null, initMsid: 345049098122,
>>             completeMsid: null, lastUpdated: null, lastPolled: null,
>>             created: Mon Jan 30 08:44:05 CET 2017}
>>             2017-01-30 08:44:06,229 DEBUG [c.c.v.VmWorkJobDispatcher]
>>             (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992)
>>             Run VM work job: com.cloud.vm.VmWorkMigrateAway for VM
>>             1729, job origin: 14953
>>             2017-01-30 08:44:06,230 DEBUG
>>             [o.a.c.f.j.i.AsyncJobManagerImpl]
>>             (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Execute sync-queue
>>             item: SyncQueueItemVO {id:63630, queueId: 67180,
>>             contentType: AsyncJob, contentId: 192990,
>>             lastProcessMsid: 345049098498, lastprocessNumber: 61,
>>             lastProcessTime: Mon Jan 30 08:44:06 CET 2017, created:
>>             Mon Jan 30 08:44:05 CET 2017}
>>             2017-01-30 08:44:06,244 DEBUG
>>             [c.c.d.DeploymentPlanningManagerImpl]
>>             (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987
>>             ctx-4c93bc31) The last host of this VM is not UP or is
>>             not enabled, host status is: Disconnected, host resource
>>             state is: Enabled
>>             2017-01-30 08:44:06,262 DEBUG
>>             [c.c.d.DeploymentPlanningManagerImpl]
>>             (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987
>>             ctx-4c93bc31) Cannot choose the last host to deploy this VM
>>             2017-01-30 08:44:06,263 DEBUG [c.c.d.FirstFitPlanner]
>>             (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987
>>             ctx-4c93bc31) Searching resources only under specified
>>             Cluster: 17
>>             2017-01-30 08:44:06,262 INFO
>>             [o.a.c.f.j.i.AsyncJobMonitor]
>>             (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991)
>>             Add job-192991 into job monitoring
>>             2017-01-30 08:44:06,246 DEBUG
>>             [o.a.c.f.j.i.AsyncJobManagerImpl]
>>             (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986)
>>             Done executing com.cloud.vm.VmWorkMigrateAway for job-192986
>>             2017-01-30 08:44:06,244 DEBUG
>>             [o.a.c.f.j.i.AsyncJobManagerImpl]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989)
>>             Wake up jobs joined with job-192989 and disjoin all
>>             subjobs created from job- 192989
>>             2017-01-30 08:44:06,269 INFO
>>             [o.a.c.f.j.i.AsyncJobMonitor]
>>             (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986)
>>             Remove job-192986 from job monitoring
>>             2017-01-30 08:44:06,279 DEBUG
>>             [o.a.c.f.j.i.AsyncJobManagerImpl]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989)
>>             Done executing com.cloud.vm.VmWorkMigrateAway for job-192989
>>             2017-01-30 08:44:06,305 DEBUG
>>             [o.a.c.f.j.i.AsyncJobManagerImpl]
>>             (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991)
>>             Executing AsyncJobVO {id:192991, userId: 1, accountId: 1,
>>             instanceType: null, instanceId: null, cmd:
>>             com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>             rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGRXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAABw,
>>             cmdVersion: 0, status: IN_PROGRESS, processStatus: 0,
>>             resultCode: 0, result: null, initMsid: 345049098122,
>>             completeMsid: null, lastUpdated: null, lastPolled: null,
>>             created: Mon Jan 30 08:44:05 CET 2017}
>>             2017-01-30 08:44:06,305 DEBUG [c.c.v.VmWorkJobDispatcher]
>>             (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991)
>>             Run VM work job: com.cloud.vm.VmWorkMigrateAway for VM
>>             1605, job origin: 14965
>>             2017-01-30 08:44:06,306 DEBUG [c.c.a.m.AgentManagerImpl]
>>             (AgentManager-Handler-13:null) Ping from 258
>>             2017-01-30 08:44:06,319 DEBUG
>>             [c.c.a.m.a.i.FirstFitAllocator]
>>             (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>>             ctx-6e9c753f FirstFitRoutingAllocator) List of hosts in
>>             ascending order of number of VMs: [224, 19, 21, 15]
>>             2017-01-30 08:44:06,322 DEBUG
>>             [c.c.a.m.a.i.FirstFitAllocator]
>>             (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>>             ctx-6e9c753f FirstFitRoutingAllocator) FirstFitAllocator
>>             has 0 hosts to check for allocation: []
>>             2017-01-30 08:44:06,322 INFO [c.c.a.m.AgentManagerImpl]
>>             (AgentManager-Handler-13:null) PingMap for agent: 258
>>             will not be updated because agent is no longer in the PingMap
>>             2017-01-30 08:44:06,326 INFO
>>             [o.a.c.f.j.i.AsyncJobMonitor]
>>             (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989)
>>             Remove job-192989 from job monitoring
>>
>>             2017-01-30 08:54:38,139 INFO
>>             [c.c.v.VirtualMachineManagerImpl]
>>             (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019
>>             ctx-052eb4f5) Migration cancelled because state has
>>             changed: VM[User|i-189-1178-VM]
>>             2017-01-30 08:54:38,139 DEBUG
>>             [c.c.v.VirtualMachineManagerImpl]
>>             (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019
>>             ctx-052eb4f5) Unable to migrate VM due to: Migration
>>             cancelled because state has changed: VM[User|i-189-1178-VM]
>>             2017-01-30 08:54:38,167 DEBUG [c.c.c.CapacityManagerImpl]
>>             (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019
>>             ctx-052eb4f5) VM state transitted from :Running to
>>             Stopping with event: StopRequestedvm's original host id:
>>             29 new host id: 33 host id before state transition: 33
>>             2017-01-30 08:54:38,177 DEBUG
>>             [c.c.a.m.DirectAgentAttache]
>>             (DirectAgent-325:ctx-e540d7f5) Seq
>>             19-2355664080091480198: Response Received:
>>             2017-01-30 08:54:38,177 DEBUG [c.c.a.t.Request]
>>             (DirectAgent-325:ctx-e540d7f5) Seq
>>             19-2355664080091480198: Processing:  { Ans: , MgmtId:
>>             345049098498, via: 19, Ver: v1, Flags: 10,
>>             [{"com.cloud.agent.api.NetworkUsageAnswer":{"routerName":"r-1706-VM","bytesSent":12276848087,"bytesReceived":5226621450,"result":true,"details":"","wait":0}}]
>>             }
>>             2017-01-30 08:54:38,177 DEBUG [c.c.a.t.Request]
>>             (RouterMonitor-1:ctx-2e80f405) Seq
>>             19-2355664080091480198: Received:  { Ans: , MgmtId:
>>             345049098498, via: 19, Ver: v1, Flags: 10, {
>>             NetworkUsageAnswer } }
>>             2017-01-30 08:54:38,177 DEBUG [c.c.a.m.AgentManagerImpl]
>>             (RouterMonitor-1:ctx-2e80f405) Details from executing
>>             class com.cloud.agent.api.NetworkUsageCommand:
>>             2017-01-30 08:54:38,178 DEBUG [c.c.a.t.Request]
>>             (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019
>>             ctx-052eb4f5) Seq 33-1917970491306410035: Sending  { Cmd
>>             , MgmtId: 345049098498, via: 33(ewcstack-vh003-prod),
>>             Ver: v1, Flags: 100011,
>>             [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"checkBeforeCleanup":false,"vmName":"i-189-1178-VM","wait":0}}]
>>             }
>>             2017-01-30 08:54:38,178 DEBUG [c.c.a.t.Request]
>>             (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019
>>             ctx-052eb4f5) Seq 33-1917970491306410035: Executing:  {
>>             Cmd , MgmtId: 345049098498, via: 33(ewcstack-vh003-prod),
>>             Ver: v1, Flags: 100011,
>>             [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"checkBeforeCleanup":false,"vmName":"i-189-1178-VM","wait":0}}]
>>             }
>>             2017-01-30 08:54:38,178 DEBUG
>>             [c.c.a.m.DirectAgentAttache]
>>             (DirectAgent-43:ctx-da30f109) Seq 33-1917970491306410035:
>>             Executing request
>>             2017-01-30 08:54:38,202 DEBUG
>>             [c.c.a.m.DirectAgentAttache]
>>             (DirectAgent-326:ctx-4fb88aa7) Seq
>>             21-1035546439318503505: Executing request
>>
>>             2017-01-30 09:05:55,628 INFO
>>             [c.c.h.HighAvailabilityManagerImpl]
>>             (HA-Worker-4:ctx-36aae733 work-2049) Processing
>>             HAWork[2049-Migration-1178-Stopped-Migrating]
>>             2017-01-30 09:05:55,633 WARN
>>             [o.a.c.f.j.AsyncJobExecutionContext]
>>             (HA-Worker-4:ctx-36aae733 work-2049) Job is executed
>>             without a context, setup psudo job for the executing thread
>>             2017-01-30 09:05:55,645 DEBUG
>>             [o.a.c.f.j.i.AsyncJobManagerImpl]
>>             (HA-Worker-4:ctx-36aae733 work-2049) Sync job-193045
>>             execution on object VmWorkJobQueue.1178
>>
>>
>>
>>
>>             On 31.01.2017 00:42, Rafael Weingärtner wrote:
>>
>>                 This seems to be one of those nasty bugs that are
>>                 very hard to debug. It
>>                 seems to be cause to a combination of different
>>                 factors, such as planets
>>                 alignments ;)
>>
>>                 At class “VirtualMachineManagerImpl” line 2395 (method
>>                 “com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(String,
>>                 long, DeploymentPlanner)”) is the start of the code
>>                 flow that triggers the
>>                 migration. The point here is that, if an error occurs
>>                 during the migration,
>>                 the VM will be stopped; For some reason this is
>>                 executed only for HA VMs.
>>
>>                 Going a little bit further, we find the point that is
>>                 triggering the error
>>                 in the method
>>                 “com.cloud.vm.VirtualMachineManagerImpl.migrate(VMInstanceVO,
>>                 long, DeployDestination)” at line 1941. This line
>>                 will call the method
>>                 “com.cloud.vm.VirtualMachineManagerImpl.changeState(VMInstanceVO,
>>                 Event,
>>                 Long, ItWorkVO, Step)” to change the VM state to
>>                 migrating/prepare
>>                 migration/or something like that (I am guessing
>>                 here), but for some reason,
>>                 the state of the VM is not the one expected and this
>>                 is causing a
>>                 “ConcurrentOperationException” to be thrown. The
>>                 problem is that the
>>                 “com.cloud.vm.VirtualMachineManagerImpl.changeState(VMInstanceVO,
>>                 Event,
>>                 Long, ItWorkVO, Step)” does not seem to provide much
>>                 useful log entries.
>>
>>                 Just to enable us to discuss the problem a little
>>                 further, could you tell
>>                 us a little bit about your environment setting? How
>>                 many ACS MSs do you
>>                 have? What is the VM workload being migrated (I mean,
>>                 the number of VMs and
>>                 computing resources)? Do the other servers have
>>                 enough computing resources
>>                 to support these VMs? Could you also check the
>>                 XenServer logs, and check
>>                 what happened on both, source and destination hosts
>>                 when the migration for
>>                 VM “i-638-1735-VM” was triggered?
>>
>>                 Do you have the knowledge/skills to use an IDE (e.g.
>>                 Eclipse) to connect to
>>                 a running JVM and debug a code? Because for me it
>>                 will not be possible to
>>                 go further without looking at the variables in a
>>                 running environment.
>>
>>
>>                 On Mon, Jan 30, 2017 at 4:28 AM, Francois Scheurer <
>>                 francois.scheurer@everyware.ch
>>                 <ma...@everyware.ch>> wrote:
>>
>>                     Dear Rafael
>>
>>
>>                     Sorry for the late answer (was sick a couple of
>>                     days).
>>
>>                     I confirm we are migrating inside the same
>>                     xenserver pool (within a CS
>>                     Cluster).
>>
>>
>>                     Best Regards
>>                     Francois
>>
>>
>>
>>                     On 26.01.2017 04:35, Rafael Weingärtner wrote:
>>
>>                         Found the code that is executed.
>>                         I need you to confirm if you are migrating
>>                         within a cluster or across
>>                         clusters?
>>
>>                         On Wed, Jan 25, 2017 at 11:57 AM, Rafael
>>                         Weingärtner <
>>                         rafaelweingartner@gmail.com
>>                         <ma...@gmail.com>> wrote:
>>
>>                         I do not have access to ACS code right now,
>>                         but I suggest starting
>>
>>                             debugging here:
>>
>>                             2017-01-24 18:05:28,427 DEBUG
>>                             [c.c.c.CapacityManagerImpl]
>>
>>                                 (Work-Job-Executor-156:ctx-14b86f5e
>>                                 job-60216/job-191380 ctx-175d37df)
>>                                 VM state transitted from :Running to
>>                                 Stopping with event:
>>                                 StopRequestedvm's
>>                                 original host id: 224 new host id: 15
>>                                 host id before state transition:
>>                                 15
>>
>>                                 I would try to understand first why
>>                                 ACS requested the hypervisor to stop
>>
>>                             the VM. I mean the following, check the
>>                             source code for conditions that
>>                             would make ACS request the shutdown of VMs.
>>                             BTW: are you migrating within a cluster
>>                             or across clusters?
>>
>>                             I will be able to analyze this further
>>                             only after work hours, so if you
>>                             find anything, keep me posted.
>>
>>                             On Wed, Jan 25, 2017 at 11:45 AM,
>>                             Francois Scheurer <
>>                             francois.scheurer@everyware.ch
>>                             <ma...@everyware.ch>>
>>                             wrote:
>>
>>                             Hello Rafael
>>
>>
>>                                 I think the important log lines are
>>                                 these:
>>
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:28,407 INFO
>>                                 [c.c.v.VirtualMachineManagerImpl]
>>                                 (Work-Job-Executor-156:ctx-14b86f5e
>>                                 job-60216/job-191380 ctx-175d37df)
>>                                 Migration cancelled because state has
>>                                 changed: VM[User|i-638-1736-VM]
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:28,407 DEBUG
>>                                 [c.c.v.VirtualMachineManagerImpl]
>>                                 (Work-Job-Executor-156:ctx-14b86f5e
>>                                 job-60216/job-191380 ctx-175d37df)
>>                                 Unable to migrate VM due to:
>>                                 Migration
>>                                 cancelled because state has changed:
>>                                 VM[User|i-638-1736-VM]
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:28,427 DEBUG
>>                                 [c.c.c.CapacityManagerImpl]
>>                                 (Work-Job-Executor-156:ctx-14b86f5e
>>                                 job-60216/job-191380 ctx-175d37df) VM
>>                                 state transitted from :Running to
>>                                 Stopping with event:
>>                                 StopRequestedvm's original host id:
>>                                 224 new host
>>                                 id:
>>                                 15 host id before state transition: 15
>>
>>                                 The really disturbing thing is that
>>                                 CS stop again the VM if we start it
>>                                 manually, apparently because the
>>                                 migratingjob is still running.
>>
>>                                 Note the the VM HA Flag is enabled.
>>
>>
>>
>>                                 Thank you.
>>
>>
>>                                 Best Regards.
>>
>>                                 Francois
>>
>>
>>
>>
>>
>>
>>                                 more details:
>>
>>
>>
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:28,148 DEBUG
>>                                 [o.a.c.e.o.VolumeOrchestrator]
>>                                 (Work-Job-Executor-156:ctx-14b86f5e
>>                                 job-60216/job-191380 ctx-175d37df)
>>                                 Preparing 3 volumes for
>>                                 VM[User|i-638-1736-VM]
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:28,155 DEBUG
>>                                 [c.c.v.VmWorkJobDispatcher]
>>                                 (Work-Job-Executor-122:ctx-a8bacc64
>>                                 job-14953/job-191351) Done with run
>>                                 of VM work job:
>>                                 com.cloud.vm.VmWorkMigrateAway for VM
>>                                 1411, job origin: 14953
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:28,155 DEBUG
>>                                 [o.a.c.f.j.i.AsyncJobManagerImpl]
>>                                 (Work-Job-Executor-122:ctx-a8bacc64
>>                                 job-14953/job-191351) Done executing
>>                                 com.cloud.vm.VmWorkMigrateAway for
>>                                 job-191351
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:28,160 INFO
>>                                 [o.a.c.f.j.i.AsyncJobMonitor]
>>                                 (Work-Job-Executor-122:ctx-a8bacc64
>>                                 job-14953/job-191351) Remove
>>                                 job-191351 from job monitoring
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:28,168 INFO
>>                                 [c.c.h.HighAvailabilityManagerImpl]
>>                                 (HA-Worker-2:ctx-f3bcf43c
>>                                 work-1989)
>>                                 Completed
>>                                 HAWork[1989-Migration-1411-Running-Migrating]
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:28,246 DEBUG
>>                                 [c.c.a.m.ClusteredAgentAttache]
>>                                 (Work-Job-Executor-156:ctx-14b86f5e
>>                                 job-60216/job-191380 ctx-175d37df)
>>                                 Seq 19-8526158519542516375:
>>                                 Forwarding
>>                                 Seq 19-8526158519542516375:  { Cmd ,
>>                                 MgmtId: 345049103441, via:
>>                                 19(ewcstack-vh010-prod), Ver: v1,
>>                                 Flags: 100111, [{
>>                                 "com.cloud.agent.api.PrepareForMigrationCommand":{"vm":{"id"
>>                                 :1736,"name":"i-638-1736-VM","bootloader":"PyGrub","type":"U
>>                                 ser","cpus":4,"minSpeed":2100,"maxSpeed":2100,"minRam":17179
>>                                 869184,"maxRam":17179869184,"arch":"x86_64","os":"Other
>>                                 (64-bit)","platformEmulator":"Other
>>                                 install
>>                                 media","bootArgs":"","enable
>>                                 HA":true,"limitCpuUse":true,"enableDynamicallyScaleVm":true,
>>                                 "vncPassword":"pC1WUC7h1DBH9J36q3H9pg==","params":{"memoryOv
>>                                 ercommitRatio":"1.0","platform":"viridian:true;acpi:1;apic:t
>>                                 rue;pae:true;nx:true;timeoffset:-3601","keyboard":"us","Mess
>>                                 age.ReservedCapacityFreed.Flag":"false","cpuOvercommitRatio":"4.0","
>>                                 hypervisortoolsversion":"xenserver61"},"uuid":"5bff01de-c033
>>                                 -4925-9e47-30dd14539272","disks":[{"data":{"org.apache.cloud
>>                                 stack.storage.to
>>                                 <http://stack.storage.to>.VolumeObjectTO":{"uuid":"1ab514b2-f2de-
>>                                 4d85-8e0d-9768184a1349","volumeType":"ROOT","dataStore":{"org.
>>                                 apache.cloudstack.storage.to
>>                                 <http://apache.cloudstack.storage.to>.PrimaryDataStoreTO":{"uuid":"b8751b89-00f
>>                                 6-5079-ed7a-c073579c7563","id":64,"poolType":"PreSetup","hos
>>                                 t":"localhost","path":"/b8751b89-00f6-5079-ed7a-c073579c7563
>>                                 ","port":0,"url":"PreSetup://localhost/b8751b89-00f6-5079-ed
>>                                 7a-c073579c7563/?ROLE=Primary&STOREUUID=b8751b89-00f6-5079-
>>                                 ed7a-c073579c7563"}},"name":"ROOT-1736","size":53687091200,"path":"d3
>>                                 f87e3c-1efe-42ad-aa93-609d5b980a34","volumeId":8655,"vmName"
>>                                 :"i-638-1736-VM","accountId":638,"format":"VHD","provisionin
>>                                 gType":"THIN","id":8655,"deviceId":0,"hypervisorType":"
>>                                 XenServer"}},"diskSeq":0,"path":"d3f87e3c-1efe-42ad-aa93
>>                                 -609d5b980a34","type":"ROOT","_details":{"managed":"false","
>>                                 storagePort":"0",
>>                                 "storageHost":"localhost","volumeSize":"53687091200"}},{"dat
>>                                 a":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid
>>                                 ":"733df9dc-5e24-4ab9-b22b-bce78322428e","volumeType":"DATAD
>>                                 ISK","dataStore":{"org.apache.cloudstack.storage.to.PrimaryD
>>                                 ataStoreTO":{"uuid":"b8751b89-00f6-5079-ed7a-c073579c7563","id":64,"
>>                                 poolType":"PreSetup","host":"localhost","path":"/b8751b89-00
>>                                 f6-5079-ed7a-c073579c7563","port":0,"url":"PreSetup://localh
>>                                 ost/b8751b89-00f6-5079-ed7a-c073579c7563/?ROLE=Primary&STORE
>>                                 UUID=b8751b89-00f6-5079-ed7a-c073579c7563"}},"name":"WELLA-
>>                                 DB1-NCSA_D","size":268435456000,"path":"265f168a-bb92-4591-a32b-ee3e2
>>                                 2c8cb68","volumeId":8656,"vmName":"i-638-1736-VM","accountId
>>                                 ":638,"format":"VHD","provisioningType":"THIN","id":8656,"
>>                                 deviceId":1,"hypervisorType":"XenServer"}},"diskSeq":1,"
>>                                 path":"265f168a-bb92-4591-a32b-ee3e22c8cb68","type":"DATADIS
>>                                 K","_details":{"managed":"false","storagePort":"0","stor
>>                                 ageHost":"localhos
>>                                 t","volumeSize":"268435456000"}},{"data":{"org.apache.clouds
>>                                 tack.storage.to
>>                                 <http://tack.storage.to>.VolumeObjectTO":{"uuid":"bda6f335-b8a3-479c-
>>                                 8c0d-41b344d2a50d","volumeType":"DATADISK","dataStore":{"org.
>>                                 apache.cloudstack.storage.to
>>                                 <http://apache.cloudstack.storage.to>.PrimaryDataStoreTO":{"uuid"
>>                                 :"09df3ca6-0a20-e4f0-3d79-12a6ed67a13d","id":89,"poolType":"PreSetup","
>>                                 host":"localhost","path":"/09df3ca6-0a20-e4f0-3d79-12a6ed67a
>>                                 13d","port":0,"url":"PreSetup://localhost/09df3ca6-0a20-e4f0
>>                                 -3d79-12a6ed67a13d/?ROLE=Primary&STOREUUID=09df3ca6-0a20-
>>
>>                                 e4f0-3d79-12a6ed67a13d"}},"name":"WELLA-DB1-NCSA_E","size
>>                                 ":429496729600,"path":"612cb586-a70c-4896-b120-062ee7a67894"
>>                                 ,"volumeId":86
>>                                 57,"vmName":"i-638-1736-VM","accountId":638,"format":"VHD","
>>                                 provisioningType":"THIN","id":8657,"deviceId":2,"hypervisorT
>>                                 ype":"XenServer"}},"diskSeq":2,"path":"612cb586-a70c-4896-b1
>>                                 20-062ee7a67894","type":"DATADISK","_details":{"managed":"
>>                                 false","storagePort":"0","storageHost":"localhost","volumeSize":"42949
>>                                 6729600"}},{"data":{"org.apache.cloudstack.storage.to
>>                                 <http://apache.cloudstack.storage.to>.Templa
>>                                 teObjectTO":{"uuid":"0e5bf4ff-caca-4b08-abed-7bbcdf3c42ef","
>>                                 id":201,"format":"ISO","accountId":1,"hvm":true,"displayText
>>                                 ":"xen-pv-drv-iso","name":"xs-tools.iso","guestOsType":"CentOS
>>                                 4.5
>>                                 (32-bit)","hypervisorType":"XenServer"}},"diskSeq":3,"type":"
>>                                 ISO"}],"nics":[{"deviceId":0,"networkRateMbps":1000,"default
>>                                 Nic":true,"pxeDisable":false,"nicUuid":"3523c3d3-9f92-4076-a
>>                                 d99-f8864d0f0edd","uuid":"e6fbb257-16e1-415f-88fb-9a79b263b3
>>                                 a8","ip":"192.168.1.42","netmask":"255.255.255.0","
>>                                 gateway":"192.168.1.1","mac":"02:00:0c:ae:00:0f","dns1":"8.8
>>                                 .8.8","dns2":"8.8
>>                                 .4.4","broadcastType":"Vlan","type":"Guest","broadcastUri":"
>>                                 vlan://2285","isolationUri":"vlan://2285","isSecurityGroupEn
>>                                 abled":false,"name":"bond0"}],"vcpuMaxLimit":16},"wait":0}}]
>>                                 } to
>>                                 345049098498
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:28,399 DEBUG [c.c.a.t.Request]
>>                                 (AgentManager-Handler-10:null) Seq
>>                                 19-8526158519542516375: Processing:
>>                                 {
>>                                 Ans: , MgmtId: 345049103441, via: 19,
>>                                 Ver: v1, Flags: 110,
>>                                 [{"com.cloud.agent.api.PrepareForMigrationAnswer":{"result":
>>                                 true,"wait":0}}]
>>                                 }
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:28,399 DEBUG
>>                                 [c.c.a.m.AgentAttache]
>>                                 (AgentManager-Handler-10:null) Seq
>>                                 19-8526158519542516375: No more
>>                                 commands
>>                                 found
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:28,399 DEBUG [c.c.a.t.Request]
>>                                 (Work-Job-Executor-156:ctx-14b86f5e
>>                                 job-60216/job-191380 ctx-175d37df)
>>                                 Seq 19-8526158519542516375:
>>                                 Received:  { Ans: , MgmtId: 345049103441,
>>                                 via:
>>                                 19, Ver: v1, Flags: 110, {
>>                                 PrepareForMigrationAnswer } }
>>
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:28,407 INFO
>>                                 [c.c.v.VirtualMachineManagerImpl]
>>                                 (Work-Job-Executor-156:ctx-14b86f5e
>>                                 job-60216/job-191380 ctx-175d37df)
>>                                 Migration cancelled because state has
>>                                 changed: VM[User|i-638-1736-VM]
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:28,407 DEBUG
>>                                 [c.c.v.VirtualMachineManagerImpl]
>>                                 (Work-Job-Executor-156:ctx-14b86f5e
>>                                 job-60216/job-191380 ctx-175d37df)
>>                                 Unable to migrate VM due to:
>>                                 Migration
>>                                 cancelled because state has changed:
>>                                 VM[User|i-638-1736-VM]
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:28,427 DEBUG
>>                                 [c.c.c.CapacityManagerImpl]
>>                                 (Work-Job-Executor-156:ctx-14b86f5e
>>                                 job-60216/job-191380 ctx-175d37df) VM
>>                                 state transitted from :Running to
>>                                 Stopping with event:
>>                                 StopRequestedvm's original host id:
>>                                 224 new host
>>                                 id:
>>                                 15 host id before state transition: 15
>>
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:28,438 DEBUG
>>                                 [c.c.a.m.ClusteredAgentAttache]
>>                                 (Work-Job-Executor-156:ctx-14b86f5e
>>                                 job-60216/job-191380 ctx-175d37df)
>>                                 Seq 15-6889944479923437643:
>>                                 Forwarding
>>                                 Seq 15-6889944479923437643:  { Cmd ,
>>                                 MgmtId: 345049103441, via:
>>                                 15(ewcstack-vh011-prod), Ver: v1,
>>                                 Flags: 100011,
>>                                 [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"execut
>>                                 eInSequence":false,"checkBeforeCleanup":false,"vmName":"i-
>>                                 638-1736-VM","wait":0}}]
>>                                 } to 345049098498
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:30,335 DEBUG
>>                                 [c.c.a.m.AgentManagerImpl]
>>                                 (AgentManager-Handler-1:null) SeqA
>>                                 254-800498:
>>                                 Processing Seq 254-800498:  { Cmd ,
>>                                 MgmtId: -1, via: 254, Ver: v1,
>>                                 Flags:
>>                                 11,
>>                                 [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_
>>                                 proxyVmId":1530,"_loadInfo":"{\n
>>                                 \"connections\": []\n}","wait":0}}] }
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:30,340 DEBUG
>>                                 [c.c.a.m.AgentManagerImpl]
>>                                 (AgentManager-Handler-1:null) SeqA
>>                                 254-800498:
>>                                 Sending Seq 254-800498:  { Ans: ,
>>                                 MgmtId: 345049103441, via: 254, Ver:
>>                                 v1,
>>                                 Flags: 100010,
>>                                 [{"com.cloud.agent.api.AgentCo
>>                                 ntrolAnswer":{"result":true,"wait":0}}] }
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:32,447 DEBUG [c.c.a.t.Request]
>>                                 (AgentManager-Handler-11:null) Seq
>>                                 15-6889944479923437632: Processing:
>>                                 {
>>                                 Ans: , MgmtId: 345049103441, via: 15,
>>                                 Ver: v1, Flags: 10,
>>                                 [{"com.cloud.agent.api.StopAnswer":{"platform":"viridian:tru
>>                                 e;acpi:1;apic:true;pae:true;nx:true;timeoffset:3601","result
>>                                 ":true,"details":"Stop
>>                                 VM i-638-1735-VM Succeed","wait":0}}] }
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:32,447 DEBUG [c.c.a.t.Request]
>>                                 (Work-Job-Executor-115:ctx-0d5db528
>>                                 job-13733/job-191379 ctx-bbabfeae)
>>                                 Seq 15-6889944479923437632:
>>                                 Received:  { Ans: , MgmtId: 345049103441,
>>                                 via:
>>                                 15, Ver: v1, Flags: 10, { StopAnswer } }
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:32,470 DEBUG
>>                                 [c.c.v.VirtualMachineManagerImpl]
>>                                 (Work-Job-Executor-115:ctx-0d5db528
>>                                 job-13733/job-191379 ctx-bbabfeae)
>>                                 VM[User|i-638-1735-VM] is stopped on
>>                                 the
>>                                 host. Proceeding to release resource
>>                                 held.
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:32,475 DEBUG
>>                                 [c.c.n.NetworkModelImpl]
>>                                 (Work-Job-Executor-115:ctx-0d5db528
>>                                 job-13733/job-191379 ctx-bbabfeae)
>>                                 Service SecurityGroup is not
>>                                 supported
>>                                 in the network id=519
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:32,478 DEBUG
>>                                 [o.a.c.e.o.NetworkOrchestrator]
>>                                 (Work-Job-Executor-115:ctx-0d5db528
>>                                 job-13733/job-191379 ctx-bbabfeae)
>>                                 Changing active number of nics for
>>                                 network id=519 on -1
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:32,488 DEBUG
>>                                 [o.a.c.e.o.NetworkOrchestrator]
>>                                 (Work-Job-Executor-115:ctx-0d5db528
>>                                 job-13733/job-191379 ctx-bbabfeae)
>>                                 Asking VpcVirtualRouter to release
>>                                 NicProfile[17403-1735-3364083b-dcc0-4eb9-a7f9-f3e85bf8c7d3-1
>>                                 92.168.1.11-null
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:32,488 DEBUG
>>                                 [c.c.v.VirtualMachineManagerImpl]
>>                                 (Work-Job-Executor-115:ctx-0d5db528
>>                                 job-13733/job-191379 ctx-bbabfeae)
>>                                 Successfully released network
>>                                 resources
>>                                 for the vm VM[User|i-638-1735-VM]
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:32,488 DEBUG
>>                                 [c.c.v.VirtualMachineManagerImpl]
>>                                 (Work-Job-Executor-115:ctx-0d5db528
>>                                 job-13733/job-191379 ctx-bbabfeae)
>>                                 Successfully released storage
>>                                 resources
>>                                 for the vm VM[User|i-638-1735-VM]
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:32,496 DEBUG
>>                                 [c.c.c.CapacityManagerImpl]
>>                                 (Work-Job-Executor-115:ctx-0d5db528
>>                                 job-13733/job-191379 ctx-bbabfeae) VM
>>                                 state transitted from :Stopping to
>>                                 Stopped with event:
>>                                 OperationSucceededvm's original host
>>                                 id: 15 new host
>>                                 id: null host id before state
>>                                 transition: 15
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:32,505 DEBUG
>>                                 [c.c.c.CapacityManagerImpl]
>>                                 (Work-Job-Executor-115:ctx-0d5db528
>>                                 job-13733/job-191379 ctx-bbabfeae)
>>                                 Hosts's actual total CPU: 82976 and
>>                                 CPU
>>                                 after applying overprovisioning: 331904
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:32,505 DEBUG
>>                                 [c.c.c.CapacityManagerImpl]
>>                                 (Work-Job-Executor-115:ctx-0d5db528
>>                                 job-13733/job-191379 ctx-bbabfeae)
>>                                 Hosts's actual total RAM:
>>                                 261961927808
>>                                 and RAM after applying
>>                                 overprovisioning: 261961924608
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:32,505 DEBUG
>>                                 [c.c.c.CapacityManagerImpl]
>>                                 (Work-Job-Executor-115:ctx-0d5db528
>>                                 job-13733/job-191379 ctx-bbabfeae)
>>                                 release cpu from host: 15, old used:
>>                                 71300,reserved: 2100, actual total:
>>                                 82976, total with overprovisioning:
>>                                 331904; new used:
>>                                 67100,reserved:6300; movedfromreserved:
>>                                 false,moveToReserveredtrue
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:32,505 DEBUG
>>                                 [c.c.c.CapacityManagerImpl]
>>                                 (Work-Job-Executor-115:ctx-0d5db528
>>                                 job-13733/job-191379 ctx-bbabfeae)
>>                                 release mem from host: 15, old used:
>>                                 98784247808,reserved: 1073741824,
>>                                 total: 261961924608; new used:
>>                                 94489280512,reserved:5368709120
>>                                 <tel:5368709120> <(53)%206870-9120>;
>>                                 movedfromreserved:
>>
>>                                 false,moveToReserveredtrue
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:32,538 DEBUG
>>                                 [c.c.n.NetworkModelImpl]
>>                                 (Work-Job-Executor-115:ctx-0d5db528
>>                                 job-13733/job-191379 ctx-bbabfeae)
>>                                 Service SecurityGroup is not
>>                                 supported
>>                                 in the network id=519
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:32,557 DEBUG
>>                                 [c.c.n.NetworkModelImpl]
>>                                 (Work-Job-Executor-115:ctx-0d5db528
>>                                 job-13733/job-191379 ctx-bbabfeae)
>>                                 Service SecurityGroup is not
>>                                 supported
>>                                 in the network id=519
>>
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:32,588 ERROR
>>                                 [c.c.v.VmWorkJobHandlerProxy]
>>                                 (Work-Job-Executor-115:ctx-0d5db528
>>                                 job-13733/job-191379 ctx-bbabfeae)
>>                                 Invocation exception, caused by:
>>                                 com.cloud.utils.exception.CloudRuntimeException:
>>                                 Unable to migrate
>>                                 VM[User|i-638-1735-VM]
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:32,589 INFO
>>                                 [c.c.v.VmWorkJobHandlerProxy]
>>                                 (Work-Job-Executor-115:ctx-0d5db528
>>                                 job-13733/job-191379 ctx-bbabfeae)
>>                                 Rethrow exception
>>                                 com.cloud.utils.exception.CloudRuntimeException:
>>                                 Unable to migrate
>>                                 VM[User|i-638-1735-VM]
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:32,589 DEBUG
>>                                 [c.c.v.VmWorkJobDispatcher]
>>                                 (Work-Job-Executor-115:ctx-0d5db528
>>                                 job-13733/job-191379) Done with run
>>                                 of VM work job:
>>                                 com.cloud.vm.VmWorkMigrateAway for VM
>>                                 1735, job origin: 13733
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:32,589 ERROR
>>                                 [c.c.v.VmWorkJobDispatcher]
>>                                 (Work-Job-Executor-115:ctx-0d5db528
>>                                 job-13733/job-191379) Unable to
>>                                 complete AsyncJobVO {id:191379, userId:
>>                                 1,
>>                                 accountId: 1, instanceType: null,
>>                                 instanceId: null, cmd:
>>                                 com.cloud.vm.VmWorkMigrateAway,
>>                                 cmdInfo: rO0ABXNyAB5jb20uY2xvdWQudm0uVm
>>                                 1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb2
>>                                 0uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2
>>                                 VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZz
>>                                 t4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGx3QAGVZpcnR1YWxNYWNoaW5lTW
>>                                 FuYWdlckltcGwAAAAAAAAA4A,
>>                                 cmdVersion: 0, status: IN_PROGRESS,
>>                                 processStatus: 0, resultCode: 0,
>>                                 result: null, initMsid: 345049098498,
>>                                 completeMsid: null, lastUpdated:
>>                                 null, lastPolled: null, created: Tue
>>                                 Jan 24 18:05:00 CET 2017}, job
>>                                 origin:13733
>>                                 ewcstack-man02-prod:
>>                                 com.cloud.utils.exception.CloudRuntimeException:
>>                                 Unable to migrate VM[User|i-638-1735-VM]
>>                                 ewcstack-man02-prod:    at
>>                                 com.cloud.vm.VirtualMachineMan
>>                                 agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:2405)
>>                                 ewcstack-man02-prod:    at
>>                                 com.cloud.vm.VirtualMachineMan
>>                                 agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:4517)
>>                                 ewcstack-man02-prod:    at
>>                                 sun.reflect.GeneratedMethodAcc
>>                                 essor402.invoke(Unknown
>>                                 Source)
>>                                 ewcstack-man02-prod:    at
>>                                 sun.reflect.DelegatingMethodAc
>>                                 cessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>                                 ewcstack-man02-prod:    at
>>                                 java.lang.reflect.Method.invok
>>                                 e(Method.java:606)
>>                                 ewcstack-man02-prod:    at
>>                                 com.cloud.vm.VmWorkJobHandlerP
>>                                 roxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
>>                                 ewcstack-man02-prod:    at
>>                                 com.cloud.vm.VirtualMachineMan
>>                                 agerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4636)
>>                                 ewcstack-man02-prod:    at
>>                                 com.cloud.vm.VmWorkJobDispatch
>>                                 er.runJob(VmWorkJobDispatcher.java:103)
>>                                 ewcstack-man02-prod:    at
>>                                 org.apache.cloudstack.framewor
>>                                 k.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManag
>>                                 erImpl.java:537)
>>                                 ewcstack-man02-prod:    at
>>                                 org.apache.cloudstack.managed.
>>                                 context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>>                                 ewcstack-man02-prod:    at
>>                                 org.apache.cloudstack.managed.
>>                                 context.impl.DefaultManagedContext$1.call(DefaultManagedCont
>>                                 ext.java:56)
>>                                 ewcstack-man02-prod:    at
>>                                 org.apache.cloudstack.managed.
>>                                 context.impl.DefaultManagedContext.callWithContext(DefaultMa
>>                                 nagedContext.java:103)
>>                                 ewcstack-man02-prod:    at
>>                                 org.apache.cloudstack.managed.
>>                                 context.impl.DefaultManagedContext.runWithContext(DefaultMan
>>                                 agedContext.java:53)
>>                                 ewcstack-man02-prod:    at
>>                                 org.apache.cloudstack.managed.
>>                                 context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>>                                 ewcstack-man02-prod:    at
>>                                 org.apache.cloudstack.framewor
>>                                 k.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:494)
>>                                 ewcstack-man02-prod:    at
>>                                 java.util.concurrent.Executors
>>                                 $RunnableAdapter.call(Executors.java:471)
>>                                 ewcstack-man02-prod:    at
>>                                 java.util.concurrent.FutureTas
>>                                 k.run(FutureTask.java:262)
>>                                 ewcstack-man02-prod:    at
>>                                 java.util.concurrent.ThreadPoo
>>                                 lExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>                                 ewcstack-man02-prod:    at
>>                                 java.util.concurrent.ThreadPoo
>>                                 lExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>                                 ewcstack-man02-prod:    at
>>                                 java.lang.Thread.run(Thread.java:745)
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:32,590 DEBUG
>>                                 [o.a.c.f.j.i.AsyncJobManagerImpl]
>>                                 (Work-Job-Executor-115:ctx-0d5db528
>>                                 job-13733/job-191379) Complete async
>>                                 job-191379, jobStatus: FAILED,
>>                                 resultCode: 0, result:
>>                                 rO0ABXNyAC9jb20uY2xvdWQudXRpbH
>>                                 MuZXhjZXB0aW9uLkNsb3VkUnVudGltZUV4Y2VwdGlvbgAAAABWTU9yAwABSQ
>>                                 ALY3NFcnJvckNvZGV4cgAaamF2YS5sYW5nLlJ1bnRpbWVFeGNlcHRpb26eXw
>>                                 ZHCjSD5QIAAHhyABNqYXZhLmxhbmcuRXhjZXB0aW9u0P0fPho7HMQCAAB4cg
>>                                 ATamF2YS5sYW5nLlRocm93YWJsZdXGNSc5d7jLAwAETAAFY2F1c2V0ABVMam
>>                                 F2YS9sYW5nL1Rocm93YWJsZTtMAA1kZXRhaWxNZXNzYWdldAASTGphdmEvbG
>>                                 FuZy9TdHJpbmc7WwAKc3RhY2tUcmFjZXQAHltMamF2YS9sYW5nL1N0YWNrVH
>>                                 JhY2VFbGVtZW50O0wAFHN1cHByZXNzZWRFeGNlcHRpb25zdAAQTGphdmEvdX
>>                                 RpbC9MaXN0O3hwcQB-AAh0AChVbmFibGUgdG8gbWlncmF0ZSBWTVtVc2VyfG
>>                                 ktNjM4LTE3MzUtVk1ddXIAHltMamF2YS5sYW5nLlN0YWNrVHJhY2VFbGVtZW
>>                                 50OwJGKjw8_SI5AgAAeHAAAAAUc3IAG2phdmEubGFuZy5TdGFja1RyYWNlRW
>>                                 xlbWVudGEJxZomNt2FAgAESQAKbGluZU51bWJlckwADmRlY2xhcmluZ0NsYX
>>                                 NzcQB-AAVMAAhmaWxlTmFtZXEAfgAFTAAKbWV0aG9kTmFtZXEAfgAFeHAAAA
>>                                 lldAAmY29tLmNsb3VkLnZtLlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGx0AB
>>                                 5WaXJ0dWFsTWFjaGluZU1hbmFnZXJJbXBsLmphdmF0ABZvcmNoZXN0cmF0ZU
>>                                 1pZ3JhdGVBd2F5c3EAfgAMAAARpXEAfgAOcQB-AA9xAH4AEHNxAH4ADP____
>>                                 90ACZzdW4ucmVmbGVjdC5HZW5lcmF0ZWRNZXRob2RBY2Nlc3NvcjQwMnB0AA
>>                                 ZpbnZva2VzcQB-AAwAAAArdAAoc3VuLnJlZmxlY3QuRGVsZWdhdGluZ01ldG
>>                                 hvZEFjY2Vzc29ySW1wbHQAIURlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcG
>>                                 wuamF2YXEAfgAUc3EAfgAMAAACXnQAGGphdmEubGFuZy5yZWZsZWN0Lk1ldG
>>                                 hvZHQAC01ldGhvZC5qYXZhcQB-ABRzcQB-AAwAAABrdAAiY29tLmNsb3VkLn
>>                                 ZtLlZtV29ya0pvYkhhbmRsZXJQcm94eXQAGlZtV29ya0pvYkhhbmRsZXJQcm
>>                                 94eS5qYXZhdAAPaGFuZGxlVm1Xb3JrSm9ic3EAfgAMAAASHHEAfgAOcQB-
>>                                 AA9xAH4AHnNxAH4ADAAAAGd0ACBjb20uY2xvdWQudm0uVm1Xb3JrSm9iRGlz
>>                                 cGF0Y2hlcnQAGFZtV29ya0pvYkRpc3BhdGNoZXIuamF2YXQABnJ1bkpvYnNx
>>                                 AH4ADAAAAhl0AD9vcmcuYXBhY2hlLmNsb3Vkc3RhY2suZnJhbWV3b3JrLmpv
>>                                 YnMuaW1wbC5Bc3luY0pvYk1hbmFnZXJJbXBsJDV0ABhBc3luY0pvYk1hbmFn
>>                                 ZXJJbXBsLmphdmF0AAxydW5JbkNvbnRleHRzcQB-AAwAAAAxdAA-b3JnLmFw
>>                                 YWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29udGV4
>>                                 dFJ1bm5hYmxlJDF0ABtNYW5hZ2VkQ29udGV4dFJ1bm5hYmxlLmphdmF0AANy
>>                                 dW5zcQB-AAwAAAA4dABCb3JnLmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQu
>>                                 Y29udGV4dC5pbXBsLkRlZmF1bHRNYW5hZ2VkQ29udGV4dCQxdAAaRGVmYXVs
>>                                 dE1hbmFnZWRDb250ZXh0LmphdmF0AARjYWxsc3EAfgAMAAAAZ3QAQG9yZy5h
>>                                 cGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0
>>                                 TWFuYWdlZENvbnRleHRxAH4ALnQAD2NhbGxXaXRoQ29udGV4dHNxAH4ADAAA
>>                                 ADVxAH4AMXEAfgAudAAOcnVuV2l0aENvbnRleHRzcQB-AAwAAAAudAA8b3Jn
>>                                 LmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29u
>>                                 dGV4dFJ1bm5hYmxlcQB-ACpxAH4AK3NxAH4ADAAAAe5xAH4AJXEAfgAmcQB-
>>                                 ACtzcQB-AAwAAAHXdAAuamF2YS51dGlsLmNvbmN1cnJlbnQuRXhlY3V0b3Jz
>>                                 JFJ1bm5hYmxlQWRhcHRlcnQADkV4ZWN1dG9ycy5qYXZhcQB-AC9zcQB-AAwA
>>                                 AAEGdAAfamF2YS51dGlsLmNvbmN1cnJlbnQuRnV0dXJlVGFza3QAD0Z1dHVy
>>                                 ZVRhc2suamF2YXEAfgArc3EAfgAMAAAEeXQAJ2phdmEudXRpbC5jb25jdXJy
>>                                 ZW50LlRocmVhZFBvb2xFeGVjdXRvcnQAF1RocmVhZFBvb2xFeGVjdXRvci5q
>>                                 YXZhdAAJcnVuV29ya2Vyc3EAfgAMAAACZ3QALmphdmEudXRpbC5jb25jdXJy
>>                                 ZW50LlRocmVhZFBvb2xFeGVjdXRvciRXb3JrZXJxAH4AQHEAfgArc3EAfgAM
>>                                 AAAC6XQAEGphdmEubGFuZy5UaHJlYWR0AAtUaHJlYWQuamF2YXEAfgArc3IA
>>                                 JmphdmEudXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVMaXN0_A8lMbXs
>>                                 jhACAAFMAARsaXN0cQB-AAd4cgAsamF2YS51dGlsLkNvbGxlY3Rpb25zJFVu
>>                                 bW9kaWZpYWJsZUNvbGxlY3Rpb24ZQgCAy173HgIAAUwAAWN0ABZMamF2YS91
>>                                 dGlsL0NvbGxlY3Rpb247eHBzcgATamF2YS51dGlsLkFycmF5TGlzdHiB0h2Z
>>                                 x2GdAwABSQAEc2l6ZXhwAAAAAHcEAAAAAHhxAH4ATHgAABCadwgAAAAAAAAAAHg
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:32,592 DEBUG
>>                                 [o.a.c.f.j.i.AsyncJobManagerImpl]
>>                                 (Work-Job-Executor-115:ctx-0d5db528
>>                                 job-13733/job-191379) Publish async
>>                                 job-191379 complete on message bus
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:32,592 DEBUG
>>                                 [o.a.c.f.j.i.AsyncJobManagerImpl]
>>                                 (Work-Job-Executor-115:ctx-0d5db528
>>                                 job-13733/job-191379) Wake up jobs
>>                                 related to job-191379
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:32,592 DEBUG
>>                                 [o.a.c.f.j.i.AsyncJobManagerImpl]
>>                                 (Work-Job-Executor-115:ctx-0d5db528
>>                                 job-13733/job-191379) Update db
>>                                 status for job-191379
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:32,594 DEBUG
>>                                 [o.a.c.f.j.i.AsyncJobManagerImpl]
>>                                 (Work-Job-Executor-115:ctx-0d5db528
>>                                 job-13733/job-191379) Wake up jobs
>>                                 joined with job-191379 and disjoin
>>                                 all
>>                                 subjobs created from job- 191379
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:32,604 DEBUG
>>                                 [o.a.c.f.j.i.AsyncJobManagerImpl]
>>                                 (Work-Job-Executor-115:ctx-0d5db528
>>                                 job-13733/job-191379) Done executing
>>                                 com.cloud.vm.VmWorkMigrateAway for
>>                                 job-191379
>>                                 ewcstack-man02-prod: 2017-01-24
>>                                 18:05:32,608 INFO
>>                                 [o.a.c.f.j.i.AsyncJobMonitor]
>>                                 (Work-Job-Executor-115:ctx-0d5db528
>>                                 job-13733/job-191379) Remove
>>                                 job-191379 from job monitoring
>>
>>
>>
>>                                 On 25.01.2017 15:53, Rafael
>>                                 Weingärtner wrote:
>>
>>                                 Did you check the ACS log files
>>                                 looking for some exceptions when the
>>
>>                                     migrations for this specific VM
>>                                     started?
>>
>>                                     On Wed, Jan 25, 2017 at 9:49 AM,
>>                                     Francois Scheurer <
>>                                     francois.scheurer@everyware.ch
>>                                     <ma...@everyware.ch>>
>>                                     wrote:
>>
>>                                     Dear CS contributors
>>
>>                                         We use CS 4.5.1 on a 3
>>                                         Clusters with XenServer 6.5.
>>                                         and shared primary
>>                                         storage.
>>
>>                                         When we put a host in
>>                                         maintenance, cs evacuates all
>>                                         VM's on other
>>                                         hosts.
>>
>>                                         But it happens regularly that
>>                                         some VM's are stopped instead
>>                                         of live
>>                                         migrated.
>>
>>                                         If we restart that stopped
>>                                         VM's, CS will then stop them
>>                                         again later.
>>
>>                                         We need to start them several
>>                                         times until they stays up.
>>
>>                                         Is it a known issue? Is it
>>                                         fixed on CS 4.9.2 ?
>>
>>
>>                                         Migrating manually all VM's
>>                                         is working but impracticable
>>                                         because too
>>                                         much
>>                                         time consuming.
>>
>>
>>                                         Many thanks in advance for
>>                                         your help.
>>
>>
>>
>>                                         Best Regards
>>                                         Francois
>>
>>
>>
>>
>>
>>
>>                                         --
>>
>>
>>                                         EveryWare AG
>>                                         François Scheurer
>>                                         Senior Systems Engineer
>>                                         Zurlindenstrasse 52a
>>                                         CH-8003 Zürich
>>
>>                                         tel: +41 44 466 60 00
>>                                         <tel:%2B41%2044%20466%2060%2000>
>>                                         fax: +41 44 466 60 10
>>                                         <tel:%2B41%2044%20466%2060%2010>
>>                                         mail:
>>                                         francois.scheurer@everyware.ch
>>                                         <ma...@everyware.ch>
>>                                         web: http://www.everyware.ch
>>
>>
>>
>>                                     --
>>
>>
>>                                 EveryWare AG
>>                                 François Scheurer
>>                                 Senior Systems Engineer
>>                                 Zurlindenstrasse 52a
>>                                 CH-8003 Zürich
>>
>>                                 tel: +41 44 466 60 00
>>                                 <tel:%2B41%2044%20466%2060%2000>
>>                                 fax: +41 44 466 60 10
>>                                 <tel:%2B41%2044%20466%2060%2010>
>>                                 mail: francois.scheurer@everyware.ch
>>                                 <ma...@everyware.ch>
>>                                 web: http://www.everyware.ch
>>
>>
>>
>>                             --
>>                             Rafael Weingärtner
>>
>>
>>
>>                     --
>>
>>
>>                     EveryWare AG
>>                     François Scheurer
>>                     Senior Systems Engineer
>>                     Zurlindenstrasse 52a
>>                     CH-8003 Zürich
>>
>>                     tel: +41 44 466 60 00
>>                     <tel:%2B41%2044%20466%2060%2000>
>>                     fax: +41 44 466 60 10
>>                     <tel:%2B41%2044%20466%2060%2010>
>>                     mail: francois.scheurer@everyware.ch
>>                     <ma...@everyware.ch>
>>                     web: http://www.everyware.ch
>>
>>
>>
>>
>>             -- 
>>
>>
>>             EveryWare AG
>>             François Scheurer
>>             Senior Systems Engineer
>>             Zurlindenstrasse 52a
>>             CH-8003 Zürich
>>
>>             tel: +41 44 466 60 00 <tel:%2B41%2044%20466%2060%2000>
>>             fax: +41 44 466 60 10 <tel:%2B41%2044%20466%2060%2010>
>>             mail: francois.scheurer@everyware.ch
>>             <ma...@everyware.ch>
>>             web: http://www.everyware.ch
>>
>>
>>
>>
>>         -- 
>>         Rafael Weingärtner
>>
>>
>>
>>
>>     -- 
>>     Rafael Weingärtner
>
>     -- 
>
>
>     EveryWare AG
>     François Scheurer
>     Senior Systems Engineer
>     Zurlindenstrasse 52a
>     CH-8003 Zürich
>
>     tel:+41 44 466 60 00 <tel:+41%2044%20466%2060%2000>
>     fax:+41 44 466 60 10 <tel:+41%2044%20466%2060%2010>
>     mail:francois.scheurer@everyware.ch <ma...@everyware.ch>
>     web:http://www.everyware.ch  
>
>
>
>
> -- 
> Rafael Weingärtner

-- 


EveryWare AG
François Scheurer
Senior Systems Engineer
Zurlindenstrasse 52a
CH-8003 Zürich

tel: +41 44 466 60 00
fax: +41 44 466 60 10
mail: francois.scheurer@everyware.ch
web: http://www.everyware.ch

Re: cs 4.5.1, vm stopped when putting xen hosts in maintenance

Posted by Rafael Weingärtner <ra...@gmail.com>.

I totally understand your position of not being able to debug the ACS in
production ;)


There is one bit I quite did not understand. If you have disabled the HA
for VMs, why do you still need to change data on the data base? How did you
disabled the HA of VMs?

On Mon, Feb 6, 2017 at 5:04 AM, Francois Scheurer <
francois.scheurer@everyware.ch> wrote:

> Dear Rafael / Dear CS contributors
>
>
> I tried the following:
>
> 1.     stop all MS
> 2.     backup DB
> 3.     select * from op_ha_work where type="Migration" and mgmt_server_id
> is null and taken is null;
> 4.     delete from op_ha_work where type="Migration" and mgmt_server_id is
> null and taken is null;
> 5.     start all MS
>
> But actually after the MS stopped the op_ha_work table was automatically
> cleaned, making my steps useless.
> After the MS started the op_ha_work table was filled again and related
> VM's were stopped several times (3-7 times), requiring high attention to
> manually restart them on each stop.
>
> I later found another table that could be the cause of the issue:
>
> mysql> select op_it_work.*,vm_instance.instance_name,vm_instance.removed
> from op_it_work join vm_instance where op_it_work.step!="Done" and
> op_it_work.instance_id=vm_instance.id order by instance_id;
> +--------------------------------------+----------------+---
> ---------+-----------------------+-----------+--------------
> ------+-----------+------------+-------------+--------------
> -+-------------+---------------+---------------------+
> | id                                   | mgmt_server_id | created_at |
> thread                | type      | vm_type            | step      |
> updated_at | instance_id | resource_type | resource_id | instance_name |
> removed             |
> +--------------------------------------+----------------+---
> ---------+-----------------------+-----------+--------------
> ------+-----------+------------+-------------+--------------
> -+-------------+---------------+---------------------+
> | be4d0dd2-7d4f-47d3-ba49-623d2f115bbf |   345049098498 | 1436187965 |
> consoleproxy-1        | Starting  | ConsoleProxy       | Release   |
> 1436188265 |         402 | NULL          |           0 | v-402-VM      |
> 2015-07-06 13:23:22 |
> | fb4d1844-eb92-478c-9f50-d666790f43af |   345049098498 | 1436187965
> <(14)%203618-7965> | secstorage-1          | Starting  |
> SecondaryStorageVm | Release   | 1436188265 <(14)%203618-8265> |
> 687 | NULL          |           0 | s-687-VM      | 2015-07-06 13:11:34 |
> | 7dfeb25a-a705-42e8-907c-79d1e69697ed |   345049098498 | 1436964548
> <(14)%203696-4548> | Work-Job-Executor-165 | Migrating |
> User               | Migrating | 1436964548 <(14)%203696-4548> |
> 776 | Host          |          29 | i-51-776-VM   | 2015-07-15 12:55:38 |
> | 2018ac61-b11c-4101-8ef5-07a39509f2bd |   345049098498 | 1486020173 |
> Work-Job-Executor-19  | Migrating | User               | Prepare   |
> 1486020173 |        1177 | Host          |          25 | i-189-1177-VM |
> NULL                |
> | 25430cd5-4346-4643-9c7e-7b39b8a876a2 |   345049098122 | 1486019693 |
> Work-Job-Executor-6   | Migrating | User               | Prepare   |
> 1486019693 |        1177 | Host          |          25 | i-189-1177-VM |
> NULL                |
> | 3276d52e-cbd9-44c5-8ab6-c70784337ff9 |   345049098498 | 1486020953 |
> Work-Job-Executor-33  | Migrating | User               | Prepare   |
> 1486020953 |        1177 | Host          |          33 | i-189-1177-VM |
> NULL                |
> | 6462da33-83f2-4575-af38-0e52d2a48488 |   345049098122 | 1486021433 |
> Work-Job-Executor-12  | Migrating | User               | Prepare   |
> 1486021433 |        1177 | Host          |          25 | i-189-1177-VM |
> NULL                |
> | 688e16c8-c39f-4e88-9e8f-36c0282f3df1 |   345049098122 | 1486020773 |
> Work-Job-Executor-11  | Migrating | User               | Prepare   |
> 1486020773 |        1177 | Host          |          25 | i-189-1177-VM |
> NULL                |
> | d8a18e9d-ac6f-4273-923e-e401533e2cf7 |   345049098498 | 1486020353 |
> Work-Job-Executor-25  | Migrating | User               | Prepare   |
> 1486020353 |        1177 | Host          |          25 | i-189-1177-VM |
> NULL                |
> | 1222ece8-399d-43d8-8ff8-d24bdebca2ed |   345049103441 | 1486018924 |
> Work-Job-Executor-2   | Migrating | User               | Prepare   |
> 1486018924 |        1178 | Host          |          29 | i-189-1178-VM |
> NULL                |
> | 2ac5ca4a-c20d-433a-8379-40e0de3b3ba8 |   345049098498 | 1486021613 |
> Work-Job-Executor-41  | Migrating | User               | Prepare   |
> 1486021613 |        1178 | Host          |          29 | i-189-1178-VM |
> NULL                |
> | 3e3b95a0-2af4-4268-bfdd-f3399a85bcaa |   345049103441 | 1486019584 |
> Work-Job-Executor-5   | Migrating | User               | Prepare   |
> 1486019584 |        1178 | Host          |          29 | i-189-1178-VM |
> NULL                |
> | 53745a2d-056c-4a1b-9267-eb97f077a6c8 |   345049103441 | 1486020244 |
> Work-Job-Executor-6   | Migrating | User               | Prepare   |
> 1486020244 |        1178 | Host          |          29 | i-189-1178-VM |
> NULL                |
> | e3555b49-7c18-4272-b7f4-ba9351b3bbd1 |   345049098498 | 1486020893 |
> Work-Job-Executor-32  | Migrating | User               | Prepare   |
> 1486020893 |        1178 | Host          |          29 | i-189-1178-VM |
> NULL                |
> | 1dcd5204-4f9a-4342-a23b-329a48596e13 |   345049098498 | 1486020953 |
> Work-Job-Executor-34  | Migrating | User               | Prepare   |
> 1486020953 |        1240 | Host          |           3 | i-388-1240-VM |
> NULL                |
> | ec1668b3-fa3c-4673-b593-3d607b169a24 |   345049098498 | 1486019693 |
> Work-Job-Executor-14  | Migrating | User               | Prepare   |
> 1486019693 |        1240 | Host          |           3 | i-388-1240-VM |
> NULL                |
> | fa7b2dec-ed7f-4a67-a055-e6e6bc1631ca |   345049098498 | 1486020353 |
> Work-Job-Executor-26  | Migrating | User               | Prepare   |
> 1486020353 |        1240 | Host          |           3 | i-388-1240-VM |
> NULL                |
> | 8cfe6d3e-5b7b-4be8-ac10-c865a492aab9 |   345049098498 | 1465387271 |
> Work-Job-Executor-18  | Running   | User               | Prepare   |
> 1465387271 |        1388 | Host          |           7 | i-361-1388-VM |
> NULL                |
> | 6e941b7e-6f19-4717-bf82-974c59737dbc |   345049103441 | 1486019524 |
> Work-Job-Executor-4   | Migrating | User               | Prepare   |
> 1486019524 |        1690 | Host          |          15 | i-648-1690-VM |
> NULL                |
> | 718578a4-0dd6-42d9-be10-ed7680650108 |   345049098122 | 1486021433 |
> Work-Job-Executor-13  | Migrating | User               | Prepare   |
> 1486021433 |        1690 | Host          |          15 | i-648-1690-VM |
> NULL                |
> | 9acf3908-ef13-4b03-8c2b-1a7e55e18491 |   345049098498 | 1486020173 |
> Work-Job-Executor-20  | Migrating | User               | Prepare   |
> 1486020173 |        1690 | Host          |          15 | i-648-1690-VM |
> NULL                |
> | a50c420b-ae52-4caa-83db-704b52b299a4 |   345049098498 | 1486020773 |
> Work-Job-Executor-28  | Migrating | User               | Prepare   |
> 1486020773 |        1690 | Host          |          15 | i-648-1690-VM |
> NULL                |
> | 2b5f4179-5975-4326-9db7-643c3ede63fe |   345049103441 | 1484224516 |
> Work-Job-Executor-161 | Running   | User               | Prepare   |
> 1484224516 |        1703 | Host          |           3 | i-51-1703-VM  |
> NULL                |
> | 2f3c6c44-b2a0-4d10-88d2-ac143d2dd379 |   345049098498 | 1484223910 |
> Work-Job-Executor-163 | Running   | User               | Prepare   |
> 1484223910 |        1703 | Host          |           3 | i-51-1703-VM  |
> NULL                |
> | 4126a23b-1cde-49bc-8636-c0d063b85cf6 |   345049103441 | 1484224636 |
> Work-Job-Executor-86  | Running   | User               | Prepare   |
> 1484224636 |        1703 | Host          |           3 | i-51-1703-VM  |
> NULL                |
> | 5dad5645-766b-48c9-9a85-b77b3c101390 |   345049103441 | 1484224636 |
> Work-Job-Executor-92  | Running   | User               | Prepare   |
> 1484224636 |        1703 | Host          |           3 | i-51-1703-VM  |
> NULL                |
> | 8c1dfff9-e55d-4f81-a22f-45f72accc416 |   345049103441 | 1484224576 |
> Work-Job-Executor-95  | Running   | User               | Prepare   |
> 1484224576 |        1703 | Host          |           3 | i-51-1703-VM  |
> NULL                |
> | a703d2d7-b1e3-42ce-a551-831982990ffb |   345049098498 | 1484224630 |
> Work-Job-Executor-13  | Running   | User               | Prepare   |
> 1484224630 |        1703 | Host          |           3 | i-51-1703-VM  |
> NULL                |
> | 0e194cb5-95b7-41cb-969a-ae37d2ad449a |   345049098498 | 1485763613 |
> Work-Job-Executor-28  | Migrating | User               | Prepare   |
> 1485763613 |        1735 | Host          |          19 | i-638-1735-VM |
> 2017-01-30 15:24:10 |
> | 74f180b9-be0f-429b-9a45-597ca2ca997d |   345049098498 | 1485763013 |
> Work-Job-Executor-21  | Migrating | User               | Prepare   |
> 1485763013 |        1735 | Host          |          19 | i-638-1735-VM |
> 2017-01-30 15:24:10 |
> | feaa421c-88df-4493-9a4f-bddb6ee1b75b |   345049098122 | 1485764285 |
> Work-Job-Executor-19  | Migrating | User               | Prepare   |
> 1485764285 |        1735 | Host          |          19 | i-638-1735-VM |
> 2017-01-30 15:24:10 |
> | 32138a36-ad18-42b5-8478-76ad63caf4da |   345049098122 | 1486020293 |
> Work-Job-Executor-9   | Migrating | User               | Prepare   |
> 1486020293 |        1736 | Host          |          19 | i-638-1736-VM |
> NULL                |
> | 5306ab1f-3c92-4f95-bbff-283b43e6862d |   345049098498 | 1486019573 |
> Work-Job-Executor-11  | Migrating | User               | Prepare   |
> 1486019573 |        1736 | Host          |          19 | i-638-1736-VM |
> NULL                |
> | 843bdf9c-247d-432c-ac19-1a4f7858631b |   345049098498 | 1486021493 |
> Work-Job-Executor-38  | Migrating | User               | Prepare   |
> 1486021493 |        1736 | Host          |          19 | i-638-1736-VM |
> NULL                |
> | 8d400afb-766f-4ac5-89ee-854f25a38eba |   345049098498 | 1486020173 |
> Work-Job-Executor-21  | Migrating | User               | Prepare   |
> 1486020173 |        1736 | Host          |          19 | i-638-1736-VM |
> NULL                |
> | f0ae5612-b6ef-4882-94a2-40a1e6dce342 |   345049098498 | 1486020833 |
> Work-Job-Executor-30  | Migrating | User               | Prepare   |
> 1486020833 |        1736 | Host          |          19 | i-638-1736-VM |
> NULL                |
> +--------------------------------------+----------------+---
> ---------+-----------------------+-----------+--------------
> ------+-----------+------------+-------------+--------------
> -+-------------+---------------+---------------------+
>
> As you see, the problematic VM's are all present on that table:
>
> i-189-1177-VM
> i-189-1178-VM
> i-388-1240-VM
> i-648-1690-VM
> i-638-1736-VM
>
> As the MS are running the production environnements, we cannot debug a lot
> and just need the shortest path to resolution.
> For example restarting the MS is having an impact on our customers and we
> need to reduce the number of restarts at the minimum.
>
> I have disabled the HA on all problematic VM's except on i-189-1177-VM,
> which is less critical.
>
> The next steps could be:
> 1.     stop all MS
> 2.     backup DB
> 3.     select op_it_work.*,vm_instance.instance_name,vm_instance.removed
> from op_it_work join vm_instance where op_it_work.step!="Done" and
> op_it_work.instance_id=vm_instance.id order by instance_id;
> 4.     update op_it_work set step="Done" where step!="Done";
> 5.     start all MS
>
> What do you think about step 4., safe or risky?
> Some of these entries are related to deleted VM's and seems to safe to
> mark as "Done".
> All other entries are related to the problematic VM's that keep
> stopping... so marking them as "Done" may (hopefully) eliminate the cause
> of the Migrating/Stopping.
> But we are not sure if marking them as "Done" need also changes on other
> tables for consistency.
>
>
> Thank you for your help.
>
>
> Best Regards
> Francois Scheurer
>
>
>
>
>
> On 01.02.2017 03:01, Rafael Weingärtner wrote:
>
> Ok, I dug a little further into the code.
>
> If we put a host in maintenance, it may unfortunately happen that a
>> timeout occurs and CS aborts the migration. As far as we can interpret the
>> logs, it seems (we speculate) that CS detects that the VM is then still
>> running on the original host, which is in maintenance, and decides somehow
>> to stop it because of the "host not up" status.
>>
>
> From wat I have seen in the code, this may happen (as you speculated). If
> a timeout or some other event interrupts migrations, the VM will be
> stopped. This happens in the method “com.cloud.vm.
> VirtualMachineManagerImpl.orchestrateMigrateAway(String, long,
> DeploymentPlanner)” at line 2396 and further. This was implemented on
> purpose; if the migration does not happen; the VM is stopped on the source
> host. Probably it was implemented this way on that philosophy of gradient
> descendent of artificial intelligence:
>
> “pick a random point in the field, update the values according to the
>> slope and hopefully after successfully iterations you converge to a minimum
>> point.” (the problem is that sometimes the function diverges).
>>
>
> I mean, probably this was done this way because the HA was supposed to see
> that a VM with HA is stopped, and should try to start it again somewhere
> else. The log you posted do not provide any other details to debug this
> further because the log is after an MS restart, right?
>
>
> After restarting the MSs I guess that a similar issue may happen: the
>> hosts are disconnected during the early stage and then reconnects. Maybe CS
>> then again stops the VMs because the "host not up" status?
>> Cf. 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f) The
>> last host of this VM is not UP or is not enabled, host status is:
>> Disconnected, host resource state is: Enabled
>>
>
> Here is something interesting that you have found. I checked the code, but
> without a live debug it is pretty hard for me to get the right flow of
> execution. By looking at the code and at the logs you provided, your
> deduction is right. After ACS boots up, the check for hosts that are
> shutdown/deactivated starts, and the method “com.cloud.agent.manager.
> AgentManagerImpl.executeUserRequest(long, Event)” is executed. The
> problem is that all of the hosts are considered as disconnected (ACS may
> not have finished connecting to all of them yet), and then a whole process
> of HA is kicked. The problem is that there is no host with status “UP”.
> That is why this happen “No suitable hosts found under this Cluster: 25”,
> and then the whole flow of HA, which shuts down the VM is kicked.
>
> In theory, the HA worker should see that a VM with HA is down, and should
> try to start it. I am not understanding why this is not working. If you
> decide to debug the running ACS, I may be able to help you, just keep me
> posted.
>
>
> On Tue, Jan 31, 2017 at 10:26 AM, Rafael Weingärtner <
> <ra...@gmail.com> wrote:
>
>> I will be able to re-check some code and answer you with new thoughts on
>> this problem tonight (the log files you provided today have some very
>> insightful information regarding this problem). I believe this is a bug in
>> HA, a very tricky one that has not been reported yet (at least I do not
>> remember seeing anything similar). And thanks to randomness/destiny now you
>> have found/uncovered a way to reproduce the problem (bug?!).
>>
>> I do not suggest you changing any data at all directly in the data base
>> (unless you know ACS code well and are 100% of what you are doing). This
>> can cause even bigger and more complicated problems. If you remove the HA
>> feature from this VMs, this problem will stop. However, if you can/are
>> willing to maintain the HA until we fully uncover the problem, map and
>> report it, this would be great.
>> BTW: why are you restarting MSs?
>> What are the configurations of your MSs?
>>
>> Thanks for your time and for sharing information with our community.
>>
>>
>>
>> On Tue, Jan 31, 2017 at 8:53 AM, Francois Scheurer <
>> francois.scheurer@everyware.ch> wrote:
>>
>>> Dear Rafael
>>>
>>>
>>> Many thanks for your deep analysis.
>>> Nice humor with the planetsalignments BTW ;)
>>>
>>> We also suspected the HA Flag to be related this issue, as we never got
>>> this issue in the past, until we enabled the HA on all VMs and Offerings.
>>>
>>> Here is also the workers list:
>>>
>>> mysql> select * from op_ha_work;
>>> +------+-------------+-----------+--------------------+-----
>>> -----+----------------+---------+---------------------+-----
>>> --+---------------------+-------+-------------+---------+
>>> | id   | instance_id | type      | vm_type            | state    |
>>> mgmt_server_id | host_id | created             | tried | taken
>>>  | step  | time_to_try | updated |
>>> +------+-------------+-----------+--------------------+-----
>>> -----+----------------+---------+---------------------+-----
>>> --+---------------------+-------+-------------+---------+
>>> |  105 |         343 | HA        | User               | Running  |
>>>  345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>> Error |  1394056013 |       3 |
>>> |  109 |         407 | HA        | ConsoleProxy       | Running  |
>>>  345049098498 |      33 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>> Error |  1394056013 |       3 |
>>> |  113 |         407 | HA        | ConsoleProxy       | Running  |
>>>  345049098498 |      33 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>> Error |  1394056013 |       3 |
>>> |  117 |         407 | HA        | ConsoleProxy       | Running  |
>>>  345049098498 |      33 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>> Error |  1394056013 |       3 |
>>> |  121 |         343 | HA        | User               | Running  |
>>>  345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>> Error |  1394056013 |       3 |
>>> |  125 |         405 | HA        | SecondaryStorageVm | Running  |
>>>  345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>> Error |  1394056013 |       3 |
>>> |  129 |         393 | HA        | User               | Running  |
>>>  345049098498 |      25 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>> Error |  1394056014 |      30 |
>>> |  133 |         405 | HA        | SecondaryStorageVm | Running  |
>>>  345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>> Error |  1394056014 |       3 |
>>> |  137 |         402 | HA        | ConsoleProxy       | Running  |
>>>  345049098498 |      25 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>>> Error |  1394056014 |       3 |
>>> |  345 |         402 | HA        | ConsoleProxy       | Starting |
>>>  345049098498 |      25 | 2015-07-06 13:09:26 | 1 | 2015-07-06 13:09:26 |
>>> Error |  1402527506 |     770 |
>>> |  349 |         687 | HA        | SecondaryStorageVm | Starting |
>>>  345049098498 |      25 | 2015-07-06 13:09:26 | 1 | 2015-07-06 13:09:26 |
>>> Error |  1402527506 |       2 |
>>> | 2057 |        1178 | Migration | User               | Running  |
>>>  345049098498 |      33 | 2017-01-31 07:56:08 | 0 | 2017-01-31 07:56:08 |
>>> Done  |  1451024774 |      95 |
>>> | 2058 |        1736 | Migration | User               | Running  |
>>>  345049098498 |     224 | 2017-01-31 07:56:08 | 0 | 2017-01-31 07:56:08 |
>>> Done  |  1451024774 |      45 |
>>> | 2059 |        1240 | Migration | User               | Running  |
>>>  345049098498 |       7 | 2017-01-31 07:56:08 | 0 | 2017-01-31 07:56:08 |
>>> Done  |  1451024774 |      61 |
>>> | 2060 |        1178 | Migration | User               | Running  |
>>>  345049098122 |      33 | 2017-01-31 07:56:08 | 1 | 2017-01-31 08:06:37 |
>>> Done  |  1451025374 |      95 |
>>> | 2061 |        1178 | Migration | User               | Stopped  |
>>>      NULL |      33 | 2017-01-31 07:56:09 | 5 | NULL                | Done
>>> |  1451027900 |     109 |
>>> | 2062 |        1690 | Migration | User               | Running  |
>>>  345049098498 |     224 | 2017-01-31 07:56:09 | 0 | 2017-01-31 07:56:09 |
>>> Done  |  1451024774 |     109 |
>>> | 2063 |        1177 | Migration | User               | Running  |
>>>  345049098498 |      29 | 2017-01-31 07:56:09 | 0 | 2017-01-31 07:56:09 |
>>> Done  |  1451024774 |      82 |
>>> | 2064 |        1690 | Migration | User               | Running  |
>>>  345049098122 |     224 | 2017-01-31 07:56:09 | 1 | 2017-01-31 07:58:18 |
>>> Done  |  1451024900 |     109 |
>>> | 2065 |        1240 | Migration | User               | Running  |
>>>  345049098122 |       7 | 2017-01-31 07:56:09 | 1 | 2017-01-31 07:58:18 |
>>> Done  |  1451024900 |      61 |
>>> | 2066 |        1736 | Migration | User               | Running  |
>>>  345049098122 |     224 | 2017-01-31 07:56:09 | 1 | 2017-01-31 07:58:18 |
>>> Done  |  1451024900 |      45 |
>>> | 2067 |        1240 | Migration | User               | Stopped  |
>>>      NULL |       7 | 2017-01-31 07:56:09 | 5 | NULL                | Done
>>> |  1451028003 |      80 |
>>> | 2068 |        1178 | Migration | User               | Running  |
>>>  345049098122 |      33 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
>>> Done  |  1451024781 |      95 |
>>> | 2069 |        1736 | Migration | User               | Running  |
>>>  345049098122 |     224 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
>>> Done  |  1451024781 |      45 |
>>> | 2070 |        1240 | Migration | User               | Running  |
>>>  345049098122 |       7 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
>>> Done  |  1451024781 |      61 |
>>> | 2071 |        1177 | Migration | User               | Running  |
>>>  345049098122 |      29 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
>>> Done  |  1451024781 |      82 |
>>> | 2072 |        1177 | Migration | User               | Stopped  |
>>>      NULL |      29 | 2017-01-31 07:56:16 | 5 | NULL                | Done
>>> |  1451027973 |     101 |
>>> | 2073 |        1177 | Migration | User               | Running  |
>>>  345049098122 |      29 | 2017-01-31 07:56:16 | 1 | 2017-01-31 07:58:22 |
>>> Done  |  1451024904 |      82 |
>>> | 2074 |        1690 | Migration | User               | Stopped  |
>>>      NULL |     224 | 2017-01-31 07:56:16 | 5 | NULL                | Done
>>> |  1451027464 |     123 |
>>> | 2075 |        1736 | Migration | User               | Stopped  |
>>>      NULL |     224 | 2017-01-31 07:56:16 | 5 | NULL                | Done
>>> |  1451027477 |      59 |
>>> +------+-------------+-----------+--------------------+-----
>>> -----+----------------+---------+---------------------+-----
>>> --+---------------------+-------+-------------+---------+
>>>
>>> If we put a host in maintenance, it may unfortunately happen that a
>>> timeout occurs and CS aborts the migration.
>>> As far as we can interpret the logs, it seems (we speculate) that CS
>>> detects that the VM is then still running on the original host, which is in
>>> maintenance, and decides somehow to stop it because of the "host not up"
>>> status.
>>> We then need to start several times the VM, until CS gives up the
>>> stopping.
>>>
>>> After restarting the MSs I guess that a similar issue may happen: the
>>> hosts are disconnected during the early stage and then reconnects. Maybe CS
>>> then again stops the VMs because the "host not up" status?
>>> Cf. 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>> The last host of this VM is not UP or is not enabled, host status is:
>>> Disconnected, host resource state is: Enabled
>>>
>>> But the problem after the MSs restart is definitely only impacting VMs
>>> listed in the Table op_ha_work.
>>>
>>>
>>> To your questions:
>>>         We have 3 ACS MSs managing several Zones. Each zone has a single
>>> pod and cluster.
>>>         The Cluster of i-638-1735-VM has this summary:
>>>          Total number of VMs: 121
>>>          Total number hypervisors: 4
>>>          Total allocated RAM: 522.0 / 972 GB (53.7 %)
>>>          Total allocated cores: 211
>>>          Total allocated storage: 11285 GB
>>>         Used Memory on the hosts is now varying from 28 to 77%, so there
>>> is enough RAM.
>>>         I did not check the XS logs. The same Problem has arisen this
>>> morning again (shortly after a restart of all MSs), so I will look in the
>>> XS logs now.
>>>
>>>         I have some fair experience with Eclipse and Java, I will ask my
>>> boss if we can go this way.
>>>
>>> The main issue for now, is that restarting the MSs is reproducing the
>>> problem, which is quite annoying for the impacted VM's.
>>> I thought to stop the MSs, delete all HA/Migration entries in op_ha_work
>>> and start the MSs.
>>> Is that a viable option? (or should I delete the Migration entries only?
>>> or never touch this table?)
>>> We will also very probably remove the HA Flag on all VMs if it does not
>>> work reliably.
>>>
>>> Best Regards
>>> Francois Scheurer
>>>
>>>
>>>
>>> Here are new logs regarding the same issue of today.
>>> I tried to get the important parts related to the stop of i-189-1178-VM.
>>>
>>> regexp: (i-189-1178-VM|VmWorkMigrateAway|job-15393|job-192989|StopRe
>>> questedvm|job-14965|job-193019|Stopped-Migrating|ctx-36aae733|work-2049)
>>>
>>> 2017-01-30 08:44:04,440 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Executing
>>> AsyncJobVO {id:192989, userId: 1, accountId: 1, instanceType: null,
>>> instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>> rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZU
>>> F3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm
>>> 1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTA
>>> ALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAA
>>> AAAAAAAAEAAAAAAAAEmnQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAHQ,
>>> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resu
>>> ltCode: 0, result: null, initMsid: 345049103441, completeMsid: null,
>>> lastUpdated: null, lastPolled: null, created: Mon Jan 30 08:44:03 CET 2017}
>>> 2017-01-30 08:44:04,440 DEBUG [c.c.v.VmWorkJobDispatcher]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Run VM work
>>> job: com.cloud.vm.VmWorkMigrateAway for VM 1178, job origin: 15393
>>>
>>> 2017-01-30 08:44:05,217 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>> Deploy avoids pods: [], clusters: [], hosts: [29]
>>> 2017-01-30 08:44:05,220 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
>>> (AgentConnectTaskPool-4:ctx-5b61f2e5) create ClusteredAgentAttache for
>>> 232
>>> 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>> DeploymentPlanner allocation algorithm: com.cloud.deploy.UserDispersin
>>> gPlanner@1734b2e2
>>> 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>> Trying to allocate a host and storage pools from dc:17, pod:25,cluster:25,
>>> requested cpu: 4200, requested ram: 4294967296 <%2842%29%209496-7296>
>>> 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) Is
>>> ROOT volume READY (pool already allocated)?: Yes
>>> 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>> This VM has last host_id specified, trying to choose the same host: 25
>>>
>>> 2017-01-30 08:44:05,614 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>> The last host of this VM is not UP or is not enabled, host status is:
>>> Disconnected, host resource state is: Enabled
>>> 2017-01-30 08:44:05,614 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>> Cannot choose the last host to deploy this VM
>>> 2017-01-30 08:44:05,614 DEBUG [c.c.d.FirstFitPlanner]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>> Searching resources only under specified Cluster: 25
>>>
>>> 2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>> FirstFitRoutingAllocator) Found 0 hosts for allocation after
>>> prioritization: []
>>> 2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>> FirstFitRoutingAllocator) Looking for speed=4200Mhz, Ram=4096
>>> 2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>> FirstFitRoutingAllocator) Host Allocator returning 0 suitable hosts
>>> 2017-01-30 08:44:05,777 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) No
>>> suitable hosts found
>>> 2017-01-30 08:44:05,777 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) No
>>> suitable hosts found under this Cluster: 25
>>> 2017-01-30 08:44:05,787 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>> Could not find suitable Deployment Destination for this VM under any
>>> clusters, returning.
>>> 2017-01-30 08:44:05,787 DEBUG [c.c.d.FirstFitPlanner]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>> Searching resources only under specified Cluster: 25
>>> 2017-01-30 08:44:05,798 DEBUG [c.c.d.FirstFitPlanner]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>> The specified cluster is in avoid set, returning.
>>> 2017-01-30 08:44:05,798 DEBUG [c.c.v.VirtualMachineManagerImpl]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>> Unable to find destination for migrating the vm VM[User|i-189-1178-VM]
>>> 2017-01-30 08:44:05,798 WARN [c.c.v.VirtualMachineManagerImpl]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>> Failed to deploy vm 1178 with original planner, sending HAPlanner
>>>
>>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>> Deploy avoids pods: [], clusters: [], hosts: [29]
>>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>> DeploymentPlanner allocation algorithm: com.cloud.deploy.SkipHeurestic
>>> sPlanner@4fdf86ce
>>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>> Trying to allocate a host and storage pools from dc:17, pod:25,cluster:25,
>>> requested cpu: 4200, requested ram: 4294967296 <%2842%29%209496-7296>
>>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) Is
>>> ROOT volume READY (pool already allocated)?: Yes
>>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>> This VM has last host_id specified, trying to choose the same host: 25
>>>
>>> 2017-01-30 08:44:06,022 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>> The last host of this VM is not UP or is not enabled, host status is:
>>> Disconnected, host resource state is: Enabled
>>> 2017-01-30 08:44:06,022 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>> Cannot choose the last host to deploy this VM
>>> 2017-01-30 08:44:06,022 DEBUG [c.c.d.FirstFitPlanner]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>> Searching resources only under specified Cluster: 25
>>> 2017-01-30 08:44:06,039 DEBUG [c.c.d.SkipHeuresticsPlanner]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>> Deploying vm during HA process, so skipping disable threshold check
>>>
>>> 2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>> FirstFitRoutingAllocator) Found 0 hosts for allocation after
>>> prioritization: []
>>> 2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>> FirstFitRoutingAllocator) Looking for speed=4200Mhz, Ram=4096
>>> 2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>>> FirstFitRoutingAllocator) Host Allocator returning 0 suitable hosts
>>> 2017-01-30 08:44:06,151 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) No
>>> suitable hosts found
>>> 2017-01-30 08:44:06,151 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) No
>>> suitable hosts found under this Cluster: 25
>>> 2017-01-30 08:44:06,157 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Publish async
>>> job-192986 complete on message bus
>>> 2017-01-30 08:44:06,157 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Wake up jobs
>>> related to job-192986
>>> 2017-01-30 08:44:06,157 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Update db
>>> status for job-192986
>>> 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>> The last host of this VM is not UP or is not enabled, host status is:
>>> Disconnected, host resource state is: Enabled
>>> 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>> Cannot choose the last host to deploy this VM
>>> 2017-01-30 08:44:06,157 DEBUG [c.c.d.FirstFitPlanner]
>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>> Searching resources only under specified Cluster: 35
>>> 2017-01-30 08:44:06,160 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>> Could not find suitable Deployment Destination for this VM under any
>>> clusters, returning.
>>> 2017-01-30 08:44:06,160 DEBUG [c.c.d.FirstFitPlanner]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>> Searching resources only under specified Cluster: 25
>>> 2017-01-30 08:44:06,163 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>> (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Execute sync-queue item:
>>> SyncQueueItemVO {id:63629, queueId: 64423, contentType: AsyncJob,
>>> contentId: 192991, lastProcessMsid: 345049098498, lastprocessNumber: 27,
>>> lastProcessTime: Mon Jan 30 08:44:06 CET 2017, created: Mon Jan 30 08:44:05
>>> CET 2017}
>>> 2017-01-30 08:44:06,169 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>>> (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Add job-192992
>>> into job monitoring
>>> 2017-01-30 08:44:06,184 DEBUG [c.c.d.SkipHeuresticsPlanner]
>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>> Deploying vm during HA process, so skipping disable threshold check
>>> 2017-01-30 08:44:06,186 DEBUG [c.c.d.FirstFitPlanner]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>> The specified cluster is in avoid set, returning.
>>> 2017-01-30 08:44:06,186 DEBUG [c.c.v.VirtualMachineManagerImpl]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>> Unable to find destination for migrating the vm VM[User|i-189-1178-VM]
>>> 2017-01-30 08:44:06,186 ERROR [c.c.v.VmWorkJobHandlerProxy]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>> Invocation exception, caused by: com.cloud.exception.InsufficientServerCapacityException:
>>> Unable to find a server to migrate to.Scope=interface
>>> com.cloud.org.Cluster; id=25
>>> 2017-01-30 08:44:06,187 INFO  [c.c.v.VmWorkJobHandlerProxy]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>>> Rethrow exception com.cloud.exception.InsufficientServerCapacityException:
>>> Unable to find a server to migrate to.Scope=interface
>>> com.cloud.org.Cluster; id=25
>>> 2017-01-30 08:44:06,187 DEBUG [c.c.v.VmWorkJobDispatcher]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Done with run
>>> of VM work job: com.cloud.vm.VmWorkMigrateAway for VM 1178, job origin:
>>> 15393
>>> 2017-01-30 08:44:06,187 ERROR [c.c.v.VmWorkJobDispatcher]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Unable to
>>> complete AsyncJobVO {id:192989, userId: 1, accountId: 1, instanceType:
>>> null, instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>> rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcE
>>> mwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAl
>>> Z2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5h
>>> bWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAA
>>> AAAEmnQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAHQ, cmdVersion: 0,
>>> status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null,
>>> initMsid: 345049103441, completeMsid: null, lastUpdated: null, lastPolled:
>>> null, created: Mon Jan 30 08:44:03 CET 2017}, job origin:15393
>>> com.cloud.exception.InsufficientServerCapacityException: Unable to find
>>> a server to migrate to.Scope=interface com.cloud.org.Cluster; id=25
>>>         at com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAwa
>>> y(VirtualMachineManagerImpl.java:2390)
>>>         at com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAwa
>>> y(VirtualMachineManagerImpl.java:4520)
>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>>> ssorImpl.java:57)
>>>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>>> thodAccessorImpl.java:43)
>>>         at java.lang.reflect.Method.invoke(Method.java:606)
>>>         at com.cloud.vm.VmWorkJobHandlerProxy.handleVmWorkJob(VmWorkJob
>>> HandlerProxy.java:107)
>>>         at com.cloud.vm.VirtualMachineManagerImpl.handleVmWorkJob(Virtu
>>> alMachineManagerImpl.java:4636)
>>>         at com.cloud.vm.VmWorkJobDispatcher.runJob(VmWorkJobDispatcher.
>>> java:103)
>>>         at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImp
>>> l$5.runInContext(AsyncJobManagerImpl.java:537)
>>>         at org.apache.cloudstack.managed.context.ManagedContextRunnable
>>> $1.run(ManagedContextRunnable.java:49)
>>>         at org.apache.cloudstack.managed.context.impl.DefaultManagedCon
>>> text$1.call(DefaultManagedContext.java:56)
>>>         at org.apache.cloudstack.managed.context.impl.DefaultManagedCon
>>> text.callWithContext(DefaultManagedContext.java:103)
>>>         at org.apache.cloudstack.managed.context.impl.DefaultManagedCon
>>> text.runWithContext(DefaultManagedContext.java:53)
>>>         at org.apache.cloudstack.managed.context.ManagedContextRunnable
>>> .run(ManagedContextRunnable.java:46)
>>>         at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImp
>>> l$5.run(AsyncJobManagerImpl.java:494)
>>>         at java.util.concurrent.Executors$RunnableAdapter.call(Executor
>>> s.java:471)
>>>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>> Executor.java:1145)
>>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>> lExecutor.java:615)
>>>         at java.lang.Thread.run(Thread.java:745)
>>> 2017-01-30 08:44:06,188 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>> (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Schedule queued job-192991
>>> 2017-01-30 08:44:06,192 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Wake up jobs
>>> joined with job-192986 and disjoin all subjobs created from job- 192986
>>> 2017-01-30 08:44:06,196 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Complete async
>>> job-192989, jobStatus: FAILED, resultCode: 0, result:
>>> rO0ABXNyADdjb20uY2xvdWQuZXhjZXB0aW9uLkluc3VmZmljaWVudFNlcnZl
>>> ckNhcGFjaXR5RXhjZXB0aW9uAAAAAFZNT3ACAAFaABVhZmZpbml0eUdyb3Vw
>>> c0FwcGxpZWR4cgAxY29tLmNsb3VkLmV4Y2VwdGlvbi5JbnN1ZmZpY2llbnRD
>>> YXBhY2l0eUV4Y2VwdGlvbgAAAABWTU91AgACTAACaWR0ABBMamF2YS9sYW5n
>>> L0xvbmc7TAAFc2NvcGV0ABFMamF2YS9sYW5nL0NsYXNzO3hyACJjb20uY2xv
>>> dWQuZXhjZXB0aW9uLkNsb3VkRXhjZXB0aW9ueeiOG_HkRMcCAAJMAAtjc0Vy
>>> cm9yQ29kZXQAE0xqYXZhL2xhbmcvSW50ZWdlcjtMAAZpZExpc3R0ABVMamF2
>>> YS91dGlsL0FycmF5TGlzdDt4cgATamF2YS5sYW5nLkV4Y2VwdGlvbtD9Hz4a
>>> OxzEAgAAeHIAE2phdmEubGFuZy5UaHJvd2FibGXVxjUnOXe4ywMABEwABWNh
>>> dXNldAAVTGphdmEvbGFuZy9UaHJvd2FibGU7TAANZGV0YWlsTWVzc2FnZXQA
>>> EkxqYXZhL2xhbmcvU3RyaW5nO1sACnN0YWNrVHJhY2V0AB5bTGphdmEvbGFu
>>> Zy9TdGFja1RyYWNlRWxlbWVudDtMABRzdXBwcmVzc2VkRXhjZXB0aW9uc3QA
>>> EExqYXZhL3V0aWwvTGlzdDt4cHEAfgANdAAmVW5hYmxlIHRvIGZpbmQgYSBz
>>> ZXJ2ZXIgdG8gbWlncmF0ZSB0by51cgAeW0xqYXZhLmxhbmcuU3RhY2tUcmFj
>>> ZUVsZW1lbnQ7AkYqPDz9IjkCAAB4cAAAABVzcgAbamF2YS5sYW5nLlN0YWNr
>>> VHJhY2VFbGVtZW50YQnFmiY23YUCAARJAApsaW5lTnVtYmVyTAAOZGVjbGFy
>>> aW5nQ2xhc3NxAH4ACkwACGZpbGVOYW1lcQB-AApMAAptZXRob2ROYW1lcQB-
>>> AAp4cAAACVZ0ACZjb20uY2xvdWQudm0uVmlydHVhbE1hY2hpbmVNYW5hZ2Vy
>>> SW1wbHQAHlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwuamF2YXQAFm9yY2hl
>>> c3RyYXRlTWlncmF0ZUF3YXlzcQB-ABEAABGocQB-ABNxAH4AFHEAfgAVc3EA
>>> fgAR_____nQAJHN1bi5yZWZsZWN0Lk5hdGl2ZU1ldGhvZEFjY2Vzc29ySW1w
>>> bHQAHU5hdGl2ZU1ldGhvZEFjY2Vzc29ySW1wbC5qYXZhdAAHaW52b2tlMHNx
>>> AH4AEQAAADlxAH4AGHEAfgAZdAAGaW52b2tlc3EAfgARAAAAK3QAKHN1bi5y
>>> ZWZsZWN0LkRlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcGx0ACFEZWxlZ2F0
>>> aW5nTWV0aG9kQWNjZXNzb3JJbXBsLmphdmFxAH4AHHNxAH4AEQAAAl50ABhq
>>> YXZhLmxhbmcucmVmbGVjdC5NZXRob2R0AAtNZXRob2QuamF2YXEAfgAcc3EA
>>> fgARAAAAa3QAImNvbS5jbG91ZC52bS5WbVdvcmtKb2JIYW5kbGVyUHJveHl0
>>> ABpWbVdvcmtKb2JIYW5kbGVyUHJveHkuamF2YXQAD2hhbmRsZVZtV29ya0pv
>>> YnNxAH4AEQAAEhxxAH4AE3EAfgAUcQB-ACZzcQB-ABEAAABndAAgY29tLmNs
>>> b3VkLnZtLlZtV29ya0pvYkRpc3BhdGNoZXJ0ABhWbVdvcmtKb2JEaXNwYXRj
>>> aGVyLmphdmF0AAZydW5Kb2JzcQB-ABEAAAIZdAA_b3JnLmFwYWNoZS5jbG91
>>> ZHN0YWNrLmZyYW1ld29yay5qb2JzLmltcGwuQXN5bmNKb2JNYW5hZ2VySW1w
>>> bCQ1dAAYQXN5bmNKb2JNYW5hZ2VySW1wbC5qYXZhdAAMcnVuSW5Db250ZXh0
>>> c3EAfgARAAAAMXQAPm9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNv
>>> bnRleHQuTWFuYWdlZENvbnRleHRSdW5uYWJsZSQxdAAbTWFuYWdlZENvbnRl
>>> eHRSdW5uYWJsZS5qYXZhdAADcnVuc3EAfgARAAAAOHQAQm9yZy5hcGFjaGUu
>>> Y2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0TWFuYWdl
>>> ZENvbnRleHQkMXQAGkRlZmF1bHRNYW5hZ2VkQ29udGV4dC5qYXZhdAAEY2Fs
>>> bHNxAH4AEQAAAGd0AEBvcmcuYXBhY2hlLmNsb3Vkc3RhY2subWFuYWdlZC5j
>>> b250ZXh0LmltcGwuRGVmYXVsdE1hbmFnZWRDb250ZXh0cQB-ADZ0AA9jYWxs
>>> V2l0aENvbnRleHRzcQB-ABEAAAA1cQB-ADlxAH4ANnQADnJ1bldpdGhDb250
>>> ZXh0c3EAfgARAAAALnQAPG9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2Vk
>>> LmNvbnRleHQuTWFuYWdlZENvbnRleHRSdW5uYWJsZXEAfgAycQB-ADNzcQB-
>>> ABEAAAHucQB-AC1xAH4ALnEAfgAzc3EAfgARAAAB13QALmphdmEudXRpbC5j
>>> b25jdXJyZW50LkV4ZWN1dG9ycyRSdW5uYWJsZUFkYXB0ZXJ0AA5FeGVjdXRv
>>> cnMuamF2YXEAfgA3c3EAfgARAAABBnQAH2phdmEudXRpbC5jb25jdXJyZW50
>>> LkZ1dHVyZVRhc2t0AA9GdXR1cmVUYXNrLmphdmFxAH4AM3NxAH4AEQAABHl0
>>> ACdqYXZhLnV0aWwuY29uY3VycmVudC5UaHJlYWRQb29sRXhlY3V0b3J0ABdU
>>> aHJlYWRQb29sRXhlY3V0b3IuamF2YXQACXJ1bldvcmtlcnNxAH4AEQAAAmd0
>>> AC5qYXZhLnV0aWwuY29uY3VycmVudC5UaHJlYWRQb29sRXhlY3V0b3IkV29y
>>> a2VycQB-AEhxAH4AM3NxAH4AEQAAAul0ABBqYXZhLmxhbmcuVGhyZWFkdAAL
>>> VGhyZWFkLmphdmFxAH4AM3NyACZqYXZhLnV0aWwuQ29sbGVjdGlvbnMkVW5t
>>> b2RpZmlhYmxlTGlzdPwPJTG17I4QAgABTAAEbGlzdHEAfgAMeHIALGphdmEu
>>> dXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVDb2xsZWN0aW9uGUIAgMte
>>> 9x4CAAFMAAFjdAAWTGphdmEvdXRpbC9Db2xsZWN0aW9uO3hwc3IAE2phdmEu
>>> dXRpbC5BcnJheUxpc3R4gdIdmcdhnQMAAUkABHNpemV4cAAAAAB3BAAAAAB4
>>> cQB-AFR4c3IAEWphdmEubGFuZy5JbnRlZ2VyEuKgpPeBhzgCAAFJAAV2YWx1
>>> ZXhyABBqYXZhLmxhbmcuTnVtYmVyhqyVHQuU4IsCAAB4cAAAEO9zcQB-AFMA
>>> AAAAdwQAAAAAeHNyAA5qYXZhLmxhbmcuTG9uZzuL5JDMjyPfAgABSgAFdmFs
>>> dWV4cQB-AFYAAAAAAAAAGXZyABVjb20uY2xvdWQub3JnLkNsdXN0ZXKqoV-
>>> omaW2FAIAAHhwAA
>>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>> Deploy avoids pods: [], clusters: [], hosts: [3]
>>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>> DeploymentPlanner allocation algorithm: com.cloud.deploy.SkipHeurestic
>>> sPlanner@4fdf86ce
>>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>> Trying to allocate a host and storage pools from dc:14, pod:17,cluster:17,
>>> requested cpu: 4200, requested ram: 4294967296
>>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31) Is
>>> ROOT volume READY (pool already allocated)?: Yes
>>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>> This VM has last host_id specified, trying to choose the same host: 213
>>> 2017-01-30 08:44:06,213 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Publish async
>>> job-192989 complete on message bus
>>> 2017-01-30 08:44:06,213 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Wake up jobs
>>> related to job-192989
>>> 2017-01-30 08:44:06,213 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Update db
>>> status for job-192989
>>> 2017-01-30 08:44:06,219 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>>> Checking resources in Cluster: 35 under Pod: 31
>>> 2017-01-30 08:44:06,227 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f
>>> FirstFitRoutingAllocator) Looking for hosts in dc: 23  pod:31  cluster:35
>>> 2017-01-30 08:44:06,229 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>> (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Executing
>>> AsyncJobVO {id:192992, userId: 1, accountId: 1, instanceType: null,
>>> instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>> rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcE
>>> mwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAl
>>> Z2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5h
>>> bWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAA
>>> AAAGwXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAEw, cmdVersion: 0,
>>> status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null,
>>> initMsid: 345049098122, completeMsid: null, lastUpdated: null, lastPolled:
>>> null, created: Mon Jan 30 08:44:05 CET 2017}
>>> 2017-01-30 08:44:06,229 DEBUG [c.c.v.VmWorkJobDispatcher]
>>> (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Run VM work
>>> job: com.cloud.vm.VmWorkMigrateAway for VM 1729, job origin: 14953
>>> 2017-01-30 08:44:06,230 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>> (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Execute sync-queue item:
>>> SyncQueueItemVO {id:63630, queueId: 67180, contentType: AsyncJob,
>>> contentId: 192990, lastProcessMsid: 345049098498, lastprocessNumber: 61,
>>> lastProcessTime: Mon Jan 30 08:44:06 CET 2017, created: Mon Jan 30 08:44:05
>>> CET 2017}
>>> 2017-01-30 08:44:06,244 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>> The last host of this VM is not UP or is not enabled, host status is:
>>> Disconnected, host resource state is: Enabled
>>> 2017-01-30 08:44:06,262 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>> Cannot choose the last host to deploy this VM
>>> 2017-01-30 08:44:06,263 DEBUG [c.c.d.FirstFitPlanner]
>>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>>> Searching resources only under specified Cluster: 17
>>> 2017-01-30 08:44:06,262 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>>> (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Add job-192991
>>> into job monitoring
>>> 2017-01-30 08:44:06,246 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Done executing
>>> com.cloud.vm.VmWorkMigrateAway for job-192986
>>> 2017-01-30 08:44:06,244 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Wake up jobs
>>> joined with job-192989 and disjoin all subjobs created from job- 192989
>>> 2017-01-30 08:44:06,269 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Remove
>>> job-192986 from job monitoring
>>> 2017-01-30 08:44:06,279 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Done executing
>>> com.cloud.vm.VmWorkMigrateAway for job-192989
>>> 2017-01-30 08:44:06,305 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>> (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Executing
>>> AsyncJobVO {id:192991, userId: 1, accountId: 1, instanceType: null,
>>> instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>> rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcE
>>> mwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAl
>>> Z2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5h
>>> bWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAA
>>> AAAGRXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAABw, cmdVersion: 0,
>>> status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null,
>>> initMsid: 345049098122, completeMsid: null, lastUpdated: null, lastPolled:
>>> null, created: Mon Jan 30 08:44:05 CET 2017}
>>> 2017-01-30 08:44:06,305 DEBUG [c.c.v.VmWorkJobDispatcher]
>>> (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Run VM work
>>> job: com.cloud.vm.VmWorkMigrateAway for VM 1605, job origin: 14965
>>> 2017-01-30 08:44:06,306 DEBUG [c.c.a.m.AgentManagerImpl]
>>> (AgentManager-Handler-13:null) Ping from 258
>>> 2017-01-30 08:44:06,319 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f
>>> FirstFitRoutingAllocator) List of hosts in ascending order of number of
>>> VMs: [224, 19, 21, 15]
>>> 2017-01-30 08:44:06,322 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f
>>> FirstFitRoutingAllocator) FirstFitAllocator has 0 hosts to check for
>>> allocation: []
>>> 2017-01-30 08:44:06,322 INFO  [c.c.a.m.AgentManagerImpl]
>>> (AgentManager-Handler-13:null) PingMap for agent: 258 will not be updated
>>> because agent is no longer in the PingMap
>>> 2017-01-30 08:44:06,326 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Remove
>>> job-192989 from job monitoring
>>>
>>> 2017-01-30 08:54:38,139 INFO [c.c.v.VirtualMachineManagerImpl]
>>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5)
>>> Migration cancelled because state has changed: VM[User|i-189-1178-VM]
>>> 2017-01-30 08:54:38,139 DEBUG [c.c.v.VirtualMachineManagerImpl]
>>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5)
>>> Unable to migrate VM due to: Migration cancelled because state has changed:
>>> VM[User|i-189-1178-VM]
>>> 2017-01-30 08:54:38,167 DEBUG [c.c.c.CapacityManagerImpl]
>>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5)
>>> VM state transitted from :Running to Stopping with event: StopRequestedvm's
>>> original host id: 29 new host id: 33 host id before state transition: 33
>>> 2017-01-30 08:54:38,177 DEBUG [c.c.a.m.DirectAgentAttache]
>>> (DirectAgent-325:ctx-e540d7f5) Seq 19-2355664080091480198: Response
>>> Received:
>>> 2017-01-30 08:54:38,177 DEBUG [c.c.a.t.Request]
>>> (DirectAgent-325:ctx-e540d7f5) Seq 19-2355664080091480198: Processing:  {
>>> Ans: , MgmtId: 345049098498, via: 19, Ver: v1, Flags: 10,
>>> [{"com.cloud.agent.api.NetworkUsageAnswer":{"routerName":"r-
>>> 1706-VM","bytesSent":12276848087,"bytesReceived":5226621450,
>>> "result":true,"details":"","wait":0}}] }
>>> 2017-01-30 08:54:38,177 DEBUG [c.c.a.t.Request]
>>> (RouterMonitor-1:ctx-2e80f405) Seq 19-2355664080091480198: Received:  {
>>> Ans: , MgmtId: 345049098498, via: 19, Ver: v1, Flags: 10, {
>>> NetworkUsageAnswer } }
>>> 2017-01-30 08:54:38,177 DEBUG [c.c.a.m.AgentManagerImpl]
>>> (RouterMonitor-1:ctx-2e80f405) Details from executing class
>>> com.cloud.agent.api.NetworkUsageCommand:
>>> 2017-01-30 08:54:38,178 DEBUG [c.c.a.t.Request]
>>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5)
>>> Seq 33-1917970491306410035: Sending  { Cmd , MgmtId: 345049098498, via:
>>> 33(ewcstack-vh003-prod), Ver: v1, Flags: 100011,
>>> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"execut
>>> eInSequence":false,"checkBeforeCleanup":false,"vmName":"i-189-1178-VM","wait":0}}]
>>> }
>>> 2017-01-30 08:54:38,178 DEBUG [c.c.a.t.Request]
>>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5)
>>> Seq 33-1917970491306410035: Executing:  { Cmd , MgmtId: 345049098498, via:
>>> 33(ewcstack-vh003-prod), Ver: v1, Flags: 100011,
>>> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"execut
>>> eInSequence":false,"checkBeforeCleanup":false,"vmName":"i-189-1178-VM","wait":0}}]
>>> }
>>> 2017-01-30 08:54:38,178 DEBUG [c.c.a.m.DirectAgentAttache]
>>> (DirectAgent-43:ctx-da30f109) Seq 33-1917970491306410035: Executing request
>>> 2017-01-30 08:54:38,202 DEBUG [c.c.a.m.DirectAgentAttache]
>>> (DirectAgent-326:ctx-4fb88aa7) Seq 21-1035546439318503505: Executing request
>>>
>>> 2017-01-30 09:05:55,628 INFO [c.c.h.HighAvailabilityManagerImpl]
>>> (HA-Worker-4:ctx-36aae733 work-2049) Processing
>>> HAWork[2049-Migration-1178-Stopped-Migrating]
>>> 2017-01-30 09:05:55,633 WARN [o.a.c.f.j.AsyncJobExecutionContext]
>>> (HA-Worker-4:ctx-36aae733 work-2049) Job is executed without a context,
>>> setup psudo job for the executing thread
>>> 2017-01-30 09:05:55,645 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>> (HA-Worker-4:ctx-36aae733 work-2049) Sync job-193045 execution on object
>>> VmWorkJobQueue.1178
>>>
>>>
>>>
>>>
>>> On 31.01.2017 00:42, Rafael Weingärtner wrote:
>>>
>>>> This seems to be one of those nasty bugs that are very hard to debug. It
>>>> seems to be cause to a combination of different factors, such as planets
>>>> alignments ;)
>>>>
>>>> At class “VirtualMachineManagerImpl” line 2395 (method
>>>> “com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(String,
>>>> long, DeploymentPlanner)”) is the start of the code flow that triggers
>>>> the
>>>> migration. The point here is that, if an error occurs during the
>>>> migration,
>>>> the VM will be stopped; For some reason this is executed only for HA
>>>> VMs.
>>>>
>>>> Going a little bit further, we find the point that is triggering the
>>>> error
>>>> in the method “com.cloud.vm.VirtualMachineMa
>>>> nagerImpl.migrate(VMInstanceVO,
>>>> long, DeployDestination)” at line 1941. This line will call the method
>>>> “com.cloud.vm.VirtualMachineManagerImpl.changeState(VMInstanceVO,
>>>> Event,
>>>> Long, ItWorkVO, Step)” to change the VM state to migrating/prepare
>>>> migration/or something like that (I am guessing here), but for some
>>>> reason,
>>>> the state of the VM is not the one expected and this is causing a
>>>> “ConcurrentOperationException” to be thrown. The problem is that the
>>>> “com.cloud.vm.VirtualMachineManagerImpl.changeState(VMInstanceVO,
>>>> Event,
>>>> Long, ItWorkVO, Step)” does not seem to provide much useful log entries.
>>>>
>>>> Just to enable us to discuss the problem a little further, could you
>>>> tell
>>>> us a little bit about your environment setting? How many ACS MSs do you
>>>> have? What is the VM workload being migrated (I mean, the number of VMs
>>>> and
>>>> computing resources)? Do the other servers have enough computing
>>>> resources
>>>> to support these VMs? Could you also check the XenServer logs, and check
>>>> what happened on both, source and destination hosts when the migration
>>>> for
>>>> VM “i-638-1735-VM” was triggered?
>>>>
>>>> Do you have the knowledge/skills to use an IDE (e.g. Eclipse) to
>>>> connect to
>>>> a running JVM and debug a code? Because for me it will not be possible
>>>> to
>>>> go further without looking at the variables in a running environment.
>>>>
>>>>
>>>> On Mon, Jan 30, 2017 at 4:28 AM, Francois Scheurer <
>>>> francois.scheurer@everyware.ch> wrote:
>>>>
>>>> Dear Rafael
>>>>>
>>>>>
>>>>> Sorry for the late answer (was sick a couple of days).
>>>>>
>>>>> I confirm we are migrating inside the same xenserver pool (within a CS
>>>>> Cluster).
>>>>>
>>>>>
>>>>> Best Regards
>>>>> Francois
>>>>>
>>>>>
>>>>>
>>>>> On 26.01.2017 04:35, Rafael Weingärtner wrote:
>>>>>
>>>>> Found the code that is executed.
>>>>>> I need you to confirm if you are migrating within a cluster or across
>>>>>> clusters?
>>>>>>
>>>>>> On Wed, Jan 25, 2017 at 11:57 AM, Rafael Weingärtner <
>>>>>> rafaelweingartner@gmail.com> wrote:
>>>>>>
>>>>>> I do not have access to ACS code right now, but I suggest starting
>>>>>>
>>>>>>> debugging here:
>>>>>>>
>>>>>>> 2017-01-24 18:05:28,427 DEBUG [c.c.c.CapacityManagerImpl]
>>>>>>>
>>>>>>>> (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380
>>>>>>>> ctx-175d37df)
>>>>>>>> VM state transitted from :Running to Stopping with event:
>>>>>>>> StopRequestedvm's
>>>>>>>> original host id: 224 new host id: 15 host id before state
>>>>>>>> transition:
>>>>>>>> 15
>>>>>>>>
>>>>>>>> I would try to understand first why ACS requested the hypervisor to
>>>>>>>> stop
>>>>>>>>
>>>>>>> the VM. I mean the following, check the source code for conditions
>>>>>>> that
>>>>>>> would make ACS request the shutdown of VMs.
>>>>>>> BTW: are you migrating within a cluster or across clusters?
>>>>>>>
>>>>>>> I will be able to analyze this further only after work hours, so if
>>>>>>> you
>>>>>>> find anything, keep me posted.
>>>>>>>
>>>>>>> On Wed, Jan 25, 2017 at 11:45 AM, Francois Scheurer <
>>>>>>> francois.scheurer@everyware.ch> wrote:
>>>>>>>
>>>>>>> Hello Rafael
>>>>>>>
>>>>>>>>
>>>>>>>> I think the important log lines are these:
>>>>>>>>
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 INFO
>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b
>>>>>>>> 86f5e
>>>>>>>> job-60216/job-191380 ctx-175d37df) Migration cancelled because
>>>>>>>> state has
>>>>>>>> changed: VM[User|i-638-1736-VM]
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 DEBUG
>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b
>>>>>>>> 86f5e
>>>>>>>> job-60216/job-191380 ctx-175d37df) Unable to migrate VM due to:
>>>>>>>> Migration
>>>>>>>> cancelled because state has changed: VM[User|i-638-1736-VM]
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,427 DEBUG
>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>>>>>>> job-60216/job-191380 ctx-175d37df) VM state transitted from
>>>>>>>> :Running to
>>>>>>>> Stopping with event: StopRequestedvm's original host id: 224 new
>>>>>>>> host
>>>>>>>> id:
>>>>>>>> 15 host id before state transition: 15
>>>>>>>>
>>>>>>>> The really disturbing thing is that CS stop again the VM if we
>>>>>>>> start it
>>>>>>>> manually, apparently because the migratingjob is still running.
>>>>>>>>
>>>>>>>> Note the the VM HA Flag is enabled.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Thank you.
>>>>>>>>
>>>>>>>>
>>>>>>>> Best Regards.
>>>>>>>>
>>>>>>>> Francois
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> more details:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,148 DEBUG
>>>>>>>> [o.a.c.e.o.VolumeOrchestrator] (Work-Job-Executor-156:ctx-14b86f5e
>>>>>>>> job-60216/job-191380 ctx-175d37df) Preparing 3 volumes for
>>>>>>>> VM[User|i-638-1736-VM]
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,155 DEBUG
>>>>>>>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-122:ctx-a8bacc64
>>>>>>>> job-14953/job-191351) Done with run of VM work job:
>>>>>>>> com.cloud.vm.VmWorkMigrateAway for VM 1411, job origin: 14953
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,155 DEBUG
>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-122:ctx-a8b
>>>>>>>> acc64
>>>>>>>> job-14953/job-191351) Done executing com.cloud.vm.VmWorkMigrateAway
>>>>>>>> for
>>>>>>>> job-191351
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,160 INFO
>>>>>>>> [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-122:ctx-a8bacc64
>>>>>>>> job-14953/job-191351) Remove job-191351 from job monitoring
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,168 INFO
>>>>>>>> [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-2:ctx-f3bcf43c
>>>>>>>> work-1989)
>>>>>>>> Completed HAWork[1989-Migration-1411-Running-Migrating]
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,246 DEBUG
>>>>>>>> [c.c.a.m.ClusteredAgentAttache] (Work-Job-Executor-156:ctx-14b86f5e
>>>>>>>> job-60216/job-191380 ctx-175d37df) Seq 19-8526158519542516375:
>>>>>>>> Forwarding
>>>>>>>> Seq 19-8526158519542516375:  { Cmd , MgmtId: 345049103441, via:
>>>>>>>> 19(ewcstack-vh010-prod), Ver: v1, Flags: 100111, [{
>>>>>>>> "com.cloud.agent.api.PrepareForMigrationCommand":{"vm":{"id"
>>>>>>>> :1736,"name":"i-638-1736-VM","bootloader":"PyGrub","type":"U
>>>>>>>> ser","cpus":4,"minSpeed":2100,"maxSpeed":2100,"minRam":17179
>>>>>>>> 869184,"maxRam":17179869184,"arch":"x86_64","os":"Other
>>>>>>>> (64-bit)","platformEmulator":"Other install
>>>>>>>> media","bootArgs":"","enable
>>>>>>>> HA":true,"limitCpuUse":true,"enableDynamicallyScaleVm":true,
>>>>>>>> "vncPassword":"pC1WUC7h1DBH9J36q3H9pg==","params":{"memoryOv
>>>>>>>> ercommitRatio":"1.0","platform":"viridian:true;acpi:1;apic:t
>>>>>>>> rue;pae:true;nx:true;timeoffset:-3601","keyboard":"us","Mess
>>>>>>>> age.ReservedCapacityFreed.Flag":"false","cpuOvercommitRatio"
>>>>>>>> :"4.0","
>>>>>>>> hypervisortoolsversion":"xenserver61"},"uuid":"5bff01de-c033
>>>>>>>> -4925-9e47-30dd14539272","disks":[{"data":{"org.apache.cloud
>>>>>>>> stack.storage.to.VolumeObjectTO":{"uuid":"1ab514b2-f2de-
>>>>>>>> 4d85-8e0d-9768184a1349","volumeType":"ROOT","dataStore":{"org.
>>>>>>>> apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"b8
>>>>>>>> 751b89-00f
>>>>>>>> 6-5079-ed7a-c073579c7563","id":64,"poolType":"PreSetup","hos
>>>>>>>> t":"localhost","path":"/b8751b89-00f6-5079-ed7a-c073579c7563
>>>>>>>> ","port":0,"url":"PreSetup://localhost/b8751b89-00f6-5079-ed
>>>>>>>> 7a-c073579c7563/?ROLE=Primary&STOREUUID=b8751b89-00f6-5079-
>>>>>>>> ed7a-c073579c7563"}},"name":"ROOT-1736","size":53687091200,"
>>>>>>>> path":"d3
>>>>>>>> f87e3c-1efe-42ad-aa93-609d5b980a34","volumeId":8655,"vmName"
>>>>>>>> :"i-638-1736-VM","accountId":638,"format":"VHD","provisionin
>>>>>>>> gType":"THIN","id":8655,"deviceId":0,"hypervisorType":"
>>>>>>>> XenServer"}},"diskSeq":0,"path":"d3f87e3c-1efe-42ad-aa93
>>>>>>>> -609d5b980a34","type":"ROOT","_details":{"managed":"false","
>>>>>>>> storagePort":"0",
>>>>>>>> "storageHost":"localhost","volumeSize":"53687091200"}},{"dat
>>>>>>>> a":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid
>>>>>>>> ":"733df9dc-5e24-4ab9-b22b-bce78322428e","volumeType":"DATAD
>>>>>>>> ISK","dataStore":{"org.apache.cloudstack.storage.to.PrimaryD
>>>>>>>> ataStoreTO":{"uuid":"b8751b89-00f6-5079-ed7a-c073579c7563","
>>>>>>>> id":64,"
>>>>>>>> poolType":"PreSetup","host":"localhost","path":"/b8751b89-00
>>>>>>>> f6-5079-ed7a-c073579c7563","port":0,"url":"PreSetup://localh
>>>>>>>> ost/b8751b89-00f6-5079-ed7a-c073579c7563/?ROLE=Primary&STORE
>>>>>>>> UUID=b8751b89-00f6-5079-ed7a-c073579c7563"}},"name":"WELLA-
>>>>>>>> DB1-NCSA_D","size":268435456000,"path":"265f168a-bb92-4591-a
>>>>>>>> 32b-ee3e2
>>>>>>>> 2c8cb68","volumeId":8656,"vmName":"i-638-1736-VM","accountId
>>>>>>>> ":638,"format":"VHD","provisioningType":"THIN","id":8656,"
>>>>>>>> deviceId":1,"hypervisorType":"XenServer"}},"diskSeq":1,"
>>>>>>>> path":"265f168a-bb92-4591-a32b-ee3e22c8cb68","type":"DATADIS
>>>>>>>> K","_details":{"managed":"false","storagePort":"0","stor
>>>>>>>> ageHost":"localhos
>>>>>>>> t","volumeSize":"268435456000"}},{"data":{"org.apache.clouds
>>>>>>>> tack.storage.to.VolumeObjectTO":{"uuid":"bda6f335-b8a3-479c-
>>>>>>>> 8c0d-41b344d2a50d","volumeType":"DATADISK","dataStore":{"org.
>>>>>>>> apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid"
>>>>>>>> :"09df3ca6-0a20-e4f0-3d79-12a6ed67a13d","id":89,"poolType":"
>>>>>>>> PreSetup","
>>>>>>>> host":"localhost","path":"/09df3ca6-0a20-e4f0-3d79-12a6ed67a
>>>>>>>> 13d","port":0,"url":"PreSetup://localhost/09df3ca6-0a20-e4f0
>>>>>>>> -3d79-12a6ed67a13d/?ROLE=Primary&STOREUUID=09df3ca6-0a20-
>>>>>>>>
>>>>>>>> e4f0-3d79-12a6ed67a13d"}},"name":"WELLA-DB1-NCSA_E","size
>>>>>>>> ":429496729600,"path":"612cb586-a70c-4896-b120-062ee7a67894"
>>>>>>>> ,"volumeId":86
>>>>>>>> 57,"vmName":"i-638-1736-VM","accountId":638,"format":"VHD","
>>>>>>>> provisioningType":"THIN","id":8657,"deviceId":2,"hypervisorT
>>>>>>>> ype":"XenServer"}},"diskSeq":2,"path":"612cb586-a70c-4896-b1
>>>>>>>> 20-062ee7a67894","type":"DATADISK","_details":{"managed":"
>>>>>>>> false","storagePort":"0","storageHost":"localhost","volumeSi
>>>>>>>> ze":"42949
>>>>>>>> 6729600"}},{"data":{"org.apache.cloudstack.storage.to.Templa
>>>>>>>> teObjectTO":{"uuid":"0e5bf4ff-caca-4b08-abed-7bbcdf3c42ef","
>>>>>>>> id":201,"format":"ISO","accountId":1,"hvm":true,"displayText
>>>>>>>> ":"xen-pv-drv-iso","name":"xs-tools.iso","guestOsType":"CentOS 4.5
>>>>>>>> (32-bit)","hypervisorType":"XenServer"}},"diskSeq":3,"type":"
>>>>>>>> ISO"}],"nics":[{"deviceId":0,"networkRateMbps":1000,"default
>>>>>>>> Nic":true,"pxeDisable":false,"nicUuid":"3523c3d3-9f92-4076-a
>>>>>>>> d99-f8864d0f0edd","uuid":"e6fbb257-16e1-415f-88fb-9a79b263b3
>>>>>>>> a8","ip":"192.168.1.42","netmask":"255.255.255.0","
>>>>>>>> gateway":"192.168.1.1","mac":"02:00:0c:ae:00:0f","dns1":"8.8
>>>>>>>> .8.8","dns2":"8.8
>>>>>>>> .4.4","broadcastType":"Vlan","type":"Guest","broadcastUri":"
>>>>>>>> vlan://2285","isolationUri":"vlan://2285","isSecurityGroupEn
>>>>>>>> abled":false,"name":"bond0"}],"vcpuMaxLimit":16},"wait":0}}] } to
>>>>>>>> 345049098498
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG [c.c.a.t.Request]
>>>>>>>> (AgentManager-Handler-10:null) Seq 19-8526158519542516375:
>>>>>>>> Processing:
>>>>>>>> {
>>>>>>>> Ans: , MgmtId: 345049103441, via: 19, Ver: v1, Flags: 110,
>>>>>>>> [{"com.cloud.agent.api.PrepareForMigrationAnswer":{"result":
>>>>>>>> true,"wait":0}}]
>>>>>>>> }
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG
>>>>>>>> [c.c.a.m.AgentAttache]
>>>>>>>> (AgentManager-Handler-10:null) Seq 19-8526158519542516375: No more
>>>>>>>> commands
>>>>>>>> found
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG [c.c.a.t.Request]
>>>>>>>> (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380
>>>>>>>> ctx-175d37df)
>>>>>>>> Seq 19-8526158519542516375: Received:  { Ans: , MgmtId:
>>>>>>>> 345049103441,
>>>>>>>> via:
>>>>>>>> 19, Ver: v1, Flags: 110, { PrepareForMigrationAnswer } }
>>>>>>>>
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 INFO
>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b
>>>>>>>> 86f5e
>>>>>>>> job-60216/job-191380 ctx-175d37df) Migration cancelled because
>>>>>>>> state has
>>>>>>>> changed: VM[User|i-638-1736-VM]
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 DEBUG
>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b
>>>>>>>> 86f5e
>>>>>>>> job-60216/job-191380 ctx-175d37df) Unable to migrate VM due to:
>>>>>>>> Migration
>>>>>>>> cancelled because state has changed: VM[User|i-638-1736-VM]
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,427 DEBUG
>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>>>>>>> job-60216/job-191380 ctx-175d37df) VM state transitted from
>>>>>>>> :Running to
>>>>>>>> Stopping with event: StopRequestedvm's original host id: 224 new
>>>>>>>> host
>>>>>>>> id:
>>>>>>>> 15 host id before state transition: 15
>>>>>>>>
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,438 DEBUG
>>>>>>>> [c.c.a.m.ClusteredAgentAttache] (Work-Job-Executor-156:ctx-14b86f5e
>>>>>>>> job-60216/job-191380 ctx-175d37df) Seq 15-6889944479923437643:
>>>>>>>> Forwarding
>>>>>>>> Seq 15-6889944479923437643:  { Cmd , MgmtId: 345049103441, via:
>>>>>>>> 15(ewcstack-vh011-prod), Ver: v1, Flags: 100011,
>>>>>>>> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"execut
>>>>>>>> eInSequence":false,"checkBeforeCleanup":false,"vmName":"i-
>>>>>>>> 638-1736-VM","wait":0}}]
>>>>>>>> } to 345049098498
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:30,335 DEBUG
>>>>>>>> [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-1:null) SeqA
>>>>>>>> 254-800498:
>>>>>>>> Processing Seq 254-800498:  { Cmd , MgmtId: -1, via: 254, Ver: v1,
>>>>>>>> Flags:
>>>>>>>> 11, [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_
>>>>>>>> proxyVmId":1530,"_loadInfo":"{\n
>>>>>>>> \"connections\": []\n}","wait":0}}] }
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:30,340 DEBUG
>>>>>>>> [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-1:null) SeqA
>>>>>>>> 254-800498:
>>>>>>>> Sending Seq 254-800498:  { Ans: , MgmtId: 345049103441, via: 254,
>>>>>>>> Ver:
>>>>>>>> v1,
>>>>>>>> Flags: 100010, [{"com.cloud.agent.api.AgentCo
>>>>>>>> ntrolAnswer":{"result":true,"wait":0}}] }
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,447 DEBUG [c.c.a.t.Request]
>>>>>>>> (AgentManager-Handler-11:null) Seq 15-6889944479923437632:
>>>>>>>> Processing:
>>>>>>>> {
>>>>>>>> Ans: , MgmtId: 345049103441, via: 15, Ver: v1, Flags: 10,
>>>>>>>> [{"com.cloud.agent.api.StopAnswer":{"platform":"viridian:tru
>>>>>>>> e;acpi:1;apic:true;pae:true;nx:true;timeoffset:3601","result
>>>>>>>> ":true,"details":"Stop
>>>>>>>> VM i-638-1735-VM Succeed","wait":0}}] }
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,447 DEBUG [c.c.a.t.Request]
>>>>>>>> (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379
>>>>>>>> ctx-bbabfeae)
>>>>>>>> Seq 15-6889944479923437632: Received:  { Ans: , MgmtId:
>>>>>>>> 345049103441,
>>>>>>>> via:
>>>>>>>> 15, Ver: v1, Flags: 10, { StopAnswer } }
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,470 DEBUG
>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>> db528
>>>>>>>> job-13733/job-191379 ctx-bbabfeae) VM[User|i-638-1735-VM] is
>>>>>>>> stopped on
>>>>>>>> the
>>>>>>>> host. Proceeding to release resource held.
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,475 DEBUG
>>>>>>>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not
>>>>>>>> supported
>>>>>>>> in the network id=519
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,478 DEBUG
>>>>>>>> [o.a.c.e.o.NetworkOrchestrator] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Changing active number of nics
>>>>>>>> for
>>>>>>>> network id=519 on -1
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>>>>>>>> [o.a.c.e.o.NetworkOrchestrator] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Asking VpcVirtualRouter to
>>>>>>>> release
>>>>>>>> NicProfile[17403-1735-3364083b-dcc0-4eb9-a7f9-f3e85bf8c7d3-1
>>>>>>>> 92.168.1.11-null
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>> db528
>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Successfully released network
>>>>>>>> resources
>>>>>>>> for the vm VM[User|i-638-1735-VM]
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>> db528
>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Successfully released storage
>>>>>>>> resources
>>>>>>>> for the vm VM[User|i-638-1735-VM]
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,496 DEBUG
>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>> job-13733/job-191379 ctx-bbabfeae) VM state transitted from
>>>>>>>> :Stopping to
>>>>>>>> Stopped with event: OperationSucceededvm's original host id: 15 new
>>>>>>>> host
>>>>>>>> id: null host id before state transition: 15
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Hosts's actual total CPU: 82976
>>>>>>>> and
>>>>>>>> CPU
>>>>>>>> after applying overprovisioning: 331904
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Hosts's actual total RAM:
>>>>>>>> 261961927808
>>>>>>>> and RAM after applying overprovisioning: 261961924608
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>> job-13733/job-191379 ctx-bbabfeae) release cpu from host: 15, old
>>>>>>>> used:
>>>>>>>> 71300,reserved: 2100, actual total: 82976, total with
>>>>>>>> overprovisioning:
>>>>>>>> 331904; new used: 67100,reserved:6300; movedfromreserved:
>>>>>>>> false,moveToReserveredtrue
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>> job-13733/job-191379 ctx-bbabfeae) release mem from host: 15, old
>>>>>>>> used:
>>>>>>>> 98784247808,reserved: 1073741824, total: 261961924608; new used:
>>>>>>>> 94489280512,reserved:5368709120 <(53)%206870-9120>;
>>>>>>>> movedfromreserved:
>>>>>>>>
>>>>>>>> false,moveToReserveredtrue
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,538 DEBUG
>>>>>>>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not
>>>>>>>> supported
>>>>>>>> in the network id=519
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,557 DEBUG
>>>>>>>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not
>>>>>>>> supported
>>>>>>>> in the network id=519
>>>>>>>>
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,588 ERROR
>>>>>>>> [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Invocation exception, caused by:
>>>>>>>> com.cloud.utils.exception.CloudRuntimeException: Unable to migrate
>>>>>>>> VM[User|i-638-1735-VM]
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 INFO
>>>>>>>> [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>> job-13733/job-191379 ctx-bbabfeae) Rethrow exception
>>>>>>>> com.cloud.utils.exception.CloudRuntimeException: Unable to migrate
>>>>>>>> VM[User|i-638-1735-VM]
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 DEBUG
>>>>>>>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>> job-13733/job-191379) Done with run of VM work job:
>>>>>>>> com.cloud.vm.VmWorkMigrateAway for VM 1735, job origin: 13733
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 ERROR
>>>>>>>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>> job-13733/job-191379) Unable to complete AsyncJobVO {id:191379,
>>>>>>>> userId:
>>>>>>>> 1,
>>>>>>>> accountId: 1, instanceType: null, instanceId: null, cmd:
>>>>>>>> com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>>>>>> rO0ABXNyAB5jb20uY2xvdWQudm0uVm
>>>>>>>> 1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb2
>>>>>>>> 0uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2
>>>>>>>> VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZz
>>>>>>>> t4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGx3QAGVZpcnR1YWxNYWNoaW5lTW
>>>>>>>> FuYWdlckltcGwAAAAAAAAA4A,
>>>>>>>> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
>>>>>>>> result: null, initMsid: 345049098498, completeMsid: null,
>>>>>>>> lastUpdated:
>>>>>>>> null, lastPolled: null, created: Tue Jan 24 18:05:00 CET 2017}, job
>>>>>>>> origin:13733
>>>>>>>> ewcstack-man02-prod: com.cloud.utils.exception.Clou
>>>>>>>> dRuntimeException:
>>>>>>>> Unable to migrate VM[User|i-638-1735-VM]
>>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>>>>>>>> agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.ja
>>>>>>>> va:2405)
>>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>>>>>>>> agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.ja
>>>>>>>> va:4517)
>>>>>>>> ewcstack-man02-prod:    at sun.reflect.GeneratedMethodAcc
>>>>>>>> essor402.invoke(Unknown
>>>>>>>> Source)
>>>>>>>> ewcstack-man02-prod:    at sun.reflect.DelegatingMethodAc
>>>>>>>> cessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>>>> ewcstack-man02-prod:    at java.lang.reflect.Method.invok
>>>>>>>> e(Method.java:606)
>>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VmWorkJobHandlerP
>>>>>>>> roxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
>>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>>>>>>>> agerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4636)
>>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VmWorkJobDispatch
>>>>>>>> er.runJob(VmWorkJobDispatcher.java:103)
>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.framewor
>>>>>>>> k.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManag
>>>>>>>> erImpl.java:537)
>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>>> context.ManagedContextRunnable$1.run(ManagedContextRunnable.
>>>>>>>> java:49)
>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>>> context.impl.DefaultManagedContext$1.call(DefaultManagedCont
>>>>>>>> ext.java:56)
>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>>> context.impl.DefaultManagedContext.callWithContext(DefaultMa
>>>>>>>> nagedContext.java:103)
>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>>> context.impl.DefaultManagedContext.runWithContext(DefaultMan
>>>>>>>> agedContext.java:53)
>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>>> context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.framewor
>>>>>>>> k.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:494)
>>>>>>>> ewcstack-man02-prod:    at java.util.concurrent.Executors
>>>>>>>> $RunnableAdapter.call(Executors.java:471)
>>>>>>>> ewcstack-man02-prod:    at java.util.concurrent.FutureTas
>>>>>>>> k.run(FutureTask.java:262)
>>>>>>>> ewcstack-man02-prod:    at java.util.concurrent.ThreadPoo
>>>>>>>> lExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>>>>> ewcstack-man02-prod:    at java.util.concurrent.ThreadPoo
>>>>>>>> lExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>>>>> ewcstack-man02-prod:    at java.lang.Thread.run(Thread.java:745)
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,590 DEBUG
>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>> db528
>>>>>>>> job-13733/job-191379) Complete async job-191379, jobStatus: FAILED,
>>>>>>>> resultCode: 0, result: rO0ABXNyAC9jb20uY2xvdWQudXRpbH
>>>>>>>> MuZXhjZXB0aW9uLkNsb3VkUnVudGltZUV4Y2VwdGlvbgAAAABWTU9yAwABSQ
>>>>>>>> ALY3NFcnJvckNvZGV4cgAaamF2YS5sYW5nLlJ1bnRpbWVFeGNlcHRpb26eXw
>>>>>>>> ZHCjSD5QIAAHhyABNqYXZhLmxhbmcuRXhjZXB0aW9u0P0fPho7HMQCAAB4cg
>>>>>>>> ATamF2YS5sYW5nLlRocm93YWJsZdXGNSc5d7jLAwAETAAFY2F1c2V0ABVMam
>>>>>>>> F2YS9sYW5nL1Rocm93YWJsZTtMAA1kZXRhaWxNZXNzYWdldAASTGphdmEvbG
>>>>>>>> FuZy9TdHJpbmc7WwAKc3RhY2tUcmFjZXQAHltMamF2YS9sYW5nL1N0YWNrVH
>>>>>>>> JhY2VFbGVtZW50O0wAFHN1cHByZXNzZWRFeGNlcHRpb25zdAAQTGphdmEvdX
>>>>>>>> RpbC9MaXN0O3hwcQB-AAh0AChVbmFibGUgdG8gbWlncmF0ZSBWTVtVc2VyfG
>>>>>>>> ktNjM4LTE3MzUtVk1ddXIAHltMamF2YS5sYW5nLlN0YWNrVHJhY2VFbGVtZW
>>>>>>>> 50OwJGKjw8_SI5AgAAeHAAAAAUc3IAG2phdmEubGFuZy5TdGFja1RyYWNlRW
>>>>>>>> xlbWVudGEJxZomNt2FAgAESQAKbGluZU51bWJlckwADmRlY2xhcmluZ0NsYX
>>>>>>>> NzcQB-AAVMAAhmaWxlTmFtZXEAfgAFTAAKbWV0aG9kTmFtZXEAfgAFeHAAAA
>>>>>>>> lldAAmY29tLmNsb3VkLnZtLlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGx0AB
>>>>>>>> 5WaXJ0dWFsTWFjaGluZU1hbmFnZXJJbXBsLmphdmF0ABZvcmNoZXN0cmF0ZU
>>>>>>>> 1pZ3JhdGVBd2F5c3EAfgAMAAARpXEAfgAOcQB-AA9xAH4AEHNxAH4ADP____
>>>>>>>> 90ACZzdW4ucmVmbGVjdC5HZW5lcmF0ZWRNZXRob2RBY2Nlc3NvcjQwMnB0AA
>>>>>>>> ZpbnZva2VzcQB-AAwAAAArdAAoc3VuLnJlZmxlY3QuRGVsZWdhdGluZ01ldG
>>>>>>>> hvZEFjY2Vzc29ySW1wbHQAIURlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcG
>>>>>>>> wuamF2YXEAfgAUc3EAfgAMAAACXnQAGGphdmEubGFuZy5yZWZsZWN0Lk1ldG
>>>>>>>> hvZHQAC01ldGhvZC5qYXZhcQB-ABRzcQB-AAwAAABrdAAiY29tLmNsb3VkLn
>>>>>>>> ZtLlZtV29ya0pvYkhhbmRsZXJQcm94eXQAGlZtV29ya0pvYkhhbmRsZXJQcm
>>>>>>>> 94eS5qYXZhdAAPaGFuZGxlVm1Xb3JrSm9ic3EAfgAMAAASHHEAfgAOcQB-
>>>>>>>> AA9xAH4AHnNxAH4ADAAAAGd0ACBjb20uY2xvdWQudm0uVm1Xb3JrSm9iRGlz
>>>>>>>> cGF0Y2hlcnQAGFZtV29ya0pvYkRpc3BhdGNoZXIuamF2YXQABnJ1bkpvYnNx
>>>>>>>> AH4ADAAAAhl0AD9vcmcuYXBhY2hlLmNsb3Vkc3RhY2suZnJhbWV3b3JrLmpv
>>>>>>>> YnMuaW1wbC5Bc3luY0pvYk1hbmFnZXJJbXBsJDV0ABhBc3luY0pvYk1hbmFn
>>>>>>>> ZXJJbXBsLmphdmF0AAxydW5JbkNvbnRleHRzcQB-AAwAAAAxdAA-b3JnLmFw
>>>>>>>> YWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29udGV4
>>>>>>>> dFJ1bm5hYmxlJDF0ABtNYW5hZ2VkQ29udGV4dFJ1bm5hYmxlLmphdmF0AANy
>>>>>>>> dW5zcQB-AAwAAAA4dABCb3JnLmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQu
>>>>>>>> Y29udGV4dC5pbXBsLkRlZmF1bHRNYW5hZ2VkQ29udGV4dCQxdAAaRGVmYXVs
>>>>>>>> dE1hbmFnZWRDb250ZXh0LmphdmF0AARjYWxsc3EAfgAMAAAAZ3QAQG9yZy5h
>>>>>>>> cGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0
>>>>>>>> TWFuYWdlZENvbnRleHRxAH4ALnQAD2NhbGxXaXRoQ29udGV4dHNxAH4ADAAA
>>>>>>>> ADVxAH4AMXEAfgAudAAOcnVuV2l0aENvbnRleHRzcQB-AAwAAAAudAA8b3Jn
>>>>>>>> LmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29u
>>>>>>>> dGV4dFJ1bm5hYmxlcQB-ACpxAH4AK3NxAH4ADAAAAe5xAH4AJXEAfgAmcQB-
>>>>>>>> ACtzcQB-AAwAAAHXdAAuamF2YS51dGlsLmNvbmN1cnJlbnQuRXhlY3V0b3Jz
>>>>>>>> JFJ1bm5hYmxlQWRhcHRlcnQADkV4ZWN1dG9ycy5qYXZhcQB-AC9zcQB-AAwA
>>>>>>>> AAEGdAAfamF2YS51dGlsLmNvbmN1cnJlbnQuRnV0dXJlVGFza3QAD0Z1dHVy
>>>>>>>> ZVRhc2suamF2YXEAfgArc3EAfgAMAAAEeXQAJ2phdmEudXRpbC5jb25jdXJy
>>>>>>>> ZW50LlRocmVhZFBvb2xFeGVjdXRvcnQAF1RocmVhZFBvb2xFeGVjdXRvci5q
>>>>>>>> YXZhdAAJcnVuV29ya2Vyc3EAfgAMAAACZ3QALmphdmEudXRpbC5jb25jdXJy
>>>>>>>> ZW50LlRocmVhZFBvb2xFeGVjdXRvciRXb3JrZXJxAH4AQHEAfgArc3EAfgAM
>>>>>>>> AAAC6XQAEGphdmEubGFuZy5UaHJlYWR0AAtUaHJlYWQuamF2YXEAfgArc3IA
>>>>>>>> JmphdmEudXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVMaXN0_A8lMbXs
>>>>>>>> jhACAAFMAARsaXN0cQB-AAd4cgAsamF2YS51dGlsLkNvbGxlY3Rpb25zJFVu
>>>>>>>> bW9kaWZpYWJsZUNvbGxlY3Rpb24ZQgCAy173HgIAAUwAAWN0ABZMamF2YS91
>>>>>>>> dGlsL0NvbGxlY3Rpb247eHBzcgATamF2YS51dGlsLkFycmF5TGlzdHiB0h2Z
>>>>>>>> x2GdAwABSQAEc2l6ZXhwAAAAAHcEAAAAAHhxAH4ATHgAABCadwgAAAAAAAAAAHg
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>> db528
>>>>>>>> job-13733/job-191379) Publish async job-191379 complete on message
>>>>>>>> bus
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>> db528
>>>>>>>> job-13733/job-191379) Wake up jobs related to job-191379
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>> db528
>>>>>>>> job-13733/job-191379) Update db status for job-191379
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,594 DEBUG
>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>> db528
>>>>>>>> job-13733/job-191379) Wake up jobs joined with job-191379 and
>>>>>>>> disjoin
>>>>>>>> all
>>>>>>>> subjobs created from job- 191379
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,604 DEBUG
>>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>>> db528
>>>>>>>> job-13733/job-191379) Done executing com.cloud.vm.VmWorkMigrateAway
>>>>>>>> for
>>>>>>>> job-191379
>>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,608 INFO
>>>>>>>> [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>>> job-13733/job-191379) Remove job-191379 from job monitoring
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 25.01.2017 15:53, Rafael Weingärtner wrote:
>>>>>>>>
>>>>>>>> Did you check the ACS log files looking for some exceptions when the
>>>>>>>>
>>>>>>>>> migrations for this specific VM started?
>>>>>>>>>
>>>>>>>>> On Wed, Jan 25, 2017 at 9:49 AM, Francois Scheurer <
>>>>>>>>> francois.scheurer@everyware.ch> wrote:
>>>>>>>>>
>>>>>>>>> Dear CS contributors
>>>>>>>>>
>>>>>>>>> We use CS 4.5.1 on a 3 Clusters with XenServer 6.5. and shared
>>>>>>>>>> primary
>>>>>>>>>> storage.
>>>>>>>>>>
>>>>>>>>>> When we put a host in maintenance, cs evacuates all VM's on other
>>>>>>>>>> hosts.
>>>>>>>>>>
>>>>>>>>>> But it happens regularly that some VM's are stopped instead of
>>>>>>>>>> live
>>>>>>>>>> migrated.
>>>>>>>>>>
>>>>>>>>>> If we restart that stopped VM's, CS will then stop them again
>>>>>>>>>> later.
>>>>>>>>>>
>>>>>>>>>> We need to start them several times until they stays up.
>>>>>>>>>>
>>>>>>>>>> Is it a known issue? Is it fixed on CS 4.9.2 ?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Migrating manually all VM's is working but impracticable because
>>>>>>>>>> too
>>>>>>>>>> much
>>>>>>>>>> time consuming.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Many thanks in advance for your help.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Best Regards
>>>>>>>>>> Francois
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> EveryWare AG
>>>>>>>>>> François Scheurer
>>>>>>>>>> Senior Systems Engineer
>>>>>>>>>> Zurlindenstrasse 52a
>>>>>>>>>> CH-8003 Zürich
>>>>>>>>>>
>>>>>>>>>> tel: +41 44 466 60 00 <%2B41%2044%20466%2060%2000>
>>>>>>>>>> fax: +41 44 466 60 10 <%2B41%2044%20466%2060%2010>
>>>>>>>>>> mail: francois.scheurer@everyware.ch
>>>>>>>>>> web: http://www.everyware.ch
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>
>>>>>>>>
>>>>>>>> EveryWare AG
>>>>>>>> François Scheurer
>>>>>>>> Senior Systems Engineer
>>>>>>>> Zurlindenstrasse 52a
>>>>>>>> CH-8003 Zürich
>>>>>>>>
>>>>>>>> tel: +41 44 466 60 00
>>>>>>>> fax: +41 44 466 60 10
>>>>>>>> mail: francois.scheurer@everyware.ch
>>>>>>>> web: http://www.everyware.ch
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>> Rafael Weingärtner
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> --
>>>>>
>>>>>
>>>>> EveryWare AG
>>>>> François Scheurer
>>>>> Senior Systems Engineer
>>>>> Zurlindenstrasse 52a
>>>>> CH-8003 Zürich
>>>>>
>>>>> tel: +41 44 466 60 00 <%2B41%2044%20466%2060%2000>
>>>>> fax: +41 44 466 60 10 <%2B41%2044%20466%2060%2010>
>>>>> mail: francois.scheurer@everyware.ch
>>>>> web: http://www.everyware.ch
>>>>>
>>>>>
>>>>
>>>>
>>> --
>>>
>>>
>>> EveryWare AG
>>> François Scheurer
>>> Senior Systems Engineer
>>> Zurlindenstrasse 52a
>>> CH-8003 Zürich
>>>
>>> tel: +41 44 466 60 00 <%2B41%2044%20466%2060%2000>
>>> fax: +41 44 466 60 10 <%2B41%2044%20466%2060%2010>
>>> mail: francois.scheurer@everyware.ch
>>> web: http://www.everyware.ch
>>>
>>>
>>
>>
>> --
>> Rafael Weingärtner
>>
>
>
>
> --
> Rafael Weingärtner
>
>
> --
>
>
> EveryWare AG
> François Scheurer
> Senior Systems Engineer
> Zurlindenstrasse 52a
> CH-8003 Zürich
>
> tel: +41 44 466 60 00 <+41%2044%20466%2060%2000>
> fax: +41 44 466 60 10 <+41%2044%20466%2060%2010>
> mail: francois.scheurer@everyware.ch
> web: http://www.everyware.ch
>
>


-- 
Rafael Weingärtner

Re: cs 4.5.1, vm stopped when putting xen hosts in maintenance

Posted by Francois Scheurer <fr...@everyware.ch>.

Dear Rafael / Dear CS contributors


I tried the following:

1.     stop all MS
2.     backup DB
3.     select * from op_ha_work where type="Migration" and 
mgmt_server_id is null and taken is null;
4.     delete from op_ha_work where type="Migration" and mgmt_server_id 
is null and taken is null;
5.     start all MS

But actually after the MS stopped the op_ha_work table was automatically 
cleaned, making my steps useless.
After the MS started theop_ha_work table was filled again and related 
VM's were stopped several times (3-7 times), requiring high attention to 
manually restart them on each stop.

I later found another table that could be the cause of the issue:

mysql> select op_it_work.*,vm_instance.instance_name,vm_instance.removed 
from op_it_work join vm_instance where op_it_work.step!="Done" and 
op_it_work.instance_id=vm_instance.id order by instance_id;
+--------------------------------------+----------------+------------+-----------------------+-----------+--------------------+-----------+------------+-------------+---------------+-------------+---------------+---------------------+
| id                                   | mgmt_server_id | created_at | 
thread                | type      | vm_type            | step      | 
updated_at | instance_id | resource_type | resource_id | instance_name | 
removed             |
+--------------------------------------+----------------+------------+-----------------------+-----------+--------------------+-----------+------------+-------------+---------------+-------------+---------------+---------------------+
| be4d0dd2-7d4f-47d3-ba49-623d2f115bbf |   345049098498 | 1436187965 | 
consoleproxy-1        | Starting  | ConsoleProxy       | Release   | 
1436188265 |         402 | NULL          |           0 | v-402-VM      | 
2015-07-06 13:23:22 |
| fb4d1844-eb92-478c-9f50-d666790f43af |   345049098498 | 1436187965 | 
secstorage-1          | Starting  | SecondaryStorageVm | Release   | 
1436188265 |         687 | NULL          |           0 | s-687-VM      | 
2015-07-06 13:11:34 |
| 7dfeb25a-a705-42e8-907c-79d1e69697ed |   345049098498 | 1436964548 | 
Work-Job-Executor-165 | Migrating | User               | Migrating | 
1436964548 |         776 | Host          |          29 | i-51-776-VM   | 
2015-07-15 12:55:38 |
| 2018ac61-b11c-4101-8ef5-07a39509f2bd |   345049098498 | 1486020173 | 
Work-Job-Executor-19  | Migrating | User               | Prepare   | 
1486020173 |        1177 | Host          |          25 | i-189-1177-VM | 
NULL                |
| 25430cd5-4346-4643-9c7e-7b39b8a876a2 |   345049098122 | 1486019693 | 
Work-Job-Executor-6   | Migrating | User               | Prepare   | 
1486019693 |        1177 | Host          |          25 | i-189-1177-VM | 
NULL                |
| 3276d52e-cbd9-44c5-8ab6-c70784337ff9 |   345049098498 | 1486020953 | 
Work-Job-Executor-33  | Migrating | User               | Prepare   | 
1486020953 |        1177 | Host          |          33 | i-189-1177-VM | 
NULL                |
| 6462da33-83f2-4575-af38-0e52d2a48488 |   345049098122 | 1486021433 | 
Work-Job-Executor-12  | Migrating | User               | Prepare   | 
1486021433 |        1177 | Host          |          25 | i-189-1177-VM | 
NULL                |
| 688e16c8-c39f-4e88-9e8f-36c0282f3df1 |   345049098122 | 1486020773 | 
Work-Job-Executor-11  | Migrating | User               | Prepare   | 
1486020773 |        1177 | Host          |          25 | i-189-1177-VM | 
NULL                |
| d8a18e9d-ac6f-4273-923e-e401533e2cf7 |   345049098498 | 1486020353 | 
Work-Job-Executor-25  | Migrating | User               | Prepare   | 
1486020353 |        1177 | Host          |          25 | i-189-1177-VM | 
NULL                |
| 1222ece8-399d-43d8-8ff8-d24bdebca2ed |   345049103441 | 1486018924 | 
Work-Job-Executor-2   | Migrating | User               | Prepare   | 
1486018924 |        1178 | Host          |          29 | i-189-1178-VM | 
NULL                |
| 2ac5ca4a-c20d-433a-8379-40e0de3b3ba8 |   345049098498 | 1486021613 | 
Work-Job-Executor-41  | Migrating | User               | Prepare   | 
1486021613 |        1178 | Host          |          29 | i-189-1178-VM | 
NULL                |
| 3e3b95a0-2af4-4268-bfdd-f3399a85bcaa |   345049103441 | 1486019584 | 
Work-Job-Executor-5   | Migrating | User               | Prepare   | 
1486019584 |        1178 | Host          |          29 | i-189-1178-VM | 
NULL                |
| 53745a2d-056c-4a1b-9267-eb97f077a6c8 |   345049103441 | 1486020244 | 
Work-Job-Executor-6   | Migrating | User               | Prepare   | 
1486020244 |        1178 | Host          |          29 | i-189-1178-VM | 
NULL                |
| e3555b49-7c18-4272-b7f4-ba9351b3bbd1 |   345049098498 | 1486020893 | 
Work-Job-Executor-32  | Migrating | User               | Prepare   | 
1486020893 |        1178 | Host          |          29 | i-189-1178-VM | 
NULL                |
| 1dcd5204-4f9a-4342-a23b-329a48596e13 |   345049098498 | 1486020953 | 
Work-Job-Executor-34  | Migrating | User               | Prepare   | 
1486020953 |        1240 | Host          |           3 | i-388-1240-VM | 
NULL                |
| ec1668b3-fa3c-4673-b593-3d607b169a24 |   345049098498 | 1486019693 | 
Work-Job-Executor-14  | Migrating | User               | Prepare   | 
1486019693 |        1240 | Host          |           3 | i-388-1240-VM | 
NULL                |
| fa7b2dec-ed7f-4a67-a055-e6e6bc1631ca |   345049098498 | 1486020353 | 
Work-Job-Executor-26  | Migrating | User               | Prepare   | 
1486020353 |        1240 | Host          |           3 | i-388-1240-VM | 
NULL                |
| 8cfe6d3e-5b7b-4be8-ac10-c865a492aab9 |   345049098498 | 1465387271 | 
Work-Job-Executor-18  | Running   | User               | Prepare   | 
1465387271 |        1388 | Host          |           7 | i-361-1388-VM | 
NULL                |
| 6e941b7e-6f19-4717-bf82-974c59737dbc |   345049103441 | 1486019524 | 
Work-Job-Executor-4   | Migrating | User               | Prepare   | 
1486019524 |        1690 | Host          |          15 | i-648-1690-VM | 
NULL                |
| 718578a4-0dd6-42d9-be10-ed7680650108 |   345049098122 | 1486021433 | 
Work-Job-Executor-13  | Migrating | User               | Prepare   | 
1486021433 |        1690 | Host          |          15 | i-648-1690-VM | 
NULL                |
| 9acf3908-ef13-4b03-8c2b-1a7e55e18491 |   345049098498 | 1486020173 | 
Work-Job-Executor-20  | Migrating | User               | Prepare   | 
1486020173 |        1690 | Host          |          15 | i-648-1690-VM | 
NULL                |
| a50c420b-ae52-4caa-83db-704b52b299a4 |   345049098498 | 1486020773 | 
Work-Job-Executor-28  | Migrating | User               | Prepare   | 
1486020773 |        1690 | Host          |          15 | i-648-1690-VM | 
NULL                |
| 2b5f4179-5975-4326-9db7-643c3ede63fe |   345049103441 | 1484224516 | 
Work-Job-Executor-161 | Running   | User               | Prepare   | 
1484224516 |        1703 | Host          |           3 | i-51-1703-VM  | 
NULL                |
| 2f3c6c44-b2a0-4d10-88d2-ac143d2dd379 |   345049098498 | 1484223910 | 
Work-Job-Executor-163 | Running   | User               | Prepare   | 
1484223910 |        1703 | Host          |           3 | i-51-1703-VM  | 
NULL                |
| 4126a23b-1cde-49bc-8636-c0d063b85cf6 |   345049103441 | 1484224636 | 
Work-Job-Executor-86  | Running   | User               | Prepare   | 
1484224636 |        1703 | Host          |           3 | i-51-1703-VM  | 
NULL                |
| 5dad5645-766b-48c9-9a85-b77b3c101390 |   345049103441 | 1484224636 | 
Work-Job-Executor-92  | Running   | User               | Prepare   | 
1484224636 |        1703 | Host          |           3 | i-51-1703-VM  | 
NULL                |
| 8c1dfff9-e55d-4f81-a22f-45f72accc416 |   345049103441 | 1484224576 | 
Work-Job-Executor-95  | Running   | User               | Prepare   | 
1484224576 |        1703 | Host          |           3 | i-51-1703-VM  | 
NULL                |
| a703d2d7-b1e3-42ce-a551-831982990ffb |   345049098498 | 1484224630 | 
Work-Job-Executor-13  | Running   | User               | Prepare   | 
1484224630 |        1703 | Host          |           3 | i-51-1703-VM  | 
NULL                |
| 0e194cb5-95b7-41cb-969a-ae37d2ad449a |   345049098498 | 1485763613 | 
Work-Job-Executor-28  | Migrating | User               | Prepare   | 
1485763613 |        1735 | Host          |          19 | i-638-1735-VM | 
2017-01-30 15:24:10 |
| 74f180b9-be0f-429b-9a45-597ca2ca997d |   345049098498 | 1485763013 | 
Work-Job-Executor-21  | Migrating | User               | Prepare   | 
1485763013 |        1735 | Host          |          19 | i-638-1735-VM | 
2017-01-30 15:24:10 |
| feaa421c-88df-4493-9a4f-bddb6ee1b75b |   345049098122 | 1485764285 | 
Work-Job-Executor-19  | Migrating | User               | Prepare   | 
1485764285 |        1735 | Host          |          19 | i-638-1735-VM | 
2017-01-30 15:24:10 |
| 32138a36-ad18-42b5-8478-76ad63caf4da |   345049098122 | 1486020293 | 
Work-Job-Executor-9   | Migrating | User               | Prepare   | 
1486020293 |        1736 | Host          |          19 | i-638-1736-VM | 
NULL                |
| 5306ab1f-3c92-4f95-bbff-283b43e6862d |   345049098498 | 1486019573 | 
Work-Job-Executor-11  | Migrating | User               | Prepare   | 
1486019573 |        1736 | Host          |          19 | i-638-1736-VM | 
NULL                |
| 843bdf9c-247d-432c-ac19-1a4f7858631b |   345049098498 | 1486021493 | 
Work-Job-Executor-38  | Migrating | User               | Prepare   | 
1486021493 |        1736 | Host          |          19 | i-638-1736-VM | 
NULL                |
| 8d400afb-766f-4ac5-89ee-854f25a38eba |   345049098498 | 1486020173 | 
Work-Job-Executor-21  | Migrating | User               | Prepare   | 
1486020173 |        1736 | Host          |          19 | i-638-1736-VM | 
NULL                |
| f0ae5612-b6ef-4882-94a2-40a1e6dce342 |   345049098498 | 1486020833 | 
Work-Job-Executor-30  | Migrating | User               | Prepare   | 
1486020833 |        1736 | Host          |          19 | i-638-1736-VM | 
NULL                |
+--------------------------------------+----------------+------------+-----------------------+-----------+--------------------+-----------+------------+-------------+---------------+-------------+---------------+---------------------+

As you see, the problematic VM's are all present on that table:

    i-189-1177-VM
    i-189-1178-VM
    i-388-1240-VM
    i-648-1690-VM
    i-638-1736-VM

As the MS are running the production environnements, we cannot debug a 
lot and just need the shortest path to resolution.
For example restarting the MS is having an impact on our customers and 
we need to reduce the number of restarts at the minimum.

I have disabled the HA on all problematic VM's except on i-189-1177-VM, 
which is less critical.

The next steps could be:

1.     stop all MS
2.     backup DB
3.     select op_it_work.*,vm_instance.instance_name,vm_instance.removed 
from op_it_work join vm_instance where op_it_work.step!="Done" and 
op_it_work.instance_id=vm_instance.id order by instance_id;
4.     update op_it_work set step="Done" where step!="Done";
5.     start all MS

What do you think about step 4., safe or risky?
Some of these entries are related to deleted VM's and seems to safe to 
mark as "Done".
All other entries are related to the problematic VM's that keep 
stopping... so marking them as "Done" may (hopefully) eliminate the 
cause of the Migrating/Stopping.
But we are not sure if marking them as "Done" need also changes on other 
tables for consistency.


Thank you for your help.


Best Regards
Francois Scheurer





On 01.02.2017 03:01, Rafael Weingärtner wrote:
> Ok, I dug a little further into the code.
>
>     If we put a host in maintenance, it may unfortunately happen that
>     a timeout occurs and CS aborts the migration. As far as we can
>     interpret the logs, it seems (we speculate) that CS detects that
>     the VM is then still running on the original host, which is in
>     maintenance, and decides somehow to stop it because of the "host
>     not up" status.
>
>
> From wat I have seen in the code, this may happen (as you speculated). 
> If a timeout or some other event interrupts migrations, the VM will be 
> stopped. This happens in the method 
> “com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(String, 
> long, DeploymentPlanner)” at line 2396 and further. This was 
> implemented on purpose; if the migration does not happen; the VM is 
> stopped on the source host. Probably it was implemented this way on 
> that philosophy of gradient descendent of artificial intelligence:
>
>     “pick a random point in the field, update the values according to
>     the slope and hopefully after successfully iterations you converge
>     to a minimum point.” (the problem is that sometimes the function
>     diverges).
>
>
> I mean, probably this was done this way because the HA was supposed to 
> see that a VM with HA is stopped, and should try to start it again 
> somewhere else. The log you posted do not provide any other details to 
> debug this further because the log is after an MS restart, right?
>
>
>     After restarting the MSs I guess that a similar issue may happen:
>     the hosts are disconnected during the early stage and then
>     reconnects. Maybe CS then again stops the VMs because the "host
>     not up" status?
>     Cf. 2017-01-30 08:44:06,157 DEBUG
>     [c.c.d.DeploymentPlanningManagerImpl]
>     (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>     ctx-6e9c753f) The last host of this VM is not UP or is not
>     enabled, host status is: Disconnected, host resource state is: Enabled
>
>
> Here is something interesting that you have found. I checked the code, 
> but without a live debug it is pretty hard for me to get the right 
> flow of execution. By looking at the code and at the logs you 
> provided, your deduction is right. After ACS boots up, the check for 
> hosts that are shutdown/deactivated starts, and the method 
> “com.cloud.agent.manager.AgentManagerImpl.executeUserRequest(long, 
> Event)” is executed. The problem is that all of the hosts are 
> considered as disconnected (ACS may not have finished connecting to 
> all of them yet), and then a whole process of HA is kicked. The 
> problem is that there is no host with status “UP”. That is why this 
> happen “No suitable hosts found under this Cluster: 25”, and then the 
> whole flow of HA, which shuts down the VM is kicked.
>
> In theory, the HA worker should see that a VM with HA is down, and 
> should try to start it. I am not understanding why this is not 
> working. If you decide to debug the running ACS, I may be able to help 
> you, just keep me posted.
>
>
> On Tue, Jan 31, 2017 at 10:26 AM, Rafael Weingärtner 
> <rafaelweingartner@gmail.com <ma...@gmail.com>> wrote:
>
>     I will be able to re-check some code and answer you with new
>     thoughts on this problem tonight (the log files you provided today
>     have some very insightful information regarding this problem). I
>     believe this is a bug in HA, a very tricky one that has not been
>     reported yet (at least I do not remember seeing anything similar).
>     And thanks to randomness/destiny now you have found/uncovered a
>     way to reproduce the problem (bug?!).
>
>     I do not suggest you changing any data at all directly in the data
>     base (unless you know ACS code well and are 100% of what you are
>     doing). This can cause even bigger and more complicated problems.
>     If you remove the HA feature from this VMs, this problem will
>     stop. However, if you can/are willing to maintain the HA until we
>     fully uncover the problem, map and report it, this would be great.
>     BTW: why are you restarting MSs?
>     What are the configurations of your MSs?
>
>     Thanks for your time and for sharing information with our community.
>
>
>
>     On Tue, Jan 31, 2017 at 8:53 AM, Francois Scheurer
>     <francois.scheurer@everyware.ch
>     <ma...@everyware.ch>> wrote:
>
>         Dear Rafael
>
>
>         Many thanks for your deep analysis.
>         Nice humor with the planetsalignments BTW ;)
>
>         We also suspected the HA Flag to be related this issue, as we
>         never got this issue in the past, until we enabled the HA on
>         all VMs and Offerings.
>
>         Here is also the workers list:
>
>         mysql> select * from op_ha_work;
>         +------+-------------+-----------+--------------------+----------+----------------+---------+---------------------+-------+---------------------+-------+-------------+---------+
>         | id   | instance_id | type      | vm_type     | state    |
>         mgmt_server_id | host_id | created             | tried | taken
>          | step  | time_to_try | updated |
>         +------+-------------+-----------+--------------------+----------+----------------+---------+---------------------+-------+---------------------+-------+-------------+---------+
>         |  105 |         343 | HA        | User    | Running  | 
>          345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28
>         03:30:04 | Error |  1394056013 |       3 |
>         |  109 |         407 | HA        | ConsoleProxy    | Running 
>         |   345049098498 |      33 | 2015-03-28 03:19:03 | 2 |
>         2015-03-28 03:30:04 | Error |  1394056013 |       3 |
>         |  113 |         407 | HA        | ConsoleProxy    | Running 
>         |   345049098498 |      33 | 2015-03-28 03:19:03 | 2 |
>         2015-03-28 03:30:04 | Error |  1394056013 |       3 |
>         |  117 |         407 | HA        | ConsoleProxy    | Running 
>         |   345049098498 |      33 | 2015-03-28 03:19:03 | 2 |
>         2015-03-28 03:30:04 | Error |  1394056013 |       3 |
>         |  121 |         343 | HA        | User    | Running  | 
>          345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28
>         03:30:04 | Error |  1394056013 |       3 |
>         |  125 |         405 | HA        | SecondaryStorageVm |
>         Running  |   345049098498 |     29 | 2015-03-28 03:19:03 | 2 |
>         2015-03-28 03:30:04 | Error |  1394056013 |       3 |
>         |  129 |         393 | HA        | User    | Running  | 
>          345049098498 |      25 | 2015-03-28 03:19:03 | 2 | 2015-03-28
>         03:30:04 | Error |  1394056014 |      30 |
>         |  133 |         405 | HA        | SecondaryStorageVm |
>         Running  |   345049098498 |     29 | 2015-03-28 03:19:03 | 2 |
>         2015-03-28 03:30:04 | Error |  1394056014 |       3 |
>         |  137 |         402 | HA        | ConsoleProxy    | Running 
>         |   345049098498 |      25 | 2015-03-28 03:19:03 | 2 |
>         2015-03-28 03:30:04 | Error |  1394056014 |       3 |
>         |  345 |         402 | HA        | ConsoleProxy    | Starting
>         |   345049098498 |      25 | 2015-07-06 13:09:26 | 1 |
>         2015-07-06 13:09:26 | Error |  1402527506 |     770 |
>         |  349 |         687 | HA        | SecondaryStorageVm |
>         Starting |   345049098498 |     25 | 2015-07-06 13:09:26 | 1 |
>         2015-07-06 13:09:26 | Error |  1402527506 |       2 |
>         | 2057 |        1178 | Migration | User    | Running  | 
>          345049098498 |      33 | 2017-01-31 07:56:08 | 0 | 2017-01-31
>         07:56:08 | Done  |  1451024774 |      95 |
>         | 2058 |        1736 | Migration | User    | Running  | 
>          345049098498 |     224 | 2017-01-31 07:56:08 | 0 | 2017-01-31
>         07:56:08 | Done  |  1451024774 |      45 |
>         | 2059 |        1240 | Migration | User    | Running  | 
>          345049098498 |       7 | 2017-01-31 07:56:08 | 0 | 2017-01-31
>         07:56:08 | Done  |  1451024774 |      61 |
>         | 2060 |        1178 | Migration | User    | Running  | 
>          345049098122 |      33 | 2017-01-31 07:56:08 | 1 | 2017-01-31
>         08:06:37 | Done  |  1451025374 |      95 |
>         | 2061 |        1178 | Migration | User    | Stopped  |       
>            NULL |      33 | 2017-01-31 07:56:09 | 5 | NULL           
>             | Done  | 1451027900 <tel:1451027900> |     109 |
>         | 2062 |        1690 | Migration | User    | Running  | 
>          345049098498 |     224 | 2017-01-31 07:56:09 | 0 | 2017-01-31
>         07:56:09 | Done  |  1451024774 |     109 |
>         | 2063 |        1177 | Migration | User    | Running  | 
>          345049098498 |      29 | 2017-01-31 07:56:09 | 0 | 2017-01-31
>         07:56:09 | Done  |  1451024774 |      82 |
>         | 2064 |        1690 | Migration | User    | Running  | 
>          345049098122 |     224 | 2017-01-31 07:56:09 | 1 | 2017-01-31
>         07:58:18 | Done  |  1451024900 |     109 |
>         | 2065 |        1240 | Migration | User    | Running  | 
>          345049098122 |       7 | 2017-01-31 07:56:09 | 1 | 2017-01-31
>         07:58:18 | Done  |  1451024900 |      61 |
>         | 2066 |        1736 | Migration | User    | Running  | 
>          345049098122 |     224 | 2017-01-31 07:56:09 | 1 | 2017-01-31
>         07:58:18 | Done  |  1451024900 |      45 |
>         | 2067 |        1240 | Migration | User    | Stopped  |       
>            NULL |       7 | 2017-01-31 07:56:09 | 5 | NULL           
>             | Done  | 1451028003 <tel:1451028003> |      80 |
>         | 2068 |        1178 | Migration | User    | Running  | 
>          345049098122 |      33 | 2017-01-31 07:56:16 | 0 | 2017-01-31
>         07:56:16 | Done  |  1451024781 |      95 |
>         | 2069 |        1736 | Migration | User    | Running  | 
>          345049098122 |     224 | 2017-01-31 07:56:16 | 0 | 2017-01-31
>         07:56:16 | Done  |  1451024781 |      45 |
>         | 2070 |        1240 | Migration | User    | Running  | 
>          345049098122 |       7 | 2017-01-31 07:56:16 | 0 | 2017-01-31
>         07:56:16 | Done  |  1451024781 |      61 |
>         | 2071 |        1177 | Migration | User    | Running  | 
>          345049098122 |      29 | 2017-01-31 07:56:16 | 0 | 2017-01-31
>         07:56:16 | Done  |  1451024781 |      82 |
>         | 2072 |        1177 | Migration | User    | Stopped  |       
>            NULL |      29 | 2017-01-31 07:56:16 | 5 | NULL           
>             | Done  | 1451027973 <tel:1451027973> |     101 |
>         | 2073 |        1177 | Migration | User    | Running  | 
>          345049098122 |      29 | 2017-01-31 07:56:16 | 1 | 2017-01-31
>         07:58:22 | Done  |  1451024904 |      82 |
>         | 2074 |        1690 | Migration | User    | Stopped  |       
>            NULL |     224 | 2017-01-31 07:56:16 | 5 | NULL           
>             | Done  | 1451027464 <tel:1451027464> |     123 |
>         | 2075 |        1736 | Migration | User    | Stopped  |       
>            NULL |     224 | 2017-01-31 07:56:16 | 5 | NULL           
>             | Done  | 1451027477 <tel:1451027477> |      59 |
>         +------+-------------+-----------+--------------------+----------+----------------+---------+---------------------+-------+---------------------+-------+-------------+---------+
>
>         If we put a host in maintenance, it may unfortunately happen
>         that a timeout occurs and CS aborts the migration.
>         As far as we can interpret the logs, it seems (we speculate)
>         that CS detects that the VM is then still running on the
>         original host, which is in maintenance, and decides somehow to
>         stop it because of the "host not up" status.
>         We then need to start several times the VM, until CS gives up
>         the stopping.
>
>         After restarting the MSs I guess that a similar issue may
>         happen: the hosts are disconnected during the early stage and
>         then reconnects. Maybe CS then again stops the VMs because the
>         "host not up" status?
>         Cf. 2017-01-30 08:44:06,157 DEBUG
>         [c.c.d.DeploymentPlanningManagerImpl]
>         (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>         ctx-6e9c753f) The last host of this VM is not UP or is not
>         enabled, host status is: Disconnected, host resource state is:
>         Enabled
>
>         But the problem after the MSs restart is definitely only
>         impacting VMs listed in the Table op_ha_work.
>
>
>         To your questions:
>                 We have 3 ACS MSs managing several Zones. Each zone
>         has a single pod and cluster.
>                 The Cluster of i-638-1735-VM has this summary:
>                  Total number of VMs: 121
>                  Total number hypervisors: 4
>                  Total allocated RAM: 522.0 / 972 GB (53.7 %)
>                  Total allocated cores: 211
>                  Total allocated storage: 11285 GB
>                 Used Memory on the hosts is now varying from 28 to
>         77%, so there is enough RAM.
>                 I did not check the XS logs. The same Problem has
>         arisen this morning again (shortly after a restart of all
>         MSs), so I will look in the XS logs now.
>
>                 I have some fair experience with Eclipse and Java, I
>         will ask my boss if we can go this way.
>
>         The main issue for now, is that restarting the MSs is
>         reproducing the problem, which is quite annoying for the
>         impacted VM's.
>         I thought to stop the MSs, delete all HA/Migration entries in
>         op_ha_work and start the MSs.
>         Is that a viable option? (or should I delete the Migration
>         entries only? or never touch this table?)
>         We will also very probably remove the HA Flag on all VMs if it
>         does not work reliably.
>
>         Best Regards
>         Francois Scheurer
>
>
>
>         Here are new logs regarding the same issue of today.
>         I tried to get the important parts related to the stop of
>         i-189-1178-VM.
>
>         regexp:
>         (i-189-1178-VM|VmWorkMigrateAway|job-15393|job-192989|StopRequestedvm|job-14965|job-193019|Stopped-Migrating|ctx-36aae733|work-2049)
>
>         2017-01-30 08:44:04,440 DEBUG
>         [o.a.c.f.j.i.AsyncJobManagerImpl]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989)
>         Executing AsyncJobVO {id:192989, userId: 1, accountId: 1,
>         instanceType: null, instanceId: null, cmd:
>         com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>         rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZU
>         F3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAEmnQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAHQ,
>         cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resu
>         ltCode: 0, result: null, initMsid: 345049103441, completeMsid:
>         null, lastUpdated: null, lastPolled: null, created: Mon Jan 30
>         08:44:03 CET 2017}
>         2017-01-30 08:44:04,440 DEBUG [c.c.v.VmWorkJobDispatcher]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Run VM
>         work job: com.cloud.vm.VmWorkMigrateAway for VM 1178, job
>         origin: 15393
>
>         2017-01-30 08:44:05,217 DEBUG
>         [c.c.d.DeploymentPlanningManagerImpl]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea) Deploy avoids pods: [], clusters: [], hosts: [29]
>         2017-01-30 08:44:05,220 DEBUG
>         [c.c.a.m.ClusteredAgentManagerImpl]
>         (AgentConnectTaskPool-4:ctx-5b61f2e5) create
>         ClusteredAgentAttache for 232
>         2017-01-30 08:44:05,220 DEBUG
>         [c.c.d.DeploymentPlanningManagerImpl]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea) DeploymentPlanner allocation algorithm:
>         com.cloud.deploy.UserDispersingPlanner@1734b2e2
>         2017-01-30 08:44:05,220 DEBUG
>         [c.c.d.DeploymentPlanningManagerImpl]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea) Trying to allocate a host and storage pools from
>         dc:17, pod:25,cluster:25, requested cpu: 4200, requested ram:
>         4294967296 <tel:%2842%29%209496-7296>
>         2017-01-30 08:44:05,220 DEBUG
>         [c.c.d.DeploymentPlanningManagerImpl]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea) Is ROOT volume READY (pool already allocated)?: Yes
>         2017-01-30 08:44:05,220 DEBUG
>         [c.c.d.DeploymentPlanningManagerImpl]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea) This VM has last host_id specified, trying to
>         choose the same host: 25
>
>         2017-01-30 08:44:05,614 DEBUG
>         [c.c.d.DeploymentPlanningManagerImpl]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea) The last host of this VM is not UP or is not
>         enabled, host status is: Disconnected, host resource state is:
>         Enabled
>         2017-01-30 08:44:05,614 DEBUG
>         [c.c.d.DeploymentPlanningManagerImpl]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea) Cannot choose the last host to deploy this VM
>         2017-01-30 08:44:05,614 DEBUG [c.c.d.FirstFitPlanner]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea) Searching resources only under specified Cluster: 25
>
>         2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea FirstFitRoutingAllocator) Found 0 hosts for
>         allocation after prioritization: []
>         2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea FirstFitRoutingAllocator) Looking for
>         speed=4200Mhz, Ram=4096
>         2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea FirstFitRoutingAllocator) Host Allocator
>         returning 0 suitable hosts
>         2017-01-30 08:44:05,777 DEBUG
>         [c.c.d.DeploymentPlanningManagerImpl]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea) No suitable hosts found
>         2017-01-30 08:44:05,777 DEBUG
>         [c.c.d.DeploymentPlanningManagerImpl]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea) No suitable hosts found under this Cluster: 25
>         2017-01-30 08:44:05,787 DEBUG
>         [c.c.d.DeploymentPlanningManagerImpl]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea) Could not find suitable Deployment Destination
>         for this VM under any clusters, returning.
>         2017-01-30 08:44:05,787 DEBUG [c.c.d.FirstFitPlanner]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea) Searching resources only under specified Cluster: 25
>         2017-01-30 08:44:05,798 DEBUG [c.c.d.FirstFitPlanner]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea) The specified cluster is in avoid set, returning.
>         2017-01-30 08:44:05,798 DEBUG
>         [c.c.v.VirtualMachineManagerImpl]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea) Unable to find destination for migrating the vm
>         VM[User|i-189-1178-VM]
>         2017-01-30 08:44:05,798 WARN [c.c.v.VirtualMachineManagerImpl]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea) Failed to deploy vm 1178 with original planner,
>         sending HAPlanner
>
>         2017-01-30 08:44:05,985 DEBUG
>         [c.c.d.DeploymentPlanningManagerImpl]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea) Deploy avoids pods: [], clusters: [], hosts: [29]
>         2017-01-30 08:44:05,985 DEBUG
>         [c.c.d.DeploymentPlanningManagerImpl]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea) DeploymentPlanner allocation algorithm:
>         com.cloud.deploy.SkipHeuresticsPlanner@4fdf86ce
>         2017-01-30 08:44:05,985 DEBUG
>         [c.c.d.DeploymentPlanningManagerImpl]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea) Trying to allocate a host and storage pools from
>         dc:17, pod:25,cluster:25, requested cpu: 4200, requested ram:
>         4294967296 <tel:%2842%29%209496-7296>
>         2017-01-30 08:44:05,985 DEBUG
>         [c.c.d.DeploymentPlanningManagerImpl]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea) Is ROOT volume READY (pool already allocated)?: Yes
>         2017-01-30 08:44:05,985 DEBUG
>         [c.c.d.DeploymentPlanningManagerImpl]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea) This VM has last host_id specified, trying to
>         choose the same host: 25
>
>         2017-01-30 08:44:06,022 DEBUG
>         [c.c.d.DeploymentPlanningManagerImpl]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea) The last host of this VM is not UP or is not
>         enabled, host status is: Disconnected, host resource state is:
>         Enabled
>         2017-01-30 08:44:06,022 DEBUG
>         [c.c.d.DeploymentPlanningManagerImpl]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea) Cannot choose the last host to deploy this VM
>         2017-01-30 08:44:06,022 DEBUG [c.c.d.FirstFitPlanner]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea) Searching resources only under specified Cluster: 25
>         2017-01-30 08:44:06,039 DEBUG [c.c.d.SkipHeuresticsPlanner]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea) Deploying vm during HA process, so skipping
>         disable threshold check
>
>         2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea FirstFitRoutingAllocator) Found 0 hosts for
>         allocation after prioritization: []
>         2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea FirstFitRoutingAllocator) Looking for
>         speed=4200Mhz, Ram=4096
>         2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea FirstFitRoutingAllocator) Host Allocator
>         returning 0 suitable hosts
>         2017-01-30 08:44:06,151 DEBUG
>         [c.c.d.DeploymentPlanningManagerImpl]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea) No suitable hosts found
>         2017-01-30 08:44:06,151 DEBUG
>         [c.c.d.DeploymentPlanningManagerImpl]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea) No suitable hosts found under this Cluster: 25
>         2017-01-30 08:44:06,157 DEBUG
>         [o.a.c.f.j.i.AsyncJobManagerImpl]
>         (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986)
>         Publish async job-192986 complete on message bus
>         2017-01-30 08:44:06,157 DEBUG
>         [o.a.c.f.j.i.AsyncJobManagerImpl]
>         (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Wake
>         up jobs related to job-192986
>         2017-01-30 08:44:06,157 DEBUG
>         [o.a.c.f.j.i.AsyncJobManagerImpl]
>         (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Update
>         db status for job-192986
>         2017-01-30 08:44:06,157 DEBUG
>         [c.c.d.DeploymentPlanningManagerImpl]
>         (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>         ctx-6e9c753f) The last host of this VM is not UP or is not
>         enabled, host status is: Disconnected, host resource state is:
>         Enabled
>         2017-01-30 08:44:06,157 DEBUG
>         [c.c.d.DeploymentPlanningManagerImpl]
>         (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>         ctx-6e9c753f) Cannot choose the last host to deploy this VM
>         2017-01-30 08:44:06,157 DEBUG [c.c.d.FirstFitPlanner]
>         (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>         ctx-6e9c753f) Searching resources only under specified Cluster: 35
>         2017-01-30 08:44:06,160 DEBUG
>         [c.c.d.DeploymentPlanningManagerImpl]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea) Could not find suitable Deployment Destination
>         for this VM under any clusters, returning.
>         2017-01-30 08:44:06,160 DEBUG [c.c.d.FirstFitPlanner]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea) Searching resources only under specified Cluster: 25
>         2017-01-30 08:44:06,163 DEBUG
>         [o.a.c.f.j.i.AsyncJobManagerImpl]
>         (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Execute sync-queue
>         item: SyncQueueItemVO {id:63629, queueId: 64423, contentType:
>         AsyncJob, contentId: 192991, lastProcessMsid: 345049098498,
>         lastprocessNumber: 27, lastProcessTime: Mon Jan 30 08:44:06
>         CET 2017, created: Mon Jan 30 08:44:05 CET 2017}
>         2017-01-30 08:44:06,169 INFO [o.a.c.f.j.i.AsyncJobMonitor]
>         (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Add
>         job-192992 into job monitoring
>         2017-01-30 08:44:06,184 DEBUG [c.c.d.SkipHeuresticsPlanner]
>         (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>         ctx-6e9c753f) Deploying vm during HA process, so skipping
>         disable threshold check
>         2017-01-30 08:44:06,186 DEBUG [c.c.d.FirstFitPlanner]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea) The specified cluster is in avoid set, returning.
>         2017-01-30 08:44:06,186 DEBUG
>         [c.c.v.VirtualMachineManagerImpl]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea) Unable to find destination for migrating the vm
>         VM[User|i-189-1178-VM]
>         2017-01-30 08:44:06,186 ERROR [c.c.v.VmWorkJobHandlerProxy]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea) Invocation exception, caused by:
>         com.cloud.exception.InsufficientServerCapacityException:
>         Unable to find a server to migrate to.Scope=interface
>         com.cloud.org.Cluster; id=25
>         2017-01-30 08:44:06,187 INFO [c.c.v.VmWorkJobHandlerProxy]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>         ctx-ea83d6ea) Rethrow exception
>         com.cloud.exception.InsufficientServerCapacityException:
>         Unable to find a server to migrate to.Scope=interface
>         com.cloud.org.Cluster; id=25
>         2017-01-30 08:44:06,187 DEBUG [c.c.v.VmWorkJobDispatcher]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Done
>         with run of VM work job: com.cloud.vm.VmWorkMigrateAway for VM
>         1178, job origin: 15393
>         2017-01-30 08:44:06,187 ERROR [c.c.v.VmWorkJobDispatcher]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Unable
>         to complete AsyncJobVO {id:192989, userId: 1, accountId: 1,
>         instanceType: null, instanceId: null, cmd:
>         com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>         rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAEmnQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAHQ,
>         cmdVersion: 0, status: IN_PROGRESS, processStatus: 0,
>         resultCode: 0, result: null, initMsid: 345049103441,
>         completeMsid: null, lastUpdated: null, lastPolled: null,
>         created: Mon Jan 30 08:44:03 CET 2017}, job origin:15393
>         com.cloud.exception.InsufficientServerCapacityException:
>         Unable to find a server to migrate to.Scope=interface
>         com.cloud.org.Cluster; id=25
>                 at
>         com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:2390)
>                 at
>         com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:4520)
>                 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>         Method)
>                 at
>         sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>                 at
>         sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>                 at java.lang.reflect.Method.invoke(Method.java:606)
>                 at
>         com.cloud.vm.VmWorkJobHandlerProxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
>                 at
>         com.cloud.vm.VirtualMachineManagerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4636)
>                 at
>         com.cloud.vm.VmWorkJobDispatcher.runJob(VmWorkJobDispatcher.java:103)
>                 at
>         org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:537)
>                 at
>         org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>                 at
>         org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
>                 at
>         org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
>                 at
>         org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
>                 at
>         org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>                 at
>         org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:494)
>                 at
>         java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>                 at
>         java.util.concurrent.FutureTask.run(FutureTask.java:262)
>                 at
>         java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>                 at
>         java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>                 at java.lang.Thread.run(Thread.java:745)
>         2017-01-30 08:44:06,188 DEBUG
>         [o.a.c.f.j.i.AsyncJobManagerImpl]
>         (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Schedule queued job-192991
>         2017-01-30 08:44:06,192 DEBUG
>         [o.a.c.f.j.i.AsyncJobManagerImpl]
>         (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Wake
>         up jobs joined with job-192986 and disjoin all subjobs created
>         from job- 192986
>         2017-01-30 08:44:06,196 DEBUG
>         [o.a.c.f.j.i.AsyncJobManagerImpl]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989)
>         Complete async job-192989, jobStatus: FAILED, resultCode: 0,
>         result:
>         rO0ABXNyADdjb20uY2xvdWQuZXhjZXB0aW9uLkluc3VmZmljaWVudFNlcnZlckNhcGFjaXR5RXhjZXB0aW9uAAAAAFZNT3ACAAFaABVhZmZpbml0eUdyb3Vwc0FwcGxpZWR4cgAxY29tLmNsb3VkLmV4Y2VwdGlvbi5JbnN1ZmZpY2llbnRDYXBhY2l0eUV4Y2VwdGlvbgAAAABWTU91AgACTAACaWR0ABBMamF2YS9sYW5nL0xvbmc7TAAFc2NvcGV0ABFMamF2YS9sYW5nL0NsYXNzO3hyACJjb20uY2xvdWQuZXhjZXB0aW9uLkNsb3VkRXhjZXB0aW9ueeiOG_HkRMcCAAJMAAtjc0Vycm9yQ29kZXQAE0xqYXZhL2xhbmcvSW50ZWdlcjtMAAZpZExpc3R0ABVMamF2YS91dGlsL0FycmF5TGlzdDt4cgATamF2YS5sYW5nLkV4Y2VwdGlvbtD9Hz4aOxzEAgAAeHIAE2phdmEubGFuZy5UaHJvd2FibGXVxjUnOXe4ywMABEwABWNhdXNldAAVTGphdmEvbGFuZy9UaHJvd2FibGU7TAANZGV0YWlsTWVzc2FnZXQAEkxqYXZhL2xhbmcvU3RyaW5nO1sACnN0YWNrVHJhY2V0AB5bTGphdmEvbGFuZy9TdGFja1RyYWNlRWxlbWVudDtMABRzdXBwcmVzc2VkRXhjZXB0aW9uc3QAEExqYXZhL3V0aWwvTGlzdDt4cHEAfgANdAAmVW5hYmxlIHRvIGZpbmQgYSBzZXJ2ZXIgdG8gbWlncmF0ZSB0by51cgAeW0xqYXZhLmxhbmcuU3RhY2tUcmFjZUVsZW1lbnQ7AkYqPDz9IjkCAAB4cAAAABVzcgAbamF2YS5sYW5nLlN0YWNrVHJhY2VFbGVtZW50YQnFmiY23YUCAARJAApsaW5lTnVtYmVyTAAOZGVjbGFyaW5nQ2xhc3NxAH4ACkwACGZpbGVOYW1lcQB-AApMAAptZXRob2ROYW1lcQB-AAp4cAAACVZ0ACZjb20uY2xvdWQudm0uVmlydHVhbE1hY2hpbmVNYW5hZ2VySW1wbHQAHlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwuamF2YXQAFm9yY2hlc3RyYXRlTWlncmF0ZUF3YXlzcQB-ABEAABGocQB-ABNxAH4AFHEAfgAVc3EAfgAR_____nQAJHN1bi5yZWZsZWN0Lk5hdGl2ZU1ldGhvZEFjY2Vzc29ySW1wbHQAHU5hdGl2ZU1ldGhvZEFjY2Vzc29ySW1wbC5qYXZhdAAHaW52b2tlMHNxAH4AEQAAADlxAH4AGHEAfgAZdAAGaW52b2tlc3EAfgARAAAAK3QAKHN1bi5yZWZsZWN0LkRlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcGx0ACFEZWxlZ2F0aW5nTWV0aG9kQWNjZXNzb3JJbXBsLmphdmFxAH4AHHNxAH4AEQAAAl50ABhqYXZhLmxhbmcucmVmbGVjdC5NZXRob2R0AAtNZXRob2QuamF2YXEAfgAcc3EAfgARAAAAa3QAImNvbS5jbG91ZC52bS5WbVdvcmtKb2JIYW5kbGVyUHJveHl0ABpWbVdvcmtKb2JIYW5kbGVyUHJveHkuamF2YXQAD2hhbmRsZVZtV29ya0pvYnNxAH4AEQAAEhxxAH4AE3EAfgAUcQB-ACZzcQB-ABEAAABndAAgY29tLmNsb3VkLnZtLlZtV29ya0pvYkRpc3BhdGNoZXJ0ABhWbVdvcmtKb2JEaXNwYXRjaGVyLmphdmF0AAZydW5Kb2JzcQB-ABEAAAIZdAA_b3JnLmFwYWNoZS5jbG91ZHN0YWNrLmZyYW1ld29yay5qb2JzLmltcGwuQXN5bmNKb2JNYW5hZ2VySW1wbCQ1dAAYQXN5bmNKb2JNYW5hZ2VySW1wbC5qYXZhdAAMcnVuSW5Db250ZXh0c3EAfgARAAAAMXQAPm9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuTWFuYWdlZENvbnRleHRSdW5uYWJsZSQxdAAbTWFuYWdlZENvbnRleHRSdW5uYWJsZS5qYXZhdAADcnVuc3EAfgARAAAAOHQAQm9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0TWFuYWdlZENvbnRleHQkMXQAGkRlZmF1bHRNYW5hZ2VkQ29udGV4dC5qYXZhdAAEY2FsbHNxAH4AEQAAAGd0AEBvcmcuYXBhY2hlLmNsb3Vkc3RhY2subWFuYWdlZC5jb250ZXh0LmltcGwuRGVmYXVsdE1hbmFnZWRDb250ZXh0cQB-ADZ0AA9jYWxsV2l0aENvbnRleHRzcQB-ABEAAAA1cQB-ADlxAH4ANnQADnJ1bldpdGhDb250ZXh0c3EAfgARAAAALnQAPG9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuTWFuYWdlZENvbnRleHRSdW5uYWJsZXEAfgAycQB-ADNzcQB-ABEAAAHucQB-AC1xAH4ALnEAfgAzc3EAfgARAAAB13QALmphdmEudXRpbC5jb25jdXJyZW50LkV4ZWN1dG9ycyRSdW5uYWJsZUFkYXB0ZXJ0AA5FeGVjdXRvcnMuamF2YXEAfgA3c3EAfgARAAABBnQAH2phdmEudXRpbC5jb25jdXJyZW50LkZ1dHVyZVRhc2t0AA9GdXR1cmVUYXNrLmphdmFxAH4AM3NxAH4AEQAABHl0ACdqYXZhLnV0aWwuY29uY3VycmVudC5UaHJlYWRQb29sRXhlY3V0b3J0ABdUaHJlYWRQb29sRXhlY3V0b3IuamF2YXQACXJ1bldvcmtlcnNxAH4AEQAAAmd0AC5qYXZhLnV0aWwuY29uY3VycmVudC5UaHJlYWRQb29sRXhlY3V0b3IkV29ya2VycQB-AEhxAH4AM3NxAH4AEQAAAul0ABBqYXZhLmxhbmcuVGhyZWFkdAALVGhyZWFkLmphdmFxAH4AM3NyACZqYXZhLnV0aWwuQ29sbGVjdGlvbnMkVW5tb2RpZmlhYmxlTGlzdPwPJTG17I4QAgABTAAEbGlzdHEAfgAMeHIALGphdmEudXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVDb2xsZWN0aW9uGUIAgMte9x4CAAFMAAFjdAAWTGphdmEvdXRpbC9Db2xsZWN0aW9uO3hwc3IAE2phdmEudXRpbC5BcnJheUxpc3R4gdIdmcdhnQMAAUkABHNpemV4cAAAAAB3BAAAAAB4cQB-AFR4c3IAEWphdmEubGFuZy5JbnRlZ2VyEuKgpPeBhzgCAAFJAAV2YWx1ZXhyABBqYXZhLmxhbmcuTnVtYmVyhqyVHQuU4IsCAAB4cAAAEO9zcQB-AFMAAAAAdwQAAAAAeHNyAA5qYXZhLmxhbmcuTG9uZzuL5JDMjyPfAgABSgAFdmFsdWV4cQB-AFYAAAAAAAAAGXZyABVjb20uY2xvdWQub3JnLkNsdXN0ZXKqoV-omaW2FAIAAHhwAA
>         2017-01-30 08:44:06,201 DEBUG
>         [c.c.d.DeploymentPlanningManagerImpl]
>         (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987
>         ctx-4c93bc31) Deploy avoids pods: [], clusters: [], hosts: [3]
>         2017-01-30 08:44:06,201 DEBUG
>         [c.c.d.DeploymentPlanningManagerImpl]
>         (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987
>         ctx-4c93bc31) DeploymentPlanner allocation algorithm:
>         com.cloud.deploy.SkipHeuresticsPlanner@4fdf86ce
>         2017-01-30 08:44:06,201 DEBUG
>         [c.c.d.DeploymentPlanningManagerImpl]
>         (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987
>         ctx-4c93bc31) Trying to allocate a host and storage pools from
>         dc:14, pod:17,cluster:17, requested cpu: 4200, requested ram:
>         4294967296
>         2017-01-30 08:44:06,201 DEBUG
>         [c.c.d.DeploymentPlanningManagerImpl]
>         (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987
>         ctx-4c93bc31) Is ROOT volume READY (pool already allocated)?: Yes
>         2017-01-30 08:44:06,201 DEBUG
>         [c.c.d.DeploymentPlanningManagerImpl]
>         (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987
>         ctx-4c93bc31) This VM has last host_id specified, trying to
>         choose the same host: 213
>         2017-01-30 08:44:06,213 DEBUG
>         [o.a.c.f.j.i.AsyncJobManagerImpl]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989)
>         Publish async job-192989 complete on message bus
>         2017-01-30 08:44:06,213 DEBUG
>         [o.a.c.f.j.i.AsyncJobManagerImpl]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Wake
>         up jobs related to job-192989
>         2017-01-30 08:44:06,213 DEBUG
>         [o.a.c.f.j.i.AsyncJobManagerImpl]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Update
>         db status for job-192989
>         2017-01-30 08:44:06,219 DEBUG
>         [c.c.d.DeploymentPlanningManagerImpl]
>         (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>         ctx-6e9c753f) Checking resources in Cluster: 35 under Pod: 31
>         2017-01-30 08:44:06,227 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>         (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>         ctx-6e9c753f FirstFitRoutingAllocator) Looking for hosts in
>         dc: 23  pod:31  cluster:35
>         2017-01-30 08:44:06,229 DEBUG
>         [o.a.c.f.j.i.AsyncJobManagerImpl]
>         (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992)
>         Executing AsyncJobVO {id:192992, userId: 1, accountId: 1,
>         instanceType: null, instanceId: null, cmd:
>         com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>         rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGwXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAEw,
>         cmdVersion: 0, status: IN_PROGRESS, processStatus: 0,
>         resultCode: 0, result: null, initMsid: 345049098122,
>         completeMsid: null, lastUpdated: null, lastPolled: null,
>         created: Mon Jan 30 08:44:05 CET 2017}
>         2017-01-30 08:44:06,229 DEBUG [c.c.v.VmWorkJobDispatcher]
>         (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Run VM
>         work job: com.cloud.vm.VmWorkMigrateAway for VM 1729, job
>         origin: 14953
>         2017-01-30 08:44:06,230 DEBUG
>         [o.a.c.f.j.i.AsyncJobManagerImpl]
>         (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Execute sync-queue
>         item: SyncQueueItemVO {id:63630, queueId: 67180, contentType:
>         AsyncJob, contentId: 192990, lastProcessMsid: 345049098498,
>         lastprocessNumber: 61, lastProcessTime: Mon Jan 30 08:44:06
>         CET 2017, created: Mon Jan 30 08:44:05 CET 2017}
>         2017-01-30 08:44:06,244 DEBUG
>         [c.c.d.DeploymentPlanningManagerImpl]
>         (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987
>         ctx-4c93bc31) The last host of this VM is not UP or is not
>         enabled, host status is: Disconnected, host resource state is:
>         Enabled
>         2017-01-30 08:44:06,262 DEBUG
>         [c.c.d.DeploymentPlanningManagerImpl]
>         (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987
>         ctx-4c93bc31) Cannot choose the last host to deploy this VM
>         2017-01-30 08:44:06,263 DEBUG [c.c.d.FirstFitPlanner]
>         (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987
>         ctx-4c93bc31) Searching resources only under specified Cluster: 17
>         2017-01-30 08:44:06,262 INFO [o.a.c.f.j.i.AsyncJobMonitor]
>         (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Add
>         job-192991 into job monitoring
>         2017-01-30 08:44:06,246 DEBUG
>         [o.a.c.f.j.i.AsyncJobManagerImpl]
>         (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Done
>         executing com.cloud.vm.VmWorkMigrateAway for job-192986
>         2017-01-30 08:44:06,244 DEBUG
>         [o.a.c.f.j.i.AsyncJobManagerImpl]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Wake
>         up jobs joined with job-192989 and disjoin all subjobs created
>         from job- 192989
>         2017-01-30 08:44:06,269 INFO [o.a.c.f.j.i.AsyncJobMonitor]
>         (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Remove
>         job-192986 from job monitoring
>         2017-01-30 08:44:06,279 DEBUG
>         [o.a.c.f.j.i.AsyncJobManagerImpl]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Done
>         executing com.cloud.vm.VmWorkMigrateAway for job-192989
>         2017-01-30 08:44:06,305 DEBUG
>         [o.a.c.f.j.i.AsyncJobManagerImpl]
>         (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991)
>         Executing AsyncJobVO {id:192991, userId: 1, accountId: 1,
>         instanceType: null, instanceId: null, cmd:
>         com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>         rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGRXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAABw,
>         cmdVersion: 0, status: IN_PROGRESS, processStatus: 0,
>         resultCode: 0, result: null, initMsid: 345049098122,
>         completeMsid: null, lastUpdated: null, lastPolled: null,
>         created: Mon Jan 30 08:44:05 CET 2017}
>         2017-01-30 08:44:06,305 DEBUG [c.c.v.VmWorkJobDispatcher]
>         (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Run VM
>         work job: com.cloud.vm.VmWorkMigrateAway for VM 1605, job
>         origin: 14965
>         2017-01-30 08:44:06,306 DEBUG [c.c.a.m.AgentManagerImpl]
>         (AgentManager-Handler-13:null) Ping from 258
>         2017-01-30 08:44:06,319 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>         (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>         ctx-6e9c753f FirstFitRoutingAllocator) List of hosts in
>         ascending order of number of VMs: [224, 19, 21, 15]
>         2017-01-30 08:44:06,322 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>         (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>         ctx-6e9c753f FirstFitRoutingAllocator) FirstFitAllocator has 0
>         hosts to check for allocation: []
>         2017-01-30 08:44:06,322 INFO [c.c.a.m.AgentManagerImpl]
>         (AgentManager-Handler-13:null) PingMap for agent: 258 will not
>         be updated because agent is no longer in the PingMap
>         2017-01-30 08:44:06,326 INFO [o.a.c.f.j.i.AsyncJobMonitor]
>         (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Remove
>         job-192989 from job monitoring
>
>         2017-01-30 08:54:38,139 INFO [c.c.v.VirtualMachineManagerImpl]
>         (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019
>         ctx-052eb4f5) Migration cancelled because state has changed:
>         VM[User|i-189-1178-VM]
>         2017-01-30 08:54:38,139 DEBUG
>         [c.c.v.VirtualMachineManagerImpl]
>         (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019
>         ctx-052eb4f5) Unable to migrate VM due to: Migration cancelled
>         because state has changed: VM[User|i-189-1178-VM]
>         2017-01-30 08:54:38,167 DEBUG [c.c.c.CapacityManagerImpl]
>         (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019
>         ctx-052eb4f5) VM state transitted from :Running to Stopping
>         with event: StopRequestedvm's original host id: 29 new host
>         id: 33 host id before state transition: 33
>         2017-01-30 08:54:38,177 DEBUG [c.c.a.m.DirectAgentAttache]
>         (DirectAgent-325:ctx-e540d7f5) Seq 19-2355664080091480198:
>         Response Received:
>         2017-01-30 08:54:38,177 DEBUG [c.c.a.t.Request]
>         (DirectAgent-325:ctx-e540d7f5) Seq 19-2355664080091480198:
>         Processing:  { Ans: , MgmtId: 345049098498, via: 19, Ver: v1,
>         Flags: 10,
>         [{"com.cloud.agent.api.NetworkUsageAnswer":{"routerName":"r-1706-VM","bytesSent":12276848087,"bytesReceived":5226621450,"result":true,"details":"","wait":0}}]
>         }
>         2017-01-30 08:54:38,177 DEBUG [c.c.a.t.Request]
>         (RouterMonitor-1:ctx-2e80f405) Seq 19-2355664080091480198:
>         Received:  { Ans: , MgmtId: 345049098498, via: 19, Ver: v1,
>         Flags: 10, { NetworkUsageAnswer } }
>         2017-01-30 08:54:38,177 DEBUG [c.c.a.m.AgentManagerImpl]
>         (RouterMonitor-1:ctx-2e80f405) Details from executing class
>         com.cloud.agent.api.NetworkUsageCommand:
>         2017-01-30 08:54:38,178 DEBUG [c.c.a.t.Request]
>         (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019
>         ctx-052eb4f5) Seq 33-1917970491306410035: Sending  { Cmd ,
>         MgmtId: 345049098498, via: 33(ewcstack-vh003-prod), Ver: v1,
>         Flags: 100011,
>         [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"checkBeforeCleanup":false,"vmName":"i-189-1178-VM","wait":0}}]
>         }
>         2017-01-30 08:54:38,178 DEBUG [c.c.a.t.Request]
>         (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019
>         ctx-052eb4f5) Seq 33-1917970491306410035: Executing:  { Cmd ,
>         MgmtId: 345049098498, via: 33(ewcstack-vh003-prod), Ver: v1,
>         Flags: 100011,
>         [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"checkBeforeCleanup":false,"vmName":"i-189-1178-VM","wait":0}}]
>         }
>         2017-01-30 08:54:38,178 DEBUG [c.c.a.m.DirectAgentAttache]
>         (DirectAgent-43:ctx-da30f109) Seq 33-1917970491306410035:
>         Executing request
>         2017-01-30 08:54:38,202 DEBUG [c.c.a.m.DirectAgentAttache]
>         (DirectAgent-326:ctx-4fb88aa7) Seq 21-1035546439318503505:
>         Executing request
>
>         2017-01-30 09:05:55,628 INFO
>         [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-4:ctx-36aae733
>         work-2049) Processing
>         HAWork[2049-Migration-1178-Stopped-Migrating]
>         2017-01-30 09:05:55,633 WARN
>         [o.a.c.f.j.AsyncJobExecutionContext] (HA-Worker-4:ctx-36aae733
>         work-2049) Job is executed without a context, setup psudo job
>         for the executing thread
>         2017-01-30 09:05:55,645 DEBUG
>         [o.a.c.f.j.i.AsyncJobManagerImpl] (HA-Worker-4:ctx-36aae733
>         work-2049) Sync job-193045 execution on object
>         VmWorkJobQueue.1178
>
>
>
>
>         On 31.01.2017 00:42, Rafael Weingärtner wrote:
>
>             This seems to be one of those nasty bugs that are very
>             hard to debug. It
>             seems to be cause to a combination of different factors,
>             such as planets
>             alignments ;)
>
>             At class “VirtualMachineManagerImpl” line 2395 (method
>             “com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(String,
>             long, DeploymentPlanner)”) is the start of the code flow
>             that triggers the
>             migration. The point here is that, if an error occurs
>             during the migration,
>             the VM will be stopped; For some reason this is executed
>             only for HA VMs.
>
>             Going a little bit further, we find the point that is
>             triggering the error
>             in the method
>             “com.cloud.vm.VirtualMachineManagerImpl.migrate(VMInstanceVO,
>             long, DeployDestination)” at line 1941. This line will
>             call the method
>             “com.cloud.vm.VirtualMachineManagerImpl.changeState(VMInstanceVO,
>             Event,
>             Long, ItWorkVO, Step)” to change the VM state to
>             migrating/prepare
>             migration/or something like that (I am guessing here), but
>             for some reason,
>             the state of the VM is not the one expected and this is
>             causing a
>             “ConcurrentOperationException” to be thrown. The problem
>             is that the
>             “com.cloud.vm.VirtualMachineManagerImpl.changeState(VMInstanceVO,
>             Event,
>             Long, ItWorkVO, Step)” does not seem to provide much
>             useful log entries.
>
>             Just to enable us to discuss the problem a little further,
>             could you tell
>             us a little bit about your environment setting? How many
>             ACS MSs do you
>             have? What is the VM workload being migrated (I mean, the
>             number of VMs and
>             computing resources)? Do the other servers have enough
>             computing resources
>             to support these VMs? Could you also check the XenServer
>             logs, and check
>             what happened on both, source and destination hosts when
>             the migration for
>             VM “i-638-1735-VM” was triggered?
>
>             Do you have the knowledge/skills to use an IDE (e.g.
>             Eclipse) to connect to
>             a running JVM and debug a code? Because for me it will not
>             be possible to
>             go further without looking at the variables in a running
>             environment.
>
>
>             On Mon, Jan 30, 2017 at 4:28 AM, Francois Scheurer <
>             francois.scheurer@everyware.ch
>             <ma...@everyware.ch>> wrote:
>
>                 Dear Rafael
>
>
>                 Sorry for the late answer (was sick a couple of days).
>
>                 I confirm we are migrating inside the same xenserver
>                 pool (within a CS
>                 Cluster).
>
>
>                 Best Regards
>                 Francois
>
>
>
>                 On 26.01.2017 04:35, Rafael Weingärtner wrote:
>
>                     Found the code that is executed.
>                     I need you to confirm if you are migrating within
>                     a cluster or across
>                     clusters?
>
>                     On Wed, Jan 25, 2017 at 11:57 AM, Rafael Weingärtner <
>                     rafaelweingartner@gmail.com
>                     <ma...@gmail.com>> wrote:
>
>                     I do not have access to ACS code right now, but I
>                     suggest starting
>
>                         debugging here:
>
>                         2017-01-24 18:05:28,427 DEBUG
>                         [c.c.c.CapacityManagerImpl]
>
>                             (Work-Job-Executor-156:ctx-14b86f5e
>                             job-60216/job-191380 ctx-175d37df)
>                             VM state transitted from :Running to
>                             Stopping with event:
>                             StopRequestedvm's
>                             original host id: 224 new host id: 15 host
>                             id before state transition:
>                             15
>
>                             I would try to understand first why ACS
>                             requested the hypervisor to stop
>
>                         the VM. I mean the following, check the source
>                         code for conditions that
>                         would make ACS request the shutdown of VMs.
>                         BTW: are you migrating within a cluster or
>                         across clusters?
>
>                         I will be able to analyze this further only
>                         after work hours, so if you
>                         find anything, keep me posted.
>
>                         On Wed, Jan 25, 2017 at 11:45 AM, Francois
>                         Scheurer <
>                         francois.scheurer@everyware.ch
>                         <ma...@everyware.ch>> wrote:
>
>                         Hello Rafael
>
>
>                             I think the important log lines are these:
>
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:28,407 INFO
>                             [c.c.v.VirtualMachineManagerImpl]
>                             (Work-Job-Executor-156:ctx-14b86f5e
>                             job-60216/job-191380 ctx-175d37df)
>                             Migration cancelled because state has
>                             changed: VM[User|i-638-1736-VM]
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:28,407 DEBUG
>                             [c.c.v.VirtualMachineManagerImpl]
>                             (Work-Job-Executor-156:ctx-14b86f5e
>                             job-60216/job-191380 ctx-175d37df) Unable
>                             to migrate VM due to:
>                             Migration
>                             cancelled because state has changed:
>                             VM[User|i-638-1736-VM]
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:28,427 DEBUG
>                             [c.c.c.CapacityManagerImpl]
>                             (Work-Job-Executor-156:ctx-14b86f5e
>                             job-60216/job-191380 ctx-175d37df) VM
>                             state transitted from :Running to
>                             Stopping with event: StopRequestedvm's
>                             original host id: 224 new host
>                             id:
>                             15 host id before state transition: 15
>
>                             The really disturbing thing is that CS
>                             stop again the VM if we start it
>                             manually, apparently because the
>                             migratingjob is still running.
>
>                             Note the the VM HA Flag is enabled.
>
>
>
>                             Thank you.
>
>
>                             Best Regards.
>
>                             Francois
>
>
>
>
>
>
>                             more details:
>
>
>
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:28,148 DEBUG
>                             [o.a.c.e.o.VolumeOrchestrator]
>                             (Work-Job-Executor-156:ctx-14b86f5e
>                             job-60216/job-191380 ctx-175d37df)
>                             Preparing 3 volumes for
>                             VM[User|i-638-1736-VM]
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:28,155 DEBUG
>                             [c.c.v.VmWorkJobDispatcher]
>                             (Work-Job-Executor-122:ctx-a8bacc64
>                             job-14953/job-191351) Done with run of VM
>                             work job:
>                             com.cloud.vm.VmWorkMigrateAway for VM
>                             1411, job origin: 14953
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:28,155 DEBUG
>                             [o.a.c.f.j.i.AsyncJobManagerImpl]
>                             (Work-Job-Executor-122:ctx-a8bacc64
>                             job-14953/job-191351) Done executing
>                             com.cloud.vm.VmWorkMigrateAway for
>                             job-191351
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:28,160 INFO
>                             [o.a.c.f.j.i.AsyncJobMonitor]
>                             (Work-Job-Executor-122:ctx-a8bacc64
>                             job-14953/job-191351) Remove job-191351
>                             from job monitoring
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:28,168 INFO
>                             [c.c.h.HighAvailabilityManagerImpl]
>                             (HA-Worker-2:ctx-f3bcf43c
>                             work-1989)
>                             Completed
>                             HAWork[1989-Migration-1411-Running-Migrating]
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:28,246 DEBUG
>                             [c.c.a.m.ClusteredAgentAttache]
>                             (Work-Job-Executor-156:ctx-14b86f5e
>                             job-60216/job-191380 ctx-175d37df) Seq
>                             19-8526158519542516375:
>                             Forwarding
>                             Seq 19-8526158519542516375:  { Cmd ,
>                             MgmtId: 345049103441, via:
>                             19(ewcstack-vh010-prod), Ver: v1, Flags:
>                             100111, [{
>                             "com.cloud.agent.api.PrepareForMigrationCommand":{"vm":{"id"
>                             :1736,"name":"i-638-1736-VM","bootloader":"PyGrub","type":"U
>                             ser","cpus":4,"minSpeed":2100,"maxSpeed":2100,"minRam":17179
>                             869184,"maxRam":17179869184,"arch":"x86_64","os":"Other
>                             (64-bit)","platformEmulator":"Other install
>                             media","bootArgs":"","enable
>                             HA":true,"limitCpuUse":true,"enableDynamicallyScaleVm":true,
>                             "vncPassword":"pC1WUC7h1DBH9J36q3H9pg==","params":{"memoryOv
>                             ercommitRatio":"1.0","platform":"viridian:true;acpi:1;apic:t
>                             rue;pae:true;nx:true;timeoffset:-3601","keyboard":"us","Mess
>                             age.ReservedCapacityFreed.Flag":"false","cpuOvercommitRatio":"4.0","
>                             hypervisortoolsversion":"xenserver61"},"uuid":"5bff01de-c033
>                             -4925-9e47-30dd14539272","disks":[{"data":{"org.apache.cloud
>                             stack.storage.to
>                             <http://stack.storage.to>.VolumeObjectTO":{"uuid":"1ab514b2-f2de-
>                             4d85-8e0d-9768184a1349","volumeType":"ROOT","dataStore":{"org.
>                             apache.cloudstack.storage.to
>                             <http://apache.cloudstack.storage.to>.PrimaryDataStoreTO":{"uuid":"b8751b89-00f
>                             6-5079-ed7a-c073579c7563","id":64,"poolType":"PreSetup","hos
>                             t":"localhost","path":"/b8751b89-00f6-5079-ed7a-c073579c7563
>                             ","port":0,"url":"PreSetup://localhost/b8751b89-00f6-5079-ed
>                             7a-c073579c7563/?ROLE=Primary&STOREUUID=b8751b89-00f6-5079-
>                             ed7a-c073579c7563"}},"name":"ROOT-1736","size":53687091200,"path":"d3
>                             f87e3c-1efe-42ad-aa93-609d5b980a34","volumeId":8655,"vmName"
>                             :"i-638-1736-VM","accountId":638,"format":"VHD","provisionin
>                             gType":"THIN","id":8655,"deviceId":0,"hypervisorType":"
>                             XenServer"}},"diskSeq":0,"path":"d3f87e3c-1efe-42ad-aa93
>                             -609d5b980a34","type":"ROOT","_details":{"managed":"false","
>                             storagePort":"0",
>                             "storageHost":"localhost","volumeSize":"53687091200"}},{"dat
>                             a":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid
>                             ":"733df9dc-5e24-4ab9-b22b-bce78322428e","volumeType":"DATAD
>                             ISK","dataStore":{"org.apache.cloudstack.storage.to.PrimaryD
>                             ataStoreTO":{"uuid":"b8751b89-00f6-5079-ed7a-c073579c7563","id":64,"
>                             poolType":"PreSetup","host":"localhost","path":"/b8751b89-00
>                             f6-5079-ed7a-c073579c7563","port":0,"url":"PreSetup://localh
>                             ost/b8751b89-00f6-5079-ed7a-c073579c7563/?ROLE=Primary&STORE
>                             UUID=b8751b89-00f6-5079-ed7a-c073579c7563"}},"name":"WELLA-
>                             DB1-NCSA_D","size":268435456000,"path":"265f168a-bb92-4591-a32b-ee3e2
>                             2c8cb68","volumeId":8656,"vmName":"i-638-1736-VM","accountId
>                             ":638,"format":"VHD","provisioningType":"THIN","id":8656,"
>                             deviceId":1,"hypervisorType":"XenServer"}},"diskSeq":1,"
>                             path":"265f168a-bb92-4591-a32b-ee3e22c8cb68","type":"DATADIS
>                             K","_details":{"managed":"false","storagePort":"0","stor
>                             ageHost":"localhos
>                             t","volumeSize":"268435456000"}},{"data":{"org.apache.clouds
>                             tack.storage.to
>                             <http://tack.storage.to>.VolumeObjectTO":{"uuid":"bda6f335-b8a3-479c-
>                             8c0d-41b344d2a50d","volumeType":"DATADISK","dataStore":{"org.
>                             apache.cloudstack.storage.to
>                             <http://apache.cloudstack.storage.to>.PrimaryDataStoreTO":{"uuid"
>                             :"09df3ca6-0a20-e4f0-3d79-12a6ed67a13d","id":89,"poolType":"PreSetup","
>                             host":"localhost","path":"/09df3ca6-0a20-e4f0-3d79-12a6ed67a
>                             13d","port":0,"url":"PreSetup://localhost/09df3ca6-0a20-e4f0
>                             -3d79-12a6ed67a13d/?ROLE=Primary&STOREUUID=09df3ca6-0a20-
>
>                             e4f0-3d79-12a6ed67a13d"}},"name":"WELLA-DB1-NCSA_E","size
>                             ":429496729600,"path":"612cb586-a70c-4896-b120-062ee7a67894"
>                             ,"volumeId":86
>                             57,"vmName":"i-638-1736-VM","accountId":638,"format":"VHD","
>                             provisioningType":"THIN","id":8657,"deviceId":2,"hypervisorT
>                             ype":"XenServer"}},"diskSeq":2,"path":"612cb586-a70c-4896-b1
>                             20-062ee7a67894","type":"DATADISK","_details":{"managed":"
>                             false","storagePort":"0","storageHost":"localhost","volumeSize":"42949
>                             6729600"}},{"data":{"org.apache.cloudstack.storage.to
>                             <http://apache.cloudstack.storage.to>.Templa
>                             teObjectTO":{"uuid":"0e5bf4ff-caca-4b08-abed-7bbcdf3c42ef","
>                             id":201,"format":"ISO","accountId":1,"hvm":true,"displayText
>                             ":"xen-pv-drv-iso","name":"xs-tools.iso","guestOsType":"CentOS
>                             4.5
>                             (32-bit)","hypervisorType":"XenServer"}},"diskSeq":3,"type":"
>                             ISO"}],"nics":[{"deviceId":0,"networkRateMbps":1000,"default
>                             Nic":true,"pxeDisable":false,"nicUuid":"3523c3d3-9f92-4076-a
>                             d99-f8864d0f0edd","uuid":"e6fbb257-16e1-415f-88fb-9a79b263b3
>                             a8","ip":"192.168.1.42","netmask":"255.255.255.0","
>                             gateway":"192.168.1.1","mac":"02:00:0c:ae:00:0f","dns1":"8.8
>                             .8.8","dns2":"8.8
>                             .4.4","broadcastType":"Vlan","type":"Guest","broadcastUri":"
>                             vlan://2285","isolationUri":"vlan://2285","isSecurityGroupEn
>                             abled":false,"name":"bond0"}],"vcpuMaxLimit":16},"wait":0}}]
>                             } to
>                             345049098498
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:28,399 DEBUG [c.c.a.t.Request]
>                             (AgentManager-Handler-10:null) Seq
>                             19-8526158519542516375: Processing:
>                             {
>                             Ans: , MgmtId: 345049103441, via: 19, Ver:
>                             v1, Flags: 110,
>                             [{"com.cloud.agent.api.PrepareForMigrationAnswer":{"result":
>                             true,"wait":0}}]
>                             }
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:28,399 DEBUG
>                             [c.c.a.m.AgentAttache]
>                             (AgentManager-Handler-10:null) Seq
>                             19-8526158519542516375: No more
>                             commands
>                             found
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:28,399 DEBUG [c.c.a.t.Request]
>                             (Work-Job-Executor-156:ctx-14b86f5e
>                             job-60216/job-191380 ctx-175d37df)
>                             Seq 19-8526158519542516375: Received:  {
>                             Ans: , MgmtId: 345049103441,
>                             via:
>                             19, Ver: v1, Flags: 110, {
>                             PrepareForMigrationAnswer } }
>
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:28,407 INFO
>                             [c.c.v.VirtualMachineManagerImpl]
>                             (Work-Job-Executor-156:ctx-14b86f5e
>                             job-60216/job-191380 ctx-175d37df)
>                             Migration cancelled because state has
>                             changed: VM[User|i-638-1736-VM]
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:28,407 DEBUG
>                             [c.c.v.VirtualMachineManagerImpl]
>                             (Work-Job-Executor-156:ctx-14b86f5e
>                             job-60216/job-191380 ctx-175d37df) Unable
>                             to migrate VM due to:
>                             Migration
>                             cancelled because state has changed:
>                             VM[User|i-638-1736-VM]
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:28,427 DEBUG
>                             [c.c.c.CapacityManagerImpl]
>                             (Work-Job-Executor-156:ctx-14b86f5e
>                             job-60216/job-191380 ctx-175d37df) VM
>                             state transitted from :Running to
>                             Stopping with event: StopRequestedvm's
>                             original host id: 224 new host
>                             id:
>                             15 host id before state transition: 15
>
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:28,438 DEBUG
>                             [c.c.a.m.ClusteredAgentAttache]
>                             (Work-Job-Executor-156:ctx-14b86f5e
>                             job-60216/job-191380 ctx-175d37df) Seq
>                             15-6889944479923437643:
>                             Forwarding
>                             Seq 15-6889944479923437643:  { Cmd ,
>                             MgmtId: 345049103441, via:
>                             15(ewcstack-vh011-prod), Ver: v1, Flags:
>                             100011,
>                             [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"execut
>                             eInSequence":false,"checkBeforeCleanup":false,"vmName":"i-
>                             638-1736-VM","wait":0}}]
>                             } to 345049098498
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:30,335 DEBUG
>                             [c.c.a.m.AgentManagerImpl]
>                             (AgentManager-Handler-1:null) SeqA
>                             254-800498:
>                             Processing Seq 254-800498:  { Cmd ,
>                             MgmtId: -1, via: 254, Ver: v1,
>                             Flags:
>                             11,
>                             [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_
>                             proxyVmId":1530,"_loadInfo":"{\n
>                             \"connections\": []\n}","wait":0}}] }
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:30,340 DEBUG
>                             [c.c.a.m.AgentManagerImpl]
>                             (AgentManager-Handler-1:null) SeqA
>                             254-800498:
>                             Sending Seq 254-800498:  { Ans: , MgmtId:
>                             345049103441, via: 254, Ver:
>                             v1,
>                             Flags: 100010, [{"com.cloud.agent.api.AgentCo
>                             ntrolAnswer":{"result":true,"wait":0}}] }
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:32,447 DEBUG [c.c.a.t.Request]
>                             (AgentManager-Handler-11:null) Seq
>                             15-6889944479923437632: Processing:
>                             {
>                             Ans: , MgmtId: 345049103441, via: 15, Ver:
>                             v1, Flags: 10,
>                             [{"com.cloud.agent.api.StopAnswer":{"platform":"viridian:tru
>                             e;acpi:1;apic:true;pae:true;nx:true;timeoffset:3601","result
>                             ":true,"details":"Stop
>                             VM i-638-1735-VM Succeed","wait":0}}] }
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:32,447 DEBUG [c.c.a.t.Request]
>                             (Work-Job-Executor-115:ctx-0d5db528
>                             job-13733/job-191379 ctx-bbabfeae)
>                             Seq 15-6889944479923437632: Received:  {
>                             Ans: , MgmtId: 345049103441,
>                             via:
>                             15, Ver: v1, Flags: 10, { StopAnswer } }
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:32,470 DEBUG
>                             [c.c.v.VirtualMachineManagerImpl]
>                             (Work-Job-Executor-115:ctx-0d5db528
>                             job-13733/job-191379 ctx-bbabfeae)
>                             VM[User|i-638-1735-VM] is stopped on
>                             the
>                             host. Proceeding to release resource held.
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:32,475 DEBUG
>                             [c.c.n.NetworkModelImpl]
>                             (Work-Job-Executor-115:ctx-0d5db528
>                             job-13733/job-191379 ctx-bbabfeae) Service
>                             SecurityGroup is not
>                             supported
>                             in the network id=519
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:32,478 DEBUG
>                             [o.a.c.e.o.NetworkOrchestrator]
>                             (Work-Job-Executor-115:ctx-0d5db528
>                             job-13733/job-191379 ctx-bbabfeae)
>                             Changing active number of nics for
>                             network id=519 on -1
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:32,488 DEBUG
>                             [o.a.c.e.o.NetworkOrchestrator]
>                             (Work-Job-Executor-115:ctx-0d5db528
>                             job-13733/job-191379 ctx-bbabfeae) Asking
>                             VpcVirtualRouter to release
>                             NicProfile[17403-1735-3364083b-dcc0-4eb9-a7f9-f3e85bf8c7d3-1
>                             92.168.1.11-null
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:32,488 DEBUG
>                             [c.c.v.VirtualMachineManagerImpl]
>                             (Work-Job-Executor-115:ctx-0d5db528
>                             job-13733/job-191379 ctx-bbabfeae)
>                             Successfully released network
>                             resources
>                             for the vm VM[User|i-638-1735-VM]
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:32,488 DEBUG
>                             [c.c.v.VirtualMachineManagerImpl]
>                             (Work-Job-Executor-115:ctx-0d5db528
>                             job-13733/job-191379 ctx-bbabfeae)
>                             Successfully released storage
>                             resources
>                             for the vm VM[User|i-638-1735-VM]
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:32,496 DEBUG
>                             [c.c.c.CapacityManagerImpl]
>                             (Work-Job-Executor-115:ctx-0d5db528
>                             job-13733/job-191379 ctx-bbabfeae) VM
>                             state transitted from :Stopping to
>                             Stopped with event: OperationSucceededvm's
>                             original host id: 15 new host
>                             id: null host id before state transition: 15
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:32,505 DEBUG
>                             [c.c.c.CapacityManagerImpl]
>                             (Work-Job-Executor-115:ctx-0d5db528
>                             job-13733/job-191379 ctx-bbabfeae) Hosts's
>                             actual total CPU: 82976 and
>                             CPU
>                             after applying overprovisioning: 331904
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:32,505 DEBUG
>                             [c.c.c.CapacityManagerImpl]
>                             (Work-Job-Executor-115:ctx-0d5db528
>                             job-13733/job-191379 ctx-bbabfeae) Hosts's
>                             actual total RAM:
>                             261961927808
>                             and RAM after applying overprovisioning:
>                             261961924608
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:32,505 DEBUG
>                             [c.c.c.CapacityManagerImpl]
>                             (Work-Job-Executor-115:ctx-0d5db528
>                             job-13733/job-191379 ctx-bbabfeae) release
>                             cpu from host: 15, old used:
>                             71300,reserved: 2100, actual total: 82976,
>                             total with overprovisioning:
>                             331904; new used: 67100,reserved:6300;
>                             movedfromreserved:
>                             false,moveToReserveredtrue
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:32,505 DEBUG
>                             [c.c.c.CapacityManagerImpl]
>                             (Work-Job-Executor-115:ctx-0d5db528
>                             job-13733/job-191379 ctx-bbabfeae) release
>                             mem from host: 15, old used:
>                             98784247808,reserved: 1073741824, total:
>                             261961924608; new used:
>                             94489280512,reserved:5368709120
>                             <tel:5368709120> <(53)%206870-9120>;
>                             movedfromreserved:
>
>                             false,moveToReserveredtrue
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:32,538 DEBUG
>                             [c.c.n.NetworkModelImpl]
>                             (Work-Job-Executor-115:ctx-0d5db528
>                             job-13733/job-191379 ctx-bbabfeae) Service
>                             SecurityGroup is not
>                             supported
>                             in the network id=519
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:32,557 DEBUG
>                             [c.c.n.NetworkModelImpl]
>                             (Work-Job-Executor-115:ctx-0d5db528
>                             job-13733/job-191379 ctx-bbabfeae) Service
>                             SecurityGroup is not
>                             supported
>                             in the network id=519
>
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:32,588 ERROR
>                             [c.c.v.VmWorkJobHandlerProxy]
>                             (Work-Job-Executor-115:ctx-0d5db528
>                             job-13733/job-191379 ctx-bbabfeae)
>                             Invocation exception, caused by:
>                             com.cloud.utils.exception.CloudRuntimeException:
>                             Unable to migrate
>                             VM[User|i-638-1735-VM]
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:32,589 INFO
>                             [c.c.v.VmWorkJobHandlerProxy]
>                             (Work-Job-Executor-115:ctx-0d5db528
>                             job-13733/job-191379 ctx-bbabfeae) Rethrow
>                             exception
>                             com.cloud.utils.exception.CloudRuntimeException:
>                             Unable to migrate
>                             VM[User|i-638-1735-VM]
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:32,589 DEBUG
>                             [c.c.v.VmWorkJobDispatcher]
>                             (Work-Job-Executor-115:ctx-0d5db528
>                             job-13733/job-191379) Done with run of VM
>                             work job:
>                             com.cloud.vm.VmWorkMigrateAway for VM
>                             1735, job origin: 13733
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:32,589 ERROR
>                             [c.c.v.VmWorkJobDispatcher]
>                             (Work-Job-Executor-115:ctx-0d5db528
>                             job-13733/job-191379) Unable to complete
>                             AsyncJobVO {id:191379, userId:
>                             1,
>                             accountId: 1, instanceType: null,
>                             instanceId: null, cmd:
>                             com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>                             rO0ABXNyAB5jb20uY2xvdWQudm0uVm
>                             1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb2
>                             0uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2
>                             VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZz
>                             t4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGx3QAGVZpcnR1YWxNYWNoaW5lTW
>                             FuYWdlckltcGwAAAAAAAAA4A,
>                             cmdVersion: 0, status: IN_PROGRESS,
>                             processStatus: 0, resultCode: 0,
>                             result: null, initMsid: 345049098498,
>                             completeMsid: null, lastUpdated:
>                             null, lastPolled: null, created: Tue Jan
>                             24 18:05:00 CET 2017}, job
>                             origin:13733
>                             ewcstack-man02-prod:
>                             com.cloud.utils.exception.CloudRuntimeException:
>                             Unable to migrate VM[User|i-638-1735-VM]
>                             ewcstack-man02-prod:    at
>                             com.cloud.vm.VirtualMachineMan
>                             agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:2405)
>                             ewcstack-man02-prod:    at
>                             com.cloud.vm.VirtualMachineMan
>                             agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:4517)
>                             ewcstack-man02-prod:    at
>                             sun.reflect.GeneratedMethodAcc
>                             essor402.invoke(Unknown
>                             Source)
>                             ewcstack-man02-prod:    at
>                             sun.reflect.DelegatingMethodAc
>                             cessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>                             ewcstack-man02-prod:    at
>                             java.lang.reflect.Method.invok
>                             e(Method.java:606)
>                             ewcstack-man02-prod:    at
>                             com.cloud.vm.VmWorkJobHandlerP
>                             roxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
>                             ewcstack-man02-prod:    at
>                             com.cloud.vm.VirtualMachineMan
>                             agerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4636)
>                             ewcstack-man02-prod:    at
>                             com.cloud.vm.VmWorkJobDispatch
>                             er.runJob(VmWorkJobDispatcher.java:103)
>                             ewcstack-man02-prod:    at
>                             org.apache.cloudstack.framewor
>                             k.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManag
>                             erImpl.java:537)
>                             ewcstack-man02-prod:    at
>                             org.apache.cloudstack.managed.
>                             context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>                             ewcstack-man02-prod:    at
>                             org.apache.cloudstack.managed.
>                             context.impl.DefaultManagedContext$1.call(DefaultManagedCont
>                             ext.java:56)
>                             ewcstack-man02-prod:    at
>                             org.apache.cloudstack.managed.
>                             context.impl.DefaultManagedContext.callWithContext(DefaultMa
>                             nagedContext.java:103)
>                             ewcstack-man02-prod:    at
>                             org.apache.cloudstack.managed.
>                             context.impl.DefaultManagedContext.runWithContext(DefaultMan
>                             agedContext.java:53)
>                             ewcstack-man02-prod:    at
>                             org.apache.cloudstack.managed.
>                             context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>                             ewcstack-man02-prod:    at
>                             org.apache.cloudstack.framewor
>                             k.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:494)
>                             ewcstack-man02-prod:    at
>                             java.util.concurrent.Executors
>                             $RunnableAdapter.call(Executors.java:471)
>                             ewcstack-man02-prod:    at
>                             java.util.concurrent.FutureTas
>                             k.run(FutureTask.java:262)
>                             ewcstack-man02-prod:    at
>                             java.util.concurrent.ThreadPoo
>                             lExecutor.runWorker(ThreadPoolExecutor.java:1145)
>                             ewcstack-man02-prod:    at
>                             java.util.concurrent.ThreadPoo
>                             lExecutor$Worker.run(ThreadPoolExecutor.java:615)
>                             ewcstack-man02-prod:    at
>                             java.lang.Thread.run(Thread.java:745)
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:32,590 DEBUG
>                             [o.a.c.f.j.i.AsyncJobManagerImpl]
>                             (Work-Job-Executor-115:ctx-0d5db528
>                             job-13733/job-191379) Complete async
>                             job-191379, jobStatus: FAILED,
>                             resultCode: 0, result:
>                             rO0ABXNyAC9jb20uY2xvdWQudXRpbH
>                             MuZXhjZXB0aW9uLkNsb3VkUnVudGltZUV4Y2VwdGlvbgAAAABWTU9yAwABSQ
>                             ALY3NFcnJvckNvZGV4cgAaamF2YS5sYW5nLlJ1bnRpbWVFeGNlcHRpb26eXw
>                             ZHCjSD5QIAAHhyABNqYXZhLmxhbmcuRXhjZXB0aW9u0P0fPho7HMQCAAB4cg
>                             ATamF2YS5sYW5nLlRocm93YWJsZdXGNSc5d7jLAwAETAAFY2F1c2V0ABVMam
>                             F2YS9sYW5nL1Rocm93YWJsZTtMAA1kZXRhaWxNZXNzYWdldAASTGphdmEvbG
>                             FuZy9TdHJpbmc7WwAKc3RhY2tUcmFjZXQAHltMamF2YS9sYW5nL1N0YWNrVH
>                             JhY2VFbGVtZW50O0wAFHN1cHByZXNzZWRFeGNlcHRpb25zdAAQTGphdmEvdX
>                             RpbC9MaXN0O3hwcQB-AAh0AChVbmFibGUgdG8gbWlncmF0ZSBWTVtVc2VyfG
>                             ktNjM4LTE3MzUtVk1ddXIAHltMamF2YS5sYW5nLlN0YWNrVHJhY2VFbGVtZW
>                             50OwJGKjw8_SI5AgAAeHAAAAAUc3IAG2phdmEubGFuZy5TdGFja1RyYWNlRW
>                             xlbWVudGEJxZomNt2FAgAESQAKbGluZU51bWJlckwADmRlY2xhcmluZ0NsYX
>                             NzcQB-AAVMAAhmaWxlTmFtZXEAfgAFTAAKbWV0aG9kTmFtZXEAfgAFeHAAAA
>                             lldAAmY29tLmNsb3VkLnZtLlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGx0AB
>                             5WaXJ0dWFsTWFjaGluZU1hbmFnZXJJbXBsLmphdmF0ABZvcmNoZXN0cmF0ZU
>                             1pZ3JhdGVBd2F5c3EAfgAMAAARpXEAfgAOcQB-AA9xAH4AEHNxAH4ADP____
>                             90ACZzdW4ucmVmbGVjdC5HZW5lcmF0ZWRNZXRob2RBY2Nlc3NvcjQwMnB0AA
>                             ZpbnZva2VzcQB-AAwAAAArdAAoc3VuLnJlZmxlY3QuRGVsZWdhdGluZ01ldG
>                             hvZEFjY2Vzc29ySW1wbHQAIURlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcG
>                             wuamF2YXEAfgAUc3EAfgAMAAACXnQAGGphdmEubGFuZy5yZWZsZWN0Lk1ldG
>                             hvZHQAC01ldGhvZC5qYXZhcQB-ABRzcQB-AAwAAABrdAAiY29tLmNsb3VkLn
>                             ZtLlZtV29ya0pvYkhhbmRsZXJQcm94eXQAGlZtV29ya0pvYkhhbmRsZXJQcm
>                             94eS5qYXZhdAAPaGFuZGxlVm1Xb3JrSm9ic3EAfgAMAAASHHEAfgAOcQB-
>                             AA9xAH4AHnNxAH4ADAAAAGd0ACBjb20uY2xvdWQudm0uVm1Xb3JrSm9iRGlz
>                             cGF0Y2hlcnQAGFZtV29ya0pvYkRpc3BhdGNoZXIuamF2YXQABnJ1bkpvYnNx
>                             AH4ADAAAAhl0AD9vcmcuYXBhY2hlLmNsb3Vkc3RhY2suZnJhbWV3b3JrLmpv
>                             YnMuaW1wbC5Bc3luY0pvYk1hbmFnZXJJbXBsJDV0ABhBc3luY0pvYk1hbmFn
>                             ZXJJbXBsLmphdmF0AAxydW5JbkNvbnRleHRzcQB-AAwAAAAxdAA-b3JnLmFw
>                             YWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29udGV4
>                             dFJ1bm5hYmxlJDF0ABtNYW5hZ2VkQ29udGV4dFJ1bm5hYmxlLmphdmF0AANy
>                             dW5zcQB-AAwAAAA4dABCb3JnLmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQu
>                             Y29udGV4dC5pbXBsLkRlZmF1bHRNYW5hZ2VkQ29udGV4dCQxdAAaRGVmYXVs
>                             dE1hbmFnZWRDb250ZXh0LmphdmF0AARjYWxsc3EAfgAMAAAAZ3QAQG9yZy5h
>                             cGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0
>                             TWFuYWdlZENvbnRleHRxAH4ALnQAD2NhbGxXaXRoQ29udGV4dHNxAH4ADAAA
>                             ADVxAH4AMXEAfgAudAAOcnVuV2l0aENvbnRleHRzcQB-AAwAAAAudAA8b3Jn
>                             LmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29u
>                             dGV4dFJ1bm5hYmxlcQB-ACpxAH4AK3NxAH4ADAAAAe5xAH4AJXEAfgAmcQB-
>                             ACtzcQB-AAwAAAHXdAAuamF2YS51dGlsLmNvbmN1cnJlbnQuRXhlY3V0b3Jz
>                             JFJ1bm5hYmxlQWRhcHRlcnQADkV4ZWN1dG9ycy5qYXZhcQB-AC9zcQB-AAwA
>                             AAEGdAAfamF2YS51dGlsLmNvbmN1cnJlbnQuRnV0dXJlVGFza3QAD0Z1dHVy
>                             ZVRhc2suamF2YXEAfgArc3EAfgAMAAAEeXQAJ2phdmEudXRpbC5jb25jdXJy
>                             ZW50LlRocmVhZFBvb2xFeGVjdXRvcnQAF1RocmVhZFBvb2xFeGVjdXRvci5q
>                             YXZhdAAJcnVuV29ya2Vyc3EAfgAMAAACZ3QALmphdmEudXRpbC5jb25jdXJy
>                             ZW50LlRocmVhZFBvb2xFeGVjdXRvciRXb3JrZXJxAH4AQHEAfgArc3EAfgAM
>                             AAAC6XQAEGphdmEubGFuZy5UaHJlYWR0AAtUaHJlYWQuamF2YXEAfgArc3IA
>                             JmphdmEudXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVMaXN0_A8lMbXs
>                             jhACAAFMAARsaXN0cQB-AAd4cgAsamF2YS51dGlsLkNvbGxlY3Rpb25zJFVu
>                             bW9kaWZpYWJsZUNvbGxlY3Rpb24ZQgCAy173HgIAAUwAAWN0ABZMamF2YS91
>                             dGlsL0NvbGxlY3Rpb247eHBzcgATamF2YS51dGlsLkFycmF5TGlzdHiB0h2Z
>                             x2GdAwABSQAEc2l6ZXhwAAAAAHcEAAAAAHhxAH4ATHgAABCadwgAAAAAAAAAAHg
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:32,592 DEBUG
>                             [o.a.c.f.j.i.AsyncJobManagerImpl]
>                             (Work-Job-Executor-115:ctx-0d5db528
>                             job-13733/job-191379) Publish async
>                             job-191379 complete on message bus
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:32,592 DEBUG
>                             [o.a.c.f.j.i.AsyncJobManagerImpl]
>                             (Work-Job-Executor-115:ctx-0d5db528
>                             job-13733/job-191379) Wake up jobs related
>                             to job-191379
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:32,592 DEBUG
>                             [o.a.c.f.j.i.AsyncJobManagerImpl]
>                             (Work-Job-Executor-115:ctx-0d5db528
>                             job-13733/job-191379) Update db status for
>                             job-191379
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:32,594 DEBUG
>                             [o.a.c.f.j.i.AsyncJobManagerImpl]
>                             (Work-Job-Executor-115:ctx-0d5db528
>                             job-13733/job-191379) Wake up jobs joined
>                             with job-191379 and disjoin
>                             all
>                             subjobs created from job- 191379
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:32,604 DEBUG
>                             [o.a.c.f.j.i.AsyncJobManagerImpl]
>                             (Work-Job-Executor-115:ctx-0d5db528
>                             job-13733/job-191379) Done executing
>                             com.cloud.vm.VmWorkMigrateAway for
>                             job-191379
>                             ewcstack-man02-prod: 2017-01-24
>                             18:05:32,608 INFO
>                             [o.a.c.f.j.i.AsyncJobMonitor]
>                             (Work-Job-Executor-115:ctx-0d5db528
>                             job-13733/job-191379) Remove job-191379
>                             from job monitoring
>
>
>
>                             On 25.01.2017 15:53, Rafael Weingärtner wrote:
>
>                             Did you check the ACS log files looking
>                             for some exceptions when the
>
>                                 migrations for this specific VM started?
>
>                                 On Wed, Jan 25, 2017 at 9:49 AM,
>                                 Francois Scheurer <
>                                 francois.scheurer@everyware.ch
>                                 <ma...@everyware.ch>>
>                                 wrote:
>
>                                 Dear CS contributors
>
>                                     We use CS 4.5.1 on a 3 Clusters
>                                     with XenServer 6.5. and shared primary
>                                     storage.
>
>                                     When we put a host in maintenance,
>                                     cs evacuates all VM's on other
>                                     hosts.
>
>                                     But it happens regularly that some
>                                     VM's are stopped instead of live
>                                     migrated.
>
>                                     If we restart that stopped VM's,
>                                     CS will then stop them again later.
>
>                                     We need to start them several
>                                     times until they stays up.
>
>                                     Is it a known issue? Is it fixed
>                                     on CS 4.9.2 ?
>
>
>                                     Migrating manually all VM's is
>                                     working but impracticable because too
>                                     much
>                                     time consuming.
>
>
>                                     Many thanks in advance for your help.
>
>
>
>                                     Best Regards
>                                     Francois
>
>
>
>
>
>
>                                     --
>
>
>                                     EveryWare AG
>                                     François Scheurer
>                                     Senior Systems Engineer
>                                     Zurlindenstrasse 52a
>                                     CH-8003 Zürich
>
>                                     tel: +41 44 466 60 00
>                                     <tel:%2B41%2044%20466%2060%2000>
>                                     fax: +41 44 466 60 10
>                                     <tel:%2B41%2044%20466%2060%2010>
>                                     mail:
>                                     francois.scheurer@everyware.ch
>                                     <ma...@everyware.ch>
>                                     web: http://www.everyware.ch
>
>
>
>                                 --
>
>
>                             EveryWare AG
>                             François Scheurer
>                             Senior Systems Engineer
>                             Zurlindenstrasse 52a
>                             CH-8003 Zürich
>
>                             tel: +41 44 466 60 00
>                             <tel:%2B41%2044%20466%2060%2000>
>                             fax: +41 44 466 60 10
>                             <tel:%2B41%2044%20466%2060%2010>
>                             mail: francois.scheurer@everyware.ch
>                             <ma...@everyware.ch>
>                             web: http://www.everyware.ch
>
>
>
>                         --
>                         Rafael Weingärtner
>
>
>
>                 --
>
>
>                 EveryWare AG
>                 François Scheurer
>                 Senior Systems Engineer
>                 Zurlindenstrasse 52a
>                 CH-8003 Zürich
>
>                 tel: +41 44 466 60 00 <tel:%2B41%2044%20466%2060%2000>
>                 fax: +41 44 466 60 10 <tel:%2B41%2044%20466%2060%2010>
>                 mail: francois.scheurer@everyware.ch
>                 <ma...@everyware.ch>
>                 web: http://www.everyware.ch
>
>
>
>
>         -- 
>
>
>         EveryWare AG
>         François Scheurer
>         Senior Systems Engineer
>         Zurlindenstrasse 52a
>         CH-8003 Zürich
>
>         tel: +41 44 466 60 00 <tel:%2B41%2044%20466%2060%2000>
>         fax: +41 44 466 60 10 <tel:%2B41%2044%20466%2060%2010>
>         mail: francois.scheurer@everyware.ch
>         <ma...@everyware.ch>
>         web: http://www.everyware.ch
>
>
>
>
>     -- 
>     Rafael Weingärtner
>
>
>
>
> -- 
> Rafael Weingärtner

-- 


EveryWare AG
François Scheurer
Senior Systems Engineer
Zurlindenstrasse 52a
CH-8003 Zürich

tel: +41 44 466 60 00
fax: +41 44 466 60 10
mail: francois.scheurer@everyware.ch
web: http://www.everyware.ch

Re: cs 4.5.1, vm stopped when putting xen hosts in maintenance

Posted by Rafael Weingärtner <ra...@gmail.com>.

Ok, I dug a little further into the code.

If we put a host in maintenance, it may unfortunately happen that a timeout
> occurs and CS aborts the migration. As far as we can interpret the logs, it
> seems (we speculate) that CS detects that the VM is then still running on
> the original host, which is in maintenance, and decides somehow to stop it
> because of the "host not up" status.
>

From wat I have seen in the code, this may happen (as you speculated). If a
timeout or some other event interrupts migrations, the VM will be stopped.
This happens in the method
“com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(String,
long, DeploymentPlanner)” at line 2396 and further. This was implemented on
purpose; if the migration does not happen; the VM is stopped on the source
host. Probably it was implemented this way on that philosophy of gradient
descendent of artificial intelligence:

“pick a random point in the field, update the values according to the slope
> and hopefully after successfully iterations you converge to a minimum
> point.” (the problem is that sometimes the function diverges).
>

I mean, probably this was done this way because the HA was supposed to see
that a VM with HA is stopped, and should try to start it again somewhere
else. The log you posted do not provide any other details to debug this
further because the log is after an MS restart, right?


After restarting the MSs I guess that a similar issue may happen: the hosts
> are disconnected during the early stage and then reconnects. Maybe CS then
> again stops the VMs because the "host not up" status?
> Cf. 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f) The
> last host of this VM is not UP or is not enabled, host status is:
> Disconnected, host resource state is: Enabled
>

Here is something interesting that you have found. I checked the code, but
without a live debug it is pretty hard for me to get the right flow of
execution. By looking at the code and at the logs you provided, your
deduction is right. After ACS boots up, the check for hosts that are
shutdown/deactivated starts, and the method
“com.cloud.agent.manager.AgentManagerImpl.executeUserRequest(long, Event)”
is executed. The problem is that all of the hosts are considered as
disconnected (ACS may not have finished connecting to all of them yet), and
then a whole process of HA is kicked. The problem is that there is no host
with status “UP”. That is why this happen “No suitable hosts found under
this Cluster: 25”, and then the whole flow of HA, which shuts down the VM
is kicked.

In theory, the HA worker should see that a VM with HA is down, and should
try to start it. I am not understanding why this is not working. If you
decide to debug the running ACS, I may be able to help you, just keep me
posted.


On Tue, Jan 31, 2017 at 10:26 AM, Rafael Weingärtner <
rafaelweingartner@gmail.com> wrote:

> I will be able to re-check some code and answer you with new thoughts on
> this problem tonight (the log files you provided today have some very
> insightful information regarding this problem). I believe this is a bug in
> HA, a very tricky one that has not been reported yet (at least I do not
> remember seeing anything similar). And thanks to randomness/destiny now you
> have found/uncovered a way to reproduce the problem (bug?!).
>
> I do not suggest you changing any data at all directly in the data base
> (unless you know ACS code well and are 100% of what you are doing). This
> can cause even bigger and more complicated problems. If you remove the HA
> feature from this VMs, this problem will stop. However, if you can/are
> willing to maintain the HA until we fully uncover the problem, map and
> report it, this would be great.
> BTW: why are you restarting MSs?
> What are the configurations of your MSs?
>
> Thanks for your time and for sharing information with our community.
>
>
>
> On Tue, Jan 31, 2017 at 8:53 AM, Francois Scheurer <
> francois.scheurer@everyware.ch> wrote:
>
>> Dear Rafael
>>
>>
>> Many thanks for your deep analysis.
>> Nice humor with the planetsalignments BTW ;)
>>
>> We also suspected the HA Flag to be related this issue, as we never got
>> this issue in the past, until we enabled the HA on all VMs and Offerings.
>>
>> Here is also the workers list:
>>
>> mysql> select * from op_ha_work;
>> +------+-------------+-----------+--------------------+-----
>> -----+----------------+---------+---------------------+-----
>> --+---------------------+-------+-------------+---------+
>> | id   | instance_id | type      | vm_type            | state    |
>> mgmt_server_id | host_id | created             | tried | taken
>>  | step  | time_to_try | updated |
>> +------+-------------+-----------+--------------------+-----
>> -----+----------------+---------+---------------------+-----
>> --+---------------------+-------+-------------+---------+
>> |  105 |         343 | HA        | User               | Running  |
>>  345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>> Error |  1394056013 |       3 |
>> |  109 |         407 | HA        | ConsoleProxy       | Running  |
>>  345049098498 |      33 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>> Error |  1394056013 |       3 |
>> |  113 |         407 | HA        | ConsoleProxy       | Running  |
>>  345049098498 |      33 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>> Error |  1394056013 |       3 |
>> |  117 |         407 | HA        | ConsoleProxy       | Running  |
>>  345049098498 |      33 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>> Error |  1394056013 |       3 |
>> |  121 |         343 | HA        | User               | Running  |
>>  345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>> Error |  1394056013 |       3 |
>> |  125 |         405 | HA        | SecondaryStorageVm | Running  |
>>  345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>> Error |  1394056013 |       3 |
>> |  129 |         393 | HA        | User               | Running  |
>>  345049098498 |      25 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>> Error |  1394056014 |      30 |
>> |  133 |         405 | HA        | SecondaryStorageVm | Running  |
>>  345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>> Error |  1394056014 |       3 |
>> |  137 |         402 | HA        | ConsoleProxy       | Running  |
>>  345049098498 |      25 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
>> Error |  1394056014 |       3 |
>> |  345 |         402 | HA        | ConsoleProxy       | Starting |
>>  345049098498 |      25 | 2015-07-06 13:09:26 | 1 | 2015-07-06 13:09:26 |
>> Error |  1402527506 |     770 |
>> |  349 |         687 | HA        | SecondaryStorageVm | Starting |
>>  345049098498 |      25 | 2015-07-06 13:09:26 | 1 | 2015-07-06 13:09:26 |
>> Error |  1402527506 |       2 |
>> | 2057 |        1178 | Migration | User               | Running  |
>>  345049098498 |      33 | 2017-01-31 07:56:08 | 0 | 2017-01-31 07:56:08 |
>> Done  |  1451024774 |      95 |
>> | 2058 |        1736 | Migration | User               | Running  |
>>  345049098498 |     224 | 2017-01-31 07:56:08 | 0 | 2017-01-31 07:56:08 |
>> Done  |  1451024774 |      45 |
>> | 2059 |        1240 | Migration | User               | Running  |
>>  345049098498 |       7 | 2017-01-31 07:56:08 | 0 | 2017-01-31 07:56:08 |
>> Done  |  1451024774 |      61 |
>> | 2060 |        1178 | Migration | User               | Running  |
>>  345049098122 |      33 | 2017-01-31 07:56:08 | 1 | 2017-01-31 08:06:37 |
>> Done  |  1451025374 |      95 |
>> | 2061 |        1178 | Migration | User               | Stopped  |
>>    NULL |      33 | 2017-01-31 07:56:09 | 5 | NULL                | Done
>> |  1451027900 |     109 |
>> | 2062 |        1690 | Migration | User               | Running  |
>>  345049098498 |     224 | 2017-01-31 07:56:09 | 0 | 2017-01-31 07:56:09 |
>> Done  |  1451024774 |     109 |
>> | 2063 |        1177 | Migration | User               | Running  |
>>  345049098498 |      29 | 2017-01-31 07:56:09 | 0 | 2017-01-31 07:56:09 |
>> Done  |  1451024774 |      82 |
>> | 2064 |        1690 | Migration | User               | Running  |
>>  345049098122 |     224 | 2017-01-31 07:56:09 | 1 | 2017-01-31 07:58:18 |
>> Done  |  1451024900 |     109 |
>> | 2065 |        1240 | Migration | User               | Running  |
>>  345049098122 |       7 | 2017-01-31 07:56:09 | 1 | 2017-01-31 07:58:18 |
>> Done  |  1451024900 |      61 |
>> | 2066 |        1736 | Migration | User               | Running  |
>>  345049098122 |     224 | 2017-01-31 07:56:09 | 1 | 2017-01-31 07:58:18 |
>> Done  |  1451024900 |      45 |
>> | 2067 |        1240 | Migration | User               | Stopped  |
>>    NULL |       7 | 2017-01-31 07:56:09 | 5 | NULL                | Done
>> |  1451028003 |      80 |
>> | 2068 |        1178 | Migration | User               | Running  |
>>  345049098122 |      33 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
>> Done  |  1451024781 |      95 |
>> | 2069 |        1736 | Migration | User               | Running  |
>>  345049098122 |     224 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
>> Done  |  1451024781 |      45 |
>> | 2070 |        1240 | Migration | User               | Running  |
>>  345049098122 |       7 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
>> Done  |  1451024781 |      61 |
>> | 2071 |        1177 | Migration | User               | Running  |
>>  345049098122 |      29 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
>> Done  |  1451024781 |      82 |
>> | 2072 |        1177 | Migration | User               | Stopped  |
>>    NULL |      29 | 2017-01-31 07:56:16 | 5 | NULL                | Done
>> |  1451027973 |     101 |
>> | 2073 |        1177 | Migration | User               | Running  |
>>  345049098122 |      29 | 2017-01-31 07:56:16 | 1 | 2017-01-31 07:58:22 |
>> Done  |  1451024904 |      82 |
>> | 2074 |        1690 | Migration | User               | Stopped  |
>>    NULL |     224 | 2017-01-31 07:56:16 | 5 | NULL                | Done
>> |  1451027464 |     123 |
>> | 2075 |        1736 | Migration | User               | Stopped  |
>>    NULL |     224 | 2017-01-31 07:56:16 | 5 | NULL                | Done
>> |  1451027477 |      59 |
>> +------+-------------+-----------+--------------------+-----
>> -----+----------------+---------+---------------------+-----
>> --+---------------------+-------+-------------+---------+
>>
>> If we put a host in maintenance, it may unfortunately happen that a
>> timeout occurs and CS aborts the migration.
>> As far as we can interpret the logs, it seems (we speculate) that CS
>> detects that the VM is then still running on the original host, which is in
>> maintenance, and decides somehow to stop it because of the "host not up"
>> status.
>> We then need to start several times the VM, until CS gives up the
>> stopping.
>>
>> After restarting the MSs I guess that a similar issue may happen: the
>> hosts are disconnected during the early stage and then reconnects. Maybe CS
>> then again stops the VMs because the "host not up" status?
>> Cf. 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f) The
>> last host of this VM is not UP or is not enabled, host status is:
>> Disconnected, host resource state is: Enabled
>>
>> But the problem after the MSs restart is definitely only impacting VMs
>> listed in the Table op_ha_work.
>>
>>
>> To your questions:
>>         We have 3 ACS MSs managing several Zones. Each zone has a single
>> pod and cluster.
>>         The Cluster of i-638-1735-VM has this summary:
>>          Total number of VMs: 121
>>          Total number hypervisors: 4
>>          Total allocated RAM: 522.0 / 972 GB (53.7 %)
>>          Total allocated cores: 211
>>          Total allocated storage: 11285 GB
>>         Used Memory on the hosts is now varying from 28 to 77%, so there
>> is enough RAM.
>>         I did not check the XS logs. The same Problem has arisen this
>> morning again (shortly after a restart of all MSs), so I will look in the
>> XS logs now.
>>
>>         I have some fair experience with Eclipse and Java, I will ask my
>> boss if we can go this way.
>>
>> The main issue for now, is that restarting the MSs is reproducing the
>> problem, which is quite annoying for the impacted VM's.
>> I thought to stop the MSs, delete all HA/Migration entries in op_ha_work
>> and start the MSs.
>> Is that a viable option? (or should I delete the Migration entries only?
>> or never touch this table?)
>> We will also very probably remove the HA Flag on all VMs if it does not
>> work reliably.
>>
>> Best Regards
>> Francois Scheurer
>>
>>
>>
>> Here are new logs regarding the same issue of today.
>> I tried to get the important parts related to the stop of i-189-1178-VM.
>>
>> regexp: (i-189-1178-VM|VmWorkMigrateAway|job-15393|job-192989|StopRe
>> questedvm|job-14965|job-193019|Stopped-Migrating|ctx-36aae733|work-2049)
>>
>> 2017-01-30 08:44:04,440 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Executing
>> AsyncJobVO {id:192989, userId: 1, accountId: 1, instanceType: null,
>> instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>> rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZU
>> F3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm
>> 1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTA
>> ALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAA
>> AAAAAAAAEAAAAAAAAEmnQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAHQ,
>> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resu
>> ltCode: 0, result: null, initMsid: 345049103441, completeMsid: null,
>> lastUpdated: null, lastPolled: null, created: Mon Jan 30 08:44:03 CET 2017}
>> 2017-01-30 08:44:04,440 DEBUG [c.c.v.VmWorkJobDispatcher]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Run VM work job:
>> com.cloud.vm.VmWorkMigrateAway for VM 1178, job origin: 15393
>>
>> 2017-01-30 08:44:05,217 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>> Deploy avoids pods: [], clusters: [], hosts: [29]
>> 2017-01-30 08:44:05,220 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
>> (AgentConnectTaskPool-4:ctx-5b61f2e5) create ClusteredAgentAttache for
>> 232
>> 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>> DeploymentPlanner allocation algorithm: com.cloud.deploy.UserDispersin
>> gPlanner@1734b2e2
>> 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>> Trying to allocate a host and storage pools from dc:17, pod:25,cluster:25,
>> requested cpu: 4200, requested ram: 4294967296 <(42)%209496-7296>
>> 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) Is
>> ROOT volume READY (pool already allocated)?: Yes
>> 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>> This VM has last host_id specified, trying to choose the same host: 25
>>
>> 2017-01-30 08:44:05,614 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) The
>> last host of this VM is not UP or is not enabled, host status is:
>> Disconnected, host resource state is: Enabled
>> 2017-01-30 08:44:05,614 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>> Cannot choose the last host to deploy this VM
>> 2017-01-30 08:44:05,614 DEBUG [c.c.d.FirstFitPlanner]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>> Searching resources only under specified Cluster: 25
>>
>> 2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>> FirstFitRoutingAllocator) Found 0 hosts for allocation after
>> prioritization: []
>> 2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>> FirstFitRoutingAllocator) Looking for speed=4200Mhz, Ram=4096
>> 2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>> FirstFitRoutingAllocator) Host Allocator returning 0 suitable hosts
>> 2017-01-30 08:44:05,777 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) No
>> suitable hosts found
>> 2017-01-30 08:44:05,777 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) No
>> suitable hosts found under this Cluster: 25
>> 2017-01-30 08:44:05,787 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>> Could not find suitable Deployment Destination for this VM under any
>> clusters, returning.
>> 2017-01-30 08:44:05,787 DEBUG [c.c.d.FirstFitPlanner]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>> Searching resources only under specified Cluster: 25
>> 2017-01-30 08:44:05,798 DEBUG [c.c.d.FirstFitPlanner]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) The
>> specified cluster is in avoid set, returning.
>> 2017-01-30 08:44:05,798 DEBUG [c.c.v.VirtualMachineManagerImpl]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>> Unable to find destination for migrating the vm VM[User|i-189-1178-VM]
>> 2017-01-30 08:44:05,798 WARN [c.c.v.VirtualMachineManagerImpl]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>> Failed to deploy vm 1178 with original planner, sending HAPlanner
>>
>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>> Deploy avoids pods: [], clusters: [], hosts: [29]
>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>> DeploymentPlanner allocation algorithm: com.cloud.deploy.SkipHeurestic
>> sPlanner@4fdf86ce
>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>> Trying to allocate a host and storage pools from dc:17, pod:25,cluster:25,
>> requested cpu: 4200, requested ram: 4294967296 <(42)%209496-7296>
>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) Is
>> ROOT volume READY (pool already allocated)?: Yes
>> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>> This VM has last host_id specified, trying to choose the same host: 25
>>
>> 2017-01-30 08:44:06,022 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) The
>> last host of this VM is not UP or is not enabled, host status is:
>> Disconnected, host resource state is: Enabled
>> 2017-01-30 08:44:06,022 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>> Cannot choose the last host to deploy this VM
>> 2017-01-30 08:44:06,022 DEBUG [c.c.d.FirstFitPlanner]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>> Searching resources only under specified Cluster: 25
>> 2017-01-30 08:44:06,039 DEBUG [c.c.d.SkipHeuresticsPlanner]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>> Deploying vm during HA process, so skipping disable threshold check
>>
>> 2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>> FirstFitRoutingAllocator) Found 0 hosts for allocation after
>> prioritization: []
>> 2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>> FirstFitRoutingAllocator) Looking for speed=4200Mhz, Ram=4096
>> 2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
>> FirstFitRoutingAllocator) Host Allocator returning 0 suitable hosts
>> 2017-01-30 08:44:06,151 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) No
>> suitable hosts found
>> 2017-01-30 08:44:06,151 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) No
>> suitable hosts found under this Cluster: 25
>> 2017-01-30 08:44:06,157 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Publish async
>> job-192986 complete on message bus
>> 2017-01-30 08:44:06,157 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Wake up jobs
>> related to job-192986
>> 2017-01-30 08:44:06,157 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Update db status
>> for job-192986
>> 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f) The
>> last host of this VM is not UP or is not enabled, host status is:
>> Disconnected, host resource state is: Enabled
>> 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>> Cannot choose the last host to deploy this VM
>> 2017-01-30 08:44:06,157 DEBUG [c.c.d.FirstFitPlanner]
>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>> Searching resources only under specified Cluster: 35
>> 2017-01-30 08:44:06,160 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>> Could not find suitable Deployment Destination for this VM under any
>> clusters, returning.
>> 2017-01-30 08:44:06,160 DEBUG [c.c.d.FirstFitPlanner]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>> Searching resources only under specified Cluster: 25
>> 2017-01-30 08:44:06,163 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>> (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Execute sync-queue item:
>> SyncQueueItemVO {id:63629, queueId: 64423, contentType: AsyncJob,
>> contentId: 192991, lastProcessMsid: 345049098498, lastprocessNumber: 27,
>> lastProcessTime: Mon Jan 30 08:44:06 CET 2017, created: Mon Jan 30 08:44:05
>> CET 2017}
>> 2017-01-30 08:44:06,169 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>> (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Add job-192992
>> into job monitoring
>> 2017-01-30 08:44:06,184 DEBUG [c.c.d.SkipHeuresticsPlanner]
>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>> Deploying vm during HA process, so skipping disable threshold check
>> 2017-01-30 08:44:06,186 DEBUG [c.c.d.FirstFitPlanner]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) The
>> specified cluster is in avoid set, returning.
>> 2017-01-30 08:44:06,186 DEBUG [c.c.v.VirtualMachineManagerImpl]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>> Unable to find destination for migrating the vm VM[User|i-189-1178-VM]
>> 2017-01-30 08:44:06,186 ERROR [c.c.v.VmWorkJobHandlerProxy]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>> Invocation exception, caused by: com.cloud.exception.InsufficientServerCapacityException:
>> Unable to find a server to migrate to.Scope=interface
>> com.cloud.org.Cluster; id=25
>> 2017-01-30 08:44:06,187 INFO  [c.c.v.VmWorkJobHandlerProxy]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
>> Rethrow exception com.cloud.exception.InsufficientServerCapacityException:
>> Unable to find a server to migrate to.Scope=interface
>> com.cloud.org.Cluster; id=25
>> 2017-01-30 08:44:06,187 DEBUG [c.c.v.VmWorkJobDispatcher]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Done with run of
>> VM work job: com.cloud.vm.VmWorkMigrateAway for VM 1178, job origin: 15393
>> 2017-01-30 08:44:06,187 ERROR [c.c.v.VmWorkJobDispatcher]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Unable to
>> complete AsyncJobVO {id:192989, userId: 1, accountId: 1, instanceType:
>> null, instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>> rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcE
>> mwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAl
>> Z2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5h
>> bWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAA
>> AAAEmnQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAHQ, cmdVersion: 0,
>> status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null,
>> initMsid: 345049103441, completeMsid: null, lastUpdated: null, lastPolled:
>> null, created: Mon Jan 30 08:44:03 CET 2017}, job origin:15393
>> com.cloud.exception.InsufficientServerCapacityException: Unable to find
>> a server to migrate to.Scope=interface com.cloud.org.Cluster; id=25
>>         at com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAwa
>> y(VirtualMachineManagerImpl.java:2390)
>>         at com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAwa
>> y(VirtualMachineManagerImpl.java:4520)
>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>> ssorImpl.java:57)
>>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>> thodAccessorImpl.java:43)
>>         at java.lang.reflect.Method.invoke(Method.java:606)
>>         at com.cloud.vm.VmWorkJobHandlerProxy.handleVmWorkJob(VmWorkJob
>> HandlerProxy.java:107)
>>         at com.cloud.vm.VirtualMachineManagerImpl.handleVmWorkJob(Virtu
>> alMachineManagerImpl.java:4636)
>>         at com.cloud.vm.VmWorkJobDispatcher.runJob(VmWorkJobDispatcher.
>> java:103)
>>         at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImp
>> l$5.runInContext(AsyncJobManagerImpl.java:537)
>>         at org.apache.cloudstack.managed.context.ManagedContextRunnable
>> $1.run(ManagedContextRunnable.java:49)
>>         at org.apache.cloudstack.managed.context.impl.DefaultManagedCon
>> text$1.call(DefaultManagedContext.java:56)
>>         at org.apache.cloudstack.managed.context.impl.DefaultManagedCon
>> text.callWithContext(DefaultManagedContext.java:103)
>>         at org.apache.cloudstack.managed.context.impl.DefaultManagedCon
>> text.runWithContext(DefaultManagedContext.java:53)
>>         at org.apache.cloudstack.managed.context.ManagedContextRunnable
>> .run(ManagedContextRunnable.java:46)
>>         at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImp
>> l$5.run(AsyncJobManagerImpl.java:494)
>>         at java.util.concurrent.Executors$RunnableAdapter.call(Executor
>> s.java:471)
>>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>> Executor.java:1145)
>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>> lExecutor.java:615)
>>         at java.lang.Thread.run(Thread.java:745)
>> 2017-01-30 08:44:06,188 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>> (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Schedule queued job-192991
>> 2017-01-30 08:44:06,192 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Wake up jobs
>> joined with job-192986 and disjoin all subjobs created from job- 192986
>> 2017-01-30 08:44:06,196 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Complete async
>> job-192989, jobStatus: FAILED, resultCode: 0, result:
>> rO0ABXNyADdjb20uY2xvdWQuZXhjZXB0aW9uLkluc3VmZmljaWVudFNlcnZl
>> ckNhcGFjaXR5RXhjZXB0aW9uAAAAAFZNT3ACAAFaABVhZmZpbml0eUdyb3Vw
>> c0FwcGxpZWR4cgAxY29tLmNsb3VkLmV4Y2VwdGlvbi5JbnN1ZmZpY2llbnRD
>> YXBhY2l0eUV4Y2VwdGlvbgAAAABWTU91AgACTAACaWR0ABBMamF2YS9sYW5n
>> L0xvbmc7TAAFc2NvcGV0ABFMamF2YS9sYW5nL0NsYXNzO3hyACJjb20uY2xv
>> dWQuZXhjZXB0aW9uLkNsb3VkRXhjZXB0aW9ueeiOG_HkRMcCAAJMAAtjc0Vy
>> cm9yQ29kZXQAE0xqYXZhL2xhbmcvSW50ZWdlcjtMAAZpZExpc3R0ABVMamF2
>> YS91dGlsL0FycmF5TGlzdDt4cgATamF2YS5sYW5nLkV4Y2VwdGlvbtD9Hz4a
>> OxzEAgAAeHIAE2phdmEubGFuZy5UaHJvd2FibGXVxjUnOXe4ywMABEwABWNh
>> dXNldAAVTGphdmEvbGFuZy9UaHJvd2FibGU7TAANZGV0YWlsTWVzc2FnZXQA
>> EkxqYXZhL2xhbmcvU3RyaW5nO1sACnN0YWNrVHJhY2V0AB5bTGphdmEvbGFu
>> Zy9TdGFja1RyYWNlRWxlbWVudDtMABRzdXBwcmVzc2VkRXhjZXB0aW9uc3QA
>> EExqYXZhL3V0aWwvTGlzdDt4cHEAfgANdAAmVW5hYmxlIHRvIGZpbmQgYSBz
>> ZXJ2ZXIgdG8gbWlncmF0ZSB0by51cgAeW0xqYXZhLmxhbmcuU3RhY2tUcmFj
>> ZUVsZW1lbnQ7AkYqPDz9IjkCAAB4cAAAABVzcgAbamF2YS5sYW5nLlN0YWNr
>> VHJhY2VFbGVtZW50YQnFmiY23YUCAARJAApsaW5lTnVtYmVyTAAOZGVjbGFy
>> aW5nQ2xhc3NxAH4ACkwACGZpbGVOYW1lcQB-AApMAAptZXRob2ROYW1lcQB-
>> AAp4cAAACVZ0ACZjb20uY2xvdWQudm0uVmlydHVhbE1hY2hpbmVNYW5hZ2Vy
>> SW1wbHQAHlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwuamF2YXQAFm9yY2hl
>> c3RyYXRlTWlncmF0ZUF3YXlzcQB-ABEAABGocQB-ABNxAH4AFHEAfgAVc3EA
>> fgAR_____nQAJHN1bi5yZWZsZWN0Lk5hdGl2ZU1ldGhvZEFjY2Vzc29ySW1w
>> bHQAHU5hdGl2ZU1ldGhvZEFjY2Vzc29ySW1wbC5qYXZhdAAHaW52b2tlMHNx
>> AH4AEQAAADlxAH4AGHEAfgAZdAAGaW52b2tlc3EAfgARAAAAK3QAKHN1bi5y
>> ZWZsZWN0LkRlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcGx0ACFEZWxlZ2F0
>> aW5nTWV0aG9kQWNjZXNzb3JJbXBsLmphdmFxAH4AHHNxAH4AEQAAAl50ABhq
>> YXZhLmxhbmcucmVmbGVjdC5NZXRob2R0AAtNZXRob2QuamF2YXEAfgAcc3EA
>> fgARAAAAa3QAImNvbS5jbG91ZC52bS5WbVdvcmtKb2JIYW5kbGVyUHJveHl0
>> ABpWbVdvcmtKb2JIYW5kbGVyUHJveHkuamF2YXQAD2hhbmRsZVZtV29ya0pv
>> YnNxAH4AEQAAEhxxAH4AE3EAfgAUcQB-ACZzcQB-ABEAAABndAAgY29tLmNs
>> b3VkLnZtLlZtV29ya0pvYkRpc3BhdGNoZXJ0ABhWbVdvcmtKb2JEaXNwYXRj
>> aGVyLmphdmF0AAZydW5Kb2JzcQB-ABEAAAIZdAA_b3JnLmFwYWNoZS5jbG91
>> ZHN0YWNrLmZyYW1ld29yay5qb2JzLmltcGwuQXN5bmNKb2JNYW5hZ2VySW1w
>> bCQ1dAAYQXN5bmNKb2JNYW5hZ2VySW1wbC5qYXZhdAAMcnVuSW5Db250ZXh0
>> c3EAfgARAAAAMXQAPm9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNv
>> bnRleHQuTWFuYWdlZENvbnRleHRSdW5uYWJsZSQxdAAbTWFuYWdlZENvbnRl
>> eHRSdW5uYWJsZS5qYXZhdAADcnVuc3EAfgARAAAAOHQAQm9yZy5hcGFjaGUu
>> Y2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0TWFuYWdl
>> ZENvbnRleHQkMXQAGkRlZmF1bHRNYW5hZ2VkQ29udGV4dC5qYXZhdAAEY2Fs
>> bHNxAH4AEQAAAGd0AEBvcmcuYXBhY2hlLmNsb3Vkc3RhY2subWFuYWdlZC5j
>> b250ZXh0LmltcGwuRGVmYXVsdE1hbmFnZWRDb250ZXh0cQB-ADZ0AA9jYWxs
>> V2l0aENvbnRleHRzcQB-ABEAAAA1cQB-ADlxAH4ANnQADnJ1bldpdGhDb250
>> ZXh0c3EAfgARAAAALnQAPG9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2Vk
>> LmNvbnRleHQuTWFuYWdlZENvbnRleHRSdW5uYWJsZXEAfgAycQB-ADNzcQB-
>> ABEAAAHucQB-AC1xAH4ALnEAfgAzc3EAfgARAAAB13QALmphdmEudXRpbC5j
>> b25jdXJyZW50LkV4ZWN1dG9ycyRSdW5uYWJsZUFkYXB0ZXJ0AA5FeGVjdXRv
>> cnMuamF2YXEAfgA3c3EAfgARAAABBnQAH2phdmEudXRpbC5jb25jdXJyZW50
>> LkZ1dHVyZVRhc2t0AA9GdXR1cmVUYXNrLmphdmFxAH4AM3NxAH4AEQAABHl0
>> ACdqYXZhLnV0aWwuY29uY3VycmVudC5UaHJlYWRQb29sRXhlY3V0b3J0ABdU
>> aHJlYWRQb29sRXhlY3V0b3IuamF2YXQACXJ1bldvcmtlcnNxAH4AEQAAAmd0
>> AC5qYXZhLnV0aWwuY29uY3VycmVudC5UaHJlYWRQb29sRXhlY3V0b3IkV29y
>> a2VycQB-AEhxAH4AM3NxAH4AEQAAAul0ABBqYXZhLmxhbmcuVGhyZWFkdAAL
>> VGhyZWFkLmphdmFxAH4AM3NyACZqYXZhLnV0aWwuQ29sbGVjdGlvbnMkVW5t
>> b2RpZmlhYmxlTGlzdPwPJTG17I4QAgABTAAEbGlzdHEAfgAMeHIALGphdmEu
>> dXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVDb2xsZWN0aW9uGUIAgMte
>> 9x4CAAFMAAFjdAAWTGphdmEvdXRpbC9Db2xsZWN0aW9uO3hwc3IAE2phdmEu
>> dXRpbC5BcnJheUxpc3R4gdIdmcdhnQMAAUkABHNpemV4cAAAAAB3BAAAAAB4
>> cQB-AFR4c3IAEWphdmEubGFuZy5JbnRlZ2VyEuKgpPeBhzgCAAFJAAV2YWx1
>> ZXhyABBqYXZhLmxhbmcuTnVtYmVyhqyVHQuU4IsCAAB4cAAAEO9zcQB-
>> AFMAAAAAdwQAAAAAeHNyAA5qYXZhLmxhbmcuTG9uZzuL5JDMjyPfAgABSgAF
>> dmFsdWV4cQB-AFYAAAAAAAAAGXZyABVjb20uY2xvdWQub3JnLkNsdXN0ZXKq
>> oV-omaW2FAIAAHhwAA
>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>> Deploy avoids pods: [], clusters: [], hosts: [3]
>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>> DeploymentPlanner allocation algorithm: com.cloud.deploy.SkipHeurestic
>> sPlanner@4fdf86ce
>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>> Trying to allocate a host and storage pools from dc:14, pod:17,cluster:17,
>> requested cpu: 4200, requested ram: 4294967296
>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31) Is
>> ROOT volume READY (pool already allocated)?: Yes
>> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>> This VM has last host_id specified, trying to choose the same host: 213
>> 2017-01-30 08:44:06,213 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Publish async
>> job-192989 complete on message bus
>> 2017-01-30 08:44:06,213 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Wake up jobs
>> related to job-192989
>> 2017-01-30 08:44:06,213 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Update db status
>> for job-192989
>> 2017-01-30 08:44:06,219 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
>> Checking resources in Cluster: 35 under Pod: 31
>> 2017-01-30 08:44:06,227 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f
>> FirstFitRoutingAllocator) Looking for hosts in dc: 23  pod:31  cluster:35
>> 2017-01-30 08:44:06,229 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>> (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Executing
>> AsyncJobVO {id:192992, userId: 1, accountId: 1, instanceType: null,
>> instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>> rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcE
>> mwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAl
>> Z2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5h
>> bWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAA
>> AAAGwXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAEw, cmdVersion: 0,
>> status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null,
>> initMsid: 345049098122, completeMsid: null, lastUpdated: null, lastPolled:
>> null, created: Mon Jan 30 08:44:05 CET 2017}
>> 2017-01-30 08:44:06,229 DEBUG [c.c.v.VmWorkJobDispatcher]
>> (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Run VM work job:
>> com.cloud.vm.VmWorkMigrateAway for VM 1729, job origin: 14953
>> 2017-01-30 08:44:06,230 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>> (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Execute sync-queue item:
>> SyncQueueItemVO {id:63630, queueId: 67180, contentType: AsyncJob,
>> contentId: 192990, lastProcessMsid: 345049098498, lastprocessNumber: 61,
>> lastProcessTime: Mon Jan 30 08:44:06 CET 2017, created: Mon Jan 30 08:44:05
>> CET 2017}
>> 2017-01-30 08:44:06,244 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31) The
>> last host of this VM is not UP or is not enabled, host status is:
>> Disconnected, host resource state is: Enabled
>> 2017-01-30 08:44:06,262 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>> Cannot choose the last host to deploy this VM
>> 2017-01-30 08:44:06,263 DEBUG [c.c.d.FirstFitPlanner]
>> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
>> Searching resources only under specified Cluster: 17
>> 2017-01-30 08:44:06,262 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>> (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Add job-192991
>> into job monitoring
>> 2017-01-30 08:44:06,246 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Done executing
>> com.cloud.vm.VmWorkMigrateAway for job-192986
>> 2017-01-30 08:44:06,244 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Wake up jobs
>> joined with job-192989 and disjoin all subjobs created from job- 192989
>> 2017-01-30 08:44:06,269 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Remove
>> job-192986 from job monitoring
>> 2017-01-30 08:44:06,279 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Done executing
>> com.cloud.vm.VmWorkMigrateAway for job-192989
>> 2017-01-30 08:44:06,305 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>> (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Executing
>> AsyncJobVO {id:192991, userId: 1, accountId: 1, instanceType: null,
>> instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>> rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcE
>> mwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAl
>> Z2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5h
>> bWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAA
>> AAAGRXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAABw, cmdVersion: 0,
>> status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null,
>> initMsid: 345049098122, completeMsid: null, lastUpdated: null, lastPolled:
>> null, created: Mon Jan 30 08:44:05 CET 2017}
>> 2017-01-30 08:44:06,305 DEBUG [c.c.v.VmWorkJobDispatcher]
>> (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Run VM work job:
>> com.cloud.vm.VmWorkMigrateAway for VM 1605, job origin: 14965
>> 2017-01-30 08:44:06,306 DEBUG [c.c.a.m.AgentManagerImpl]
>> (AgentManager-Handler-13:null) Ping from 258
>> 2017-01-30 08:44:06,319 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f
>> FirstFitRoutingAllocator) List of hosts in ascending order of number of
>> VMs: [224, 19, 21, 15]
>> 2017-01-30 08:44:06,322 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f
>> FirstFitRoutingAllocator) FirstFitAllocator has 0 hosts to check for
>> allocation: []
>> 2017-01-30 08:44:06,322 INFO  [c.c.a.m.AgentManagerImpl]
>> (AgentManager-Handler-13:null) PingMap for agent: 258 will not be updated
>> because agent is no longer in the PingMap
>> 2017-01-30 08:44:06,326 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Remove
>> job-192989 from job monitoring
>>
>> 2017-01-30 08:54:38,139 INFO [c.c.v.VirtualMachineManagerImpl]
>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5)
>> Migration cancelled because state has changed: VM[User|i-189-1178-VM]
>> 2017-01-30 08:54:38,139 DEBUG [c.c.v.VirtualMachineManagerImpl]
>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5)
>> Unable to migrate VM due to: Migration cancelled because state has changed:
>> VM[User|i-189-1178-VM]
>> 2017-01-30 08:54:38,167 DEBUG [c.c.c.CapacityManagerImpl]
>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5) VM
>> state transitted from :Running to Stopping with event: StopRequestedvm's
>> original host id: 29 new host id: 33 host id before state transition: 33
>> 2017-01-30 08:54:38,177 DEBUG [c.c.a.m.DirectAgentAttache]
>> (DirectAgent-325:ctx-e540d7f5) Seq 19-2355664080091480198: Response
>> Received:
>> 2017-01-30 08:54:38,177 DEBUG [c.c.a.t.Request]
>> (DirectAgent-325:ctx-e540d7f5) Seq 19-2355664080091480198: Processing:  {
>> Ans: , MgmtId: 345049098498, via: 19, Ver: v1, Flags: 10,
>> [{"com.cloud.agent.api.NetworkUsageAnswer":{"routerName":"r-
>> 1706-VM","bytesSent":12276848087,"bytesReceived":5226621450,
>> "result":true,"details":"","wait":0}}] }
>> 2017-01-30 08:54:38,177 DEBUG [c.c.a.t.Request]
>> (RouterMonitor-1:ctx-2e80f405) Seq 19-2355664080091480198: Received:  {
>> Ans: , MgmtId: 345049098498, via: 19, Ver: v1, Flags: 10, {
>> NetworkUsageAnswer } }
>> 2017-01-30 08:54:38,177 DEBUG [c.c.a.m.AgentManagerImpl]
>> (RouterMonitor-1:ctx-2e80f405) Details from executing class
>> com.cloud.agent.api.NetworkUsageCommand:
>> 2017-01-30 08:54:38,178 DEBUG [c.c.a.t.Request]
>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5)
>> Seq 33-1917970491306410035: Sending  { Cmd , MgmtId: 345049098498, via:
>> 33(ewcstack-vh003-prod), Ver: v1, Flags: 100011,
>> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"execut
>> eInSequence":false,"checkBeforeCleanup":false,"vmName":"i-189-1178-VM","wait":0}}]
>> }
>> 2017-01-30 08:54:38,178 DEBUG [c.c.a.t.Request]
>> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5)
>> Seq 33-1917970491306410035: Executing:  { Cmd , MgmtId: 345049098498, via:
>> 33(ewcstack-vh003-prod), Ver: v1, Flags: 100011,
>> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"execut
>> eInSequence":false,"checkBeforeCleanup":false,"vmName":"i-189-1178-VM","wait":0}}]
>> }
>> 2017-01-30 08:54:38,178 DEBUG [c.c.a.m.DirectAgentAttache]
>> (DirectAgent-43:ctx-da30f109) Seq 33-1917970491306410035: Executing request
>> 2017-01-30 08:54:38,202 DEBUG [c.c.a.m.DirectAgentAttache]
>> (DirectAgent-326:ctx-4fb88aa7) Seq 21-1035546439318503505: Executing request
>>
>> 2017-01-30 09:05:55,628 INFO [c.c.h.HighAvailabilityManagerImpl]
>> (HA-Worker-4:ctx-36aae733 work-2049) Processing
>> HAWork[2049-Migration-1178-Stopped-Migrating]
>> 2017-01-30 09:05:55,633 WARN [o.a.c.f.j.AsyncJobExecutionContext]
>> (HA-Worker-4:ctx-36aae733 work-2049) Job is executed without a context,
>> setup psudo job for the executing thread
>> 2017-01-30 09:05:55,645 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>> (HA-Worker-4:ctx-36aae733 work-2049) Sync job-193045 execution on object
>> VmWorkJobQueue.1178
>>
>>
>>
>>
>> On 31.01.2017 00:42, Rafael Weingärtner wrote:
>>
>>> This seems to be one of those nasty bugs that are very hard to debug. It
>>> seems to be cause to a combination of different factors, such as planets
>>> alignments ;)
>>>
>>> At class “VirtualMachineManagerImpl” line 2395 (method
>>> “com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(String,
>>> long, DeploymentPlanner)”) is the start of the code flow that triggers
>>> the
>>> migration. The point here is that, if an error occurs during the
>>> migration,
>>> the VM will be stopped; For some reason this is executed only for HA VMs.
>>>
>>> Going a little bit further, we find the point that is triggering the
>>> error
>>> in the method “com.cloud.vm.VirtualMachineMa
>>> nagerImpl.migrate(VMInstanceVO,
>>> long, DeployDestination)” at line 1941. This line will call the method
>>> “com.cloud.vm.VirtualMachineManagerImpl.changeState(VMInstanceVO, Event,
>>> Long, ItWorkVO, Step)” to change the VM state to migrating/prepare
>>> migration/or something like that (I am guessing here), but for some
>>> reason,
>>> the state of the VM is not the one expected and this is causing a
>>> “ConcurrentOperationException” to be thrown. The problem is that the
>>> “com.cloud.vm.VirtualMachineManagerImpl.changeState(VMInstanceVO, Event,
>>> Long, ItWorkVO, Step)” does not seem to provide much useful log entries.
>>>
>>> Just to enable us to discuss the problem a little further, could you tell
>>> us a little bit about your environment setting? How many ACS MSs do you
>>> have? What is the VM workload being migrated (I mean, the number of VMs
>>> and
>>> computing resources)? Do the other servers have enough computing
>>> resources
>>> to support these VMs? Could you also check the XenServer logs, and check
>>> what happened on both, source and destination hosts when the migration
>>> for
>>> VM “i-638-1735-VM” was triggered?
>>>
>>> Do you have the knowledge/skills to use an IDE (e.g. Eclipse) to connect
>>> to
>>> a running JVM and debug a code? Because for me it will not be possible to
>>> go further without looking at the variables in a running environment.
>>>
>>>
>>> On Mon, Jan 30, 2017 at 4:28 AM, Francois Scheurer <
>>> francois.scheurer@everyware.ch> wrote:
>>>
>>> Dear Rafael
>>>>
>>>>
>>>> Sorry for the late answer (was sick a couple of days).
>>>>
>>>> I confirm we are migrating inside the same xenserver pool (within a CS
>>>> Cluster).
>>>>
>>>>
>>>> Best Regards
>>>> Francois
>>>>
>>>>
>>>>
>>>> On 26.01.2017 04:35, Rafael Weingärtner wrote:
>>>>
>>>> Found the code that is executed.
>>>>> I need you to confirm if you are migrating within a cluster or across
>>>>> clusters?
>>>>>
>>>>> On Wed, Jan 25, 2017 at 11:57 AM, Rafael Weingärtner <
>>>>> rafaelweingartner@gmail.com> wrote:
>>>>>
>>>>> I do not have access to ACS code right now, but I suggest starting
>>>>>
>>>>>> debugging here:
>>>>>>
>>>>>> 2017-01-24 18:05:28,427 DEBUG [c.c.c.CapacityManagerImpl]
>>>>>>
>>>>>>> (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380
>>>>>>> ctx-175d37df)
>>>>>>> VM state transitted from :Running to Stopping with event:
>>>>>>> StopRequestedvm's
>>>>>>> original host id: 224 new host id: 15 host id before state
>>>>>>> transition:
>>>>>>> 15
>>>>>>>
>>>>>>> I would try to understand first why ACS requested the hypervisor to
>>>>>>> stop
>>>>>>>
>>>>>> the VM. I mean the following, check the source code for conditions
>>>>>> that
>>>>>> would make ACS request the shutdown of VMs.
>>>>>> BTW: are you migrating within a cluster or across clusters?
>>>>>>
>>>>>> I will be able to analyze this further only after work hours, so if
>>>>>> you
>>>>>> find anything, keep me posted.
>>>>>>
>>>>>> On Wed, Jan 25, 2017 at 11:45 AM, Francois Scheurer <
>>>>>> francois.scheurer@everyware.ch> wrote:
>>>>>>
>>>>>> Hello Rafael
>>>>>>
>>>>>>>
>>>>>>> I think the important log lines are these:
>>>>>>>
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 INFO
>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b
>>>>>>> 86f5e
>>>>>>> job-60216/job-191380 ctx-175d37df) Migration cancelled because state
>>>>>>> has
>>>>>>> changed: VM[User|i-638-1736-VM]
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 DEBUG
>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b
>>>>>>> 86f5e
>>>>>>> job-60216/job-191380 ctx-175d37df) Unable to migrate VM due to:
>>>>>>> Migration
>>>>>>> cancelled because state has changed: VM[User|i-638-1736-VM]
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,427 DEBUG
>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>>>>>> job-60216/job-191380 ctx-175d37df) VM state transitted from :Running
>>>>>>> to
>>>>>>> Stopping with event: StopRequestedvm's original host id: 224 new host
>>>>>>> id:
>>>>>>> 15 host id before state transition: 15
>>>>>>>
>>>>>>> The really disturbing thing is that CS stop again the VM if we start
>>>>>>> it
>>>>>>> manually, apparently because the migratingjob is still running.
>>>>>>>
>>>>>>> Note the the VM HA Flag is enabled.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Thank you.
>>>>>>>
>>>>>>>
>>>>>>> Best Regards.
>>>>>>>
>>>>>>> Francois
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> more details:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,148 DEBUG
>>>>>>> [o.a.c.e.o.VolumeOrchestrator] (Work-Job-Executor-156:ctx-14b86f5e
>>>>>>> job-60216/job-191380 ctx-175d37df) Preparing 3 volumes for
>>>>>>> VM[User|i-638-1736-VM]
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,155 DEBUG
>>>>>>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-122:ctx-a8bacc64
>>>>>>> job-14953/job-191351) Done with run of VM work job:
>>>>>>> com.cloud.vm.VmWorkMigrateAway for VM 1411, job origin: 14953
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,155 DEBUG
>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-122:ctx-a8b
>>>>>>> acc64
>>>>>>> job-14953/job-191351) Done executing com.cloud.vm.VmWorkMigrateAway
>>>>>>> for
>>>>>>> job-191351
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,160 INFO
>>>>>>> [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-122:ctx-a8bacc64
>>>>>>> job-14953/job-191351) Remove job-191351 from job monitoring
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,168 INFO
>>>>>>> [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-2:ctx-f3bcf43c
>>>>>>> work-1989)
>>>>>>> Completed HAWork[1989-Migration-1411-Running-Migrating]
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,246 DEBUG
>>>>>>> [c.c.a.m.ClusteredAgentAttache] (Work-Job-Executor-156:ctx-14b86f5e
>>>>>>> job-60216/job-191380 ctx-175d37df) Seq 19-8526158519542516375:
>>>>>>> Forwarding
>>>>>>> Seq 19-8526158519542516375:  { Cmd , MgmtId: 345049103441, via:
>>>>>>> 19(ewcstack-vh010-prod), Ver: v1, Flags: 100111, [{
>>>>>>> "com.cloud.agent.api.PrepareForMigrationCommand":{"vm":{"id"
>>>>>>> :1736,"name":"i-638-1736-VM","bootloader":"PyGrub","type":"U
>>>>>>> ser","cpus":4,"minSpeed":2100,"maxSpeed":2100,"minRam":17179
>>>>>>> 869184,"maxRam":17179869184,"arch":"x86_64","os":"Other
>>>>>>> (64-bit)","platformEmulator":"Other install
>>>>>>> media","bootArgs":"","enable
>>>>>>> HA":true,"limitCpuUse":true,"enableDynamicallyScaleVm":true,
>>>>>>> "vncPassword":"pC1WUC7h1DBH9J36q3H9pg==","params":{"memoryOv
>>>>>>> ercommitRatio":"1.0","platform":"viridian:true;acpi:1;apic:t
>>>>>>> rue;pae:true;nx:true;timeoffset:-3601","keyboard":"us","Mess
>>>>>>> age.ReservedCapacityFreed.Flag":"false","cpuOvercommitRatio":"4.0","
>>>>>>> hypervisortoolsversion":"xenserver61"},"uuid":"5bff01de-c033
>>>>>>> -4925-9e47-30dd14539272","disks":[{"data":{"org.apache.cloud
>>>>>>> stack.storage.to.VolumeObjectTO":{"uuid":"1ab514b2-f2de-
>>>>>>> 4d85-8e0d-9768184a1349","volumeType":"ROOT","dataStore":{"org.
>>>>>>> apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"b8
>>>>>>> 751b89-00f
>>>>>>> 6-5079-ed7a-c073579c7563","id":64,"poolType":"PreSetup","hos
>>>>>>> t":"localhost","path":"/b8751b89-00f6-5079-ed7a-c073579c7563
>>>>>>> ","port":0,"url":"PreSetup://localhost/b8751b89-00f6-5079-ed
>>>>>>> 7a-c073579c7563/?ROLE=Primary&STOREUUID=b8751b89-00f6-5079-
>>>>>>> ed7a-c073579c7563"}},"name":"ROOT-1736","size":53687091200,"
>>>>>>> path":"d3
>>>>>>> f87e3c-1efe-42ad-aa93-609d5b980a34","volumeId":8655,"vmName"
>>>>>>> :"i-638-1736-VM","accountId":638,"format":"VHD","provisionin
>>>>>>> gType":"THIN","id":8655,"deviceId":0,"hypervisorType":"
>>>>>>> XenServer"}},"diskSeq":0,"path":"d3f87e3c-1efe-42ad-aa93
>>>>>>> -609d5b980a34","type":"ROOT","_details":{"managed":"false","
>>>>>>> storagePort":"0",
>>>>>>> "storageHost":"localhost","volumeSize":"53687091200"}},{"dat
>>>>>>> a":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid
>>>>>>> ":"733df9dc-5e24-4ab9-b22b-bce78322428e","volumeType":"DATAD
>>>>>>> ISK","dataStore":{"org.apache.cloudstack.storage.to.PrimaryD
>>>>>>> ataStoreTO":{"uuid":"b8751b89-00f6-5079-ed7a-c073579c7563","id":64,"
>>>>>>> poolType":"PreSetup","host":"localhost","path":"/b8751b89-00
>>>>>>> f6-5079-ed7a-c073579c7563","port":0,"url":"PreSetup://localh
>>>>>>> ost/b8751b89-00f6-5079-ed7a-c073579c7563/?ROLE=Primary&STORE
>>>>>>> UUID=b8751b89-00f6-5079-ed7a-c073579c7563"}},"name":"WELLA-
>>>>>>> DB1-NCSA_D","size":268435456000,"path":"265f168a-bb92-4591-a
>>>>>>> 32b-ee3e2
>>>>>>> 2c8cb68","volumeId":8656,"vmName":"i-638-1736-VM","accountId
>>>>>>> ":638,"format":"VHD","provisioningType":"THIN","id":8656,"
>>>>>>> deviceId":1,"hypervisorType":"XenServer"}},"diskSeq":1,"
>>>>>>> path":"265f168a-bb92-4591-a32b-ee3e22c8cb68","type":"DATADIS
>>>>>>> K","_details":{"managed":"false","storagePort":"0","stor
>>>>>>> ageHost":"localhos
>>>>>>> t","volumeSize":"268435456000"}},{"data":{"org.apache.clouds
>>>>>>> tack.storage.to.VolumeObjectTO":{"uuid":"bda6f335-b8a3-479c-
>>>>>>> 8c0d-41b344d2a50d","volumeType":"DATADISK","dataStore":{"org.
>>>>>>> apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid"
>>>>>>> :"09df3ca6-0a20-e4f0-3d79-12a6ed67a13d","id":89,"poolType":"
>>>>>>> PreSetup","
>>>>>>> host":"localhost","path":"/09df3ca6-0a20-e4f0-3d79-12a6ed67a
>>>>>>> 13d","port":0,"url":"PreSetup://localhost/09df3ca6-0a20-e4f0
>>>>>>> -3d79-12a6ed67a13d/?ROLE=Primary&STOREUUID=09df3ca6-0a20-
>>>>>>>
>>>>>>> e4f0-3d79-12a6ed67a13d"}},"name":"WELLA-DB1-NCSA_E","size
>>>>>>> ":429496729600,"path":"612cb586-a70c-4896-b120-062ee7a67894"
>>>>>>> ,"volumeId":86
>>>>>>> 57,"vmName":"i-638-1736-VM","accountId":638,"format":"VHD","
>>>>>>> provisioningType":"THIN","id":8657,"deviceId":2,"hypervisorT
>>>>>>> ype":"XenServer"}},"diskSeq":2,"path":"612cb586-a70c-4896-b1
>>>>>>> 20-062ee7a67894","type":"DATADISK","_details":{"managed":"
>>>>>>> false","storagePort":"0","storageHost":"localhost","volumeSi
>>>>>>> ze":"42949
>>>>>>> 6729600"}},{"data":{"org.apache.cloudstack.storage.to.Templa
>>>>>>> teObjectTO":{"uuid":"0e5bf4ff-caca-4b08-abed-7bbcdf3c42ef","
>>>>>>> id":201,"format":"ISO","accountId":1,"hvm":true,"displayText
>>>>>>> ":"xen-pv-drv-iso","name":"xs-tools.iso","guestOsType":"CentOS 4.5
>>>>>>> (32-bit)","hypervisorType":"XenServer"}},"diskSeq":3,"type":"
>>>>>>> ISO"}],"nics":[{"deviceId":0,"networkRateMbps":1000,"default
>>>>>>> Nic":true,"pxeDisable":false,"nicUuid":"3523c3d3-9f92-4076-a
>>>>>>> d99-f8864d0f0edd","uuid":"e6fbb257-16e1-415f-88fb-9a79b263b3
>>>>>>> a8","ip":"192.168.1.42","netmask":"255.255.255.0","
>>>>>>> gateway":"192.168.1.1","mac":"02:00:0c:ae:00:0f","dns1":"8.8
>>>>>>> .8.8","dns2":"8.8
>>>>>>> .4.4","broadcastType":"Vlan","type":"Guest","broadcastUri":"
>>>>>>> vlan://2285","isolationUri":"vlan://2285","isSecurityGroupEn
>>>>>>> abled":false,"name":"bond0"}],"vcpuMaxLimit":16},"wait":0}}] } to
>>>>>>> 345049098498
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG [c.c.a.t.Request]
>>>>>>> (AgentManager-Handler-10:null) Seq 19-8526158519542516375:
>>>>>>> Processing:
>>>>>>> {
>>>>>>> Ans: , MgmtId: 345049103441, via: 19, Ver: v1, Flags: 110,
>>>>>>> [{"com.cloud.agent.api.PrepareForMigrationAnswer":{"result":
>>>>>>> true,"wait":0}}]
>>>>>>> }
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG
>>>>>>> [c.c.a.m.AgentAttache]
>>>>>>> (AgentManager-Handler-10:null) Seq 19-8526158519542516375: No more
>>>>>>> commands
>>>>>>> found
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG [c.c.a.t.Request]
>>>>>>> (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380
>>>>>>> ctx-175d37df)
>>>>>>> Seq 19-8526158519542516375: Received:  { Ans: , MgmtId: 345049103441,
>>>>>>> via:
>>>>>>> 19, Ver: v1, Flags: 110, { PrepareForMigrationAnswer } }
>>>>>>>
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 INFO
>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b
>>>>>>> 86f5e
>>>>>>> job-60216/job-191380 ctx-175d37df) Migration cancelled because state
>>>>>>> has
>>>>>>> changed: VM[User|i-638-1736-VM]
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 DEBUG
>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b
>>>>>>> 86f5e
>>>>>>> job-60216/job-191380 ctx-175d37df) Unable to migrate VM due to:
>>>>>>> Migration
>>>>>>> cancelled because state has changed: VM[User|i-638-1736-VM]
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,427 DEBUG
>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>>>>>> job-60216/job-191380 ctx-175d37df) VM state transitted from :Running
>>>>>>> to
>>>>>>> Stopping with event: StopRequestedvm's original host id: 224 new host
>>>>>>> id:
>>>>>>> 15 host id before state transition: 15
>>>>>>>
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,438 DEBUG
>>>>>>> [c.c.a.m.ClusteredAgentAttache] (Work-Job-Executor-156:ctx-14b86f5e
>>>>>>> job-60216/job-191380 ctx-175d37df) Seq 15-6889944479923437643:
>>>>>>> Forwarding
>>>>>>> Seq 15-6889944479923437643:  { Cmd , MgmtId: 345049103441, via:
>>>>>>> 15(ewcstack-vh011-prod), Ver: v1, Flags: 100011,
>>>>>>> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"execut
>>>>>>> eInSequence":false,"checkBeforeCleanup":false,"vmName":"i-
>>>>>>> 638-1736-VM","wait":0}}]
>>>>>>> } to 345049098498
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:30,335 DEBUG
>>>>>>> [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-1:null) SeqA
>>>>>>> 254-800498:
>>>>>>> Processing Seq 254-800498:  { Cmd , MgmtId: -1, via: 254, Ver: v1,
>>>>>>> Flags:
>>>>>>> 11, [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_
>>>>>>> proxyVmId":1530,"_loadInfo":"{\n
>>>>>>> \"connections\": []\n}","wait":0}}] }
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:30,340 DEBUG
>>>>>>> [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-1:null) SeqA
>>>>>>> 254-800498:
>>>>>>> Sending Seq 254-800498:  { Ans: , MgmtId: 345049103441, via: 254,
>>>>>>> Ver:
>>>>>>> v1,
>>>>>>> Flags: 100010, [{"com.cloud.agent.api.AgentCo
>>>>>>> ntrolAnswer":{"result":true,"wait":0}}] }
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,447 DEBUG [c.c.a.t.Request]
>>>>>>> (AgentManager-Handler-11:null) Seq 15-6889944479923437632:
>>>>>>> Processing:
>>>>>>> {
>>>>>>> Ans: , MgmtId: 345049103441, via: 15, Ver: v1, Flags: 10,
>>>>>>> [{"com.cloud.agent.api.StopAnswer":{"platform":"viridian:tru
>>>>>>> e;acpi:1;apic:true;pae:true;nx:true;timeoffset:3601","result
>>>>>>> ":true,"details":"Stop
>>>>>>> VM i-638-1735-VM Succeed","wait":0}}] }
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,447 DEBUG [c.c.a.t.Request]
>>>>>>> (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379
>>>>>>> ctx-bbabfeae)
>>>>>>> Seq 15-6889944479923437632: Received:  { Ans: , MgmtId: 345049103441,
>>>>>>> via:
>>>>>>> 15, Ver: v1, Flags: 10, { StopAnswer } }
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,470 DEBUG
>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>> db528
>>>>>>> job-13733/job-191379 ctx-bbabfeae) VM[User|i-638-1735-VM] is stopped
>>>>>>> on
>>>>>>> the
>>>>>>> host. Proceeding to release resource held.
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,475 DEBUG
>>>>>>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not
>>>>>>> supported
>>>>>>> in the network id=519
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,478 DEBUG
>>>>>>> [o.a.c.e.o.NetworkOrchestrator] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>> job-13733/job-191379 ctx-bbabfeae) Changing active number of nics for
>>>>>>> network id=519 on -1
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>>>>>>> [o.a.c.e.o.NetworkOrchestrator] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>> job-13733/job-191379 ctx-bbabfeae) Asking VpcVirtualRouter to release
>>>>>>> NicProfile[17403-1735-3364083b-dcc0-4eb9-a7f9-f3e85bf8c7d3-1
>>>>>>> 92.168.1.11-null
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>> db528
>>>>>>> job-13733/job-191379 ctx-bbabfeae) Successfully released network
>>>>>>> resources
>>>>>>> for the vm VM[User|i-638-1735-VM]
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>> db528
>>>>>>> job-13733/job-191379 ctx-bbabfeae) Successfully released storage
>>>>>>> resources
>>>>>>> for the vm VM[User|i-638-1735-VM]
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,496 DEBUG
>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>> job-13733/job-191379 ctx-bbabfeae) VM state transitted from
>>>>>>> :Stopping to
>>>>>>> Stopped with event: OperationSucceededvm's original host id: 15 new
>>>>>>> host
>>>>>>> id: null host id before state transition: 15
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>> job-13733/job-191379 ctx-bbabfeae) Hosts's actual total CPU: 82976
>>>>>>> and
>>>>>>> CPU
>>>>>>> after applying overprovisioning: 331904
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>> job-13733/job-191379 ctx-bbabfeae) Hosts's actual total RAM:
>>>>>>> 261961927808
>>>>>>> and RAM after applying overprovisioning: 261961924608
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>> job-13733/job-191379 ctx-bbabfeae) release cpu from host: 15, old
>>>>>>> used:
>>>>>>> 71300,reserved: 2100, actual total: 82976, total with
>>>>>>> overprovisioning:
>>>>>>> 331904; new used: 67100,reserved:6300; movedfromreserved:
>>>>>>> false,moveToReserveredtrue
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>> job-13733/job-191379 ctx-bbabfeae) release mem from host: 15, old
>>>>>>> used:
>>>>>>> 98784247808,reserved: 1073741824, total: 261961924608; new used:
>>>>>>> 94489280512,reserved:5368709120 <(53)%206870-9120>;
>>>>>>> movedfromreserved:
>>>>>>>
>>>>>>> false,moveToReserveredtrue
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,538 DEBUG
>>>>>>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not
>>>>>>> supported
>>>>>>> in the network id=519
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,557 DEBUG
>>>>>>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not
>>>>>>> supported
>>>>>>> in the network id=519
>>>>>>>
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,588 ERROR
>>>>>>> [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>> job-13733/job-191379 ctx-bbabfeae) Invocation exception, caused by:
>>>>>>> com.cloud.utils.exception.CloudRuntimeException: Unable to migrate
>>>>>>> VM[User|i-638-1735-VM]
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 INFO
>>>>>>> [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>> job-13733/job-191379 ctx-bbabfeae) Rethrow exception
>>>>>>> com.cloud.utils.exception.CloudRuntimeException: Unable to migrate
>>>>>>> VM[User|i-638-1735-VM]
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 DEBUG
>>>>>>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>> job-13733/job-191379) Done with run of VM work job:
>>>>>>> com.cloud.vm.VmWorkMigrateAway for VM 1735, job origin: 13733
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 ERROR
>>>>>>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>> job-13733/job-191379) Unable to complete AsyncJobVO {id:191379,
>>>>>>> userId:
>>>>>>> 1,
>>>>>>> accountId: 1, instanceType: null, instanceId: null, cmd:
>>>>>>> com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>>>>> rO0ABXNyAB5jb20uY2xvdWQudm0uVm
>>>>>>> 1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb2
>>>>>>> 0uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2
>>>>>>> VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZz
>>>>>>> t4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGx3QAGVZpcnR1YWxNYWNoaW5lTW
>>>>>>> FuYWdlckltcGwAAAAAAAAA4A,
>>>>>>> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
>>>>>>> result: null, initMsid: 345049098498, completeMsid: null,
>>>>>>> lastUpdated:
>>>>>>> null, lastPolled: null, created: Tue Jan 24 18:05:00 CET 2017}, job
>>>>>>> origin:13733
>>>>>>> ewcstack-man02-prod: com.cloud.utils.exception.Clou
>>>>>>> dRuntimeException:
>>>>>>> Unable to migrate VM[User|i-638-1735-VM]
>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>>>>>>> agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:2405)
>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>>>>>>> agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:4517)
>>>>>>> ewcstack-man02-prod:    at sun.reflect.GeneratedMethodAcc
>>>>>>> essor402.invoke(Unknown
>>>>>>> Source)
>>>>>>> ewcstack-man02-prod:    at sun.reflect.DelegatingMethodAc
>>>>>>> cessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>>> ewcstack-man02-prod:    at java.lang.reflect.Method.invok
>>>>>>> e(Method.java:606)
>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VmWorkJobHandlerP
>>>>>>> roxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>>>>>>> agerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4636)
>>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VmWorkJobDispatch
>>>>>>> er.runJob(VmWorkJobDispatcher.java:103)
>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.framewor
>>>>>>> k.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManag
>>>>>>> erImpl.java:537)
>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>> context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>> context.impl.DefaultManagedContext$1.call(DefaultManagedCont
>>>>>>> ext.java:56)
>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>> context.impl.DefaultManagedContext.callWithContext(DefaultMa
>>>>>>> nagedContext.java:103)
>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>> context.impl.DefaultManagedContext.runWithContext(DefaultMan
>>>>>>> agedContext.java:53)
>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>>> context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.framewor
>>>>>>> k.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:494)
>>>>>>> ewcstack-man02-prod:    at java.util.concurrent.Executors
>>>>>>> $RunnableAdapter.call(Executors.java:471)
>>>>>>> ewcstack-man02-prod:    at java.util.concurrent.FutureTas
>>>>>>> k.run(FutureTask.java:262)
>>>>>>> ewcstack-man02-prod:    at java.util.concurrent.ThreadPoo
>>>>>>> lExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>>>> ewcstack-man02-prod:    at java.util.concurrent.ThreadPoo
>>>>>>> lExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>>>> ewcstack-man02-prod:    at java.lang.Thread.run(Thread.java:745)
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,590 DEBUG
>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>> db528
>>>>>>> job-13733/job-191379) Complete async job-191379, jobStatus: FAILED,
>>>>>>> resultCode: 0, result: rO0ABXNyAC9jb20uY2xvdWQudXRpbH
>>>>>>> MuZXhjZXB0aW9uLkNsb3VkUnVudGltZUV4Y2VwdGlvbgAAAABWTU9yAwABSQ
>>>>>>> ALY3NFcnJvckNvZGV4cgAaamF2YS5sYW5nLlJ1bnRpbWVFeGNlcHRpb26eXw
>>>>>>> ZHCjSD5QIAAHhyABNqYXZhLmxhbmcuRXhjZXB0aW9u0P0fPho7HMQCAAB4cg
>>>>>>> ATamF2YS5sYW5nLlRocm93YWJsZdXGNSc5d7jLAwAETAAFY2F1c2V0ABVMam
>>>>>>> F2YS9sYW5nL1Rocm93YWJsZTtMAA1kZXRhaWxNZXNzYWdldAASTGphdmEvbG
>>>>>>> FuZy9TdHJpbmc7WwAKc3RhY2tUcmFjZXQAHltMamF2YS9sYW5nL1N0YWNrVH
>>>>>>> JhY2VFbGVtZW50O0wAFHN1cHByZXNzZWRFeGNlcHRpb25zdAAQTGphdmEvdX
>>>>>>> RpbC9MaXN0O3hwcQB-AAh0AChVbmFibGUgdG8gbWlncmF0ZSBWTVtVc2VyfG
>>>>>>> ktNjM4LTE3MzUtVk1ddXIAHltMamF2YS5sYW5nLlN0YWNrVHJhY2VFbGVtZW
>>>>>>> 50OwJGKjw8_SI5AgAAeHAAAAAUc3IAG2phdmEubGFuZy5TdGFja1RyYWNlRW
>>>>>>> xlbWVudGEJxZomNt2FAgAESQAKbGluZU51bWJlckwADmRlY2xhcmluZ0NsYX
>>>>>>> NzcQB-AAVMAAhmaWxlTmFtZXEAfgAFTAAKbWV0aG9kTmFtZXEAfgAFeHAAAA
>>>>>>> lldAAmY29tLmNsb3VkLnZtLlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGx0AB
>>>>>>> 5WaXJ0dWFsTWFjaGluZU1hbmFnZXJJbXBsLmphdmF0ABZvcmNoZXN0cmF0ZU
>>>>>>> 1pZ3JhdGVBd2F5c3EAfgAMAAARpXEAfgAOcQB-AA9xAH4AEHNxAH4ADP____
>>>>>>> 90ACZzdW4ucmVmbGVjdC5HZW5lcmF0ZWRNZXRob2RBY2Nlc3NvcjQwMnB0AA
>>>>>>> ZpbnZva2VzcQB-AAwAAAArdAAoc3VuLnJlZmxlY3QuRGVsZWdhdGluZ01ldG
>>>>>>> hvZEFjY2Vzc29ySW1wbHQAIURlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcG
>>>>>>> wuamF2YXEAfgAUc3EAfgAMAAACXnQAGGphdmEubGFuZy5yZWZsZWN0Lk1ldG
>>>>>>> hvZHQAC01ldGhvZC5qYXZhcQB-ABRzcQB-AAwAAABrdAAiY29tLmNsb3VkLn
>>>>>>> ZtLlZtV29ya0pvYkhhbmRsZXJQcm94eXQAGlZtV29ya0pvYkhhbmRsZXJQcm
>>>>>>> 94eS5qYXZhdAAPaGFuZGxlVm1Xb3JrSm9ic3EAfgAMAAASHHEAfgAOcQB-
>>>>>>> AA9xAH4AHnNxAH4ADAAAAGd0ACBjb20uY2xvdWQudm0uVm1Xb3JrSm9iRGlz
>>>>>>> cGF0Y2hlcnQAGFZtV29ya0pvYkRpc3BhdGNoZXIuamF2YXQABnJ1bkpvYnNx
>>>>>>> AH4ADAAAAhl0AD9vcmcuYXBhY2hlLmNsb3Vkc3RhY2suZnJhbWV3b3JrLmpv
>>>>>>> YnMuaW1wbC5Bc3luY0pvYk1hbmFnZXJJbXBsJDV0ABhBc3luY0pvYk1hbmFn
>>>>>>> ZXJJbXBsLmphdmF0AAxydW5JbkNvbnRleHRzcQB-AAwAAAAxdAA-b3JnLmFw
>>>>>>> YWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29udGV4
>>>>>>> dFJ1bm5hYmxlJDF0ABtNYW5hZ2VkQ29udGV4dFJ1bm5hYmxlLmphdmF0AANy
>>>>>>> dW5zcQB-AAwAAAA4dABCb3JnLmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQu
>>>>>>> Y29udGV4dC5pbXBsLkRlZmF1bHRNYW5hZ2VkQ29udGV4dCQxdAAaRGVmYXVs
>>>>>>> dE1hbmFnZWRDb250ZXh0LmphdmF0AARjYWxsc3EAfgAMAAAAZ3QAQG9yZy5h
>>>>>>> cGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0
>>>>>>> TWFuYWdlZENvbnRleHRxAH4ALnQAD2NhbGxXaXRoQ29udGV4dHNxAH4ADAAA
>>>>>>> ADVxAH4AMXEAfgAudAAOcnVuV2l0aENvbnRleHRzcQB-AAwAAAAudAA8b3Jn
>>>>>>> LmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29u
>>>>>>> dGV4dFJ1bm5hYmxlcQB-ACpxAH4AK3NxAH4ADAAAAe5xAH4AJXEAfgAmcQB-
>>>>>>> ACtzcQB-AAwAAAHXdAAuamF2YS51dGlsLmNvbmN1cnJlbnQuRXhlY3V0b3Jz
>>>>>>> JFJ1bm5hYmxlQWRhcHRlcnQADkV4ZWN1dG9ycy5qYXZhcQB-AC9zcQB-AAwA
>>>>>>> AAEGdAAfamF2YS51dGlsLmNvbmN1cnJlbnQuRnV0dXJlVGFza3QAD0Z1dHVy
>>>>>>> ZVRhc2suamF2YXEAfgArc3EAfgAMAAAEeXQAJ2phdmEudXRpbC5jb25jdXJy
>>>>>>> ZW50LlRocmVhZFBvb2xFeGVjdXRvcnQAF1RocmVhZFBvb2xFeGVjdXRvci5q
>>>>>>> YXZhdAAJcnVuV29ya2Vyc3EAfgAMAAACZ3QALmphdmEudXRpbC5jb25jdXJy
>>>>>>> ZW50LlRocmVhZFBvb2xFeGVjdXRvciRXb3JrZXJxAH4AQHEAfgArc3EAfgAM
>>>>>>> AAAC6XQAEGphdmEubGFuZy5UaHJlYWR0AAtUaHJlYWQuamF2YXEAfgArc3IA
>>>>>>> JmphdmEudXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVMaXN0_A8lMbXs
>>>>>>> jhACAAFMAARsaXN0cQB-AAd4cgAsamF2YS51dGlsLkNvbGxlY3Rpb25zJFVu
>>>>>>> bW9kaWZpYWJsZUNvbGxlY3Rpb24ZQgCAy173HgIAAUwAAWN0ABZMamF2YS91
>>>>>>> dGlsL0NvbGxlY3Rpb247eHBzcgATamF2YS51dGlsLkFycmF5TGlzdHiB0h2Z
>>>>>>> x2GdAwABSQAEc2l6ZXhwAAAAAHcEAAAAAHhxAH4ATHgAABCadwgAAAAAAAAAAHg
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>> db528
>>>>>>> job-13733/job-191379) Publish async job-191379 complete on message
>>>>>>> bus
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>> db528
>>>>>>> job-13733/job-191379) Wake up jobs related to job-191379
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>> db528
>>>>>>> job-13733/job-191379) Update db status for job-191379
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,594 DEBUG
>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>> db528
>>>>>>> job-13733/job-191379) Wake up jobs joined with job-191379 and disjoin
>>>>>>> all
>>>>>>> subjobs created from job- 191379
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,604 DEBUG
>>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5
>>>>>>> db528
>>>>>>> job-13733/job-191379) Done executing com.cloud.vm.VmWorkMigrateAway
>>>>>>> for
>>>>>>> job-191379
>>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,608 INFO
>>>>>>> [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-115:ctx-0d5db528
>>>>>>> job-13733/job-191379) Remove job-191379 from job monitoring
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 25.01.2017 15:53, Rafael Weingärtner wrote:
>>>>>>>
>>>>>>> Did you check the ACS log files looking for some exceptions when the
>>>>>>>
>>>>>>>> migrations for this specific VM started?
>>>>>>>>
>>>>>>>> On Wed, Jan 25, 2017 at 9:49 AM, Francois Scheurer <
>>>>>>>> francois.scheurer@everyware.ch> wrote:
>>>>>>>>
>>>>>>>> Dear CS contributors
>>>>>>>>
>>>>>>>> We use CS 4.5.1 on a 3 Clusters with XenServer 6.5. and shared
>>>>>>>>> primary
>>>>>>>>> storage.
>>>>>>>>>
>>>>>>>>> When we put a host in maintenance, cs evacuates all VM's on other
>>>>>>>>> hosts.
>>>>>>>>>
>>>>>>>>> But it happens regularly that some VM's are stopped instead of live
>>>>>>>>> migrated.
>>>>>>>>>
>>>>>>>>> If we restart that stopped VM's, CS will then stop them again
>>>>>>>>> later.
>>>>>>>>>
>>>>>>>>> We need to start them several times until they stays up.
>>>>>>>>>
>>>>>>>>> Is it a known issue? Is it fixed on CS 4.9.2 ?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Migrating manually all VM's is working but impracticable because
>>>>>>>>> too
>>>>>>>>> much
>>>>>>>>> time consuming.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Many thanks in advance for your help.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Best Regards
>>>>>>>>> Francois
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> EveryWare AG
>>>>>>>>> François Scheurer
>>>>>>>>> Senior Systems Engineer
>>>>>>>>> Zurlindenstrasse 52a
>>>>>>>>> CH-8003 Zürich
>>>>>>>>>
>>>>>>>>> tel: +41 44 466 60 00
>>>>>>>>> fax: +41 44 466 60 10
>>>>>>>>> mail: francois.scheurer@everyware.ch
>>>>>>>>> web: http://www.everyware.ch
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>
>>>>>>>
>>>>>>> EveryWare AG
>>>>>>> François Scheurer
>>>>>>> Senior Systems Engineer
>>>>>>> Zurlindenstrasse 52a
>>>>>>> CH-8003 Zürich
>>>>>>>
>>>>>>> tel: +41 44 466 60 00
>>>>>>> fax: +41 44 466 60 10
>>>>>>> mail: francois.scheurer@everyware.ch
>>>>>>> web: http://www.everyware.ch
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>> Rafael Weingärtner
>>>>>>
>>>>>>
>>>>>>
>>>>> --
>>>>
>>>>
>>>> EveryWare AG
>>>> François Scheurer
>>>> Senior Systems Engineer
>>>> Zurlindenstrasse 52a
>>>> CH-8003 Zürich
>>>>
>>>> tel: +41 44 466 60 00
>>>> fax: +41 44 466 60 10
>>>> mail: francois.scheurer@everyware.ch
>>>> web: http://www.everyware.ch
>>>>
>>>>
>>>
>>>
>> --
>>
>>
>> EveryWare AG
>> François Scheurer
>> Senior Systems Engineer
>> Zurlindenstrasse 52a
>> CH-8003 Zürich
>>
>> tel: +41 44 466 60 00
>> fax: +41 44 466 60 10
>> mail: francois.scheurer@everyware.ch
>> web: http://www.everyware.ch
>>
>>
>
>
> --
> Rafael Weingärtner
>



-- 
Rafael Weingärtner

Re: cs 4.5.1, vm stopped when putting xen hosts in maintenance

Posted by Francois Scheurer <fr...@everyware.ch>.

Dear Rafael


We needed to restart the MS's to apply new global parameters.


Best Regards

Francois





On 31.01.2017 16:26, Rafael Weingärtner wrote:
> I will be able to re-check some code and answer you with new thoughts 
> on this problem tonight (the log files you provided today have some 
> very insightful information regarding this problem). I believe this is 
> a bug in HA, a very tricky one that has not been reported yet (at 
> least I do not remember seeing anything similar). And thanks to 
> randomness/destiny now you have found/uncovered a way to reproduce the 
> problem (bug?!).
>
> I do not suggest you changing any data at all directly in the data 
> base (unless you know ACS code well and are 100% of what you are 
> doing). This can cause even bigger and more complicated problems. If 
> you remove the HA feature from this VMs, this problem will stop. 
> However, if you can/are willing to maintain the HA until we fully 
> uncover the problem, map and report it, this would be great.
> BTW: why are you restarting MSs?
> What are the configurations of your MSs?
>
> Thanks for your time and for sharing information with our community.
>
>
>
> On Tue, Jan 31, 2017 at 8:53 AM, Francois Scheurer 
> <francois.scheurer@everyware.ch 
> <ma...@everyware.ch>> wrote:
>
>     Dear Rafael
>
>
>     Many thanks for your deep analysis.
>     Nice humor with the planetsalignments BTW ;)
>
>     We also suspected the HA Flag to be related this issue, as we
>     never got this issue in the past, until we enabled the HA on all
>     VMs and Offerings.
>
>     Here is also the workers list:
>
>     mysql> select * from op_ha_work;
>     +------+-------------+-----------+--------------------+----------+----------------+---------+---------------------+-------+---------------------+-------+-------------+---------+
>     | id   | instance_id | type      | vm_type            | state    |
>     mgmt_server_id | host_id | created             | tried | taken   
>                | step  | time_to_try | updated |
>     +------+-------------+-----------+--------------------+----------+----------------+---------+---------------------+-------+---------------------+-------+-------------+---------+
>     |  105 |         343 | HA        | User               | Running 
>     |   345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28
>     03:30:04 | Error |  1394056013 |       3 |
>     |  109 |         407 | HA        | ConsoleProxy       | Running 
>     |   345049098498 |      33 | 2015-03-28 03:19:03 | 2 | 2015-03-28
>     03:30:04 | Error |  1394056013 |       3 |
>     |  113 |         407 | HA        | ConsoleProxy       | Running 
>     |   345049098498 |      33 | 2015-03-28 03:19:03 | 2 | 2015-03-28
>     03:30:04 | Error |  1394056013 |       3 |
>     |  117 |         407 | HA        | ConsoleProxy       | Running 
>     |   345049098498 |      33 | 2015-03-28 03:19:03 | 2 | 2015-03-28
>     03:30:04 | Error |  1394056013 |       3 |
>     |  121 |         343 | HA        | User               | Running 
>     |   345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28
>     03:30:04 | Error |  1394056013 |       3 |
>     |  125 |         405 | HA        | SecondaryStorageVm | Running 
>     |   345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28
>     03:30:04 | Error |  1394056013 |       3 |
>     |  129 |         393 | HA        | User               | Running 
>     |   345049098498 |      25 | 2015-03-28 03:19:03 | 2 | 2015-03-28
>     03:30:04 | Error |  1394056014 |      30 |
>     |  133 |         405 | HA        | SecondaryStorageVm | Running 
>     |   345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28
>     03:30:04 | Error |  1394056014 |       3 |
>     |  137 |         402 | HA        | ConsoleProxy       | Running 
>     |   345049098498 |      25 | 2015-03-28 03:19:03 | 2 | 2015-03-28
>     03:30:04 | Error |  1394056014 |       3 |
>     |  345 |         402 | HA        | ConsoleProxy       | Starting
>     |   345049098498 |      25 | 2015-07-06 13:09:26 | 1 | 2015-07-06
>     13:09:26 | Error |  1402527506 |     770 |
>     |  349 |         687 | HA        | SecondaryStorageVm | Starting
>     |   345049098498 |      25 | 2015-07-06 13:09:26 | 1 | 2015-07-06
>     13:09:26 | Error |  1402527506 |       2 |
>     | 2057 |        1178 | Migration | User               | Running 
>     |   345049098498 |      33 | 2017-01-31 07:56:08 | 0 | 2017-01-31
>     07:56:08 | Done  |  1451024774 |      95 |
>     | 2058 |        1736 | Migration | User               | Running 
>     |   345049098498 |     224 | 2017-01-31 07:56:08 | 0 | 2017-01-31
>     07:56:08 | Done  |  1451024774 |      45 |
>     | 2059 |        1240 | Migration | User               | Running 
>     |   345049098498 |       7 | 2017-01-31 07:56:08 | 0 | 2017-01-31
>     07:56:08 | Done  |  1451024774 |      61 |
>     | 2060 |        1178 | Migration | User               | Running 
>     |   345049098122 |      33 | 2017-01-31 07:56:08 | 1 | 2017-01-31
>     08:06:37 | Done  |  1451025374 |      95 |
>     | 2061 |        1178 | Migration | User               | Stopped 
>     |           NULL |      33 | 2017-01-31 07:56:09 | 5 | NULL       
>             | Done  | 1451027900 <tel:1451027900> |  109 |
>     | 2062 |        1690 | Migration | User               | Running 
>     |   345049098498 |     224 | 2017-01-31 07:56:09 | 0 | 2017-01-31
>     07:56:09 | Done  |  1451024774 |     109 |
>     | 2063 |        1177 | Migration | User               | Running 
>     |   345049098498 |      29 | 2017-01-31 07:56:09 | 0 | 2017-01-31
>     07:56:09 | Done  |  1451024774 |      82 |
>     | 2064 |        1690 | Migration | User               | Running 
>     |   345049098122 |     224 | 2017-01-31 07:56:09 | 1 | 2017-01-31
>     07:58:18 | Done  |  1451024900 |     109 |
>     | 2065 |        1240 | Migration | User               | Running 
>     |   345049098122 |       7 | 2017-01-31 07:56:09 | 1 | 2017-01-31
>     07:58:18 | Done  |  1451024900 |      61 |
>     | 2066 |        1736 | Migration | User               | Running 
>     |   345049098122 |     224 | 2017-01-31 07:56:09 | 1 | 2017-01-31
>     07:58:18 | Done  |  1451024900 |      45 |
>     | 2067 |        1240 | Migration | User               | Stopped 
>     |           NULL |       7 | 2017-01-31 07:56:09 | 5 | NULL       
>             | Done  | 1451028003 <tel:1451028003> |   80 |
>     | 2068 |        1178 | Migration | User               | Running 
>     |   345049098122 |      33 | 2017-01-31 07:56:16 | 0 | 2017-01-31
>     07:56:16 | Done  |  1451024781 |      95 |
>     | 2069 |        1736 | Migration | User               | Running 
>     |   345049098122 |     224 | 2017-01-31 07:56:16 | 0 | 2017-01-31
>     07:56:16 | Done  |  1451024781 |      45 |
>     | 2070 |        1240 | Migration | User               | Running 
>     |   345049098122 |       7 | 2017-01-31 07:56:16 | 0 | 2017-01-31
>     07:56:16 | Done  |  1451024781 |      61 |
>     | 2071 |        1177 | Migration | User               | Running 
>     |   345049098122 |      29 | 2017-01-31 07:56:16 | 0 | 2017-01-31
>     07:56:16 | Done  |  1451024781 |      82 |
>     | 2072 |        1177 | Migration | User               | Stopped 
>     |           NULL |      29 | 2017-01-31 07:56:16 | 5 | NULL       
>             | Done  | 1451027973 <tel:1451027973> |  101 |
>     | 2073 |        1177 | Migration | User               | Running 
>     |   345049098122 |      29 | 2017-01-31 07:56:16 | 1 | 2017-01-31
>     07:58:22 | Done  |  1451024904 |      82 |
>     | 2074 |        1690 | Migration | User               | Stopped 
>     |           NULL |     224 | 2017-01-31 07:56:16 | 5 | NULL       
>             | Done  | 1451027464 <tel:1451027464> |  123 |
>     | 2075 |        1736 | Migration | User               | Stopped 
>     |           NULL |     224 | 2017-01-31 07:56:16 | 5 | NULL       
>             | Done  | 1451027477 <tel:1451027477> |   59 |
>     +------+-------------+-----------+--------------------+----------+----------------+---------+---------------------+-------+---------------------+-------+-------------+---------+
>
>     If we put a host in maintenance, it may unfortunately happen that
>     a timeout occurs and CS aborts the migration.
>     As far as we can interpret the logs, it seems (we speculate) that
>     CS detects that the VM is then still running on the original host,
>     which is in maintenance, and decides somehow to stop it because of
>     the "host not up" status.
>     We then need to start several times the VM, until CS gives up the
>     stopping.
>
>     After restarting the MSs I guess that a similar issue may happen:
>     the hosts are disconnected during the early stage and then
>     reconnects. Maybe CS then again stops the VMs because the "host
>     not up" status?
>     Cf. 2017-01-30 08:44:06,157 DEBUG
>     [c.c.d.DeploymentPlanningManagerImpl]
>     (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>     ctx-6e9c753f) The last host of this VM is not UP or is not
>     enabled, host status is: Disconnected, host resource state is: Enabled
>
>     But the problem after the MSs restart is definitely only impacting
>     VMs listed in the Table op_ha_work.
>
>
>     To your questions:
>             We have 3 ACS MSs managing several Zones. Each zone has a
>     single pod and cluster.
>             The Cluster of i-638-1735-VM has this summary:
>              Total number of VMs: 121
>              Total number hypervisors: 4
>              Total allocated RAM: 522.0 / 972 GB (53.7 %)
>              Total allocated cores: 211
>              Total allocated storage: 11285 GB
>             Used Memory on the hosts is now varying from 28 to 77%, so
>     there is enough RAM.
>             I did not check the XS logs. The same Problem has arisen
>     this morning again (shortly after a restart of all MSs), so I will
>     look in the XS logs now.
>
>             I have some fair experience with Eclipse and Java, I will
>     ask my boss if we can go this way.
>
>     The main issue for now, is that restarting the MSs is reproducing
>     the problem, which is quite annoying for the impacted VM's.
>     I thought to stop the MSs, delete all HA/Migration entries in
>     op_ha_work and start the MSs.
>     Is that a viable option? (or should I delete the Migration entries
>     only? or never touch this table?)
>     We will also very probably remove the HA Flag on all VMs if it
>     does not work reliably.
>
>     Best Regards
>     Francois Scheurer
>
>
>
>     Here are new logs regarding the same issue of today.
>     I tried to get the important parts related to the stop of
>     i-189-1178-VM.
>
>     regexp:
>     (i-189-1178-VM|VmWorkMigrateAway|job-15393|job-192989|StopRequestedvm|job-14965|job-193019|Stopped-Migrating|ctx-36aae733|work-2049)
>
>     2017-01-30 08:44:04,440 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Executing
>     AsyncJobVO {id:192989, userId: 1, accountId: 1, instanceType:
>     null, instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway,
>     cmdInfo: rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZU
>     F3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAEmnQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAHQ,
>     cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resu
>     ltCode: 0, result: null, initMsid: 345049103441, completeMsid:
>     null, lastUpdated: null, lastPolled: null, created: Mon Jan 30
>     08:44:03 CET 2017}
>     2017-01-30 08:44:04,440 DEBUG [c.c.v.VmWorkJobDispatcher]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Run VM
>     work job: com.cloud.vm.VmWorkMigrateAway for VM 1178, job origin:
>     15393
>
>     2017-01-30 08:44:05,217 DEBUG
>     [c.c.d.DeploymentPlanningManagerImpl]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea) Deploy avoids pods: [], clusters: [], hosts: [29]
>     2017-01-30 08:44:05,220 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
>     (AgentConnectTaskPool-4:ctx-5b61f2e5) create ClusteredAgentAttache
>     for 232
>     2017-01-30 08:44:05,220 DEBUG
>     [c.c.d.DeploymentPlanningManagerImpl]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea) DeploymentPlanner allocation algorithm:
>     com.cloud.deploy.UserDispersingPlanner@1734b2e2
>     2017-01-30 08:44:05,220 DEBUG
>     [c.c.d.DeploymentPlanningManagerImpl]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea) Trying to allocate a host and storage pools from
>     dc:17, pod:25,cluster:25, requested cpu: 4200, requested ram:
>     4294967296
>     2017-01-30 08:44:05,220 DEBUG
>     [c.c.d.DeploymentPlanningManagerImpl]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea) Is ROOT volume READY (pool already allocated)?: Yes
>     2017-01-30 08:44:05,220 DEBUG
>     [c.c.d.DeploymentPlanningManagerImpl]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea) This VM has last host_id specified, trying to choose
>     the same host: 25
>
>     2017-01-30 08:44:05,614 DEBUG
>     [c.c.d.DeploymentPlanningManagerImpl]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea) The last host of this VM is not UP or is not
>     enabled, host status is: Disconnected, host resource state is: Enabled
>     2017-01-30 08:44:05,614 DEBUG
>     [c.c.d.DeploymentPlanningManagerImpl]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea) Cannot choose the last host to deploy this VM
>     2017-01-30 08:44:05,614 DEBUG [c.c.d.FirstFitPlanner]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea) Searching resources only under specified Cluster: 25
>
>     2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea FirstFitRoutingAllocator) Found 0 hosts for
>     allocation after prioritization: []
>     2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea FirstFitRoutingAllocator) Looking for speed=4200Mhz,
>     Ram=4096
>     2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea FirstFitRoutingAllocator) Host Allocator returning 0
>     suitable hosts
>     2017-01-30 08:44:05,777 DEBUG
>     [c.c.d.DeploymentPlanningManagerImpl]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea) No suitable hosts found
>     2017-01-30 08:44:05,777 DEBUG
>     [c.c.d.DeploymentPlanningManagerImpl]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea) No suitable hosts found under this Cluster: 25
>     2017-01-30 08:44:05,787 DEBUG
>     [c.c.d.DeploymentPlanningManagerImpl]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea) Could not find suitable Deployment Destination for
>     this VM under any clusters, returning.
>     2017-01-30 08:44:05,787 DEBUG [c.c.d.FirstFitPlanner]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea) Searching resources only under specified Cluster: 25
>     2017-01-30 08:44:05,798 DEBUG [c.c.d.FirstFitPlanner]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea) The specified cluster is in avoid set, returning.
>     2017-01-30 08:44:05,798 DEBUG [c.c.v.VirtualMachineManagerImpl]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea) Unable to find destination for migrating the vm
>     VM[User|i-189-1178-VM]
>     2017-01-30 08:44:05,798 WARN [c.c.v.VirtualMachineManagerImpl]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea) Failed to deploy vm 1178 with original planner,
>     sending HAPlanner
>
>     2017-01-30 08:44:05,985 DEBUG
>     [c.c.d.DeploymentPlanningManagerImpl]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea) Deploy avoids pods: [], clusters: [], hosts: [29]
>     2017-01-30 08:44:05,985 DEBUG
>     [c.c.d.DeploymentPlanningManagerImpl]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea) DeploymentPlanner allocation algorithm:
>     com.cloud.deploy.SkipHeuresticsPlanner@4fdf86ce
>     2017-01-30 08:44:05,985 DEBUG
>     [c.c.d.DeploymentPlanningManagerImpl]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea) Trying to allocate a host and storage pools from
>     dc:17, pod:25,cluster:25, requested cpu: 4200, requested ram:
>     4294967296
>     2017-01-30 08:44:05,985 DEBUG
>     [c.c.d.DeploymentPlanningManagerImpl]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea) Is ROOT volume READY (pool already allocated)?: Yes
>     2017-01-30 08:44:05,985 DEBUG
>     [c.c.d.DeploymentPlanningManagerImpl]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea) This VM has last host_id specified, trying to choose
>     the same host: 25
>
>     2017-01-30 08:44:06,022 DEBUG
>     [c.c.d.DeploymentPlanningManagerImpl]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea) The last host of this VM is not UP or is not
>     enabled, host status is: Disconnected, host resource state is: Enabled
>     2017-01-30 08:44:06,022 DEBUG
>     [c.c.d.DeploymentPlanningManagerImpl]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea) Cannot choose the last host to deploy this VM
>     2017-01-30 08:44:06,022 DEBUG [c.c.d.FirstFitPlanner]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea) Searching resources only under specified Cluster: 25
>     2017-01-30 08:44:06,039 DEBUG [c.c.d.SkipHeuresticsPlanner]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea) Deploying vm during HA process, so skipping disable
>     threshold check
>
>     2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea FirstFitRoutingAllocator) Found 0 hosts for
>     allocation after prioritization: []
>     2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea FirstFitRoutingAllocator) Looking for speed=4200Mhz,
>     Ram=4096
>     2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea FirstFitRoutingAllocator) Host Allocator returning 0
>     suitable hosts
>     2017-01-30 08:44:06,151 DEBUG
>     [c.c.d.DeploymentPlanningManagerImpl]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea) No suitable hosts found
>     2017-01-30 08:44:06,151 DEBUG
>     [c.c.d.DeploymentPlanningManagerImpl]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea) No suitable hosts found under this Cluster: 25
>     2017-01-30 08:44:06,157 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>     (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Publish
>     async job-192986 complete on message bus
>     2017-01-30 08:44:06,157 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>     (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Wake up
>     jobs related to job-192986
>     2017-01-30 08:44:06,157 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>     (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Update db
>     status for job-192986
>     2017-01-30 08:44:06,157 DEBUG
>     [c.c.d.DeploymentPlanningManagerImpl]
>     (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>     ctx-6e9c753f) The last host of this VM is not UP or is not
>     enabled, host status is: Disconnected, host resource state is: Enabled
>     2017-01-30 08:44:06,157 DEBUG
>     [c.c.d.DeploymentPlanningManagerImpl]
>     (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>     ctx-6e9c753f) Cannot choose the last host to deploy this VM
>     2017-01-30 08:44:06,157 DEBUG [c.c.d.FirstFitPlanner]
>     (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>     ctx-6e9c753f) Searching resources only under specified Cluster: 35
>     2017-01-30 08:44:06,160 DEBUG
>     [c.c.d.DeploymentPlanningManagerImpl]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea) Could not find suitable Deployment Destination for
>     this VM under any clusters, returning.
>     2017-01-30 08:44:06,160 DEBUG [c.c.d.FirstFitPlanner]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea) Searching resources only under specified Cluster: 25
>     2017-01-30 08:44:06,163 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>     (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Execute sync-queue item:
>     SyncQueueItemVO {id:63629, queueId: 64423, contentType: AsyncJob,
>     contentId: 192991, lastProcessMsid: 345049098498,
>     lastprocessNumber: 27, lastProcessTime: Mon Jan 30 08:44:06 CET
>     2017, created: Mon Jan 30 08:44:05 CET 2017}
>     2017-01-30 08:44:06,169 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>     (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Add
>     job-192992 into job monitoring
>     2017-01-30 08:44:06,184 DEBUG [c.c.d.SkipHeuresticsPlanner]
>     (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>     ctx-6e9c753f) Deploying vm during HA process, so skipping disable
>     threshold check
>     2017-01-30 08:44:06,186 DEBUG [c.c.d.FirstFitPlanner]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea) The specified cluster is in avoid set, returning.
>     2017-01-30 08:44:06,186 DEBUG [c.c.v.VirtualMachineManagerImpl]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea) Unable to find destination for migrating the vm
>     VM[User|i-189-1178-VM]
>     2017-01-30 08:44:06,186 ERROR [c.c.v.VmWorkJobHandlerProxy]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea) Invocation exception, caused by:
>     com.cloud.exception.InsufficientServerCapacityException: Unable to
>     find a server to migrate to.Scope=interface com.cloud.org.Cluster;
>     id=25
>     2017-01-30 08:44:06,187 INFO  [c.c.v.VmWorkJobHandlerProxy]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989
>     ctx-ea83d6ea) Rethrow exception
>     com.cloud.exception.InsufficientServerCapacityException: Unable to
>     find a server to migrate to.Scope=interface com.cloud.org.Cluster;
>     id=25
>     2017-01-30 08:44:06,187 DEBUG [c.c.v.VmWorkJobDispatcher]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Done with
>     run of VM work job: com.cloud.vm.VmWorkMigrateAway for VM 1178,
>     job origin: 15393
>     2017-01-30 08:44:06,187 ERROR [c.c.v.VmWorkJobDispatcher]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Unable to
>     complete AsyncJobVO {id:192989, userId: 1, accountId: 1,
>     instanceType: null, instanceId: null, cmd:
>     com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>     rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAEmnQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAHQ,
>     cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode:
>     0, result: null, initMsid: 345049103441, completeMsid: null,
>     lastUpdated: null, lastPolled: null, created: Mon Jan 30 08:44:03
>     CET 2017}, job origin:15393
>     com.cloud.exception.InsufficientServerCapacityException: Unable to
>     find a server to migrate to.Scope=interface com.cloud.org.Cluster;
>     id=25
>             at
>     com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:2390)
>             at
>     com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:4520)
>             at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>             at
>     sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>             at
>     sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>             at java.lang.reflect.Method.invoke(Method.java:606)
>             at
>     com.cloud.vm.VmWorkJobHandlerProxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
>             at
>     com.cloud.vm.VirtualMachineManagerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4636)
>             at
>     com.cloud.vm.VmWorkJobDispatcher.runJob(VmWorkJobDispatcher.java:103)
>             at
>     org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:537)
>             at
>     org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>             at
>     org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
>             at
>     org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
>             at
>     org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
>             at
>     org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>             at
>     org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:494)
>             at
>     java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>             at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>             at
>     java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>             at
>     java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>             at java.lang.Thread.run(Thread.java:745)
>     2017-01-30 08:44:06,188 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>     (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Schedule queued job-192991
>     2017-01-30 08:44:06,192 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>     (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Wake up
>     jobs joined with job-192986 and disjoin all subjobs created from
>     job- 192986
>     2017-01-30 08:44:06,196 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Complete
>     async job-192989, jobStatus: FAILED, resultCode: 0, result:
>     rO0ABXNyADdjb20uY2xvdWQuZXhjZXB0aW9uLkluc3VmZmljaWVudFNlcnZlckNhcGFjaXR5RXhjZXB0aW9uAAAAAFZNT3ACAAFaABVhZmZpbml0eUdyb3Vwc0FwcGxpZWR4cgAxY29tLmNsb3VkLmV4Y2VwdGlvbi5JbnN1ZmZpY2llbnRDYXBhY2l0eUV4Y2VwdGlvbgAAAABWTU91AgACTAACaWR0ABBMamF2YS9sYW5nL0xvbmc7TAAFc2NvcGV0ABFMamF2YS9sYW5nL0NsYXNzO3hyACJjb20uY2xvdWQuZXhjZXB0aW9uLkNsb3VkRXhjZXB0aW9ueeiOG_HkRMcCAAJMAAtjc0Vycm9yQ29kZXQAE0xqYXZhL2xhbmcvSW50ZWdlcjtMAAZpZExpc3R0ABVMamF2YS91dGlsL0FycmF5TGlzdDt4cgATamF2YS5sYW5nLkV4Y2VwdGlvbtD9Hz4aOxzEAgAAeHIAE2phdmEubGFuZy5UaHJvd2FibGXVxjUnOXe4ywMABEwABWNhdXNldAAVTGphdmEvbGFuZy9UaHJvd2FibGU7TAANZGV0YWlsTWVzc2FnZXQAEkxqYXZhL2xhbmcvU3RyaW5nO1sACnN0YWNrVHJhY2V0AB5bTGphdmEvbGFuZy9TdGFja1RyYWNlRWxlbWVudDtMABRzdXBwcmVzc2VkRXhjZXB0aW9uc3QAEExqYXZhL3V0aWwvTGlzdDt4cHEAfgANdAAmVW5hYmxlIHRvIGZpbmQgYSBzZXJ2ZXIgdG8gbWlncmF0ZSB0by51cgAeW0xqYXZhLmxhbmcuU3RhY2tUcmFjZUVsZW1lbnQ7AkYqPDz9IjkCAAB4cAAAABVzcgAbamF2YS5sYW5nLlN0YWNrVHJhY2VFbGVtZW50YQnFmiY23YUCAARJAApsaW5lTnVtYmVyTAAOZGVjbGFyaW5nQ2xhc3NxAH4ACkwACGZpbGVOYW1lcQB-AApMAAptZXRob2ROYW1lcQB-AAp4cAAACVZ0ACZjb20uY2xvdWQudm0uVmlydHVhbE1hY2hpbmVNYW5hZ2VySW1wbHQAHlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwuamF2YXQAFm9yY2hlc3RyYXRlTWlncmF0ZUF3YXlzcQB-ABEAABGocQB-ABNxAH4AFHEAfgAVc3EAfgAR_____nQAJHN1bi5yZWZsZWN0Lk5hdGl2ZU1ldGhvZEFjY2Vzc29ySW1wbHQAHU5hdGl2ZU1ldGhvZEFjY2Vzc29ySW1wbC5qYXZhdAAHaW52b2tlMHNxAH4AEQAAADlxAH4AGHEAfgAZdAAGaW52b2tlc3EAfgARAAAAK3QAKHN1bi5yZWZsZWN0LkRlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcGx0ACFEZWxlZ2F0aW5nTWV0aG9kQWNjZXNzb3JJbXBsLmphdmFxAH4AHHNxAH4AEQAAAl50ABhqYXZhLmxhbmcucmVmbGVjdC5NZXRob2R0AAtNZXRob2QuamF2YXEAfgAcc3EAfgARAAAAa3QAImNvbS5jbG91ZC52bS5WbVdvcmtKb2JIYW5kbGVyUHJveHl0ABpWbVdvcmtKb2JIYW5kbGVyUHJveHkuamF2YXQAD2hhbmRsZVZtV29ya0pvYnNxAH4AEQAAEhxxAH4AE3EAfgAUcQB-ACZzcQB-ABEAAABndAAgY29tLmNsb3VkLnZtLlZtV29ya0pvYkRpc3BhdGNoZXJ0ABhWbVdvcmtKb2JEaXNwYXRjaGVyLmphdmF0AAZydW5Kb2JzcQB-ABEAAAIZdAA_b3JnLmFwYWNoZS5jbG91ZHN0YWNrLmZyYW1ld29yay5qb2JzLmltcGwuQXN5bmNKb2JNYW5hZ2VySW1wbCQ1dAAYQXN5bmNKb2JNYW5hZ2VySW1wbC5qYXZhdAAMcnVuSW5Db250ZXh0c3EAfgARAAAAMXQAPm9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuTWFuYWdlZENvbnRleHRSdW5uYWJsZSQxdAAbTWFuYWdlZENvbnRleHRSdW5uYWJsZS5qYXZhdAADcnVuc3EAfgARAAAAOHQAQm9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0TWFuYWdlZENvbnRleHQkMXQAGkRlZmF1bHRNYW5hZ2VkQ29udGV4dC5qYXZhdAAEY2FsbHNxAH4AEQAAAGd0AEBvcmcuYXBhY2hlLmNsb3Vkc3RhY2subWFuYWdlZC5jb250ZXh0LmltcGwuRGVmYXVsdE1hbmFnZWRDb250ZXh0cQB-ADZ0AA9jYWxsV2l0aENvbnRleHRzcQB-ABEAAAA1cQB-ADlxAH4ANnQADnJ1bldpdGhDb250ZXh0c3EAfgARAAAALnQAPG9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuTWFuYWdlZENvbnRleHRSdW5uYWJsZXEAfgAycQB-ADNzcQB-ABEAAAHucQB-AC1xAH4ALnEAfgAzc3EAfgARAAAB13QALmphdmEudXRpbC5jb25jdXJyZW50LkV4ZWN1dG9ycyRSdW5uYWJsZUFkYXB0ZXJ0AA5FeGVjdXRvcnMuamF2YXEAfgA3c3EAfgARAAABBnQAH2phdmEudXRpbC5jb25jdXJyZW50LkZ1dHVyZVRhc2t0AA9GdXR1cmVUYXNrLmphdmFxAH4AM3NxAH4AEQAABHl0ACdqYXZhLnV0aWwuY29uY3VycmVudC5UaHJlYWRQb29sRXhlY3V0b3J0ABdUaHJlYWRQb29sRXhlY3V0b3IuamF2YXQACXJ1bldvcmtlcnNxAH4AEQAAAmd0AC5qYXZhLnV0aWwuY29uY3VycmVudC5UaHJlYWRQb29sRXhlY3V0b3IkV29ya2VycQB-AEhxAH4AM3NxAH4AEQAAAul0ABBqYXZhLmxhbmcuVGhyZWFkdAALVGhyZWFkLmphdmFxAH4AM3NyACZqYXZhLnV0aWwuQ29sbGVjdGlvbnMkVW5tb2RpZmlhYmxlTGlzdPwPJTG17I4QAgABTAAEbGlzdHEAfgAMeHIALGphdmEudXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVDb2xsZWN0aW9uGUIAgMte9x4CAAFMAAFjdAAWTGphdmEvdXRpbC9Db2xsZWN0aW9uO3hwc3IAE2phdmEudXRpbC5BcnJheUxpc3R4gdIdmcdhnQMAAUkABHNpemV4cAAAAAB3BAAAAAB4cQB-AFR4c3IAEWphdmEubGFuZy5JbnRlZ2VyEuKgpPeBhzgCAAFJAAV2YWx1ZXhyABBqYXZhLmxhbmcuTnVtYmVyhqyVHQuU4IsCAAB4cAAAEO9zcQB-AFMAAAAAdwQAAAAAeHNyAA5qYXZhLmxhbmcuTG9uZzuL5JDMjyPfAgABSgAFdmFsdWV4cQB-AFYAAAAAAAAAGXZyABVjb20uY2xvdWQub3JnLkNsdXN0ZXKqoV-omaW2FAIAAHhwAA
>     2017-01-30 08:44:06,201 DEBUG
>     [c.c.d.DeploymentPlanningManagerImpl]
>     (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987
>     ctx-4c93bc31) Deploy avoids pods: [], clusters: [], hosts: [3]
>     2017-01-30 08:44:06,201 DEBUG
>     [c.c.d.DeploymentPlanningManagerImpl]
>     (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987
>     ctx-4c93bc31) DeploymentPlanner allocation algorithm:
>     com.cloud.deploy.SkipHeuresticsPlanner@4fdf86ce
>     2017-01-30 08:44:06,201 DEBUG
>     [c.c.d.DeploymentPlanningManagerImpl]
>     (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987
>     ctx-4c93bc31) Trying to allocate a host and storage pools from
>     dc:14, pod:17,cluster:17, requested cpu: 4200, requested ram:
>     4294967296
>     2017-01-30 08:44:06,201 DEBUG
>     [c.c.d.DeploymentPlanningManagerImpl]
>     (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987
>     ctx-4c93bc31) Is ROOT volume READY (pool already allocated)?: Yes
>     2017-01-30 08:44:06,201 DEBUG
>     [c.c.d.DeploymentPlanningManagerImpl]
>     (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987
>     ctx-4c93bc31) This VM has last host_id specified, trying to choose
>     the same host: 213
>     2017-01-30 08:44:06,213 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Publish
>     async job-192989 complete on message bus
>     2017-01-30 08:44:06,213 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Wake up
>     jobs related to job-192989
>     2017-01-30 08:44:06,213 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Update db
>     status for job-192989
>     2017-01-30 08:44:06,219 DEBUG
>     [c.c.d.DeploymentPlanningManagerImpl]
>     (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>     ctx-6e9c753f) Checking resources in Cluster: 35 under Pod: 31
>     2017-01-30 08:44:06,227 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>     (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>     ctx-6e9c753f FirstFitRoutingAllocator) Looking for hosts in dc:
>     23  pod:31  cluster:35
>     2017-01-30 08:44:06,229 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>     (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Executing
>     AsyncJobVO {id:192992, userId: 1, accountId: 1, instanceType:
>     null, instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway,
>     cmdInfo:
>     rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGwXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAEw,
>     cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode:
>     0, result: null, initMsid: 345049098122, completeMsid: null,
>     lastUpdated: null, lastPolled: null, created: Mon Jan 30 08:44:05
>     CET 2017}
>     2017-01-30 08:44:06,229 DEBUG [c.c.v.VmWorkJobDispatcher]
>     (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Run VM
>     work job: com.cloud.vm.VmWorkMigrateAway for VM 1729, job origin:
>     14953
>     2017-01-30 08:44:06,230 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>     (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Execute sync-queue item:
>     SyncQueueItemVO {id:63630, queueId: 67180, contentType: AsyncJob,
>     contentId: 192990, lastProcessMsid: 345049098498,
>     lastprocessNumber: 61, lastProcessTime: Mon Jan 30 08:44:06 CET
>     2017, created: Mon Jan 30 08:44:05 CET 2017}
>     2017-01-30 08:44:06,244 DEBUG
>     [c.c.d.DeploymentPlanningManagerImpl]
>     (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987
>     ctx-4c93bc31) The last host of this VM is not UP or is not
>     enabled, host status is: Disconnected, host resource state is: Enabled
>     2017-01-30 08:44:06,262 DEBUG
>     [c.c.d.DeploymentPlanningManagerImpl]
>     (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987
>     ctx-4c93bc31) Cannot choose the last host to deploy this VM
>     2017-01-30 08:44:06,263 DEBUG [c.c.d.FirstFitPlanner]
>     (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987
>     ctx-4c93bc31) Searching resources only under specified Cluster: 17
>     2017-01-30 08:44:06,262 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>     (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Add
>     job-192991 into job monitoring
>     2017-01-30 08:44:06,246 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>     (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Done
>     executing com.cloud.vm.VmWorkMigrateAway for job-192986
>     2017-01-30 08:44:06,244 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Wake up
>     jobs joined with job-192989 and disjoin all subjobs created from
>     job- 192989
>     2017-01-30 08:44:06,269 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>     (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Remove
>     job-192986 from job monitoring
>     2017-01-30 08:44:06,279 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Done
>     executing com.cloud.vm.VmWorkMigrateAway for job-192989
>     2017-01-30 08:44:06,305 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>     (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Executing
>     AsyncJobVO {id:192991, userId: 1, accountId: 1, instanceType:
>     null, instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway,
>     cmdInfo:
>     rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGRXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAABw,
>     cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode:
>     0, result: null, initMsid: 345049098122, completeMsid: null,
>     lastUpdated: null, lastPolled: null, created: Mon Jan 30 08:44:05
>     CET 2017}
>     2017-01-30 08:44:06,305 DEBUG [c.c.v.VmWorkJobDispatcher]
>     (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Run VM
>     work job: com.cloud.vm.VmWorkMigrateAway for VM 1605, job origin:
>     14965
>     2017-01-30 08:44:06,306 DEBUG [c.c.a.m.AgentManagerImpl]
>     (AgentManager-Handler-13:null) Ping from 258
>     2017-01-30 08:44:06,319 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>     (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>     ctx-6e9c753f FirstFitRoutingAllocator) List of hosts in ascending
>     order of number of VMs: [224, 19, 21, 15]
>     2017-01-30 08:44:06,322 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
>     (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988
>     ctx-6e9c753f FirstFitRoutingAllocator) FirstFitAllocator has 0
>     hosts to check for allocation: []
>     2017-01-30 08:44:06,322 INFO  [c.c.a.m.AgentManagerImpl]
>     (AgentManager-Handler-13:null) PingMap for agent: 258 will not be
>     updated because agent is no longer in the PingMap
>     2017-01-30 08:44:06,326 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>     (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Remove
>     job-192989 from job monitoring
>
>     2017-01-30 08:54:38,139 INFO [c.c.v.VirtualMachineManagerImpl]
>     (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019
>     ctx-052eb4f5) Migration cancelled because state has changed:
>     VM[User|i-189-1178-VM]
>     2017-01-30 08:54:38,139 DEBUG [c.c.v.VirtualMachineManagerImpl]
>     (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019
>     ctx-052eb4f5) Unable to migrate VM due to: Migration cancelled
>     because state has changed: VM[User|i-189-1178-VM]
>     2017-01-30 08:54:38,167 DEBUG [c.c.c.CapacityManagerImpl]
>     (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019
>     ctx-052eb4f5) VM state transitted from :Running to Stopping with
>     event: StopRequestedvm's original host id: 29 new host id: 33 host
>     id before state transition: 33
>     2017-01-30 08:54:38,177 DEBUG [c.c.a.m.DirectAgentAttache]
>     (DirectAgent-325:ctx-e540d7f5) Seq 19-2355664080091480198:
>     Response Received:
>     2017-01-30 08:54:38,177 DEBUG [c.c.a.t.Request]
>     (DirectAgent-325:ctx-e540d7f5) Seq 19-2355664080091480198:
>     Processing:  { Ans: , MgmtId: 345049098498, via: 19, Ver: v1,
>     Flags: 10,
>     [{"com.cloud.agent.api.NetworkUsageAnswer":{"routerName":"r-1706-VM","bytesSent":12276848087,"bytesReceived":5226621450,"result":true,"details":"","wait":0}}]
>     }
>     2017-01-30 08:54:38,177 DEBUG [c.c.a.t.Request]
>     (RouterMonitor-1:ctx-2e80f405) Seq 19-2355664080091480198:
>     Received:  { Ans: , MgmtId: 345049098498, via: 19, Ver: v1, Flags:
>     10, { NetworkUsageAnswer } }
>     2017-01-30 08:54:38,177 DEBUG [c.c.a.m.AgentManagerImpl]
>     (RouterMonitor-1:ctx-2e80f405) Details from executing class
>     com.cloud.agent.api.NetworkUsageCommand:
>     2017-01-30 08:54:38,178 DEBUG [c.c.a.t.Request]
>     (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019
>     ctx-052eb4f5) Seq 33-1917970491306410035: Sending  { Cmd , MgmtId:
>     345049098498, via: 33(ewcstack-vh003-prod), Ver: v1, Flags:
>     100011,
>     [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"checkBeforeCleanup":false,"vmName":"i-189-1178-VM","wait":0}}]
>     }
>     2017-01-30 08:54:38,178 DEBUG [c.c.a.t.Request]
>     (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019
>     ctx-052eb4f5) Seq 33-1917970491306410035: Executing:  { Cmd ,
>     MgmtId: 345049098498, via: 33(ewcstack-vh003-prod), Ver: v1,
>     Flags: 100011,
>     [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"checkBeforeCleanup":false,"vmName":"i-189-1178-VM","wait":0}}]
>     }
>     2017-01-30 08:54:38,178 DEBUG [c.c.a.m.DirectAgentAttache]
>     (DirectAgent-43:ctx-da30f109) Seq 33-1917970491306410035:
>     Executing request
>     2017-01-30 08:54:38,202 DEBUG [c.c.a.m.DirectAgentAttache]
>     (DirectAgent-326:ctx-4fb88aa7) Seq 21-1035546439318503505:
>     Executing request
>
>     2017-01-30 09:05:55,628 INFO [c.c.h.HighAvailabilityManagerImpl]
>     (HA-Worker-4:ctx-36aae733 work-2049) Processing
>     HAWork[2049-Migration-1178-Stopped-Migrating]
>     2017-01-30 09:05:55,633 WARN [o.a.c.f.j.AsyncJobExecutionContext]
>     (HA-Worker-4:ctx-36aae733 work-2049) Job is executed without a
>     context, setup psudo job for the executing thread
>     2017-01-30 09:05:55,645 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>     (HA-Worker-4:ctx-36aae733 work-2049) Sync job-193045 execution on
>     object VmWorkJobQueue.1178
>
>
>
>
>     On 31.01.2017 00:42, Rafael Weingärtner wrote:
>
>         This seems to be one of those nasty bugs that are very hard to
>         debug. It
>         seems to be cause to a combination of different factors, such
>         as planets
>         alignments ;)
>
>         At class “VirtualMachineManagerImpl” line 2395 (method
>         “com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(String,
>         long, DeploymentPlanner)”) is the start of the code flow that
>         triggers the
>         migration. The point here is that, if an error occurs during
>         the migration,
>         the VM will be stopped; For some reason this is executed only
>         for HA VMs.
>
>         Going a little bit further, we find the point that is
>         triggering the error
>         in the method
>         “com.cloud.vm.VirtualMachineManagerImpl.migrate(VMInstanceVO,
>         long, DeployDestination)” at line 1941. This line will call
>         the method
>         “com.cloud.vm.VirtualMachineManagerImpl.changeState(VMInstanceVO,
>         Event,
>         Long, ItWorkVO, Step)” to change the VM state to migrating/prepare
>         migration/or something like that (I am guessing here), but for
>         some reason,
>         the state of the VM is not the one expected and this is causing a
>         “ConcurrentOperationException” to be thrown. The problem is
>         that the
>         “com.cloud.vm.VirtualMachineManagerImpl.changeState(VMInstanceVO,
>         Event,
>         Long, ItWorkVO, Step)” does not seem to provide much useful
>         log entries.
>
>         Just to enable us to discuss the problem a little further,
>         could you tell
>         us a little bit about your environment setting? How many ACS
>         MSs do you
>         have? What is the VM workload being migrated (I mean, the
>         number of VMs and
>         computing resources)? Do the other servers have enough
>         computing resources
>         to support these VMs? Could you also check the XenServer logs,
>         and check
>         what happened on both, source and destination hosts when the
>         migration for
>         VM “i-638-1735-VM” was triggered?
>
>         Do you have the knowledge/skills to use an IDE (e.g. Eclipse)
>         to connect to
>         a running JVM and debug a code? Because for me it will not be
>         possible to
>         go further without looking at the variables in a running
>         environment.
>
>
>         On Mon, Jan 30, 2017 at 4:28 AM, Francois Scheurer <
>         francois.scheurer@everyware.ch
>         <ma...@everyware.ch>> wrote:
>
>             Dear Rafael
>
>
>             Sorry for the late answer (was sick a couple of days).
>
>             I confirm we are migrating inside the same xenserver pool
>             (within a CS
>             Cluster).
>
>
>             Best Regards
>             Francois
>
>
>
>             On 26.01.2017 04:35, Rafael Weingärtner wrote:
>
>                 Found the code that is executed.
>                 I need you to confirm if you are migrating within a
>                 cluster or across
>                 clusters?
>
>                 On Wed, Jan 25, 2017 at 11:57 AM, Rafael Weingärtner <
>                 rafaelweingartner@gmail.com
>                 <ma...@gmail.com>> wrote:
>
>                 I do not have access to ACS code right now, but I
>                 suggest starting
>
>                     debugging here:
>
>                     2017-01-24 18:05:28,427 DEBUG
>                     [c.c.c.CapacityManagerImpl]
>
>                         (Work-Job-Executor-156:ctx-14b86f5e
>                         job-60216/job-191380 ctx-175d37df)
>                         VM state transitted from :Running to Stopping
>                         with event:
>                         StopRequestedvm's
>                         original host id: 224 new host id: 15 host id
>                         before state transition:
>                         15
>
>                         I would try to understand first why ACS
>                         requested the hypervisor to stop
>
>                     the VM. I mean the following, check the source
>                     code for conditions that
>                     would make ACS request the shutdown of VMs.
>                     BTW: are you migrating within a cluster or across
>                     clusters?
>
>                     I will be able to analyze this further only after
>                     work hours, so if you
>                     find anything, keep me posted.
>
>                     On Wed, Jan 25, 2017 at 11:45 AM, Francois Scheurer <
>                     francois.scheurer@everyware.ch
>                     <ma...@everyware.ch>> wrote:
>
>                     Hello Rafael
>
>
>                         I think the important log lines are these:
>
>                         ewcstack-man02-prod: 2017-01-24 18:05:28,407 INFO
>                         [c.c.v.VirtualMachineManagerImpl]
>                         (Work-Job-Executor-156:ctx-14b86f5e
>                         job-60216/job-191380 ctx-175d37df) Migration
>                         cancelled because state has
>                         changed: VM[User|i-638-1736-VM]
>                         ewcstack-man02-prod: 2017-01-24 18:05:28,407 DEBUG
>                         [c.c.v.VirtualMachineManagerImpl]
>                         (Work-Job-Executor-156:ctx-14b86f5e
>                         job-60216/job-191380 ctx-175d37df) Unable to
>                         migrate VM due to:
>                         Migration
>                         cancelled because state has changed:
>                         VM[User|i-638-1736-VM]
>                         ewcstack-man02-prod: 2017-01-24 18:05:28,427 DEBUG
>                         [c.c.c.CapacityManagerImpl]
>                         (Work-Job-Executor-156:ctx-14b86f5e
>                         job-60216/job-191380 ctx-175d37df) VM state
>                         transitted from :Running to
>                         Stopping with event: StopRequestedvm's
>                         original host id: 224 new host
>                         id:
>                         15 host id before state transition: 15
>
>                         The really disturbing thing is that CS stop
>                         again the VM if we start it
>                         manually, apparently because the migratingjob
>                         is still running.
>
>                         Note the the VM HA Flag is enabled.
>
>
>
>                         Thank you.
>
>
>                         Best Regards.
>
>                         Francois
>
>
>
>
>
>
>                         more details:
>
>
>
>                         ewcstack-man02-prod: 2017-01-24 18:05:28,148 DEBUG
>                         [o.a.c.e.o.VolumeOrchestrator]
>                         (Work-Job-Executor-156:ctx-14b86f5e
>                         job-60216/job-191380 ctx-175d37df) Preparing 3
>                         volumes for
>                         VM[User|i-638-1736-VM]
>                         ewcstack-man02-prod: 2017-01-24 18:05:28,155 DEBUG
>                         [c.c.v.VmWorkJobDispatcher]
>                         (Work-Job-Executor-122:ctx-a8bacc64
>                         job-14953/job-191351) Done with run of VM work
>                         job:
>                         com.cloud.vm.VmWorkMigrateAway for VM 1411,
>                         job origin: 14953
>                         ewcstack-man02-prod: 2017-01-24 18:05:28,155 DEBUG
>                         [o.a.c.f.j.i.AsyncJobManagerImpl]
>                         (Work-Job-Executor-122:ctx-a8bacc64
>                         job-14953/job-191351) Done executing
>                         com.cloud.vm.VmWorkMigrateAway for
>                         job-191351
>                         ewcstack-man02-prod: 2017-01-24 18:05:28,160 INFO
>                         [o.a.c.f.j.i.AsyncJobMonitor]
>                         (Work-Job-Executor-122:ctx-a8bacc64
>                         job-14953/job-191351) Remove job-191351 from
>                         job monitoring
>                         ewcstack-man02-prod: 2017-01-24 18:05:28,168 INFO
>                         [c.c.h.HighAvailabilityManagerImpl]
>                         (HA-Worker-2:ctx-f3bcf43c
>                         work-1989)
>                         Completed
>                         HAWork[1989-Migration-1411-Running-Migrating]
>                         ewcstack-man02-prod: 2017-01-24 18:05:28,246 DEBUG
>                         [c.c.a.m.ClusteredAgentAttache]
>                         (Work-Job-Executor-156:ctx-14b86f5e
>                         job-60216/job-191380 ctx-175d37df) Seq
>                         19-8526158519542516375:
>                         Forwarding
>                         Seq 19-8526158519542516375:  { Cmd , MgmtId:
>                         345049103441, via:
>                         19(ewcstack-vh010-prod), Ver: v1, Flags:
>                         100111, [{
>                         "com.cloud.agent.api.PrepareForMigrationCommand":{"vm":{"id"
>                         :1736,"name":"i-638-1736-VM","bootloader":"PyGrub","type":"U
>                         ser","cpus":4,"minSpeed":2100,"maxSpeed":2100,"minRam":17179
>                         869184,"maxRam":17179869184,"arch":"x86_64","os":"Other
>                         (64-bit)","platformEmulator":"Other install
>                         media","bootArgs":"","enable
>                         HA":true,"limitCpuUse":true,"enableDynamicallyScaleVm":true,
>                         "vncPassword":"pC1WUC7h1DBH9J36q3H9pg==","params":{"memoryOv
>                         ercommitRatio":"1.0","platform":"viridian:true;acpi:1;apic:t
>                         rue;pae:true;nx:true;timeoffset:-3601","keyboard":"us","Mess
>                         age.ReservedCapacityFreed.Flag":"false","cpuOvercommitRatio":"4.0","
>                         hypervisortoolsversion":"xenserver61"},"uuid":"5bff01de-c033
>                         -4925-9e47-30dd14539272","disks":[{"data":{"org.apache.cloud
>                         stack.storage.to
>                         <http://stack.storage.to>.VolumeObjectTO":{"uuid":"1ab514b2-f2de-
>                         4d85-8e0d-9768184a1349","volumeType":"ROOT","dataStore":{"org.
>                         apache.cloudstack.storage.to
>                         <http://apache.cloudstack.storage.to>.PrimaryDataStoreTO":{"uuid":"b8751b89-00f
>                         6-5079-ed7a-c073579c7563","id":64,"poolType":"PreSetup","hos
>                         t":"localhost","path":"/b8751b89-00f6-5079-ed7a-c073579c7563
>                         ","port":0,"url":"PreSetup://localhost/b8751b89-00f6-5079-ed
>                         7a-c073579c7563/?ROLE=Primary&STOREUUID=b8751b89-00f6-5079-
>                         ed7a-c073579c7563"}},"name":"ROOT-1736","size":53687091200,"path":"d3
>                         f87e3c-1efe-42ad-aa93-609d5b980a34","volumeId":8655,"vmName"
>                         :"i-638-1736-VM","accountId":638,"format":"VHD","provisionin
>                         gType":"THIN","id":8655,"deviceId":0,"hypervisorType":"
>                         XenServer"}},"diskSeq":0,"path":"d3f87e3c-1efe-42ad-aa93
>                         -609d5b980a34","type":"ROOT","_details":{"managed":"false","
>                         storagePort":"0",
>                         "storageHost":"localhost","volumeSize":"53687091200"}},{"dat
>                         a":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid
>                         ":"733df9dc-5e24-4ab9-b22b-bce78322428e","volumeType":"DATAD
>                         ISK","dataStore":{"org.apache.cloudstack.storage.to.PrimaryD
>                         ataStoreTO":{"uuid":"b8751b89-00f6-5079-ed7a-c073579c7563","id":64,"
>                         poolType":"PreSetup","host":"localhost","path":"/b8751b89-00
>                         f6-5079-ed7a-c073579c7563","port":0,"url":"PreSetup://localh
>                         ost/b8751b89-00f6-5079-ed7a-c073579c7563/?ROLE=Primary&STORE
>                         UUID=b8751b89-00f6-5079-ed7a-c073579c7563"}},"name":"WELLA-
>                         DB1-NCSA_D","size":268435456000,"path":"265f168a-bb92-4591-a32b-ee3e2
>                         2c8cb68","volumeId":8656,"vmName":"i-638-1736-VM","accountId
>                         ":638,"format":"VHD","provisioningType":"THIN","id":8656,"
>                         deviceId":1,"hypervisorType":"XenServer"}},"diskSeq":1,"
>                         path":"265f168a-bb92-4591-a32b-ee3e22c8cb68","type":"DATADIS
>                         K","_details":{"managed":"false","storagePort":"0","stor
>                         ageHost":"localhos
>                         t","volumeSize":"268435456000"}},{"data":{"org.apache.clouds
>                         tack.storage.to
>                         <http://tack.storage.to>.VolumeObjectTO":{"uuid":"bda6f335-b8a3-479c-
>                         8c0d-41b344d2a50d","volumeType":"DATADISK","dataStore":{"org.
>                         apache.cloudstack.storage.to
>                         <http://apache.cloudstack.storage.to>.PrimaryDataStoreTO":{"uuid"
>                         :"09df3ca6-0a20-e4f0-3d79-12a6ed67a13d","id":89,"poolType":"PreSetup","
>                         host":"localhost","path":"/09df3ca6-0a20-e4f0-3d79-12a6ed67a
>                         13d","port":0,"url":"PreSetup://localhost/09df3ca6-0a20-e4f0
>                         -3d79-12a6ed67a13d/?ROLE=Primary&STOREUUID=09df3ca6-0a20-
>
>                         e4f0-3d79-12a6ed67a13d"}},"name":"WELLA-DB1-NCSA_E","size
>                         ":429496729600,"path":"612cb586-a70c-4896-b120-062ee7a67894"
>                         ,"volumeId":86
>                         57,"vmName":"i-638-1736-VM","accountId":638,"format":"VHD","
>                         provisioningType":"THIN","id":8657,"deviceId":2,"hypervisorT
>                         ype":"XenServer"}},"diskSeq":2,"path":"612cb586-a70c-4896-b1
>                         20-062ee7a67894","type":"DATADISK","_details":{"managed":"
>                         false","storagePort":"0","storageHost":"localhost","volumeSize":"42949
>                         6729600"}},{"data":{"org.apache.cloudstack.storage.to
>                         <http://apache.cloudstack.storage.to>.Templa
>                         teObjectTO":{"uuid":"0e5bf4ff-caca-4b08-abed-7bbcdf3c42ef","
>                         id":201,"format":"ISO","accountId":1,"hvm":true,"displayText
>                         ":"xen-pv-drv-iso","name":"xs-tools.iso","guestOsType":"CentOS
>                         4.5
>                         (32-bit)","hypervisorType":"XenServer"}},"diskSeq":3,"type":"
>                         ISO"}],"nics":[{"deviceId":0,"networkRateMbps":1000,"default
>                         Nic":true,"pxeDisable":false,"nicUuid":"3523c3d3-9f92-4076-a
>                         d99-f8864d0f0edd","uuid":"e6fbb257-16e1-415f-88fb-9a79b263b3
>                         a8","ip":"192.168.1.42","netmask":"255.255.255.0","
>                         gateway":"192.168.1.1","mac":"02:00:0c:ae:00:0f","dns1":"8.8
>                         .8.8","dns2":"8.8
>                         .4.4","broadcastType":"Vlan","type":"Guest","broadcastUri":"
>                         vlan://2285","isolationUri":"vlan://2285","isSecurityGroupEn
>                         abled":false,"name":"bond0"}],"vcpuMaxLimit":16},"wait":0}}]
>                         } to
>                         345049098498
>                         ewcstack-man02-prod: 2017-01-24 18:05:28,399
>                         DEBUG [c.c.a.t.Request]
>                         (AgentManager-Handler-10:null) Seq
>                         19-8526158519542516375: Processing:
>                         {
>                         Ans: , MgmtId: 345049103441, via: 19, Ver: v1,
>                         Flags: 110,
>                         [{"com.cloud.agent.api.PrepareForMigrationAnswer":{"result":
>                         true,"wait":0}}]
>                         }
>                         ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG
>                         [c.c.a.m.AgentAttache]
>                         (AgentManager-Handler-10:null) Seq
>                         19-8526158519542516375: No more
>                         commands
>                         found
>                         ewcstack-man02-prod: 2017-01-24 18:05:28,399
>                         DEBUG [c.c.a.t.Request]
>                         (Work-Job-Executor-156:ctx-14b86f5e
>                         job-60216/job-191380 ctx-175d37df)
>                         Seq 19-8526158519542516375: Received:  { Ans:
>                         , MgmtId: 345049103441,
>                         via:
>                         19, Ver: v1, Flags: 110, {
>                         PrepareForMigrationAnswer } }
>
>                         ewcstack-man02-prod: 2017-01-24 18:05:28,407 INFO
>                         [c.c.v.VirtualMachineManagerImpl]
>                         (Work-Job-Executor-156:ctx-14b86f5e
>                         job-60216/job-191380 ctx-175d37df) Migration
>                         cancelled because state has
>                         changed: VM[User|i-638-1736-VM]
>                         ewcstack-man02-prod: 2017-01-24 18:05:28,407 DEBUG
>                         [c.c.v.VirtualMachineManagerImpl]
>                         (Work-Job-Executor-156:ctx-14b86f5e
>                         job-60216/job-191380 ctx-175d37df) Unable to
>                         migrate VM due to:
>                         Migration
>                         cancelled because state has changed:
>                         VM[User|i-638-1736-VM]
>                         ewcstack-man02-prod: 2017-01-24 18:05:28,427 DEBUG
>                         [c.c.c.CapacityManagerImpl]
>                         (Work-Job-Executor-156:ctx-14b86f5e
>                         job-60216/job-191380 ctx-175d37df) VM state
>                         transitted from :Running to
>                         Stopping with event: StopRequestedvm's
>                         original host id: 224 new host
>                         id:
>                         15 host id before state transition: 15
>
>                         ewcstack-man02-prod: 2017-01-24 18:05:28,438 DEBUG
>                         [c.c.a.m.ClusteredAgentAttache]
>                         (Work-Job-Executor-156:ctx-14b86f5e
>                         job-60216/job-191380 ctx-175d37df) Seq
>                         15-6889944479923437643:
>                         Forwarding
>                         Seq 15-6889944479923437643:  { Cmd , MgmtId:
>                         345049103441, via:
>                         15(ewcstack-vh011-prod), Ver: v1, Flags: 100011,
>                         [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"execut
>                         eInSequence":false,"checkBeforeCleanup":false,"vmName":"i-
>                         638-1736-VM","wait":0}}]
>                         } to 345049098498
>                         ewcstack-man02-prod: 2017-01-24 18:05:30,335 DEBUG
>                         [c.c.a.m.AgentManagerImpl]
>                         (AgentManager-Handler-1:null) SeqA
>                         254-800498:
>                         Processing Seq 254-800498:  { Cmd , MgmtId:
>                         -1, via: 254, Ver: v1,
>                         Flags:
>                         11,
>                         [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_
>                         proxyVmId":1530,"_loadInfo":"{\n
>                         \"connections\": []\n}","wait":0}}] }
>                         ewcstack-man02-prod: 2017-01-24 18:05:30,340 DEBUG
>                         [c.c.a.m.AgentManagerImpl]
>                         (AgentManager-Handler-1:null) SeqA
>                         254-800498:
>                         Sending Seq 254-800498:  { Ans: , MgmtId:
>                         345049103441, via: 254, Ver:
>                         v1,
>                         Flags: 100010, [{"com.cloud.agent.api.AgentCo
>                         ntrolAnswer":{"result":true,"wait":0}}] }
>                         ewcstack-man02-prod: 2017-01-24 18:05:32,447
>                         DEBUG [c.c.a.t.Request]
>                         (AgentManager-Handler-11:null) Seq
>                         15-6889944479923437632: Processing:
>                         {
>                         Ans: , MgmtId: 345049103441, via: 15, Ver: v1,
>                         Flags: 10,
>                         [{"com.cloud.agent.api.StopAnswer":{"platform":"viridian:tru
>                         e;acpi:1;apic:true;pae:true;nx:true;timeoffset:3601","result
>                         ":true,"details":"Stop
>                         VM i-638-1735-VM Succeed","wait":0}}] }
>                         ewcstack-man02-prod: 2017-01-24 18:05:32,447
>                         DEBUG [c.c.a.t.Request]
>                         (Work-Job-Executor-115:ctx-0d5db528
>                         job-13733/job-191379 ctx-bbabfeae)
>                         Seq 15-6889944479923437632: Received:  { Ans:
>                         , MgmtId: 345049103441,
>                         via:
>                         15, Ver: v1, Flags: 10, { StopAnswer } }
>                         ewcstack-man02-prod: 2017-01-24 18:05:32,470 DEBUG
>                         [c.c.v.VirtualMachineManagerImpl]
>                         (Work-Job-Executor-115:ctx-0d5db528
>                         job-13733/job-191379 ctx-bbabfeae)
>                         VM[User|i-638-1735-VM] is stopped on
>                         the
>                         host. Proceeding to release resource held.
>                         ewcstack-man02-prod: 2017-01-24 18:05:32,475 DEBUG
>                         [c.c.n.NetworkModelImpl]
>                         (Work-Job-Executor-115:ctx-0d5db528
>                         job-13733/job-191379 ctx-bbabfeae) Service
>                         SecurityGroup is not
>                         supported
>                         in the network id=519
>                         ewcstack-man02-prod: 2017-01-24 18:05:32,478 DEBUG
>                         [o.a.c.e.o.NetworkOrchestrator]
>                         (Work-Job-Executor-115:ctx-0d5db528
>                         job-13733/job-191379 ctx-bbabfeae) Changing
>                         active number of nics for
>                         network id=519 on -1
>                         ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>                         [o.a.c.e.o.NetworkOrchestrator]
>                         (Work-Job-Executor-115:ctx-0d5db528
>                         job-13733/job-191379 ctx-bbabfeae) Asking
>                         VpcVirtualRouter to release
>                         NicProfile[17403-1735-3364083b-dcc0-4eb9-a7f9-f3e85bf8c7d3-1
>                         92.168.1.11-null
>                         ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>                         [c.c.v.VirtualMachineManagerImpl]
>                         (Work-Job-Executor-115:ctx-0d5db528
>                         job-13733/job-191379 ctx-bbabfeae)
>                         Successfully released network
>                         resources
>                         for the vm VM[User|i-638-1735-VM]
>                         ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>                         [c.c.v.VirtualMachineManagerImpl]
>                         (Work-Job-Executor-115:ctx-0d5db528
>                         job-13733/job-191379 ctx-bbabfeae)
>                         Successfully released storage
>                         resources
>                         for the vm VM[User|i-638-1735-VM]
>                         ewcstack-man02-prod: 2017-01-24 18:05:32,496 DEBUG
>                         [c.c.c.CapacityManagerImpl]
>                         (Work-Job-Executor-115:ctx-0d5db528
>                         job-13733/job-191379 ctx-bbabfeae) VM state
>                         transitted from :Stopping to
>                         Stopped with event: OperationSucceededvm's
>                         original host id: 15 new host
>                         id: null host id before state transition: 15
>                         ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>                         [c.c.c.CapacityManagerImpl]
>                         (Work-Job-Executor-115:ctx-0d5db528
>                         job-13733/job-191379 ctx-bbabfeae) Hosts's
>                         actual total CPU: 82976 and
>                         CPU
>                         after applying overprovisioning: 331904
>                         ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>                         [c.c.c.CapacityManagerImpl]
>                         (Work-Job-Executor-115:ctx-0d5db528
>                         job-13733/job-191379 ctx-bbabfeae) Hosts's
>                         actual total RAM:
>                         261961927808
>                         and RAM after applying overprovisioning:
>                         261961924608
>                         ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>                         [c.c.c.CapacityManagerImpl]
>                         (Work-Job-Executor-115:ctx-0d5db528
>                         job-13733/job-191379 ctx-bbabfeae) release cpu
>                         from host: 15, old used:
>                         71300,reserved: 2100, actual total: 82976,
>                         total with overprovisioning:
>                         331904; new used: 67100,reserved:6300;
>                         movedfromreserved:
>                         false,moveToReserveredtrue
>                         ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>                         [c.c.c.CapacityManagerImpl]
>                         (Work-Job-Executor-115:ctx-0d5db528
>                         job-13733/job-191379 ctx-bbabfeae) release mem
>                         from host: 15, old used:
>                         98784247808,reserved: 1073741824, total:
>                         261961924608; new used:
>                         94489280512,reserved:5368709120
>                         <tel:5368709120> <(53)%206870-9120>;
>                         movedfromreserved:
>
>                         false,moveToReserveredtrue
>                         ewcstack-man02-prod: 2017-01-24 18:05:32,538 DEBUG
>                         [c.c.n.NetworkModelImpl]
>                         (Work-Job-Executor-115:ctx-0d5db528
>                         job-13733/job-191379 ctx-bbabfeae) Service
>                         SecurityGroup is not
>                         supported
>                         in the network id=519
>                         ewcstack-man02-prod: 2017-01-24 18:05:32,557 DEBUG
>                         [c.c.n.NetworkModelImpl]
>                         (Work-Job-Executor-115:ctx-0d5db528
>                         job-13733/job-191379 ctx-bbabfeae) Service
>                         SecurityGroup is not
>                         supported
>                         in the network id=519
>
>                         ewcstack-man02-prod: 2017-01-24 18:05:32,588 ERROR
>                         [c.c.v.VmWorkJobHandlerProxy]
>                         (Work-Job-Executor-115:ctx-0d5db528
>                         job-13733/job-191379 ctx-bbabfeae) Invocation
>                         exception, caused by:
>                         com.cloud.utils.exception.CloudRuntimeException:
>                         Unable to migrate
>                         VM[User|i-638-1735-VM]
>                         ewcstack-man02-prod: 2017-01-24 18:05:32,589 INFO
>                         [c.c.v.VmWorkJobHandlerProxy]
>                         (Work-Job-Executor-115:ctx-0d5db528
>                         job-13733/job-191379 ctx-bbabfeae) Rethrow
>                         exception
>                         com.cloud.utils.exception.CloudRuntimeException:
>                         Unable to migrate
>                         VM[User|i-638-1735-VM]
>                         ewcstack-man02-prod: 2017-01-24 18:05:32,589 DEBUG
>                         [c.c.v.VmWorkJobDispatcher]
>                         (Work-Job-Executor-115:ctx-0d5db528
>                         job-13733/job-191379) Done with run of VM work
>                         job:
>                         com.cloud.vm.VmWorkMigrateAway for VM 1735,
>                         job origin: 13733
>                         ewcstack-man02-prod: 2017-01-24 18:05:32,589 ERROR
>                         [c.c.v.VmWorkJobDispatcher]
>                         (Work-Job-Executor-115:ctx-0d5db528
>                         job-13733/job-191379) Unable to complete
>                         AsyncJobVO {id:191379, userId:
>                         1,
>                         accountId: 1, instanceType: null, instanceId:
>                         null, cmd:
>                         com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>                         rO0ABXNyAB5jb20uY2xvdWQudm0uVm
>                         1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb2
>                         0uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2
>                         VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZz
>                         t4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGx3QAGVZpcnR1YWxNYWNoaW5lTW
>                         FuYWdlckltcGwAAAAAAAAA4A,
>                         cmdVersion: 0, status: IN_PROGRESS,
>                         processStatus: 0, resultCode: 0,
>                         result: null, initMsid: 345049098498,
>                         completeMsid: null, lastUpdated:
>                         null, lastPolled: null, created: Tue Jan 24
>                         18:05:00 CET 2017}, job
>                         origin:13733
>                         ewcstack-man02-prod:
>                         com.cloud.utils.exception.CloudRuntimeException:
>                         Unable to migrate VM[User|i-638-1735-VM]
>                         ewcstack-man02-prod:    at
>                         com.cloud.vm.VirtualMachineMan
>                         agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:2405)
>                         ewcstack-man02-prod:    at
>                         com.cloud.vm.VirtualMachineMan
>                         agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:4517)
>                         ewcstack-man02-prod:    at
>                         sun.reflect.GeneratedMethodAcc
>                         essor402.invoke(Unknown
>                         Source)
>                         ewcstack-man02-prod:    at
>                         sun.reflect.DelegatingMethodAc
>                         cessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>                         ewcstack-man02-prod:    at
>                         java.lang.reflect.Method.invok
>                         e(Method.java:606)
>                         ewcstack-man02-prod:    at
>                         com.cloud.vm.VmWorkJobHandlerP
>                         roxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
>                         ewcstack-man02-prod:    at
>                         com.cloud.vm.VirtualMachineMan
>                         agerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4636)
>                         ewcstack-man02-prod:    at
>                         com.cloud.vm.VmWorkJobDispatch
>                         er.runJob(VmWorkJobDispatcher.java:103)
>                         ewcstack-man02-prod:    at
>                         org.apache.cloudstack.framewor
>                         k.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManag
>                         erImpl.java:537)
>                         ewcstack-man02-prod:    at
>                         org.apache.cloudstack.managed.
>                         context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>                         ewcstack-man02-prod:    at
>                         org.apache.cloudstack.managed.
>                         context.impl.DefaultManagedContext$1.call(DefaultManagedCont
>                         ext.java:56)
>                         ewcstack-man02-prod:    at
>                         org.apache.cloudstack.managed.
>                         context.impl.DefaultManagedContext.callWithContext(DefaultMa
>                         nagedContext.java:103)
>                         ewcstack-man02-prod:    at
>                         org.apache.cloudstack.managed.
>                         context.impl.DefaultManagedContext.runWithContext(DefaultMan
>                         agedContext.java:53)
>                         ewcstack-man02-prod:    at
>                         org.apache.cloudstack.managed.
>                         context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>                         ewcstack-man02-prod:    at
>                         org.apache.cloudstack.framewor
>                         k.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:494)
>                         ewcstack-man02-prod:    at
>                         java.util.concurrent.Executors
>                         $RunnableAdapter.call(Executors.java:471)
>                         ewcstack-man02-prod:    at
>                         java.util.concurrent.FutureTas
>                         k.run(FutureTask.java:262)
>                         ewcstack-man02-prod:    at
>                         java.util.concurrent.ThreadPoo
>                         lExecutor.runWorker(ThreadPoolExecutor.java:1145)
>                         ewcstack-man02-prod:    at
>                         java.util.concurrent.ThreadPoo
>                         lExecutor$Worker.run(ThreadPoolExecutor.java:615)
>                         ewcstack-man02-prod:    at
>                         java.lang.Thread.run(Thread.java:745)
>                         ewcstack-man02-prod: 2017-01-24 18:05:32,590 DEBUG
>                         [o.a.c.f.j.i.AsyncJobManagerImpl]
>                         (Work-Job-Executor-115:ctx-0d5db528
>                         job-13733/job-191379) Complete async
>                         job-191379, jobStatus: FAILED,
>                         resultCode: 0, result:
>                         rO0ABXNyAC9jb20uY2xvdWQudXRpbH
>                         MuZXhjZXB0aW9uLkNsb3VkUnVudGltZUV4Y2VwdGlvbgAAAABWTU9yAwABSQ
>                         ALY3NFcnJvckNvZGV4cgAaamF2YS5sYW5nLlJ1bnRpbWVFeGNlcHRpb26eXw
>                         ZHCjSD5QIAAHhyABNqYXZhLmxhbmcuRXhjZXB0aW9u0P0fPho7HMQCAAB4cg
>                         ATamF2YS5sYW5nLlRocm93YWJsZdXGNSc5d7jLAwAETAAFY2F1c2V0ABVMam
>                         F2YS9sYW5nL1Rocm93YWJsZTtMAA1kZXRhaWxNZXNzYWdldAASTGphdmEvbG
>                         FuZy9TdHJpbmc7WwAKc3RhY2tUcmFjZXQAHltMamF2YS9sYW5nL1N0YWNrVH
>                         JhY2VFbGVtZW50O0wAFHN1cHByZXNzZWRFeGNlcHRpb25zdAAQTGphdmEvdX
>                         RpbC9MaXN0O3hwcQB-AAh0AChVbmFibGUgdG8gbWlncmF0ZSBWTVtVc2VyfG
>                         ktNjM4LTE3MzUtVk1ddXIAHltMamF2YS5sYW5nLlN0YWNrVHJhY2VFbGVtZW
>                         50OwJGKjw8_SI5AgAAeHAAAAAUc3IAG2phdmEubGFuZy5TdGFja1RyYWNlRW
>                         xlbWVudGEJxZomNt2FAgAESQAKbGluZU51bWJlckwADmRlY2xhcmluZ0NsYX
>                         NzcQB-AAVMAAhmaWxlTmFtZXEAfgAFTAAKbWV0aG9kTmFtZXEAfgAFeHAAAA
>                         lldAAmY29tLmNsb3VkLnZtLlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGx0AB
>                         5WaXJ0dWFsTWFjaGluZU1hbmFnZXJJbXBsLmphdmF0ABZvcmNoZXN0cmF0ZU
>                         1pZ3JhdGVBd2F5c3EAfgAMAAARpXEAfgAOcQB-AA9xAH4AEHNxAH4ADP____
>                         90ACZzdW4ucmVmbGVjdC5HZW5lcmF0ZWRNZXRob2RBY2Nlc3NvcjQwMnB0AA
>                         ZpbnZva2VzcQB-AAwAAAArdAAoc3VuLnJlZmxlY3QuRGVsZWdhdGluZ01ldG
>                         hvZEFjY2Vzc29ySW1wbHQAIURlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcG
>                         wuamF2YXEAfgAUc3EAfgAMAAACXnQAGGphdmEubGFuZy5yZWZsZWN0Lk1ldG
>                         hvZHQAC01ldGhvZC5qYXZhcQB-ABRzcQB-AAwAAABrdAAiY29tLmNsb3VkLn
>                         ZtLlZtV29ya0pvYkhhbmRsZXJQcm94eXQAGlZtV29ya0pvYkhhbmRsZXJQcm
>                         94eS5qYXZhdAAPaGFuZGxlVm1Xb3JrSm9ic3EAfgAMAAASHHEAfgAOcQB-
>                         AA9xAH4AHnNxAH4ADAAAAGd0ACBjb20uY2xvdWQudm0uVm1Xb3JrSm9iRGlz
>                         cGF0Y2hlcnQAGFZtV29ya0pvYkRpc3BhdGNoZXIuamF2YXQABnJ1bkpvYnNx
>                         AH4ADAAAAhl0AD9vcmcuYXBhY2hlLmNsb3Vkc3RhY2suZnJhbWV3b3JrLmpv
>                         YnMuaW1wbC5Bc3luY0pvYk1hbmFnZXJJbXBsJDV0ABhBc3luY0pvYk1hbmFn
>                         ZXJJbXBsLmphdmF0AAxydW5JbkNvbnRleHRzcQB-AAwAAAAxdAA-b3JnLmFw
>                         YWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29udGV4
>                         dFJ1bm5hYmxlJDF0ABtNYW5hZ2VkQ29udGV4dFJ1bm5hYmxlLmphdmF0AANy
>                         dW5zcQB-AAwAAAA4dABCb3JnLmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQu
>                         Y29udGV4dC5pbXBsLkRlZmF1bHRNYW5hZ2VkQ29udGV4dCQxdAAaRGVmYXVs
>                         dE1hbmFnZWRDb250ZXh0LmphdmF0AARjYWxsc3EAfgAMAAAAZ3QAQG9yZy5h
>                         cGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0
>                         TWFuYWdlZENvbnRleHRxAH4ALnQAD2NhbGxXaXRoQ29udGV4dHNxAH4ADAAA
>                         ADVxAH4AMXEAfgAudAAOcnVuV2l0aENvbnRleHRzcQB-AAwAAAAudAA8b3Jn
>                         LmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29u
>                         dGV4dFJ1bm5hYmxlcQB-ACpxAH4AK3NxAH4ADAAAAe5xAH4AJXEAfgAmcQB-
>                         ACtzcQB-AAwAAAHXdAAuamF2YS51dGlsLmNvbmN1cnJlbnQuRXhlY3V0b3Jz
>                         JFJ1bm5hYmxlQWRhcHRlcnQADkV4ZWN1dG9ycy5qYXZhcQB-AC9zcQB-AAwA
>                         AAEGdAAfamF2YS51dGlsLmNvbmN1cnJlbnQuRnV0dXJlVGFza3QAD0Z1dHVy
>                         ZVRhc2suamF2YXEAfgArc3EAfgAMAAAEeXQAJ2phdmEudXRpbC5jb25jdXJy
>                         ZW50LlRocmVhZFBvb2xFeGVjdXRvcnQAF1RocmVhZFBvb2xFeGVjdXRvci5q
>                         YXZhdAAJcnVuV29ya2Vyc3EAfgAMAAACZ3QALmphdmEudXRpbC5jb25jdXJy
>                         ZW50LlRocmVhZFBvb2xFeGVjdXRvciRXb3JrZXJxAH4AQHEAfgArc3EAfgAM
>                         AAAC6XQAEGphdmEubGFuZy5UaHJlYWR0AAtUaHJlYWQuamF2YXEAfgArc3IA
>                         JmphdmEudXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVMaXN0_A8lMbXs
>                         jhACAAFMAARsaXN0cQB-AAd4cgAsamF2YS51dGlsLkNvbGxlY3Rpb25zJFVu
>                         bW9kaWZpYWJsZUNvbGxlY3Rpb24ZQgCAy173HgIAAUwAAWN0ABZMamF2YS91
>                         dGlsL0NvbGxlY3Rpb247eHBzcgATamF2YS51dGlsLkFycmF5TGlzdHiB0h2Z
>                         x2GdAwABSQAEc2l6ZXhwAAAAAHcEAAAAAHhxAH4ATHgAABCadwgAAAAAAAAAAHg
>                         ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>                         [o.a.c.f.j.i.AsyncJobManagerImpl]
>                         (Work-Job-Executor-115:ctx-0d5db528
>                         job-13733/job-191379) Publish async job-191379
>                         complete on message bus
>                         ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>                         [o.a.c.f.j.i.AsyncJobManagerImpl]
>                         (Work-Job-Executor-115:ctx-0d5db528
>                         job-13733/job-191379) Wake up jobs related to
>                         job-191379
>                         ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>                         [o.a.c.f.j.i.AsyncJobManagerImpl]
>                         (Work-Job-Executor-115:ctx-0d5db528
>                         job-13733/job-191379) Update db status for
>                         job-191379
>                         ewcstack-man02-prod: 2017-01-24 18:05:32,594 DEBUG
>                         [o.a.c.f.j.i.AsyncJobManagerImpl]
>                         (Work-Job-Executor-115:ctx-0d5db528
>                         job-13733/job-191379) Wake up jobs joined with
>                         job-191379 and disjoin
>                         all
>                         subjobs created from job- 191379
>                         ewcstack-man02-prod: 2017-01-24 18:05:32,604 DEBUG
>                         [o.a.c.f.j.i.AsyncJobManagerImpl]
>                         (Work-Job-Executor-115:ctx-0d5db528
>                         job-13733/job-191379) Done executing
>                         com.cloud.vm.VmWorkMigrateAway for
>                         job-191379
>                         ewcstack-man02-prod: 2017-01-24 18:05:32,608 INFO
>                         [o.a.c.f.j.i.AsyncJobMonitor]
>                         (Work-Job-Executor-115:ctx-0d5db528
>                         job-13733/job-191379) Remove job-191379 from
>                         job monitoring
>
>
>
>                         On 25.01.2017 15:53, Rafael Weingärtner wrote:
>
>                         Did you check the ACS log files looking for
>                         some exceptions when the
>
>                             migrations for this specific VM started?
>
>                             On Wed, Jan 25, 2017 at 9:49 AM, Francois
>                             Scheurer <
>                             francois.scheurer@everyware.ch
>                             <ma...@everyware.ch>>
>                             wrote:
>
>                             Dear CS contributors
>
>                                 We use CS 4.5.1 on a 3 Clusters with
>                                 XenServer 6.5. and shared primary
>                                 storage.
>
>                                 When we put a host in maintenance, cs
>                                 evacuates all VM's on other
>                                 hosts.
>
>                                 But it happens regularly that some
>                                 VM's are stopped instead of live
>                                 migrated.
>
>                                 If we restart that stopped VM's, CS
>                                 will then stop them again later.
>
>                                 We need to start them several times
>                                 until they stays up.
>
>                                 Is it a known issue? Is it fixed on CS
>                                 4.9.2 ?
>
>
>                                 Migrating manually all VM's is working
>                                 but impracticable because too
>                                 much
>                                 time consuming.
>
>
>                                 Many thanks in advance for your help.
>
>
>
>                                 Best Regards
>                                 Francois
>
>
>
>
>
>
>                                 --
>
>
>                                 EveryWare AG
>                                 François Scheurer
>                                 Senior Systems Engineer
>                                 Zurlindenstrasse 52a
>                                 CH-8003 Zürich
>
>                                 tel: +41 44 466 60 00
>                                 <tel:%2B41%2044%20466%2060%2000>
>                                 fax: +41 44 466 60 10
>                                 <tel:%2B41%2044%20466%2060%2010>
>                                 mail: francois.scheurer@everyware.ch
>                                 <ma...@everyware.ch>
>                                 web: http://www.everyware.ch
>
>
>
>                             --
>
>
>                         EveryWare AG
>                         François Scheurer
>                         Senior Systems Engineer
>                         Zurlindenstrasse 52a
>                         CH-8003 Zürich
>
>                         tel: +41 44 466 60 00
>                         <tel:%2B41%2044%20466%2060%2000>
>                         fax: +41 44 466 60 10
>                         <tel:%2B41%2044%20466%2060%2010>
>                         mail: francois.scheurer@everyware.ch
>                         <ma...@everyware.ch>
>                         web: http://www.everyware.ch
>
>
>
>                     --
>                     Rafael Weingärtner
>
>
>
>             --
>
>
>             EveryWare AG
>             François Scheurer
>             Senior Systems Engineer
>             Zurlindenstrasse 52a
>             CH-8003 Zürich
>
>             tel: +41 44 466 60 00 <tel:%2B41%2044%20466%2060%2000>
>             fax: +41 44 466 60 10 <tel:%2B41%2044%20466%2060%2010>
>             mail: francois.scheurer@everyware.ch
>             <ma...@everyware.ch>
>             web: http://www.everyware.ch
>
>
>
>
>     -- 
>
>
>     EveryWare AG
>     François Scheurer
>     Senior Systems Engineer
>     Zurlindenstrasse 52a
>     CH-8003 Zürich
>
>     tel: +41 44 466 60 00 <tel:%2B41%2044%20466%2060%2000>
>     fax: +41 44 466 60 10 <tel:%2B41%2044%20466%2060%2010>
>     mail: francois.scheurer@everyware.ch
>     <ma...@everyware.ch>
>     web: http://www.everyware.ch
>
>
>
>
> -- 
> Rafael Weingärtner

-- 


EveryWare AG
François Scheurer
Senior Systems Engineer
Zurlindenstrasse 52a
CH-8003 Zürich

tel: +41 44 466 60 00
fax: +41 44 466 60 10
mail: francois.scheurer@everyware.ch
web: http://www.everyware.ch

Re: cs 4.5.1, vm stopped when putting xen hosts in maintenance

Posted by Rafael Weingärtner <ra...@gmail.com>.

I will be able to re-check some code and answer you with new thoughts on
this problem tonight (the log files you provided today have some very
insightful information regarding this problem). I believe this is a bug in
HA, a very tricky one that has not been reported yet (at least I do not
remember seeing anything similar). And thanks to randomness/destiny now you
have found/uncovered a way to reproduce the problem (bug?!).

I do not suggest you changing any data at all directly in the data base
(unless you know ACS code well and are 100% of what you are doing). This
can cause even bigger and more complicated problems. If you remove the HA
feature from this VMs, this problem will stop. However, if you can/are
willing to maintain the HA until we fully uncover the problem, map and
report it, this would be great.
BTW: why are you restarting MSs?
What are the configurations of your MSs?

Thanks for your time and for sharing information with our community.



On Tue, Jan 31, 2017 at 8:53 AM, Francois Scheurer <
francois.scheurer@everyware.ch> wrote:

> Dear Rafael
>
>
> Many thanks for your deep analysis.
> Nice humor with the planetsalignments BTW ;)
>
> We also suspected the HA Flag to be related this issue, as we never got
> this issue in the past, until we enabled the HA on all VMs and Offerings.
>
> Here is also the workers list:
>
> mysql> select * from op_ha_work;
> +------+-------------+-----------+--------------------+-----
> -----+----------------+---------+---------------------+-----
> --+---------------------+-------+-------------+---------+
> | id   | instance_id | type      | vm_type            | state    |
> mgmt_server_id | host_id | created             | tried | taken
>  | step  | time_to_try | updated |
> +------+-------------+-----------+--------------------+-----
> -----+----------------+---------+---------------------+-----
> --+---------------------+-------+-------------+---------+
> |  105 |         343 | HA        | User               | Running  |
>  345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
> Error |  1394056013 |       3 |
> |  109 |         407 | HA        | ConsoleProxy       | Running  |
>  345049098498 |      33 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
> Error |  1394056013 |       3 |
> |  113 |         407 | HA        | ConsoleProxy       | Running  |
>  345049098498 |      33 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
> Error |  1394056013 |       3 |
> |  117 |         407 | HA        | ConsoleProxy       | Running  |
>  345049098498 |      33 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
> Error |  1394056013 |       3 |
> |  121 |         343 | HA        | User               | Running  |
>  345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
> Error |  1394056013 |       3 |
> |  125 |         405 | HA        | SecondaryStorageVm | Running  |
>  345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
> Error |  1394056013 |       3 |
> |  129 |         393 | HA        | User               | Running  |
>  345049098498 |      25 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
> Error |  1394056014 |      30 |
> |  133 |         405 | HA        | SecondaryStorageVm | Running  |
>  345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
> Error |  1394056014 |       3 |
> |  137 |         402 | HA        | ConsoleProxy       | Running  |
>  345049098498 |      25 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 |
> Error |  1394056014 |       3 |
> |  345 |         402 | HA        | ConsoleProxy       | Starting |
>  345049098498 |      25 | 2015-07-06 13:09:26 | 1 | 2015-07-06 13:09:26 |
> Error |  1402527506 |     770 |
> |  349 |         687 | HA        | SecondaryStorageVm | Starting |
>  345049098498 |      25 | 2015-07-06 13:09:26 | 1 | 2015-07-06 13:09:26 |
> Error |  1402527506 |       2 |
> | 2057 |        1178 | Migration | User               | Running  |
>  345049098498 |      33 | 2017-01-31 07:56:08 | 0 | 2017-01-31 07:56:08 |
> Done  |  1451024774 |      95 |
> | 2058 |        1736 | Migration | User               | Running  |
>  345049098498 |     224 | 2017-01-31 07:56:08 | 0 | 2017-01-31 07:56:08 |
> Done  |  1451024774 |      45 |
> | 2059 |        1240 | Migration | User               | Running  |
>  345049098498 |       7 | 2017-01-31 07:56:08 | 0 | 2017-01-31 07:56:08 |
> Done  |  1451024774 |      61 |
> | 2060 |        1178 | Migration | User               | Running  |
>  345049098122 |      33 | 2017-01-31 07:56:08 | 1 | 2017-01-31 08:06:37 |
> Done  |  1451025374 |      95 |
> | 2061 |        1178 | Migration | User               | Stopped  |
>    NULL |      33 | 2017-01-31 07:56:09 | 5 | NULL                | Done
> |  1451027900 |     109 |
> | 2062 |        1690 | Migration | User               | Running  |
>  345049098498 |     224 | 2017-01-31 07:56:09 | 0 | 2017-01-31 07:56:09 |
> Done  |  1451024774 |     109 |
> | 2063 |        1177 | Migration | User               | Running  |
>  345049098498 |      29 | 2017-01-31 07:56:09 | 0 | 2017-01-31 07:56:09 |
> Done  |  1451024774 |      82 |
> | 2064 |        1690 | Migration | User               | Running  |
>  345049098122 |     224 | 2017-01-31 07:56:09 | 1 | 2017-01-31 07:58:18 |
> Done  |  1451024900 |     109 |
> | 2065 |        1240 | Migration | User               | Running  |
>  345049098122 |       7 | 2017-01-31 07:56:09 | 1 | 2017-01-31 07:58:18 |
> Done  |  1451024900 |      61 |
> | 2066 |        1736 | Migration | User               | Running  |
>  345049098122 |     224 | 2017-01-31 07:56:09 | 1 | 2017-01-31 07:58:18 |
> Done  |  1451024900 |      45 |
> | 2067 |        1240 | Migration | User               | Stopped  |
>    NULL |       7 | 2017-01-31 07:56:09 | 5 | NULL                | Done
> |  1451028003 |      80 |
> | 2068 |        1178 | Migration | User               | Running  |
>  345049098122 |      33 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
> Done  |  1451024781 |      95 |
> | 2069 |        1736 | Migration | User               | Running  |
>  345049098122 |     224 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
> Done  |  1451024781 |      45 |
> | 2070 |        1240 | Migration | User               | Running  |
>  345049098122 |       7 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
> Done  |  1451024781 |      61 |
> | 2071 |        1177 | Migration | User               | Running  |
>  345049098122 |      29 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 |
> Done  |  1451024781 |      82 |
> | 2072 |        1177 | Migration | User               | Stopped  |
>    NULL |      29 | 2017-01-31 07:56:16 | 5 | NULL                | Done
> |  1451027973 |     101 |
> | 2073 |        1177 | Migration | User               | Running  |
>  345049098122 |      29 | 2017-01-31 07:56:16 | 1 | 2017-01-31 07:58:22 |
> Done  |  1451024904 |      82 |
> | 2074 |        1690 | Migration | User               | Stopped  |
>    NULL |     224 | 2017-01-31 07:56:16 | 5 | NULL                | Done
> |  1451027464 |     123 |
> | 2075 |        1736 | Migration | User               | Stopped  |
>    NULL |     224 | 2017-01-31 07:56:16 | 5 | NULL                | Done
> |  1451027477 |      59 |
> +------+-------------+-----------+--------------------+-----
> -----+----------------+---------+---------------------+-----
> --+---------------------+-------+-------------+---------+
>
> If we put a host in maintenance, it may unfortunately happen that a
> timeout occurs and CS aborts the migration.
> As far as we can interpret the logs, it seems (we speculate) that CS
> detects that the VM is then still running on the original host, which is in
> maintenance, and decides somehow to stop it because of the "host not up"
> status.
> We then need to start several times the VM, until CS gives up the stopping.
>
> After restarting the MSs I guess that a similar issue may happen: the
> hosts are disconnected during the early stage and then reconnects. Maybe CS
> then again stops the VMs because the "host not up" status?
> Cf. 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f) The
> last host of this VM is not UP or is not enabled, host status is:
> Disconnected, host resource state is: Enabled
>
> But the problem after the MSs restart is definitely only impacting VMs
> listed in the Table op_ha_work.
>
>
> To your questions:
>         We have 3 ACS MSs managing several Zones. Each zone has a single
> pod and cluster.
>         The Cluster of i-638-1735-VM has this summary:
>          Total number of VMs: 121
>          Total number hypervisors: 4
>          Total allocated RAM: 522.0 / 972 GB (53.7 %)
>          Total allocated cores: 211
>          Total allocated storage: 11285 GB
>         Used Memory on the hosts is now varying from 28 to 77%, so there
> is enough RAM.
>         I did not check the XS logs. The same Problem has arisen this
> morning again (shortly after a restart of all MSs), so I will look in the
> XS logs now.
>
>         I have some fair experience with Eclipse and Java, I will ask my
> boss if we can go this way.
>
> The main issue for now, is that restarting the MSs is reproducing the
> problem, which is quite annoying for the impacted VM's.
> I thought to stop the MSs, delete all HA/Migration entries in op_ha_work
> and start the MSs.
> Is that a viable option? (or should I delete the Migration entries only?
> or never touch this table?)
> We will also very probably remove the HA Flag on all VMs if it does not
> work reliably.
>
> Best Regards
> Francois Scheurer
>
>
>
> Here are new logs regarding the same issue of today.
> I tried to get the important parts related to the stop of i-189-1178-VM.
>
> regexp: (i-189-1178-VM|VmWorkMigrateAway|job-15393|job-192989|
> StopRequestedvm|job-14965|job-193019|Stopped-Migrating|ctx-
> 36aae733|work-2049)
>
> 2017-01-30 08:44:04,440 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Executing
> AsyncJobVO {id:192989, userId: 1, accountId: 1, instanceType: null,
> instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo:
> rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZU
> F3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm
> 1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTA
> ALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAA
> AAAAAAAAEAAAAAAAAEmnQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAHQ,
> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resu
> ltCode: 0, result: null, initMsid: 345049103441, completeMsid: null,
> lastUpdated: null, lastPolled: null, created: Mon Jan 30 08:44:03 CET 2017}
> 2017-01-30 08:44:04,440 DEBUG [c.c.v.VmWorkJobDispatcher]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Run VM work job:
> com.cloud.vm.VmWorkMigrateAway for VM 1178, job origin: 15393
>
> 2017-01-30 08:44:05,217 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
> Deploy avoids pods: [], clusters: [], hosts: [29]
> 2017-01-30 08:44:05,220 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
> (AgentConnectTaskPool-4:ctx-5b61f2e5) create ClusteredAgentAttache for 232
> 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
> DeploymentPlanner allocation algorithm: com.cloud.deploy.UserDispersin
> gPlanner@1734b2e2
> 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
> Trying to allocate a host and storage pools from dc:17, pod:25,cluster:25,
> requested cpu: 4200, requested ram: 4294967296
> 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) Is
> ROOT volume READY (pool already allocated)?: Yes
> 2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) This
> VM has last host_id specified, trying to choose the same host: 25
>
> 2017-01-30 08:44:05,614 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) The
> last host of this VM is not UP or is not enabled, host status is:
> Disconnected, host resource state is: Enabled
> 2017-01-30 08:44:05,614 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
> Cannot choose the last host to deploy this VM
> 2017-01-30 08:44:05,614 DEBUG [c.c.d.FirstFitPlanner]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
> Searching resources only under specified Cluster: 25
>
> 2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
> FirstFitRoutingAllocator) Found 0 hosts for allocation after
> prioritization: []
> 2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
> FirstFitRoutingAllocator) Looking for speed=4200Mhz, Ram=4096
> 2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
> FirstFitRoutingAllocator) Host Allocator returning 0 suitable hosts
> 2017-01-30 08:44:05,777 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) No
> suitable hosts found
> 2017-01-30 08:44:05,777 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) No
> suitable hosts found under this Cluster: 25
> 2017-01-30 08:44:05,787 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
> Could not find suitable Deployment Destination for this VM under any
> clusters, returning.
> 2017-01-30 08:44:05,787 DEBUG [c.c.d.FirstFitPlanner]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
> Searching resources only under specified Cluster: 25
> 2017-01-30 08:44:05,798 DEBUG [c.c.d.FirstFitPlanner]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) The
> specified cluster is in avoid set, returning.
> 2017-01-30 08:44:05,798 DEBUG [c.c.v.VirtualMachineManagerImpl]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
> Unable to find destination for migrating the vm VM[User|i-189-1178-VM]
> 2017-01-30 08:44:05,798 WARN [c.c.v.VirtualMachineManagerImpl]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
> Failed to deploy vm 1178 with original planner, sending HAPlanner
>
> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
> Deploy avoids pods: [], clusters: [], hosts: [29]
> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
> DeploymentPlanner allocation algorithm: com.cloud.deploy.SkipHeurestic
> sPlanner@4fdf86ce
> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
> Trying to allocate a host and storage pools from dc:17, pod:25,cluster:25,
> requested cpu: 4200, requested ram: 4294967296
> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) Is
> ROOT volume READY (pool already allocated)?: Yes
> 2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) This
> VM has last host_id specified, trying to choose the same host: 25
>
> 2017-01-30 08:44:06,022 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) The
> last host of this VM is not UP or is not enabled, host status is:
> Disconnected, host resource state is: Enabled
> 2017-01-30 08:44:06,022 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
> Cannot choose the last host to deploy this VM
> 2017-01-30 08:44:06,022 DEBUG [c.c.d.FirstFitPlanner]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
> Searching resources only under specified Cluster: 25
> 2017-01-30 08:44:06,039 DEBUG [c.c.d.SkipHeuresticsPlanner]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
> Deploying vm during HA process, so skipping disable threshold check
>
> 2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
> FirstFitRoutingAllocator) Found 0 hosts for allocation after
> prioritization: []
> 2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
> FirstFitRoutingAllocator) Looking for speed=4200Mhz, Ram=4096
> 2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea
> FirstFitRoutingAllocator) Host Allocator returning 0 suitable hosts
> 2017-01-30 08:44:06,151 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) No
> suitable hosts found
> 2017-01-30 08:44:06,151 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) No
> suitable hosts found under this Cluster: 25
> 2017-01-30 08:44:06,157 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Publish async
> job-192986 complete on message bus
> 2017-01-30 08:44:06,157 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Wake up jobs
> related to job-192986
> 2017-01-30 08:44:06,157 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Update db status
> for job-192986
> 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f) The
> last host of this VM is not UP or is not enabled, host status is:
> Disconnected, host resource state is: Enabled
> 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
> Cannot choose the last host to deploy this VM
> 2017-01-30 08:44:06,157 DEBUG [c.c.d.FirstFitPlanner]
> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
> Searching resources only under specified Cluster: 35
> 2017-01-30 08:44:06,160 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
> Could not find suitable Deployment Destination for this VM under any
> clusters, returning.
> 2017-01-30 08:44:06,160 DEBUG [c.c.d.FirstFitPlanner]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
> Searching resources only under specified Cluster: 25
> 2017-01-30 08:44:06,163 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Execute sync-queue item:
> SyncQueueItemVO {id:63629, queueId: 64423, contentType: AsyncJob,
> contentId: 192991, lastProcessMsid: 345049098498, lastprocessNumber: 27,
> lastProcessTime: Mon Jan 30 08:44:06 CET 2017, created: Mon Jan 30 08:44:05
> CET 2017}
> 2017-01-30 08:44:06,169 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
> (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Add job-192992
> into job monitoring
> 2017-01-30 08:44:06,184 DEBUG [c.c.d.SkipHeuresticsPlanner]
> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
> Deploying vm during HA process, so skipping disable threshold check
> 2017-01-30 08:44:06,186 DEBUG [c.c.d.FirstFitPlanner]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) The
> specified cluster is in avoid set, returning.
> 2017-01-30 08:44:06,186 DEBUG [c.c.v.VirtualMachineManagerImpl]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
> Unable to find destination for migrating the vm VM[User|i-189-1178-VM]
> 2017-01-30 08:44:06,186 ERROR [c.c.v.VmWorkJobHandlerProxy]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
> Invocation exception, caused by: com.cloud.exception.InsufficientServerCapacityException:
> Unable to find a server to migrate to.Scope=interface
> com.cloud.org.Cluster; id=25
> 2017-01-30 08:44:06,187 INFO  [c.c.v.VmWorkJobHandlerProxy]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea)
> Rethrow exception com.cloud.exception.InsufficientServerCapacityException:
> Unable to find a server to migrate to.Scope=interface
> com.cloud.org.Cluster; id=25
> 2017-01-30 08:44:06,187 DEBUG [c.c.v.VmWorkJobDispatcher]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Done with run of
> VM work job: com.cloud.vm.VmWorkMigrateAway for VM 1178, job origin: 15393
> 2017-01-30 08:44:06,187 ERROR [c.c.v.VmWorkJobDispatcher]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Unable to
> complete AsyncJobVO {id:192989, userId: 1, accountId: 1, instanceType:
> null, instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo:
> rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcE
> mwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAl
> Z2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5h
> bWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAA
> AAAEmnQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAHQ, cmdVersion: 0,
> status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null,
> initMsid: 345049103441, completeMsid: null, lastUpdated: null, lastPolled:
> null, created: Mon Jan 30 08:44:03 CET 2017}, job origin:15393
> com.cloud.exception.InsufficientServerCapacityException: Unable to find a
> server to migrate to.Scope=interface com.cloud.org.Cluster; id=25
>         at com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAwa
> y(VirtualMachineManagerImpl.java:2390)
>         at com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAwa
> y(VirtualMachineManagerImpl.java:4520)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
> ssorImpl.java:57)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
> thodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at com.cloud.vm.VmWorkJobHandlerProxy.handleVmWorkJob(VmWorkJob
> HandlerProxy.java:107)
>         at com.cloud.vm.VirtualMachineManagerImpl.handleVmWorkJob(Virtu
> alMachineManagerImpl.java:4636)
>         at com.cloud.vm.VmWorkJobDispatcher.runJob(VmWorkJobDispatcher.
> java:103)
>         at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImp
> l$5.runInContext(AsyncJobManagerImpl.java:537)
>         at org.apache.cloudstack.managed.context.ManagedContextRunnable
> $1.run(ManagedContextRunnable.java:49)
>         at org.apache.cloudstack.managed.context.impl.DefaultManagedCon
> text$1.call(DefaultManagedContext.java:56)
>         at org.apache.cloudstack.managed.context.impl.DefaultManagedCon
> text.callWithContext(DefaultManagedContext.java:103)
>         at org.apache.cloudstack.managed.context.impl.DefaultManagedCon
> text.runWithContext(DefaultManagedContext.java:53)
>         at org.apache.cloudstack.managed.context.ManagedContextRunnable
> .run(ManagedContextRunnable.java:46)
>         at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImp
> l$5.run(AsyncJobManagerImpl.java:494)
>         at java.util.concurrent.Executors$RunnableAdapter.call(
> Executors.java:471)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
> Executor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
> lExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> 2017-01-30 08:44:06,188 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Schedule queued job-192991
> 2017-01-30 08:44:06,192 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Wake up jobs
> joined with job-192986 and disjoin all subjobs created from job- 192986
> 2017-01-30 08:44:06,196 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Complete async
> job-192989, jobStatus: FAILED, resultCode: 0, result:
> rO0ABXNyADdjb20uY2xvdWQuZXhjZXB0aW9uLkluc3VmZmljaWVudFNlcnZl
> ckNhcGFjaXR5RXhjZXB0aW9uAAAAAFZNT3ACAAFaABVhZmZpbml0eUdyb3Vw
> c0FwcGxpZWR4cgAxY29tLmNsb3VkLmV4Y2VwdGlvbi5JbnN1ZmZpY2llbnRD
> YXBhY2l0eUV4Y2VwdGlvbgAAAABWTU91AgACTAACaWR0ABBMamF2YS9sYW5n
> L0xvbmc7TAAFc2NvcGV0ABFMamF2YS9sYW5nL0NsYXNzO3hyACJjb20uY2xv
> dWQuZXhjZXB0aW9uLkNsb3VkRXhjZXB0aW9ueeiOG_HkRMcCAAJMAAtjc0Vy
> cm9yQ29kZXQAE0xqYXZhL2xhbmcvSW50ZWdlcjtMAAZpZExpc3R0ABVMamF2
> YS91dGlsL0FycmF5TGlzdDt4cgATamF2YS5sYW5nLkV4Y2VwdGlvbtD9Hz4a
> OxzEAgAAeHIAE2phdmEubGFuZy5UaHJvd2FibGXVxjUnOXe4ywMABEwABWNh
> dXNldAAVTGphdmEvbGFuZy9UaHJvd2FibGU7TAANZGV0YWlsTWVzc2FnZXQA
> EkxqYXZhL2xhbmcvU3RyaW5nO1sACnN0YWNrVHJhY2V0AB5bTGphdmEvbGFu
> Zy9TdGFja1RyYWNlRWxlbWVudDtMABRzdXBwcmVzc2VkRXhjZXB0aW9uc3QA
> EExqYXZhL3V0aWwvTGlzdDt4cHEAfgANdAAmVW5hYmxlIHRvIGZpbmQgYSBz
> ZXJ2ZXIgdG8gbWlncmF0ZSB0by51cgAeW0xqYXZhLmxhbmcuU3RhY2tUcmFj
> ZUVsZW1lbnQ7AkYqPDz9IjkCAAB4cAAAABVzcgAbamF2YS5sYW5nLlN0YWNr
> VHJhY2VFbGVtZW50YQnFmiY23YUCAARJAApsaW5lTnVtYmVyTAAOZGVjbGFy
> aW5nQ2xhc3NxAH4ACkwACGZpbGVOYW1lcQB-AApMAAptZXRob2ROYW1lcQB-
> AAp4cAAACVZ0ACZjb20uY2xvdWQudm0uVmlydHVhbE1hY2hpbmVNYW5hZ2Vy
> SW1wbHQAHlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwuamF2YXQAFm9yY2hl
> c3RyYXRlTWlncmF0ZUF3YXlzcQB-ABEAABGocQB-ABNxAH4AFHEAfgAVc3EA
> fgAR_____nQAJHN1bi5yZWZsZWN0Lk5hdGl2ZU1ldGhvZEFjY2Vzc29ySW1w
> bHQAHU5hdGl2ZU1ldGhvZEFjY2Vzc29ySW1wbC5qYXZhdAAHaW52b2tlMHNx
> AH4AEQAAADlxAH4AGHEAfgAZdAAGaW52b2tlc3EAfgARAAAAK3QAKHN1bi5y
> ZWZsZWN0LkRlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcGx0ACFEZWxlZ2F0
> aW5nTWV0aG9kQWNjZXNzb3JJbXBsLmphdmFxAH4AHHNxAH4AEQAAAl50ABhq
> YXZhLmxhbmcucmVmbGVjdC5NZXRob2R0AAtNZXRob2QuamF2YXEAfgAcc3EA
> fgARAAAAa3QAImNvbS5jbG91ZC52bS5WbVdvcmtKb2JIYW5kbGVyUHJveHl0
> ABpWbVdvcmtKb2JIYW5kbGVyUHJveHkuamF2YXQAD2hhbmRsZVZtV29ya0pv
> YnNxAH4AEQAAEhxxAH4AE3EAfgAUcQB-ACZzcQB-ABEAAABndAAgY29tLmNs
> b3VkLnZtLlZtV29ya0pvYkRpc3BhdGNoZXJ0ABhWbVdvcmtKb2JEaXNwYXRj
> aGVyLmphdmF0AAZydW5Kb2JzcQB-ABEAAAIZdAA_b3JnLmFwYWNoZS5jbG
> 91ZHN0YWNrLmZyYW1ld29yay5qb2JzLmltcGwuQXN5bmNKb2JNYW5hZ2VySW
> 1wbCQ1dAAYQXN5bmNKb2JNYW5hZ2VySW1wbC5qYXZhdAAMcnVuSW5Db250ZX
> h0c3EAfgARAAAAMXQAPm9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLm
> NvbnRleHQuTWFuYWdlZENvbnRleHRSdW5uYWJsZSQxdAAbTWFuYWdlZENvbn
> RleHRSdW5uYWJsZS5qYXZhdAADcnVuc3EAfgARAAAAOHQAQm9yZy5hcGFjaG
> UuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0TWFuYW
> dlZENvbnRleHQkMXQAGkRlZmF1bHRNYW5hZ2VkQ29udGV4dC5qYXZhdAAEY2
> FsbHNxAH4AEQAAAGd0AEBvcmcuYXBhY2hlLmNsb3Vkc3RhY2subWFuYWdlZC
> 5jb250ZXh0LmltcGwuRGVmYXVsdE1hbmFnZWRDb250ZXh0cQB-ADZ0AA9jYW
> xsV2l0aENvbnRleHRzcQB-ABEAAAA1cQB-ADlxAH4ANnQADnJ1bl
> dpdGhDb250ZXh0c3EAfgARAAAALnQAPG9yZy5hcGFjaGUuY2xvdWRzdGFjay
> 5tYW5hZ2VkLmNvbnRleHQuTWFuYWdlZENvbnRleHRSdW5uYWJsZXEAfgAycQ
> B-ADNzcQB-ABEAAAHucQB-AC1xAH4ALnEAfgAzc3EAfgARAAAB13QALmphdm
> EudXRpbC5jb25jdXJyZW50LkV4ZWN1dG9ycyRSdW5uYWJsZUFkYXB0ZXJ0AA
> 5FeGVjdXRvcnMuamF2YXEAfgA3c3EAfgARAAABBnQAH2phdmEudXRpbC5jb2
> 5jdXJyZW50LkZ1dHVyZVRhc2t0AA9GdXR1cmVUYXNrLmphdmFxAH4AM3NxAH
> 4AEQAABHl0ACdqYXZhLnV0aWwuY29uY3VycmVudC5UaHJlYWRQb29sRXhlY3
> V0b3J0ABdUaHJlYWRQb29sRXhlY3V0b3IuamF2YXQACXJ1bldvcmtlcnNxAH
> 4AEQAAAmd0AC5qYXZhLnV0aWwuY29uY3VycmVudC5UaHJlYWRQb29sRXhlY3
> V0b3IkV29ya2VycQB-AEhxAH4AM3NxAH4AEQAAAul0ABBqYXZhLmxhbmcuVG
> hyZWFkdAALVGhyZWFkLmphdmFxAH4AM3NyACZqYXZhLnV0aWwuQ29sbGVjdG
> lvbnMkVW5tb2RpZmlhYmxlTGlzdPwPJTG17I4QAgABTAAEbGlzdHEAfgAMeH
> IALGphdmEudXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVDb2xsZWN0aW
> 9uGUIAgMte9x4CAAFMAAFjdAAWTGphdmEvdXRpbC9Db2xsZWN0aW9uO3hwc3
> IAE2phdmEudXRpbC5BcnJheUxpc3R4gdIdmcdhnQMAAUkABHNpemV4cAAAAA
> B3BAAAAAB4cQB-AFR4c3IAEWphdmEubGFuZy5JbnRlZ2VyEuKgpPeBhzgCAA
> FJAAV2YWx1ZXhyABBqYXZhLmxhbmcuTnVtYmVyhqyVHQuU4IsCAAB4cAAAEO
> 9zcQB-AFMAAAAAdwQAAAAAeHNyAA5qYXZhLmxhbmcuTG9uZzuL5JDMjyPfAg
> ABSgAFdmFsdWV4cQB-AFYAAAAAAAAAGXZyABVjb20uY2xvdWQub3JnLkNsdX
> N0ZXKqoV-omaW2FAIAAHhwAA
> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
> Deploy avoids pods: [], clusters: [], hosts: [3]
> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
> DeploymentPlanner allocation algorithm: com.cloud.deploy.SkipHeurestic
> sPlanner@4fdf86ce
> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
> Trying to allocate a host and storage pools from dc:14, pod:17,cluster:17,
> requested cpu: 4200, requested ram: 4294967296
> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31) Is
> ROOT volume READY (pool already allocated)?: Yes
> 2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31) This
> VM has last host_id specified, trying to choose the same host: 213
> 2017-01-30 08:44:06,213 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Publish async
> job-192989 complete on message bus
> 2017-01-30 08:44:06,213 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Wake up jobs
> related to job-192989
> 2017-01-30 08:44:06,213 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Update db status
> for job-192989
> 2017-01-30 08:44:06,219 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f)
> Checking resources in Cluster: 35 under Pod: 31
> 2017-01-30 08:44:06,227 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f
> FirstFitRoutingAllocator) Looking for hosts in dc: 23  pod:31  cluster:35
> 2017-01-30 08:44:06,229 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Executing
> AsyncJobVO {id:192992, userId: 1, accountId: 1, instanceType: null,
> instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo:
> rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcE
> mwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAl
> Z2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5h
> bWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAA
> AAAGwXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAEw, cmdVersion: 0,
> status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null,
> initMsid: 345049098122, completeMsid: null, lastUpdated: null, lastPolled:
> null, created: Mon Jan 30 08:44:05 CET 2017}
> 2017-01-30 08:44:06,229 DEBUG [c.c.v.VmWorkJobDispatcher]
> (Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Run VM work job:
> com.cloud.vm.VmWorkMigrateAway for VM 1729, job origin: 14953
> 2017-01-30 08:44:06,230 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> (AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Execute sync-queue item:
> SyncQueueItemVO {id:63630, queueId: 67180, contentType: AsyncJob,
> contentId: 192990, lastProcessMsid: 345049098498, lastprocessNumber: 61,
> lastProcessTime: Mon Jan 30 08:44:06 CET 2017, created: Mon Jan 30 08:44:05
> CET 2017}
> 2017-01-30 08:44:06,244 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31) The
> last host of this VM is not UP or is not enabled, host status is:
> Disconnected, host resource state is: Enabled
> 2017-01-30 08:44:06,262 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
> Cannot choose the last host to deploy this VM
> 2017-01-30 08:44:06,263 DEBUG [c.c.d.FirstFitPlanner]
> (Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31)
> Searching resources only under specified Cluster: 17
> 2017-01-30 08:44:06,262 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
> (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Add job-192991
> into job monitoring
> 2017-01-30 08:44:06,246 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Done executing
> com.cloud.vm.VmWorkMigrateAway for job-192986
> 2017-01-30 08:44:06,244 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Wake up jobs
> joined with job-192989 and disjoin all subjobs created from job- 192989
> 2017-01-30 08:44:06,269 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
> (Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Remove job-192986
> from job monitoring
> 2017-01-30 08:44:06,279 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Done executing
> com.cloud.vm.VmWorkMigrateAway for job-192989
> 2017-01-30 08:44:06,305 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Executing
> AsyncJobVO {id:192991, userId: 1, accountId: 1, instanceType: null,
> instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo:
> rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcE
> mwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAl
> Z2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5h
> bWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAA
> AAAGRXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAABw, cmdVersion: 0,
> status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null,
> initMsid: 345049098122, completeMsid: null, lastUpdated: null, lastPolled:
> null, created: Mon Jan 30 08:44:05 CET 2017}
> 2017-01-30 08:44:06,305 DEBUG [c.c.v.VmWorkJobDispatcher]
> (Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Run VM work job:
> com.cloud.vm.VmWorkMigrateAway for VM 1605, job origin: 14965
> 2017-01-30 08:44:06,306 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentManager-Handler-13:null) Ping from 258
> 2017-01-30 08:44:06,319 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f
> FirstFitRoutingAllocator) List of hosts in ascending order of number of
> VMs: [224, 19, 21, 15]
> 2017-01-30 08:44:06,322 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
> (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f
> FirstFitRoutingAllocator) FirstFitAllocator has 0 hosts to check for
> allocation: []
> 2017-01-30 08:44:06,322 INFO  [c.c.a.m.AgentManagerImpl]
> (AgentManager-Handler-13:null) PingMap for agent: 258 will not be updated
> because agent is no longer in the PingMap
> 2017-01-30 08:44:06,326 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
> (Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Remove job-192989
> from job monitoring
>
> 2017-01-30 08:54:38,139 INFO [c.c.v.VirtualMachineManagerImpl]
> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5)
> Migration cancelled because state has changed: VM[User|i-189-1178-VM]
> 2017-01-30 08:54:38,139 DEBUG [c.c.v.VirtualMachineManagerImpl]
> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5)
> Unable to migrate VM due to: Migration cancelled because state has changed:
> VM[User|i-189-1178-VM]
> 2017-01-30 08:54:38,167 DEBUG [c.c.c.CapacityManagerImpl]
> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5) VM
> state transitted from :Running to Stopping with event: StopRequestedvm's
> original host id: 29 new host id: 33 host id before state transition: 33
> 2017-01-30 08:54:38,177 DEBUG [c.c.a.m.DirectAgentAttache]
> (DirectAgent-325:ctx-e540d7f5) Seq 19-2355664080091480198: Response
> Received:
> 2017-01-30 08:54:38,177 DEBUG [c.c.a.t.Request]
> (DirectAgent-325:ctx-e540d7f5) Seq 19-2355664080091480198: Processing:  {
> Ans: , MgmtId: 345049098498, via: 19, Ver: v1, Flags: 10,
> [{"com.cloud.agent.api.NetworkUsageAnswer":{"routerName":"r-
> 1706-VM","bytesSent":12276848087,"bytesReceived":
> 5226621450,"result":true,"details":"","wait":0}}] }
> 2017-01-30 08:54:38,177 DEBUG [c.c.a.t.Request]
> (RouterMonitor-1:ctx-2e80f405) Seq 19-2355664080091480198: Received:  {
> Ans: , MgmtId: 345049098498, via: 19, Ver: v1, Flags: 10, {
> NetworkUsageAnswer } }
> 2017-01-30 08:54:38,177 DEBUG [c.c.a.m.AgentManagerImpl]
> (RouterMonitor-1:ctx-2e80f405) Details from executing class
> com.cloud.agent.api.NetworkUsageCommand:
> 2017-01-30 08:54:38,178 DEBUG [c.c.a.t.Request]
> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5) Seq
> 33-1917970491306410035: Sending  { Cmd , MgmtId: 345049098498, via:
> 33(ewcstack-vh003-prod), Ver: v1, Flags: 100011,
> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"
> executeInSequence":false,"checkBeforeCleanup":false,"vmName":"i-189-1178-VM","wait":0}}]
> }
> 2017-01-30 08:54:38,178 DEBUG [c.c.a.t.Request]
> (Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5) Seq
> 33-1917970491306410035: Executing:  { Cmd , MgmtId: 345049098498, via:
> 33(ewcstack-vh003-prod), Ver: v1, Flags: 100011,
> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"
> executeInSequence":false,"checkBeforeCleanup":false,"vmName":"i-189-1178-VM","wait":0}}]
> }
> 2017-01-30 08:54:38,178 DEBUG [c.c.a.m.DirectAgentAttache]
> (DirectAgent-43:ctx-da30f109) Seq 33-1917970491306410035: Executing request
> 2017-01-30 08:54:38,202 DEBUG [c.c.a.m.DirectAgentAttache]
> (DirectAgent-326:ctx-4fb88aa7) Seq 21-1035546439318503505: Executing request
>
> 2017-01-30 09:05:55,628 INFO [c.c.h.HighAvailabilityManagerImpl]
> (HA-Worker-4:ctx-36aae733 work-2049) Processing
> HAWork[2049-Migration-1178-Stopped-Migrating]
> 2017-01-30 09:05:55,633 WARN [o.a.c.f.j.AsyncJobExecutionContext]
> (HA-Worker-4:ctx-36aae733 work-2049) Job is executed without a context,
> setup psudo job for the executing thread
> 2017-01-30 09:05:55,645 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> (HA-Worker-4:ctx-36aae733 work-2049) Sync job-193045 execution on object
> VmWorkJobQueue.1178
>
>
>
>
> On 31.01.2017 00:42, Rafael Weingärtner wrote:
>
>> This seems to be one of those nasty bugs that are very hard to debug. It
>> seems to be cause to a combination of different factors, such as planets
>> alignments ;)
>>
>> At class “VirtualMachineManagerImpl” line 2395 (method
>> “com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(String,
>> long, DeploymentPlanner)”) is the start of the code flow that triggers the
>> migration. The point here is that, if an error occurs during the
>> migration,
>> the VM will be stopped; For some reason this is executed only for HA VMs.
>>
>> Going a little bit further, we find the point that is triggering the error
>> in the method “com.cloud.vm.VirtualMachineManagerImpl.migrate(
>> VMInstanceVO,
>> long, DeployDestination)” at line 1941. This line will call the method
>> “com.cloud.vm.VirtualMachineManagerImpl.changeState(VMInstanceVO, Event,
>> Long, ItWorkVO, Step)” to change the VM state to migrating/prepare
>> migration/or something like that (I am guessing here), but for some
>> reason,
>> the state of the VM is not the one expected and this is causing a
>> “ConcurrentOperationException” to be thrown. The problem is that the
>> “com.cloud.vm.VirtualMachineManagerImpl.changeState(VMInstanceVO, Event,
>> Long, ItWorkVO, Step)” does not seem to provide much useful log entries.
>>
>> Just to enable us to discuss the problem a little further, could you tell
>> us a little bit about your environment setting? How many ACS MSs do you
>> have? What is the VM workload being migrated (I mean, the number of VMs
>> and
>> computing resources)? Do the other servers have enough computing resources
>> to support these VMs? Could you also check the XenServer logs, and check
>> what happened on both, source and destination hosts when the migration for
>> VM “i-638-1735-VM” was triggered?
>>
>> Do you have the knowledge/skills to use an IDE (e.g. Eclipse) to connect
>> to
>> a running JVM and debug a code? Because for me it will not be possible to
>> go further without looking at the variables in a running environment.
>>
>>
>> On Mon, Jan 30, 2017 at 4:28 AM, Francois Scheurer <
>> francois.scheurer@everyware.ch> wrote:
>>
>> Dear Rafael
>>>
>>>
>>> Sorry for the late answer (was sick a couple of days).
>>>
>>> I confirm we are migrating inside the same xenserver pool (within a CS
>>> Cluster).
>>>
>>>
>>> Best Regards
>>> Francois
>>>
>>>
>>>
>>> On 26.01.2017 04:35, Rafael Weingärtner wrote:
>>>
>>> Found the code that is executed.
>>>> I need you to confirm if you are migrating within a cluster or across
>>>> clusters?
>>>>
>>>> On Wed, Jan 25, 2017 at 11:57 AM, Rafael Weingärtner <
>>>> rafaelweingartner@gmail.com> wrote:
>>>>
>>>> I do not have access to ACS code right now, but I suggest starting
>>>>
>>>>> debugging here:
>>>>>
>>>>> 2017-01-24 18:05:28,427 DEBUG [c.c.c.CapacityManagerImpl]
>>>>>
>>>>>> (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380
>>>>>> ctx-175d37df)
>>>>>> VM state transitted from :Running to Stopping with event:
>>>>>> StopRequestedvm's
>>>>>> original host id: 224 new host id: 15 host id before state transition:
>>>>>> 15
>>>>>>
>>>>>> I would try to understand first why ACS requested the hypervisor to
>>>>>> stop
>>>>>>
>>>>> the VM. I mean the following, check the source code for conditions that
>>>>> would make ACS request the shutdown of VMs.
>>>>> BTW: are you migrating within a cluster or across clusters?
>>>>>
>>>>> I will be able to analyze this further only after work hours, so if you
>>>>> find anything, keep me posted.
>>>>>
>>>>> On Wed, Jan 25, 2017 at 11:45 AM, Francois Scheurer <
>>>>> francois.scheurer@everyware.ch> wrote:
>>>>>
>>>>> Hello Rafael
>>>>>
>>>>>>
>>>>>> I think the important log lines are these:
>>>>>>
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 INFO
>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>>>>> job-60216/job-191380 ctx-175d37df) Migration cancelled because state
>>>>>> has
>>>>>> changed: VM[User|i-638-1736-VM]
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 DEBUG
>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>>>>> job-60216/job-191380 ctx-175d37df) Unable to migrate VM due to:
>>>>>> Migration
>>>>>> cancelled because state has changed: VM[User|i-638-1736-VM]
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,427 DEBUG
>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>>>>> job-60216/job-191380 ctx-175d37df) VM state transitted from :Running
>>>>>> to
>>>>>> Stopping with event: StopRequestedvm's original host id: 224 new host
>>>>>> id:
>>>>>> 15 host id before state transition: 15
>>>>>>
>>>>>> The really disturbing thing is that CS stop again the VM if we start
>>>>>> it
>>>>>> manually, apparently because the migratingjob is still running.
>>>>>>
>>>>>> Note the the VM HA Flag is enabled.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thank you.
>>>>>>
>>>>>>
>>>>>> Best Regards.
>>>>>>
>>>>>> Francois
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> more details:
>>>>>>
>>>>>>
>>>>>>
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,148 DEBUG
>>>>>> [o.a.c.e.o.VolumeOrchestrator] (Work-Job-Executor-156:ctx-14b86f5e
>>>>>> job-60216/job-191380 ctx-175d37df) Preparing 3 volumes for
>>>>>> VM[User|i-638-1736-VM]
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,155 DEBUG
>>>>>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-122:ctx-a8bacc64
>>>>>> job-14953/job-191351) Done with run of VM work job:
>>>>>> com.cloud.vm.VmWorkMigrateAway for VM 1411, job origin: 14953
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,155 DEBUG
>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-122:ctx-a8bacc64
>>>>>> job-14953/job-191351) Done executing com.cloud.vm.VmWorkMigrateAway
>>>>>> for
>>>>>> job-191351
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,160 INFO
>>>>>> [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-122:ctx-a8bacc64
>>>>>> job-14953/job-191351) Remove job-191351 from job monitoring
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,168 INFO
>>>>>> [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-2:ctx-f3bcf43c
>>>>>> work-1989)
>>>>>> Completed HAWork[1989-Migration-1411-Running-Migrating]
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,246 DEBUG
>>>>>> [c.c.a.m.ClusteredAgentAttache] (Work-Job-Executor-156:ctx-14b86f5e
>>>>>> job-60216/job-191380 ctx-175d37df) Seq 19-8526158519542516375:
>>>>>> Forwarding
>>>>>> Seq 19-8526158519542516375:  { Cmd , MgmtId: 345049103441, via:
>>>>>> 19(ewcstack-vh010-prod), Ver: v1, Flags: 100111, [{
>>>>>> "com.cloud.agent.api.PrepareForMigrationCommand":{"vm":{"id"
>>>>>> :1736,"name":"i-638-1736-VM","bootloader":"PyGrub","type":"U
>>>>>> ser","cpus":4,"minSpeed":2100,"maxSpeed":2100,"minRam":17179
>>>>>> 869184,"maxRam":17179869184,"arch":"x86_64","os":"Other
>>>>>> (64-bit)","platformEmulator":"Other install
>>>>>> media","bootArgs":"","enable
>>>>>> HA":true,"limitCpuUse":true,"enableDynamicallyScaleVm":true,
>>>>>> "vncPassword":"pC1WUC7h1DBH9J36q3H9pg==","params":{"memoryOv
>>>>>> ercommitRatio":"1.0","platform":"viridian:true;acpi:1;apic:t
>>>>>> rue;pae:true;nx:true;timeoffset:-3601","keyboard":"us","Mess
>>>>>> age.ReservedCapacityFreed.Flag":"false","cpuOvercommitRatio":"4.0","
>>>>>> hypervisortoolsversion":"xenserver61"},"uuid":"5bff01de-c033
>>>>>> -4925-9e47-30dd14539272","disks":[{"data":{"org.apache.cloud
>>>>>> stack.storage.to.VolumeObjectTO":{"uuid":"1ab514b2-f2de-
>>>>>> 4d85-8e0d-9768184a1349","volumeType":"ROOT","dataStore":{"org.
>>>>>> apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"b8
>>>>>> 751b89-00f
>>>>>> 6-5079-ed7a-c073579c7563","id":64,"poolType":"PreSetup","hos
>>>>>> t":"localhost","path":"/b8751b89-00f6-5079-ed7a-c073579c7563
>>>>>> ","port":0,"url":"PreSetup://localhost/b8751b89-00f6-5079-ed
>>>>>> 7a-c073579c7563/?ROLE=Primary&STOREUUID=b8751b89-00f6-5079-
>>>>>> ed7a-c073579c7563"}},"name":"ROOT-1736","size":53687091200,"path":"d3
>>>>>> f87e3c-1efe-42ad-aa93-609d5b980a34","volumeId":8655,"vmName"
>>>>>> :"i-638-1736-VM","accountId":638,"format":"VHD","provisionin
>>>>>> gType":"THIN","id":8655,"deviceId":0,"hypervisorType":"
>>>>>> XenServer"}},"diskSeq":0,"path":"d3f87e3c-1efe-42ad-aa93
>>>>>> -609d5b980a34","type":"ROOT","_details":{"managed":"false","
>>>>>> storagePort":"0",
>>>>>> "storageHost":"localhost","volumeSize":"53687091200"}},{"dat
>>>>>> a":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid
>>>>>> ":"733df9dc-5e24-4ab9-b22b-bce78322428e","volumeType":"DATAD
>>>>>> ISK","dataStore":{"org.apache.cloudstack.storage.to.PrimaryD
>>>>>> ataStoreTO":{"uuid":"b8751b89-00f6-5079-ed7a-c073579c7563","id":64,"
>>>>>> poolType":"PreSetup","host":"localhost","path":"/b8751b89-00
>>>>>> f6-5079-ed7a-c073579c7563","port":0,"url":"PreSetup://localh
>>>>>> ost/b8751b89-00f6-5079-ed7a-c073579c7563/?ROLE=Primary&STORE
>>>>>> UUID=b8751b89-00f6-5079-ed7a-c073579c7563"}},"name":"WELLA-
>>>>>> DB1-NCSA_D","size":268435456000,"path":"265f168a-bb92-4591-a32b-ee3e2
>>>>>> 2c8cb68","volumeId":8656,"vmName":"i-638-1736-VM","accountId
>>>>>> ":638,"format":"VHD","provisioningType":"THIN","id":8656,"
>>>>>> deviceId":1,"hypervisorType":"XenServer"}},"diskSeq":1,"
>>>>>> path":"265f168a-bb92-4591-a32b-ee3e22c8cb68","type":"DATADIS
>>>>>> K","_details":{"managed":"false","storagePort":"0","stor
>>>>>> ageHost":"localhos
>>>>>> t","volumeSize":"268435456000"}},{"data":{"org.apache.clouds
>>>>>> tack.storage.to.VolumeObjectTO":{"uuid":"bda6f335-b8a3-479c-
>>>>>> 8c0d-41b344d2a50d","volumeType":"DATADISK","dataStore":{"org.
>>>>>> apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid"
>>>>>> :"09df3ca6-0a20-e4f0-3d79-12a6ed67a13d","id":89,"poolType":"
>>>>>> PreSetup","
>>>>>> host":"localhost","path":"/09df3ca6-0a20-e4f0-3d79-12a6ed67a
>>>>>> 13d","port":0,"url":"PreSetup://localhost/09df3ca6-0a20-e4f0
>>>>>> -3d79-12a6ed67a13d/?ROLE=Primary&STOREUUID=09df3ca6-0a20-
>>>>>>
>>>>>> e4f0-3d79-12a6ed67a13d"}},"name":"WELLA-DB1-NCSA_E","size
>>>>>> ":429496729600,"path":"612cb586-a70c-4896-b120-062ee7a67894"
>>>>>> ,"volumeId":86
>>>>>> 57,"vmName":"i-638-1736-VM","accountId":638,"format":"VHD","
>>>>>> provisioningType":"THIN","id":8657,"deviceId":2,"hypervisorT
>>>>>> ype":"XenServer"}},"diskSeq":2,"path":"612cb586-a70c-4896-b1
>>>>>> 20-062ee7a67894","type":"DATADISK","_details":{"managed":"
>>>>>> false","storagePort":"0","storageHost":"localhost","volumeSi
>>>>>> ze":"42949
>>>>>> 6729600"}},{"data":{"org.apache.cloudstack.storage.to.Templa
>>>>>> teObjectTO":{"uuid":"0e5bf4ff-caca-4b08-abed-7bbcdf3c42ef","
>>>>>> id":201,"format":"ISO","accountId":1,"hvm":true,"displayText
>>>>>> ":"xen-pv-drv-iso","name":"xs-tools.iso","guestOsType":"CentOS 4.5
>>>>>> (32-bit)","hypervisorType":"XenServer"}},"diskSeq":3,"type":"
>>>>>> ISO"}],"nics":[{"deviceId":0,"networkRateMbps":1000,"default
>>>>>> Nic":true,"pxeDisable":false,"nicUuid":"3523c3d3-9f92-4076-a
>>>>>> d99-f8864d0f0edd","uuid":"e6fbb257-16e1-415f-88fb-9a79b263b3
>>>>>> a8","ip":"192.168.1.42","netmask":"255.255.255.0","
>>>>>> gateway":"192.168.1.1","mac":"02:00:0c:ae:00:0f","dns1":"8.8
>>>>>> .8.8","dns2":"8.8
>>>>>> .4.4","broadcastType":"Vlan","type":"Guest","broadcastUri":"
>>>>>> vlan://2285","isolationUri":"vlan://2285","isSecurityGroupEn
>>>>>> abled":false,"name":"bond0"}],"vcpuMaxLimit":16},"wait":0}}] } to
>>>>>> 345049098498
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG [c.c.a.t.Request]
>>>>>> (AgentManager-Handler-10:null) Seq 19-8526158519542516375: Processing:
>>>>>> {
>>>>>> Ans: , MgmtId: 345049103441, via: 19, Ver: v1, Flags: 110,
>>>>>> [{"com.cloud.agent.api.PrepareForMigrationAnswer":{"result":
>>>>>> true,"wait":0}}]
>>>>>> }
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG
>>>>>> [c.c.a.m.AgentAttache]
>>>>>> (AgentManager-Handler-10:null) Seq 19-8526158519542516375: No more
>>>>>> commands
>>>>>> found
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG [c.c.a.t.Request]
>>>>>> (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380
>>>>>> ctx-175d37df)
>>>>>> Seq 19-8526158519542516375: Received:  { Ans: , MgmtId: 345049103441,
>>>>>> via:
>>>>>> 19, Ver: v1, Flags: 110, { PrepareForMigrationAnswer } }
>>>>>>
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 INFO
>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>>>>> job-60216/job-191380 ctx-175d37df) Migration cancelled because state
>>>>>> has
>>>>>> changed: VM[User|i-638-1736-VM]
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 DEBUG
>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>>>>> job-60216/job-191380 ctx-175d37df) Unable to migrate VM due to:
>>>>>> Migration
>>>>>> cancelled because state has changed: VM[User|i-638-1736-VM]
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,427 DEBUG
>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>>>>> job-60216/job-191380 ctx-175d37df) VM state transitted from :Running
>>>>>> to
>>>>>> Stopping with event: StopRequestedvm's original host id: 224 new host
>>>>>> id:
>>>>>> 15 host id before state transition: 15
>>>>>>
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,438 DEBUG
>>>>>> [c.c.a.m.ClusteredAgentAttache] (Work-Job-Executor-156:ctx-14b86f5e
>>>>>> job-60216/job-191380 ctx-175d37df) Seq 15-6889944479923437643:
>>>>>> Forwarding
>>>>>> Seq 15-6889944479923437643:  { Cmd , MgmtId: 345049103441, via:
>>>>>> 15(ewcstack-vh011-prod), Ver: v1, Flags: 100011,
>>>>>> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"execut
>>>>>> eInSequence":false,"checkBeforeCleanup":false,"vmName":"i-
>>>>>> 638-1736-VM","wait":0}}]
>>>>>> } to 345049098498
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:30,335 DEBUG
>>>>>> [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-1:null) SeqA
>>>>>> 254-800498:
>>>>>> Processing Seq 254-800498:  { Cmd , MgmtId: -1, via: 254, Ver: v1,
>>>>>> Flags:
>>>>>> 11, [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_
>>>>>> proxyVmId":1530,"_loadInfo":"{\n
>>>>>> \"connections\": []\n}","wait":0}}] }
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:30,340 DEBUG
>>>>>> [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-1:null) SeqA
>>>>>> 254-800498:
>>>>>> Sending Seq 254-800498:  { Ans: , MgmtId: 345049103441, via: 254, Ver:
>>>>>> v1,
>>>>>> Flags: 100010, [{"com.cloud.agent.api.AgentCo
>>>>>> ntrolAnswer":{"result":true,"wait":0}}] }
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,447 DEBUG [c.c.a.t.Request]
>>>>>> (AgentManager-Handler-11:null) Seq 15-6889944479923437632: Processing:
>>>>>> {
>>>>>> Ans: , MgmtId: 345049103441, via: 15, Ver: v1, Flags: 10,
>>>>>> [{"com.cloud.agent.api.StopAnswer":{"platform":"viridian:tru
>>>>>> e;acpi:1;apic:true;pae:true;nx:true;timeoffset:3601","result
>>>>>> ":true,"details":"Stop
>>>>>> VM i-638-1735-VM Succeed","wait":0}}] }
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,447 DEBUG [c.c.a.t.Request]
>>>>>> (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379
>>>>>> ctx-bbabfeae)
>>>>>> Seq 15-6889944479923437632: Received:  { Ans: , MgmtId: 345049103441,
>>>>>> via:
>>>>>> 15, Ver: v1, Flags: 10, { StopAnswer } }
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,470 DEBUG
>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>> job-13733/job-191379 ctx-bbabfeae) VM[User|i-638-1735-VM] is stopped
>>>>>> on
>>>>>> the
>>>>>> host. Proceeding to release resource held.
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,475 DEBUG
>>>>>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not
>>>>>> supported
>>>>>> in the network id=519
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,478 DEBUG
>>>>>> [o.a.c.e.o.NetworkOrchestrator] (Work-Job-Executor-115:ctx-0d5db528
>>>>>> job-13733/job-191379 ctx-bbabfeae) Changing active number of nics for
>>>>>> network id=519 on -1
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>>>>>> [o.a.c.e.o.NetworkOrchestrator] (Work-Job-Executor-115:ctx-0d5db528
>>>>>> job-13733/job-191379 ctx-bbabfeae) Asking VpcVirtualRouter to release
>>>>>> NicProfile[17403-1735-3364083b-dcc0-4eb9-a7f9-f3e85bf8c7d3-1
>>>>>> 92.168.1.11-null
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>> job-13733/job-191379 ctx-bbabfeae) Successfully released network
>>>>>> resources
>>>>>> for the vm VM[User|i-638-1735-VM]
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>> job-13733/job-191379 ctx-bbabfeae) Successfully released storage
>>>>>> resources
>>>>>> for the vm VM[User|i-638-1735-VM]
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,496 DEBUG
>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>> job-13733/job-191379 ctx-bbabfeae) VM state transitted from :Stopping
>>>>>> to
>>>>>> Stopped with event: OperationSucceededvm's original host id: 15 new
>>>>>> host
>>>>>> id: null host id before state transition: 15
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>> job-13733/job-191379 ctx-bbabfeae) Hosts's actual total CPU: 82976 and
>>>>>> CPU
>>>>>> after applying overprovisioning: 331904
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>> job-13733/job-191379 ctx-bbabfeae) Hosts's actual total RAM:
>>>>>> 261961927808
>>>>>> and RAM after applying overprovisioning: 261961924608
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>> job-13733/job-191379 ctx-bbabfeae) release cpu from host: 15, old
>>>>>> used:
>>>>>> 71300,reserved: 2100, actual total: 82976, total with
>>>>>> overprovisioning:
>>>>>> 331904; new used: 67100,reserved:6300; movedfromreserved:
>>>>>> false,moveToReserveredtrue
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>> job-13733/job-191379 ctx-bbabfeae) release mem from host: 15, old
>>>>>> used:
>>>>>> 98784247808,reserved: 1073741824, total: 261961924608; new used:
>>>>>> 94489280512,reserved:5368709120 <(53)%206870-9120>;
>>>>>> movedfromreserved:
>>>>>>
>>>>>> false,moveToReserveredtrue
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,538 DEBUG
>>>>>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not
>>>>>> supported
>>>>>> in the network id=519
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,557 DEBUG
>>>>>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not
>>>>>> supported
>>>>>> in the network id=519
>>>>>>
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,588 ERROR
>>>>>> [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-115:ctx-0d5db528
>>>>>> job-13733/job-191379 ctx-bbabfeae) Invocation exception, caused by:
>>>>>> com.cloud.utils.exception.CloudRuntimeException: Unable to migrate
>>>>>> VM[User|i-638-1735-VM]
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 INFO
>>>>>> [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-115:ctx-0d5db528
>>>>>> job-13733/job-191379 ctx-bbabfeae) Rethrow exception
>>>>>> com.cloud.utils.exception.CloudRuntimeException: Unable to migrate
>>>>>> VM[User|i-638-1735-VM]
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 DEBUG
>>>>>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-115:ctx-0d5db528
>>>>>> job-13733/job-191379) Done with run of VM work job:
>>>>>> com.cloud.vm.VmWorkMigrateAway for VM 1735, job origin: 13733
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 ERROR
>>>>>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-115:ctx-0d5db528
>>>>>> job-13733/job-191379) Unable to complete AsyncJobVO {id:191379,
>>>>>> userId:
>>>>>> 1,
>>>>>> accountId: 1, instanceType: null, instanceId: null, cmd:
>>>>>> com.cloud.vm.VmWorkMigrateAway, cmdInfo:
>>>>>> rO0ABXNyAB5jb20uY2xvdWQudm0uVm
>>>>>> 1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb2
>>>>>> 0uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2
>>>>>> VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZz
>>>>>> t4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGx3QAGVZpcnR1YWxNYWNoaW5lTW
>>>>>> FuYWdlckltcGwAAAAAAAAA4A,
>>>>>> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
>>>>>> result: null, initMsid: 345049098498, completeMsid: null, lastUpdated:
>>>>>> null, lastPolled: null, created: Tue Jan 24 18:05:00 CET 2017}, job
>>>>>> origin:13733
>>>>>> ewcstack-man02-prod: com.cloud.utils.exception.CloudRuntimeException:
>>>>>> Unable to migrate VM[User|i-638-1735-VM]
>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>>>>>> agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:2405)
>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>>>>>> agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:4517)
>>>>>> ewcstack-man02-prod:    at sun.reflect.GeneratedMethodAcc
>>>>>> essor402.invoke(Unknown
>>>>>> Source)
>>>>>> ewcstack-man02-prod:    at sun.reflect.DelegatingMethodAc
>>>>>> cessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>> ewcstack-man02-prod:    at java.lang.reflect.Method.invok
>>>>>> e(Method.java:606)
>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VmWorkJobHandlerP
>>>>>> roxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>>>>>> agerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4636)
>>>>>> ewcstack-man02-prod:    at com.cloud.vm.VmWorkJobDispatch
>>>>>> er.runJob(VmWorkJobDispatcher.java:103)
>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.framewor
>>>>>> k.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManag
>>>>>> erImpl.java:537)
>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>> context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>> context.impl.DefaultManagedContext$1.call(DefaultManagedCont
>>>>>> ext.java:56)
>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>> context.impl.DefaultManagedContext.callWithContext(DefaultMa
>>>>>> nagedContext.java:103)
>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>> context.impl.DefaultManagedContext.runWithContext(DefaultMan
>>>>>> agedContext.java:53)
>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>>> context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.framewor
>>>>>> k.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:494)
>>>>>> ewcstack-man02-prod:    at java.util.concurrent.Executors
>>>>>> $RunnableAdapter.call(Executors.java:471)
>>>>>> ewcstack-man02-prod:    at java.util.concurrent.FutureTas
>>>>>> k.run(FutureTask.java:262)
>>>>>> ewcstack-man02-prod:    at java.util.concurrent.ThreadPoo
>>>>>> lExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>>> ewcstack-man02-prod:    at java.util.concurrent.ThreadPoo
>>>>>> lExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>>> ewcstack-man02-prod:    at java.lang.Thread.run(Thread.java:745)
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,590 DEBUG
>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>> job-13733/job-191379) Complete async job-191379, jobStatus: FAILED,
>>>>>> resultCode: 0, result: rO0ABXNyAC9jb20uY2xvdWQudXRpbH
>>>>>> MuZXhjZXB0aW9uLkNsb3VkUnVudGltZUV4Y2VwdGlvbgAAAABWTU9yAwABSQ
>>>>>> ALY3NFcnJvckNvZGV4cgAaamF2YS5sYW5nLlJ1bnRpbWVFeGNlcHRpb26eXw
>>>>>> ZHCjSD5QIAAHhyABNqYXZhLmxhbmcuRXhjZXB0aW9u0P0fPho7HMQCAAB4cg
>>>>>> ATamF2YS5sYW5nLlRocm93YWJsZdXGNSc5d7jLAwAETAAFY2F1c2V0ABVMam
>>>>>> F2YS9sYW5nL1Rocm93YWJsZTtMAA1kZXRhaWxNZXNzYWdldAASTGphdmEvbG
>>>>>> FuZy9TdHJpbmc7WwAKc3RhY2tUcmFjZXQAHltMamF2YS9sYW5nL1N0YWNrVH
>>>>>> JhY2VFbGVtZW50O0wAFHN1cHByZXNzZWRFeGNlcHRpb25zdAAQTGphdmEvdX
>>>>>> RpbC9MaXN0O3hwcQB-AAh0AChVbmFibGUgdG8gbWlncmF0ZSBWTVtVc2VyfG
>>>>>> ktNjM4LTE3MzUtVk1ddXIAHltMamF2YS5sYW5nLlN0YWNrVHJhY2VFbGVtZW
>>>>>> 50OwJGKjw8_SI5AgAAeHAAAAAUc3IAG2phdmEubGFuZy5TdGFja1RyYWNlRW
>>>>>> xlbWVudGEJxZomNt2FAgAESQAKbGluZU51bWJlckwADmRlY2xhcmluZ0NsYX
>>>>>> NzcQB-AAVMAAhmaWxlTmFtZXEAfgAFTAAKbWV0aG9kTmFtZXEAfgAFeHAAAA
>>>>>> lldAAmY29tLmNsb3VkLnZtLlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGx0AB
>>>>>> 5WaXJ0dWFsTWFjaGluZU1hbmFnZXJJbXBsLmphdmF0ABZvcmNoZXN0cmF0ZU
>>>>>> 1pZ3JhdGVBd2F5c3EAfgAMAAARpXEAfgAOcQB-AA9xAH4AEHNxAH4ADP____
>>>>>> 90ACZzdW4ucmVmbGVjdC5HZW5lcmF0ZWRNZXRob2RBY2Nlc3NvcjQwMnB0AA
>>>>>> ZpbnZva2VzcQB-AAwAAAArdAAoc3VuLnJlZmxlY3QuRGVsZWdhdGluZ01ldG
>>>>>> hvZEFjY2Vzc29ySW1wbHQAIURlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcG
>>>>>> wuamF2YXEAfgAUc3EAfgAMAAACXnQAGGphdmEubGFuZy5yZWZsZWN0Lk1ldG
>>>>>> hvZHQAC01ldGhvZC5qYXZhcQB-ABRzcQB-AAwAAABrdAAiY29tLmNsb3VkLn
>>>>>> ZtLlZtV29ya0pvYkhhbmRsZXJQcm94eXQAGlZtV29ya0pvYkhhbmRsZXJQcm
>>>>>> 94eS5qYXZhdAAPaGFuZGxlVm1Xb3JrSm9ic3EAfgAMAAASHHEAfgAOcQB-
>>>>>> AA9xAH4AHnNxAH4ADAAAAGd0ACBjb20uY2xvdWQudm0uVm1Xb3JrSm9iRGlz
>>>>>> cGF0Y2hlcnQAGFZtV29ya0pvYkRpc3BhdGNoZXIuamF2YXQABnJ1bkpvYnNx
>>>>>> AH4ADAAAAhl0AD9vcmcuYXBhY2hlLmNsb3Vkc3RhY2suZnJhbWV3b3JrLmpv
>>>>>> YnMuaW1wbC5Bc3luY0pvYk1hbmFnZXJJbXBsJDV0ABhBc3luY0pvYk1hbmFn
>>>>>> ZXJJbXBsLmphdmF0AAxydW5JbkNvbnRleHRzcQB-AAwAAAAxdAA-b3JnLmFw
>>>>>> YWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29udGV4
>>>>>> dFJ1bm5hYmxlJDF0ABtNYW5hZ2VkQ29udGV4dFJ1bm5hYmxlLmphdmF0AANy
>>>>>> dW5zcQB-AAwAAAA4dABCb3JnLmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQu
>>>>>> Y29udGV4dC5pbXBsLkRlZmF1bHRNYW5hZ2VkQ29udGV4dCQxdAAaRGVmYXVs
>>>>>> dE1hbmFnZWRDb250ZXh0LmphdmF0AARjYWxsc3EAfgAMAAAAZ3QAQG9yZy5h
>>>>>> cGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0
>>>>>> TWFuYWdlZENvbnRleHRxAH4ALnQAD2NhbGxXaXRoQ29udGV4dHNxAH4ADAAA
>>>>>> ADVxAH4AMXEAfgAudAAOcnVuV2l0aENvbnRleHRzcQB-AAwAAAAudAA8b3Jn
>>>>>> LmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29u
>>>>>> dGV4dFJ1bm5hYmxlcQB-ACpxAH4AK3NxAH4ADAAAAe5xAH4AJXEAfgAmcQB-
>>>>>> ACtzcQB-AAwAAAHXdAAuamF2YS51dGlsLmNvbmN1cnJlbnQuRXhlY3V0b3Jz
>>>>>> JFJ1bm5hYmxlQWRhcHRlcnQADkV4ZWN1dG9ycy5qYXZhcQB-AC9zcQB-AAwA
>>>>>> AAEGdAAfamF2YS51dGlsLmNvbmN1cnJlbnQuRnV0dXJlVGFza3QAD0Z1dHVy
>>>>>> ZVRhc2suamF2YXEAfgArc3EAfgAMAAAEeXQAJ2phdmEudXRpbC5jb25jdXJy
>>>>>> ZW50LlRocmVhZFBvb2xFeGVjdXRvcnQAF1RocmVhZFBvb2xFeGVjdXRvci5q
>>>>>> YXZhdAAJcnVuV29ya2Vyc3EAfgAMAAACZ3QALmphdmEudXRpbC5jb25jdXJy
>>>>>> ZW50LlRocmVhZFBvb2xFeGVjdXRvciRXb3JrZXJxAH4AQHEAfgArc3EAfgAM
>>>>>> AAAC6XQAEGphdmEubGFuZy5UaHJlYWR0AAtUaHJlYWQuamF2YXEAfgArc3IA
>>>>>> JmphdmEudXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVMaXN0_A8lMbXs
>>>>>> jhACAAFMAARsaXN0cQB-AAd4cgAsamF2YS51dGlsLkNvbGxlY3Rpb25zJFVu
>>>>>> bW9kaWZpYWJsZUNvbGxlY3Rpb24ZQgCAy173HgIAAUwAAWN0ABZMamF2YS91
>>>>>> dGlsL0NvbGxlY3Rpb247eHBzcgATamF2YS51dGlsLkFycmF5TGlzdHiB0h2Z
>>>>>> x2GdAwABSQAEc2l6ZXhwAAAAAHcEAAAAAHhxAH4ATHgAABCadwgAAAAAAAAAAHg
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>> job-13733/job-191379) Publish async job-191379 complete on message bus
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>> job-13733/job-191379) Wake up jobs related to job-191379
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>> job-13733/job-191379) Update db status for job-191379
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,594 DEBUG
>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>> job-13733/job-191379) Wake up jobs joined with job-191379 and disjoin
>>>>>> all
>>>>>> subjobs created from job- 191379
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,604 DEBUG
>>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>>> job-13733/job-191379) Done executing com.cloud.vm.VmWorkMigrateAway
>>>>>> for
>>>>>> job-191379
>>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,608 INFO
>>>>>> [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-115:ctx-0d5db528
>>>>>> job-13733/job-191379) Remove job-191379 from job monitoring
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 25.01.2017 15:53, Rafael Weingärtner wrote:
>>>>>>
>>>>>> Did you check the ACS log files looking for some exceptions when the
>>>>>>
>>>>>>> migrations for this specific VM started?
>>>>>>>
>>>>>>> On Wed, Jan 25, 2017 at 9:49 AM, Francois Scheurer <
>>>>>>> francois.scheurer@everyware.ch> wrote:
>>>>>>>
>>>>>>> Dear CS contributors
>>>>>>>
>>>>>>> We use CS 4.5.1 on a 3 Clusters with XenServer 6.5. and shared
>>>>>>>> primary
>>>>>>>> storage.
>>>>>>>>
>>>>>>>> When we put a host in maintenance, cs evacuates all VM's on other
>>>>>>>> hosts.
>>>>>>>>
>>>>>>>> But it happens regularly that some VM's are stopped instead of live
>>>>>>>> migrated.
>>>>>>>>
>>>>>>>> If we restart that stopped VM's, CS will then stop them again later.
>>>>>>>>
>>>>>>>> We need to start them several times until they stays up.
>>>>>>>>
>>>>>>>> Is it a known issue? Is it fixed on CS 4.9.2 ?
>>>>>>>>
>>>>>>>>
>>>>>>>> Migrating manually all VM's is working but impracticable because too
>>>>>>>> much
>>>>>>>> time consuming.
>>>>>>>>
>>>>>>>>
>>>>>>>> Many thanks in advance for your help.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Best Regards
>>>>>>>> Francois
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>>
>>>>>>>> EveryWare AG
>>>>>>>> François Scheurer
>>>>>>>> Senior Systems Engineer
>>>>>>>> Zurlindenstrasse 52a
>>>>>>>> CH-8003 Zürich
>>>>>>>>
>>>>>>>> tel: +41 44 466 60 00
>>>>>>>> fax: +41 44 466 60 10
>>>>>>>> mail: francois.scheurer@everyware.ch
>>>>>>>> web: http://www.everyware.ch
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>
>>>>>>
>>>>>> EveryWare AG
>>>>>> François Scheurer
>>>>>> Senior Systems Engineer
>>>>>> Zurlindenstrasse 52a
>>>>>> CH-8003 Zürich
>>>>>>
>>>>>> tel: +41 44 466 60 00
>>>>>> fax: +41 44 466 60 10
>>>>>> mail: francois.scheurer@everyware.ch
>>>>>> web: http://www.everyware.ch
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>> Rafael Weingärtner
>>>>>
>>>>>
>>>>>
>>>> --
>>>
>>>
>>> EveryWare AG
>>> François Scheurer
>>> Senior Systems Engineer
>>> Zurlindenstrasse 52a
>>> CH-8003 Zürich
>>>
>>> tel: +41 44 466 60 00
>>> fax: +41 44 466 60 10
>>> mail: francois.scheurer@everyware.ch
>>> web: http://www.everyware.ch
>>>
>>>
>>
>>
> --
>
>
> EveryWare AG
> François Scheurer
> Senior Systems Engineer
> Zurlindenstrasse 52a
> CH-8003 Zürich
>
> tel: +41 44 466 60 00
> fax: +41 44 466 60 10
> mail: francois.scheurer@everyware.ch
> web: http://www.everyware.ch
>
>


-- 
Rafael Weingärtner

Re: cs 4.5.1, vm stopped when putting xen hosts in maintenance

Posted by Francois Scheurer <fr...@everyware.ch>.

Dear Rafael


Many thanks for your deep analysis.
Nice humor with the planetsalignments BTW ;)

We also suspected the HA Flag to be related this issue, as we never got 
this issue in the past, until we enabled the HA on all VMs and Offerings.

Here is also the workers list:

mysql> select * from op_ha_work;
+------+-------------+-----------+--------------------+----------+----------------+---------+---------------------+-------+---------------------+-------+-------------+---------+
| id   | instance_id | type      | vm_type            | state    | 
mgmt_server_id | host_id | created             | tried | 
taken               | step  | time_to_try | updated |
+------+-------------+-----------+--------------------+----------+----------------+---------+---------------------+-------+---------------------+-------+-------------+---------+
|  105 |         343 | HA        | User               | Running  |   
345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 | 
Error |  1394056013 |       3 |
|  109 |         407 | HA        | ConsoleProxy       | Running  |   
345049098498 |      33 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 | 
Error |  1394056013 |       3 |
|  113 |         407 | HA        | ConsoleProxy       | Running  |   
345049098498 |      33 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 | 
Error |  1394056013 |       3 |
|  117 |         407 | HA        | ConsoleProxy       | Running  |   
345049098498 |      33 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 | 
Error |  1394056013 |       3 |
|  121 |         343 | HA        | User               | Running  |   
345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 | 
Error |  1394056013 |       3 |
|  125 |         405 | HA        | SecondaryStorageVm | Running  |   
345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 | 
Error |  1394056013 |       3 |
|  129 |         393 | HA        | User               | Running  |   
345049098498 |      25 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 | 
Error |  1394056014 |      30 |
|  133 |         405 | HA        | SecondaryStorageVm | Running  |   
345049098498 |      29 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 | 
Error |  1394056014 |       3 |
|  137 |         402 | HA        | ConsoleProxy       | Running  |   
345049098498 |      25 | 2015-03-28 03:19:03 | 2 | 2015-03-28 03:30:04 | 
Error |  1394056014 |       3 |
|  345 |         402 | HA        | ConsoleProxy       | Starting |   
345049098498 |      25 | 2015-07-06 13:09:26 | 1 | 2015-07-06 13:09:26 | 
Error |  1402527506 |     770 |
|  349 |         687 | HA        | SecondaryStorageVm | Starting |   
345049098498 |      25 | 2015-07-06 13:09:26 | 1 | 2015-07-06 13:09:26 | 
Error |  1402527506 |       2 |
| 2057 |        1178 | Migration | User               | Running  |   
345049098498 |      33 | 2017-01-31 07:56:08 | 0 | 2017-01-31 07:56:08 | 
Done  |  1451024774 |      95 |
| 2058 |        1736 | Migration | User               | Running  |   
345049098498 |     224 | 2017-01-31 07:56:08 | 0 | 2017-01-31 07:56:08 | 
Done  |  1451024774 |      45 |
| 2059 |        1240 | Migration | User               | Running  |   
345049098498 |       7 | 2017-01-31 07:56:08 | 0 | 2017-01-31 07:56:08 | 
Done  |  1451024774 |      61 |
| 2060 |        1178 | Migration | User               | Running  |   
345049098122 |      33 | 2017-01-31 07:56:08 | 1 | 2017-01-31 08:06:37 | 
Done  |  1451025374 |      95 |
| 2061 |        1178 | Migration | User               | Stopped  
|           NULL |      33 | 2017-01-31 07:56:09 | 5 | 
NULL                | Done  |  1451027900 |     109 |
| 2062 |        1690 | Migration | User               | Running  |   
345049098498 |     224 | 2017-01-31 07:56:09 | 0 | 2017-01-31 07:56:09 | 
Done  |  1451024774 |     109 |
| 2063 |        1177 | Migration | User               | Running  |   
345049098498 |      29 | 2017-01-31 07:56:09 | 0 | 2017-01-31 07:56:09 | 
Done  |  1451024774 |      82 |
| 2064 |        1690 | Migration | User               | Running  |   
345049098122 |     224 | 2017-01-31 07:56:09 | 1 | 2017-01-31 07:58:18 | 
Done  |  1451024900 |     109 |
| 2065 |        1240 | Migration | User               | Running  |   
345049098122 |       7 | 2017-01-31 07:56:09 | 1 | 2017-01-31 07:58:18 | 
Done  |  1451024900 |      61 |
| 2066 |        1736 | Migration | User               | Running  |   
345049098122 |     224 | 2017-01-31 07:56:09 | 1 | 2017-01-31 07:58:18 | 
Done  |  1451024900 |      45 |
| 2067 |        1240 | Migration | User               | Stopped  
|           NULL |       7 | 2017-01-31 07:56:09 | 5 | 
NULL                | Done  |  1451028003 |      80 |
| 2068 |        1178 | Migration | User               | Running  |   
345049098122 |      33 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 | 
Done  |  1451024781 |      95 |
| 2069 |        1736 | Migration | User               | Running  |   
345049098122 |     224 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 | 
Done  |  1451024781 |      45 |
| 2070 |        1240 | Migration | User               | Running  |   
345049098122 |       7 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 | 
Done  |  1451024781 |      61 |
| 2071 |        1177 | Migration | User               | Running  |   
345049098122 |      29 | 2017-01-31 07:56:16 | 0 | 2017-01-31 07:56:16 | 
Done  |  1451024781 |      82 |
| 2072 |        1177 | Migration | User               | Stopped  
|           NULL |      29 | 2017-01-31 07:56:16 | 5 | 
NULL                | Done  |  1451027973 |     101 |
| 2073 |        1177 | Migration | User               | Running  |   
345049098122 |      29 | 2017-01-31 07:56:16 | 1 | 2017-01-31 07:58:22 | 
Done  |  1451024904 |      82 |
| 2074 |        1690 | Migration | User               | Stopped  
|           NULL |     224 | 2017-01-31 07:56:16 | 5 | 
NULL                | Done  |  1451027464 |     123 |
| 2075 |        1736 | Migration | User               | Stopped  
|           NULL |     224 | 2017-01-31 07:56:16 | 5 | 
NULL                | Done  |  1451027477 |      59 |
+------+-------------+-----------+--------------------+----------+----------------+---------+---------------------+-------+---------------------+-------+-------------+---------+

If we put a host in maintenance, it may unfortunately happen that a timeout occurs and CS aborts the migration.
As far as we can interpret the logs, it seems (we speculate) that CS detects that the VM is then still running on the original host, which is in maintenance, and decides somehow to stop it because of the "host not up" status.
We then need to start several times the VM, until CS gives up the stopping.

After restarting the MSs I guess that a similar issue may happen: the hosts are disconnected during the early stage and then reconnects. Maybe CS then again stops the VMs because the "host not up" status?
Cf. 2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f) The last host of this VM is not UP or is not enabled, host status is: Disconnected, host resource state is: Enabled

But the problem after the MSs restart is definitely only impacting VMs 
listed in the Table op_ha_work.


To your questions:
	We have 3 ACS MSs managing several Zones. Each zone has a single pod and cluster.
	The Cluster of i-638-1735-VM has this summary:
	 Total number of VMs: 121
	 Total number hypervisors: 4
	 Total allocated RAM: 522.0 / 972 GB (53.7 %)
	 Total allocated cores: 211
	 Total allocated storage: 11285 GB
	Used Memory on the hosts is now varying from 28 to 77%, so there is enough RAM.
	I did not check the XS logs. The same Problem has arisen this morning again (shortly after a restart of all MSs), so I will look in the XS logs now.

	I have some fair experience with Eclipse and Java, I will ask my boss if we can go this way.

The main issue for now, is that restarting the MSs is reproducing the problem, which is quite annoying for the impacted VM's.
I thought to stop the MSs, delete all HA/Migration entries in op_ha_work and start the MSs.
Is that a viable option? (or should I delete the Migration entries only? or never touch this table?)
We will also very probably remove the HA Flag on all VMs if it does not work reliably.

Best Regards
Francois Scheurer



Here are new logs regarding the same issue of today.
I tried to get the important parts related to the stop of i-189-1178-VM.

regexp: 
(i-189-1178-VM|VmWorkMigrateAway|job-15393|job-192989|StopRequestedvm|job-14965|job-193019|Stopped-Migrating|ctx-36aae733|work-2049)

2017-01-30 08:44:04,440 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Executing 
AsyncJobVO {id:192989, userId: 1, accountId: 1, instanceType: null, 
instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo: 
rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZU
F3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAEmnQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAHQ, 
cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resu
ltCode: 0, result: null, initMsid: 345049103441, completeMsid: null, 
lastUpdated: null, lastPolled: null, created: Mon Jan 30 08:44:03 CET 2017}
2017-01-30 08:44:04,440 DEBUG [c.c.v.VmWorkJobDispatcher] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Run VM work job: 
com.cloud.vm.VmWorkMigrateAway for VM 1178, job origin: 15393

2017-01-30 08:44:05,217 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) 
Deploy avoids pods: [], clusters: [], hosts: [29]
2017-01-30 08:44:05,220 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
(AgentConnectTaskPool-4:ctx-5b61f2e5) create ClusteredAgentAttache for 232
2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) 
DeploymentPlanner allocation algorithm: 
com.cloud.deploy.UserDispersingPlanner@1734b2e2
2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) 
Trying to allocate a host and storage pools from dc:17, 
pod:25,cluster:25, requested cpu: 4200, requested ram: 4294967296
2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) Is 
ROOT volume READY (pool already allocated)?: Yes
2017-01-30 08:44:05,220 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) 
This VM has last host_id specified, trying to choose the same host: 25

2017-01-30 08:44:05,614 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) The 
last host of this VM is not UP or is not enabled, host status is: 
Disconnected, host resource state is: Enabled
2017-01-30 08:44:05,614 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) 
Cannot choose the last host to deploy this VM
2017-01-30 08:44:05,614 DEBUG [c.c.d.FirstFitPlanner] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) 
Searching resources only under specified Cluster: 25

2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea 
FirstFitRoutingAllocator) Found 0 hosts for allocation after 
prioritization: []
2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea 
FirstFitRoutingAllocator) Looking for speed=4200Mhz, Ram=4096
2017-01-30 08:44:05,777 DEBUG [c.c.a.m.a.i.FirstFitAllocator] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea 
FirstFitRoutingAllocator) Host Allocator returning 0 suitable hosts
2017-01-30 08:44:05,777 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) No 
suitable hosts found
2017-01-30 08:44:05,777 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) No 
suitable hosts found under this Cluster: 25
2017-01-30 08:44:05,787 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) 
Could not find suitable Deployment Destination for this VM under any 
clusters, returning.
2017-01-30 08:44:05,787 DEBUG [c.c.d.FirstFitPlanner] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) 
Searching resources only under specified Cluster: 25
2017-01-30 08:44:05,798 DEBUG [c.c.d.FirstFitPlanner] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) The 
specified cluster is in avoid set, returning.
2017-01-30 08:44:05,798 DEBUG [c.c.v.VirtualMachineManagerImpl] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) 
Unable to find destination for migrating the vm VM[User|i-189-1178-VM]
2017-01-30 08:44:05,798 WARN [c.c.v.VirtualMachineManagerImpl] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) 
Failed to deploy vm 1178 with original planner, sending HAPlanner

2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) 
Deploy avoids pods: [], clusters: [], hosts: [29]
2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) 
DeploymentPlanner allocation algorithm: 
com.cloud.deploy.SkipHeuresticsPlanner@4fdf86ce
2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) 
Trying to allocate a host and storage pools from dc:17, 
pod:25,cluster:25, requested cpu: 4200, requested ram: 4294967296
2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) Is 
ROOT volume READY (pool already allocated)?: Yes
2017-01-30 08:44:05,985 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) 
This VM has last host_id specified, trying to choose the same host: 25

2017-01-30 08:44:06,022 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) The 
last host of this VM is not UP or is not enabled, host status is: 
Disconnected, host resource state is: Enabled
2017-01-30 08:44:06,022 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) 
Cannot choose the last host to deploy this VM
2017-01-30 08:44:06,022 DEBUG [c.c.d.FirstFitPlanner] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) 
Searching resources only under specified Cluster: 25
2017-01-30 08:44:06,039 DEBUG [c.c.d.SkipHeuresticsPlanner] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) 
Deploying vm during HA process, so skipping disable threshold check

2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea 
FirstFitRoutingAllocator) Found 0 hosts for allocation after 
prioritization: []
2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea 
FirstFitRoutingAllocator) Looking for speed=4200Mhz, Ram=4096
2017-01-30 08:44:06,151 DEBUG [c.c.a.m.a.i.FirstFitAllocator] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea 
FirstFitRoutingAllocator) Host Allocator returning 0 suitable hosts
2017-01-30 08:44:06,151 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) No 
suitable hosts found
2017-01-30 08:44:06,151 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) No 
suitable hosts found under this Cluster: 25
2017-01-30 08:44:06,157 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Publish async 
job-192986 complete on message bus
2017-01-30 08:44:06,157 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Wake up jobs 
related to job-192986
2017-01-30 08:44:06,157 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Update db status 
for job-192986
2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f) The 
last host of this VM is not UP or is not enabled, host status is: 
Disconnected, host resource state is: Enabled
2017-01-30 08:44:06,157 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f) 
Cannot choose the last host to deploy this VM
2017-01-30 08:44:06,157 DEBUG [c.c.d.FirstFitPlanner] 
(Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f) 
Searching resources only under specified Cluster: 35
2017-01-30 08:44:06,160 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) 
Could not find suitable Deployment Destination for this VM under any 
clusters, returning.
2017-01-30 08:44:06,160 DEBUG [c.c.d.FirstFitPlanner] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) 
Searching resources only under specified Cluster: 25
2017-01-30 08:44:06,163 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Execute sync-queue item: 
SyncQueueItemVO {id:63629, queueId: 64423, contentType: AsyncJob, 
contentId: 192991, lastProcessMsid: 345049098498, lastprocessNumber: 27, 
lastProcessTime: Mon Jan 30 08:44:06 CET 2017, created: Mon Jan 30 
08:44:05 CET 2017}
2017-01-30 08:44:06,169 INFO  [o.a.c.f.j.i.AsyncJobMonitor] 
(Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Add job-192992 
into job monitoring
2017-01-30 08:44:06,184 DEBUG [c.c.d.SkipHeuresticsPlanner] 
(Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f) 
Deploying vm during HA process, so skipping disable threshold check
2017-01-30 08:44:06,186 DEBUG [c.c.d.FirstFitPlanner] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) The 
specified cluster is in avoid set, returning.
2017-01-30 08:44:06,186 DEBUG [c.c.v.VirtualMachineManagerImpl] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) 
Unable to find destination for migrating the vm VM[User|i-189-1178-VM]
2017-01-30 08:44:06,186 ERROR [c.c.v.VmWorkJobHandlerProxy] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) 
Invocation exception, caused by: 
com.cloud.exception.InsufficientServerCapacityException: Unable to find 
a server to migrate to.Scope=interface com.cloud.org.Cluster; id=25
2017-01-30 08:44:06,187 INFO  [c.c.v.VmWorkJobHandlerProxy] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989 ctx-ea83d6ea) 
Rethrow exception 
com.cloud.exception.InsufficientServerCapacityException: Unable to find 
a server to migrate to.Scope=interface com.cloud.org.Cluster; id=25
2017-01-30 08:44:06,187 DEBUG [c.c.v.VmWorkJobDispatcher] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Done with run of 
VM work job: com.cloud.vm.VmWorkMigrateAway for VM 1178, job origin: 15393
2017-01-30 08:44:06,187 ERROR [c.c.v.VmWorkJobDispatcher] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Unable to 
complete AsyncJobVO {id:192989, userId: 1, accountId: 1, instanceType: 
null, instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo: 
rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAEmnQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAHQ, 
cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, 
result: null, initMsid: 345049103441, completeMsid: null, lastUpdated: 
null, lastPolled: null, created: Mon Jan 30 08:44:03 CET 2017}, job 
origin:15393
com.cloud.exception.InsufficientServerCapacityException: Unable to find 
a server to migrate to.Scope=interface com.cloud.org.Cluster; id=25
         at 
com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:2390)
         at 
com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:4520)
         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
         at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
         at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
         at java.lang.reflect.Method.invoke(Method.java:606)
         at 
com.cloud.vm.VmWorkJobHandlerProxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
         at 
com.cloud.vm.VirtualMachineManagerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4636)
         at 
com.cloud.vm.VmWorkJobDispatcher.runJob(VmWorkJobDispatcher.java:103)
         at 
org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:537)
         at 
org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
         at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
         at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
         at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
         at 
org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
         at 
org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:494)
         at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
         at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
         at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
         at java.lang.Thread.run(Thread.java:745)
2017-01-30 08:44:06,188 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Schedule queued job-192991
2017-01-30 08:44:06,192 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Wake up jobs 
joined with job-192986 and disjoin all subjobs created from job- 192986
2017-01-30 08:44:06,196 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Complete async 
job-192989, jobStatus: FAILED, resultCode: 0, result: 
rO0ABXNyADdjb20uY2xvdWQuZXhjZXB0aW9uLkluc3VmZmljaWVudFNlcnZlckNhcGFjaXR5RXhjZXB0aW9uAAAAAFZNT3ACAAFaABVhZmZpbml0eUdyb3Vwc0FwcGxpZWR4cgAxY29tLmNsb3VkLmV4Y2VwdGlvbi5JbnN1ZmZpY2llbnRDYXBhY2l0eUV4Y2VwdGlvbgAAAABWTU91AgACTAACaWR0ABBMamF2YS9sYW5nL0xvbmc7TAAFc2NvcGV0ABFMamF2YS9sYW5nL0NsYXNzO3hyACJjb20uY2xvdWQuZXhjZXB0aW9uLkNsb3VkRXhjZXB0aW9ueeiOG_HkRMcCAAJMAAtjc0Vycm9yQ29kZXQAE0xqYXZhL2xhbmcvSW50ZWdlcjtMAAZpZExpc3R0ABVMamF2YS91dGlsL0FycmF5TGlzdDt4cgATamF2YS5sYW5nLkV4Y2VwdGlvbtD9Hz4aOxzEAgAAeHIAE2phdmEubGFuZy5UaHJvd2FibGXVxjUnOXe4ywMABEwABWNhdXNldAAVTGphdmEvbGFuZy9UaHJvd2FibGU7TAANZGV0YWlsTWVzc2FnZXQAEkxqYXZhL2xhbmcvU3RyaW5nO1sACnN0YWNrVHJhY2V0AB5bTGphdmEvbGFuZy9TdGFja1RyYWNlRWxlbWVudDtMABRzdXBwcmVzc2VkRXhjZXB0aW9uc3QAEExqYXZhL3V0aWwvTGlzdDt4cHEAfgANdAAmVW5hYmxlIHRvIGZpbmQgYSBzZXJ2ZXIgdG8gbWlncmF0ZSB0by51cgAeW0xqYXZhLmxhbmcuU3RhY2tUcmFjZUVsZW1lbnQ7AkYqPDz9IjkCAAB4cAAAABVzcgAbamF2YS5sYW5nLlN0YWNrVHJhY2VFbGVtZW50YQnFmiY23YUCAARJAApsaW5lTnVtYmVyTAAOZGVjbGFyaW5nQ2xhc3NxAH4ACkwACGZpbGVOYW1lcQB-AApMAAptZXRob2ROYW1lcQB-AAp4cAAACVZ0ACZjb20uY2xvdWQudm0uVmlydHVhbE1hY2hpbmVNYW5hZ2VySW1wbHQAHlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwuamF2YXQAFm9yY2hlc3RyYXRlTWlncmF0ZUF3YXlzcQB-ABEAABGocQB-ABNxAH4AFHEAfgAVc3EAfgAR_____nQAJHN1bi5yZWZsZWN0Lk5hdGl2ZU1ldGhvZEFjY2Vzc29ySW1wbHQAHU5hdGl2ZU1ldGhvZEFjY2Vzc29ySW1wbC5qYXZhdAAHaW52b2tlMHNxAH4AEQAAADlxAH4AGHEAfgAZdAAGaW52b2tlc3EAfgARAAAAK3QAKHN1bi5yZWZsZWN0LkRlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcGx0ACFEZWxlZ2F0aW5nTWV0aG9kQWNjZXNzb3JJbXBsLmphdmFxAH4AHHNxAH4AEQAAAl50ABhqYXZhLmxhbmcucmVmbGVjdC5NZXRob2R0AAtNZXRob2QuamF2YXEAfgAcc3EAfgARAAAAa3QAImNvbS5jbG91ZC52bS5WbVdvcmtKb2JIYW5kbGVyUHJveHl0ABpWbVdvcmtKb2JIYW5kbGVyUHJveHkuamF2YXQAD2hhbmRsZVZtV29ya0pvYnNxAH4AEQAAEhxxAH4AE3EAfgAUcQB-ACZzcQB-ABEAAABndAAgY29tLmNsb3VkLnZtLlZtV29ya0pvYkRpc3BhdGNoZXJ0ABhWbVdvcmtKb2JEaXNwYXRjaGVyLmphdmF0AAZydW5Kb2JzcQB-ABEAAAIZdAA_b3JnLmFwYWNoZS5jbG91ZHN0YWNrLmZyYW1ld29yay5qb2JzLmltcGwuQXN5bmNKb2JNYW5hZ2VySW1wbCQ1dAAYQXN5bmNKb2JNYW5hZ2VySW1wbC5qYXZhdAAMcnVuSW5Db250ZXh0c3EAfgARAAAAMXQAPm9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuTWFuYWdlZENvbnRleHRSdW5uYWJsZSQxdAAbTWFuYWdlZENvbnRleHRSdW5uYWJsZS5qYXZhdAADcnVuc3EAfgARAAAAOHQAQm9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0TWFuYWdlZENvbnRleHQkMXQAGkRlZmF1bHRNYW5hZ2VkQ29udGV4dC5qYXZhdAAEY2FsbHNxAH4AEQAAAGd0AEBvcmcuYXBhY2hlLmNsb3Vkc3RhY2subWFuYWdlZC5jb250ZXh0LmltcGwuRGVmYXVsdE1hbmFnZWRDb250ZXh0cQB-ADZ0AA9jYWxsV2l0aENvbnRleHRzcQB-ABEAAAA1cQB-ADlxAH4ANnQADnJ1bldpdGhDb250ZXh0c3EAfgARAAAALnQAPG9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuTWFuYWdlZENvbnRleHRSdW5uYWJsZXEAfgAycQB-ADNzcQB-ABEAAAHucQB-AC1xAH4ALnEAfgAzc3EAfgARAAAB13QALmphdmEudXRpbC5jb25jdXJyZW50LkV4ZWN1dG9ycyRSdW5uYWJsZUFkYXB0ZXJ0AA5FeGVjdXRvcnMuamF2YXEAfgA3c3EAfgARAAABBnQAH2phdmEudXRpbC5jb25jdXJyZW50LkZ1dHVyZVRhc2t0AA9GdXR1cmVUYXNrLmphdmFxAH4AM3NxAH4AEQAABHl0ACdqYXZhLnV0aWwuY29uY3VycmVudC5UaHJlYWRQb29sRXhlY3V0b3J0ABdUaHJlYWRQb29sRXhlY3V0b3IuamF2YXQACXJ1bldvcmtlcnNxAH4AEQAAAmd0AC5qYXZhLnV0aWwuY29uY3VycmVudC5UaHJlYWRQb29sRXhlY3V0b3IkV29ya2VycQB-AEhxAH4AM3NxAH4AEQAAAul0ABBqYXZhLmxhbmcuVGhyZWFkdAALVGhyZWFkLmphdmFxAH4AM3NyACZqYXZhLnV0aWwuQ29sbGVjdGlvbnMkVW5tb2RpZmlhYmxlTGlzdPwPJTG17I4QAgABTAAEbGlzdHEAfgAMeHIALGphdmEudXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVDb2xsZWN0aW9uGUIAgMte9x4CAAFMAAFjdAAWTGphdmEvdXRpbC9Db2xsZWN0aW9uO3hwc3IAE2phdmEudXRpbC5BcnJheUxpc3R4gdIdmcdhnQMAAUkABHNpemV4cAAAAAB3BAAAAAB4cQB-AFR4c3IAEWphdmEubGFuZy5JbnRlZ2VyEuKgpPeBhzgCAAFJAAV2YWx1ZXhyABBqYXZhLmxhbmcuTnVtYmVyhqyVHQuU4IsCAAB4cAAAEO9zcQB-AFMAAAAAdwQAAAAAeHNyAA5qYXZhLmxhbmcuTG9uZzuL5JDMjyPfAgABSgAFdmFsdWV4cQB-AFYAAAAAAAAAGXZyABVjb20uY2xvdWQub3JnLkNsdXN0ZXKqoV-omaW2FAIAAHhwAA
2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31) 
Deploy avoids pods: [], clusters: [], hosts: [3]
2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31) 
DeploymentPlanner allocation algorithm: 
com.cloud.deploy.SkipHeuresticsPlanner@4fdf86ce
2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31) 
Trying to allocate a host and storage pools from dc:14, 
pod:17,cluster:17, requested cpu: 4200, requested ram: 4294967296
2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31) Is 
ROOT volume READY (pool already allocated)?: Yes
2017-01-30 08:44:06,201 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31) 
This VM has last host_id specified, trying to choose the same host: 213
2017-01-30 08:44:06,213 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Publish async 
job-192989 complete on message bus
2017-01-30 08:44:06,213 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Wake up jobs 
related to job-192989
2017-01-30 08:44:06,213 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Update db status 
for job-192989
2017-01-30 08:44:06,219 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f) 
Checking resources in Cluster: 35 under Pod: 31
2017-01-30 08:44:06,227 DEBUG [c.c.a.m.a.i.FirstFitAllocator] 
(Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f 
FirstFitRoutingAllocator) Looking for hosts in dc: 23  pod:31  cluster:35
2017-01-30 08:44:06,229 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Executing 
AsyncJobVO {id:192992, userId: 1, accountId: 1, instanceType: null, 
instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo: 
rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGwXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAEw, 
cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, 
result: null, initMsid: 345049098122, completeMsid: null, lastUpdated: 
null, lastPolled: null, created: Mon Jan 30 08:44:05 CET 2017}
2017-01-30 08:44:06,229 DEBUG [c.c.v.VmWorkJobDispatcher] 
(Work-Job-Executor-8:ctx-381e8d1a job-14953/job-192992) Run VM work job: 
com.cloud.vm.VmWorkMigrateAway for VM 1729, job origin: 14953
2017-01-30 08:44:06,230 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(AsyncJobMgr-Heartbeat-1:ctx-eb81fc0b) Execute sync-queue item: 
SyncQueueItemVO {id:63630, queueId: 67180, contentType: AsyncJob, 
contentId: 192990, lastProcessMsid: 345049098498, lastprocessNumber: 61, 
lastProcessTime: Mon Jan 30 08:44:06 CET 2017, created: Mon Jan 30 
08:44:05 CET 2017}
2017-01-30 08:44:06,244 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31) The 
last host of this VM is not UP or is not enabled, host status is: 
Disconnected, host resource state is: Enabled
2017-01-30 08:44:06,262 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31) 
Cannot choose the last host to deploy this VM
2017-01-30 08:44:06,263 DEBUG [c.c.d.FirstFitPlanner] 
(Work-Job-Executor-7:ctx-bcf531e9 job-13729/job-192987 ctx-4c93bc31) 
Searching resources only under specified Cluster: 17
2017-01-30 08:44:06,262 INFO  [o.a.c.f.j.i.AsyncJobMonitor] 
(Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Add job-192991 
into job monitoring
2017-01-30 08:44:06,246 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Done executing 
com.cloud.vm.VmWorkMigrateAway for job-192986
2017-01-30 08:44:06,244 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Wake up jobs 
joined with job-192989 and disjoin all subjobs created from job- 192989
2017-01-30 08:44:06,269 INFO  [o.a.c.f.j.i.AsyncJobMonitor] 
(Work-Job-Executor-4:ctx-017064e7 job-13733/job-192986) Remove 
job-192986 from job monitoring
2017-01-30 08:44:06,279 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Done executing 
com.cloud.vm.VmWorkMigrateAway for job-192989
2017-01-30 08:44:06,305 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Executing 
AsyncJobVO {id:192991, userId: 1, accountId: 1, instanceType: null, 
instanceId: null, cmd: com.cloud.vm.VmWorkMigrateAway, cmdInfo: 
rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGRXQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAABw, 
cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, 
result: null, initMsid: 345049098122, completeMsid: null, lastUpdated: 
null, lastPolled: null, created: Mon Jan 30 08:44:05 CET 2017}
2017-01-30 08:44:06,305 DEBUG [c.c.v.VmWorkJobDispatcher] 
(Work-Job-Executor-9:ctx-f4d09efb job-14965/job-192991) Run VM work job: 
com.cloud.vm.VmWorkMigrateAway for VM 1605, job origin: 14965
2017-01-30 08:44:06,306 DEBUG [c.c.a.m.AgentManagerImpl] 
(AgentManager-Handler-13:null) Ping from 258
2017-01-30 08:44:06,319 DEBUG [c.c.a.m.a.i.FirstFitAllocator] 
(Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f 
FirstFitRoutingAllocator) List of hosts in ascending order of number of 
VMs: [224, 19, 21, 15]
2017-01-30 08:44:06,322 DEBUG [c.c.a.m.a.i.FirstFitAllocator] 
(Work-Job-Executor-6:ctx-0d738a8e job-13737/job-192988 ctx-6e9c753f 
FirstFitRoutingAllocator) FirstFitAllocator has 0 hosts to check for 
allocation: []
2017-01-30 08:44:06,322 INFO  [c.c.a.m.AgentManagerImpl] 
(AgentManager-Handler-13:null) PingMap for agent: 258 will not be 
updated because agent is no longer in the PingMap
2017-01-30 08:44:06,326 INFO  [o.a.c.f.j.i.AsyncJobMonitor] 
(Work-Job-Executor-5:ctx-32f1e087 job-15393/job-192989) Remove 
job-192989 from job monitoring

2017-01-30 08:54:38,139 INFO [c.c.v.VirtualMachineManagerImpl] 
(Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5) 
Migration cancelled because state has changed: VM[User|i-189-1178-VM]
2017-01-30 08:54:38,139 DEBUG [c.c.v.VirtualMachineManagerImpl] 
(Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5) 
Unable to migrate VM due to: Migration cancelled because state has 
changed: VM[User|i-189-1178-VM]
2017-01-30 08:54:38,167 DEBUG [c.c.c.CapacityManagerImpl] 
(Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5) VM 
state transitted from :Running to Stopping with event: StopRequestedvm's 
original host id: 29 new host id: 33 host id before state transition: 33
2017-01-30 08:54:38,177 DEBUG [c.c.a.m.DirectAgentAttache] 
(DirectAgent-325:ctx-e540d7f5) Seq 19-2355664080091480198: Response 
Received:
2017-01-30 08:54:38,177 DEBUG [c.c.a.t.Request] 
(DirectAgent-325:ctx-e540d7f5) Seq 19-2355664080091480198: Processing:  
{ Ans: , MgmtId: 345049098498, via: 19, Ver: v1, Flags: 10, 
[{"com.cloud.agent.api.NetworkUsageAnswer":{"routerName":"r-1706-VM","bytesSent":12276848087,"bytesReceived":5226621450,"result":true,"details":"","wait":0}}] 
}
2017-01-30 08:54:38,177 DEBUG [c.c.a.t.Request] 
(RouterMonitor-1:ctx-2e80f405) Seq 19-2355664080091480198: Received:  { 
Ans: , MgmtId: 345049098498, via: 19, Ver: v1, Flags: 10, { 
NetworkUsageAnswer } }
2017-01-30 08:54:38,177 DEBUG [c.c.a.m.AgentManagerImpl] 
(RouterMonitor-1:ctx-2e80f405) Details from executing class 
com.cloud.agent.api.NetworkUsageCommand:
2017-01-30 08:54:38,178 DEBUG [c.c.a.t.Request] 
(Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5) 
Seq 33-1917970491306410035: Sending  { Cmd , MgmtId: 345049098498, via: 
33(ewcstack-vh003-prod), Ver: v1, Flags: 100011, 
[{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"checkBeforeCleanup":false,"vmName":"i-189-1178-VM","wait":0}}] 
}
2017-01-30 08:54:38,178 DEBUG [c.c.a.t.Request] 
(Work-Job-Executor-20:ctx-a988090f job-14965/job-193019 ctx-052eb4f5) 
Seq 33-1917970491306410035: Executing:  { Cmd , MgmtId: 345049098498, 
via: 33(ewcstack-vh003-prod), Ver: v1, Flags: 100011, 
[{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"checkBeforeCleanup":false,"vmName":"i-189-1178-VM","wait":0}}] 
}
2017-01-30 08:54:38,178 DEBUG [c.c.a.m.DirectAgentAttache] 
(DirectAgent-43:ctx-da30f109) Seq 33-1917970491306410035: Executing request
2017-01-30 08:54:38,202 DEBUG [c.c.a.m.DirectAgentAttache] 
(DirectAgent-326:ctx-4fb88aa7) Seq 21-1035546439318503505: Executing request

2017-01-30 09:05:55,628 INFO [c.c.h.HighAvailabilityManagerImpl] 
(HA-Worker-4:ctx-36aae733 work-2049) Processing 
HAWork[2049-Migration-1178-Stopped-Migrating]
2017-01-30 09:05:55,633 WARN [o.a.c.f.j.AsyncJobExecutionContext] 
(HA-Worker-4:ctx-36aae733 work-2049) Job is executed without a context, 
setup psudo job for the executing thread
2017-01-30 09:05:55,645 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(HA-Worker-4:ctx-36aae733 work-2049) Sync job-193045 execution on object 
VmWorkJobQueue.1178



On 31.01.2017 00:42, Rafael Weingärtner wrote:
> This seems to be one of those nasty bugs that are very hard to debug. It
> seems to be cause to a combination of different factors, such as planets
> alignments ;)
>
> At class “VirtualMachineManagerImpl” line 2395 (method
> “com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(String,
> long, DeploymentPlanner)”) is the start of the code flow that triggers the
> migration. The point here is that, if an error occurs during the migration,
> the VM will be stopped; For some reason this is executed only for HA VMs.
>
> Going a little bit further, we find the point that is triggering the error
> in the method “com.cloud.vm.VirtualMachineManagerImpl.migrate(VMInstanceVO,
> long, DeployDestination)” at line 1941. This line will call the method
> “com.cloud.vm.VirtualMachineManagerImpl.changeState(VMInstanceVO, Event,
> Long, ItWorkVO, Step)” to change the VM state to migrating/prepare
> migration/or something like that (I am guessing here), but for some reason,
> the state of the VM is not the one expected and this is causing a
> “ConcurrentOperationException” to be thrown. The problem is that the
> “com.cloud.vm.VirtualMachineManagerImpl.changeState(VMInstanceVO, Event,
> Long, ItWorkVO, Step)” does not seem to provide much useful log entries.
>
> Just to enable us to discuss the problem a little further, could you tell
> us a little bit about your environment setting? How many ACS MSs do you
> have? What is the VM workload being migrated (I mean, the number of VMs and
> computing resources)? Do the other servers have enough computing resources
> to support these VMs? Could you also check the XenServer logs, and check
> what happened on both, source and destination hosts when the migration for
> VM “i-638-1735-VM” was triggered?
>
> Do you have the knowledge/skills to use an IDE (e.g. Eclipse) to connect to
> a running JVM and debug a code? Because for me it will not be possible to
> go further without looking at the variables in a running environment.
>
>
> On Mon, Jan 30, 2017 at 4:28 AM, Francois Scheurer <
> francois.scheurer@everyware.ch> wrote:
>
>> Dear Rafael
>>
>>
>> Sorry for the late answer (was sick a couple of days).
>>
>> I confirm we are migrating inside the same xenserver pool (within a CS
>> Cluster).
>>
>>
>> Best Regards
>> Francois
>>
>>
>>
>> On 26.01.2017 04:35, Rafael Weingärtner wrote:
>>
>>> Found the code that is executed.
>>> I need you to confirm if you are migrating within a cluster or across
>>> clusters?
>>>
>>> On Wed, Jan 25, 2017 at 11:57 AM, Rafael Weingärtner <
>>> rafaelweingartner@gmail.com> wrote:
>>>
>>> I do not have access to ACS code right now, but I suggest starting
>>>> debugging here:
>>>>
>>>> 2017-01-24 18:05:28,427 DEBUG [c.c.c.CapacityManagerImpl]
>>>>> (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380 ctx-175d37df)
>>>>> VM state transitted from :Running to Stopping with event:
>>>>> StopRequestedvm's
>>>>> original host id: 224 new host id: 15 host id before state transition:
>>>>> 15
>>>>>
>>>>> I would try to understand first why ACS requested the hypervisor to stop
>>>> the VM. I mean the following, check the source code for conditions that
>>>> would make ACS request the shutdown of VMs.
>>>> BTW: are you migrating within a cluster or across clusters?
>>>>
>>>> I will be able to analyze this further only after work hours, so if you
>>>> find anything, keep me posted.
>>>>
>>>> On Wed, Jan 25, 2017 at 11:45 AM, Francois Scheurer <
>>>> francois.scheurer@everyware.ch> wrote:
>>>>
>>>> Hello Rafael
>>>>>
>>>>> I think the important log lines are these:
>>>>>
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 INFO
>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>>>> job-60216/job-191380 ctx-175d37df) Migration cancelled because state has
>>>>> changed: VM[User|i-638-1736-VM]
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 DEBUG
>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>>>> job-60216/job-191380 ctx-175d37df) Unable to migrate VM due to:
>>>>> Migration
>>>>> cancelled because state has changed: VM[User|i-638-1736-VM]
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,427 DEBUG
>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>>>> job-60216/job-191380 ctx-175d37df) VM state transitted from :Running to
>>>>> Stopping with event: StopRequestedvm's original host id: 224 new host
>>>>> id:
>>>>> 15 host id before state transition: 15
>>>>>
>>>>> The really disturbing thing is that CS stop again the VM if we start it
>>>>> manually, apparently because the migratingjob is still running.
>>>>>
>>>>> Note the the VM HA Flag is enabled.
>>>>>
>>>>>
>>>>>
>>>>> Thank you.
>>>>>
>>>>>
>>>>> Best Regards.
>>>>>
>>>>> Francois
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> more details:
>>>>>
>>>>>
>>>>>
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,148 DEBUG
>>>>> [o.a.c.e.o.VolumeOrchestrator] (Work-Job-Executor-156:ctx-14b86f5e
>>>>> job-60216/job-191380 ctx-175d37df) Preparing 3 volumes for
>>>>> VM[User|i-638-1736-VM]
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,155 DEBUG
>>>>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-122:ctx-a8bacc64
>>>>> job-14953/job-191351) Done with run of VM work job:
>>>>> com.cloud.vm.VmWorkMigrateAway for VM 1411, job origin: 14953
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,155 DEBUG
>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-122:ctx-a8bacc64
>>>>> job-14953/job-191351) Done executing com.cloud.vm.VmWorkMigrateAway for
>>>>> job-191351
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,160 INFO
>>>>> [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-122:ctx-a8bacc64
>>>>> job-14953/job-191351) Remove job-191351 from job monitoring
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,168 INFO
>>>>> [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-2:ctx-f3bcf43c
>>>>> work-1989)
>>>>> Completed HAWork[1989-Migration-1411-Running-Migrating]
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,246 DEBUG
>>>>> [c.c.a.m.ClusteredAgentAttache] (Work-Job-Executor-156:ctx-14b86f5e
>>>>> job-60216/job-191380 ctx-175d37df) Seq 19-8526158519542516375:
>>>>> Forwarding
>>>>> Seq 19-8526158519542516375:  { Cmd , MgmtId: 345049103441, via:
>>>>> 19(ewcstack-vh010-prod), Ver: v1, Flags: 100111, [{
>>>>> "com.cloud.agent.api.PrepareForMigrationCommand":{"vm":{"id"
>>>>> :1736,"name":"i-638-1736-VM","bootloader":"PyGrub","type":"U
>>>>> ser","cpus":4,"minSpeed":2100,"maxSpeed":2100,"minRam":17179
>>>>> 869184,"maxRam":17179869184,"arch":"x86_64","os":"Other
>>>>> (64-bit)","platformEmulator":"Other install
>>>>> media","bootArgs":"","enable
>>>>> HA":true,"limitCpuUse":true,"enableDynamicallyScaleVm":true,
>>>>> "vncPassword":"pC1WUC7h1DBH9J36q3H9pg==","params":{"memoryOv
>>>>> ercommitRatio":"1.0","platform":"viridian:true;acpi:1;apic:t
>>>>> rue;pae:true;nx:true;timeoffset:-3601","keyboard":"us","Mess
>>>>> age.ReservedCapacityFreed.Flag":"false","cpuOvercommitRatio":"4.0","
>>>>> hypervisortoolsversion":"xenserver61"},"uuid":"5bff01de-c033
>>>>> -4925-9e47-30dd14539272","disks":[{"data":{"org.apache.cloud
>>>>> stack.storage.to.VolumeObjectTO":{"uuid":"1ab514b2-f2de-
>>>>> 4d85-8e0d-9768184a1349","volumeType":"ROOT","dataStore":{"org.
>>>>> apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"b8751b89-00f
>>>>> 6-5079-ed7a-c073579c7563","id":64,"poolType":"PreSetup","hos
>>>>> t":"localhost","path":"/b8751b89-00f6-5079-ed7a-c073579c7563
>>>>> ","port":0,"url":"PreSetup://localhost/b8751b89-00f6-5079-ed
>>>>> 7a-c073579c7563/?ROLE=Primary&STOREUUID=b8751b89-00f6-5079-
>>>>> ed7a-c073579c7563"}},"name":"ROOT-1736","size":53687091200,"path":"d3
>>>>> f87e3c-1efe-42ad-aa93-609d5b980a34","volumeId":8655,"vmName"
>>>>> :"i-638-1736-VM","accountId":638,"format":"VHD","provisionin
>>>>> gType":"THIN","id":8655,"deviceId":0,"hypervisorType":"
>>>>> XenServer"}},"diskSeq":0,"path":"d3f87e3c-1efe-42ad-aa93
>>>>> -609d5b980a34","type":"ROOT","_details":{"managed":"false","
>>>>> storagePort":"0",
>>>>> "storageHost":"localhost","volumeSize":"53687091200"}},{"dat
>>>>> a":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid
>>>>> ":"733df9dc-5e24-4ab9-b22b-bce78322428e","volumeType":"DATAD
>>>>> ISK","dataStore":{"org.apache.cloudstack.storage.to.PrimaryD
>>>>> ataStoreTO":{"uuid":"b8751b89-00f6-5079-ed7a-c073579c7563","id":64,"
>>>>> poolType":"PreSetup","host":"localhost","path":"/b8751b89-00
>>>>> f6-5079-ed7a-c073579c7563","port":0,"url":"PreSetup://localh
>>>>> ost/b8751b89-00f6-5079-ed7a-c073579c7563/?ROLE=Primary&STORE
>>>>> UUID=b8751b89-00f6-5079-ed7a-c073579c7563"}},"name":"WELLA-
>>>>> DB1-NCSA_D","size":268435456000,"path":"265f168a-bb92-4591-a32b-ee3e2
>>>>> 2c8cb68","volumeId":8656,"vmName":"i-638-1736-VM","accountId
>>>>> ":638,"format":"VHD","provisioningType":"THIN","id":8656,"
>>>>> deviceId":1,"hypervisorType":"XenServer"}},"diskSeq":1,"
>>>>> path":"265f168a-bb92-4591-a32b-ee3e22c8cb68","type":"DATADIS
>>>>> K","_details":{"managed":"false","storagePort":"0","stor
>>>>> ageHost":"localhos
>>>>> t","volumeSize":"268435456000"}},{"data":{"org.apache.clouds
>>>>> tack.storage.to.VolumeObjectTO":{"uuid":"bda6f335-b8a3-479c-
>>>>> 8c0d-41b344d2a50d","volumeType":"DATADISK","dataStore":{"org.
>>>>> apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid"
>>>>> :"09df3ca6-0a20-e4f0-3d79-12a6ed67a13d","id":89,"poolType":"PreSetup","
>>>>> host":"localhost","path":"/09df3ca6-0a20-e4f0-3d79-12a6ed67a
>>>>> 13d","port":0,"url":"PreSetup://localhost/09df3ca6-0a20-e4f0
>>>>> -3d79-12a6ed67a13d/?ROLE=Primary&STOREUUID=09df3ca6-0a20-
>>>>>
>>>>> e4f0-3d79-12a6ed67a13d"}},"name":"WELLA-DB1-NCSA_E","size
>>>>> ":429496729600,"path":"612cb586-a70c-4896-b120-062ee7a67894"
>>>>> ,"volumeId":86
>>>>> 57,"vmName":"i-638-1736-VM","accountId":638,"format":"VHD","
>>>>> provisioningType":"THIN","id":8657,"deviceId":2,"hypervisorT
>>>>> ype":"XenServer"}},"diskSeq":2,"path":"612cb586-a70c-4896-b1
>>>>> 20-062ee7a67894","type":"DATADISK","_details":{"managed":"
>>>>> false","storagePort":"0","storageHost":"localhost","volumeSize":"42949
>>>>> 6729600"}},{"data":{"org.apache.cloudstack.storage.to.Templa
>>>>> teObjectTO":{"uuid":"0e5bf4ff-caca-4b08-abed-7bbcdf3c42ef","
>>>>> id":201,"format":"ISO","accountId":1,"hvm":true,"displayText
>>>>> ":"xen-pv-drv-iso","name":"xs-tools.iso","guestOsType":"CentOS 4.5
>>>>> (32-bit)","hypervisorType":"XenServer"}},"diskSeq":3,"type":"
>>>>> ISO"}],"nics":[{"deviceId":0,"networkRateMbps":1000,"default
>>>>> Nic":true,"pxeDisable":false,"nicUuid":"3523c3d3-9f92-4076-a
>>>>> d99-f8864d0f0edd","uuid":"e6fbb257-16e1-415f-88fb-9a79b263b3
>>>>> a8","ip":"192.168.1.42","netmask":"255.255.255.0","
>>>>> gateway":"192.168.1.1","mac":"02:00:0c:ae:00:0f","dns1":"8.8
>>>>> .8.8","dns2":"8.8
>>>>> .4.4","broadcastType":"Vlan","type":"Guest","broadcastUri":"
>>>>> vlan://2285","isolationUri":"vlan://2285","isSecurityGroupEn
>>>>> abled":false,"name":"bond0"}],"vcpuMaxLimit":16},"wait":0}}] } to
>>>>> 345049098498
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG [c.c.a.t.Request]
>>>>> (AgentManager-Handler-10:null) Seq 19-8526158519542516375: Processing:
>>>>> {
>>>>> Ans: , MgmtId: 345049103441, via: 19, Ver: v1, Flags: 110,
>>>>> [{"com.cloud.agent.api.PrepareForMigrationAnswer":{"result":
>>>>> true,"wait":0}}]
>>>>> }
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG
>>>>> [c.c.a.m.AgentAttache]
>>>>> (AgentManager-Handler-10:null) Seq 19-8526158519542516375: No more
>>>>> commands
>>>>> found
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG [c.c.a.t.Request]
>>>>> (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380 ctx-175d37df)
>>>>> Seq 19-8526158519542516375: Received:  { Ans: , MgmtId: 345049103441,
>>>>> via:
>>>>> 19, Ver: v1, Flags: 110, { PrepareForMigrationAnswer } }
>>>>>
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 INFO
>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>>>> job-60216/job-191380 ctx-175d37df) Migration cancelled because state has
>>>>> changed: VM[User|i-638-1736-VM]
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 DEBUG
>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>>>> job-60216/job-191380 ctx-175d37df) Unable to migrate VM due to:
>>>>> Migration
>>>>> cancelled because state has changed: VM[User|i-638-1736-VM]
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,427 DEBUG
>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>>>> job-60216/job-191380 ctx-175d37df) VM state transitted from :Running to
>>>>> Stopping with event: StopRequestedvm's original host id: 224 new host
>>>>> id:
>>>>> 15 host id before state transition: 15
>>>>>
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,438 DEBUG
>>>>> [c.c.a.m.ClusteredAgentAttache] (Work-Job-Executor-156:ctx-14b86f5e
>>>>> job-60216/job-191380 ctx-175d37df) Seq 15-6889944479923437643:
>>>>> Forwarding
>>>>> Seq 15-6889944479923437643:  { Cmd , MgmtId: 345049103441, via:
>>>>> 15(ewcstack-vh011-prod), Ver: v1, Flags: 100011,
>>>>> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"execut
>>>>> eInSequence":false,"checkBeforeCleanup":false,"vmName":"i-
>>>>> 638-1736-VM","wait":0}}]
>>>>> } to 345049098498
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:30,335 DEBUG
>>>>> [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-1:null) SeqA
>>>>> 254-800498:
>>>>> Processing Seq 254-800498:  { Cmd , MgmtId: -1, via: 254, Ver: v1,
>>>>> Flags:
>>>>> 11, [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_
>>>>> proxyVmId":1530,"_loadInfo":"{\n
>>>>> \"connections\": []\n}","wait":0}}] }
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:30,340 DEBUG
>>>>> [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-1:null) SeqA
>>>>> 254-800498:
>>>>> Sending Seq 254-800498:  { Ans: , MgmtId: 345049103441, via: 254, Ver:
>>>>> v1,
>>>>> Flags: 100010, [{"com.cloud.agent.api.AgentCo
>>>>> ntrolAnswer":{"result":true,"wait":0}}] }
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,447 DEBUG [c.c.a.t.Request]
>>>>> (AgentManager-Handler-11:null) Seq 15-6889944479923437632: Processing:
>>>>> {
>>>>> Ans: , MgmtId: 345049103441, via: 15, Ver: v1, Flags: 10,
>>>>> [{"com.cloud.agent.api.StopAnswer":{"platform":"viridian:tru
>>>>> e;acpi:1;apic:true;pae:true;nx:true;timeoffset:3601","result
>>>>> ":true,"details":"Stop
>>>>> VM i-638-1735-VM Succeed","wait":0}}] }
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,447 DEBUG [c.c.a.t.Request]
>>>>> (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379 ctx-bbabfeae)
>>>>> Seq 15-6889944479923437632: Received:  { Ans: , MgmtId: 345049103441,
>>>>> via:
>>>>> 15, Ver: v1, Flags: 10, { StopAnswer } }
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,470 DEBUG
>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>> job-13733/job-191379 ctx-bbabfeae) VM[User|i-638-1735-VM] is stopped on
>>>>> the
>>>>> host. Proceeding to release resource held.
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,475 DEBUG
>>>>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not
>>>>> supported
>>>>> in the network id=519
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,478 DEBUG
>>>>> [o.a.c.e.o.NetworkOrchestrator] (Work-Job-Executor-115:ctx-0d5db528
>>>>> job-13733/job-191379 ctx-bbabfeae) Changing active number of nics for
>>>>> network id=519 on -1
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>>>>> [o.a.c.e.o.NetworkOrchestrator] (Work-Job-Executor-115:ctx-0d5db528
>>>>> job-13733/job-191379 ctx-bbabfeae) Asking VpcVirtualRouter to release
>>>>> NicProfile[17403-1735-3364083b-dcc0-4eb9-a7f9-f3e85bf8c7d3-1
>>>>> 92.168.1.11-null
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>> job-13733/job-191379 ctx-bbabfeae) Successfully released network
>>>>> resources
>>>>> for the vm VM[User|i-638-1735-VM]
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>> job-13733/job-191379 ctx-bbabfeae) Successfully released storage
>>>>> resources
>>>>> for the vm VM[User|i-638-1735-VM]
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,496 DEBUG
>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>> job-13733/job-191379 ctx-bbabfeae) VM state transitted from :Stopping to
>>>>> Stopped with event: OperationSucceededvm's original host id: 15 new host
>>>>> id: null host id before state transition: 15
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>> job-13733/job-191379 ctx-bbabfeae) Hosts's actual total CPU: 82976 and
>>>>> CPU
>>>>> after applying overprovisioning: 331904
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>> job-13733/job-191379 ctx-bbabfeae) Hosts's actual total RAM:
>>>>> 261961927808
>>>>> and RAM after applying overprovisioning: 261961924608
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>> job-13733/job-191379 ctx-bbabfeae) release cpu from host: 15, old used:
>>>>> 71300,reserved: 2100, actual total: 82976, total with overprovisioning:
>>>>> 331904; new used: 67100,reserved:6300; movedfromreserved:
>>>>> false,moveToReserveredtrue
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>> job-13733/job-191379 ctx-bbabfeae) release mem from host: 15, old used:
>>>>> 98784247808,reserved: 1073741824, total: 261961924608; new used:
>>>>> 94489280512,reserved:5368709120 <(53)%206870-9120>; movedfromreserved:
>>>>>
>>>>> false,moveToReserveredtrue
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,538 DEBUG
>>>>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not
>>>>> supported
>>>>> in the network id=519
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,557 DEBUG
>>>>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not
>>>>> supported
>>>>> in the network id=519
>>>>>
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,588 ERROR
>>>>> [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-115:ctx-0d5db528
>>>>> job-13733/job-191379 ctx-bbabfeae) Invocation exception, caused by:
>>>>> com.cloud.utils.exception.CloudRuntimeException: Unable to migrate
>>>>> VM[User|i-638-1735-VM]
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 INFO
>>>>> [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-115:ctx-0d5db528
>>>>> job-13733/job-191379 ctx-bbabfeae) Rethrow exception
>>>>> com.cloud.utils.exception.CloudRuntimeException: Unable to migrate
>>>>> VM[User|i-638-1735-VM]
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 DEBUG
>>>>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-115:ctx-0d5db528
>>>>> job-13733/job-191379) Done with run of VM work job:
>>>>> com.cloud.vm.VmWorkMigrateAway for VM 1735, job origin: 13733
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 ERROR
>>>>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-115:ctx-0d5db528
>>>>> job-13733/job-191379) Unable to complete AsyncJobVO {id:191379, userId:
>>>>> 1,
>>>>> accountId: 1, instanceType: null, instanceId: null, cmd:
>>>>> com.cloud.vm.VmWorkMigrateAway, cmdInfo: rO0ABXNyAB5jb20uY2xvdWQudm0uVm
>>>>> 1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb2
>>>>> 0uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2
>>>>> VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZz
>>>>> t4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGx3QAGVZpcnR1YWxNYWNoaW5lTW
>>>>> FuYWdlckltcGwAAAAAAAAA4A,
>>>>> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
>>>>> result: null, initMsid: 345049098498, completeMsid: null, lastUpdated:
>>>>> null, lastPolled: null, created: Tue Jan 24 18:05:00 CET 2017}, job
>>>>> origin:13733
>>>>> ewcstack-man02-prod: com.cloud.utils.exception.CloudRuntimeException:
>>>>> Unable to migrate VM[User|i-638-1735-VM]
>>>>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>>>>> agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:2405)
>>>>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>>>>> agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:4517)
>>>>> ewcstack-man02-prod:    at sun.reflect.GeneratedMethodAcc
>>>>> essor402.invoke(Unknown
>>>>> Source)
>>>>> ewcstack-man02-prod:    at sun.reflect.DelegatingMethodAc
>>>>> cessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>> ewcstack-man02-prod:    at java.lang.reflect.Method.invok
>>>>> e(Method.java:606)
>>>>> ewcstack-man02-prod:    at com.cloud.vm.VmWorkJobHandlerP
>>>>> roxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
>>>>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>>>>> agerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4636)
>>>>> ewcstack-man02-prod:    at com.cloud.vm.VmWorkJobDispatch
>>>>> er.runJob(VmWorkJobDispatcher.java:103)
>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.framewor
>>>>> k.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManag
>>>>> erImpl.java:537)
>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>> context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>> context.impl.DefaultManagedContext$1.call(DefaultManagedCont
>>>>> ext.java:56)
>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>> context.impl.DefaultManagedContext.callWithContext(DefaultMa
>>>>> nagedContext.java:103)
>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>> context.impl.DefaultManagedContext.runWithContext(DefaultMan
>>>>> agedContext.java:53)
>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>>> context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>>>>> ewcstack-man02-prod:    at org.apache.cloudstack.framewor
>>>>> k.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:494)
>>>>> ewcstack-man02-prod:    at java.util.concurrent.Executors
>>>>> $RunnableAdapter.call(Executors.java:471)
>>>>> ewcstack-man02-prod:    at java.util.concurrent.FutureTas
>>>>> k.run(FutureTask.java:262)
>>>>> ewcstack-man02-prod:    at java.util.concurrent.ThreadPoo
>>>>> lExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>> ewcstack-man02-prod:    at java.util.concurrent.ThreadPoo
>>>>> lExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>> ewcstack-man02-prod:    at java.lang.Thread.run(Thread.java:745)
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,590 DEBUG
>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>> job-13733/job-191379) Complete async job-191379, jobStatus: FAILED,
>>>>> resultCode: 0, result: rO0ABXNyAC9jb20uY2xvdWQudXRpbH
>>>>> MuZXhjZXB0aW9uLkNsb3VkUnVudGltZUV4Y2VwdGlvbgAAAABWTU9yAwABSQ
>>>>> ALY3NFcnJvckNvZGV4cgAaamF2YS5sYW5nLlJ1bnRpbWVFeGNlcHRpb26eXw
>>>>> ZHCjSD5QIAAHhyABNqYXZhLmxhbmcuRXhjZXB0aW9u0P0fPho7HMQCAAB4cg
>>>>> ATamF2YS5sYW5nLlRocm93YWJsZdXGNSc5d7jLAwAETAAFY2F1c2V0ABVMam
>>>>> F2YS9sYW5nL1Rocm93YWJsZTtMAA1kZXRhaWxNZXNzYWdldAASTGphdmEvbG
>>>>> FuZy9TdHJpbmc7WwAKc3RhY2tUcmFjZXQAHltMamF2YS9sYW5nL1N0YWNrVH
>>>>> JhY2VFbGVtZW50O0wAFHN1cHByZXNzZWRFeGNlcHRpb25zdAAQTGphdmEvdX
>>>>> RpbC9MaXN0O3hwcQB-AAh0AChVbmFibGUgdG8gbWlncmF0ZSBWTVtVc2VyfG
>>>>> ktNjM4LTE3MzUtVk1ddXIAHltMamF2YS5sYW5nLlN0YWNrVHJhY2VFbGVtZW
>>>>> 50OwJGKjw8_SI5AgAAeHAAAAAUc3IAG2phdmEubGFuZy5TdGFja1RyYWNlRW
>>>>> xlbWVudGEJxZomNt2FAgAESQAKbGluZU51bWJlckwADmRlY2xhcmluZ0NsYX
>>>>> NzcQB-AAVMAAhmaWxlTmFtZXEAfgAFTAAKbWV0aG9kTmFtZXEAfgAFeHAAAA
>>>>> lldAAmY29tLmNsb3VkLnZtLlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGx0AB
>>>>> 5WaXJ0dWFsTWFjaGluZU1hbmFnZXJJbXBsLmphdmF0ABZvcmNoZXN0cmF0ZU
>>>>> 1pZ3JhdGVBd2F5c3EAfgAMAAARpXEAfgAOcQB-AA9xAH4AEHNxAH4ADP____
>>>>> 90ACZzdW4ucmVmbGVjdC5HZW5lcmF0ZWRNZXRob2RBY2Nlc3NvcjQwMnB0AA
>>>>> ZpbnZva2VzcQB-AAwAAAArdAAoc3VuLnJlZmxlY3QuRGVsZWdhdGluZ01ldG
>>>>> hvZEFjY2Vzc29ySW1wbHQAIURlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcG
>>>>> wuamF2YXEAfgAUc3EAfgAMAAACXnQAGGphdmEubGFuZy5yZWZsZWN0Lk1ldG
>>>>> hvZHQAC01ldGhvZC5qYXZhcQB-ABRzcQB-AAwAAABrdAAiY29tLmNsb3VkLn
>>>>> ZtLlZtV29ya0pvYkhhbmRsZXJQcm94eXQAGlZtV29ya0pvYkhhbmRsZXJQcm
>>>>> 94eS5qYXZhdAAPaGFuZGxlVm1Xb3JrSm9ic3EAfgAMAAASHHEAfgAOcQB-
>>>>> AA9xAH4AHnNxAH4ADAAAAGd0ACBjb20uY2xvdWQudm0uVm1Xb3JrSm9iRGlz
>>>>> cGF0Y2hlcnQAGFZtV29ya0pvYkRpc3BhdGNoZXIuamF2YXQABnJ1bkpvYnNx
>>>>> AH4ADAAAAhl0AD9vcmcuYXBhY2hlLmNsb3Vkc3RhY2suZnJhbWV3b3JrLmpv
>>>>> YnMuaW1wbC5Bc3luY0pvYk1hbmFnZXJJbXBsJDV0ABhBc3luY0pvYk1hbmFn
>>>>> ZXJJbXBsLmphdmF0AAxydW5JbkNvbnRleHRzcQB-AAwAAAAxdAA-b3JnLmFw
>>>>> YWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29udGV4
>>>>> dFJ1bm5hYmxlJDF0ABtNYW5hZ2VkQ29udGV4dFJ1bm5hYmxlLmphdmF0AANy
>>>>> dW5zcQB-AAwAAAA4dABCb3JnLmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQu
>>>>> Y29udGV4dC5pbXBsLkRlZmF1bHRNYW5hZ2VkQ29udGV4dCQxdAAaRGVmYXVs
>>>>> dE1hbmFnZWRDb250ZXh0LmphdmF0AARjYWxsc3EAfgAMAAAAZ3QAQG9yZy5h
>>>>> cGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0
>>>>> TWFuYWdlZENvbnRleHRxAH4ALnQAD2NhbGxXaXRoQ29udGV4dHNxAH4ADAAA
>>>>> ADVxAH4AMXEAfgAudAAOcnVuV2l0aENvbnRleHRzcQB-AAwAAAAudAA8b3Jn
>>>>> LmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29u
>>>>> dGV4dFJ1bm5hYmxlcQB-ACpxAH4AK3NxAH4ADAAAAe5xAH4AJXEAfgAmcQB-
>>>>> ACtzcQB-AAwAAAHXdAAuamF2YS51dGlsLmNvbmN1cnJlbnQuRXhlY3V0b3Jz
>>>>> JFJ1bm5hYmxlQWRhcHRlcnQADkV4ZWN1dG9ycy5qYXZhcQB-AC9zcQB-AAwA
>>>>> AAEGdAAfamF2YS51dGlsLmNvbmN1cnJlbnQuRnV0dXJlVGFza3QAD0Z1dHVy
>>>>> ZVRhc2suamF2YXEAfgArc3EAfgAMAAAEeXQAJ2phdmEudXRpbC5jb25jdXJy
>>>>> ZW50LlRocmVhZFBvb2xFeGVjdXRvcnQAF1RocmVhZFBvb2xFeGVjdXRvci5q
>>>>> YXZhdAAJcnVuV29ya2Vyc3EAfgAMAAACZ3QALmphdmEudXRpbC5jb25jdXJy
>>>>> ZW50LlRocmVhZFBvb2xFeGVjdXRvciRXb3JrZXJxAH4AQHEAfgArc3EAfgAM
>>>>> AAAC6XQAEGphdmEubGFuZy5UaHJlYWR0AAtUaHJlYWQuamF2YXEAfgArc3IA
>>>>> JmphdmEudXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVMaXN0_A8lMbXs
>>>>> jhACAAFMAARsaXN0cQB-AAd4cgAsamF2YS51dGlsLkNvbGxlY3Rpb25zJFVu
>>>>> bW9kaWZpYWJsZUNvbGxlY3Rpb24ZQgCAy173HgIAAUwAAWN0ABZMamF2YS91
>>>>> dGlsL0NvbGxlY3Rpb247eHBzcgATamF2YS51dGlsLkFycmF5TGlzdHiB0h2Z
>>>>> x2GdAwABSQAEc2l6ZXhwAAAAAHcEAAAAAHhxAH4ATHgAABCadwgAAAAAAAAAAHg
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>> job-13733/job-191379) Publish async job-191379 complete on message bus
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>> job-13733/job-191379) Wake up jobs related to job-191379
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>> job-13733/job-191379) Update db status for job-191379
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,594 DEBUG
>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>> job-13733/job-191379) Wake up jobs joined with job-191379 and disjoin
>>>>> all
>>>>> subjobs created from job- 191379
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,604 DEBUG
>>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>>> job-13733/job-191379) Done executing com.cloud.vm.VmWorkMigrateAway for
>>>>> job-191379
>>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,608 INFO
>>>>> [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-115:ctx-0d5db528
>>>>> job-13733/job-191379) Remove job-191379 from job monitoring
>>>>>
>>>>>
>>>>>
>>>>> On 25.01.2017 15:53, Rafael Weingärtner wrote:
>>>>>
>>>>> Did you check the ACS log files looking for some exceptions when the
>>>>>> migrations for this specific VM started?
>>>>>>
>>>>>> On Wed, Jan 25, 2017 at 9:49 AM, Francois Scheurer <
>>>>>> francois.scheurer@everyware.ch> wrote:
>>>>>>
>>>>>> Dear CS contributors
>>>>>>
>>>>>>> We use CS 4.5.1 on a 3 Clusters with XenServer 6.5. and shared primary
>>>>>>> storage.
>>>>>>>
>>>>>>> When we put a host in maintenance, cs evacuates all VM's on other
>>>>>>> hosts.
>>>>>>>
>>>>>>> But it happens regularly that some VM's are stopped instead of live
>>>>>>> migrated.
>>>>>>>
>>>>>>> If we restart that stopped VM's, CS will then stop them again later.
>>>>>>>
>>>>>>> We need to start them several times until they stays up.
>>>>>>>
>>>>>>> Is it a known issue? Is it fixed on CS 4.9.2 ?
>>>>>>>
>>>>>>>
>>>>>>> Migrating manually all VM's is working but impracticable because too
>>>>>>> much
>>>>>>> time consuming.
>>>>>>>
>>>>>>>
>>>>>>> Many thanks in advance for your help.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Best Regards
>>>>>>> Francois
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>>
>>>>>>> EveryWare AG
>>>>>>> François Scheurer
>>>>>>> Senior Systems Engineer
>>>>>>> Zurlindenstrasse 52a
>>>>>>> CH-8003 Zürich
>>>>>>>
>>>>>>> tel: +41 44 466 60 00
>>>>>>> fax: +41 44 466 60 10
>>>>>>> mail: francois.scheurer@everyware.ch
>>>>>>> web: http://www.everyware.ch
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> --
>>>>>
>>>>> EveryWare AG
>>>>> François Scheurer
>>>>> Senior Systems Engineer
>>>>> Zurlindenstrasse 52a
>>>>> CH-8003 Zürich
>>>>>
>>>>> tel: +41 44 466 60 00
>>>>> fax: +41 44 466 60 10
>>>>> mail: francois.scheurer@everyware.ch
>>>>> web: http://www.everyware.ch
>>>>>
>>>>>
>>>>>
>>>> --
>>>> Rafael Weingärtner
>>>>
>>>>
>>>
>> --
>>
>>
>> EveryWare AG
>> François Scheurer
>> Senior Systems Engineer
>> Zurlindenstrasse 52a
>> CH-8003 Zürich
>>
>> tel: +41 44 466 60 00
>> fax: +41 44 466 60 10
>> mail: francois.scheurer@everyware.ch
>> web: http://www.everyware.ch
>>
>
>

-- 


EveryWare AG
François Scheurer
Senior Systems Engineer
Zurlindenstrasse 52a
CH-8003 Zürich

tel: +41 44 466 60 00
fax: +41 44 466 60 10
mail: francois.scheurer@everyware.ch
web: http://www.everyware.ch

Re: cs 4.5.1, vm stopped when putting xen hosts in maintenance

Posted by Rafael Weingärtner <ra...@gmail.com>.

This seems to be one of those nasty bugs that are very hard to debug. It
seems to be cause to a combination of different factors, such as planets
alignments ;)

At class “VirtualMachineManagerImpl” line 2395 (method
“com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(String,
long, DeploymentPlanner)”) is the start of the code flow that triggers the
migration. The point here is that, if an error occurs during the migration,
the VM will be stopped; For some reason this is executed only for HA VMs.

Going a little bit further, we find the point that is triggering the error
in the method “com.cloud.vm.VirtualMachineManagerImpl.migrate(VMInstanceVO,
long, DeployDestination)” at line 1941. This line will call the method
“com.cloud.vm.VirtualMachineManagerImpl.changeState(VMInstanceVO, Event,
Long, ItWorkVO, Step)” to change the VM state to migrating/prepare
migration/or something like that (I am guessing here), but for some reason,
the state of the VM is not the one expected and this is causing a
“ConcurrentOperationException” to be thrown. The problem is that the
“com.cloud.vm.VirtualMachineManagerImpl.changeState(VMInstanceVO, Event,
Long, ItWorkVO, Step)” does not seem to provide much useful log entries.

Just to enable us to discuss the problem a little further, could you tell
us a little bit about your environment setting? How many ACS MSs do you
have? What is the VM workload being migrated (I mean, the number of VMs and
computing resources)? Do the other servers have enough computing resources
to support these VMs? Could you also check the XenServer logs, and check
what happened on both, source and destination hosts when the migration for
VM “i-638-1735-VM” was triggered?

Do you have the knowledge/skills to use an IDE (e.g. Eclipse) to connect to
a running JVM and debug a code? Because for me it will not be possible to
go further without looking at the variables in a running environment.


On Mon, Jan 30, 2017 at 4:28 AM, Francois Scheurer <
francois.scheurer@everyware.ch> wrote:

> Dear Rafael
>
>
> Sorry for the late answer (was sick a couple of days).
>
> I confirm we are migrating inside the same xenserver pool (within a CS
> Cluster).
>
>
> Best Regards
> Francois
>
>
>
> On 26.01.2017 04:35, Rafael Weingärtner wrote:
>
>> Found the code that is executed.
>> I need you to confirm if you are migrating within a cluster or across
>> clusters?
>>
>> On Wed, Jan 25, 2017 at 11:57 AM, Rafael Weingärtner <
>> rafaelweingartner@gmail.com> wrote:
>>
>> I do not have access to ACS code right now, but I suggest starting
>>> debugging here:
>>>
>>> 2017-01-24 18:05:28,427 DEBUG [c.c.c.CapacityManagerImpl]
>>>> (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380 ctx-175d37df)
>>>> VM state transitted from :Running to Stopping with event:
>>>> StopRequestedvm's
>>>> original host id: 224 new host id: 15 host id before state transition:
>>>> 15
>>>>
>>>> I would try to understand first why ACS requested the hypervisor to stop
>>> the VM. I mean the following, check the source code for conditions that
>>> would make ACS request the shutdown of VMs.
>>> BTW: are you migrating within a cluster or across clusters?
>>>
>>> I will be able to analyze this further only after work hours, so if you
>>> find anything, keep me posted.
>>>
>>> On Wed, Jan 25, 2017 at 11:45 AM, Francois Scheurer <
>>> francois.scheurer@everyware.ch> wrote:
>>>
>>> Hello Rafael
>>>>
>>>>
>>>> I think the important log lines are these:
>>>>
>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 INFO
>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>>> job-60216/job-191380 ctx-175d37df) Migration cancelled because state has
>>>> changed: VM[User|i-638-1736-VM]
>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 DEBUG
>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>>> job-60216/job-191380 ctx-175d37df) Unable to migrate VM due to:
>>>> Migration
>>>> cancelled because state has changed: VM[User|i-638-1736-VM]
>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,427 DEBUG
>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>>> job-60216/job-191380 ctx-175d37df) VM state transitted from :Running to
>>>> Stopping with event: StopRequestedvm's original host id: 224 new host
>>>> id:
>>>> 15 host id before state transition: 15
>>>>
>>>> The really disturbing thing is that CS stop again the VM if we start it
>>>> manually, apparently because the migratingjob is still running.
>>>>
>>>> Note the the VM HA Flag is enabled.
>>>>
>>>>
>>>>
>>>> Thank you.
>>>>
>>>>
>>>> Best Regards.
>>>>
>>>> Francois
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> more details:
>>>>
>>>>
>>>>
>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,148 DEBUG
>>>> [o.a.c.e.o.VolumeOrchestrator] (Work-Job-Executor-156:ctx-14b86f5e
>>>> job-60216/job-191380 ctx-175d37df) Preparing 3 volumes for
>>>> VM[User|i-638-1736-VM]
>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,155 DEBUG
>>>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-122:ctx-a8bacc64
>>>> job-14953/job-191351) Done with run of VM work job:
>>>> com.cloud.vm.VmWorkMigrateAway for VM 1411, job origin: 14953
>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,155 DEBUG
>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-122:ctx-a8bacc64
>>>> job-14953/job-191351) Done executing com.cloud.vm.VmWorkMigrateAway for
>>>> job-191351
>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,160 INFO
>>>> [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-122:ctx-a8bacc64
>>>> job-14953/job-191351) Remove job-191351 from job monitoring
>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,168 INFO
>>>> [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-2:ctx-f3bcf43c
>>>> work-1989)
>>>> Completed HAWork[1989-Migration-1411-Running-Migrating]
>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,246 DEBUG
>>>> [c.c.a.m.ClusteredAgentAttache] (Work-Job-Executor-156:ctx-14b86f5e
>>>> job-60216/job-191380 ctx-175d37df) Seq 19-8526158519542516375:
>>>> Forwarding
>>>> Seq 19-8526158519542516375:  { Cmd , MgmtId: 345049103441, via:
>>>> 19(ewcstack-vh010-prod), Ver: v1, Flags: 100111, [{
>>>> "com.cloud.agent.api.PrepareForMigrationCommand":{"vm":{"id"
>>>> :1736,"name":"i-638-1736-VM","bootloader":"PyGrub","type":"U
>>>> ser","cpus":4,"minSpeed":2100,"maxSpeed":2100,"minRam":17179
>>>> 869184,"maxRam":17179869184,"arch":"x86_64","os":"Other
>>>> (64-bit)","platformEmulator":"Other install
>>>> media","bootArgs":"","enable
>>>> HA":true,"limitCpuUse":true,"enableDynamicallyScaleVm":true,
>>>> "vncPassword":"pC1WUC7h1DBH9J36q3H9pg==","params":{"memoryOv
>>>> ercommitRatio":"1.0","platform":"viridian:true;acpi:1;apic:t
>>>> rue;pae:true;nx:true;timeoffset:-3601","keyboard":"us","Mess
>>>> age.ReservedCapacityFreed.Flag":"false","cpuOvercommitRatio":"4.0","
>>>> hypervisortoolsversion":"xenserver61"},"uuid":"5bff01de-c033
>>>> -4925-9e47-30dd14539272","disks":[{"data":{"org.apache.cloud
>>>> stack.storage.to.VolumeObjectTO":{"uuid":"1ab514b2-f2de-
>>>> 4d85-8e0d-9768184a1349","volumeType":"ROOT","dataStore":{"org.
>>>> apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"b8751b89-00f
>>>> 6-5079-ed7a-c073579c7563","id":64,"poolType":"PreSetup","hos
>>>> t":"localhost","path":"/b8751b89-00f6-5079-ed7a-c073579c7563
>>>> ","port":0,"url":"PreSetup://localhost/b8751b89-00f6-5079-ed
>>>> 7a-c073579c7563/?ROLE=Primary&STOREUUID=b8751b89-00f6-5079-
>>>> ed7a-c073579c7563"}},"name":"ROOT-1736","size":53687091200,"path":"d3
>>>> f87e3c-1efe-42ad-aa93-609d5b980a34","volumeId":8655,"vmName"
>>>> :"i-638-1736-VM","accountId":638,"format":"VHD","provisionin
>>>> gType":"THIN","id":8655,"deviceId":0,"hypervisorType":"
>>>> XenServer"}},"diskSeq":0,"path":"d3f87e3c-1efe-42ad-aa93
>>>> -609d5b980a34","type":"ROOT","_details":{"managed":"false","
>>>> storagePort":"0",
>>>> "storageHost":"localhost","volumeSize":"53687091200"}},{"dat
>>>> a":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid
>>>> ":"733df9dc-5e24-4ab9-b22b-bce78322428e","volumeType":"DATAD
>>>> ISK","dataStore":{"org.apache.cloudstack.storage.to.PrimaryD
>>>> ataStoreTO":{"uuid":"b8751b89-00f6-5079-ed7a-c073579c7563","id":64,"
>>>> poolType":"PreSetup","host":"localhost","path":"/b8751b89-00
>>>> f6-5079-ed7a-c073579c7563","port":0,"url":"PreSetup://localh
>>>> ost/b8751b89-00f6-5079-ed7a-c073579c7563/?ROLE=Primary&STORE
>>>> UUID=b8751b89-00f6-5079-ed7a-c073579c7563"}},"name":"WELLA-
>>>> DB1-NCSA_D","size":268435456000,"path":"265f168a-bb92-4591-a32b-ee3e2
>>>> 2c8cb68","volumeId":8656,"vmName":"i-638-1736-VM","accountId
>>>> ":638,"format":"VHD","provisioningType":"THIN","id":8656,"
>>>> deviceId":1,"hypervisorType":"XenServer"}},"diskSeq":1,"
>>>> path":"265f168a-bb92-4591-a32b-ee3e22c8cb68","type":"DATADIS
>>>> K","_details":{"managed":"false","storagePort":"0","stor
>>>> ageHost":"localhos
>>>> t","volumeSize":"268435456000"}},{"data":{"org.apache.clouds
>>>> tack.storage.to.VolumeObjectTO":{"uuid":"bda6f335-b8a3-479c-
>>>> 8c0d-41b344d2a50d","volumeType":"DATADISK","dataStore":{"org.
>>>> apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid"
>>>> :"09df3ca6-0a20-e4f0-3d79-12a6ed67a13d","id":89,"poolType":"PreSetup","
>>>> host":"localhost","path":"/09df3ca6-0a20-e4f0-3d79-12a6ed67a
>>>> 13d","port":0,"url":"PreSetup://localhost/09df3ca6-0a20-e4f0
>>>> -3d79-12a6ed67a13d/?ROLE=Primary&STOREUUID=09df3ca6-0a20-
>>>>
>>>> e4f0-3d79-12a6ed67a13d"}},"name":"WELLA-DB1-NCSA_E","size
>>>> ":429496729600,"path":"612cb586-a70c-4896-b120-062ee7a67894"
>>>> ,"volumeId":86
>>>> 57,"vmName":"i-638-1736-VM","accountId":638,"format":"VHD","
>>>> provisioningType":"THIN","id":8657,"deviceId":2,"hypervisorT
>>>> ype":"XenServer"}},"diskSeq":2,"path":"612cb586-a70c-4896-b1
>>>> 20-062ee7a67894","type":"DATADISK","_details":{"managed":"
>>>> false","storagePort":"0","storageHost":"localhost","volumeSize":"42949
>>>> 6729600"}},{"data":{"org.apache.cloudstack.storage.to.Templa
>>>> teObjectTO":{"uuid":"0e5bf4ff-caca-4b08-abed-7bbcdf3c42ef","
>>>> id":201,"format":"ISO","accountId":1,"hvm":true,"displayText
>>>> ":"xen-pv-drv-iso","name":"xs-tools.iso","guestOsType":"CentOS 4.5
>>>> (32-bit)","hypervisorType":"XenServer"}},"diskSeq":3,"type":"
>>>> ISO"}],"nics":[{"deviceId":0,"networkRateMbps":1000,"default
>>>> Nic":true,"pxeDisable":false,"nicUuid":"3523c3d3-9f92-4076-a
>>>> d99-f8864d0f0edd","uuid":"e6fbb257-16e1-415f-88fb-9a79b263b3
>>>> a8","ip":"192.168.1.42","netmask":"255.255.255.0","
>>>> gateway":"192.168.1.1","mac":"02:00:0c:ae:00:0f","dns1":"8.8
>>>> .8.8","dns2":"8.8
>>>> .4.4","broadcastType":"Vlan","type":"Guest","broadcastUri":"
>>>> vlan://2285","isolationUri":"vlan://2285","isSecurityGroupEn
>>>> abled":false,"name":"bond0"}],"vcpuMaxLimit":16},"wait":0}}] } to
>>>> 345049098498
>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG [c.c.a.t.Request]
>>>> (AgentManager-Handler-10:null) Seq 19-8526158519542516375: Processing:
>>>> {
>>>> Ans: , MgmtId: 345049103441, via: 19, Ver: v1, Flags: 110,
>>>> [{"com.cloud.agent.api.PrepareForMigrationAnswer":{"result":
>>>> true,"wait":0}}]
>>>> }
>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG
>>>> [c.c.a.m.AgentAttache]
>>>> (AgentManager-Handler-10:null) Seq 19-8526158519542516375: No more
>>>> commands
>>>> found
>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG [c.c.a.t.Request]
>>>> (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380 ctx-175d37df)
>>>> Seq 19-8526158519542516375: Received:  { Ans: , MgmtId: 345049103441,
>>>> via:
>>>> 19, Ver: v1, Flags: 110, { PrepareForMigrationAnswer } }
>>>>
>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 INFO
>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>>> job-60216/job-191380 ctx-175d37df) Migration cancelled because state has
>>>> changed: VM[User|i-638-1736-VM]
>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 DEBUG
>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>>> job-60216/job-191380 ctx-175d37df) Unable to migrate VM due to:
>>>> Migration
>>>> cancelled because state has changed: VM[User|i-638-1736-VM]
>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,427 DEBUG
>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>>> job-60216/job-191380 ctx-175d37df) VM state transitted from :Running to
>>>> Stopping with event: StopRequestedvm's original host id: 224 new host
>>>> id:
>>>> 15 host id before state transition: 15
>>>>
>>>> ewcstack-man02-prod: 2017-01-24 18:05:28,438 DEBUG
>>>> [c.c.a.m.ClusteredAgentAttache] (Work-Job-Executor-156:ctx-14b86f5e
>>>> job-60216/job-191380 ctx-175d37df) Seq 15-6889944479923437643:
>>>> Forwarding
>>>> Seq 15-6889944479923437643:  { Cmd , MgmtId: 345049103441, via:
>>>> 15(ewcstack-vh011-prod), Ver: v1, Flags: 100011,
>>>> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"execut
>>>> eInSequence":false,"checkBeforeCleanup":false,"vmName":"i-
>>>> 638-1736-VM","wait":0}}]
>>>> } to 345049098498
>>>> ewcstack-man02-prod: 2017-01-24 18:05:30,335 DEBUG
>>>> [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-1:null) SeqA
>>>> 254-800498:
>>>> Processing Seq 254-800498:  { Cmd , MgmtId: -1, via: 254, Ver: v1,
>>>> Flags:
>>>> 11, [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_
>>>> proxyVmId":1530,"_loadInfo":"{\n
>>>> \"connections\": []\n}","wait":0}}] }
>>>> ewcstack-man02-prod: 2017-01-24 18:05:30,340 DEBUG
>>>> [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-1:null) SeqA
>>>> 254-800498:
>>>> Sending Seq 254-800498:  { Ans: , MgmtId: 345049103441, via: 254, Ver:
>>>> v1,
>>>> Flags: 100010, [{"com.cloud.agent.api.AgentCo
>>>> ntrolAnswer":{"result":true,"wait":0}}] }
>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,447 DEBUG [c.c.a.t.Request]
>>>> (AgentManager-Handler-11:null) Seq 15-6889944479923437632: Processing:
>>>> {
>>>> Ans: , MgmtId: 345049103441, via: 15, Ver: v1, Flags: 10,
>>>> [{"com.cloud.agent.api.StopAnswer":{"platform":"viridian:tru
>>>> e;acpi:1;apic:true;pae:true;nx:true;timeoffset:3601","result
>>>> ":true,"details":"Stop
>>>> VM i-638-1735-VM Succeed","wait":0}}] }
>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,447 DEBUG [c.c.a.t.Request]
>>>> (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379 ctx-bbabfeae)
>>>> Seq 15-6889944479923437632: Received:  { Ans: , MgmtId: 345049103441,
>>>> via:
>>>> 15, Ver: v1, Flags: 10, { StopAnswer } }
>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,470 DEBUG
>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>> job-13733/job-191379 ctx-bbabfeae) VM[User|i-638-1735-VM] is stopped on
>>>> the
>>>> host. Proceeding to release resource held.
>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,475 DEBUG
>>>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not
>>>> supported
>>>> in the network id=519
>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,478 DEBUG
>>>> [o.a.c.e.o.NetworkOrchestrator] (Work-Job-Executor-115:ctx-0d5db528
>>>> job-13733/job-191379 ctx-bbabfeae) Changing active number of nics for
>>>> network id=519 on -1
>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>>>> [o.a.c.e.o.NetworkOrchestrator] (Work-Job-Executor-115:ctx-0d5db528
>>>> job-13733/job-191379 ctx-bbabfeae) Asking VpcVirtualRouter to release
>>>> NicProfile[17403-1735-3364083b-dcc0-4eb9-a7f9-f3e85bf8c7d3-1
>>>> 92.168.1.11-null
>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>> job-13733/job-191379 ctx-bbabfeae) Successfully released network
>>>> resources
>>>> for the vm VM[User|i-638-1735-VM]
>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>> job-13733/job-191379 ctx-bbabfeae) Successfully released storage
>>>> resources
>>>> for the vm VM[User|i-638-1735-VM]
>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,496 DEBUG
>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>> job-13733/job-191379 ctx-bbabfeae) VM state transitted from :Stopping to
>>>> Stopped with event: OperationSucceededvm's original host id: 15 new host
>>>> id: null host id before state transition: 15
>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>> job-13733/job-191379 ctx-bbabfeae) Hosts's actual total CPU: 82976 and
>>>> CPU
>>>> after applying overprovisioning: 331904
>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>> job-13733/job-191379 ctx-bbabfeae) Hosts's actual total RAM:
>>>> 261961927808
>>>> and RAM after applying overprovisioning: 261961924608
>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>> job-13733/job-191379 ctx-bbabfeae) release cpu from host: 15, old used:
>>>> 71300,reserved: 2100, actual total: 82976, total with overprovisioning:
>>>> 331904; new used: 67100,reserved:6300; movedfromreserved:
>>>> false,moveToReserveredtrue
>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>> job-13733/job-191379 ctx-bbabfeae) release mem from host: 15, old used:
>>>> 98784247808,reserved: 1073741824, total: 261961924608; new used:
>>>> 94489280512,reserved:5368709120 <(53)%206870-9120>; movedfromreserved:
>>>>
>>>> false,moveToReserveredtrue
>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,538 DEBUG
>>>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not
>>>> supported
>>>> in the network id=519
>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,557 DEBUG
>>>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not
>>>> supported
>>>> in the network id=519
>>>>
>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,588 ERROR
>>>> [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-115:ctx-0d5db528
>>>> job-13733/job-191379 ctx-bbabfeae) Invocation exception, caused by:
>>>> com.cloud.utils.exception.CloudRuntimeException: Unable to migrate
>>>> VM[User|i-638-1735-VM]
>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 INFO
>>>> [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-115:ctx-0d5db528
>>>> job-13733/job-191379 ctx-bbabfeae) Rethrow exception
>>>> com.cloud.utils.exception.CloudRuntimeException: Unable to migrate
>>>> VM[User|i-638-1735-VM]
>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 DEBUG
>>>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-115:ctx-0d5db528
>>>> job-13733/job-191379) Done with run of VM work job:
>>>> com.cloud.vm.VmWorkMigrateAway for VM 1735, job origin: 13733
>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 ERROR
>>>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-115:ctx-0d5db528
>>>> job-13733/job-191379) Unable to complete AsyncJobVO {id:191379, userId:
>>>> 1,
>>>> accountId: 1, instanceType: null, instanceId: null, cmd:
>>>> com.cloud.vm.VmWorkMigrateAway, cmdInfo: rO0ABXNyAB5jb20uY2xvdWQudm0uVm
>>>> 1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb2
>>>> 0uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2
>>>> VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZz
>>>> t4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGx3QAGVZpcnR1YWxNYWNoaW5lTW
>>>> FuYWdlckltcGwAAAAAAAAA4A,
>>>> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
>>>> result: null, initMsid: 345049098498, completeMsid: null, lastUpdated:
>>>> null, lastPolled: null, created: Tue Jan 24 18:05:00 CET 2017}, job
>>>> origin:13733
>>>> ewcstack-man02-prod: com.cloud.utils.exception.CloudRuntimeException:
>>>> Unable to migrate VM[User|i-638-1735-VM]
>>>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>>>> agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:2405)
>>>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>>>> agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:4517)
>>>> ewcstack-man02-prod:    at sun.reflect.GeneratedMethodAcc
>>>> essor402.invoke(Unknown
>>>> Source)
>>>> ewcstack-man02-prod:    at sun.reflect.DelegatingMethodAc
>>>> cessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>> ewcstack-man02-prod:    at java.lang.reflect.Method.invok
>>>> e(Method.java:606)
>>>> ewcstack-man02-prod:    at com.cloud.vm.VmWorkJobHandlerP
>>>> roxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
>>>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>>>> agerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4636)
>>>> ewcstack-man02-prod:    at com.cloud.vm.VmWorkJobDispatch
>>>> er.runJob(VmWorkJobDispatcher.java:103)
>>>> ewcstack-man02-prod:    at org.apache.cloudstack.framewor
>>>> k.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManag
>>>> erImpl.java:537)
>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>> context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>> context.impl.DefaultManagedContext$1.call(DefaultManagedCont
>>>> ext.java:56)
>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>> context.impl.DefaultManagedContext.callWithContext(DefaultMa
>>>> nagedContext.java:103)
>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>> context.impl.DefaultManagedContext.runWithContext(DefaultMan
>>>> agedContext.java:53)
>>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>>> context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>>>> ewcstack-man02-prod:    at org.apache.cloudstack.framewor
>>>> k.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:494)
>>>> ewcstack-man02-prod:    at java.util.concurrent.Executors
>>>> $RunnableAdapter.call(Executors.java:471)
>>>> ewcstack-man02-prod:    at java.util.concurrent.FutureTas
>>>> k.run(FutureTask.java:262)
>>>> ewcstack-man02-prod:    at java.util.concurrent.ThreadPoo
>>>> lExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>> ewcstack-man02-prod:    at java.util.concurrent.ThreadPoo
>>>> lExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>> ewcstack-man02-prod:    at java.lang.Thread.run(Thread.java:745)
>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,590 DEBUG
>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>> job-13733/job-191379) Complete async job-191379, jobStatus: FAILED,
>>>> resultCode: 0, result: rO0ABXNyAC9jb20uY2xvdWQudXRpbH
>>>> MuZXhjZXB0aW9uLkNsb3VkUnVudGltZUV4Y2VwdGlvbgAAAABWTU9yAwABSQ
>>>> ALY3NFcnJvckNvZGV4cgAaamF2YS5sYW5nLlJ1bnRpbWVFeGNlcHRpb26eXw
>>>> ZHCjSD5QIAAHhyABNqYXZhLmxhbmcuRXhjZXB0aW9u0P0fPho7HMQCAAB4cg
>>>> ATamF2YS5sYW5nLlRocm93YWJsZdXGNSc5d7jLAwAETAAFY2F1c2V0ABVMam
>>>> F2YS9sYW5nL1Rocm93YWJsZTtMAA1kZXRhaWxNZXNzYWdldAASTGphdmEvbG
>>>> FuZy9TdHJpbmc7WwAKc3RhY2tUcmFjZXQAHltMamF2YS9sYW5nL1N0YWNrVH
>>>> JhY2VFbGVtZW50O0wAFHN1cHByZXNzZWRFeGNlcHRpb25zdAAQTGphdmEvdX
>>>> RpbC9MaXN0O3hwcQB-AAh0AChVbmFibGUgdG8gbWlncmF0ZSBWTVtVc2VyfG
>>>> ktNjM4LTE3MzUtVk1ddXIAHltMamF2YS5sYW5nLlN0YWNrVHJhY2VFbGVtZW
>>>> 50OwJGKjw8_SI5AgAAeHAAAAAUc3IAG2phdmEubGFuZy5TdGFja1RyYWNlRW
>>>> xlbWVudGEJxZomNt2FAgAESQAKbGluZU51bWJlckwADmRlY2xhcmluZ0NsYX
>>>> NzcQB-AAVMAAhmaWxlTmFtZXEAfgAFTAAKbWV0aG9kTmFtZXEAfgAFeHAAAA
>>>> lldAAmY29tLmNsb3VkLnZtLlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGx0AB
>>>> 5WaXJ0dWFsTWFjaGluZU1hbmFnZXJJbXBsLmphdmF0ABZvcmNoZXN0cmF0ZU
>>>> 1pZ3JhdGVBd2F5c3EAfgAMAAARpXEAfgAOcQB-AA9xAH4AEHNxAH4ADP____
>>>> 90ACZzdW4ucmVmbGVjdC5HZW5lcmF0ZWRNZXRob2RBY2Nlc3NvcjQwMnB0AA
>>>> ZpbnZva2VzcQB-AAwAAAArdAAoc3VuLnJlZmxlY3QuRGVsZWdhdGluZ01ldG
>>>> hvZEFjY2Vzc29ySW1wbHQAIURlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcG
>>>> wuamF2YXEAfgAUc3EAfgAMAAACXnQAGGphdmEubGFuZy5yZWZsZWN0Lk1ldG
>>>> hvZHQAC01ldGhvZC5qYXZhcQB-ABRzcQB-AAwAAABrdAAiY29tLmNsb3VkLn
>>>> ZtLlZtV29ya0pvYkhhbmRsZXJQcm94eXQAGlZtV29ya0pvYkhhbmRsZXJQcm
>>>> 94eS5qYXZhdAAPaGFuZGxlVm1Xb3JrSm9ic3EAfgAMAAASHHEAfgAOcQB-
>>>> AA9xAH4AHnNxAH4ADAAAAGd0ACBjb20uY2xvdWQudm0uVm1Xb3JrSm9iRGlz
>>>> cGF0Y2hlcnQAGFZtV29ya0pvYkRpc3BhdGNoZXIuamF2YXQABnJ1bkpvYnNx
>>>> AH4ADAAAAhl0AD9vcmcuYXBhY2hlLmNsb3Vkc3RhY2suZnJhbWV3b3JrLmpv
>>>> YnMuaW1wbC5Bc3luY0pvYk1hbmFnZXJJbXBsJDV0ABhBc3luY0pvYk1hbmFn
>>>> ZXJJbXBsLmphdmF0AAxydW5JbkNvbnRleHRzcQB-AAwAAAAxdAA-b3JnLmFw
>>>> YWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29udGV4
>>>> dFJ1bm5hYmxlJDF0ABtNYW5hZ2VkQ29udGV4dFJ1bm5hYmxlLmphdmF0AANy
>>>> dW5zcQB-AAwAAAA4dABCb3JnLmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQu
>>>> Y29udGV4dC5pbXBsLkRlZmF1bHRNYW5hZ2VkQ29udGV4dCQxdAAaRGVmYXVs
>>>> dE1hbmFnZWRDb250ZXh0LmphdmF0AARjYWxsc3EAfgAMAAAAZ3QAQG9yZy5h
>>>> cGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0
>>>> TWFuYWdlZENvbnRleHRxAH4ALnQAD2NhbGxXaXRoQ29udGV4dHNxAH4ADAAA
>>>> ADVxAH4AMXEAfgAudAAOcnVuV2l0aENvbnRleHRzcQB-AAwAAAAudAA8b3Jn
>>>> LmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29u
>>>> dGV4dFJ1bm5hYmxlcQB-ACpxAH4AK3NxAH4ADAAAAe5xAH4AJXEAfgAmcQB-
>>>> ACtzcQB-AAwAAAHXdAAuamF2YS51dGlsLmNvbmN1cnJlbnQuRXhlY3V0b3Jz
>>>> JFJ1bm5hYmxlQWRhcHRlcnQADkV4ZWN1dG9ycy5qYXZhcQB-AC9zcQB-AAwA
>>>> AAEGdAAfamF2YS51dGlsLmNvbmN1cnJlbnQuRnV0dXJlVGFza3QAD0Z1dHVy
>>>> ZVRhc2suamF2YXEAfgArc3EAfgAMAAAEeXQAJ2phdmEudXRpbC5jb25jdXJy
>>>> ZW50LlRocmVhZFBvb2xFeGVjdXRvcnQAF1RocmVhZFBvb2xFeGVjdXRvci5q
>>>> YXZhdAAJcnVuV29ya2Vyc3EAfgAMAAACZ3QALmphdmEudXRpbC5jb25jdXJy
>>>> ZW50LlRocmVhZFBvb2xFeGVjdXRvciRXb3JrZXJxAH4AQHEAfgArc3EAfgAM
>>>> AAAC6XQAEGphdmEubGFuZy5UaHJlYWR0AAtUaHJlYWQuamF2YXEAfgArc3IA
>>>> JmphdmEudXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVMaXN0_A8lMbXs
>>>> jhACAAFMAARsaXN0cQB-AAd4cgAsamF2YS51dGlsLkNvbGxlY3Rpb25zJFVu
>>>> bW9kaWZpYWJsZUNvbGxlY3Rpb24ZQgCAy173HgIAAUwAAWN0ABZMamF2YS91
>>>> dGlsL0NvbGxlY3Rpb247eHBzcgATamF2YS51dGlsLkFycmF5TGlzdHiB0h2Z
>>>> x2GdAwABSQAEc2l6ZXhwAAAAAHcEAAAAAHhxAH4ATHgAABCadwgAAAAAAAAAAHg
>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>> job-13733/job-191379) Publish async job-191379 complete on message bus
>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>> job-13733/job-191379) Wake up jobs related to job-191379
>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>> job-13733/job-191379) Update db status for job-191379
>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,594 DEBUG
>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>> job-13733/job-191379) Wake up jobs joined with job-191379 and disjoin
>>>> all
>>>> subjobs created from job- 191379
>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,604 DEBUG
>>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>>> job-13733/job-191379) Done executing com.cloud.vm.VmWorkMigrateAway for
>>>> job-191379
>>>> ewcstack-man02-prod: 2017-01-24 18:05:32,608 INFO
>>>> [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-115:ctx-0d5db528
>>>> job-13733/job-191379) Remove job-191379 from job monitoring
>>>>
>>>>
>>>>
>>>> On 25.01.2017 15:53, Rafael Weingärtner wrote:
>>>>
>>>> Did you check the ACS log files looking for some exceptions when the
>>>>> migrations for this specific VM started?
>>>>>
>>>>> On Wed, Jan 25, 2017 at 9:49 AM, Francois Scheurer <
>>>>> francois.scheurer@everyware.ch> wrote:
>>>>>
>>>>> Dear CS contributors
>>>>>
>>>>>>
>>>>>> We use CS 4.5.1 on a 3 Clusters with XenServer 6.5. and shared primary
>>>>>> storage.
>>>>>>
>>>>>> When we put a host in maintenance, cs evacuates all VM's on other
>>>>>> hosts.
>>>>>>
>>>>>> But it happens regularly that some VM's are stopped instead of live
>>>>>> migrated.
>>>>>>
>>>>>> If we restart that stopped VM's, CS will then stop them again later.
>>>>>>
>>>>>> We need to start them several times until they stays up.
>>>>>>
>>>>>> Is it a known issue? Is it fixed on CS 4.9.2 ?
>>>>>>
>>>>>>
>>>>>> Migrating manually all VM's is working but impracticable because too
>>>>>> much
>>>>>> time consuming.
>>>>>>
>>>>>>
>>>>>> Many thanks in advance for your help.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Best Regards
>>>>>> Francois
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>>
>>>>>> EveryWare AG
>>>>>> François Scheurer
>>>>>> Senior Systems Engineer
>>>>>> Zurlindenstrasse 52a
>>>>>> CH-8003 Zürich
>>>>>>
>>>>>> tel: +41 44 466 60 00
>>>>>> fax: +41 44 466 60 10
>>>>>> mail: francois.scheurer@everyware.ch
>>>>>> web: http://www.everyware.ch
>>>>>>
>>>>>>
>>>>>>
>>>>> --
>>>>
>>>>
>>>> EveryWare AG
>>>> François Scheurer
>>>> Senior Systems Engineer
>>>> Zurlindenstrasse 52a
>>>> CH-8003 Zürich
>>>>
>>>> tel: +41 44 466 60 00
>>>> fax: +41 44 466 60 10
>>>> mail: francois.scheurer@everyware.ch
>>>> web: http://www.everyware.ch
>>>>
>>>>
>>>>
>>> --
>>> Rafael Weingärtner
>>>
>>>
>>
>>
> --
>
>
> EveryWare AG
> François Scheurer
> Senior Systems Engineer
> Zurlindenstrasse 52a
> CH-8003 Zürich
>
> tel: +41 44 466 60 00
> fax: +41 44 466 60 10
> mail: francois.scheurer@everyware.ch
> web: http://www.everyware.ch
>



-- 
Rafael Weingärtner

Re: cs 4.5.1, vm stopped when putting xen hosts in maintenance

Posted by Francois Scheurer <fr...@everyware.ch>.

Dear Rafael


Sorry for the late answer (was sick a couple of days).

I confirm we are migrating inside the same xenserver pool (within a CS 
Cluster).


Best Regards
Francois


On 26.01.2017 04:35, Rafael Weing�rtner wrote:
> Found the code that is executed.
> I need you to confirm if you are migrating within a cluster or across
> clusters?
>
> On Wed, Jan 25, 2017 at 11:57 AM, Rafael Weing�rtner <
> rafaelweingartner@gmail.com> wrote:
>
>> I do not have access to ACS code right now, but I suggest starting
>> debugging here:
>>
>>> 2017-01-24 18:05:28,427 DEBUG [c.c.c.CapacityManagerImpl]
>>> (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380 ctx-175d37df)
>>> VM state transitted from :Running to Stopping with event: StopRequestedvm's
>>> original host id: 224 new host id: 15 host id before state transition: 15
>>>
>> I would try to understand first why ACS requested the hypervisor to stop
>> the VM. I mean the following, check the source code for conditions that
>> would make ACS request the shutdown of VMs.
>> BTW: are you migrating within a cluster or across clusters?
>>
>> I will be able to analyze this further only after work hours, so if you
>> find anything, keep me posted.
>>
>> On Wed, Jan 25, 2017 at 11:45 AM, Francois Scheurer <
>> francois.scheurer@everyware.ch> wrote:
>>
>>> Hello Rafael
>>>
>>>
>>> I think the important log lines are these:
>>>
>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 INFO
>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>> job-60216/job-191380 ctx-175d37df) Migration cancelled because state has
>>> changed: VM[User|i-638-1736-VM]
>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 DEBUG
>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>> job-60216/job-191380 ctx-175d37df) Unable to migrate VM due to: Migration
>>> cancelled because state has changed: VM[User|i-638-1736-VM]
>>> ewcstack-man02-prod: 2017-01-24 18:05:28,427 DEBUG
>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>> job-60216/job-191380 ctx-175d37df) VM state transitted from :Running to
>>> Stopping with event: StopRequestedvm's original host id: 224 new host id:
>>> 15 host id before state transition: 15
>>>
>>> The really disturbing thing is that CS stop again the VM if we start it
>>> manually, apparently because the migratingjob is still running.
>>>
>>> Note the the VM HA Flag is enabled.
>>>
>>>
>>>
>>> Thank you.
>>>
>>>
>>> Best Regards.
>>>
>>> Francois
>>>
>>>
>>>
>>>
>>>
>>>
>>> more details:
>>>
>>>
>>>
>>> ewcstack-man02-prod: 2017-01-24 18:05:28,148 DEBUG
>>> [o.a.c.e.o.VolumeOrchestrator] (Work-Job-Executor-156:ctx-14b86f5e
>>> job-60216/job-191380 ctx-175d37df) Preparing 3 volumes for
>>> VM[User|i-638-1736-VM]
>>> ewcstack-man02-prod: 2017-01-24 18:05:28,155 DEBUG
>>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-122:ctx-a8bacc64
>>> job-14953/job-191351) Done with run of VM work job:
>>> com.cloud.vm.VmWorkMigrateAway for VM 1411, job origin: 14953
>>> ewcstack-man02-prod: 2017-01-24 18:05:28,155 DEBUG
>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-122:ctx-a8bacc64
>>> job-14953/job-191351) Done executing com.cloud.vm.VmWorkMigrateAway for
>>> job-191351
>>> ewcstack-man02-prod: 2017-01-24 18:05:28,160 INFO
>>> [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-122:ctx-a8bacc64
>>> job-14953/job-191351) Remove job-191351 from job monitoring
>>> ewcstack-man02-prod: 2017-01-24 18:05:28,168 INFO
>>> [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-2:ctx-f3bcf43c work-1989)
>>> Completed HAWork[1989-Migration-1411-Running-Migrating]
>>> ewcstack-man02-prod: 2017-01-24 18:05:28,246 DEBUG
>>> [c.c.a.m.ClusteredAgentAttache] (Work-Job-Executor-156:ctx-14b86f5e
>>> job-60216/job-191380 ctx-175d37df) Seq 19-8526158519542516375: Forwarding
>>> Seq 19-8526158519542516375:  { Cmd , MgmtId: 345049103441, via:
>>> 19(ewcstack-vh010-prod), Ver: v1, Flags: 100111, [{
>>> "com.cloud.agent.api.PrepareForMigrationCommand":{"vm":{"id"
>>> :1736,"name":"i-638-1736-VM","bootloader":"PyGrub","type":"U
>>> ser","cpus":4,"minSpeed":2100,"maxSpeed":2100,"minRam":17179
>>> 869184,"maxRam":17179869184,"arch":"x86_64","os":"Other
>>> (64-bit)","platformEmulator":"Other install media","bootArgs":"","enable
>>> HA":true,"limitCpuUse":true,"enableDynamicallyScaleVm":true,
>>> "vncPassword":"pC1WUC7h1DBH9J36q3H9pg==","params":{"memoryOv
>>> ercommitRatio":"1.0","platform":"viridian:true;acpi:1;apic:t
>>> rue;pae:true;nx:true;timeoffset:-3601","keyboard":"us","Mess
>>> age.ReservedCapacityFreed.Flag":"false","cpuOvercommitRatio":"4.0","
>>> hypervisortoolsversion":"xenserver61"},"uuid":"5bff01de-c033
>>> -4925-9e47-30dd14539272","disks":[{"data":{"org.apache.cloud
>>> stack.storage.to.VolumeObjectTO":{"uuid":"1ab514b2-f2de-
>>> 4d85-8e0d-9768184a1349","volumeType":"ROOT","dataStore":{"org.
>>> apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"b8751b89-00f
>>> 6-5079-ed7a-c073579c7563","id":64,"poolType":"PreSetup","hos
>>> t":"localhost","path":"/b8751b89-00f6-5079-ed7a-c073579c7563
>>> ","port":0,"url":"PreSetup://localhost/b8751b89-00f6-5079-ed
>>> 7a-c073579c7563/?ROLE=Primary&STOREUUID=b8751b89-00f6-5079-
>>> ed7a-c073579c7563"}},"name":"ROOT-1736","size":53687091200,"path":"d3
>>> f87e3c-1efe-42ad-aa93-609d5b980a34","volumeId":8655,"vmName"
>>> :"i-638-1736-VM","accountId":638,"format":"VHD","provisionin
>>> gType":"THIN","id":8655,"deviceId":0,"hypervisorType":"
>>> XenServer"}},"diskSeq":0,"path":"d3f87e3c-1efe-42ad-aa93
>>> -609d5b980a34","type":"ROOT","_details":{"managed":"false","
>>> storagePort":"0",
>>> "storageHost":"localhost","volumeSize":"53687091200"}},{"dat
>>> a":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid
>>> ":"733df9dc-5e24-4ab9-b22b-bce78322428e","volumeType":"DATAD
>>> ISK","dataStore":{"org.apache.cloudstack.storage.to.PrimaryD
>>> ataStoreTO":{"uuid":"b8751b89-00f6-5079-ed7a-c073579c7563","id":64,"
>>> poolType":"PreSetup","host":"localhost","path":"/b8751b89-00
>>> f6-5079-ed7a-c073579c7563","port":0,"url":"PreSetup://localh
>>> ost/b8751b89-00f6-5079-ed7a-c073579c7563/?ROLE=Primary&STORE
>>> UUID=b8751b89-00f6-5079-ed7a-c073579c7563"}},"name":"WELLA-
>>> DB1-NCSA_D","size":268435456000,"path":"265f168a-bb92-4591-a32b-ee3e2
>>> 2c8cb68","volumeId":8656,"vmName":"i-638-1736-VM","accountId
>>> ":638,"format":"VHD","provisioningType":"THIN","id":8656,"
>>> deviceId":1,"hypervisorType":"XenServer"}},"diskSeq":1,"
>>> path":"265f168a-bb92-4591-a32b-ee3e22c8cb68","type":"DATADIS
>>> K","_details":{"managed":"false","storagePort":"0","stor
>>> ageHost":"localhos
>>> t","volumeSize":"268435456000"}},{"data":{"org.apache.clouds
>>> tack.storage.to.VolumeObjectTO":{"uuid":"bda6f335-b8a3-479c-
>>> 8c0d-41b344d2a50d","volumeType":"DATADISK","dataStore":{"org.
>>> apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid"
>>> :"09df3ca6-0a20-e4f0-3d79-12a6ed67a13d","id":89,"poolType":"PreSetup","
>>> host":"localhost","path":"/09df3ca6-0a20-e4f0-3d79-12a6ed67a
>>> 13d","port":0,"url":"PreSetup://localhost/09df3ca6-0a20-e4f0
>>> -3d79-12a6ed67a13d/?ROLE=Primary&STOREUUID=09df3ca6-0a20-
>>> e4f0-3d79-12a6ed67a13d"}},"name":"WELLA-DB1-NCSA_E","size
>>> ":429496729600,"path":"612cb586-a70c-4896-b120-062ee7a67894"
>>> ,"volumeId":86
>>> 57,"vmName":"i-638-1736-VM","accountId":638,"format":"VHD","
>>> provisioningType":"THIN","id":8657,"deviceId":2,"hypervisorT
>>> ype":"XenServer"}},"diskSeq":2,"path":"612cb586-a70c-4896-b1
>>> 20-062ee7a67894","type":"DATADISK","_details":{"managed":"
>>> false","storagePort":"0","storageHost":"localhost","volumeSize":"42949
>>> 6729600"}},{"data":{"org.apache.cloudstack.storage.to.Templa
>>> teObjectTO":{"uuid":"0e5bf4ff-caca-4b08-abed-7bbcdf3c42ef","
>>> id":201,"format":"ISO","accountId":1,"hvm":true,"displayText
>>> ":"xen-pv-drv-iso","name":"xs-tools.iso","guestOsType":"CentOS 4.5
>>> (32-bit)","hypervisorType":"XenServer"}},"diskSeq":3,"type":"
>>> ISO"}],"nics":[{"deviceId":0,"networkRateMbps":1000,"default
>>> Nic":true,"pxeDisable":false,"nicUuid":"3523c3d3-9f92-4076-a
>>> d99-f8864d0f0edd","uuid":"e6fbb257-16e1-415f-88fb-9a79b263b3
>>> a8","ip":"192.168.1.42","netmask":"255.255.255.0","
>>> gateway":"192.168.1.1","mac":"02:00:0c:ae:00:0f","dns1":"8.8
>>> .8.8","dns2":"8.8
>>> .4.4","broadcastType":"Vlan","type":"Guest","broadcastUri":"
>>> vlan://2285","isolationUri":"vlan://2285","isSecurityGroupEn
>>> abled":false,"name":"bond0"}],"vcpuMaxLimit":16},"wait":0}}] } to
>>> 345049098498
>>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG [c.c.a.t.Request]
>>> (AgentManager-Handler-10:null) Seq 19-8526158519542516375: Processing:  {
>>> Ans: , MgmtId: 345049103441, via: 19, Ver: v1, Flags: 110,
>>> [{"com.cloud.agent.api.PrepareForMigrationAnswer":{"result":true,"wait":0}}]
>>> }
>>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG [c.c.a.m.AgentAttache]
>>> (AgentManager-Handler-10:null) Seq 19-8526158519542516375: No more commands
>>> found
>>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG [c.c.a.t.Request]
>>> (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380 ctx-175d37df)
>>> Seq 19-8526158519542516375: Received:  { Ans: , MgmtId: 345049103441, via:
>>> 19, Ver: v1, Flags: 110, { PrepareForMigrationAnswer } }
>>>
>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 INFO
>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>> job-60216/job-191380 ctx-175d37df) Migration cancelled because state has
>>> changed: VM[User|i-638-1736-VM]
>>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 DEBUG
>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>> job-60216/job-191380 ctx-175d37df) Unable to migrate VM due to: Migration
>>> cancelled because state has changed: VM[User|i-638-1736-VM]
>>> ewcstack-man02-prod: 2017-01-24 18:05:28,427 DEBUG
>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>>> job-60216/job-191380 ctx-175d37df) VM state transitted from :Running to
>>> Stopping with event: StopRequestedvm's original host id: 224 new host id:
>>> 15 host id before state transition: 15
>>>
>>> ewcstack-man02-prod: 2017-01-24 18:05:28,438 DEBUG
>>> [c.c.a.m.ClusteredAgentAttache] (Work-Job-Executor-156:ctx-14b86f5e
>>> job-60216/job-191380 ctx-175d37df) Seq 15-6889944479923437643: Forwarding
>>> Seq 15-6889944479923437643:  { Cmd , MgmtId: 345049103441, via:
>>> 15(ewcstack-vh011-prod), Ver: v1, Flags: 100011,
>>> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"execut
>>> eInSequence":false,"checkBeforeCleanup":false,"vmName":"i-638-1736-VM","wait":0}}]
>>> } to 345049098498
>>> ewcstack-man02-prod: 2017-01-24 18:05:30,335 DEBUG
>>> [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-1:null) SeqA 254-800498:
>>> Processing Seq 254-800498:  { Cmd , MgmtId: -1, via: 254, Ver: v1, Flags:
>>> 11, [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_proxyVmId":1530,"_loadInfo":"{\n
>>> \"connections\": []\n}","wait":0}}] }
>>> ewcstack-man02-prod: 2017-01-24 18:05:30,340 DEBUG
>>> [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-1:null) SeqA 254-800498:
>>> Sending Seq 254-800498:  { Ans: , MgmtId: 345049103441, via: 254, Ver: v1,
>>> Flags: 100010, [{"com.cloud.agent.api.AgentCo
>>> ntrolAnswer":{"result":true,"wait":0}}] }
>>> ewcstack-man02-prod: 2017-01-24 18:05:32,447 DEBUG [c.c.a.t.Request]
>>> (AgentManager-Handler-11:null) Seq 15-6889944479923437632: Processing:  {
>>> Ans: , MgmtId: 345049103441, via: 15, Ver: v1, Flags: 10,
>>> [{"com.cloud.agent.api.StopAnswer":{"platform":"viridian:tru
>>> e;acpi:1;apic:true;pae:true;nx:true;timeoffset:3601","result":true,"details":"Stop
>>> VM i-638-1735-VM Succeed","wait":0}}] }
>>> ewcstack-man02-prod: 2017-01-24 18:05:32,447 DEBUG [c.c.a.t.Request]
>>> (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379 ctx-bbabfeae)
>>> Seq 15-6889944479923437632: Received:  { Ans: , MgmtId: 345049103441, via:
>>> 15, Ver: v1, Flags: 10, { StopAnswer } }
>>> ewcstack-man02-prod: 2017-01-24 18:05:32,470 DEBUG
>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>> job-13733/job-191379 ctx-bbabfeae) VM[User|i-638-1735-VM] is stopped on the
>>> host. Proceeding to release resource held.
>>> ewcstack-man02-prod: 2017-01-24 18:05:32,475 DEBUG
>>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not supported
>>> in the network id=519
>>> ewcstack-man02-prod: 2017-01-24 18:05:32,478 DEBUG
>>> [o.a.c.e.o.NetworkOrchestrator] (Work-Job-Executor-115:ctx-0d5db528
>>> job-13733/job-191379 ctx-bbabfeae) Changing active number of nics for
>>> network id=519 on -1
>>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>>> [o.a.c.e.o.NetworkOrchestrator] (Work-Job-Executor-115:ctx-0d5db528
>>> job-13733/job-191379 ctx-bbabfeae) Asking VpcVirtualRouter to release
>>> NicProfile[17403-1735-3364083b-dcc0-4eb9-a7f9-f3e85bf8c7d3-1
>>> 92.168.1.11-null
>>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>> job-13733/job-191379 ctx-bbabfeae) Successfully released network resources
>>> for the vm VM[User|i-638-1735-VM]
>>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>> job-13733/job-191379 ctx-bbabfeae) Successfully released storage resources
>>> for the vm VM[User|i-638-1735-VM]
>>> ewcstack-man02-prod: 2017-01-24 18:05:32,496 DEBUG
>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>> job-13733/job-191379 ctx-bbabfeae) VM state transitted from :Stopping to
>>> Stopped with event: OperationSucceededvm's original host id: 15 new host
>>> id: null host id before state transition: 15
>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>> job-13733/job-191379 ctx-bbabfeae) Hosts's actual total CPU: 82976 and CPU
>>> after applying overprovisioning: 331904
>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>> job-13733/job-191379 ctx-bbabfeae) Hosts's actual total RAM: 261961927808
>>> and RAM after applying overprovisioning: 261961924608
>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>> job-13733/job-191379 ctx-bbabfeae) release cpu from host: 15, old used:
>>> 71300,reserved: 2100, actual total: 82976, total with overprovisioning:
>>> 331904; new used: 67100,reserved:6300; movedfromreserved:
>>> false,moveToReserveredtrue
>>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>> job-13733/job-191379 ctx-bbabfeae) release mem from host: 15, old used:
>>> 98784247808,reserved: 1073741824, total: 261961924608; new used:
>>> 94489280512,reserved:5368709120 <(53)%206870-9120>; movedfromreserved:
>>> false,moveToReserveredtrue
>>> ewcstack-man02-prod: 2017-01-24 18:05:32,538 DEBUG
>>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not supported
>>> in the network id=519
>>> ewcstack-man02-prod: 2017-01-24 18:05:32,557 DEBUG
>>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not supported
>>> in the network id=519
>>>
>>> ewcstack-man02-prod: 2017-01-24 18:05:32,588 ERROR
>>> [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-115:ctx-0d5db528
>>> job-13733/job-191379 ctx-bbabfeae) Invocation exception, caused by:
>>> com.cloud.utils.exception.CloudRuntimeException: Unable to migrate
>>> VM[User|i-638-1735-VM]
>>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 INFO
>>> [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-115:ctx-0d5db528
>>> job-13733/job-191379 ctx-bbabfeae) Rethrow exception
>>> com.cloud.utils.exception.CloudRuntimeException: Unable to migrate
>>> VM[User|i-638-1735-VM]
>>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 DEBUG
>>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-115:ctx-0d5db528
>>> job-13733/job-191379) Done with run of VM work job:
>>> com.cloud.vm.VmWorkMigrateAway for VM 1735, job origin: 13733
>>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 ERROR
>>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-115:ctx-0d5db528
>>> job-13733/job-191379) Unable to complete AsyncJobVO {id:191379, userId: 1,
>>> accountId: 1, instanceType: null, instanceId: null, cmd:
>>> com.cloud.vm.VmWorkMigrateAway, cmdInfo: rO0ABXNyAB5jb20uY2xvdWQudm0uVm
>>> 1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb2
>>> 0uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2
>>> VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZz
>>> t4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGx3QAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAA4A,
>>> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
>>> result: null, initMsid: 345049098498, completeMsid: null, lastUpdated:
>>> null, lastPolled: null, created: Tue Jan 24 18:05:00 CET 2017}, job
>>> origin:13733
>>> ewcstack-man02-prod: com.cloud.utils.exception.CloudRuntimeException:
>>> Unable to migrate VM[User|i-638-1735-VM]
>>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>>> agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:2405)
>>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>>> agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:4517)
>>> ewcstack-man02-prod:    at sun.reflect.GeneratedMethodAccessor402.invoke(Unknown
>>> Source)
>>> ewcstack-man02-prod:    at sun.reflect.DelegatingMethodAc
>>> cessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> ewcstack-man02-prod:    at java.lang.reflect.Method.invok
>>> e(Method.java:606)
>>> ewcstack-man02-prod:    at com.cloud.vm.VmWorkJobHandlerP
>>> roxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
>>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>>> agerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4636)
>>> ewcstack-man02-prod:    at com.cloud.vm.VmWorkJobDispatch
>>> er.runJob(VmWorkJobDispatcher.java:103)
>>> ewcstack-man02-prod:    at org.apache.cloudstack.framewor
>>> k.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManag
>>> erImpl.java:537)
>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>> context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>> context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>> context.impl.DefaultManagedContext.callWithContext(DefaultMa
>>> nagedContext.java:103)
>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>> context.impl.DefaultManagedContext.runWithContext(DefaultMan
>>> agedContext.java:53)
>>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>>> context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>>> ewcstack-man02-prod:    at org.apache.cloudstack.framewor
>>> k.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:494)
>>> ewcstack-man02-prod:    at java.util.concurrent.Executors
>>> $RunnableAdapter.call(Executors.java:471)
>>> ewcstack-man02-prod:    at java.util.concurrent.FutureTas
>>> k.run(FutureTask.java:262)
>>> ewcstack-man02-prod:    at java.util.concurrent.ThreadPoo
>>> lExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>> ewcstack-man02-prod:    at java.util.concurrent.ThreadPoo
>>> lExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>> ewcstack-man02-prod:    at java.lang.Thread.run(Thread.java:745)
>>> ewcstack-man02-prod: 2017-01-24 18:05:32,590 DEBUG
>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>> job-13733/job-191379) Complete async job-191379, jobStatus: FAILED,
>>> resultCode: 0, result: rO0ABXNyAC9jb20uY2xvdWQudXRpbH
>>> MuZXhjZXB0aW9uLkNsb3VkUnVudGltZUV4Y2VwdGlvbgAAAABWTU9yAwABSQ
>>> ALY3NFcnJvckNvZGV4cgAaamF2YS5sYW5nLlJ1bnRpbWVFeGNlcHRpb26eXw
>>> ZHCjSD5QIAAHhyABNqYXZhLmxhbmcuRXhjZXB0aW9u0P0fPho7HMQCAAB4cg
>>> ATamF2YS5sYW5nLlRocm93YWJsZdXGNSc5d7jLAwAETAAFY2F1c2V0ABVMam
>>> F2YS9sYW5nL1Rocm93YWJsZTtMAA1kZXRhaWxNZXNzYWdldAASTGphdmEvbG
>>> FuZy9TdHJpbmc7WwAKc3RhY2tUcmFjZXQAHltMamF2YS9sYW5nL1N0YWNrVH
>>> JhY2VFbGVtZW50O0wAFHN1cHByZXNzZWRFeGNlcHRpb25zdAAQTGphdmEvdX
>>> RpbC9MaXN0O3hwcQB-AAh0AChVbmFibGUgdG8gbWlncmF0ZSBWTVtVc2VyfG
>>> ktNjM4LTE3MzUtVk1ddXIAHltMamF2YS5sYW5nLlN0YWNrVHJhY2VFbGVtZW
>>> 50OwJGKjw8_SI5AgAAeHAAAAAUc3IAG2phdmEubGFuZy5TdGFja1RyYWNlRW
>>> xlbWVudGEJxZomNt2FAgAESQAKbGluZU51bWJlckwADmRlY2xhcmluZ0NsYX
>>> NzcQB-AAVMAAhmaWxlTmFtZXEAfgAFTAAKbWV0aG9kTmFtZXEAfgAFeHAAAA
>>> lldAAmY29tLmNsb3VkLnZtLlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGx0AB
>>> 5WaXJ0dWFsTWFjaGluZU1hbmFnZXJJbXBsLmphdmF0ABZvcmNoZXN0cmF0ZU
>>> 1pZ3JhdGVBd2F5c3EAfgAMAAARpXEAfgAOcQB-AA9xAH4AEHNxAH4ADP____
>>> 90ACZzdW4ucmVmbGVjdC5HZW5lcmF0ZWRNZXRob2RBY2Nlc3NvcjQwMnB0AA
>>> ZpbnZva2VzcQB-AAwAAAArdAAoc3VuLnJlZmxlY3QuRGVsZWdhdGluZ01ldG
>>> hvZEFjY2Vzc29ySW1wbHQAIURlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcG
>>> wuamF2YXEAfgAUc3EAfgAMAAACXnQAGGphdmEubGFuZy5yZWZsZWN0Lk1ldG
>>> hvZHQAC01ldGhvZC5qYXZhcQB-ABRzcQB-AAwAAABrdAAiY29tLmNsb3VkLn
>>> ZtLlZtV29ya0pvYkhhbmRsZXJQcm94eXQAGlZtV29ya0pvYkhhbmRsZXJQcm
>>> 94eS5qYXZhdAAPaGFuZGxlVm1Xb3JrSm9ic3EAfgAMAAASHHEAfgAOcQB-
>>> AA9xAH4AHnNxAH4ADAAAAGd0ACBjb20uY2xvdWQudm0uVm1Xb3JrSm9iRGlz
>>> cGF0Y2hlcnQAGFZtV29ya0pvYkRpc3BhdGNoZXIuamF2YXQABnJ1bkpvYnNx
>>> AH4ADAAAAhl0AD9vcmcuYXBhY2hlLmNsb3Vkc3RhY2suZnJhbWV3b3JrLmpv
>>> YnMuaW1wbC5Bc3luY0pvYk1hbmFnZXJJbXBsJDV0ABhBc3luY0pvYk1hbmFn
>>> ZXJJbXBsLmphdmF0AAxydW5JbkNvbnRleHRzcQB-AAwAAAAxdAA-b3JnLmFw
>>> YWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29udGV4
>>> dFJ1bm5hYmxlJDF0ABtNYW5hZ2VkQ29udGV4dFJ1bm5hYmxlLmphdmF0AANy
>>> dW5zcQB-AAwAAAA4dABCb3JnLmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQu
>>> Y29udGV4dC5pbXBsLkRlZmF1bHRNYW5hZ2VkQ29udGV4dCQxdAAaRGVmYXVs
>>> dE1hbmFnZWRDb250ZXh0LmphdmF0AARjYWxsc3EAfgAMAAAAZ3QAQG9yZy5h
>>> cGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0
>>> TWFuYWdlZENvbnRleHRxAH4ALnQAD2NhbGxXaXRoQ29udGV4dHNxAH4ADAAA
>>> ADVxAH4AMXEAfgAudAAOcnVuV2l0aENvbnRleHRzcQB-AAwAAAAudAA8b3Jn
>>> LmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29u
>>> dGV4dFJ1bm5hYmxlcQB-ACpxAH4AK3NxAH4ADAAAAe5xAH4AJXEAfgAmcQB-
>>> ACtzcQB-AAwAAAHXdAAuamF2YS51dGlsLmNvbmN1cnJlbnQuRXhlY3V0b3Jz
>>> JFJ1bm5hYmxlQWRhcHRlcnQADkV4ZWN1dG9ycy5qYXZhcQB-AC9zcQB-AAwA
>>> AAEGdAAfamF2YS51dGlsLmNvbmN1cnJlbnQuRnV0dXJlVGFza3QAD0Z1dHVy
>>> ZVRhc2suamF2YXEAfgArc3EAfgAMAAAEeXQAJ2phdmEudXRpbC5jb25jdXJy
>>> ZW50LlRocmVhZFBvb2xFeGVjdXRvcnQAF1RocmVhZFBvb2xFeGVjdXRvci5q
>>> YXZhdAAJcnVuV29ya2Vyc3EAfgAMAAACZ3QALmphdmEudXRpbC5jb25jdXJy
>>> ZW50LlRocmVhZFBvb2xFeGVjdXRvciRXb3JrZXJxAH4AQHEAfgArc3EAfgAM
>>> AAAC6XQAEGphdmEubGFuZy5UaHJlYWR0AAtUaHJlYWQuamF2YXEAfgArc3IA
>>> JmphdmEudXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVMaXN0_A8lMbXs
>>> jhACAAFMAARsaXN0cQB-AAd4cgAsamF2YS51dGlsLkNvbGxlY3Rpb25zJFVu
>>> bW9kaWZpYWJsZUNvbGxlY3Rpb24ZQgCAy173HgIAAUwAAWN0ABZMamF2YS91
>>> dGlsL0NvbGxlY3Rpb247eHBzcgATamF2YS51dGlsLkFycmF5TGlzdHiB0h2Z
>>> x2GdAwABSQAEc2l6ZXhwAAAAAHcEAAAAAHhxAH4ATHgAABCadwgAAAAAAAAAAHg
>>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>> job-13733/job-191379) Publish async job-191379 complete on message bus
>>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>> job-13733/job-191379) Wake up jobs related to job-191379
>>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>> job-13733/job-191379) Update db status for job-191379
>>> ewcstack-man02-prod: 2017-01-24 18:05:32,594 DEBUG
>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>> job-13733/job-191379) Wake up jobs joined with job-191379 and disjoin all
>>> subjobs created from job- 191379
>>> ewcstack-man02-prod: 2017-01-24 18:05:32,604 DEBUG
>>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>>> job-13733/job-191379) Done executing com.cloud.vm.VmWorkMigrateAway for
>>> job-191379
>>> ewcstack-man02-prod: 2017-01-24 18:05:32,608 INFO
>>> [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-115:ctx-0d5db528
>>> job-13733/job-191379) Remove job-191379 from job monitoring
>>>
>>>
>>>
>>> On 25.01.2017 15:53, Rafael Weing�rtner wrote:
>>>
>>>> Did you check the ACS log files looking for some exceptions when the
>>>> migrations for this specific VM started?
>>>>
>>>> On Wed, Jan 25, 2017 at 9:49 AM, Francois Scheurer <
>>>> francois.scheurer@everyware.ch> wrote:
>>>>
>>>> Dear CS contributors
>>>>>
>>>>> We use CS 4.5.1 on a 3 Clusters with XenServer 6.5. and shared primary
>>>>> storage.
>>>>>
>>>>> When we put a host in maintenance, cs evacuates all VM's on other hosts.
>>>>>
>>>>> But it happens regularly that some VM's are stopped instead of live
>>>>> migrated.
>>>>>
>>>>> If we restart that stopped VM's, CS will then stop them again later.
>>>>>
>>>>> We need to start them several times until they stays up.
>>>>>
>>>>> Is it a known issue? Is it fixed on CS 4.9.2 ?
>>>>>
>>>>>
>>>>> Migrating manually all VM's is working but impracticable because too
>>>>> much
>>>>> time consuming.
>>>>>
>>>>>
>>>>> Many thanks in advance for your help.
>>>>>
>>>>>
>>>>>
>>>>> Best Regards
>>>>> Francois
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>>
>>>>> EveryWare AG
>>>>> Fran�ois Scheurer
>>>>> Senior Systems Engineer
>>>>> Zurlindenstrasse 52a
>>>>> CH-8003 Z�rich
>>>>>
>>>>> tel: +41 44 466 60 00
>>>>> fax: +41 44 466 60 10
>>>>> mail: francois.scheurer@everyware.ch
>>>>> web: http://www.everyware.ch
>>>>>
>>>>>
>>>>
>>> --
>>>
>>>
>>> EveryWare AG
>>> Fran�ois Scheurer
>>> Senior Systems Engineer
>>> Zurlindenstrasse 52a
>>> CH-8003 Z�rich
>>>
>>> tel: +41 44 466 60 00
>>> fax: +41 44 466 60 10
>>> mail: francois.scheurer@everyware.ch
>>> web: http://www.everyware.ch
>>>
>>>
>>
>> --
>> Rafael Weing�rtner
>>
>
>

-- 


EveryWare AG
Fran�ois Scheurer
Senior Systems Engineer
Zurlindenstrasse 52a
CH-8003 Z�rich

tel: +41 44 466 60 00
fax: +41 44 466 60 10
mail: francois.scheurer@everyware.ch
web: http://www.everyware.ch

Re: cs 4.5.1, vm stopped when putting xen hosts in maintenance

Posted by Rafael Weingärtner <ra...@gmail.com>.

Found the code that is executed.
I need you to confirm if you are migrating within a cluster or across
clusters?

On Wed, Jan 25, 2017 at 11:57 AM, Rafael Weingärtner <
rafaelweingartner@gmail.com> wrote:

> I do not have access to ACS code right now, but I suggest starting
> debugging here:
>
>> 2017-01-24 18:05:28,427 DEBUG [c.c.c.CapacityManagerImpl]
>> (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380 ctx-175d37df)
>> VM state transitted from :Running to Stopping with event: StopRequestedvm's
>> original host id: 224 new host id: 15 host id before state transition: 15
>>
>
> I would try to understand first why ACS requested the hypervisor to stop
> the VM. I mean the following, check the source code for conditions that
> would make ACS request the shutdown of VMs.
> BTW: are you migrating within a cluster or across clusters?
>
> I will be able to analyze this further only after work hours, so if you
> find anything, keep me posted.
>
> On Wed, Jan 25, 2017 at 11:45 AM, Francois Scheurer <
> francois.scheurer@everyware.ch> wrote:
>
>> Hello Rafael
>>
>>
>> I think the important log lines are these:
>>
>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 INFO
>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>> job-60216/job-191380 ctx-175d37df) Migration cancelled because state has
>> changed: VM[User|i-638-1736-VM]
>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 DEBUG
>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>> job-60216/job-191380 ctx-175d37df) Unable to migrate VM due to: Migration
>> cancelled because state has changed: VM[User|i-638-1736-VM]
>> ewcstack-man02-prod: 2017-01-24 18:05:28,427 DEBUG
>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>> job-60216/job-191380 ctx-175d37df) VM state transitted from :Running to
>> Stopping with event: StopRequestedvm's original host id: 224 new host id:
>> 15 host id before state transition: 15
>>
>> The really disturbing thing is that CS stop again the VM if we start it
>> manually, apparently because the migratingjob is still running.
>>
>> Note the the VM HA Flag is enabled.
>>
>>
>>
>> Thank you.
>>
>>
>> Best Regards.
>>
>> Francois
>>
>>
>>
>>
>>
>>
>> more details:
>>
>>
>>
>> ewcstack-man02-prod: 2017-01-24 18:05:28,148 DEBUG
>> [o.a.c.e.o.VolumeOrchestrator] (Work-Job-Executor-156:ctx-14b86f5e
>> job-60216/job-191380 ctx-175d37df) Preparing 3 volumes for
>> VM[User|i-638-1736-VM]
>> ewcstack-man02-prod: 2017-01-24 18:05:28,155 DEBUG
>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-122:ctx-a8bacc64
>> job-14953/job-191351) Done with run of VM work job:
>> com.cloud.vm.VmWorkMigrateAway for VM 1411, job origin: 14953
>> ewcstack-man02-prod: 2017-01-24 18:05:28,155 DEBUG
>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-122:ctx-a8bacc64
>> job-14953/job-191351) Done executing com.cloud.vm.VmWorkMigrateAway for
>> job-191351
>> ewcstack-man02-prod: 2017-01-24 18:05:28,160 INFO
>> [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-122:ctx-a8bacc64
>> job-14953/job-191351) Remove job-191351 from job monitoring
>> ewcstack-man02-prod: 2017-01-24 18:05:28,168 INFO
>> [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-2:ctx-f3bcf43c work-1989)
>> Completed HAWork[1989-Migration-1411-Running-Migrating]
>> ewcstack-man02-prod: 2017-01-24 18:05:28,246 DEBUG
>> [c.c.a.m.ClusteredAgentAttache] (Work-Job-Executor-156:ctx-14b86f5e
>> job-60216/job-191380 ctx-175d37df) Seq 19-8526158519542516375: Forwarding
>> Seq 19-8526158519542516375:  { Cmd , MgmtId: 345049103441, via:
>> 19(ewcstack-vh010-prod), Ver: v1, Flags: 100111, [{
>> "com.cloud.agent.api.PrepareForMigrationCommand":{"vm":{"id"
>> :1736,"name":"i-638-1736-VM","bootloader":"PyGrub","type":"U
>> ser","cpus":4,"minSpeed":2100,"maxSpeed":2100,"minRam":17179
>> 869184,"maxRam":17179869184,"arch":"x86_64","os":"Other
>> (64-bit)","platformEmulator":"Other install media","bootArgs":"","enable
>> HA":true,"limitCpuUse":true,"enableDynamicallyScaleVm":true,
>> "vncPassword":"pC1WUC7h1DBH9J36q3H9pg==","params":{"memoryOv
>> ercommitRatio":"1.0","platform":"viridian:true;acpi:1;apic:t
>> rue;pae:true;nx:true;timeoffset:-3601","keyboard":"us","Mess
>> age.ReservedCapacityFreed.Flag":"false","cpuOvercommitRatio":"4.0","
>> hypervisortoolsversion":"xenserver61"},"uuid":"5bff01de-c033
>> -4925-9e47-30dd14539272","disks":[{"data":{"org.apache.cloud
>> stack.storage.to.VolumeObjectTO":{"uuid":"1ab514b2-f2de-
>> 4d85-8e0d-9768184a1349","volumeType":"ROOT","dataStore":{"org.
>> apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"b8751b89-00f
>> 6-5079-ed7a-c073579c7563","id":64,"poolType":"PreSetup","hos
>> t":"localhost","path":"/b8751b89-00f6-5079-ed7a-c073579c7563
>> ","port":0,"url":"PreSetup://localhost/b8751b89-00f6-5079-ed
>> 7a-c073579c7563/?ROLE=Primary&STOREUUID=b8751b89-00f6-5079-
>> ed7a-c073579c7563"}},"name":"ROOT-1736","size":53687091200,"path":"d3
>> f87e3c-1efe-42ad-aa93-609d5b980a34","volumeId":8655,"vmName"
>> :"i-638-1736-VM","accountId":638,"format":"VHD","provisionin
>> gType":"THIN","id":8655,"deviceId":0,"hypervisorType":"
>> XenServer"}},"diskSeq":0,"path":"d3f87e3c-1efe-42ad-aa93
>> -609d5b980a34","type":"ROOT","_details":{"managed":"false","
>> storagePort":"0",
>> "storageHost":"localhost","volumeSize":"53687091200"}},{"dat
>> a":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid
>> ":"733df9dc-5e24-4ab9-b22b-bce78322428e","volumeType":"DATAD
>> ISK","dataStore":{"org.apache.cloudstack.storage.to.PrimaryD
>> ataStoreTO":{"uuid":"b8751b89-00f6-5079-ed7a-c073579c7563","id":64,"
>> poolType":"PreSetup","host":"localhost","path":"/b8751b89-00
>> f6-5079-ed7a-c073579c7563","port":0,"url":"PreSetup://localh
>> ost/b8751b89-00f6-5079-ed7a-c073579c7563/?ROLE=Primary&STORE
>> UUID=b8751b89-00f6-5079-ed7a-c073579c7563"}},"name":"WELLA-
>> DB1-NCSA_D","size":268435456000,"path":"265f168a-bb92-4591-a32b-ee3e2
>> 2c8cb68","volumeId":8656,"vmName":"i-638-1736-VM","accountId
>> ":638,"format":"VHD","provisioningType":"THIN","id":8656,"
>> deviceId":1,"hypervisorType":"XenServer"}},"diskSeq":1,"
>> path":"265f168a-bb92-4591-a32b-ee3e22c8cb68","type":"DATADIS
>> K","_details":{"managed":"false","storagePort":"0","stor
>> ageHost":"localhos
>> t","volumeSize":"268435456000"}},{"data":{"org.apache.clouds
>> tack.storage.to.VolumeObjectTO":{"uuid":"bda6f335-b8a3-479c-
>> 8c0d-41b344d2a50d","volumeType":"DATADISK","dataStore":{"org.
>> apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid"
>> :"09df3ca6-0a20-e4f0-3d79-12a6ed67a13d","id":89,"poolType":"PreSetup","
>> host":"localhost","path":"/09df3ca6-0a20-e4f0-3d79-12a6ed67a
>> 13d","port":0,"url":"PreSetup://localhost/09df3ca6-0a20-e4f0
>> -3d79-12a6ed67a13d/?ROLE=Primary&STOREUUID=09df3ca6-0a20-
>> e4f0-3d79-12a6ed67a13d"}},"name":"WELLA-DB1-NCSA_E","size
>> ":429496729600,"path":"612cb586-a70c-4896-b120-062ee7a67894"
>> ,"volumeId":86
>> 57,"vmName":"i-638-1736-VM","accountId":638,"format":"VHD","
>> provisioningType":"THIN","id":8657,"deviceId":2,"hypervisorT
>> ype":"XenServer"}},"diskSeq":2,"path":"612cb586-a70c-4896-b1
>> 20-062ee7a67894","type":"DATADISK","_details":{"managed":"
>> false","storagePort":"0","storageHost":"localhost","volumeSize":"42949
>> 6729600"}},{"data":{"org.apache.cloudstack.storage.to.Templa
>> teObjectTO":{"uuid":"0e5bf4ff-caca-4b08-abed-7bbcdf3c42ef","
>> id":201,"format":"ISO","accountId":1,"hvm":true,"displayText
>> ":"xen-pv-drv-iso","name":"xs-tools.iso","guestOsType":"CentOS 4.5
>> (32-bit)","hypervisorType":"XenServer"}},"diskSeq":3,"type":"
>> ISO"}],"nics":[{"deviceId":0,"networkRateMbps":1000,"default
>> Nic":true,"pxeDisable":false,"nicUuid":"3523c3d3-9f92-4076-a
>> d99-f8864d0f0edd","uuid":"e6fbb257-16e1-415f-88fb-9a79b263b3
>> a8","ip":"192.168.1.42","netmask":"255.255.255.0","
>> gateway":"192.168.1.1","mac":"02:00:0c:ae:00:0f","dns1":"8.8
>> .8.8","dns2":"8.8
>> .4.4","broadcastType":"Vlan","type":"Guest","broadcastUri":"
>> vlan://2285","isolationUri":"vlan://2285","isSecurityGroupEn
>> abled":false,"name":"bond0"}],"vcpuMaxLimit":16},"wait":0}}] } to
>> 345049098498
>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG [c.c.a.t.Request]
>> (AgentManager-Handler-10:null) Seq 19-8526158519542516375: Processing:  {
>> Ans: , MgmtId: 345049103441, via: 19, Ver: v1, Flags: 110,
>> [{"com.cloud.agent.api.PrepareForMigrationAnswer":{"result":true,"wait":0}}]
>> }
>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG [c.c.a.m.AgentAttache]
>> (AgentManager-Handler-10:null) Seq 19-8526158519542516375: No more commands
>> found
>> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG [c.c.a.t.Request]
>> (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380 ctx-175d37df)
>> Seq 19-8526158519542516375: Received:  { Ans: , MgmtId: 345049103441, via:
>> 19, Ver: v1, Flags: 110, { PrepareForMigrationAnswer } }
>>
>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 INFO
>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>> job-60216/job-191380 ctx-175d37df) Migration cancelled because state has
>> changed: VM[User|i-638-1736-VM]
>> ewcstack-man02-prod: 2017-01-24 18:05:28,407 DEBUG
>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>> job-60216/job-191380 ctx-175d37df) Unable to migrate VM due to: Migration
>> cancelled because state has changed: VM[User|i-638-1736-VM]
>> ewcstack-man02-prod: 2017-01-24 18:05:28,427 DEBUG
>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
>> job-60216/job-191380 ctx-175d37df) VM state transitted from :Running to
>> Stopping with event: StopRequestedvm's original host id: 224 new host id:
>> 15 host id before state transition: 15
>>
>> ewcstack-man02-prod: 2017-01-24 18:05:28,438 DEBUG
>> [c.c.a.m.ClusteredAgentAttache] (Work-Job-Executor-156:ctx-14b86f5e
>> job-60216/job-191380 ctx-175d37df) Seq 15-6889944479923437643: Forwarding
>> Seq 15-6889944479923437643:  { Cmd , MgmtId: 345049103441, via:
>> 15(ewcstack-vh011-prod), Ver: v1, Flags: 100011,
>> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"execut
>> eInSequence":false,"checkBeforeCleanup":false,"vmName":"i-638-1736-VM","wait":0}}]
>> } to 345049098498
>> ewcstack-man02-prod: 2017-01-24 18:05:30,335 DEBUG
>> [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-1:null) SeqA 254-800498:
>> Processing Seq 254-800498:  { Cmd , MgmtId: -1, via: 254, Ver: v1, Flags:
>> 11, [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_proxyVmId":1530,"_loadInfo":"{\n
>> \"connections\": []\n}","wait":0}}] }
>> ewcstack-man02-prod: 2017-01-24 18:05:30,340 DEBUG
>> [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-1:null) SeqA 254-800498:
>> Sending Seq 254-800498:  { Ans: , MgmtId: 345049103441, via: 254, Ver: v1,
>> Flags: 100010, [{"com.cloud.agent.api.AgentCo
>> ntrolAnswer":{"result":true,"wait":0}}] }
>> ewcstack-man02-prod: 2017-01-24 18:05:32,447 DEBUG [c.c.a.t.Request]
>> (AgentManager-Handler-11:null) Seq 15-6889944479923437632: Processing:  {
>> Ans: , MgmtId: 345049103441, via: 15, Ver: v1, Flags: 10,
>> [{"com.cloud.agent.api.StopAnswer":{"platform":"viridian:tru
>> e;acpi:1;apic:true;pae:true;nx:true;timeoffset:3601","result":true,"details":"Stop
>> VM i-638-1735-VM Succeed","wait":0}}] }
>> ewcstack-man02-prod: 2017-01-24 18:05:32,447 DEBUG [c.c.a.t.Request]
>> (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379 ctx-bbabfeae)
>> Seq 15-6889944479923437632: Received:  { Ans: , MgmtId: 345049103441, via:
>> 15, Ver: v1, Flags: 10, { StopAnswer } }
>> ewcstack-man02-prod: 2017-01-24 18:05:32,470 DEBUG
>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>> job-13733/job-191379 ctx-bbabfeae) VM[User|i-638-1735-VM] is stopped on the
>> host. Proceeding to release resource held.
>> ewcstack-man02-prod: 2017-01-24 18:05:32,475 DEBUG
>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not supported
>> in the network id=519
>> ewcstack-man02-prod: 2017-01-24 18:05:32,478 DEBUG
>> [o.a.c.e.o.NetworkOrchestrator] (Work-Job-Executor-115:ctx-0d5db528
>> job-13733/job-191379 ctx-bbabfeae) Changing active number of nics for
>> network id=519 on -1
>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>> [o.a.c.e.o.NetworkOrchestrator] (Work-Job-Executor-115:ctx-0d5db528
>> job-13733/job-191379 ctx-bbabfeae) Asking VpcVirtualRouter to release
>> NicProfile[17403-1735-3364083b-dcc0-4eb9-a7f9-f3e85bf8c7d3-1
>> 92.168.1.11-null
>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>> job-13733/job-191379 ctx-bbabfeae) Successfully released network resources
>> for the vm VM[User|i-638-1735-VM]
>> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
>> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>> job-13733/job-191379 ctx-bbabfeae) Successfully released storage resources
>> for the vm VM[User|i-638-1735-VM]
>> ewcstack-man02-prod: 2017-01-24 18:05:32,496 DEBUG
>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>> job-13733/job-191379 ctx-bbabfeae) VM state transitted from :Stopping to
>> Stopped with event: OperationSucceededvm's original host id: 15 new host
>> id: null host id before state transition: 15
>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>> job-13733/job-191379 ctx-bbabfeae) Hosts's actual total CPU: 82976 and CPU
>> after applying overprovisioning: 331904
>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>> job-13733/job-191379 ctx-bbabfeae) Hosts's actual total RAM: 261961927808
>> and RAM after applying overprovisioning: 261961924608
>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>> job-13733/job-191379 ctx-bbabfeae) release cpu from host: 15, old used:
>> 71300,reserved: 2100, actual total: 82976, total with overprovisioning:
>> 331904; new used: 67100,reserved:6300; movedfromreserved:
>> false,moveToReserveredtrue
>> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
>> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>> job-13733/job-191379 ctx-bbabfeae) release mem from host: 15, old used:
>> 98784247808,reserved: 1073741824, total: 261961924608; new used:
>> 94489280512,reserved:5368709120 <(53)%206870-9120>; movedfromreserved:
>> false,moveToReserveredtrue
>> ewcstack-man02-prod: 2017-01-24 18:05:32,538 DEBUG
>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not supported
>> in the network id=519
>> ewcstack-man02-prod: 2017-01-24 18:05:32,557 DEBUG
>> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
>> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not supported
>> in the network id=519
>>
>> ewcstack-man02-prod: 2017-01-24 18:05:32,588 ERROR
>> [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-115:ctx-0d5db528
>> job-13733/job-191379 ctx-bbabfeae) Invocation exception, caused by:
>> com.cloud.utils.exception.CloudRuntimeException: Unable to migrate
>> VM[User|i-638-1735-VM]
>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 INFO
>> [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-115:ctx-0d5db528
>> job-13733/job-191379 ctx-bbabfeae) Rethrow exception
>> com.cloud.utils.exception.CloudRuntimeException: Unable to migrate
>> VM[User|i-638-1735-VM]
>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 DEBUG
>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-115:ctx-0d5db528
>> job-13733/job-191379) Done with run of VM work job:
>> com.cloud.vm.VmWorkMigrateAway for VM 1735, job origin: 13733
>> ewcstack-man02-prod: 2017-01-24 18:05:32,589 ERROR
>> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-115:ctx-0d5db528
>> job-13733/job-191379) Unable to complete AsyncJobVO {id:191379, userId: 1,
>> accountId: 1, instanceType: null, instanceId: null, cmd:
>> com.cloud.vm.VmWorkMigrateAway, cmdInfo: rO0ABXNyAB5jb20uY2xvdWQudm0uVm
>> 1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb2
>> 0uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2
>> VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZz
>> t4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGx3QAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAA4A,
>> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
>> result: null, initMsid: 345049098498, completeMsid: null, lastUpdated:
>> null, lastPolled: null, created: Tue Jan 24 18:05:00 CET 2017}, job
>> origin:13733
>> ewcstack-man02-prod: com.cloud.utils.exception.CloudRuntimeException:
>> Unable to migrate VM[User|i-638-1735-VM]
>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>> agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:2405)
>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>> agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:4517)
>> ewcstack-man02-prod:    at sun.reflect.GeneratedMethodAccessor402.invoke(Unknown
>> Source)
>> ewcstack-man02-prod:    at sun.reflect.DelegatingMethodAc
>> cessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> ewcstack-man02-prod:    at java.lang.reflect.Method.invok
>> e(Method.java:606)
>> ewcstack-man02-prod:    at com.cloud.vm.VmWorkJobHandlerP
>> roxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
>> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
>> agerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4636)
>> ewcstack-man02-prod:    at com.cloud.vm.VmWorkJobDispatch
>> er.runJob(VmWorkJobDispatcher.java:103)
>> ewcstack-man02-prod:    at org.apache.cloudstack.framewor
>> k.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManag
>> erImpl.java:537)
>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>> context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>> context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>> context.impl.DefaultManagedContext.callWithContext(DefaultMa
>> nagedContext.java:103)
>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>> context.impl.DefaultManagedContext.runWithContext(DefaultMan
>> agedContext.java:53)
>> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
>> context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>> ewcstack-man02-prod:    at org.apache.cloudstack.framewor
>> k.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:494)
>> ewcstack-man02-prod:    at java.util.concurrent.Executors
>> $RunnableAdapter.call(Executors.java:471)
>> ewcstack-man02-prod:    at java.util.concurrent.FutureTas
>> k.run(FutureTask.java:262)
>> ewcstack-man02-prod:    at java.util.concurrent.ThreadPoo
>> lExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> ewcstack-man02-prod:    at java.util.concurrent.ThreadPoo
>> lExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> ewcstack-man02-prod:    at java.lang.Thread.run(Thread.java:745)
>> ewcstack-man02-prod: 2017-01-24 18:05:32,590 DEBUG
>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>> job-13733/job-191379) Complete async job-191379, jobStatus: FAILED,
>> resultCode: 0, result: rO0ABXNyAC9jb20uY2xvdWQudXRpbH
>> MuZXhjZXB0aW9uLkNsb3VkUnVudGltZUV4Y2VwdGlvbgAAAABWTU9yAwABSQ
>> ALY3NFcnJvckNvZGV4cgAaamF2YS5sYW5nLlJ1bnRpbWVFeGNlcHRpb26eXw
>> ZHCjSD5QIAAHhyABNqYXZhLmxhbmcuRXhjZXB0aW9u0P0fPho7HMQCAAB4cg
>> ATamF2YS5sYW5nLlRocm93YWJsZdXGNSc5d7jLAwAETAAFY2F1c2V0ABVMam
>> F2YS9sYW5nL1Rocm93YWJsZTtMAA1kZXRhaWxNZXNzYWdldAASTGphdmEvbG
>> FuZy9TdHJpbmc7WwAKc3RhY2tUcmFjZXQAHltMamF2YS9sYW5nL1N0YWNrVH
>> JhY2VFbGVtZW50O0wAFHN1cHByZXNzZWRFeGNlcHRpb25zdAAQTGphdmEvdX
>> RpbC9MaXN0O3hwcQB-AAh0AChVbmFibGUgdG8gbWlncmF0ZSBWTVtVc2VyfG
>> ktNjM4LTE3MzUtVk1ddXIAHltMamF2YS5sYW5nLlN0YWNrVHJhY2VFbGVtZW
>> 50OwJGKjw8_SI5AgAAeHAAAAAUc3IAG2phdmEubGFuZy5TdGFja1RyYWNlRW
>> xlbWVudGEJxZomNt2FAgAESQAKbGluZU51bWJlckwADmRlY2xhcmluZ0NsYX
>> NzcQB-AAVMAAhmaWxlTmFtZXEAfgAFTAAKbWV0aG9kTmFtZXEAfgAFeHAAAA
>> lldAAmY29tLmNsb3VkLnZtLlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGx0AB
>> 5WaXJ0dWFsTWFjaGluZU1hbmFnZXJJbXBsLmphdmF0ABZvcmNoZXN0cmF0ZU
>> 1pZ3JhdGVBd2F5c3EAfgAMAAARpXEAfgAOcQB-AA9xAH4AEHNxAH4ADP____
>> 90ACZzdW4ucmVmbGVjdC5HZW5lcmF0ZWRNZXRob2RBY2Nlc3NvcjQwMnB0AA
>> ZpbnZva2VzcQB-AAwAAAArdAAoc3VuLnJlZmxlY3QuRGVsZWdhdGluZ01ldG
>> hvZEFjY2Vzc29ySW1wbHQAIURlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcG
>> wuamF2YXEAfgAUc3EAfgAMAAACXnQAGGphdmEubGFuZy5yZWZsZWN0Lk1ldG
>> hvZHQAC01ldGhvZC5qYXZhcQB-ABRzcQB-AAwAAABrdAAiY29tLmNsb3VkLn
>> ZtLlZtV29ya0pvYkhhbmRsZXJQcm94eXQAGlZtV29ya0pvYkhhbmRsZXJQcm
>> 94eS5qYXZhdAAPaGFuZGxlVm1Xb3JrSm9ic3EAfgAMAAASHHEAfgAOcQB-
>> AA9xAH4AHnNxAH4ADAAAAGd0ACBjb20uY2xvdWQudm0uVm1Xb3JrSm9iRGlz
>> cGF0Y2hlcnQAGFZtV29ya0pvYkRpc3BhdGNoZXIuamF2YXQABnJ1bkpvYnNx
>> AH4ADAAAAhl0AD9vcmcuYXBhY2hlLmNsb3Vkc3RhY2suZnJhbWV3b3JrLmpv
>> YnMuaW1wbC5Bc3luY0pvYk1hbmFnZXJJbXBsJDV0ABhBc3luY0pvYk1hbmFn
>> ZXJJbXBsLmphdmF0AAxydW5JbkNvbnRleHRzcQB-AAwAAAAxdAA-b3JnLmFw
>> YWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29udGV4
>> dFJ1bm5hYmxlJDF0ABtNYW5hZ2VkQ29udGV4dFJ1bm5hYmxlLmphdmF0AANy
>> dW5zcQB-AAwAAAA4dABCb3JnLmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQu
>> Y29udGV4dC5pbXBsLkRlZmF1bHRNYW5hZ2VkQ29udGV4dCQxdAAaRGVmYXVs
>> dE1hbmFnZWRDb250ZXh0LmphdmF0AARjYWxsc3EAfgAMAAAAZ3QAQG9yZy5h
>> cGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0
>> TWFuYWdlZENvbnRleHRxAH4ALnQAD2NhbGxXaXRoQ29udGV4dHNxAH4ADAAA
>> ADVxAH4AMXEAfgAudAAOcnVuV2l0aENvbnRleHRzcQB-AAwAAAAudAA8b3Jn
>> LmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29u
>> dGV4dFJ1bm5hYmxlcQB-ACpxAH4AK3NxAH4ADAAAAe5xAH4AJXEAfgAmcQB-
>> ACtzcQB-AAwAAAHXdAAuamF2YS51dGlsLmNvbmN1cnJlbnQuRXhlY3V0b3Jz
>> JFJ1bm5hYmxlQWRhcHRlcnQADkV4ZWN1dG9ycy5qYXZhcQB-AC9zcQB-AAwA
>> AAEGdAAfamF2YS51dGlsLmNvbmN1cnJlbnQuRnV0dXJlVGFza3QAD0Z1dHVy
>> ZVRhc2suamF2YXEAfgArc3EAfgAMAAAEeXQAJ2phdmEudXRpbC5jb25jdXJy
>> ZW50LlRocmVhZFBvb2xFeGVjdXRvcnQAF1RocmVhZFBvb2xFeGVjdXRvci5q
>> YXZhdAAJcnVuV29ya2Vyc3EAfgAMAAACZ3QALmphdmEudXRpbC5jb25jdXJy
>> ZW50LlRocmVhZFBvb2xFeGVjdXRvciRXb3JrZXJxAH4AQHEAfgArc3EAfgAM
>> AAAC6XQAEGphdmEubGFuZy5UaHJlYWR0AAtUaHJlYWQuamF2YXEAfgArc3IA
>> JmphdmEudXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVMaXN0_A8lMbXs
>> jhACAAFMAARsaXN0cQB-AAd4cgAsamF2YS51dGlsLkNvbGxlY3Rpb25zJFVu
>> bW9kaWZpYWJsZUNvbGxlY3Rpb24ZQgCAy173HgIAAUwAAWN0ABZMamF2YS91
>> dGlsL0NvbGxlY3Rpb247eHBzcgATamF2YS51dGlsLkFycmF5TGlzdHiB0h2Z
>> x2GdAwABSQAEc2l6ZXhwAAAAAHcEAAAAAHhxAH4ATHgAABCadwgAAAAAAAAAAHg
>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>> job-13733/job-191379) Publish async job-191379 complete on message bus
>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>> job-13733/job-191379) Wake up jobs related to job-191379
>> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>> job-13733/job-191379) Update db status for job-191379
>> ewcstack-man02-prod: 2017-01-24 18:05:32,594 DEBUG
>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>> job-13733/job-191379) Wake up jobs joined with job-191379 and disjoin all
>> subjobs created from job- 191379
>> ewcstack-man02-prod: 2017-01-24 18:05:32,604 DEBUG
>> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
>> job-13733/job-191379) Done executing com.cloud.vm.VmWorkMigrateAway for
>> job-191379
>> ewcstack-man02-prod: 2017-01-24 18:05:32,608 INFO
>> [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-115:ctx-0d5db528
>> job-13733/job-191379) Remove job-191379 from job monitoring
>>
>>
>>
>> On 25.01.2017 15:53, Rafael Weingärtner wrote:
>>
>>> Did you check the ACS log files looking for some exceptions when the
>>> migrations for this specific VM started?
>>>
>>> On Wed, Jan 25, 2017 at 9:49 AM, Francois Scheurer <
>>> francois.scheurer@everyware.ch> wrote:
>>>
>>> Dear CS contributors
>>>>
>>>>
>>>> We use CS 4.5.1 on a 3 Clusters with XenServer 6.5. and shared primary
>>>> storage.
>>>>
>>>> When we put a host in maintenance, cs evacuates all VM's on other hosts.
>>>>
>>>> But it happens regularly that some VM's are stopped instead of live
>>>> migrated.
>>>>
>>>> If we restart that stopped VM's, CS will then stop them again later.
>>>>
>>>> We need to start them several times until they stays up.
>>>>
>>>> Is it a known issue? Is it fixed on CS 4.9.2 ?
>>>>
>>>>
>>>> Migrating manually all VM's is working but impracticable because too
>>>> much
>>>> time consuming.
>>>>
>>>>
>>>> Many thanks in advance for your help.
>>>>
>>>>
>>>>
>>>> Best Regards
>>>> Francois
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>>
>>>> EveryWare AG
>>>> François Scheurer
>>>> Senior Systems Engineer
>>>> Zurlindenstrasse 52a
>>>> CH-8003 Zürich
>>>>
>>>> tel: +41 44 466 60 00
>>>> fax: +41 44 466 60 10
>>>> mail: francois.scheurer@everyware.ch
>>>> web: http://www.everyware.ch
>>>>
>>>>
>>>
>>>
>> --
>>
>>
>> EveryWare AG
>> François Scheurer
>> Senior Systems Engineer
>> Zurlindenstrasse 52a
>> CH-8003 Zürich
>>
>> tel: +41 44 466 60 00
>> fax: +41 44 466 60 10
>> mail: francois.scheurer@everyware.ch
>> web: http://www.everyware.ch
>>
>>
>
>
> --
> Rafael Weingärtner
>



-- 
Rafael Weingärtner

Re: cs 4.5.1, vm stopped when putting xen hosts in maintenance

Posted by Rafael Weingärtner <ra...@gmail.com>.

I do not have access to ACS code right now, but I suggest starting
debugging here:

> 2017-01-24 18:05:28,427 DEBUG [c.c.c.CapacityManagerImpl]
> (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380 ctx-175d37df) VM
> state transitted from :Running to Stopping with event: StopRequestedvm's
> original host id: 224 new host id: 15 host id before state transition: 15
>

I would try to understand first why ACS requested the hypervisor to stop
the VM. I mean the following, check the source code for conditions that
would make ACS request the shutdown of VMs.
BTW: are you migrating within a cluster or across clusters?

I will be able to analyze this further only after work hours, so if you
find anything, keep me posted.

On Wed, Jan 25, 2017 at 11:45 AM, Francois Scheurer <
francois.scheurer@everyware.ch> wrote:

> Hello Rafael
>
>
> I think the important log lines are these:
>
> ewcstack-man02-prod: 2017-01-24 18:05:28,407 INFO
> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
> job-60216/job-191380 ctx-175d37df) Migration cancelled because state has
> changed: VM[User|i-638-1736-VM]
> ewcstack-man02-prod: 2017-01-24 18:05:28,407 DEBUG
> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
> job-60216/job-191380 ctx-175d37df) Unable to migrate VM due to: Migration
> cancelled because state has changed: VM[User|i-638-1736-VM]
> ewcstack-man02-prod: 2017-01-24 18:05:28,427 DEBUG
> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
> job-60216/job-191380 ctx-175d37df) VM state transitted from :Running to
> Stopping with event: StopRequestedvm's original host id: 224 new host id:
> 15 host id before state transition: 15
>
> The really disturbing thing is that CS stop again the VM if we start it
> manually, apparently because the migratingjob is still running.
>
> Note the the VM HA Flag is enabled.
>
>
>
> Thank you.
>
>
> Best Regards.
>
> Francois
>
>
>
>
>
>
> more details:
>
>
>
> ewcstack-man02-prod: 2017-01-24 18:05:28,148 DEBUG
> [o.a.c.e.o.VolumeOrchestrator] (Work-Job-Executor-156:ctx-14b86f5e
> job-60216/job-191380 ctx-175d37df) Preparing 3 volumes for
> VM[User|i-638-1736-VM]
> ewcstack-man02-prod: 2017-01-24 18:05:28,155 DEBUG
> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-122:ctx-a8bacc64
> job-14953/job-191351) Done with run of VM work job:
> com.cloud.vm.VmWorkMigrateAway for VM 1411, job origin: 14953
> ewcstack-man02-prod: 2017-01-24 18:05:28,155 DEBUG
> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-122:ctx-a8bacc64
> job-14953/job-191351) Done executing com.cloud.vm.VmWorkMigrateAway for
> job-191351
> ewcstack-man02-prod: 2017-01-24 18:05:28,160 INFO
> [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-122:ctx-a8bacc64
> job-14953/job-191351) Remove job-191351 from job monitoring
> ewcstack-man02-prod: 2017-01-24 18:05:28,168 INFO
> [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-2:ctx-f3bcf43c work-1989)
> Completed HAWork[1989-Migration-1411-Running-Migrating]
> ewcstack-man02-prod: 2017-01-24 18:05:28,246 DEBUG
> [c.c.a.m.ClusteredAgentAttache] (Work-Job-Executor-156:ctx-14b86f5e
> job-60216/job-191380 ctx-175d37df) Seq 19-8526158519542516375: Forwarding
> Seq 19-8526158519542516375:  { Cmd , MgmtId: 345049103441, via:
> 19(ewcstack-vh010-prod), Ver: v1, Flags: 100111, [{
> "com.cloud.agent.api.PrepareForMigrationCommand":{"vm":{"id"
> :1736,"name":"i-638-1736-VM","bootloader":"PyGrub","type":"
> User","cpus":4,"minSpeed":2100,"maxSpeed":2100,"minRam":
> 17179869184,"maxRam":17179869184,"arch":"x86_64","os":"Other
> (64-bit)","platformEmulator":"Other install media","bootArgs":"","enable
> HA":true,"limitCpuUse":true,"enableDynamicallyScaleVm":true,
> "vncPassword":"pC1WUC7h1DBH9J36q3H9pg==","params":{"memoryOv
> ercommitRatio":"1.0","platform":"viridian:true;acpi:1;apic:
> true;pae:true;nx:true;timeoffset:-3601","keyboard":"us","
> Message.ReservedCapacityFreed.Flag":"false","cpuOvercommitRatio":"4.0","
> hypervisortoolsversion":"xenserver61"},"uuid":"5bff01de-
> c033-4925-9e47-30dd14539272","disks":[{"data":{"org.apache.
> cloudstack.storage.to.VolumeObjectTO":{"uuid":"1ab514b2-f2de-4d85-8e0d-
> 9768184a1349","volumeType":"ROOT","dataStore":{"org.
> apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"b8751b89-00f
> 6-5079-ed7a-c073579c7563","id":64,"poolType":"PreSetup","hos
> t":"localhost","path":"/b8751b89-00f6-5079-ed7a-c073579c7563
> ","port":0,"url":"PreSetup://localhost/b8751b89-00f6-5079-
> ed7a-c073579c7563/?ROLE=Primary&STOREUUID=b8751b89-
> 00f6-5079-ed7a-c073579c7563"}},"name":"ROOT-1736","size":
> 53687091200,"path":"d3
> f87e3c-1efe-42ad-aa93-609d5b980a34","volumeId":8655,"vmName"
> :"i-638-1736-VM","accountId":638,"format":"VHD","
> provisioningType":"THIN","id":8655,"deviceId":0,"hypervisorT
> ype":"XenServer"}},"diskSeq":0,"path":"d3f87e3c-1efe-42ad-
> aa93-609d5b980a34","type":"ROOT","_details":{"managed":"
> false","storagePort":"0",
> "storageHost":"localhost","volumeSize":"53687091200"}},{"dat
> a":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid
> ":"733df9dc-5e24-4ab9-b22b-bce78322428e","volumeType":"
> DATADISK","dataStore":{"org.apache.cloudstack.storage.to.P
> rimaryDataStoreTO":{"uuid":"b8751b89-00f6-5079-ed7a-c073579c
> 7563","id":64,"
> poolType":"PreSetup","host":"localhost","path":"/b8751b89-00
> f6-5079-ed7a-c073579c7563","port":0,"url":"PreSetup://localh
> ost/b8751b89-00f6-5079-ed7a-c073579c7563/?ROLE=Primary&
> STOREUUID=b8751b89-00f6-5079-ed7a-c073579c7563"}},"name":"
> WELLA-DB1-NCSA_D","size":268435456000,"path":"265f168a-
> bb92-4591-a32b-ee3e2
> 2c8cb68","volumeId":8656,"vmName":"i-638-1736-VM","accountId
> ":638,"format":"VHD","provisioningType":"THIN","id":
> 8656,"deviceId":1,"hypervisorType":"XenServer"}},"diskSeq":
> 1,"path":"265f168a-bb92-4591-a32b-ee3e22c8cb68","type":"
> DATADISK","_details":{"managed":"false","storagePort":"0","
> storageHost":"localhos
> t","volumeSize":"268435456000"}},{"data":{"org.apache.clouds
> tack.storage.to.VolumeObjectTO":{"uuid":"bda6f335-b8a3-479c-
> 8c0d-41b344d2a50d","volumeType":"DATADISK","dataStore":{"org.apache.
> cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"09df3ca6-
> 0a20-e4f0-3d79-12a6ed67a13d","id":89,"poolType":"PreSetup","
> host":"localhost","path":"/09df3ca6-0a20-e4f0-3d79-12a6ed67a
> 13d","port":0,"url":"PreSetup://localhost/09df3ca6-0a20-
> e4f0-3d79-12a6ed67a13d/?ROLE=Primary&STOREUUID=09df3ca6-
> 0a20-e4f0-3d79-12a6ed67a13d"}},"name":"WELLA-DB1-NCSA_E","
> size":429496729600,"path":"612cb586-a70c-4896-b120-
> 062ee7a67894","volumeId":86
> 57,"vmName":"i-638-1736-VM","accountId":638,"format":"VHD","
> provisioningType":"THIN","id":8657,"deviceId":2,"hypervisorT
> ype":"XenServer"}},"diskSeq":2,"path":"612cb586-a70c-4896-
> b120-062ee7a67894","type":"DATADISK","_details":{"managed
> ":"false","storagePort":"0","storageHost":"localhost","volumeSize":"42949
> 6729600"}},{"data":{"org.apache.cloudstack.storage.to.Templa
> teObjectTO":{"uuid":"0e5bf4ff-caca-4b08-abed-7bbcdf3c42ef","
> id":201,"format":"ISO","accountId":1,"hvm":true,"
> displayText":"xen-pv-drv-iso","name":"xs-tools.iso","guestOsType":"CentOS
> 4.5 (32-bit)","hypervisorType":"XenServer"}},"diskSeq":3,"type":"
> ISO"}],"nics":[{"deviceId":0,"networkRateMbps":1000,"default
> Nic":true,"pxeDisable":false,"nicUuid":"3523c3d3-9f92-4076-
> ad99-f8864d0f0edd","uuid":"e6fbb257-16e1-415f-88fb-
> 9a79b263b3a8","ip":"192.168.1.42","netmask":"255.255.255.0",
> "gateway":"192.168.1.1","mac":"02:00:0c:ae:00:0f","dns1":"8.
> 8.8.8","dns2":"8.8
> .4.4","broadcastType":"Vlan","type":"Guest","broadcastUri":"
> vlan://2285","isolationUri":"vlan://2285","isSecurityGroupEn
> abled":false,"name":"bond0"}],"vcpuMaxLimit":16},"wait":0}}] } to
> 345049098498
> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG [c.c.a.t.Request]
> (AgentManager-Handler-10:null) Seq 19-8526158519542516375: Processing:  {
> Ans: , MgmtId: 345049103441, via: 19, Ver: v1, Flags: 110,
> [{"com.cloud.agent.api.PrepareForMigrationAnswer":{"result":true,"wait":0}}]
> }
> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG [c.c.a.m.AgentAttache]
> (AgentManager-Handler-10:null) Seq 19-8526158519542516375: No more commands
> found
> ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG [c.c.a.t.Request]
> (Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380 ctx-175d37df)
> Seq 19-8526158519542516375: Received:  { Ans: , MgmtId: 345049103441, via:
> 19, Ver: v1, Flags: 110, { PrepareForMigrationAnswer } }
>
> ewcstack-man02-prod: 2017-01-24 18:05:28,407 INFO
> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
> job-60216/job-191380 ctx-175d37df) Migration cancelled because state has
> changed: VM[User|i-638-1736-VM]
> ewcstack-man02-prod: 2017-01-24 18:05:28,407 DEBUG
> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
> job-60216/job-191380 ctx-175d37df) Unable to migrate VM due to: Migration
> cancelled because state has changed: VM[User|i-638-1736-VM]
> ewcstack-man02-prod: 2017-01-24 18:05:28,427 DEBUG
> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e
> job-60216/job-191380 ctx-175d37df) VM state transitted from :Running to
> Stopping with event: StopRequestedvm's original host id: 224 new host id:
> 15 host id before state transition: 15
>
> ewcstack-man02-prod: 2017-01-24 18:05:28,438 DEBUG
> [c.c.a.m.ClusteredAgentAttache] (Work-Job-Executor-156:ctx-14b86f5e
> job-60216/job-191380 ctx-175d37df) Seq 15-6889944479923437643: Forwarding
> Seq 15-6889944479923437643:  { Cmd , MgmtId: 345049103441, via:
> 15(ewcstack-vh011-prod), Ver: v1, Flags: 100011,
> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"
> executeInSequence":false,"checkBeforeCleanup":false,"vmName":"i-638-1736-VM","wait":0}}]
> } to 345049098498
> ewcstack-man02-prod: 2017-01-24 18:05:30,335 DEBUG
> [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-1:null) SeqA 254-800498:
> Processing Seq 254-800498:  { Cmd , MgmtId: -1, via: 254, Ver: v1, Flags:
> 11, [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_
> proxyVmId":1530,"_loadInfo":"{\n \"connections\": []\n}","wait":0}}] }
> ewcstack-man02-prod: 2017-01-24 18:05:30,340 DEBUG
> [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-1:null) SeqA 254-800498:
> Sending Seq 254-800498:  { Ans: , MgmtId: 345049103441, via: 254, Ver: v1,
> Flags: 100010, [{"com.cloud.agent.api.AgentControlAnswer":{"result":true,"wait":0}}]
> }
> ewcstack-man02-prod: 2017-01-24 18:05:32,447 DEBUG [c.c.a.t.Request]
> (AgentManager-Handler-11:null) Seq 15-6889944479923437632: Processing:  {
> Ans: , MgmtId: 345049103441, via: 15, Ver: v1, Flags: 10,
> [{"com.cloud.agent.api.StopAnswer":{"platform":"viridian:
> true;acpi:1;apic:true;pae:true;nx:true;timeoffset:3601","result":true,"details":"Stop
> VM i-638-1735-VM Succeed","wait":0}}] }
> ewcstack-man02-prod: 2017-01-24 18:05:32,447 DEBUG [c.c.a.t.Request]
> (Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379 ctx-bbabfeae)
> Seq 15-6889944479923437632: Received:  { Ans: , MgmtId: 345049103441, via:
> 15, Ver: v1, Flags: 10, { StopAnswer } }
> ewcstack-man02-prod: 2017-01-24 18:05:32,470 DEBUG
> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
> job-13733/job-191379 ctx-bbabfeae) VM[User|i-638-1735-VM] is stopped on the
> host. Proceeding to release resource held.
> ewcstack-man02-prod: 2017-01-24 18:05:32,475 DEBUG
> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not supported
> in the network id=519
> ewcstack-man02-prod: 2017-01-24 18:05:32,478 DEBUG
> [o.a.c.e.o.NetworkOrchestrator] (Work-Job-Executor-115:ctx-0d5db528
> job-13733/job-191379 ctx-bbabfeae) Changing active number of nics for
> network id=519 on -1
> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
> [o.a.c.e.o.NetworkOrchestrator] (Work-Job-Executor-115:ctx-0d5db528
> job-13733/job-191379 ctx-bbabfeae) Asking VpcVirtualRouter to release
> NicProfile[17403-1735-3364083b-dcc0-4eb9-a7f9-f3e85bf8c7d3-
> 192.168.1.11-null
> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
> job-13733/job-191379 ctx-bbabfeae) Successfully released network resources
> for the vm VM[User|i-638-1735-VM]
> ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG
> [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
> job-13733/job-191379 ctx-bbabfeae) Successfully released storage resources
> for the vm VM[User|i-638-1735-VM]
> ewcstack-man02-prod: 2017-01-24 18:05:32,496 DEBUG
> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
> job-13733/job-191379 ctx-bbabfeae) VM state transitted from :Stopping to
> Stopped with event: OperationSucceededvm's original host id: 15 new host
> id: null host id before state transition: 15
> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
> job-13733/job-191379 ctx-bbabfeae) Hosts's actual total CPU: 82976 and CPU
> after applying overprovisioning: 331904
> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
> job-13733/job-191379 ctx-bbabfeae) Hosts's actual total RAM: 261961927808
> and RAM after applying overprovisioning: 261961924608
> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
> job-13733/job-191379 ctx-bbabfeae) release cpu from host: 15, old used:
> 71300,reserved: 2100, actual total: 82976, total with overprovisioning:
> 331904; new used: 67100,reserved:6300; movedfromreserved:
> false,moveToReserveredtrue
> ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG
> [c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
> job-13733/job-191379 ctx-bbabfeae) release mem from host: 15, old used:
> 98784247808,reserved: 1073741824, total: 261961924608; new used:
> 94489280512,reserved:5368709120 <(53)%206870-9120>; movedfromreserved:
> false,moveToReserveredtrue
> ewcstack-man02-prod: 2017-01-24 18:05:32,538 DEBUG
> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not supported
> in the network id=519
> ewcstack-man02-prod: 2017-01-24 18:05:32,557 DEBUG
> [c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528
> job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not supported
> in the network id=519
>
> ewcstack-man02-prod: 2017-01-24 18:05:32,588 ERROR
> [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-115:ctx-0d5db528
> job-13733/job-191379 ctx-bbabfeae) Invocation exception, caused by:
> com.cloud.utils.exception.CloudRuntimeException: Unable to migrate
> VM[User|i-638-1735-VM]
> ewcstack-man02-prod: 2017-01-24 18:05:32,589 INFO
> [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-115:ctx-0d5db528
> job-13733/job-191379 ctx-bbabfeae) Rethrow exception
> com.cloud.utils.exception.CloudRuntimeException: Unable to migrate
> VM[User|i-638-1735-VM]
> ewcstack-man02-prod: 2017-01-24 18:05:32,589 DEBUG
> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-115:ctx-0d5db528
> job-13733/job-191379) Done with run of VM work job:
> com.cloud.vm.VmWorkMigrateAway for VM 1735, job origin: 13733
> ewcstack-man02-prod: 2017-01-24 18:05:32,589 ERROR
> [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-115:ctx-0d5db528
> job-13733/job-191379) Unable to complete AsyncJobVO {id:191379, userId: 1,
> accountId: 1, instanceType: null, instanceId: null, cmd:
> com.cloud.vm.VmWorkMigrateAway, cmdInfo: rO0ABXNyAB5jb20uY2xvdWQudm0uVm
> 1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb2
> 0uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2
> VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZz
> t4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGx3QAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAA4A,
> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
> result: null, initMsid: 345049098498, completeMsid: null, lastUpdated:
> null, lastPolled: null, created: Tue Jan 24 18:05:00 CET 2017}, job
> origin:13733
> ewcstack-man02-prod: com.cloud.utils.exception.CloudRuntimeException:
> Unable to migrate VM[User|i-638-1735-VM]
> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
> agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:2405)
> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
> agerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:4517)
> ewcstack-man02-prod:    at sun.reflect.GeneratedMethodAccessor402.invoke(Unknown
> Source)
> ewcstack-man02-prod:    at sun.reflect.DelegatingMethodAc
> cessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> ewcstack-man02-prod:    at java.lang.reflect.Method.invok
> e(Method.java:606)
> ewcstack-man02-prod:    at com.cloud.vm.VmWorkJobHandlerP
> roxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
> ewcstack-man02-prod:    at com.cloud.vm.VirtualMachineMan
> agerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4636)
> ewcstack-man02-prod:    at com.cloud.vm.VmWorkJobDispatch
> er.runJob(VmWorkJobDispatcher.java:103)
> ewcstack-man02-prod:    at org.apache.cloudstack.framewor
> k.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManag
> erImpl.java:537)
> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
> context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
> context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
> context.impl.DefaultManagedContext.callWithContext(DefaultMa
> nagedContext.java:103)
> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
> context.impl.DefaultManagedContext.runWithContext(DefaultMan
> agedContext.java:53)
> ewcstack-man02-prod:    at org.apache.cloudstack.managed.
> context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
> ewcstack-man02-prod:    at org.apache.cloudstack.framewor
> k.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:494)
> ewcstack-man02-prod:    at java.util.concurrent.Executors
> $RunnableAdapter.call(Executors.java:471)
> ewcstack-man02-prod:    at java.util.concurrent.FutureTas
> k.run(FutureTask.java:262)
> ewcstack-man02-prod:    at java.util.concurrent.ThreadPoo
> lExecutor.runWorker(ThreadPoolExecutor.java:1145)
> ewcstack-man02-prod:    at java.util.concurrent.ThreadPoo
> lExecutor$Worker.run(ThreadPoolExecutor.java:615)
> ewcstack-man02-prod:    at java.lang.Thread.run(Thread.java:745)
> ewcstack-man02-prod: 2017-01-24 18:05:32,590 DEBUG
> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
> job-13733/job-191379) Complete async job-191379, jobStatus: FAILED,
> resultCode: 0, result: rO0ABXNyAC9jb20uY2xvdWQudXRpbH
> MuZXhjZXB0aW9uLkNsb3VkUnVudGltZUV4Y2VwdGlvbgAAAABWTU9yAwABSQ
> ALY3NFcnJvckNvZGV4cgAaamF2YS5sYW5nLlJ1bnRpbWVFeGNlcHRpb26eXw
> ZHCjSD5QIAAHhyABNqYXZhLmxhbmcuRXhjZXB0aW9u0P0fPho7HMQCAAB4cg
> ATamF2YS5sYW5nLlRocm93YWJsZdXGNSc5d7jLAwAETAAFY2F1c2V0ABVMam
> F2YS9sYW5nL1Rocm93YWJsZTtMAA1kZXRhaWxNZXNzYWdldAASTGphdmEvbG
> FuZy9TdHJpbmc7WwAKc3RhY2tUcmFjZXQAHltMamF2YS9sYW5nL1N0YWNrVH
> JhY2VFbGVtZW50O0wAFHN1cHByZXNzZWRFeGNlcHRpb25zdAAQTGphdmEvdX
> RpbC9MaXN0O3hwcQB-AAh0AChVbmFibGUgdG8gbWlncmF0ZSBWTVtVc2VyfG
> ktNjM4LTE3MzUtVk1ddXIAHltMamF2YS5sYW5nLlN0YWNrVHJhY2VFbGVtZW
> 50OwJGKjw8_SI5AgAAeHAAAAAUc3IAG2phdmEubGFuZy5TdGFja1RyYWNlRW
> xlbWVudGEJxZomNt2FAgAESQAKbGluZU51bWJlckwADmRlY2xhcmluZ0NsYX
> NzcQB-AAVMAAhmaWxlTmFtZXEAfgAFTAAKbWV0aG9kTmFtZXEAfgAFeHAAAA
> lldAAmY29tLmNsb3VkLnZtLlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGx0AB
> 5WaXJ0dWFsTWFjaGluZU1hbmFnZXJJbXBsLmphdmF0ABZvcmNoZXN0cmF0ZU
> 1pZ3JhdGVBd2F5c3EAfgAMAAARpXEAfgAOcQB-AA9xAH4AEHNxAH4ADP____
> 90ACZzdW4ucmVmbGVjdC5HZW5lcmF0ZWRNZXRob2RBY2Nlc3NvcjQwMnB0AA
> ZpbnZva2VzcQB-AAwAAAArdAAoc3VuLnJlZmxlY3QuRGVsZWdhdGluZ01ldG
> hvZEFjY2Vzc29ySW1wbHQAIURlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcG
> wuamF2YXEAfgAUc3EAfgAMAAACXnQAGGphdmEubGFuZy5yZWZsZWN0Lk1ldG
> hvZHQAC01ldGhvZC5qYXZhcQB-ABRzcQB-AAwAAABrdAAiY29tLmNsb3
> VkLnZtLlZtV29ya0pvYkhhbmRsZXJQcm94eXQAGlZtV29ya0pvYkhhbmRsZX
> JQcm94eS5qYXZhdAAPaGFuZGxlVm1Xb3JrSm9ic3EAfgAMAAASHHEAfgAOcQ
> B-AA9xAH4AHnNxAH4ADAAAAGd0ACBjb20uY2xvdWQudm0uVm1Xb3JrSm9iRG
> lzcGF0Y2hlcnQAGFZtV29ya0pvYkRpc3BhdGNoZXIuamF2YXQABnJ1bkpvYn
> NxAH4ADAAAAhl0AD9vcmcuYXBhY2hlLmNsb3Vkc3RhY2suZnJhbWV3b3JrLm
> pvYnMuaW1wbC5Bc3luY0pvYk1hbmFnZXJJbXBsJDV0ABhBc3luY0pvYk1hbm
> FnZXJJbXBsLmphdmF0AAxydW5JbkNvbnRleHRzcQB-AAwAAAAxdAA-b3JnLm
> FwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29udG
> V4dFJ1bm5hYmxlJDF0ABtNYW5hZ2VkQ29udGV4dFJ1bm5hYmxlLmphdmF0AA
> NydW5zcQB-AAwAAAA4dABCb3JnLmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZW
> QuY29udGV4dC5pbXBsLkRlZmF1bHRNYW5hZ2VkQ29udGV4dCQxdAAaRGVmYX
> VsdE1hbmFnZWRDb250ZXh0LmphdmF0AARjYWxsc3EAfgAMAAAAZ3QAQG9yZy
> 5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdW
> x0TWFuYWdlZENvbnRleHRxAH4ALnQAD2NhbGxXaXRoQ29udGV4dHNxAH4ADA
> AAADVxAH4AMXEAfgAudAAOcnVuV2l0aENvbnRleHRzcQB-AAwAAAAudAA8b3
> JnLmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ2
> 9udGV4dFJ1bm5hYmxlcQB-ACpxAH4AK3NxAH4ADAAAAe5xAH4AJXEAfgAmcQ
> B-ACtzcQB-AAwAAAHXdAAuamF2YS51dGlsLmNvbmN1cnJlbnQuRXhlY3V0b3
> JzJFJ1bm5hYmxlQWRhcHRlcnQADkV4ZWN1dG9ycy5qYXZhcQB-AC9zcQB-AA
> wAAAEGdAAfamF2YS51dGlsLmNvbmN1cnJlbnQuRnV0dXJlVGFza3QAD0Z1dH
> VyZVRhc2suamF2YXEAfgArc3EAfgAMAAAEeXQAJ2phdmEudXRpbC5jb25jdX
> JyZW50LlRocmVhZFBvb2xFeGVjdXRvcnQAF1RocmVhZFBvb2xFeGVjdXRvci
> 5qYXZhdAAJcnVuV29ya2Vyc3EAfgAMAAACZ3QALmphdmEudXRpbC5jb25jdX
> JyZW50LlRocmVhZFBvb2xFeGVjdXRvciRXb3JrZXJxAH4AQHEAfgArc3EAfg
> AMAAAC6XQAEGphdmEubGFuZy5UaHJlYWR0AAtUaHJlYWQuamF2YXEAfgArc3
> IAJmphdmEudXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVMaXN0_A8lMb
> XsjhACAAFMAARsaXN0cQB-AAd4cgAsamF2YS51dGlsLkNvbGxlY3Rpb25zJF
> VubW9kaWZpYWJsZUNvbGxlY3Rpb24ZQgCAy173HgIAAUwAAWN0ABZMamF2YS
> 91dGlsL0NvbGxlY3Rpb247eHBzcgATamF2YS51dGlsLkFycmF5TGlzdHiB0h
> 2Zx2GdAwABSQAEc2l6ZXhwAAAAAHcEAAAAAHhxAH4ATHgAABCadwgAAAAAAAAAAHg
> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
> job-13733/job-191379) Publish async job-191379 complete on message bus
> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
> job-13733/job-191379) Wake up jobs related to job-191379
> ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG
> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
> job-13733/job-191379) Update db status for job-191379
> ewcstack-man02-prod: 2017-01-24 18:05:32,594 DEBUG
> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
> job-13733/job-191379) Wake up jobs joined with job-191379 and disjoin all
> subjobs created from job- 191379
> ewcstack-man02-prod: 2017-01-24 18:05:32,604 DEBUG
> [o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528
> job-13733/job-191379) Done executing com.cloud.vm.VmWorkMigrateAway for
> job-191379
> ewcstack-man02-prod: 2017-01-24 18:05:32,608 INFO
> [o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-115:ctx-0d5db528
> job-13733/job-191379) Remove job-191379 from job monitoring
>
>
>
> On 25.01.2017 15:53, Rafael Weingärtner wrote:
>
>> Did you check the ACS log files looking for some exceptions when the
>> migrations for this specific VM started?
>>
>> On Wed, Jan 25, 2017 at 9:49 AM, Francois Scheurer <
>> francois.scheurer@everyware.ch> wrote:
>>
>> Dear CS contributors
>>>
>>>
>>> We use CS 4.5.1 on a 3 Clusters with XenServer 6.5. and shared primary
>>> storage.
>>>
>>> When we put a host in maintenance, cs evacuates all VM's on other hosts.
>>>
>>> But it happens regularly that some VM's are stopped instead of live
>>> migrated.
>>>
>>> If we restart that stopped VM's, CS will then stop them again later.
>>>
>>> We need to start them several times until they stays up.
>>>
>>> Is it a known issue? Is it fixed on CS 4.9.2 ?
>>>
>>>
>>> Migrating manually all VM's is working but impracticable because too much
>>> time consuming.
>>>
>>>
>>> Many thanks in advance for your help.
>>>
>>>
>>>
>>> Best Regards
>>> Francois
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>>
>>>
>>> EveryWare AG
>>> François Scheurer
>>> Senior Systems Engineer
>>> Zurlindenstrasse 52a
>>> CH-8003 Zürich
>>>
>>> tel: +41 44 466 60 00
>>> fax: +41 44 466 60 10
>>> mail: francois.scheurer@everyware.ch
>>> web: http://www.everyware.ch
>>>
>>>
>>
>>
> --
>
>
> EveryWare AG
> François Scheurer
> Senior Systems Engineer
> Zurlindenstrasse 52a
> CH-8003 Zürich
>
> tel: +41 44 466 60 00
> fax: +41 44 466 60 10
> mail: francois.scheurer@everyware.ch
> web: http://www.everyware.ch
>
>


-- 
Rafael Weingärtner

Re: cs 4.5.1, vm stopped when putting xen hosts in maintenance

Posted by Francois Scheurer <fr...@everyware.ch>.

Hello Rafael


I think the important log lines are these:

ewcstack-man02-prod: 2017-01-24 18:05:28,407 INFO 
[c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e 
job-60216/job-191380 ctx-175d37df) Migration cancelled because state has 
changed: VM[User|i-638-1736-VM]
ewcstack-man02-prod: 2017-01-24 18:05:28,407 DEBUG 
[c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e 
job-60216/job-191380 ctx-175d37df) Unable to migrate VM due to: 
Migration cancelled because state has changed: VM[User|i-638-1736-VM]
ewcstack-man02-prod: 2017-01-24 18:05:28,427 DEBUG 
[c.c.c.CapacityManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e 
job-60216/job-191380 ctx-175d37df) VM state transitted from :Running to 
Stopping with event: StopRequestedvm's original host id: 224 new host 
id: 15 host id before state transition: 15

The really disturbing thing is that CS stop again the VM if we start it 
manually, apparently because the migratingjob is still running.

Note the the VM HA Flag is enabled.



Thank you.


Best Regards.

Francois






more details:



ewcstack-man02-prod: 2017-01-24 18:05:28,148 DEBUG 
[o.a.c.e.o.VolumeOrchestrator] (Work-Job-Executor-156:ctx-14b86f5e 
job-60216/job-191380 ctx-175d37df) Preparing 3 volumes for 
VM[User|i-638-1736-VM]
ewcstack-man02-prod: 2017-01-24 18:05:28,155 DEBUG 
[c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-122:ctx-a8bacc64 
job-14953/job-191351) Done with run of VM work job: 
com.cloud.vm.VmWorkMigrateAway for VM 1411, job origin: 14953
ewcstack-man02-prod: 2017-01-24 18:05:28,155 DEBUG 
[o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-122:ctx-a8bacc64 
job-14953/job-191351) Done executing com.cloud.vm.VmWorkMigrateAway for 
job-191351
ewcstack-man02-prod: 2017-01-24 18:05:28,160 INFO 
[o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-122:ctx-a8bacc64 
job-14953/job-191351) Remove job-191351 from job monitoring
ewcstack-man02-prod: 2017-01-24 18:05:28,168 INFO 
[c.c.h.HighAvailabilityManagerImpl] (HA-Worker-2:ctx-f3bcf43c work-1989) 
Completed HAWork[1989-Migration-1411-Running-Migrating]
ewcstack-man02-prod: 2017-01-24 18:05:28,246 DEBUG 
[c.c.a.m.ClusteredAgentAttache] (Work-Job-Executor-156:ctx-14b86f5e 
job-60216/job-191380 ctx-175d37df) Seq 19-8526158519542516375: 
Forwarding Seq 19-8526158519542516375:  { Cmd , MgmtId: 345049103441, 
via: 19(ewcstack-vh010-prod), Ver: v1, Flags: 100111, [{
"com.cloud.agent.api.PrepareForMigrationCommand":{"vm":{"id":1736,"name":"i-638-1736-VM","bootloader":"PyGrub","type":"User","cpus":4,"minSpeed":2100,"maxSpeed":2100,"minRam":17179869184,"maxRam":17179869184,"arch":"x86_64","os":"Other 
(64-bit)","platformEmulator":"Other install media","bootArgs":"","enable
HA":true,"limitCpuUse":true,"enableDynamicallyScaleVm":true,"vncPassword":"pC1WUC7h1DBH9J36q3H9pg==","params":{"memoryOvercommitRatio":"1.0","platform":"viridian:true;acpi:1;apic:true;pae:true;nx:true;timeoffset:-3601","keyboard":"us","Message.ReservedCapacityFreed.Flag":"false","cpuOvercommitRatio":"4.0","
hypervisortoolsversion":"xenserver61"},"uuid":"5bff01de-c033-4925-9e47-30dd14539272","disks":[{"data":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid":"1ab514b2-f2de-4d85-8e0d-9768184a1349","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"b8751b89-00f
6-5079-ed7a-c073579c7563","id":64,"poolType":"PreSetup","host":"localhost","path":"/b8751b89-00f6-5079-ed7a-c073579c7563","port":0,"url":"PreSetup://localhost/b8751b89-00f6-5079-ed7a-c073579c7563/?ROLE=Primary&STOREUUID=b8751b89-00f6-5079-ed7a-c073579c7563"}},"name":"ROOT-1736","size":53687091200,"path":"d3
f87e3c-1efe-42ad-aa93-609d5b980a34","volumeId":8655,"vmName":"i-638-1736-VM","accountId":638,"format":"VHD","provisioningType":"THIN","id":8655,"deviceId":0,"hypervisorType":"XenServer"}},"diskSeq":0,"path":"d3f87e3c-1efe-42ad-aa93-609d5b980a34","type":"ROOT","_details":{"managed":"false","storagePort":"0",
"storageHost":"localhost","volumeSize":"53687091200"}},{"data":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid":"733df9dc-5e24-4ab9-b22b-bce78322428e","volumeType":"DATADISK","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"b8751b89-00f6-5079-ed7a-c073579c7563","id":64,"
poolType":"PreSetup","host":"localhost","path":"/b8751b89-00f6-5079-ed7a-c073579c7563","port":0,"url":"PreSetup://localhost/b8751b89-00f6-5079-ed7a-c073579c7563/?ROLE=Primary&STOREUUID=b8751b89-00f6-5079-ed7a-c073579c7563"}},"name":"WELLA-DB1-NCSA_D","size":268435456000,"path":"265f168a-bb92-4591-a32b-ee3e2
2c8cb68","volumeId":8656,"vmName":"i-638-1736-VM","accountId":638,"format":"VHD","provisioningType":"THIN","id":8656,"deviceId":1,"hypervisorType":"XenServer"}},"diskSeq":1,"path":"265f168a-bb92-4591-a32b-ee3e22c8cb68","type":"DATADISK","_details":{"managed":"false","storagePort":"0","storageHost":"localhos
t","volumeSize":"268435456000"}},{"data":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid":"bda6f335-b8a3-479c-8c0d-41b344d2a50d","volumeType":"DATADISK","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"09df3ca6-0a20-e4f0-3d79-12a6ed67a13d","id":89,"poolType":"PreSetup","
host":"localhost","path":"/09df3ca6-0a20-e4f0-3d79-12a6ed67a13d","port":0,"url":"PreSetup://localhost/09df3ca6-0a20-e4f0-3d79-12a6ed67a13d/?ROLE=Primary&STOREUUID=09df3ca6-0a20-e4f0-3d79-12a6ed67a13d"}},"name":"WELLA-DB1-NCSA_E","size":429496729600,"path":"612cb586-a70c-4896-b120-062ee7a67894","volumeId":86
57,"vmName":"i-638-1736-VM","accountId":638,"format":"VHD","provisioningType":"THIN","id":8657,"deviceId":2,"hypervisorType":"XenServer"}},"diskSeq":2,"path":"612cb586-a70c-4896-b120-062ee7a67894","type":"DATADISK","_details":{"managed":"false","storagePort":"0","storageHost":"localhost","volumeSize":"42949
6729600"}},{"data":{"org.apache.cloudstack.storage.to.TemplateObjectTO":{"uuid":"0e5bf4ff-caca-4b08-abed-7bbcdf3c42ef","id":201,"format":"ISO","accountId":1,"hvm":true,"displayText":"xen-pv-drv-iso","name":"xs-tools.iso","guestOsType":"CentOS 
4.5 (32-bit)","hypervisorType":"XenServer"}},"diskSeq":3,"type":"
ISO"}],"nics":[{"deviceId":0,"networkRateMbps":1000,"defaultNic":true,"pxeDisable":false,"nicUuid":"3523c3d3-9f92-4076-ad99-f8864d0f0edd","uuid":"e6fbb257-16e1-415f-88fb-9a79b263b3a8","ip":"192.168.1.42","netmask":"255.255.255.0","gateway":"192.168.1.1","mac":"02:00:0c:ae:00:0f","dns1":"8.8.8.8","dns2":"8.8
.4.4","broadcastType":"Vlan","type":"Guest","broadcastUri":"vlan://2285","isolationUri":"vlan://2285","isSecurityGroupEnabled":false,"name":"bond0"}],"vcpuMaxLimit":16},"wait":0}}] 
} to 345049098498
ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG [c.c.a.t.Request] 
(AgentManager-Handler-10:null) Seq 19-8526158519542516375: Processing:  
{ Ans: , MgmtId: 345049103441, via: 19, Ver: v1, Flags: 110, 
[{"com.cloud.agent.api.PrepareForMigrationAnswer":{"result":true,"wait":0}}] 
}
ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG 
[c.c.a.m.AgentAttache] (AgentManager-Handler-10:null) Seq 
19-8526158519542516375: No more commands found
ewcstack-man02-prod: 2017-01-24 18:05:28,399 DEBUG [c.c.a.t.Request] 
(Work-Job-Executor-156:ctx-14b86f5e job-60216/job-191380 ctx-175d37df) 
Seq 19-8526158519542516375: Received:  { Ans: , MgmtId: 345049103441, 
via: 19, Ver: v1, Flags: 110, { PrepareForMigrationAnswer } }

ewcstack-man02-prod: 2017-01-24 18:05:28,407 INFO 
[c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e 
job-60216/job-191380 ctx-175d37df) Migration cancelled because state has 
changed: VM[User|i-638-1736-VM]
ewcstack-man02-prod: 2017-01-24 18:05:28,407 DEBUG 
[c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e 
job-60216/job-191380 ctx-175d37df) Unable to migrate VM due to: 
Migration cancelled because state has changed: VM[User|i-638-1736-VM]
ewcstack-man02-prod: 2017-01-24 18:05:28,427 DEBUG 
[c.c.c.CapacityManagerImpl] (Work-Job-Executor-156:ctx-14b86f5e 
job-60216/job-191380 ctx-175d37df) VM state transitted from :Running to 
Stopping with event: StopRequestedvm's original host id: 224 new host 
id: 15 host id before state transition: 15

ewcstack-man02-prod: 2017-01-24 18:05:28,438 DEBUG 
[c.c.a.m.ClusteredAgentAttache] (Work-Job-Executor-156:ctx-14b86f5e 
job-60216/job-191380 ctx-175d37df) Seq 15-6889944479923437643: 
Forwarding Seq 15-6889944479923437643:  { Cmd , MgmtId: 345049103441, 
via: 15(ewcstack-vh011-prod), Ver: v1, Flags: 100011, 
[{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"checkBeforeCleanup":false,"vmName":"i-638-1736-VM","wait":0}}] 
} to 345049098498
ewcstack-man02-prod: 2017-01-24 18:05:30,335 DEBUG 
[c.c.a.m.AgentManagerImpl] (AgentManager-Handler-1:null) SeqA 
254-800498: Processing Seq 254-800498:  { Cmd , MgmtId: -1, via: 254, 
Ver: v1, Flags: 11, 
[{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_proxyVmId":1530,"_loadInfo":"{\n 
\"connections\": []\n}","wait":0}}] }
ewcstack-man02-prod: 2017-01-24 18:05:30,340 DEBUG 
[c.c.a.m.AgentManagerImpl] (AgentManager-Handler-1:null) SeqA 
254-800498: Sending Seq 254-800498:  { Ans: , MgmtId: 345049103441, via: 
254, Ver: v1, Flags: 100010, 
[{"com.cloud.agent.api.AgentControlAnswer":{"result":true,"wait":0}}] }
ewcstack-man02-prod: 2017-01-24 18:05:32,447 DEBUG [c.c.a.t.Request] 
(AgentManager-Handler-11:null) Seq 15-6889944479923437632: Processing:  
{ Ans: , MgmtId: 345049103441, via: 15, Ver: v1, Flags: 10, 
[{"com.cloud.agent.api.StopAnswer":{"platform":"viridian:true;acpi:1;apic:true;pae:true;nx:true;timeoffset:3601","result":true,"details":"Stop 
VM i-638-1735-VM Succeed","wait":0}}] }
ewcstack-man02-prod: 2017-01-24 18:05:32,447 DEBUG [c.c.a.t.Request] 
(Work-Job-Executor-115:ctx-0d5db528 job-13733/job-191379 ctx-bbabfeae) 
Seq 15-6889944479923437632: Received:  { Ans: , MgmtId: 345049103441, 
via: 15, Ver: v1, Flags: 10, { StopAnswer } }
ewcstack-man02-prod: 2017-01-24 18:05:32,470 DEBUG 
[c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5db528 
job-13733/job-191379 ctx-bbabfeae) VM[User|i-638-1735-VM] is stopped on 
the host. Proceeding to release resource held.
ewcstack-man02-prod: 2017-01-24 18:05:32,475 DEBUG 
[c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528 
job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not 
supported in the network id=519
ewcstack-man02-prod: 2017-01-24 18:05:32,478 DEBUG 
[o.a.c.e.o.NetworkOrchestrator] (Work-Job-Executor-115:ctx-0d5db528 
job-13733/job-191379 ctx-bbabfeae) Changing active number of nics for 
network id=519 on -1
ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG 
[o.a.c.e.o.NetworkOrchestrator] (Work-Job-Executor-115:ctx-0d5db528 
job-13733/job-191379 ctx-bbabfeae) Asking VpcVirtualRouter to release 
NicProfile[17403-1735-3364083b-dcc0-4eb9-a7f9-f3e85bf8c7d3-192.168.1.11-null
ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG 
[c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5db528 
job-13733/job-191379 ctx-bbabfeae) Successfully released network 
resources for the vm VM[User|i-638-1735-VM]
ewcstack-man02-prod: 2017-01-24 18:05:32,488 DEBUG 
[c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-115:ctx-0d5db528 
job-13733/job-191379 ctx-bbabfeae) Successfully released storage 
resources for the vm VM[User|i-638-1735-VM]
ewcstack-man02-prod: 2017-01-24 18:05:32,496 DEBUG 
[c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528 
job-13733/job-191379 ctx-bbabfeae) VM state transitted from :Stopping to 
Stopped with event: OperationSucceededvm's original host id: 15 new host 
id: null host id before state transition: 15
ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG 
[c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528 
job-13733/job-191379 ctx-bbabfeae) Hosts's actual total CPU: 82976 and 
CPU after applying overprovisioning: 331904
ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG 
[c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528 
job-13733/job-191379 ctx-bbabfeae) Hosts's actual total RAM: 
261961927808 and RAM after applying overprovisioning: 261961924608
ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG 
[c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528 
job-13733/job-191379 ctx-bbabfeae) release cpu from host: 15, old used: 
71300,reserved: 2100, actual total: 82976, total with overprovisioning: 
331904; new used: 67100,reserved:6300; movedfromreserved: 
false,moveToReserveredtrue
ewcstack-man02-prod: 2017-01-24 18:05:32,505 DEBUG 
[c.c.c.CapacityManagerImpl] (Work-Job-Executor-115:ctx-0d5db528 
job-13733/job-191379 ctx-bbabfeae) release mem from host: 15, old used: 
98784247808,reserved: 1073741824, total: 261961924608; new used: 
94489280512,reserved:5368709120; movedfromreserved: 
false,moveToReserveredtrue
ewcstack-man02-prod: 2017-01-24 18:05:32,538 DEBUG 
[c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528 
job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not 
supported in the network id=519
ewcstack-man02-prod: 2017-01-24 18:05:32,557 DEBUG 
[c.c.n.NetworkModelImpl] (Work-Job-Executor-115:ctx-0d5db528 
job-13733/job-191379 ctx-bbabfeae) Service SecurityGroup is not 
supported in the network id=519

ewcstack-man02-prod: 2017-01-24 18:05:32,588 ERROR 
[c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-115:ctx-0d5db528 
job-13733/job-191379 ctx-bbabfeae) Invocation exception, caused by: 
com.cloud.utils.exception.CloudRuntimeException: Unable to migrate 
VM[User|i-638-1735-VM]
ewcstack-man02-prod: 2017-01-24 18:05:32,589 INFO 
[c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-115:ctx-0d5db528 
job-13733/job-191379 ctx-bbabfeae) Rethrow exception 
com.cloud.utils.exception.CloudRuntimeException: Unable to migrate 
VM[User|i-638-1735-VM]
ewcstack-man02-prod: 2017-01-24 18:05:32,589 DEBUG 
[c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-115:ctx-0d5db528 
job-13733/job-191379) Done with run of VM work job: 
com.cloud.vm.VmWorkMigrateAway for VM 1735, job origin: 13733
ewcstack-man02-prod: 2017-01-24 18:05:32,589 ERROR 
[c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-115:ctx-0d5db528 
job-13733/job-191379) Unable to complete AsyncJobVO {id:191379, userId: 
1, accountId: 1, instanceType: null, instanceId: null, cmd: 
com.cloud.vm.VmWorkMigrateAway, cmdInfo: 
rO0ABXNyAB5jb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZUF3YXmt4MX4jtcEmwIAAUoACXNyY0hvc3RJZHhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAABAAAAAAAAAAEAAAAAAAAGx3QAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAA4A, 
cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, 
result: null, initMsid: 345049098498, completeMsid: null, lastUpdated: 
null, lastPolled: null, created: Tue Jan 24 18:05:00 CET 2017}, job 
origin:13733
ewcstack-man02-prod: com.cloud.utils.exception.CloudRuntimeException: 
Unable to migrate VM[User|i-638-1735-VM]
ewcstack-man02-prod:    at 
com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:2405)
ewcstack-man02-prod:    at 
com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrateAway(VirtualMachineManagerImpl.java:4517)
ewcstack-man02-prod:    at 
sun.reflect.GeneratedMethodAccessor402.invoke(Unknown Source)
ewcstack-man02-prod:    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
ewcstack-man02-prod:    at java.lang.reflect.Method.invoke(Method.java:606)
ewcstack-man02-prod:    at 
com.cloud.vm.VmWorkJobHandlerProxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
ewcstack-man02-prod:    at 
com.cloud.vm.VirtualMachineManagerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4636)
ewcstack-man02-prod:    at 
com.cloud.vm.VmWorkJobDispatcher.runJob(VmWorkJobDispatcher.java:103)
ewcstack-man02-prod:    at 
org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:537)
ewcstack-man02-prod:    at 
org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
ewcstack-man02-prod:    at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
ewcstack-man02-prod:    at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
ewcstack-man02-prod:    at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
ewcstack-man02-prod:    at 
org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
ewcstack-man02-prod:    at 
org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:494)
ewcstack-man02-prod:    at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
ewcstack-man02-prod:    at 
java.util.concurrent.FutureTask.run(FutureTask.java:262)
ewcstack-man02-prod:    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
ewcstack-man02-prod:    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
ewcstack-man02-prod:    at java.lang.Thread.run(Thread.java:745)
ewcstack-man02-prod: 2017-01-24 18:05:32,590 DEBUG 
[o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528 
job-13733/job-191379) Complete async job-191379, jobStatus: FAILED, 
resultCode: 0, result: 
rO0ABXNyAC9jb20uY2xvdWQudXRpbHMuZXhjZXB0aW9uLkNsb3VkUnVudGltZUV4Y2VwdGlvbgAAAABWTU9yAwABSQALY3NFcnJvckNvZGV4cgAaamF2YS5sYW5nLlJ1bnRpbWVFeGNlcHRpb26eXwZHCjSD5QIAAHhyABNqYXZhLmxhbmcuRXhjZXB0aW9u0P0fPho7HMQCAAB4cgATamF2YS5sYW5nLlRocm93YWJsZdXGNSc5d7jLAwAETAAFY2F1c2V0ABVMamF2YS9sYW5nL1Rocm93YWJsZTtMAA1kZXRhaWxNZXNzYWdldAASTGphdmEvbGFuZy9TdHJpbmc7WwAKc3RhY2tUcmFjZXQAHltMamF2YS9sYW5nL1N0YWNrVHJhY2VFbGVtZW50O0wAFHN1cHByZXNzZWRFeGNlcHRpb25zdAAQTGphdmEvdXRpbC9MaXN0O3hwcQB-AAh0AChVbmFibGUgdG8gbWlncmF0ZSBWTVtVc2VyfGktNjM4LTE3MzUtVk1ddXIAHltMamF2YS5sYW5nLlN0YWNrVHJhY2VFbGVtZW50OwJGKjw8_SI5AgAAeHAAAAAUc3IAG2phdmEubGFuZy5TdGFja1RyYWNlRWxlbWVudGEJxZomNt2FAgAESQAKbGluZU51bWJlckwADmRlY2xhcmluZ0NsYXNzcQB-AAVMAAhmaWxlTmFtZXEAfgAFTAAKbWV0aG9kTmFtZXEAfgAFeHAAAAlldAAmY29tLmNsb3VkLnZtLlZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGx0AB5WaXJ0dWFsTWFjaGluZU1hbmFnZXJJbXBsLmphdmF0ABZvcmNoZXN0cmF0ZU1pZ3JhdGVBd2F5c3EAfgAMAAARpXEAfgAOcQB-AA9xAH4AEHNxAH4ADP____90ACZzdW4ucmVmbGVjdC5HZW5lcmF0ZWRNZXRob2RBY2Nlc3NvcjQwMnB0AAZpbnZva2VzcQB-AAwAAAArdAAoc3VuLnJlZmxlY3QuRGVsZWdhdGluZ01ldGhvZEFjY2Vzc29ySW1wbHQAIURlbGVnYXRpbmdNZXRob2RBY2Nlc3NvckltcGwuamF2YXEAfgAUc3EAfgAMAAACXnQAGGphdmEubGFuZy5yZWZsZWN0Lk1ldGhvZHQAC01ldGhvZC5qYXZhcQB-ABRzcQB-AAwAAABrdAAiY29tLmNsb3VkLnZtLlZtV29ya0pvYkhhbmRsZXJQcm94eXQAGlZtV29ya0pvYkhhbmRsZXJQcm94eS5qYXZhdAAPaGFuZGxlVm1Xb3JrSm9ic3EAfgAMAAASHHEAfgAOcQB-AA9xAH4AHnNxAH4ADAAAAGd0ACBjb20uY2xvdWQudm0uVm1Xb3JrSm9iRGlzcGF0Y2hlcnQAGFZtV29ya0pvYkRpc3BhdGNoZXIuamF2YXQABnJ1bkpvYnNxAH4ADAAAAhl0AD9vcmcuYXBhY2hlLmNsb3Vkc3RhY2suZnJhbWV3b3JrLmpvYnMuaW1wbC5Bc3luY0pvYk1hbmFnZXJJbXBsJDV0ABhBc3luY0pvYk1hbmFnZXJJbXBsLmphdmF0AAxydW5JbkNvbnRleHRzcQB-AAwAAAAxdAA-b3JnLmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29udGV4dFJ1bm5hYmxlJDF0ABtNYW5hZ2VkQ29udGV4dFJ1bm5hYmxlLmphdmF0AANydW5zcQB-AAwAAAA4dABCb3JnLmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5pbXBsLkRlZmF1bHRNYW5hZ2VkQ29udGV4dCQxdAAaRGVmYXVsdE1hbmFnZWRDb250ZXh0LmphdmF0AARjYWxsc3EAfgAMAAAAZ3QAQG9yZy5hcGFjaGUuY2xvdWRzdGFjay5tYW5hZ2VkLmNvbnRleHQuaW1wbC5EZWZhdWx0TWFuYWdlZENvbnRleHRxAH4ALnQAD2NhbGxXaXRoQ29udGV4dHNxAH4ADAAAADVxAH4AMXEAfgAudAAOcnVuV2l0aENvbnRleHRzcQB-AAwAAAAudAA8b3JnLmFwYWNoZS5jbG91ZHN0YWNrLm1hbmFnZWQuY29udGV4dC5NYW5hZ2VkQ29udGV4dFJ1bm5hYmxlcQB-ACpxAH4AK3NxAH4ADAAAAe5xAH4AJXEAfgAmcQB-ACtzcQB-AAwAAAHXdAAuamF2YS51dGlsLmNvbmN1cnJlbnQuRXhlY3V0b3JzJFJ1bm5hYmxlQWRhcHRlcnQADkV4ZWN1dG9ycy5qYXZhcQB-AC9zcQB-AAwAAAEGdAAfamF2YS51dGlsLmNvbmN1cnJlbnQuRnV0dXJlVGFza3QAD0Z1dHVyZVRhc2suamF2YXEAfgArc3EAfgAMAAAEeXQAJ2phdmEudXRpbC5jb25jdXJyZW50LlRocmVhZFBvb2xFeGVjdXRvcnQAF1RocmVhZFBvb2xFeGVjdXRvci5qYXZhdAAJcnVuV29ya2Vyc3EAfgAMAAACZ3QALmphdmEudXRpbC5jb25jdXJyZW50LlRocmVhZFBvb2xFeGVjdXRvciRXb3JrZXJxAH4AQHEAfgArc3EAfgAMAAAC6XQAEGphdmEubGFuZy5UaHJlYWR0AAtUaHJlYWQuamF2YXEAfgArc3IAJmphdmEudXRpbC5Db2xsZWN0aW9ucyRVbm1vZGlmaWFibGVMaXN0_A8lMbXsjhACAAFMAARsaXN0cQB-AAd4cgAsamF2YS51dGlsLkNvbGxlY3Rpb25zJFVubW9kaWZpYWJsZUNvbGxlY3Rpb24ZQgCAy173HgIAAUwAAWN0ABZMamF2YS91dGlsL0NvbGxlY3Rpb247eHBzcgATamF2YS51dGlsLkFycmF5TGlzdHiB0h2Zx2GdAwABSQAEc2l6ZXhwAAAAAHcEAAAAAHhxAH4ATHgAABCadwgAAAAAAAAAAHg
ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG 
[o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528 
job-13733/job-191379) Publish async job-191379 complete on message bus
ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG 
[o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528 
job-13733/job-191379) Wake up jobs related to job-191379
ewcstack-man02-prod: 2017-01-24 18:05:32,592 DEBUG 
[o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528 
job-13733/job-191379) Update db status for job-191379
ewcstack-man02-prod: 2017-01-24 18:05:32,594 DEBUG 
[o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528 
job-13733/job-191379) Wake up jobs joined with job-191379 and disjoin 
all subjobs created from job- 191379
ewcstack-man02-prod: 2017-01-24 18:05:32,604 DEBUG 
[o.a.c.f.j.i.AsyncJobManagerImpl] (Work-Job-Executor-115:ctx-0d5db528 
job-13733/job-191379) Done executing com.cloud.vm.VmWorkMigrateAway for 
job-191379
ewcstack-man02-prod: 2017-01-24 18:05:32,608 INFO 
[o.a.c.f.j.i.AsyncJobMonitor] (Work-Job-Executor-115:ctx-0d5db528 
job-13733/job-191379) Remove job-191379 from job monitoring


On 25.01.2017 15:53, Rafael Weingärtner wrote:
> Did you check the ACS log files looking for some exceptions when the
> migrations for this specific VM started?
>
> On Wed, Jan 25, 2017 at 9:49 AM, Francois Scheurer <
> francois.scheurer@everyware.ch> wrote:
>
>> Dear CS contributors
>>
>>
>> We use CS 4.5.1 on a 3 Clusters with XenServer 6.5. and shared primary
>> storage.
>>
>> When we put a host in maintenance, cs evacuates all VM's on other hosts.
>>
>> But it happens regularly that some VM's are stopped instead of live
>> migrated.
>>
>> If we restart that stopped VM's, CS will then stop them again later.
>>
>> We need to start them several times until they stays up.
>>
>> Is it a known issue? Is it fixed on CS 4.9.2 ?
>>
>>
>> Migrating manually all VM's is working but impracticable because too much
>> time consuming.
>>
>>
>> Many thanks in advance for your help.
>>
>>
>>
>> Best Regards
>> Francois
>>
>>
>>
>>
>>
>>
>> --
>>
>>
>> EveryWare AG
>> François Scheurer
>> Senior Systems Engineer
>> Zurlindenstrasse 52a
>> CH-8003 Zürich
>>
>> tel: +41 44 466 60 00
>> fax: +41 44 466 60 10
>> mail: francois.scheurer@everyware.ch
>> web: http://www.everyware.ch
>>
>
>

-- 


EveryWare AG
François Scheurer
Senior Systems Engineer
Zurlindenstrasse 52a
CH-8003 Zürich

tel: +41 44 466 60 00
fax: +41 44 466 60 10
mail: francois.scheurer@everyware.ch
web: http://www.everyware.ch

Re: cs 4.5.1, vm stopped when putting xen hosts in maintenance

Posted by Rafael Weingärtner <ra...@gmail.com>.

Did you check the ACS log files looking for some exceptions when the
migrations for this specific VM started?

On Wed, Jan 25, 2017 at 9:49 AM, Francois Scheurer <
francois.scheurer@everyware.ch> wrote:

> Dear CS contributors
>
>
> We use CS 4.5.1 on a 3 Clusters with XenServer 6.5. and shared primary
> storage.
>
> When we put a host in maintenance, cs evacuates all VM's on other hosts.
>
> But it happens regularly that some VM's are stopped instead of live
> migrated.
>
> If we restart that stopped VM's, CS will then stop them again later.
>
> We need to start them several times until they stays up.
>
> Is it a known issue? Is it fixed on CS 4.9.2 ?
>
>
> Migrating manually all VM's is working but impracticable because too much
> time consuming.
>
>
> Many thanks in advance for your help.
>
>
>
> Best Regards
> Francois
>
>
>
>
>
>
> --
>
>
> EveryWare AG
> François Scheurer
> Senior Systems Engineer
> Zurlindenstrasse 52a
> CH-8003 Zürich
>
> tel: +41 44 466 60 00
> fax: +41 44 466 60 10
> mail: francois.scheurer@everyware.ch
> web: http://www.everyware.ch
>



-- 
Rafael Weingärtner