You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cloudstack.apache.org by John Burwell <jo...@shapeblue.com> on 2016/09/26 06:10:14 UTC

4.8, 4.9, and master Branches Frozen for Testing

All,

Per our release schedule [1], the 4.8, 4.9, and master branches are frozen for testing.  There are some straggling PRs that Rajani and I are working to merge.  Is it acceptable to everyone that for the next two (2) weeks, all PRs require not only 2 LGTMs, but approval by Rajani or I to be merged to these branches?  To be clear, we don’t have to perform the merges, simply give a thumbs up.

Thanks,
-John
john.burwell@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
@shapeblue
  
 


Re: 4.8, 4.9, and master Testing Status

Posted by Syed Ahmed <sa...@cloudops.com>.
@Paul

I'll test this out on my XS7 setup. I need a little help in getting started
with the details of how you are running them.

-Syed

On Wed, Oct 19, 2016 at 9:03 AM, Paul Angus <pa...@shapeblue.com>
wrote:

> To give a specific example,
>
> Running XenServer7 smoke tests result in a large number of failures due to
> problems with the tests.  If we're going to make any real forward progress,
> we need to be able to get meaningful test results.
>
> Pleeeeaaaaasssse review/test these PRs....
>
>
>
> pretty please (with sugar on top).
>
>
> Kind regards,
>
> Paul Angus
>
> paul.angus@shapeblue.com
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>
>
>
>
> -----Original Message-----
> From: Rohit Yadav [mailto:rohit.yadav@shapeblue.com]
> Sent: 19 October 2016 12:39
> To: dev@cloudstack.apache.org
> Subject: Re: 4.8, 4.9, and master Testing Status
>
> All,
>
>
> At least one more test/review LGTM is required on following PRs, please
> help with your review/tests as they will help us work towards cutting the
> RCs:
>
>
> https://github.com/apache/cloudstack/pull/1692
>
> https://github.com/apache/cloudstack/pull/1703
>
> https://github.com/apache/cloudstack/pull/1708
>
>
> As John has shared in the previous email, we still have outstanding
> failures and we'll be working towards fixing all three of them. If we can
> get them merged before end of the week, we can trigger tests on them again
> to know test status on each of the 4.8, 4.9 and master branches that can
> help us determine the quality on each of the branches and we can work
> towards RCs.
>
>
> Regards.
>
> ________________________________
> From: John Burwell <jo...@shapeblue.com>
> Sent: 14 October 2016 12:11:04
> To: dev@cloudstack.apache.org
> Subject: Re: 4.8, 4.9, and master Testing Status
>
> All,
>
> We have made great strides stabilizing the 4.8 [1] and 4.9 [2] smoke
> tests.  While we are not super green, the following remaining
> failures/issues are isolated to the VPC VR and secondary storage.
>
>         * CLOUDSTACK-9541: redundant VPC VR: issues when master and backup
> switch happens on failover [3]
>         * CLOUDSTACK-9540: createPrivateGateway create private network
> does not create proper VLAN network on XenServer
>         * CLOUDSTACK-9528: SSVM Downloads (built-in) template multiple
> times
>
> Therefore, I would like to merge these two PRs so that we can begin the
> process of rebasing and retesting the PRs slotted for 4.8 and 4.9 that are
> not affected by these issues (i.e. PRs unrelated to secondary storage or
> the VR).  Our hope is that we can correct these issues quickly, and by the
> time we have worked through the backlog of pending PRs, these issues will
> be addressed and we can move those impacted forward.
>
> Unfortunately, the master PR [5] has 6 failures and 4 errors on XenServer
> [6] that we are currently analyzing.  We hope to have these resolved
> shortly in order to begin progressing PRs targeting master.
>
> I would like to get 1692 [1] and 1703 [2] merged in the next 24 hours.  We
> need to complete the following actions in order to accomplish this goal:
>
>         * Obtain at least one code review LGTM on PR #1692 [1]
>         * Obtain at least one code review LGTM on PR #1703 [2]
>         * Obtain at least one test review LGTM on PR #1703 [2]
>
> Once these PRs, I will be updating PRs slotted for 4.8 and 4.9 to ping
> authors for a rebase.  Following each rebase, we will trigger blueorangutan
> to retest each one.
>
> Thank again for your patience and assistance, -John
>
> [1]: https://github.com/apache/cloudstack/pull/1692
> [2]: https://github.com/apache/cloudstack/pull/1703
> [3]: https://issues.apache.org/jira/browse/CLOUDSTACK-9541
> [4]: https://issues.apache.org/jira/browse/CLOUDSTACK-9540
> [5]: https://github.com/apache/cloudstack/pull/1708
> [6]: https://github.com/apache/cloudstack/pull/1708#issuecomment-253698099
>
> > On Oct 7, 2016, at 10:12 AM, Will Stevens <ws...@cloudops.com> wrote:
> >
> > Great work everyone.  Don't worry about the sporadic updates, that is
> > just the nature of the beast when working through stuff like this.
> > Well done so far...
> >
> > *Will STEVENS*
> > Lead Developer
> >
> > *CloudOps* *| *Cloud Solutions Experts
> > 420 rue Guy *|* Montreal *|* Quebec *|* H3J 1S6 w cloudops.com *|* tw
> > @CloudOps_
> >
> > On Fri, Oct 7, 2016 at 9:53 AM, John Burwell
> > <jo...@shapeblue.com>
> > wrote:
> >
> >> All,
> >>
> >> Thank you Ilya and Haijao for your words of encouragement.  In
> >> addition to the efforts of Paul, Rohit, Murali, Abhi, and Bobby,
> >> Sergey Levitskiy has been providing great help testing VMware.
> >>
> >> I apologize for my sporadic status updates.  We have made significant
> >> progress in getting smoke tests to pass on KVM, XenServer, and VMware.
> >> Currently, we have the following number of failures and errors:
> >>
> >>        * KVM: 0
> >>        * VMware: 4
> >>        * XenServer: 8
> >>
> >> The outstanding failures and errors seem to be the caused by the
> >> following
> >> issues:
> >>
> >>        1. On VMware and XenServer, guest VMs in VPCs start but don’t
> >> acquire IP addresses causing tests relying on SSH connectivity tests
> >> to fail.  The issue occurs does not occur on KVM, intermittently on
> >> VMware, and consistently on XenServer.  This issue affects the
> test_vpc_redundant,
> >> test_privategw_acl, and test_vpc_vpn test suites.   We believe that this
> >> issue may be caused by either the guest VMs startup/DHCP wait period
> >> winning the race with the VPC VR configuration or there is a problem
> >> on the VPC VR assigning IP addresses.  We are currently investigating
> >> and expect to identify the root cause shortly.
> >>        2. SSVM downloads str being restarted due to ping timeouts on
> >> XenServer and VMware.  We are seeing the following messages such as
> >> the following in the Management Server logs:
> >>
> >>                com.cloud.utils.exception.CloudRuntimeException:
> >> Failed to send command, due to Agent:5,com.cloud.exception.
> OperationTimedoutException:
> >> Commands
> >>                9042102151853113352 to Host 5 timed out after 2400
> >>
> >>          Our initial investigation discovered different timezones
> >> being used by the system VM templates and Management Server.  This
> >> discrepancy We have modified Trillian to ensure consistent
> >> configuration of time zones across a cluster, and are preparing another
> run for XenServer and VMware.
> >> KVM is not affected by this time zone issue because KVM hosts use the
> >> same CentOS template as CentOS based Management Servers -- creating
> >> time zone consistency by side effect.
> >>
> >> Reports of each test run are available on PR #1692 [1].  We have
> >> kicked a new round of tests on KVM, VMware, and XenServer with the
> >> time zone fix and additional instrumentation to run down the VPC VR
> race condition.
> >>
> >> Instead of directly forward merging these changes, we plan to open a
> >> PR for each forward merge.  Since we are very close to having 4.8
> >> resolved, Rohit has open PR 1703 [2] for the 4.9 forward merge and
> >> kicked off a test run.  While we cannot close this PR until 1692 is
> >> complete, we are hoping to get a head start on any issues in the 4.9
> branch.
> >>
> >> Thank you again for your patience,
> >> -John
> >>
> >> [1]: https://github.com/apache/cloudstack/pull/1692
> >> [2]: https://github.com/apache/cloudstack/pull/1703
> >>
> >>> On Oct 5, 2016, at 4:32 AM, Haijiao <18...@163.com> wrote:
> >>>
> >>> Though I am one of the silent majority, I would thank John the dev
> >>> team
> >> for your continuous effort, you keep ACS alive and better !
> >>>
> >>>
> >>> Just heard one of biggest finance company in China running 10,000+
> >>> VMs
> >> on ACS 4.4 for production/dev/QAS,  you guys should be proud of that.
> >>> Salute to you!
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> 在2016年10月05 03时09分, "ilya"<il...@gmail.com>写道:
> >>>
> >>> John and Team
> >>>
> >>> Thanks for amazing work and contributing back.
> >>>
> >>> Regards,
> >>> ilya
> >>>
> >>> On 10/3/16 9:48 PM, John Burwell wrote:
> >>>> All,
> >>>>
> >>>> A quick update on our progress to pass all smoke tests aka super
> >> green.  We have reduced the failures and errors for XenServer from 93
> >> to 9 and for VMware from 51 to 14.  A CentOS 6/CentOS 6 KVM run is
> >> currently executing.  Based on manual tests/fixes, we are expecting
> >> to be the first super green configuration.  We have also found the
> >> following additional
> >> defects:
> >>>>
> >>>> * CLOUDSTACK-9528 [2]: SSVM Downloads (built-in) Template Multiple
> >> Times
> >>>> * CLOUDSTACK-9529 [3]: Marvin Tests Do Not Clean Up Properly
> >>>>
> >>>> 9528 is causing XenServer environments to fail to install and
> >>>> startup
> >> cleanly.  A lack of cleanup described in 9529 is causing XenServer to
> >> exhaust available resources before a test run completes.  We believe
> >> that resolution of these issues will address most, if not all, of the
> >> XenServer issues.
> >>>>
> >>>> Thanks,
> >>>> -John
> >>>>
> >>>> [1]: https://cwiki.apache.org/confluence/pages/viewpage.
> >> action?pageId=65873020
> >>>> [2]: https://issues.apache.org/jira/browse/CLOUDSTACK-9528
> >>>> [3]: https://issues.apache.org/jira/browse/CLOUDSTACK-9529
> >>>>
> >>>>>
> >>>> john.burwell@shapeblue.com
> >>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK @shapeblue
> >>>>
> >>>>
> >>>>
> >>>> On Sep 30, 2016, at 2:40 AM, John Burwell <john.burwell@shapeblue.com
> >
> >> wrote:
> >>>>>
> >>>>> All,
> >>>>>
> >>>>> Using blueorganutan, Rohit, Murali, Boris, Paul, Abhi, and I are
> >> executing the smoke tests for the 4.8, 4.9, and master branches against
> the
> >> following environments:
> >>>>>
> >>>>>   * CentOS 7.2 Management Server + VMware 5.5u3 + NFS
> >> Primary/Secondary Storage
> >>>>>   * CentOS 7.2 Management Server + XenServer 6.5SP1 + NFS
> >> Primary/Secondary Storage
> >>>>>   * CentOS 7.2 Management Server + CentOS 7.2 KVM + NFS
> >> Primary/Secondary Storage
> >>>>>
> >>>>> Thus far, we have found seven (7) test case and/or CloudStack defects
> >> in the VMware run for the 4.8 branch [1].  We are currently triaging
> >> fifty-one (51) new issues from the XenServer run to determine which
> issues
> >> were environmental and defects.  This triage work should be completed
> today
> >> (30 Sept 2016).  Finally, we are awaiting the results of the KVM run.
> >>>>>
> >>>>> We are using PR #1692 [2] as the master tracking PR to fix all
> defects
> >> in the 4.8 branch.  Our goal is to get all non-skip tests to pass and
> then
> >> merge this PR to the 4.8, 4.9, and master.  For each bug, we are
> creating a
> >> JIRA ticket and adding a commit to the PR.  Currently, the branch for
> this
> >> PR is in the shapeblue repo (the branch started with a much smaller fix
> >> from Paul and we just kept using it).  However, if others are
> interested in
> >> picking up defects, we will move it to ASF repo.  Once the 4.8 branch is
> >> stabilized, we plan to re-execute these tests on the 4.9 and master
> >> branches as we expect that the 4.9 and master branches will have
> additional
> >> issues.
> >>>>>
> >>>>> Since we are in a test freeze, I propose that no further PRs are
> >> merged to the 4.8, 4.9, and master branches until they are stabilized.
> The
> >> following PRs will be re-based, re-tested, and merged to 4.8, 4.9.1.0,
> >> and/or 4.10.0.0 post-stabilization:
> >>>>>
> >>>>>   * 1696
> >>>>>   * 1694
> >>>>>   * 1684
> >>>>>    * 1681
> >>>>>   * 1680
> >>>>>   * 1678
> >>>>>   * 1677
> >>>>>   * 1676
> >>>>>   * 1674
> >>>>>   * 1673
> >>>>>   * 1642
> >>>>>   * 1624
> >>>>>   * 1615
> >>>>>   * 1600
> >>>>>   * 1545
> >>>>>   * 1542
> >>>>>
> >>>>> I recognize that this a large backlog of contributions ready to
> merge,
> >> and apologize for asking folks to wait.  However, given current state of
> >> the release branches, merging them before we complete fixing the smoke
> >> tests would create a moving target that further delay stabilization.
> >>>>>
> >>>>> Obviously, it is unlikely we will make the 10 October 2016 release
> >> date for the 4.8.2.0, 4.9.1.0, and 4.10.0.0 releases.  At this point,
> it is
> >> difficult to estimate the size of the schedule slip because we still
> have
> >> issues to triage and test runs to complete.  I have created a wiki page
> [2]
> >> to track progress on this effort.
> >>>>>
> >>>>> Does this approach sound reasonable?  Any suggestions to speed up
> this
> >> process will be greatly appreciated as stabilizing and re-opening these
> >> branches stable ASAP is critical for the community.
> >>>>>
> >>>>> Thanks,
> >>>>> -John
> >>>>>
> >>>>> [1]: https://issues.apache.org/jira/browse/CLOUDSTACK-9518?
> >> jql=project%20%3D%20CLOUDSTACK%20AND%20fixVersion%20in%20(4.8.2.0)%
> >> 20AND%20labels%20in%20(4.8.2.0-smoke-test-failure)
> >>>>> [2]: https://cwiki.apache.org/confluence/pages/viewpage.
> >> action?pageId=65873020
> >>>>>
> >>>>>> On Sep 26, 2016, at 8:38 AM, Will Stevens <ws...@cloudops.com>
> >> wrote:
> >>>>>>
> >>>>>> Yes, I think it is important that you or Rajani sign off on anything
> >> that
> >>>>>> gets in while branches are frozen so you guys can stay on top of
> what
> >> goes
> >>>>>> in.
> >>>>>>
> >>>>>> Thanks for all the hard work team.  :)
> >>>>>>
> >>>>>> *Will STEVENS*
> >>>>>> Lead Developer
> >>>>>>
> >>>>>> *CloudOps* *| *Cloud Solutions Experts
> >>>>>> 420 rue Guy *|* Montreal *|* Quebec *|* H3J 1S6
> >>>>>> w cloudops.com *|* tw @CloudOps_
> >>>>>>
> >>>>>> On Mon, Sep 26, 2016 at 2:10 AM, John Burwell <
> >> john.burwell@shapeblue.com>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> All,
> >>>>>>>
> >>>>>>> Per our release schedule [1], the 4.8, 4.9, and master branches are
> >> frozen
> >>>>>>> for testing.  There are some straggling PRs that Rajani and I are
> >> working
> >>>>>>> to merge.  Is it acceptable to everyone that for the next two (2)
> >> weeks,
> >>>>>>> all PRs require not only 2 LGTMs, but approval by Rajani or I to be
> >> merged
> >>>>>>> to these branches?  To be clear, we don’t have to perform the
> merges,
> >>>>>>> simply give a thumbs up.
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> -John
> >>>>>>> john.burwell@shapeblue.com
> >>>>>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>>>>> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
> >>>>>>> @shapeblue
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>
> >>>>>
> >>>>> john.burwell@shapeblue.com
> >>>>> www.shapeblue.com<http://www.shapeblue.com>
> >>>>> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
> >>>>> @shapeblue
> >>>>>
> >>>>>
> >>>>>
> >>>>
> >>
> >>
> >> john.burwell@shapeblue.com
> >> www.shapeblue.com<http://www.shapeblue.com>
> >> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
> >> @shapeblue
> >>
> >>
> >>
> >>
>
>
> john.burwell@shapeblue.com
> www.shapeblue.com<http://www.shapeblue.com>
> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
> @shapeblue
>
>
>
>
> rohit.yadav@shapeblue.com
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>
>
>
>

RE: 4.8, 4.9, and master Testing Status

Posted by Paul Angus <pa...@shapeblue.com>.
To give a specific example,

Running XenServer7 smoke tests result in a large number of failures due to problems with the tests.  If we're going to make any real forward progress, we need to be able to get meaningful test results.

Pleeeeaaaaasssse review/test these PRs....



pretty please (with sugar on top).


Kind regards,

Paul Angus

paul.angus@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 


-----Original Message-----
From: Rohit Yadav [mailto:rohit.yadav@shapeblue.com] 
Sent: 19 October 2016 12:39
To: dev@cloudstack.apache.org
Subject: Re: 4.8, 4.9, and master Testing Status

All,


At least one more test/review LGTM is required on following PRs, please help with your review/tests as they will help us work towards cutting the RCs:


https://github.com/apache/cloudstack/pull/1692

https://github.com/apache/cloudstack/pull/1703

https://github.com/apache/cloudstack/pull/1708


As John has shared in the previous email, we still have outstanding failures and we'll be working towards fixing all three of them. If we can get them merged before end of the week, we can trigger tests on them again to know test status on each of the 4.8, 4.9 and master branches that can help us determine the quality on each of the branches and we can work towards RCs.


Regards.

________________________________
From: John Burwell <jo...@shapeblue.com>
Sent: 14 October 2016 12:11:04
To: dev@cloudstack.apache.org
Subject: Re: 4.8, 4.9, and master Testing Status

All,

We have made great strides stabilizing the 4.8 [1] and 4.9 [2] smoke tests.  While we are not super green, the following remaining failures/issues are isolated to the VPC VR and secondary storage.

        * CLOUDSTACK-9541: redundant VPC VR: issues when master and backup switch happens on failover [3]
        * CLOUDSTACK-9540: createPrivateGateway create private network does not create proper VLAN network on XenServer
        * CLOUDSTACK-9528: SSVM Downloads (built-in) template multiple times

Therefore, I would like to merge these two PRs so that we can begin the process of rebasing and retesting the PRs slotted for 4.8 and 4.9 that are not affected by these issues (i.e. PRs unrelated to secondary storage or the VR).  Our hope is that we can correct these issues quickly, and by the time we have worked through the backlog of pending PRs, these issues will be addressed and we can move those impacted forward.

Unfortunately, the master PR [5] has 6 failures and 4 errors on XenServer [6] that we are currently analyzing.  We hope to have these resolved shortly in order to begin progressing PRs targeting master.

I would like to get 1692 [1] and 1703 [2] merged in the next 24 hours.  We need to complete the following actions in order to accomplish this goal:

        * Obtain at least one code review LGTM on PR #1692 [1]
        * Obtain at least one code review LGTM on PR #1703 [2]
        * Obtain at least one test review LGTM on PR #1703 [2]

Once these PRs, I will be updating PRs slotted for 4.8 and 4.9 to ping authors for a rebase.  Following each rebase, we will trigger blueorangutan to retest each one.

Thank again for your patience and assistance, -John

[1]: https://github.com/apache/cloudstack/pull/1692
[2]: https://github.com/apache/cloudstack/pull/1703
[3]: https://issues.apache.org/jira/browse/CLOUDSTACK-9541
[4]: https://issues.apache.org/jira/browse/CLOUDSTACK-9540
[5]: https://github.com/apache/cloudstack/pull/1708
[6]: https://github.com/apache/cloudstack/pull/1708#issuecomment-253698099

> On Oct 7, 2016, at 10:12 AM, Will Stevens <ws...@cloudops.com> wrote:
>
> Great work everyone.  Don't worry about the sporadic updates, that is 
> just the nature of the beast when working through stuff like this.  
> Well done so far...
>
> *Will STEVENS*
> Lead Developer
>
> *CloudOps* *| *Cloud Solutions Experts
> 420 rue Guy *|* Montreal *|* Quebec *|* H3J 1S6 w cloudops.com *|* tw 
> @CloudOps_
>
> On Fri, Oct 7, 2016 at 9:53 AM, John Burwell 
> <jo...@shapeblue.com>
> wrote:
>
>> All,
>>
>> Thank you Ilya and Haijao for your words of encouragement.  In 
>> addition to the efforts of Paul, Rohit, Murali, Abhi, and Bobby, 
>> Sergey Levitskiy has been providing great help testing VMware.
>>
>> I apologize for my sporadic status updates.  We have made significant 
>> progress in getting smoke tests to pass on KVM, XenServer, and VMware.
>> Currently, we have the following number of failures and errors:
>>
>>        * KVM: 0
>>        * VMware: 4
>>        * XenServer: 8
>>
>> The outstanding failures and errors seem to be the caused by the 
>> following
>> issues:
>>
>>        1. On VMware and XenServer, guest VMs in VPCs start but don’t 
>> acquire IP addresses causing tests relying on SSH connectivity tests 
>> to fail.  The issue occurs does not occur on KVM, intermittently on 
>> VMware, and consistently on XenServer.  This issue affects the test_vpc_redundant,
>> test_privategw_acl, and test_vpc_vpn test suites.   We believe that this
>> issue may be caused by either the guest VMs startup/DHCP wait period 
>> winning the race with the VPC VR configuration or there is a problem 
>> on the VPC VR assigning IP addresses.  We are currently investigating 
>> and expect to identify the root cause shortly.
>>        2. SSVM downloads str being restarted due to ping timeouts on 
>> XenServer and VMware.  We are seeing the following messages such as 
>> the following in the Management Server logs:
>>
>>                com.cloud.utils.exception.CloudRuntimeException: 
>> Failed to send command, due to Agent:5,com.cloud.exception.OperationTimedoutException:
>> Commands
>>                9042102151853113352 to Host 5 timed out after 2400
>>
>>          Our initial investigation discovered different timezones 
>> being used by the system VM templates and Management Server.  This 
>> discrepancy We have modified Trillian to ensure consistent 
>> configuration of time zones across a cluster, and are preparing another run for XenServer and VMware.
>> KVM is not affected by this time zone issue because KVM hosts use the 
>> same CentOS template as CentOS based Management Servers -- creating 
>> time zone consistency by side effect.
>>
>> Reports of each test run are available on PR #1692 [1].  We have 
>> kicked a new round of tests on KVM, VMware, and XenServer with the 
>> time zone fix and additional instrumentation to run down the VPC VR race condition.
>>
>> Instead of directly forward merging these changes, we plan to open a 
>> PR for each forward merge.  Since we are very close to having 4.8 
>> resolved, Rohit has open PR 1703 [2] for the 4.9 forward merge and 
>> kicked off a test run.  While we cannot close this PR until 1692 is 
>> complete, we are hoping to get a head start on any issues in the 4.9 branch.
>>
>> Thank you again for your patience,
>> -John
>>
>> [1]: https://github.com/apache/cloudstack/pull/1692
>> [2]: https://github.com/apache/cloudstack/pull/1703
>>
>>> On Oct 5, 2016, at 4:32 AM, Haijiao <18...@163.com> wrote:
>>>
>>> Though I am one of the silent majority, I would thank John the dev 
>>> team
>> for your continuous effort, you keep ACS alive and better !
>>>
>>>
>>> Just heard one of biggest finance company in China running 10,000+ 
>>> VMs
>> on ACS 4.4 for production/dev/QAS,  you guys should be proud of that.
>>> Salute to you!
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> 在2016年10月05 03时09分, "ilya"<il...@gmail.com>写道:
>>>
>>> John and Team
>>>
>>> Thanks for amazing work and contributing back.
>>>
>>> Regards,
>>> ilya
>>>
>>> On 10/3/16 9:48 PM, John Burwell wrote:
>>>> All,
>>>>
>>>> A quick update on our progress to pass all smoke tests aka super
>> green.  We have reduced the failures and errors for XenServer from 93 
>> to 9 and for VMware from 51 to 14.  A CentOS 6/CentOS 6 KVM run is 
>> currently executing.  Based on manual tests/fixes, we are expecting 
>> to be the first super green configuration.  We have also found the 
>> following additional
>> defects:
>>>>
>>>> * CLOUDSTACK-9528 [2]: SSVM Downloads (built-in) Template Multiple
>> Times
>>>> * CLOUDSTACK-9529 [3]: Marvin Tests Do Not Clean Up Properly
>>>>
>>>> 9528 is causing XenServer environments to fail to install and 
>>>> startup
>> cleanly.  A lack of cleanup described in 9529 is causing XenServer to 
>> exhaust available resources before a test run completes.  We believe 
>> that resolution of these issues will address most, if not all, of the 
>> XenServer issues.
>>>>
>>>> Thanks,
>>>> -John
>>>>
>>>> [1]: https://cwiki.apache.org/confluence/pages/viewpage.
>> action?pageId=65873020
>>>> [2]: https://issues.apache.org/jira/browse/CLOUDSTACK-9528
>>>> [3]: https://issues.apache.org/jira/browse/CLOUDSTACK-9529
>>>>
>>>>>
>>>> john.burwell@shapeblue.com
>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK @shapeblue
>>>>
>>>>
>>>>
>>>> On Sep 30, 2016, at 2:40 AM, John Burwell <jo...@shapeblue.com>
>> wrote:
>>>>>
>>>>> All,
>>>>>
>>>>> Using blueorganutan, Rohit, Murali, Boris, Paul, Abhi, and I are
>> executing the smoke tests for the 4.8, 4.9, and master branches against the
>> following environments:
>>>>>
>>>>>   * CentOS 7.2 Management Server + VMware 5.5u3 + NFS
>> Primary/Secondary Storage
>>>>>   * CentOS 7.2 Management Server + XenServer 6.5SP1 + NFS
>> Primary/Secondary Storage
>>>>>   * CentOS 7.2 Management Server + CentOS 7.2 KVM + NFS
>> Primary/Secondary Storage
>>>>>
>>>>> Thus far, we have found seven (7) test case and/or CloudStack defects
>> in the VMware run for the 4.8 branch [1].  We are currently triaging
>> fifty-one (51) new issues from the XenServer run to determine which issues
>> were environmental and defects.  This triage work should be completed today
>> (30 Sept 2016).  Finally, we are awaiting the results of the KVM run.
>>>>>
>>>>> We are using PR #1692 [2] as the master tracking PR to fix all defects
>> in the 4.8 branch.  Our goal is to get all non-skip tests to pass and then
>> merge this PR to the 4.8, 4.9, and master.  For each bug, we are creating a
>> JIRA ticket and adding a commit to the PR.  Currently, the branch for this
>> PR is in the shapeblue repo (the branch started with a much smaller fix
>> from Paul and we just kept using it).  However, if others are interested in
>> picking up defects, we will move it to ASF repo.  Once the 4.8 branch is
>> stabilized, we plan to re-execute these tests on the 4.9 and master
>> branches as we expect that the 4.9 and master branches will have additional
>> issues.
>>>>>
>>>>> Since we are in a test freeze, I propose that no further PRs are
>> merged to the 4.8, 4.9, and master branches until they are stabilized.  The
>> following PRs will be re-based, re-tested, and merged to 4.8, 4.9.1.0,
>> and/or 4.10.0.0 post-stabilization:
>>>>>
>>>>>   * 1696
>>>>>   * 1694
>>>>>   * 1684
>>>>>    * 1681
>>>>>   * 1680
>>>>>   * 1678
>>>>>   * 1677
>>>>>   * 1676
>>>>>   * 1674
>>>>>   * 1673
>>>>>   * 1642
>>>>>   * 1624
>>>>>   * 1615
>>>>>   * 1600
>>>>>   * 1545
>>>>>   * 1542
>>>>>
>>>>> I recognize that this a large backlog of contributions ready to merge,
>> and apologize for asking folks to wait.  However, given current state of
>> the release branches, merging them before we complete fixing the smoke
>> tests would create a moving target that further delay stabilization.
>>>>>
>>>>> Obviously, it is unlikely we will make the 10 October 2016 release
>> date for the 4.8.2.0, 4.9.1.0, and 4.10.0.0 releases.  At this point, it is
>> difficult to estimate the size of the schedule slip because we still have
>> issues to triage and test runs to complete.  I have created a wiki page [2]
>> to track progress on this effort.
>>>>>
>>>>> Does this approach sound reasonable?  Any suggestions to speed up this
>> process will be greatly appreciated as stabilizing and re-opening these
>> branches stable ASAP is critical for the community.
>>>>>
>>>>> Thanks,
>>>>> -John
>>>>>
>>>>> [1]: https://issues.apache.org/jira/browse/CLOUDSTACK-9518?
>> jql=project%20%3D%20CLOUDSTACK%20AND%20fixVersion%20in%20(4.8.2.0)%
>> 20AND%20labels%20in%20(4.8.2.0-smoke-test-failure)
>>>>> [2]: https://cwiki.apache.org/confluence/pages/viewpage.
>> action?pageId=65873020
>>>>>
>>>>>> On Sep 26, 2016, at 8:38 AM, Will Stevens <ws...@cloudops.com>
>> wrote:
>>>>>>
>>>>>> Yes, I think it is important that you or Rajani sign off on anything
>> that
>>>>>> gets in while branches are frozen so you guys can stay on top of what
>> goes
>>>>>> in.
>>>>>>
>>>>>> Thanks for all the hard work team.  :)
>>>>>>
>>>>>> *Will STEVENS*
>>>>>> Lead Developer
>>>>>>
>>>>>> *CloudOps* *| *Cloud Solutions Experts
>>>>>> 420 rue Guy *|* Montreal *|* Quebec *|* H3J 1S6
>>>>>> w cloudops.com *|* tw @CloudOps_
>>>>>>
>>>>>> On Mon, Sep 26, 2016 at 2:10 AM, John Burwell <
>> john.burwell@shapeblue.com>
>>>>>> wrote:
>>>>>>
>>>>>>> All,
>>>>>>>
>>>>>>> Per our release schedule [1], the 4.8, 4.9, and master branches are
>> frozen
>>>>>>> for testing.  There are some straggling PRs that Rajani and I are
>> working
>>>>>>> to merge.  Is it acceptable to everyone that for the next two (2)
>> weeks,
>>>>>>> all PRs require not only 2 LGTMs, but approval by Rajani or I to be
>> merged
>>>>>>> to these branches?  To be clear, we don’t have to perform the merges,
>>>>>>> simply give a thumbs up.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> -John
>>>>>>> john.burwell@shapeblue.com
>>>>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>>>>> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
>>>>>>> @shapeblue
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>>> john.burwell@shapeblue.com
>>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>>> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
>>>>> @shapeblue
>>>>>
>>>>>
>>>>>
>>>>
>>
>>
>> john.burwell@shapeblue.com
>> www.shapeblue.com<http://www.shapeblue.com>
>> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
>> @shapeblue
>>
>>
>>
>>


john.burwell@shapeblue.com
www.shapeblue.com<http://www.shapeblue.com>
53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
@shapeblue




rohit.yadav@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 


Re: 4.8, 4.9, and master Testing Status

Posted by Rohit Yadav <ro...@shapeblue.com>.
All,


At least one more test/review LGTM is required on following PRs, please help with your review/tests as they will help us work towards cutting the RCs:


https://github.com/apache/cloudstack/pull/1692

https://github.com/apache/cloudstack/pull/1703

https://github.com/apache/cloudstack/pull/1708


As John has shared in the previous email, we still have outstanding failures and we'll be working towards fixing all three of them. If we can get them merged before end of the week, we can trigger tests on them again to know test status on each of the 4.8, 4.9 and master branches that can help us determine the quality on each of the branches and we can work towards RCs.


Regards.

________________________________
From: John Burwell <jo...@shapeblue.com>
Sent: 14 October 2016 12:11:04
To: dev@cloudstack.apache.org
Subject: Re: 4.8, 4.9, and master Testing Status

All,

We have made great strides stabilizing the 4.8 [1] and 4.9 [2] smoke tests.  While we are not super green, the following remaining failures/issues are isolated to the VPC VR and secondary storage.

        * CLOUDSTACK-9541: redundant VPC VR: issues when master and backup switch happens on failover [3]
        * CLOUDSTACK-9540: createPrivateGateway create private network does not create proper VLAN network on XenServer
        * CLOUDSTACK-9528: SSVM Downloads (built-in) template multiple times

Therefore, I would like to merge these two PRs so that we can begin the process of rebasing and retesting the PRs slotted for 4.8 and 4.9 that are not affected by these issues (i.e. PRs unrelated to secondary storage or the VR).  Our hope is that we can correct these issues quickly, and by the time we have worked through the backlog of pending PRs, these issues will be addressed and we can move those impacted forward.

Unfortunately, the master PR [5] has 6 failures and 4 errors on XenServer [6] that we are currently analyzing.  We hope to have these resolved shortly in order to begin progressing PRs targeting master.

I would like to get 1692 [1] and 1703 [2] merged in the next 24 hours.  We need to complete the following actions in order to accomplish this goal:

        * Obtain at least one code review LGTM on PR #1692 [1]
        * Obtain at least one code review LGTM on PR #1703 [2]
        * Obtain at least one test review LGTM on PR #1703 [2]

Once these PRs, I will be updating PRs slotted for 4.8 and 4.9 to ping authors for a rebase.  Following each rebase, we will trigger blueorangutan to retest each one.

Thank again for your patience and assistance,
-John

[1]: https://github.com/apache/cloudstack/pull/1692
[2]: https://github.com/apache/cloudstack/pull/1703
[3]: https://issues.apache.org/jira/browse/CLOUDSTACK-9541
[4]: https://issues.apache.org/jira/browse/CLOUDSTACK-9540
[5]: https://github.com/apache/cloudstack/pull/1708
[6]: https://github.com/apache/cloudstack/pull/1708#issuecomment-253698099

> On Oct 7, 2016, at 10:12 AM, Will Stevens <ws...@cloudops.com> wrote:
>
> Great work everyone.  Don't worry about the sporadic updates, that is just
> the nature of the beast when working through stuff like this.  Well done so
> far...
>
> *Will STEVENS*
> Lead Developer
>
> *CloudOps* *| *Cloud Solutions Experts
> 420 rue Guy *|* Montreal *|* Quebec *|* H3J 1S6
> w cloudops.com *|* tw @CloudOps_
>
> On Fri, Oct 7, 2016 at 9:53 AM, John Burwell <jo...@shapeblue.com>
> wrote:
>
>> All,
>>
>> Thank you Ilya and Haijao for your words of encouragement.  In addition to
>> the efforts of Paul, Rohit, Murali, Abhi, and Bobby, Sergey Levitskiy has
>> been providing great help testing VMware.
>>
>> I apologize for my sporadic status updates.  We have made significant
>> progress in getting smoke tests to pass on KVM, XenServer, and VMware.
>> Currently, we have the following number of failures and errors:
>>
>>        * KVM: 0
>>        * VMware: 4
>>        * XenServer: 8
>>
>> The outstanding failures and errors seem to be the caused by the following
>> issues:
>>
>>        1. On VMware and XenServer, guest VMs in VPCs start but don’t
>> acquire IP addresses causing tests relying on SSH connectivity tests to
>> fail.  The issue occurs does not occur on KVM, intermittently on VMware,
>> and consistently on XenServer.  This issue affects the test_vpc_redundant,
>> test_privategw_acl, and test_vpc_vpn test suites.   We believe that this
>> issue may be caused by either the guest VMs startup/DHCP wait period
>> winning the race with the VPC VR configuration or there is a problem on the
>> VPC VR assigning IP addresses.  We are currently investigating and expect
>> to identify the root cause shortly.
>>        2. SSVM downloads str being restarted due to ping timeouts on
>> XenServer and VMware.  We are seeing the following messages such as the
>> following in the Management Server logs:
>>
>>                com.cloud.utils.exception.CloudRuntimeException: Failed
>> to send command, due to Agent:5,com.cloud.exception.OperationTimedoutException:
>> Commands
>>                9042102151853113352 to Host 5 timed out after 2400
>>
>>          Our initial investigation discovered different timezones being
>> used by the system VM templates and Management Server.  This discrepancy We
>> have modified Trillian to ensure consistent configuration of time zones
>> across a cluster, and are preparing another run for XenServer and VMware.
>> KVM is not affected by this time zone issue because KVM hosts use the same
>> CentOS template as CentOS based Management Servers -- creating time zone
>> consistency by side effect.
>>
>> Reports of each test run are available on PR #1692 [1].  We have kicked a
>> new round of tests on KVM, VMware, and XenServer with the time zone fix and
>> additional instrumentation to run down the VPC VR race condition.
>>
>> Instead of directly forward merging these changes, we plan to open a PR
>> for each forward merge.  Since we are very close to having 4.8 resolved,
>> Rohit has open PR 1703 [2] for the 4.9 forward merge and kicked off a test
>> run.  While we cannot close this PR until 1692 is complete, we are hoping
>> to get a head start on any issues in the 4.9 branch.
>>
>> Thank you again for your patience,
>> -John
>>
>> [1]: https://github.com/apache/cloudstack/pull/1692
>> [2]: https://github.com/apache/cloudstack/pull/1703
>>
>>> On Oct 5, 2016, at 4:32 AM, Haijiao <18...@163.com> wrote:
>>>
>>> Though I am one of the silent majority, I would thank John the dev team
>> for your continuous effort, you keep ACS alive and better !
>>>
>>>
>>> Just heard one of biggest finance company in China running 10,000+ VMs
>> on ACS 4.4 for production/dev/QAS,  you guys should be proud of that.
>>> Salute to you!
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> 在2016年10月05 03时09分, "ilya"<il...@gmail.com>写道:
>>>
>>> John and Team
>>>
>>> Thanks for amazing work and contributing back.
>>>
>>> Regards,
>>> ilya
>>>
>>> On 10/3/16 9:48 PM, John Burwell wrote:
>>>> All,
>>>>
>>>> A quick update on our progress to pass all smoke tests aka super
>> green.  We have reduced the failures and errors for XenServer from 93 to 9
>> and for VMware from 51 to 14.  A CentOS 6/CentOS 6 KVM run is currently
>> executing.  Based on manual tests/fixes, we are expecting to be the first
>> super green configuration.  We have also found the following additional
>> defects:
>>>>
>>>> * CLOUDSTACK-9528 [2]: SSVM Downloads (built-in) Template Multiple
>> Times
>>>> * CLOUDSTACK-9529 [3]: Marvin Tests Do Not Clean Up Properly
>>>>
>>>> 9528 is causing XenServer environments to fail to install and startup
>> cleanly.  A lack of cleanup described in 9529 is causing XenServer to
>> exhaust available resources before a test run completes.  We believe that
>> resolution of these issues will address most, if not all, of the XenServer
>> issues.
>>>>
>>>> Thanks,
>>>> -John
>>>>
>>>> [1]: https://cwiki.apache.org/confluence/pages/viewpage.
>> action?pageId=65873020
>>>> [2]: https://issues.apache.org/jira/browse/CLOUDSTACK-9528
>>>> [3]: https://issues.apache.org/jira/browse/CLOUDSTACK-9529
>>>>
>>>>>
>>>> john.burwell@shapeblue.com
>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
>>>> @shapeblue
>>>>
>>>>
>>>>
>>>> On Sep 30, 2016, at 2:40 AM, John Burwell <jo...@shapeblue.com>
>> wrote:
>>>>>
>>>>> All,
>>>>>
>>>>> Using blueorganutan, Rohit, Murali, Boris, Paul, Abhi, and I are
>> executing the smoke tests for the 4.8, 4.9, and master branches against the
>> following environments:
>>>>>
>>>>>   * CentOS 7.2 Management Server + VMware 5.5u3 + NFS
>> Primary/Secondary Storage
>>>>>   * CentOS 7.2 Management Server + XenServer 6.5SP1 + NFS
>> Primary/Secondary Storage
>>>>>   * CentOS 7.2 Management Server + CentOS 7.2 KVM + NFS
>> Primary/Secondary Storage
>>>>>
>>>>> Thus far, we have found seven (7) test case and/or CloudStack defects
>> in the VMware run for the 4.8 branch [1].  We are currently triaging
>> fifty-one (51) new issues from the XenServer run to determine which issues
>> were environmental and defects.  This triage work should be completed today
>> (30 Sept 2016).  Finally, we are awaiting the results of the KVM run.
>>>>>
>>>>> We are using PR #1692 [2] as the master tracking PR to fix all defects
>> in the 4.8 branch.  Our goal is to get all non-skip tests to pass and then
>> merge this PR to the 4.8, 4.9, and master.  For each bug, we are creating a
>> JIRA ticket and adding a commit to the PR.  Currently, the branch for this
>> PR is in the shapeblue repo (the branch started with a much smaller fix
>> from Paul and we just kept using it).  However, if others are interested in
>> picking up defects, we will move it to ASF repo.  Once the 4.8 branch is
>> stabilized, we plan to re-execute these tests on the 4.9 and master
>> branches as we expect that the 4.9 and master branches will have additional
>> issues.
>>>>>
>>>>> Since we are in a test freeze, I propose that no further PRs are
>> merged to the 4.8, 4.9, and master branches until they are stabilized.  The
>> following PRs will be re-based, re-tested, and merged to 4.8, 4.9.1.0,
>> and/or 4.10.0.0 post-stabilization:
>>>>>
>>>>>   * 1696
>>>>>   * 1694
>>>>>   * 1684
>>>>>    * 1681
>>>>>   * 1680
>>>>>   * 1678
>>>>>   * 1677
>>>>>   * 1676
>>>>>   * 1674
>>>>>   * 1673
>>>>>   * 1642
>>>>>   * 1624
>>>>>   * 1615
>>>>>   * 1600
>>>>>   * 1545
>>>>>   * 1542
>>>>>
>>>>> I recognize that this a large backlog of contributions ready to merge,
>> and apologize for asking folks to wait.  However, given current state of
>> the release branches, merging them before we complete fixing the smoke
>> tests would create a moving target that further delay stabilization.
>>>>>
>>>>> Obviously, it is unlikely we will make the 10 October 2016 release
>> date for the 4.8.2.0, 4.9.1.0, and 4.10.0.0 releases.  At this point, it is
>> difficult to estimate the size of the schedule slip because we still have
>> issues to triage and test runs to complete.  I have created a wiki page [2]
>> to track progress on this effort.
>>>>>
>>>>> Does this approach sound reasonable?  Any suggestions to speed up this
>> process will be greatly appreciated as stabilizing and re-opening these
>> branches stable ASAP is critical for the community.
>>>>>
>>>>> Thanks,
>>>>> -John
>>>>>
>>>>> [1]: https://issues.apache.org/jira/browse/CLOUDSTACK-9518?
>> jql=project%20%3D%20CLOUDSTACK%20AND%20fixVersion%20in%20(4.8.2.0)%
>> 20AND%20labels%20in%20(4.8.2.0-smoke-test-failure)
>>>>> [2]: https://cwiki.apache.org/confluence/pages/viewpage.
>> action?pageId=65873020
>>>>>
>>>>>> On Sep 26, 2016, at 8:38 AM, Will Stevens <ws...@cloudops.com>
>> wrote:
>>>>>>
>>>>>> Yes, I think it is important that you or Rajani sign off on anything
>> that
>>>>>> gets in while branches are frozen so you guys can stay on top of what
>> goes
>>>>>> in.
>>>>>>
>>>>>> Thanks for all the hard work team.  :)
>>>>>>
>>>>>> *Will STEVENS*
>>>>>> Lead Developer
>>>>>>
>>>>>> *CloudOps* *| *Cloud Solutions Experts
>>>>>> 420 rue Guy *|* Montreal *|* Quebec *|* H3J 1S6
>>>>>> w cloudops.com *|* tw @CloudOps_
>>>>>>
>>>>>> On Mon, Sep 26, 2016 at 2:10 AM, John Burwell <
>> john.burwell@shapeblue.com>
>>>>>> wrote:
>>>>>>
>>>>>>> All,
>>>>>>>
>>>>>>> Per our release schedule [1], the 4.8, 4.9, and master branches are
>> frozen
>>>>>>> for testing.  There are some straggling PRs that Rajani and I are
>> working
>>>>>>> to merge.  Is it acceptable to everyone that for the next two (2)
>> weeks,
>>>>>>> all PRs require not only 2 LGTMs, but approval by Rajani or I to be
>> merged
>>>>>>> to these branches?  To be clear, we don’t have to perform the merges,
>>>>>>> simply give a thumbs up.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> -John
>>>>>>> john.burwell@shapeblue.com
>>>>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>>>>> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
>>>>>>> @shapeblue
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>>> john.burwell@shapeblue.com
>>>>> www.shapeblue.com<http://www.shapeblue.com>
>>>>> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
>>>>> @shapeblue
>>>>>
>>>>>
>>>>>
>>>>
>>
>>
>> john.burwell@shapeblue.com
>> www.shapeblue.com<http://www.shapeblue.com>
>> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
>> @shapeblue
>>
>>
>>
>>


john.burwell@shapeblue.com
www.shapeblue.com<http://www.shapeblue.com>
53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
@shapeblue




rohit.yadav@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 


Re: 4.8, 4.9, and master Testing Status

Posted by John Burwell <jo...@shapeblue.com>.
All,

We have made great strides stabilizing the 4.8 [1] and 4.9 [2] smoke tests.  While we are not super green, the following remaining failures/issues are isolated to the VPC VR and secondary storage.  

	* CLOUDSTACK-9541: redundant VPC VR: issues when master and backup switch happens on failover [3]
	* CLOUDSTACK-9540: createPrivateGateway create private network does not create proper VLAN network on XenServer
	* CLOUDSTACK-9528: SSVM Downloads (built-in) template multiple times

Therefore, I would like to merge these two PRs so that we can begin the process of rebasing and retesting the PRs slotted for 4.8 and 4.9 that are not affected by these issues (i.e. PRs unrelated to secondary storage or the VR).  Our hope is that we can correct these issues quickly, and by the time we have worked through the backlog of pending PRs, these issues will be addressed and we can move those impacted forward.

Unfortunately, the master PR [5] has 6 failures and 4 errors on XenServer [6] that we are currently analyzing.  We hope to have these resolved shortly in order to begin progressing PRs targeting master.

I would like to get 1692 [1] and 1703 [2] merged in the next 24 hours.  We need to complete the following actions in order to accomplish this goal:

	* Obtain at least one code review LGTM on PR #1692 [1]
	* Obtain at least one code review LGTM on PR #1703 [2]
	* Obtain at least one test review LGTM on PR #1703 [2]

Once these PRs, I will be updating PRs slotted for 4.8 and 4.9 to ping authors for a rebase.  Following each rebase, we will trigger blueorangutan to retest each one.

Thank again for your patience and assistance,
-John

[1]: https://github.com/apache/cloudstack/pull/1692
[2]: https://github.com/apache/cloudstack/pull/1703
[3]: https://issues.apache.org/jira/browse/CLOUDSTACK-9541
[4]: https://issues.apache.org/jira/browse/CLOUDSTACK-9540
[5]: https://github.com/apache/cloudstack/pull/1708
[6]: https://github.com/apache/cloudstack/pull/1708#issuecomment-253698099

> On Oct 7, 2016, at 10:12 AM, Will Stevens <ws...@cloudops.com> wrote:
> 
> Great work everyone.  Don't worry about the sporadic updates, that is just
> the nature of the beast when working through stuff like this.  Well done so
> far...
> 
> *Will STEVENS*
> Lead Developer
> 
> *CloudOps* *| *Cloud Solutions Experts
> 420 rue Guy *|* Montreal *|* Quebec *|* H3J 1S6
> w cloudops.com *|* tw @CloudOps_
> 
> On Fri, Oct 7, 2016 at 9:53 AM, John Burwell <jo...@shapeblue.com>
> wrote:
> 
>> All,
>> 
>> Thank you Ilya and Haijao for your words of encouragement.  In addition to
>> the efforts of Paul, Rohit, Murali, Abhi, and Bobby, Sergey Levitskiy has
>> been providing great help testing VMware.
>> 
>> I apologize for my sporadic status updates.  We have made significant
>> progress in getting smoke tests to pass on KVM, XenServer, and VMware.
>> Currently, we have the following number of failures and errors:
>> 
>>        * KVM: 0
>>        * VMware: 4
>>        * XenServer: 8
>> 
>> The outstanding failures and errors seem to be the caused by the following
>> issues:
>> 
>>        1. On VMware and XenServer, guest VMs in VPCs start but don’t
>> acquire IP addresses causing tests relying on SSH connectivity tests to
>> fail.  The issue occurs does not occur on KVM, intermittently on VMware,
>> and consistently on XenServer.  This issue affects the test_vpc_redundant,
>> test_privategw_acl, and test_vpc_vpn test suites.   We believe that this
>> issue may be caused by either the guest VMs startup/DHCP wait period
>> winning the race with the VPC VR configuration or there is a problem on the
>> VPC VR assigning IP addresses.  We are currently investigating and expect
>> to identify the root cause shortly.
>>        2. SSVM downloads str being restarted due to ping timeouts on
>> XenServer and VMware.  We are seeing the following messages such as the
>> following in the Management Server logs:
>> 
>>                com.cloud.utils.exception.CloudRuntimeException: Failed
>> to send command, due to Agent:5,com.cloud.exception.OperationTimedoutException:
>> Commands
>>                9042102151853113352 to Host 5 timed out after 2400
>> 
>>          Our initial investigation discovered different timezones being
>> used by the system VM templates and Management Server.  This discrepancy We
>> have modified Trillian to ensure consistent configuration of time zones
>> across a cluster, and are preparing another run for XenServer and VMware.
>> KVM is not affected by this time zone issue because KVM hosts use the same
>> CentOS template as CentOS based Management Servers -- creating time zone
>> consistency by side effect.
>> 
>> Reports of each test run are available on PR #1692 [1].  We have kicked a
>> new round of tests on KVM, VMware, and XenServer with the time zone fix and
>> additional instrumentation to run down the VPC VR race condition.
>> 
>> Instead of directly forward merging these changes, we plan to open a PR
>> for each forward merge.  Since we are very close to having 4.8 resolved,
>> Rohit has open PR 1703 [2] for the 4.9 forward merge and kicked off a test
>> run.  While we cannot close this PR until 1692 is complete, we are hoping
>> to get a head start on any issues in the 4.9 branch.
>> 
>> Thank you again for your patience,
>> -John
>> 
>> [1]: https://github.com/apache/cloudstack/pull/1692
>> [2]: https://github.com/apache/cloudstack/pull/1703
>> 
>>> On Oct 5, 2016, at 4:32 AM, Haijiao <18...@163.com> wrote:
>>> 
>>> Though I am one of the silent majority, I would thank John the dev team
>> for your continuous effort, you keep ACS alive and better !
>>> 
>>> 
>>> Just heard one of biggest finance company in China running 10,000+ VMs
>> on ACS 4.4 for production/dev/QAS,  you guys should be proud of that.
>>> Salute to you!
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 在2016年10月05 03时09分, "ilya"<il...@gmail.com>写道:
>>> 
>>> John and Team
>>> 
>>> Thanks for amazing work and contributing back.
>>> 
>>> Regards,
>>> ilya
>>> 
>>> On 10/3/16 9:48 PM, John Burwell wrote:
>>>> All,
>>>> 
>>>> A quick update on our progress to pass all smoke tests aka super
>> green.  We have reduced the failures and errors for XenServer from 93 to 9
>> and for VMware from 51 to 14.  A CentOS 6/CentOS 6 KVM run is currently
>> executing.  Based on manual tests/fixes, we are expecting to be the first
>> super green configuration.  We have also found the following additional
>> defects:
>>>> 
>>>> * CLOUDSTACK-9528 [2]: SSVM Downloads (built-in) Template Multiple
>> Times
>>>> * CLOUDSTACK-9529 [3]: Marvin Tests Do Not Clean Up Properly
>>>> 
>>>> 9528 is causing XenServer environments to fail to install and startup
>> cleanly.  A lack of cleanup described in 9529 is causing XenServer to
>> exhaust available resources before a test run completes.  We believe that
>> resolution of these issues will address most, if not all, of the XenServer
>> issues.
>>>> 
>>>> Thanks,
>>>> -John
>>>> 
>>>> [1]: https://cwiki.apache.org/confluence/pages/viewpage.
>> action?pageId=65873020
>>>> [2]: https://issues.apache.org/jira/browse/CLOUDSTACK-9528
>>>> [3]: https://issues.apache.org/jira/browse/CLOUDSTACK-9529
>>>> 
>>>>> 
>>>> john.burwell@shapeblue.com
>>>> www.shapeblue.com
>>>> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
>>>> @shapeblue
>>>> 
>>>> 
>>>> 
>>>> On Sep 30, 2016, at 2:40 AM, John Burwell <jo...@shapeblue.com>
>> wrote:
>>>>> 
>>>>> All,
>>>>> 
>>>>> Using blueorganutan, Rohit, Murali, Boris, Paul, Abhi, and I are
>> executing the smoke tests for the 4.8, 4.9, and master branches against the
>> following environments:
>>>>> 
>>>>>   * CentOS 7.2 Management Server + VMware 5.5u3 + NFS
>> Primary/Secondary Storage
>>>>>   * CentOS 7.2 Management Server + XenServer 6.5SP1 + NFS
>> Primary/Secondary Storage
>>>>>   * CentOS 7.2 Management Server + CentOS 7.2 KVM + NFS
>> Primary/Secondary Storage
>>>>> 
>>>>> Thus far, we have found seven (7) test case and/or CloudStack defects
>> in the VMware run for the 4.8 branch [1].  We are currently triaging
>> fifty-one (51) new issues from the XenServer run to determine which issues
>> were environmental and defects.  This triage work should be completed today
>> (30 Sept 2016).  Finally, we are awaiting the results of the KVM run.
>>>>> 
>>>>> We are using PR #1692 [2] as the master tracking PR to fix all defects
>> in the 4.8 branch.  Our goal is to get all non-skip tests to pass and then
>> merge this PR to the 4.8, 4.9, and master.  For each bug, we are creating a
>> JIRA ticket and adding a commit to the PR.  Currently, the branch for this
>> PR is in the shapeblue repo (the branch started with a much smaller fix
>> from Paul and we just kept using it).  However, if others are interested in
>> picking up defects, we will move it to ASF repo.  Once the 4.8 branch is
>> stabilized, we plan to re-execute these tests on the 4.9 and master
>> branches as we expect that the 4.9 and master branches will have additional
>> issues.
>>>>> 
>>>>> Since we are in a test freeze, I propose that no further PRs are
>> merged to the 4.8, 4.9, and master branches until they are stabilized.  The
>> following PRs will be re-based, re-tested, and merged to 4.8, 4.9.1.0,
>> and/or 4.10.0.0 post-stabilization:
>>>>> 
>>>>>   * 1696
>>>>>   * 1694
>>>>>   * 1684
>>>>>    * 1681
>>>>>   * 1680
>>>>>   * 1678
>>>>>   * 1677
>>>>>   * 1676
>>>>>   * 1674
>>>>>   * 1673
>>>>>   * 1642
>>>>>   * 1624
>>>>>   * 1615
>>>>>   * 1600
>>>>>   * 1545
>>>>>   * 1542
>>>>> 
>>>>> I recognize that this a large backlog of contributions ready to merge,
>> and apologize for asking folks to wait.  However, given current state of
>> the release branches, merging them before we complete fixing the smoke
>> tests would create a moving target that further delay stabilization.
>>>>> 
>>>>> Obviously, it is unlikely we will make the 10 October 2016 release
>> date for the 4.8.2.0, 4.9.1.0, and 4.10.0.0 releases.  At this point, it is
>> difficult to estimate the size of the schedule slip because we still have
>> issues to triage and test runs to complete.  I have created a wiki page [2]
>> to track progress on this effort.
>>>>> 
>>>>> Does this approach sound reasonable?  Any suggestions to speed up this
>> process will be greatly appreciated as stabilizing and re-opening these
>> branches stable ASAP is critical for the community.
>>>>> 
>>>>> Thanks,
>>>>> -John
>>>>> 
>>>>> [1]: https://issues.apache.org/jira/browse/CLOUDSTACK-9518?
>> jql=project%20%3D%20CLOUDSTACK%20AND%20fixVersion%20in%20(4.8.2.0)%
>> 20AND%20labels%20in%20(4.8.2.0-smoke-test-failure)
>>>>> [2]: https://cwiki.apache.org/confluence/pages/viewpage.
>> action?pageId=65873020
>>>>> 
>>>>>> On Sep 26, 2016, at 8:38 AM, Will Stevens <ws...@cloudops.com>
>> wrote:
>>>>>> 
>>>>>> Yes, I think it is important that you or Rajani sign off on anything
>> that
>>>>>> gets in while branches are frozen so you guys can stay on top of what
>> goes
>>>>>> in.
>>>>>> 
>>>>>> Thanks for all the hard work team.  :)
>>>>>> 
>>>>>> *Will STEVENS*
>>>>>> Lead Developer
>>>>>> 
>>>>>> *CloudOps* *| *Cloud Solutions Experts
>>>>>> 420 rue Guy *|* Montreal *|* Quebec *|* H3J 1S6
>>>>>> w cloudops.com *|* tw @CloudOps_
>>>>>> 
>>>>>> On Mon, Sep 26, 2016 at 2:10 AM, John Burwell <
>> john.burwell@shapeblue.com>
>>>>>> wrote:
>>>>>> 
>>>>>>> All,
>>>>>>> 
>>>>>>> Per our release schedule [1], the 4.8, 4.9, and master branches are
>> frozen
>>>>>>> for testing.  There are some straggling PRs that Rajani and I are
>> working
>>>>>>> to merge.  Is it acceptable to everyone that for the next two (2)
>> weeks,
>>>>>>> all PRs require not only 2 LGTMs, but approval by Rajani or I to be
>> merged
>>>>>>> to these branches?  To be clear, we don’t have to perform the merges,
>>>>>>> simply give a thumbs up.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> -John
>>>>>>> john.burwell@shapeblue.com
>>>>>>> www.shapeblue.com
>>>>>>> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
>>>>>>> @shapeblue
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>> 
>>>>> 
>>>>> john.burwell@shapeblue.com
>>>>> www.shapeblue.com
>>>>> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
>>>>> @shapeblue
>>>>> 
>>>>> 
>>>>> 
>>>> 
>> 
>> 
>> john.burwell@shapeblue.com
>> www.shapeblue.com
>> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
>> @shapeblue
>> 
>> 
>> 
>> 


john.burwell@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
@shapeblue
  
 


Re: 4.8, 4.9, and master Testing Status

Posted by Will Stevens <ws...@cloudops.com>.
Great work everyone.  Don't worry about the sporadic updates, that is just
the nature of the beast when working through stuff like this.  Well done so
far...

*Will STEVENS*
Lead Developer

*CloudOps* *| *Cloud Solutions Experts
420 rue Guy *|* Montreal *|* Quebec *|* H3J 1S6
w cloudops.com *|* tw @CloudOps_

On Fri, Oct 7, 2016 at 9:53 AM, John Burwell <jo...@shapeblue.com>
wrote:

> All,
>
> Thank you Ilya and Haijao for your words of encouragement.  In addition to
> the efforts of Paul, Rohit, Murali, Abhi, and Bobby, Sergey Levitskiy has
> been providing great help testing VMware.
>
> I apologize for my sporadic status updates.  We have made significant
> progress in getting smoke tests to pass on KVM, XenServer, and VMware.
> Currently, we have the following number of failures and errors:
>
>         * KVM: 0
>         * VMware: 4
>         * XenServer: 8
>
> The outstanding failures and errors seem to be the caused by the following
> issues:
>
>         1. On VMware and XenServer, guest VMs in VPCs start but don’t
> acquire IP addresses causing tests relying on SSH connectivity tests to
> fail.  The issue occurs does not occur on KVM, intermittently on VMware,
> and consistently on XenServer.  This issue affects the test_vpc_redundant,
> test_privategw_acl, and test_vpc_vpn test suites.   We believe that this
> issue may be caused by either the guest VMs startup/DHCP wait period
> winning the race with the VPC VR configuration or there is a problem on the
> VPC VR assigning IP addresses.  We are currently investigating and expect
> to identify the root cause shortly.
>         2. SSVM downloads str being restarted due to ping timeouts on
> XenServer and VMware.  We are seeing the following messages such as the
> following in the Management Server logs:
>
>                 com.cloud.utils.exception.CloudRuntimeException: Failed
> to send command, due to Agent:5,com.cloud.exception.OperationTimedoutException:
> Commands
>                 9042102151853113352 to Host 5 timed out after 2400
>
>           Our initial investigation discovered different timezones being
> used by the system VM templates and Management Server.  This discrepancy We
> have modified Trillian to ensure consistent configuration of time zones
> across a cluster, and are preparing another run for XenServer and VMware.
> KVM is not affected by this time zone issue because KVM hosts use the same
> CentOS template as CentOS based Management Servers -- creating time zone
> consistency by side effect.
>
> Reports of each test run are available on PR #1692 [1].  We have kicked a
> new round of tests on KVM, VMware, and XenServer with the time zone fix and
> additional instrumentation to run down the VPC VR race condition.
>
> Instead of directly forward merging these changes, we plan to open a PR
> for each forward merge.  Since we are very close to having 4.8 resolved,
> Rohit has open PR 1703 [2] for the 4.9 forward merge and kicked off a test
> run.  While we cannot close this PR until 1692 is complete, we are hoping
> to get a head start on any issues in the 4.9 branch.
>
> Thank you again for your patience,
> -John
>
> [1]: https://github.com/apache/cloudstack/pull/1692
> [2]: https://github.com/apache/cloudstack/pull/1703
>
> > On Oct 5, 2016, at 4:32 AM, Haijiao <18...@163.com> wrote:
> >
> > Though I am one of the silent majority, I would thank John the dev team
> for your continuous effort, you keep ACS alive and better !
> >
> >
> > Just heard one of biggest finance company in China running 10,000+ VMs
> on ACS 4.4 for production/dev/QAS,  you guys should be proud of that.
> > Salute to you!
> >
> >
> >
> >
> >
> >
> >
> > 在2016年10月05 03时09分, "ilya"<il...@gmail.com>写道:
> >
> > John and Team
> >
> > Thanks for amazing work and contributing back.
> >
> > Regards,
> > ilya
> >
> > On 10/3/16 9:48 PM, John Burwell wrote:
> >> All,
> >>
> >> A quick update on our progress to pass all smoke tests aka super
> green.  We have reduced the failures and errors for XenServer from 93 to 9
> and for VMware from 51 to 14.  A CentOS 6/CentOS 6 KVM run is currently
> executing.  Based on manual tests/fixes, we are expecting to be the first
> super green configuration.  We have also found the following additional
> defects:
> >>
> >>  * CLOUDSTACK-9528 [2]: SSVM Downloads (built-in) Template Multiple
> Times
> >>  * CLOUDSTACK-9529 [3]: Marvin Tests Do Not Clean Up Properly
> >>
> >> 9528 is causing XenServer environments to fail to install and startup
> cleanly.  A lack of cleanup described in 9529 is causing XenServer to
> exhaust available resources before a test run completes.  We believe that
> resolution of these issues will address most, if not all, of the XenServer
> issues.
> >>
> >> Thanks,
> >> -John
> >>
> >> [1]: https://cwiki.apache.org/confluence/pages/viewpage.
> action?pageId=65873020
> >> [2]: https://issues.apache.org/jira/browse/CLOUDSTACK-9528
> >> [3]: https://issues.apache.org/jira/browse/CLOUDSTACK-9529
> >>
> >>>
> >> john.burwell@shapeblue.com
> >> www.shapeblue.com
> >> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
> >> @shapeblue
> >>
> >>
> >>
> >> On Sep 30, 2016, at 2:40 AM, John Burwell <jo...@shapeblue.com>
> wrote:
> >>>
> >>> All,
> >>>
> >>> Using blueorganutan, Rohit, Murali, Boris, Paul, Abhi, and I are
> executing the smoke tests for the 4.8, 4.9, and master branches against the
> following environments:
> >>>
> >>>    * CentOS 7.2 Management Server + VMware 5.5u3 + NFS
> Primary/Secondary Storage
> >>>    * CentOS 7.2 Management Server + XenServer 6.5SP1 + NFS
> Primary/Secondary Storage
> >>>    * CentOS 7.2 Management Server + CentOS 7.2 KVM + NFS
> Primary/Secondary Storage
> >>>
> >>> Thus far, we have found seven (7) test case and/or CloudStack defects
> in the VMware run for the 4.8 branch [1].  We are currently triaging
> fifty-one (51) new issues from the XenServer run to determine which issues
> were environmental and defects.  This triage work should be completed today
> (30 Sept 2016).  Finally, we are awaiting the results of the KVM run.
> >>>
> >>> We are using PR #1692 [2] as the master tracking PR to fix all defects
> in the 4.8 branch.  Our goal is to get all non-skip tests to pass and then
> merge this PR to the 4.8, 4.9, and master.  For each bug, we are creating a
> JIRA ticket and adding a commit to the PR.  Currently, the branch for this
> PR is in the shapeblue repo (the branch started with a much smaller fix
> from Paul and we just kept using it).  However, if others are interested in
> picking up defects, we will move it to ASF repo.  Once the 4.8 branch is
> stabilized, we plan to re-execute these tests on the 4.9 and master
> branches as we expect that the 4.9 and master branches will have additional
> issues.
> >>>
> >>> Since we are in a test freeze, I propose that no further PRs are
> merged to the 4.8, 4.9, and master branches until they are stabilized.  The
> following PRs will be re-based, re-tested, and merged to 4.8, 4.9.1.0,
> and/or 4.10.0.0 post-stabilization:
> >>>
> >>>    * 1696
> >>>    * 1694
> >>>    * 1684
> >>>     * 1681
> >>>    * 1680
> >>>    * 1678
> >>>    * 1677
> >>>    * 1676
> >>>    * 1674
> >>>    * 1673
> >>>    * 1642
> >>>    * 1624
> >>>    * 1615
> >>>    * 1600
> >>>    * 1545
> >>>    * 1542
> >>>
> >>> I recognize that this a large backlog of contributions ready to merge,
> and apologize for asking folks to wait.  However, given current state of
> the release branches, merging them before we complete fixing the smoke
> tests would create a moving target that further delay stabilization.
> >>>
> >>> Obviously, it is unlikely we will make the 10 October 2016 release
> date for the 4.8.2.0, 4.9.1.0, and 4.10.0.0 releases.  At this point, it is
> difficult to estimate the size of the schedule slip because we still have
> issues to triage and test runs to complete.  I have created a wiki page [2]
> to track progress on this effort.
> >>>
> >>> Does this approach sound reasonable?  Any suggestions to speed up this
> process will be greatly appreciated as stabilizing and re-opening these
> branches stable ASAP is critical for the community.
> >>>
> >>> Thanks,
> >>> -John
> >>>
> >>> [1]: https://issues.apache.org/jira/browse/CLOUDSTACK-9518?
> jql=project%20%3D%20CLOUDSTACK%20AND%20fixVersion%20in%20(4.8.2.0)%
> 20AND%20labels%20in%20(4.8.2.0-smoke-test-failure)
> >>> [2]: https://cwiki.apache.org/confluence/pages/viewpage.
> action?pageId=65873020
> >>>
> >>>> On Sep 26, 2016, at 8:38 AM, Will Stevens <ws...@cloudops.com>
> wrote:
> >>>>
> >>>> Yes, I think it is important that you or Rajani sign off on anything
> that
> >>>> gets in while branches are frozen so you guys can stay on top of what
> goes
> >>>> in.
> >>>>
> >>>> Thanks for all the hard work team.  :)
> >>>>
> >>>> *Will STEVENS*
> >>>> Lead Developer
> >>>>
> >>>> *CloudOps* *| *Cloud Solutions Experts
> >>>> 420 rue Guy *|* Montreal *|* Quebec *|* H3J 1S6
> >>>> w cloudops.com *|* tw @CloudOps_
> >>>>
> >>>> On Mon, Sep 26, 2016 at 2:10 AM, John Burwell <
> john.burwell@shapeblue.com>
> >>>> wrote:
> >>>>
> >>>>> All,
> >>>>>
> >>>>> Per our release schedule [1], the 4.8, 4.9, and master branches are
> frozen
> >>>>> for testing.  There are some straggling PRs that Rajani and I are
> working
> >>>>> to merge.  Is it acceptable to everyone that for the next two (2)
> weeks,
> >>>>> all PRs require not only 2 LGTMs, but approval by Rajani or I to be
> merged
> >>>>> to these branches?  To be clear, we don’t have to perform the merges,
> >>>>> simply give a thumbs up.
> >>>>>
> >>>>> Thanks,
> >>>>> -John
> >>>>> john.burwell@shapeblue.com
> >>>>> www.shapeblue.com
> >>>>> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
> >>>>> @shapeblue
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>
> >>>
> >>> john.burwell@shapeblue.com
> >>> www.shapeblue.com
> >>> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
> >>> @shapeblue
> >>>
> >>>
> >>>
> >>
>
>
> john.burwell@shapeblue.com
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
> @shapeblue
>
>
>
>

Re: 4.8, 4.9, and master Testing Status

Posted by John Burwell <jo...@shapeblue.com>.
All,

Thank you Ilya and Haijao for your words of encouragement.  In addition to the efforts of Paul, Rohit, Murali, Abhi, and Bobby, Sergey Levitskiy has been providing great help testing VMware.  

I apologize for my sporadic status updates.  We have made significant progress in getting smoke tests to pass on KVM, XenServer, and VMware.  Currently, we have the following number of failures and errors:

	* KVM: 0
	* VMware: 4
	* XenServer: 8

The outstanding failures and errors seem to be the caused by the following issues:

	1. On VMware and XenServer, guest VMs in VPCs start but don’t acquire IP addresses causing tests relying on SSH connectivity tests to fail.  The issue occurs does not occur on KVM, intermittently on VMware, and consistently on XenServer.  This issue affects the test_vpc_redundant, test_privategw_acl, and test_vpc_vpn test suites.   We believe that this issue may be caused by either the guest VMs startup/DHCP wait period winning the race with the VPC VR configuration or there is a problem on the VPC VR assigning IP addresses.  We are currently investigating and expect to identify the root cause shortly.
	2. SSVM downloads str being restarted due to ping timeouts on XenServer and VMware.  We are seeing the following messages such as the following in the Management Server logs:

		com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due to Agent:5,com.cloud.exception.OperationTimedoutException: Commands 
		9042102151853113352 to Host 5 timed out after 2400

	  Our initial investigation discovered different timezones being used by the system VM templates and Management Server.  This discrepancy We have modified Trillian to ensure consistent configuration of time zones across a cluster, and are preparing another run for XenServer and VMware.  KVM is not affected by this time zone issue because KVM hosts use the same CentOS template as CentOS based Management Servers -- creating time zone consistency by side effect.

Reports of each test run are available on PR #1692 [1].  We have kicked a new round of tests on KVM, VMware, and XenServer with the time zone fix and additional instrumentation to run down the VPC VR race condition.

Instead of directly forward merging these changes, we plan to open a PR for each forward merge.  Since we are very close to having 4.8 resolved, Rohit has open PR 1703 [2] for the 4.9 forward merge and kicked off a test run.  While we cannot close this PR until 1692 is complete, we are hoping to get a head start on any issues in the 4.9 branch.

Thank you again for your patience,
-John

[1]: https://github.com/apache/cloudstack/pull/1692
[2]: https://github.com/apache/cloudstack/pull/1703

> On Oct 5, 2016, at 4:32 AM, Haijiao <18...@163.com> wrote:
> 
> Though I am one of the silent majority, I would thank John the dev team for your continuous effort, you keep ACS alive and better !
> 
> 
> Just heard one of biggest finance company in China running 10,000+ VMs on ACS 4.4 for production/dev/QAS,  you guys should be proud of that.
> Salute to you!
> 
> 
> 
> 
> 
> 
> 
> 在2016年10月05 03时09分, "ilya"<il...@gmail.com>写道:
> 
> John and Team
> 
> Thanks for amazing work and contributing back.
> 
> Regards,
> ilya
> 
> On 10/3/16 9:48 PM, John Burwell wrote:
>> All,
>> 
>> A quick update on our progress to pass all smoke tests aka super green.  We have reduced the failures and errors for XenServer from 93 to 9 and for VMware from 51 to 14.  A CentOS 6/CentOS 6 KVM run is currently executing.  Based on manual tests/fixes, we are expecting to be the first super green configuration.  We have also found the following additional defects:
>> 
>>  * CLOUDSTACK-9528 [2]: SSVM Downloads (built-in) Template Multiple Times
>>  * CLOUDSTACK-9529 [3]: Marvin Tests Do Not Clean Up Properly
>> 
>> 9528 is causing XenServer environments to fail to install and startup cleanly.  A lack of cleanup described in 9529 is causing XenServer to exhaust available resources before a test run completes.  We believe that resolution of these issues will address most, if not all, of the XenServer issues.
>> 
>> Thanks,
>> -John
>> 
>> [1]: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65873020
>> [2]: https://issues.apache.org/jira/browse/CLOUDSTACK-9528
>> [3]: https://issues.apache.org/jira/browse/CLOUDSTACK-9529
>> 
>>> 
>> john.burwell@shapeblue.com
>> www.shapeblue.com
>> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
>> @shapeblue
>> 
>> 
>> 
>> On Sep 30, 2016, at 2:40 AM, John Burwell <jo...@shapeblue.com> wrote:
>>> 
>>> All,
>>> 
>>> Using blueorganutan, Rohit, Murali, Boris, Paul, Abhi, and I are executing the smoke tests for the 4.8, 4.9, and master branches against the following environments:
>>> 
>>>    * CentOS 7.2 Management Server + VMware 5.5u3 + NFS Primary/Secondary Storage
>>>    * CentOS 7.2 Management Server + XenServer 6.5SP1 + NFS Primary/Secondary Storage
>>>    * CentOS 7.2 Management Server + CentOS 7.2 KVM + NFS Primary/Secondary Storage
>>> 
>>> Thus far, we have found seven (7) test case and/or CloudStack defects in the VMware run for the 4.8 branch [1].  We are currently triaging fifty-one (51) new issues from the XenServer run to determine which issues were environmental and defects.  This triage work should be completed today (30 Sept 2016).  Finally, we are awaiting the results of the KVM run.  
>>> 
>>> We are using PR #1692 [2] as the master tracking PR to fix all defects in the 4.8 branch.  Our goal is to get all non-skip tests to pass and then merge this PR to the 4.8, 4.9, and master.  For each bug, we are creating a JIRA ticket and adding a commit to the PR.  Currently, the branch for this PR is in the shapeblue repo (the branch started with a much smaller fix from Paul and we just kept using it).  However, if others are interested in picking up defects, we will move it to ASF repo.  Once the 4.8 branch is stabilized, we plan to re-execute these tests on the 4.9 and master branches as we expect that the 4.9 and master branches will have additional issues.
>>> 
>>> Since we are in a test freeze, I propose that no further PRs are merged to the 4.8, 4.9, and master branches until they are stabilized.  The following PRs will be re-based, re-tested, and merged to 4.8, 4.9.1.0, and/or 4.10.0.0 post-stabilization:
>>> 
>>>    * 1696
>>>    * 1694
>>>    * 1684
>>>     * 1681
>>>    * 1680
>>>    * 1678
>>>    * 1677
>>>    * 1676
>>>    * 1674
>>>    * 1673
>>>    * 1642
>>>    * 1624
>>>    * 1615
>>>    * 1600
>>>    * 1545
>>>    * 1542
>>> 
>>> I recognize that this a large backlog of contributions ready to merge, and apologize for asking folks to wait.  However, given current state of the release branches, merging them before we complete fixing the smoke tests would create a moving target that further delay stabilization.  
>>> 
>>> Obviously, it is unlikely we will make the 10 October 2016 release date for the 4.8.2.0, 4.9.1.0, and 4.10.0.0 releases.  At this point, it is difficult to estimate the size of the schedule slip because we still have issues to triage and test runs to complete.  I have created a wiki page [2] to track progress on this effort.  
>>> 
>>> Does this approach sound reasonable?  Any suggestions to speed up this process will be greatly appreciated as stabilizing and re-opening these branches stable ASAP is critical for the community.
>>> 
>>> Thanks,
>>> -John
>>> 
>>> [1]: https://issues.apache.org/jira/browse/CLOUDSTACK-9518?jql=project%20%3D%20CLOUDSTACK%20AND%20fixVersion%20in%20(4.8.2.0)%20AND%20labels%20in%20(4.8.2.0-smoke-test-failure)
>>> [2]: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65873020
>>> 
>>>> On Sep 26, 2016, at 8:38 AM, Will Stevens <ws...@cloudops.com> wrote:
>>>> 
>>>> Yes, I think it is important that you or Rajani sign off on anything that
>>>> gets in while branches are frozen so you guys can stay on top of what goes
>>>> in.
>>>> 
>>>> Thanks for all the hard work team.  :)
>>>> 
>>>> *Will STEVENS*
>>>> Lead Developer
>>>> 
>>>> *CloudOps* *| *Cloud Solutions Experts
>>>> 420 rue Guy *|* Montreal *|* Quebec *|* H3J 1S6
>>>> w cloudops.com *|* tw @CloudOps_
>>>> 
>>>> On Mon, Sep 26, 2016 at 2:10 AM, John Burwell <jo...@shapeblue.com>
>>>> wrote:
>>>> 
>>>>> All,
>>>>> 
>>>>> Per our release schedule [1], the 4.8, 4.9, and master branches are frozen
>>>>> for testing.  There are some straggling PRs that Rajani and I are working
>>>>> to merge.  Is it acceptable to everyone that for the next two (2) weeks,
>>>>> all PRs require not only 2 LGTMs, but approval by Rajani or I to be merged
>>>>> to these branches?  To be clear, we don’t have to perform the merges,
>>>>> simply give a thumbs up.
>>>>> 
>>>>> Thanks,
>>>>> -John
>>>>> john.burwell@shapeblue.com
>>>>> www.shapeblue.com
>>>>> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
>>>>> @shapeblue
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>> 
>>> 
>>> john.burwell@shapeblue.com
>>> www.shapeblue.com
>>> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
>>> @shapeblue
>>> 
>>> 
>>> 
>> 


john.burwell@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
@shapeblue
  
 


Re:Re: 4.8, 4.9, and master Testing Status

Posted by Haijiao <18...@163.com>.
Though I am one of the silent majority, I would thank John the dev team for your continuous effort, you keep ACS alive and better !


Just heard one of biggest finance company in China running 10,000+ VMs on ACS 4.4 for production/dev/QAS,  you guys should be proud of that.
Salute to you!







在2016年10月05 03时09分, "ilya"<il...@gmail.com>写道:

John and Team

Thanks for amazing work and contributing back.

Regards,
ilya

On 10/3/16 9:48 PM, John Burwell wrote:
> All,
>
> A quick update on our progress to pass all smoke tests aka super green.  We have reduced the failures and errors for XenServer from 93 to 9 and for VMware from 51 to 14.  A CentOS 6/CentOS 6 KVM run is currently executing.  Based on manual tests/fixes, we are expecting to be the first super green configuration.  We have also found the following additional defects:
>
>   * CLOUDSTACK-9528 [2]: SSVM Downloads (built-in) Template Multiple Times
>   * CLOUDSTACK-9529 [3]: Marvin Tests Do Not Clean Up Properly
>
> 9528 is causing XenServer environments to fail to install and startup cleanly.  A lack of cleanup described in 9529 is causing XenServer to exhaust available resources before a test run completes.  We believe that resolution of these issues will address most, if not all, of the XenServer issues.
>
> Thanks,
> -John
>
> [1]: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65873020
> [2]: https://issues.apache.org/jira/browse/CLOUDSTACK-9528
> [3]: https://issues.apache.org/jira/browse/CLOUDSTACK-9529
>
>>
> john.burwell@shapeblue.com
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
> @shapeblue
>  
>  
>
> On Sep 30, 2016, at 2:40 AM, John Burwell <jo...@shapeblue.com> wrote:
>>
>> All,
>>
>> Using blueorganutan, Rohit, Murali, Boris, Paul, Abhi, and I are executing the smoke tests for the 4.8, 4.9, and master branches against the following environments:
>>
>>     * CentOS 7.2 Management Server + VMware 5.5u3 + NFS Primary/Secondary Storage
>>     * CentOS 7.2 Management Server + XenServer 6.5SP1 + NFS Primary/Secondary Storage
>>     * CentOS 7.2 Management Server + CentOS 7.2 KVM + NFS Primary/Secondary Storage
>>
>> Thus far, we have found seven (7) test case and/or CloudStack defects in the VMware run for the 4.8 branch [1].  We are currently triaging fifty-one (51) new issues from the XenServer run to determine which issues were environmental and defects.  This triage work should be completed today (30 Sept 2016).  Finally, we are awaiting the results of the KVM run.  
>>
>> We are using PR #1692 [2] as the master tracking PR to fix all defects in the 4.8 branch.  Our goal is to get all non-skip tests to pass and then merge this PR to the 4.8, 4.9, and master.  For each bug, we are creating a JIRA ticket and adding a commit to the PR.  Currently, the branch for this PR is in the shapeblue repo (the branch started with a much smaller fix from Paul and we just kept using it).  However, if others are interested in picking up defects, we will move it to ASF repo.  Once the 4.8 branch is stabilized, we plan to re-execute these tests on the 4.9 and master branches as we expect that the 4.9 and master branches will have additional issues.
>>
>> Since we are in a test freeze, I propose that no further PRs are merged to the 4.8, 4.9, and master branches until they are stabilized.  The following PRs will be re-based, re-tested, and merged to 4.8, 4.9.1.0, and/or 4.10.0.0 post-stabilization:
>>
>>     * 1696
>>     * 1694
>>     * 1684
>>      * 1681
>>     * 1680
>>     * 1678
>>     * 1677
>>     * 1676
>>     * 1674
>>     * 1673
>>     * 1642
>>     * 1624
>>     * 1615
>>     * 1600
>>     * 1545
>>     * 1542
>>
>> I recognize that this a large backlog of contributions ready to merge, and apologize for asking folks to wait.  However, given current state of the release branches, merging them before we complete fixing the smoke tests would create a moving target that further delay stabilization.  
>>
>> Obviously, it is unlikely we will make the 10 October 2016 release date for the 4.8.2.0, 4.9.1.0, and 4.10.0.0 releases.  At this point, it is difficult to estimate the size of the schedule slip because we still have issues to triage and test runs to complete.  I have created a wiki page [2] to track progress on this effort.  
>>
>> Does this approach sound reasonable?  Any suggestions to speed up this process will be greatly appreciated as stabilizing and re-opening these branches stable ASAP is critical for the community.
>>
>> Thanks,
>> -John
>>
>> [1]: https://issues.apache.org/jira/browse/CLOUDSTACK-9518?jql=project%20%3D%20CLOUDSTACK%20AND%20fixVersion%20in%20(4.8.2.0)%20AND%20labels%20in%20(4.8.2.0-smoke-test-failure)
>> [2]: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65873020
>>
>>> On Sep 26, 2016, at 8:38 AM, Will Stevens <ws...@cloudops.com> wrote:
>>>
>>> Yes, I think it is important that you or Rajani sign off on anything that
>>> gets in while branches are frozen so you guys can stay on top of what goes
>>> in.
>>>
>>> Thanks for all the hard work team.  :)
>>>
>>> *Will STEVENS*
>>> Lead Developer
>>>
>>> *CloudOps* *| *Cloud Solutions Experts
>>> 420 rue Guy *|* Montreal *|* Quebec *|* H3J 1S6
>>> w cloudops.com *|* tw @CloudOps_
>>>
>>> On Mon, Sep 26, 2016 at 2:10 AM, John Burwell <jo...@shapeblue.com>
>>> wrote:
>>>
>>>> All,
>>>>
>>>> Per our release schedule [1], the 4.8, 4.9, and master branches are frozen
>>>> for testing.  There are some straggling PRs that Rajani and I are working
>>>> to merge.  Is it acceptable to everyone that for the next two (2) weeks,
>>>> all PRs require not only 2 LGTMs, but approval by Rajani or I to be merged
>>>> to these branches?  To be clear, we don’t have to perform the merges,
>>>> simply give a thumbs up.
>>>>
>>>> Thanks,
>>>> -John
>>>> john.burwell@shapeblue.com
>>>> www.shapeblue.com
>>>> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
>>>> @shapeblue
>>>>
>>>>
>>>>
>>>>
>>
>>
>> john.burwell@shapeblue.com
>> www.shapeblue.com
>> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
>> @shapeblue
>>
>>
>>
>

Re: 4.8, 4.9, and master Testing Status

Posted by ilya <il...@gmail.com>.
John and Team

Thanks for amazing work and contributing back.

Regards,
ilya

On 10/3/16 9:48 PM, John Burwell wrote:
> All,
> 
> A quick update on our progress to pass all smoke tests aka super green.  We have reduced the failures and errors for XenServer from 93 to 9 and for VMware from 51 to 14.  A CentOS 6/CentOS 6 KVM run is currently executing.  Based on manual tests/fixes, we are expecting to be the first super green configuration.  We have also found the following additional defects:
> 
>   * CLOUDSTACK-9528 [2]: SSVM Downloads (built-in) Template Multiple Times 
>   * CLOUDSTACK-9529 [3]: Marvin Tests Do Not Clean Up Properly
> 
> 9528 is causing XenServer environments to fail to install and startup cleanly.  A lack of cleanup described in 9529 is causing XenServer to exhaust available resources before a test run completes.  We believe that resolution of these issues will address most, if not all, of the XenServer issues.
> 
> Thanks,
> -John
> 
> [1]: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65873020
> [2]: https://issues.apache.org/jira/browse/CLOUDSTACK-9528
> [3]: https://issues.apache.org/jira/browse/CLOUDSTACK-9529
> 
>>
> john.burwell@shapeblue.com 
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
> @shapeblue
>   
>  
> 
> On Sep 30, 2016, at 2:40 AM, John Burwell <jo...@shapeblue.com> wrote:
>>
>> All,
>>
>> Using blueorganutan, Rohit, Murali, Boris, Paul, Abhi, and I are executing the smoke tests for the 4.8, 4.9, and master branches against the following environments:
>>
>> 	* CentOS 7.2 Management Server + VMware 5.5u3 + NFS Primary/Secondary Storage
>> 	* CentOS 7.2 Management Server + XenServer 6.5SP1 + NFS Primary/Secondary Storage
>> 	* CentOS 7.2 Management Server + CentOS 7.2 KVM + NFS Primary/Secondary Storage
>>
>> Thus far, we have found seven (7) test case and/or CloudStack defects in the VMware run for the 4.8 branch [1].  We are currently triaging fifty-one (51) new issues from the XenServer run to determine which issues were environmental and defects.  This triage work should be completed today (30 Sept 2016).  Finally, we are awaiting the results of the KVM run.  
>>
>> We are using PR #1692 [2] as the master tracking PR to fix all defects in the 4.8 branch.  Our goal is to get all non-skip tests to pass and then merge this PR to the 4.8, 4.9, and master.  For each bug, we are creating a JIRA ticket and adding a commit to the PR.  Currently, the branch for this PR is in the shapeblue repo (the branch started with a much smaller fix from Paul and we just kept using it).  However, if others are interested in picking up defects, we will move it to ASF repo.  Once the 4.8 branch is stabilized, we plan to re-execute these tests on the 4.9 and master branches as we expect that the 4.9 and master branches will have additional issues.
>>
>> Since we are in a test freeze, I propose that no further PRs are merged to the 4.8, 4.9, and master branches until they are stabilized.  The following PRs will be re-based, re-tested, and merged to 4.8, 4.9.1.0, and/or 4.10.0.0 post-stabilization:
>>
>> 	* 1696
>> 	* 1694
>> 	* 1684
>>  	* 1681
>> 	* 1680
>> 	* 1678
>> 	* 1677
>> 	* 1676
>> 	* 1674
>> 	* 1673
>> 	* 1642
>> 	* 1624
>> 	* 1615
>> 	* 1600
>> 	* 1545
>> 	* 1542
>>
>> I recognize that this a large backlog of contributions ready to merge, and apologize for asking folks to wait.  However, given current state of the release branches, merging them before we complete fixing the smoke tests would create a moving target that further delay stabilization.  
>>
>> Obviously, it is unlikely we will make the 10 October 2016 release date for the 4.8.2.0, 4.9.1.0, and 4.10.0.0 releases.  At this point, it is difficult to estimate the size of the schedule slip because we still have issues to triage and test runs to complete.  I have created a wiki page [2] to track progress on this effort.  
>>
>> Does this approach sound reasonable?  Any suggestions to speed up this process will be greatly appreciated as stabilizing and re-opening these branches stable ASAP is critical for the community.
>>
>> Thanks,
>> -John
>>
>> [1]: https://issues.apache.org/jira/browse/CLOUDSTACK-9518?jql=project%20%3D%20CLOUDSTACK%20AND%20fixVersion%20in%20(4.8.2.0)%20AND%20labels%20in%20(4.8.2.0-smoke-test-failure)
>> [2]: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65873020
>>
>>> On Sep 26, 2016, at 8:38 AM, Will Stevens <ws...@cloudops.com> wrote:
>>>
>>> Yes, I think it is important that you or Rajani sign off on anything that
>>> gets in while branches are frozen so you guys can stay on top of what goes
>>> in.
>>>
>>> Thanks for all the hard work team.  :)
>>>
>>> *Will STEVENS*
>>> Lead Developer
>>>
>>> *CloudOps* *| *Cloud Solutions Experts
>>> 420 rue Guy *|* Montreal *|* Quebec *|* H3J 1S6
>>> w cloudops.com *|* tw @CloudOps_
>>>
>>> On Mon, Sep 26, 2016 at 2:10 AM, John Burwell <jo...@shapeblue.com>
>>> wrote:
>>>
>>>> All,
>>>>
>>>> Per our release schedule [1], the 4.8, 4.9, and master branches are frozen
>>>> for testing.  There are some straggling PRs that Rajani and I are working
>>>> to merge.  Is it acceptable to everyone that for the next two (2) weeks,
>>>> all PRs require not only 2 LGTMs, but approval by Rajani or I to be merged
>>>> to these branches?  To be clear, we don\u2019t have to perform the merges,
>>>> simply give a thumbs up.
>>>>
>>>> Thanks,
>>>> -John
>>>> john.burwell@shapeblue.com
>>>> www.shapeblue.com
>>>> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
>>>> @shapeblue
>>>>
>>>>
>>>>
>>>>
>>
>>
>> john.burwell@shapeblue.com 
>> www.shapeblue.com
>> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
>> @shapeblue
>>
>>
>>
> 

RE: 4.8, 4.9, and master Testing Status

Posted by Marty Godsey <ma...@gonsource.com>.
Is XenServer 7 on the roadmap? I have seen significant performance differences between my ACS cluster running XS 6.5 SP1 and a separate XS 7 cluster (non-ACS) connecting to the same storage array. Other than the XS hosts being different versions all is the same. Same switches, same storage array, etc..

I am just curious.

Regards,
Marty Godsey
nSource Solutions

-----Original Message-----
From: John Burwell [mailto:john.burwell@shapeblue.com] 
Sent: Tuesday, October 4, 2016 12:48 AM
To: dev@cloudstack.apache.org
Cc: Murali Reddy <mu...@shapeblue.com>; Rohit Yadav <ro...@shapeblue.com>; Paul Angus <pa...@shapeblue.com>; Boris Stoyanov <bo...@shapeblue.com>; Abhinandan Prateek <ab...@shapeblue.com>
Subject: Re: 4.8, 4.9, and master Testing Status

All,

A quick update on our progress to pass all smoke tests aka super green.  We have reduced the failures and errors for XenServer from 93 to 9 and for VMware from 51 to 14.  A CentOS 6/CentOS 6 KVM run is currently executing.  Based on manual tests/fixes, we are expecting to be the first super green configuration.  We have also found the following additional defects:

  * CLOUDSTACK-9528 [2]: SSVM Downloads (built-in) Template Multiple Times
  * CLOUDSTACK-9529 [3]: Marvin Tests Do Not Clean Up Properly

9528 is causing XenServer environments to fail to install and startup cleanly.  A lack of cleanup described in 9529 is causing XenServer to exhaust available resources before a test run completes.  We believe that resolution of these issues will address most, if not all, of the XenServer issues.

Thanks,
-John

[1]: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65873020
[2]: https://issues.apache.org/jira/browse/CLOUDSTACK-9528
[3]: https://issues.apache.org/jira/browse/CLOUDSTACK-9529

> 
john.burwell@shapeblue.com
www.shapeblue.com
53 Chandos Place, Covent Garden, London VA WC2N 4HSUK @shapeblue
  
 

On Sep 30, 2016, at 2:40 AM, John Burwell <jo...@shapeblue.com> wrote:
> 
> All,
> 
> Using blueorganutan, Rohit, Murali, Boris, Paul, Abhi, and I are executing the smoke tests for the 4.8, 4.9, and master branches against the following environments:
> 
> 	* CentOS 7.2 Management Server + VMware 5.5u3 + NFS Primary/Secondary Storage
> 	* CentOS 7.2 Management Server + XenServer 6.5SP1 + NFS Primary/Secondary Storage
> 	* CentOS 7.2 Management Server + CentOS 7.2 KVM + NFS 
> Primary/Secondary Storage
> 
> Thus far, we have found seven (7) test case and/or CloudStack defects in the VMware run for the 4.8 branch [1].  We are currently triaging fifty-one (51) new issues from the XenServer run to determine which issues were environmental and defects.  This triage work should be completed today (30 Sept 2016).  Finally, we are awaiting the results of the KVM run.  
> 
> We are using PR #1692 [2] as the master tracking PR to fix all defects in the 4.8 branch.  Our goal is to get all non-skip tests to pass and then merge this PR to the 4.8, 4.9, and master.  For each bug, we are creating a JIRA ticket and adding a commit to the PR.  Currently, the branch for this PR is in the shapeblue repo (the branch started with a much smaller fix from Paul and we just kept using it).  However, if others are interested in picking up defects, we will move it to ASF repo.  Once the 4.8 branch is stabilized, we plan to re-execute these tests on the 4.9 and master branches as we expect that the 4.9 and master branches will have additional issues.
> 
> Since we are in a test freeze, I propose that no further PRs are merged to the 4.8, 4.9, and master branches until they are stabilized.  The following PRs will be re-based, re-tested, and merged to 4.8, 4.9.1.0, and/or 4.10.0.0 post-stabilization:
> 
> 	* 1696
> 	* 1694
> 	* 1684
>  	* 1681
> 	* 1680
> 	* 1678
> 	* 1677
> 	* 1676
> 	* 1674
> 	* 1673
> 	* 1642
> 	* 1624
> 	* 1615
> 	* 1600
> 	* 1545
> 	* 1542
> 
> I recognize that this a large backlog of contributions ready to merge, and apologize for asking folks to wait.  However, given current state of the release branches, merging them before we complete fixing the smoke tests would create a moving target that further delay stabilization.  
> 
> Obviously, it is unlikely we will make the 10 October 2016 release date for the 4.8.2.0, 4.9.1.0, and 4.10.0.0 releases.  At this point, it is difficult to estimate the size of the schedule slip because we still have issues to triage and test runs to complete.  I have created a wiki page [2] to track progress on this effort.  
> 
> Does this approach sound reasonable?  Any suggestions to speed up this process will be greatly appreciated as stabilizing and re-opening these branches stable ASAP is critical for the community.
> 
> Thanks,
> -John
> 
> [1]: 
> https://issues.apache.org/jira/browse/CLOUDSTACK-9518?jql=project%20%3
> D%20CLOUDSTACK%20AND%20fixVersion%20in%20(4.8.2.0)%20AND%20labels%20in
> %20(4.8.2.0-smoke-test-failure)
> [2]: 
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65873
> 020
> 
>> On Sep 26, 2016, at 8:38 AM, Will Stevens <ws...@cloudops.com> wrote:
>> 
>> Yes, I think it is important that you or Rajani sign off on anything 
>> that gets in while branches are frozen so you guys can stay on top of 
>> what goes in.
>> 
>> Thanks for all the hard work team.  :)
>> 
>> *Will STEVENS*
>> Lead Developer
>> 
>> *CloudOps* *| *Cloud Solutions Experts
>> 420 rue Guy *|* Montreal *|* Quebec *|* H3J 1S6 w cloudops.com *|* tw 
>> @CloudOps_
>> 
>> On Mon, Sep 26, 2016 at 2:10 AM, John Burwell 
>> <jo...@shapeblue.com>
>> wrote:
>> 
>>> All,
>>> 
>>> Per our release schedule [1], the 4.8, 4.9, and master branches are 
>>> frozen for testing.  There are some straggling PRs that Rajani and I 
>>> are working to merge.  Is it acceptable to everyone that for the 
>>> next two (2) weeks, all PRs require not only 2 LGTMs, but approval 
>>> by Rajani or I to be merged to these branches?  To be clear, we 
>>> don’t have to perform the merges, simply give a thumbs up.
>>> 
>>> Thanks,
>>> -John
>>> john.burwell@shapeblue.com
>>> www.shapeblue.com
>>> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK @shapeblue
>>> 
>>> 
>>> 
>>> 
> 
> 
> john.burwell@shapeblue.com
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK @shapeblue
> 
> 
> 


Re: 4.8, 4.9, and master Testing Status

Posted by John Burwell <jo...@shapeblue.com>.
All,

A quick update on our progress to pass all smoke tests aka super green.  We have reduced the failures and errors for XenServer from 93 to 9 and for VMware from 51 to 14.  A CentOS 6/CentOS 6 KVM run is currently executing.  Based on manual tests/fixes, we are expecting to be the first super green configuration.  We have also found the following additional defects:

  * CLOUDSTACK-9528 [2]: SSVM Downloads (built-in) Template Multiple Times 
  * CLOUDSTACK-9529 [3]: Marvin Tests Do Not Clean Up Properly

9528 is causing XenServer environments to fail to install and startup cleanly.  A lack of cleanup described in 9529 is causing XenServer to exhaust available resources before a test run completes.  We believe that resolution of these issues will address most, if not all, of the XenServer issues.

Thanks,
-John

[1]: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65873020
[2]: https://issues.apache.org/jira/browse/CLOUDSTACK-9528
[3]: https://issues.apache.org/jira/browse/CLOUDSTACK-9529

> 
john.burwell@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
@shapeblue
  
 

On Sep 30, 2016, at 2:40 AM, John Burwell <jo...@shapeblue.com> wrote:
> 
> All,
> 
> Using blueorganutan, Rohit, Murali, Boris, Paul, Abhi, and I are executing the smoke tests for the 4.8, 4.9, and master branches against the following environments:
> 
> 	* CentOS 7.2 Management Server + VMware 5.5u3 + NFS Primary/Secondary Storage
> 	* CentOS 7.2 Management Server + XenServer 6.5SP1 + NFS Primary/Secondary Storage
> 	* CentOS 7.2 Management Server + CentOS 7.2 KVM + NFS Primary/Secondary Storage
> 
> Thus far, we have found seven (7) test case and/or CloudStack defects in the VMware run for the 4.8 branch [1].  We are currently triaging fifty-one (51) new issues from the XenServer run to determine which issues were environmental and defects.  This triage work should be completed today (30 Sept 2016).  Finally, we are awaiting the results of the KVM run.  
> 
> We are using PR #1692 [2] as the master tracking PR to fix all defects in the 4.8 branch.  Our goal is to get all non-skip tests to pass and then merge this PR to the 4.8, 4.9, and master.  For each bug, we are creating a JIRA ticket and adding a commit to the PR.  Currently, the branch for this PR is in the shapeblue repo (the branch started with a much smaller fix from Paul and we just kept using it).  However, if others are interested in picking up defects, we will move it to ASF repo.  Once the 4.8 branch is stabilized, we plan to re-execute these tests on the 4.9 and master branches as we expect that the 4.9 and master branches will have additional issues.
> 
> Since we are in a test freeze, I propose that no further PRs are merged to the 4.8, 4.9, and master branches until they are stabilized.  The following PRs will be re-based, re-tested, and merged to 4.8, 4.9.1.0, and/or 4.10.0.0 post-stabilization:
> 
> 	* 1696
> 	* 1694
> 	* 1684
>  	* 1681
> 	* 1680
> 	* 1678
> 	* 1677
> 	* 1676
> 	* 1674
> 	* 1673
> 	* 1642
> 	* 1624
> 	* 1615
> 	* 1600
> 	* 1545
> 	* 1542
> 
> I recognize that this a large backlog of contributions ready to merge, and apologize for asking folks to wait.  However, given current state of the release branches, merging them before we complete fixing the smoke tests would create a moving target that further delay stabilization.  
> 
> Obviously, it is unlikely we will make the 10 October 2016 release date for the 4.8.2.0, 4.9.1.0, and 4.10.0.0 releases.  At this point, it is difficult to estimate the size of the schedule slip because we still have issues to triage and test runs to complete.  I have created a wiki page [2] to track progress on this effort.  
> 
> Does this approach sound reasonable?  Any suggestions to speed up this process will be greatly appreciated as stabilizing and re-opening these branches stable ASAP is critical for the community.
> 
> Thanks,
> -John
> 
> [1]: https://issues.apache.org/jira/browse/CLOUDSTACK-9518?jql=project%20%3D%20CLOUDSTACK%20AND%20fixVersion%20in%20(4.8.2.0)%20AND%20labels%20in%20(4.8.2.0-smoke-test-failure)
> [2]: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65873020
> 
>> On Sep 26, 2016, at 8:38 AM, Will Stevens <ws...@cloudops.com> wrote:
>> 
>> Yes, I think it is important that you or Rajani sign off on anything that
>> gets in while branches are frozen so you guys can stay on top of what goes
>> in.
>> 
>> Thanks for all the hard work team.  :)
>> 
>> *Will STEVENS*
>> Lead Developer
>> 
>> *CloudOps* *| *Cloud Solutions Experts
>> 420 rue Guy *|* Montreal *|* Quebec *|* H3J 1S6
>> w cloudops.com *|* tw @CloudOps_
>> 
>> On Mon, Sep 26, 2016 at 2:10 AM, John Burwell <jo...@shapeblue.com>
>> wrote:
>> 
>>> All,
>>> 
>>> Per our release schedule [1], the 4.8, 4.9, and master branches are frozen
>>> for testing.  There are some straggling PRs that Rajani and I are working
>>> to merge.  Is it acceptable to everyone that for the next two (2) weeks,
>>> all PRs require not only 2 LGTMs, but approval by Rajani or I to be merged
>>> to these branches?  To be clear, we don’t have to perform the merges,
>>> simply give a thumbs up.
>>> 
>>> Thanks,
>>> -John
>>> john.burwell@shapeblue.com
>>> www.shapeblue.com
>>> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
>>> @shapeblue
>>> 
>>> 
>>> 
>>> 
> 
> 
> john.burwell@shapeblue.com 
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
> @shapeblue
> 
> 
> 


4.8, 4.9, and master Testing Status

Posted by John Burwell <jo...@shapeblue.com>.
All,

Using blueorganutan, Rohit, Murali, Boris, Paul, Abhi, and I are executing the smoke tests for the 4.8, 4.9, and master branches against the following environments:

	* CentOS 7.2 Management Server + VMware 5.5u3 + NFS Primary/Secondary Storage
	* CentOS 7.2 Management Server + XenServer 6.5SP1 + NFS Primary/Secondary Storage
	* CentOS 7.2 Management Server + CentOS 7.2 KVM + NFS Primary/Secondary Storage

Thus far, we have found seven (7) test case and/or CloudStack defects in the VMware run for the 4.8 branch [1].  We are currently triaging fifty-one (51) new issues from the XenServer run to determine which issues were environmental and defects.  This triage work should be completed today (30 Sept 2016).  Finally, we are awaiting the results of the KVM run.  

We are using PR #1692 [2] as the master tracking PR to fix all defects in the 4.8 branch.  Our goal is to get all non-skip tests to pass and then merge this PR to the 4.8, 4.9, and master.  For each bug, we are creating a JIRA ticket and adding a commit to the PR.  Currently, the branch for this PR is in the shapeblue repo (the branch started with a much smaller fix from Paul and we just kept using it).  However, if others are interested in picking up defects, we will move it to ASF repo.  Once the 4.8 branch is stabilized, we plan to re-execute these tests on the 4.9 and master branches as we expect that the 4.9 and master branches will have additional issues.

Since we are in a test freeze, I propose that no further PRs are merged to the 4.8, 4.9, and master branches until they are stabilized.  The following PRs will be re-based, re-tested, and merged to 4.8, 4.9.1.0, and/or 4.10.0.0 post-stabilization:

	* 1696
	* 1694
	* 1684
  	* 1681
	* 1680
	* 1678
	* 1677
	* 1676
	* 1674
	* 1673
	* 1642
	* 1624
	* 1615
	* 1600
	* 1545
	* 1542

I recognize that this a large backlog of contributions ready to merge, and apologize for asking folks to wait.  However, given current state of the release branches, merging them before we complete fixing the smoke tests would create a moving target that further delay stabilization.  

Obviously, it is unlikely we will make the 10 October 2016 release date for the 4.8.2.0, 4.9.1.0, and 4.10.0.0 releases.  At this point, it is difficult to estimate the size of the schedule slip because we still have issues to triage and test runs to complete.  I have created a wiki page [2] to track progress on this effort.  

Does this approach sound reasonable?  Any suggestions to speed up this process will be greatly appreciated as stabilizing and re-opening these branches stable ASAP is critical for the community.

Thanks,
-John

[1]: https://issues.apache.org/jira/browse/CLOUDSTACK-9518?jql=project%20%3D%20CLOUDSTACK%20AND%20fixVersion%20in%20(4.8.2.0)%20AND%20labels%20in%20(4.8.2.0-smoke-test-failure)
[2]: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65873020

> On Sep 26, 2016, at 8:38 AM, Will Stevens <ws...@cloudops.com> wrote:
> 
> Yes, I think it is important that you or Rajani sign off on anything that
> gets in while branches are frozen so you guys can stay on top of what goes
> in.
> 
> Thanks for all the hard work team.  :)
> 
> *Will STEVENS*
> Lead Developer
> 
> *CloudOps* *| *Cloud Solutions Experts
> 420 rue Guy *|* Montreal *|* Quebec *|* H3J 1S6
> w cloudops.com *|* tw @CloudOps_
> 
> On Mon, Sep 26, 2016 at 2:10 AM, John Burwell <jo...@shapeblue.com>
> wrote:
> 
>> All,
>> 
>> Per our release schedule [1], the 4.8, 4.9, and master branches are frozen
>> for testing.  There are some straggling PRs that Rajani and I are working
>> to merge.  Is it acceptable to everyone that for the next two (2) weeks,
>> all PRs require not only 2 LGTMs, but approval by Rajani or I to be merged
>> to these branches?  To be clear, we don’t have to perform the merges,
>> simply give a thumbs up.
>> 
>> Thanks,
>> -John
>> john.burwell@shapeblue.com
>> www.shapeblue.com
>> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
>> @shapeblue
>> 
>> 
>> 
>> 


john.burwell@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
@shapeblue
  
 


Re: 4.8, 4.9, and master Branches Frozen for Testing

Posted by Will Stevens <ws...@cloudops.com>.
Yes, I think it is important that you or Rajani sign off on anything that
gets in while branches are frozen so you guys can stay on top of what goes
in.

Thanks for all the hard work team.  :)

*Will STEVENS*
Lead Developer

*CloudOps* *| *Cloud Solutions Experts
420 rue Guy *|* Montreal *|* Quebec *|* H3J 1S6
w cloudops.com *|* tw @CloudOps_

On Mon, Sep 26, 2016 at 2:10 AM, John Burwell <jo...@shapeblue.com>
wrote:

> All,
>
> Per our release schedule [1], the 4.8, 4.9, and master branches are frozen
> for testing.  There are some straggling PRs that Rajani and I are working
> to merge.  Is it acceptable to everyone that for the next two (2) weeks,
> all PRs require not only 2 LGTMs, but approval by Rajani or I to be merged
> to these branches?  To be clear, we don’t have to perform the merges,
> simply give a thumbs up.
>
> Thanks,
> -John
> john.burwell@shapeblue.com
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
> @shapeblue
>
>
>
>