You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cloudstack.apache.org by miguelaferreira <gi...@git.apache.org> on 2015/12/05 18:02:01 UTC

[GitHub] cloudstack pull request: Prevent live-lock in NSX API client

GitHub user miguelaferreira opened a pull request:

    https://github.com/apache/cloudstack/pull/1178

    Prevent live-lock in NSX API client

    The NSX api client relies on a sequence of responses to identify the need to authenticate and to follow redirects. In order to avoid live-locks, the NSX API client has a counter that will abort the execution after 5 consecutive requests that do not produce a Success (200) response.
    When a NSX controller enters a faulty state it can allow authentication requests but deny any other API call. In such cases the NSX API client will consider the denied request a reason to follow a redirect and will enter into a live-lock (because the actual redirection is not happening).
    This PR changes the NSX API client to no reset it's counter on a Success response from an authentication request. That is, the counter is only reset if another type of API call yields a Success response.
    
    In addition, this PR also:
    * changes the configuration of the license-maven-plugin to ignore files generated by pmd
    * moves the NSX marvin test to a plugins folder
    * refactors the NSX marvin test to reduce duplication
    * adds an extra test case to the NSX marvin test that checks that NSX tunnels are properly created

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/miguelaferreira/cloudstack nsx-livelock

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/cloudstack/pull/1178.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1178
    
----
commit d248e61a3128fe285a18c9836d1c7d18b5b946e9
Author: Miguel Ferreira <mi...@me.com>
Date:   2015-12-02T13:53:26Z

    Add test for NSX plugin that simulates a live lock

commit cfe14462a3380c053ad174e0016632e6024e3573
Author: Miguel Ferreira <mi...@me.com>
Date:   2015-12-02T13:53:56Z

    Fix NSX rest client to not reset execution counter after a login

commit 86e836196ed3374912da805f598961b3fe9c9033
Author: Miguel Ferreira <mi...@me.com>
Date:   2015-12-02T14:42:38Z

    Ignore pmd generated files during license check
    
    This has been happening for the NSX and BigSwitch plugins.
    The maven build prints something like:
      [WARNING] Unknown file extension: .../plugins/network-elements/nicira-nvp/.pmd
      [WARNING] Missing header in: .../plugins/network-elements/nicira-nvp/.pmdruleset.xml

commit bafc2313bb872efa07c01dca2fdff4bec9b8dbb6
Author: Miguel Ferreira <mi...@me.com>
Date:   2015-12-04T10:40:09Z

    Move NSX integrationt test to new plugins folder

commit 1e820e3906023ae43dbd3f01d632fbef5bb77430
Author: Miguel Ferreira <mi...@me.com>
Date:   2015-12-04T10:46:08Z

    Use logger to print debug messages during test

commit a633ef811324a8dbf868c85922235c16a08cb1a0
Author: Miguel Ferreira <mi...@me.com>
Date:   2015-12-04T11:10:55Z

    Refactor test cases to reduce duplication

commit 811f254e4ddcc8de0bc0375f35783eea4e04ef6d
Author: Miguel Ferreira <mi...@me.com>
Date:   2015-12-05T14:51:07Z

    Test NSX tunnel in guest network

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] cloudstack pull request: Prevent live-lock in NSX API client

Posted by miguelaferreira <gi...@git.apache.org>.
Github user miguelaferreira commented on the pull request:

    https://github.com/apache/cloudstack/pull/1178#issuecomment-162225661
  
    As for testing, I've built the entire project executing all unit tests (including the one I've added to prevent a regression on this issue), the existing marvin tests and a new one that checks that a VM in a NSX backed network has connectivity via a NSX managed tunnel between two hosts.
    
    maven build:
    ```bash
    mvn clean install -T4 -Pdeveloper,systemvm
    
    (...)
    
    [INFO] ------------------------------------------------------------------------
    [INFO] BUILD SUCCESS
    [INFO] ------------------------------------------------------------------------
    [INFO] Total time: 08:43 min (Wall Clock)
    [INFO] Finished at: 2015-12-05T18:27:47+01:00
    [INFO] Final Memory: 110M/1614M
    [INFO] ------------------------------------------------------------------------
    ```
    
    marvin tests:
    ```bash
    nosetests --with-marvin --marvin-config=mct-zone1-kvm1-kvm2-NVP.cfg -s -a tags=advanced,required_hardware=true test/integration/plugins/test_nicira_controller.py
    
    (...)
    
    === TestName: test_01_nicira_controller | Status : SUCCESS ===
    ok
    === TestName: test_02_nicira_controller_redirect | Status : SUCCESS ===
    ok
    === TestName: test_03_nicira_tunnel_guest_network | Status : SUCCESS ===
    ok
    
    ----------------------------------------------------------------------
    Ran 3 tests in 1204.495s
    
    OK
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] cloudstack pull request: Prevent live-lock in NSX API client

Posted by remibergsma <gi...@git.apache.org>.
Github user remibergsma commented on the pull request:

    https://github.com/apache/cloudstack/pull/1178#issuecomment-162287978
  
    LGTM based on these tests:
    
    ```
    nosetests --with-marvin --marvin-config=${marvinCfg} -s -a tags=advanced,required_hardware=true \
    component/test_vpc_redundant.py \
    component/test_routers_iptables_default_policy.py \
    component/test_routers_network_ops.py \
    component/test_vpc_router_nics.py \
    smoke/test_loadbalance.py \
    smoke/test_internal_lb.py \
    smoke/test_ssvm.py \
    smoke/test_network.py
    
    ```
    
    Result:
    
    ```
    Create a redundant VPC with two networks with two VMs in each network ... === TestName: test_01_create_redundant_VPC_2tiers_4VMs_4IPs_4PF_ACL | Status : SUCCESS ===
    ok
    Create a redundant VPC with two networks with two VMs in each network and check default routes ... === TestName: test_02_redundant_VPC_default_routes | Status : SUCCESS ===
    ok
    Create a redundant VPC with two networks with two VMs in each network ... === TestName: test_03_create_redundant_VPC_1tier_2VMs_2IPs_2PF_ACL_reboot_routers | Status : SUCCESS ===
    ok
    Test iptables default INPUT/FORWARD policy on RouterVM ... === TestName: test_02_routervm_iptables_policies | Status : SUCCESS ===
    ok
    Test iptables default INPUT/FORWARD policies on VPC router ... === TestName: test_01_single_VPC_iptables_policies | Status : SUCCESS ===
    ok
    Test redundant router internals ... === TestName: test_01_isolate_network_FW_PF_default_routes_egress_true | Status : SUCCESS ===
    ok
    Test redundant router internals ... === TestName: test_02_isolate_network_FW_PF_default_routes_egress_false | Status : SUCCESS ===
    ok
    Test redundant router internals ... === TestName: test_01_RVR_Network_FW_PF_SSH_default_routes_egress_true | Status : SUCCESS ===
    ok
    Test redundant router internals ... === TestName: test_02_RVR_Network_FW_PF_SSH_default_routes_egress_false | Status : SUCCESS ===
    ok
    Create a VPC with two networks with one VM in each network and test nics after destroy ... === TestName: test_01_VPC_nics_after_destroy | Status : SUCCESS ===
    ok
    Create a VPC with two networks with one VM in each network and test default routes ... === TestName: test_02_VPC_default_routes | Status : SUCCESS ===
    ok
    Check the password file in the Router VM ... === TestName: test_isolate_network_password_server | Status : SUCCESS ===
    ok
    Check that the /etc/dhcphosts.txt doesn't contain duplicate IPs ... === TestName: test_router_dhcphosts | Status : SUCCESS ===
    ok
    Test to create Load balancing rule with source NAT ... === TestName: test_01_create_lb_rule_src_nat | Status : SUCCESS ===
    ok
    Test to create Load balancing rule with non source NAT ... === TestName: test_02_create_lb_rule_non_nat | Status : SUCCESS ===
    ok
    Test for assign & removing load balancing rule ... === TestName: test_assign_and_removal_lb | Status : SUCCESS ===
    ok
    Test to verify access to loadbalancer haproxy admin stats page ... === TestName: test02_internallb_haproxy_stats_on_all_interfaces | Status : SUCCESS ===
    ok
    Test create, assign, remove of an Internal LB with roundrobin http traffic to 3 vm's ... === TestName: test_01_internallb_roundrobin_1VPC_3VM_HTTP_port80 | Status : SUCCESS ===
    ok
    Test SSVM Internals ... === TestName: test_03_ssvm_internals | Status : SUCCESS ===
    ok
    Test CPVM Internals ... === TestName: test_04_cpvm_internals | Status : SUCCESS ===
    ok
    Test stop SSVM ... === TestName: test_05_stop_ssvm | Status : SUCCESS ===
    ok
    Test stop CPVM ... === TestName: test_06_stop_cpvm | Status : SUCCESS ===
    ok
    Test reboot SSVM ... === TestName: test_07_reboot_ssvm | Status : SUCCESS ===
    ok
    Test reboot CPVM ... === TestName: test_08_reboot_cpvm | Status : SUCCESS ===
    ok
    Test destroy SSVM ... === TestName: test_09_destroy_ssvm | Status : SUCCESS ===
    ok
    Test destroy CPVM ... === TestName: test_10_destroy_cpvm | Status : SUCCESS ===
    ok
    Test Remote Access VPN in VPC ... === TestName: test_vpc_remote_access_vpn | Status : SUCCESS ===
    ok
    Test VPN in VPC ... === TestName: test_vpc_site2site_vpn | Status : SUCCESS ===
    ok
    Test for port forwarding on source NAT ... === TestName: test_01_port_fwd_on_src_nat | Status : SUCCESS ===
    ok
    Test for port forwarding on non source NAT ... === TestName: test_02_port_fwd_on_non_src_nat | Status : SUCCESS ===
    ok
    Test for reboot router ... === TestName: test_reboot_router | Status : SUCCESS ===
    ok
    Test for Router rules for network rules on acquired public IP ... === TestName: test_network_rules_acquired_public_ip_1_static_nat_rule | Status : SUCCESS ===
    ok
    Test for Router rules for network rules on acquired public IP ... === TestName: test_network_rules_acquired_public_ip_2_nat_rule | Status : SUCCESS ===
    ok
    Test for Router rules for network rules on acquired public IP ... === TestName: test_network_rules_acquired_public_ip_3_Load_Balancer_Rule | Status : SUCCESS ===
    ok
    
    ----------------------------------------------------------------------
    Ran 34 tests in 16284.125s
    
    OK
    ```
    
    
    And:
    
    ```
    nosetests --with-marvin --marvin-config=${marvinCfg} -s -a tags=advanced,required_hardware=false \
    smoke/test_routers.py \
    smoke/test_network_acl.py \
    smoke/test_privategw_acl.py \
    smoke/test_reset_vm_on_reboot.py \
    smoke/test_vm_life_cycle.py \
    smoke/test_vpc_vpn.py \
    smoke/test_service_offerings.py \
    component/test_vpc_offerings.py \
    component/test_vpc_routers.py
    ```
    
    Result:
    
    ```
    Test router internal advanced zone ... === TestName: test_02_router_internal_adv | Status : SUCCESS ===
    ok
    Test restart network ... === TestName: test_03_restart_network_cleanup | Status : SUCCESS ===
    ok
    Test router basic setup ... === TestName: test_05_router_basic | Status : SUCCESS ===
    ok
    Test router advanced setup ... === TestName: test_06_router_advanced | Status : SUCCESS ===
    ok
    Test stop router ... === TestName: test_07_stop_router | Status : SUCCESS ===
    ok
    Test start router ... === TestName: test_08_start_router | Status : SUCCESS ===
    ok
    Test reboot router ... === TestName: test_09_reboot_router | Status : SUCCESS ===
    ok
    Test reset virtual machine on reboot ... === TestName: test_01_reset_vm_on_reboot | Status : SUCCESS ===
    ok
    Test advanced zone virtual router ... === TestName: test_advZoneVirtualRouter | Status : SUCCESS ===
    ok
    Test Deploy Virtual Machine ... === TestName: test_deploy_vm | Status : SUCCESS ===
    ok
    Test Multiple Deploy Virtual Machine ... === TestName: test_deploy_vm_multiple | Status : SUCCESS ===
    ok
    Test Stop Virtual Machine ... === TestName: test_01_stop_vm | Status : SUCCESS ===
    ok
    Test Start Virtual Machine ... === TestName: test_02_start_vm | Status : SUCCESS ===
    ok
    Test Reboot Virtual Machine ... === TestName: test_03_reboot_vm | Status : SUCCESS ===
    ok
    Test destroy Virtual Machine ... === TestName: test_06_destroy_vm | Status : SUCCESS ===
    ok
    Test recover Virtual Machine ... === TestName: test_07_restore_vm | Status : SUCCESS ===
    ok
    Test migrate VM ... === TestName: test_08_migrate_vm | Status : SUCCESS ===
    ok
    Test destroy(expunge) Virtual Machine ... === TestName: test_09_expunge_vm | Status : SUCCESS ===
    ok
    Test to create service offering ... === TestName: test_01_create_service_offering | Status : SUCCESS ===
    ok
    Test to update existing service offering ... === TestName: test_02_edit_service_offering | Status : SUCCESS ===
    ok
    Test to delete service offering ... === TestName: test_03_delete_service_offering | Status : SUCCESS ===
    ok
    Test for delete account ... === TestName: test_delete_account | Status : SUCCESS ===
    ok
    Test for Associate/Disassociate public IP address for admin account ... === TestName: test_public_ip_admin_account | Status : SUCCESS ===
    ok
    Test for Associate/Disassociate public IP address for user account ... === TestName: test_public_ip_user_account | Status : SUCCESS ===
    ok
    Test for release public IP address ... === TestName: test_releaseIP | Status : SUCCESS ===
    ok
    Test create VPC offering ... === TestName: test_01_create_vpc_offering | Status : SUCCESS ===
    ok
    Test VPC offering without load balancing service ... === TestName: test_03_vpc_off_without_lb | Status : SUCCESS ===
    ok
    Test VPC offering without static NAT service ... === TestName: test_04_vpc_off_without_static_nat | Status : SUCCESS ===
    ok
    Test VPC offering without port forwarding service ... === TestName: test_05_vpc_off_without_pf | Status : SUCCESS ===
    ok
    Test VPC offering with invalid services ... === TestName: test_06_vpc_off_invalid_services | Status : SUCCESS ===
    ok
    Test update VPC offering ... === TestName: test_07_update_vpc_off | Status : SUCCESS ===
    ok
    Test list VPC offering ... === TestName: test_08_list_vpc_off | Status : SUCCESS ===
    ok
    test_09_create_redundant_vpc_offering (integration.component.test_vpc_offerings.TestVPCOffering) ... === TestName: test_09_create_redundant_vpc_offering | Status : SUCCESS ===
    ok
    Test start/stop of router after addition of one guest network ... === TestName: test_01_start_stop_router_after_addition_of_one_guest_network | Status : SUCCESS ===
    ok
    Test reboot of router after addition of one guest network ... === TestName: test_02_reboot_router_after_addition_of_one_guest_network | Status : SUCCESS ===
    ok
    Test to change service offering of router after addition of one guest network ... === TestName: test_04_chg_srv_off_router_after_addition_of_one_guest_network | Status : SUCCESS ===
    ok
    Test destroy of router after addition of one guest network ... === TestName: test_05_destroy_router_after_addition_of_one_guest_network | Status : SUCCESS ===
    ok
    Test to stop and start router after creation of VPC ... === TestName: test_01_stop_start_router_after_creating_vpc | Status : SUCCESS ===
    ok
    Test to reboot the router after creating a VPC ... === TestName: test_02_reboot_router_after_creating_vpc | Status : SUCCESS ===
    ok
    Tests to change service offering of the Router after ... === TestName: test_04_change_service_offerring_vpc | Status : SUCCESS ===
    ok
    Test to destroy the router after creating a VPC ... === TestName: test_05_destroy_router_after_creating_vpc | Status : SUCCESS ===
    ok
    
    ----------------------------------------------------------------------
    Ran 41 tests in 7917.791s
    
    OK
    ```
    
    I didn't test the actual feature. This just shows it didn't break anything else. Will be running your tests as well!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] cloudstack pull request: Prevent live-lock in NSX API client

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/cloudstack/pull/1178


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] cloudstack pull request: Prevent live-lock in NSX API client

Posted by wilderrodrigues <gi...@git.apache.org>.
Github user wilderrodrigues commented on the pull request:

    https://github.com/apache/cloudstack/pull/1178#issuecomment-162616790
  
    @miguelaferreira gave me some explanations about the problem and how he was going to fix it. I went through the code and also checked the tests he wrote.
    
    Based on his tests results, the tests @remibergsma execute and the review I did, this PR LGMT :+1: 
    
    Cheers,
    Wilder


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] cloudstack pull request: Prevent live-lock in NSX API client

Posted by nvazquez <gi...@git.apache.org>.
Github user nvazquez commented on the pull request:

    https://github.com/apache/cloudstack/pull/1178#issuecomment-162231638
  
    Ok, I'll post results later


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] cloudstack pull request: Prevent live-lock in NSX API client

Posted by miguelaferreira <gi...@git.apache.org>.
Github user miguelaferreira commented on the pull request:

    https://github.com/apache/cloudstack/pull/1178#issuecomment-162225746
  
    ping @nvazquez would you be able to test this PR?
    
    The new marvin test I added requires two hosts in order to test the NSX tunnel between them.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---