You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cloudstack.apache.org by koushik-das <gi...@git.apache.org> on 2015/12/22 09:27:58 UTC

[GitHub] cloudstack pull request: CLOUDSTACK-9195: Cancelled/failed async j...

GitHub user koushik-das opened a pull request:

    https://github.com/apache/cloudstack/pull/1272

    CLOUDSTACK-9195: Cancelled/failed async jobs not getting cleaned up f…

    …rom DB
    
    Setting 'job_complete_msid' field in async_job table when marking job as cancelled during MS restart.
    This will ensure that the job is cleaned up as part of GC thread.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/koushik-das/cloudstack CLOUDSTACK-9195

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/cloudstack/pull/1272.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1272
    
----
commit a6f9181ee15c64ce6358be855f2f5474ca1cb7ef
Author: Koushik Das <ko...@apache.org>
Date:   2015-12-22T08:23:29Z

    CLOUDSTACK-9195: Cancelled/failed async jobs not getting cleaned up from DB
    Setting 'job_complete_msid' field in async_job table when marking job as cancelled during MS restart.
    This will ensure that the job is cleaned up as part of GC thread.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] cloudstack pull request: CLOUDSTACK-9195: Cancelled/failed async j...

Posted by remibergsma <gi...@git.apache.org>.
Github user remibergsma commented on the pull request:

    https://github.com/apache/cloudstack/pull/1272#issuecomment-172318575
  
    Ping @DaanHoogland @borisroman @wilderrodrigues Can you guys have a look at this one-line change please? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] cloudstack pull request: CLOUDSTACK-9195: Cancelled/failed async j...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/cloudstack/pull/1272


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] cloudstack pull request: CLOUDSTACK-9195: Cancelled/failed async j...

Posted by koushik-das <gi...@git.apache.org>.
Github user koushik-das commented on the pull request:

    https://github.com/apache/cloudstack/pull/1272#issuecomment-166552880
  
    Steps to test:
    
    - Prior to fix check for jobs in async_job table where job_status = 2 and job_complete_msid is NULL.
    - These entries won't get cleaned up even after "job.expire.minutes" global config as part of GC thread.
    - After the fix the entries should get cleaned up.
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] cloudstack pull request: CLOUDSTACK-9195: Cancelled/failed async j...

Posted by koushik-das <gi...@git.apache.org>.
Github user koushik-das commented on the pull request:

    https://github.com/apache/cloudstack/pull/1272#issuecomment-172421544
  
    @DaanHoogland Thanks for reviewing the code. On MS shutdown/restart the incomplete job entry will always get marked as cancelled in DB.
    You are right that an async job might continue to run even after MS shutdown. For e.g. MS is stopped in middle of deploy VM operation. If the command to deploy VM has reached HV before MS stop, then it will get started. The DB will be synced to correct state based on vm sync once the MS is started again.
    For volume and other operations (snapshot etc.), there is no recovery in these scenarios. Something like storage sync needs to be implemented which can update DB state based on the actual content/state of primary and secondary storages.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] cloudstack pull request: CLOUDSTACK-9195: Cancelled/failed async j...

Posted by DaanHoogland <gi...@git.apache.org>.
Github user DaanHoogland commented on the pull request:

    https://github.com/apache/cloudstack/pull/1272#issuecomment-172323097
  
    @koushik-das your change makes perfect sense to me from a db housekeeping point of view. I have one (little) doubt; what are the semantic ramifications. An asynchronous job might have continued after MS shutdown and its result might be actually success or failure. I am not sure if diving into this field is appropriate for this PR. Just checking with you as you might have already considered it.
    
    code LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] cloudstack pull request: CLOUDSTACK-9195: Cancelled/failed async j...

Posted by remibergsma <gi...@git.apache.org>.
Github user remibergsma commented on the pull request:

    https://github.com/apache/cloudstack/pull/1272#issuecomment-173205246
  
    LGTM based on these tests:
    
    ```
    nosetests --with-marvin --marvin-config=${marvinCfg} -s -a tags=advanced,required_hardware=true \
    component/test_vpc_redundant.py \
    component/test_routers_iptables_default_policy.py \
    component/test_routers_network_ops.py \
    component/test_vpc_router_nics.py \
    smoke/test_loadbalance.py \
    smoke/test_internal_lb.py \
    smoke/test_ssvm.py \
    smoke/test_network.py
    
    ```
    
    Result:
    
    ```
    Check the password file in the Router VM ... === TestName: test_isolate_network_password_server | Status : SUCCESS ===
    ok
    Create a redundant VPC with two networks with two VMs in each network ... === TestName: test_01_create_redundant_VPC_2tiers_4VMs_4IPs_4PF_ACL | Status : SUCCESS ===
    ok
    Create a redundant VPC with two networks with two VMs in each network and check default routes ... === TestName: test_02_redundant_VPC_default_routes | Status : SUCCESS ===
    ok
    Create a redundant VPC with two networks with two VMs in each network ... === TestName: test_03_create_redundant_VPC_1tier_2VMs_2IPs_2PF_ACL_reboot_routers | Status : SUCCESS ===
    ok
    Create a redundant VPC with 1 Tier, 1 VM, 1 ACL, 1 PF and test Network GC Nics ... === TestName: test_04_rvpc_network_garbage_collector_nics | Status : SUCCESS ===
    ok
    Create a redundant VPC with 1 Tier, 1 VM, 1 ACL, 1 PF and test Network GC Nics ... === TestName: test_05_rvpc_multi_tiers | Status : SUCCESS ===
    ok
    Test iptables default INPUT/FORWARD policy on RouterVM ... === TestName: test_02_routervm_iptables_policies | Status : SUCCESS ===
    ok
    Test iptables default INPUT/FORWARD policies on VPC router ... === TestName: test_01_single_VPC_iptables_policies | Status : SUCCESS ===
    ok
    Test redundant router internals ... === TestName: test_01_isolate_network_FW_PF_default_routes_egress_true | Status : SUCCESS ===
    ok
    Test redundant router internals ... === TestName: test_02_isolate_network_FW_PF_default_routes_egress_false | Status : SUCCESS ===
    ok
    Test redundant router internals ... === TestName: test_01_RVR_Network_FW_PF_SSH_default_routes_egress_true | Status : SUCCESS ===
    ok
    Test redundant router internals ... === TestName: test_02_RVR_Network_FW_PF_SSH_default_routes_egress_false | Status : SUCCESS ===
    ok
    Test redundant router internals ... === TestName: test_03_RVR_Network_check_router_state | Status : SUCCESS ===
    ok
    Create a VPC with two networks with one VM in each network and test nics after destroy ... === TestName: test_01_VPC_nics_after_destroy | Status : SUCCESS ===
    ok
    Create a VPC with two networks with one VM in each network and test default routes ... === TestName: test_02_VPC_default_routes | Status : SUCCESS ===
    ok
    Check that the /etc/dhcphosts.txt doesn't contain duplicate IPs ... === TestName: test_router_dhcphosts | Status : SUCCESS ===
    ok
    Test to create Load balancing rule with source NAT ... === TestName: test_01_create_lb_rule_src_nat | Status : SUCCESS ===
    ok
    Test to create Load balancing rule with non source NAT ... === TestName: test_02_create_lb_rule_non_nat | Status : SUCCESS ===
    ok
    Test for assign & removing load balancing rule ... === TestName: test_assign_and_removal_lb | Status : SUCCESS ===
    ok
    Test create, assign, remove of an Internal LB with roundrobin http traffic to 3 vm's in a Single VPC ... === TestName: test_01_internallb_roundrobin_1VPC_3VM_HTTP_port80 | Status : SUCCESS ===
    ok
    Test create, assign, remove of an Internal LB with roundrobin http traffic to 3 vm's in a Redundant VPC ... === TestName: test_02_internallb_roundrobin_1RVPC_3VM_HTTP_port80 | Status : SUCCESS ===
    ok
    Test to verify access to loadbalancer haproxy admin stats page ... === TestName: test_03_vpc_internallb_haproxy_stats_on_all_interfaces | Status : SUCCESS ===
    ok
    Test to verify access to loadbalancer haproxy admin stats page ... === TestName: test_04_rvpc_internallb_haproxy_stats_on_all_interfaces | Status : SUCCESS ===
    ok
    Test SSVM Internals ... === TestName: test_03_ssvm_internals | Status : SUCCESS ===
    ok
    Test CPVM Internals ... === TestName: test_04_cpvm_internals | Status : SUCCESS ===
    ok
    Test stop SSVM ... === TestName: test_05_stop_ssvm | Status : SUCCESS ===
    ok
    Test stop CPVM ... === TestName: test_06_stop_cpvm | Status : SUCCESS ===
    ok
    Test reboot SSVM ... === TestName: test_07_reboot_ssvm | Status : SUCCESS ===
    ok
    Test reboot CPVM ... === TestName: test_08_reboot_cpvm | Status : SUCCESS ===
    ok
    Test destroy SSVM ... === TestName: test_09_destroy_ssvm | Status : SUCCESS ===
    ok
    ok
    Create a redundant VPC with two networks with two VMs in each network ... === TestName: test_01_create_redundant_VPC_2tiers_4VMs_4IPs_4PF_ACL | Status : SUCCESS ===
    ok
    Create a redundant VPC with two networks with two VMs in each network and check default routes ... === TestName: test_02_redundant_VPC_default_routes | Status : SUCCESS ===
    ok
    Create a redundant VPC with two networks with two VMs in each network ... === TestName: test_03_create_redundant_VPC_1tier_2VMs_2IPs_2PF_ACL_reboot_routers | Status : SUCCESS ===
    ok
    Create a redundant VPC with 1 Tier, 1 VM, 1 ACL, 1 PF and test Network GC Nics ... === TestName: test_04_rvpc_network_garbage_collector_nics | Status : SUCCESS ===
    ok
    Create a redundant VPC with 1 Tier, 1 VM, 1 ACL, 1 PF and test Network GC Nics ... === TestName: test_05_rvpc_multi_tiers | Status : SUCCESS ===
    ok
    Test iptables default INPUT/FORWARD policy on RouterVM ... === TestName: test_02_routervm_iptables_policies | Status : SUCCESS ===
    ok
    Test iptables default INPUT/FORWARD policies on VPC router ... === TestName: test_01_single_VPC_iptables_policies | Status : SUCCESS ===
    ok
    Test redundant router internals ... === TestName: test_01_isolate_network_FW_PF_default_routes_egress_true | Status : SUCCESS ===
    ok
    Test redundant router internals ... === TestName: test_02_isolate_network_FW_PF_default_routes_egress_false | Status : SUCCESS ===
    ok
    Test redundant router internals ... === TestName: test_01_RVR_Network_FW_PF_SSH_default_routes_egress_true | Status : SUCCESS ===
    ok
    Test redundant router internals ... === TestName: test_02_RVR_Network_FW_PF_SSH_default_routes_egress_false | Status : SUCCESS ===
    ok
    Test redundant router internals ... === TestName: test_03_RVR_Network_check_router_state | Status : SUCCESS ===
    ok
    Create a VPC with two networks with one VM in each network and test nics after destroy ... === TestName: test_01_VPC_nics_after_destroy | Status : SUCCESS ===
    ok
    Create a VPC with two networks with one VM in each network and test default routes ... === TestName: test_02_VPC_default_routes | Status : SUCCESS ===
    ok
    Check that the /etc/dhcphosts.txt doesn't contain duplicate IPs ... === TestName: test_router_dhcphosts | Status : SUCCESS ===
    ok
    Test to create Load balancing rule with source NAT ... === TestName: test_01_create_lb_rule_src_nat | Status : SUCCESS ===
    ok
    Test to create Load balancing rule with non source NAT ... === TestName: test_02_create_lb_rule_non_nat | Status : SUCCESS ===
    ok
    Test for assign & removing load balancing rule ... === TestName: test_assign_and_removal_lb | Status : SUCCESS ===
    ok
    Test create, assign, remove of an Internal LB with roundrobin http traffic to 3 vm's in a Single VPC ... === TestName: test_01_internallb_roundrobin_1VPC_3VM_HTTP_port80 | Status : SUCCESS ===
    ok
    Test create, assign, remove of an Internal LB with roundrobin http traffic to 3 vm's in a Redundant VPC ... === TestName: test_02_internallb_roundrobin_1RVPC_3VM_HTTP_port80 | Status : SUCCESS ===
    ok
    Test to verify access to loadbalancer haproxy admin stats page ... === TestName: test_03_vpc_internallb_haproxy_stats_on_all_interfaces | Status : SUCCESS ===
    ok
    Test to verify access to loadbalancer haproxy admin stats page ... === TestName: test_04_rvpc_internallb_haproxy_stats_on_all_interfaces | Status : SUCCESS ===
    ok
    Test SSVM Internals ... === TestName: test_03_ssvm_internals | Status : SUCCESS ===
    ok
    Test CPVM Internals ... === TestName: test_04_cpvm_internals | Status : SUCCESS ===
    ok
    Test stop SSVM ... === TestName: test_05_stop_ssvm | Status : SUCCESS ===
    ok
    Test stop CPVM ... === TestName: test_06_stop_cpvm | Status : SUCCESS ===
    ok
    Test reboot SSVM ... === TestName: test_07_reboot_ssvm | Status : SUCCESS ===
    ok
    Test reboot CPVM ... === TestName: test_08_reboot_cpvm | Status : SUCCESS ===
    ok
    Test destroy SSVM ... === TestName: test_09_destroy_ssvm | Status : SUCCESS ===
    ok
    Test destroy CPVM ... === TestName: test_10_destroy_cpvm | Status : SUCCESS ===
    
    ----------------------------------------------------------------------
    Ran 44 tests in 20950.689s
    
    OK
    ```
    
    
    And:
    
    ```
    nosetests --with-marvin --marvin-config=${marvinCfg} -s -a tags=advanced,required_hardware=false \
    smoke/test_routers.py \
    smoke/test_network_acl.py \
    smoke/test_privategw_acl.py \
    smoke/test_reset_vm_on_reboot.py \
    smoke/test_vm_life_cycle.py \
    smoke/test_vpc_vpn.py \
    smoke/test_service_offerings.py \
    component/test_vpc_offerings.py \
    component/test_vpc_routers.py
    ```
    
    Result:
    
    ```
    Test router internal advanced zone ... === TestName: test_02_router_internal_adv | Status : SUCCESS ===
    ok
    Test restart network ... === TestName: test_03_restart_network_cleanup | Status : SUCCESS ===
    ok
    Test router basic setup ... === TestName: test_05_router_basic | Status : SUCCESS ===
    ok
    Test router advanced setup ... === TestName: test_06_router_advanced | Status : SUCCESS ===
    ok
    Test stop router ... === TestName: test_07_stop_router | Status : SUCCESS ===
    ok
    Test start router ... === TestName: test_08_start_router | Status : SUCCESS ===
    ok
    Test reboot router ... === TestName: test_09_reboot_router | Status : SUCCESS ===
    ok
    Test reset virtual machine on reboot ... === TestName: test_01_reset_vm_on_reboot | Status : SUCCESS ===
    ok
    Test advanced zone virtual router ... === TestName: test_advZoneVirtualRouter | Status : SUCCESS ===
    ok
    Test Deploy Virtual Machine ... === TestName: test_deploy_vm | Status : SUCCESS ===
    ok
    Test Multiple Deploy Virtual Machine ... === TestName: test_deploy_vm_multiple | Status : SUCCESS ===
    ok
    Test Stop Virtual Machine ... === TestName: test_01_stop_vm | Status : SUCCESS ===
    ok
    Test Start Virtual Machine ... === TestName: test_02_start_vm | Status : SUCCESS ===
    ok
    Test Reboot Virtual Machine ... === TestName: test_03_reboot_vm | Status : SUCCESS ===
    ok
    Test destroy Virtual Machine ... === TestName: test_06_destroy_vm | Status : SUCCESS ===
    ok
    Test recover Virtual Machine ... === TestName: test_07_restore_vm | Status : SUCCESS ===
    ok
    Test migrate VM ... === TestName: test_08_migrate_vm | Status : SUCCESS ===
    ok
    Test destroy(expunge) Virtual Machine ... === TestName: test_09_expunge_vm | Status : SUCCESS ===
    ok
    Test to create service offering ... === TestName: test_01_create_service_offering | Status : SUCCESS ===
    ok
    Test to update existing service offering ... === TestName: test_02_edit_service_offering | Status : SUCCESS ===
    ok
    Test to delete service offering ... === TestName: test_03_delete_service_offering | Status : SUCCESS ===
    ok
    Test for delete account ... === TestName: test_delete_account | Status : SUCCESS ===
    ok
    Test for Associate/Disassociate public IP address for admin account ... === TestName: test_public_ip_admin_account | Status : SUCCESS ===
    ok
    Test for Associate/Disassociate public IP address for user account ... === TestName: test_public_ip_user_account | Status : SUCCESS ===
    ok
    Test for release public IP address ... === TestName: test_releaseIP | Status : SUCCESS ===
    ok
    Test create VPC offering ... === TestName: test_01_create_vpc_offering | Status : SUCCESS ===
    ok
    Test VPC offering without load balancing service ... === TestName: test_03_vpc_off_without_lb | Status : SUCCESS ===
    ok
    Test VPC offering without static NAT service ... === TestName: test_04_vpc_off_without_static_nat | Status : SUCCESS ===
    ok
    Test VPC offering without port forwarding service ... === TestName: test_05_vpc_off_without_pf | Status : SUCCESS ===
    ok
    Test VPC offering with invalid services ... === TestName: test_06_vpc_off_invalid_services | Status : SUCCESS ===
    ok
    Test update VPC offering ... === TestName: test_07_update_vpc_off | Status : SUCCESS ===
    ok
    Test list VPC offering ... === TestName: test_08_list_vpc_off | Status : SUCCESS ===
    ok
    test_09_create_redundant_vpc_offering (integration.component.test_vpc_offerings.TestVPCOffering) ... === TestName: test_09_create_redundant_vpc_offering | Status : SUCCESS ===
    ok
    Test start/stop of router after addition of one guest network ... === TestName: test_01_start_stop_router_after_addition_of_one_guest_network | Status : SUCCESS ===
    ok
    Test reboot of router after addition of one guest network ... === TestName: test_02_reboot_router_after_addition_of_one_guest_network | Status : SUCCESS ===
    ok
    Test to change service offering of router after addition of one guest network ... === TestName: test_04_chg_srv_off_router_after_addition_of_one_guest_network | Status : SUCCESS ===
    ok
    Test destroy of router after addition of one guest network ... === TestName: test_05_destroy_router_after_addition_of_one_guest_network | Status : SUCCESS ===
    ok
    Test to stop and start router after creation of VPC ... === TestName: test_01_stop_start_router_after_creating_vpc | Status : SUCCESS ===
    ok
    Test to reboot the router after creating a VPC ... === TestName: test_02_reboot_router_after_creating_vpc | Status : SUCCESS ===
    ok
    Tests to change service offering of the Router after ... === TestName: test_04_change_service_offerring_vpc | Status : SUCCESS ===
    ok
    Test to destroy the router after creating a VPC ... === TestName: test_05_destroy_router_after_creating_vpc | Status : SUCCESS ===
    ok
    
    ----------------------------------------------------------------------
    Ran 41 tests in 7891.299s
    
    OK
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---