You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cloudstack.apache.org by GitBox <gi...@apache.org> on 2018/10/29 07:45:58 UTC

[GitHub] wido opened a new issue #2978: Router aggregate timeout does not seem to be honored

wido opened a new issue #2978: Router aggregate timeout does not seem to be honored
URL: https://github.com/apache/cloudstack/issues/2978
 
 
   ##### ISSUE TYPE
    * Bug Report
   
   ##### COMPONENT NAME
   ~~~
   Virtual Router
   ~~~
   
   ##### CLOUDSTACK VERSION
   ~~~
   4.11.1
   ~~~
   
   ##### CONFIGURATION
   ~~~
   router.aggregation.command.each.timeout = 6000
   ~~~
   
   ##### OS / ENVIRONMENT
   Basic Networking
   
   
   ##### SUMMARY
   Router gets killed on Start due to timeout before configuration has completed
   
   
   ##### STEPS TO REPRODUCE
   ~~~
   Deploy a Virtual Router with ~600 DHCP entries
   ~~~
   
   ##### EXPECTED RESULTS
   ~~~
   VR should deploy properly
   ~~~
   
   ##### ACTUAL RESULTS
   ~~~
   Timeout was reached
   ~~~
   
   
   
   The story is that during a upgrade from 4.10 to 4.11.1 we (PCextreme) encountered a problem that Virtual Routers would not start.
   
   During their Start and configuration they ran into a timeout which caused the VR to get killed.
   
   For example we saw in the logs:
   
   <pre>
   2018-10-29 06:38:07,041 DEBUG [resource.virtualnetwork.VirtualRoutingResource] (agentRequest-Handler-6:null) (logid:ded92662) Aggregate action timeout in seconds is 665
   2018-10-29 06:38:07,041 DEBUG [kvm.resource.LibvirtComputingResource] (agentRequest-Handler-6:null) (logid:ded92662) Creating file in VR, with ip: 169.254.3.223, file: VR-d09aa357-27e3-4176-a283-9a7afedbae27.cfg
   2018-10-29 06:38:07,464 DEBUG [kvm.resource.LibvirtComputingResource] (agentRequest-Handler-6:null) (logid:ded92662) Executing: /usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh vr_cfg.sh 169.254.3.223 -c /var/cache/cloud/VR-d09aa357-27e3-4176-a283-9a7afedbae27.cfg 
   2018-10-29 06:38:07,466 DEBUG [kvm.resource.LibvirtComputingResource] (agentRequest-Handler-6:null) (logid:ded92662) Executing while with timeout : 665700
   </pre>
   
   So in this case the timeout was 665 seconds, about 11 minutes.
   
   We tried to increase *router.aggregation.command.each.timeout* both on the Management Server side and in *agent.properties*, but that did not seem to make any change.
   
   For each DHCP entry a ~1 second timeout seems to be calculated. This VR has *609* DHCP entries:
   
   <pre>
   root@r-32727-VM:~# wc -l /etc/dhcphosts.txt 
   609 /etc/dhcphosts.txt
   root@r-32727-VM:~#
   </pre>
   
   10 minutes is a long time, that is something that would need improving as well, but apart from that I just would not start.
   
   My colleague created PR #2977 as this fixed the issue for us. So we need to investigate if his fix is the proper one or that the (default) timeout should be increased.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services