You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cloudstack.apache.org by GitBox <gi...@apache.org> on 2021/01/14 17:19:52 UTC

[GitHub] [cloudstack] GutoVeronezi opened a new pull request #4585: Externalize kvm agent storage timeout configuration

GutoVeronezi opened a new pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585


   ### Description
   On KVMHAMonitor, the heartbeat timeout is hardcoded to `60000` ms; therefore, operators do not have options to choose which timeout they want/need. 
   
   This PR intends to externalize the heartbeat timeout configuration on KVMHAMonitor.
   
   Depending on the use case, this timeout might need increasing/decreasing. Therefore, it is interesting to externalize such configurations for cloud operators. The hardcoded value is taken as the default one. Therefore, the current behavior is maintained.
   
   ### Types of changes
   - [ ] Breaking change (fix or feature that would cause existing functionality to change)
   - [ ] New feature (non-breaking change which adds functionality)
   - [ ] Bug fix (non-breaking change which fixes an issue)
   - [x] Enhancement (improves an existing feature and functioanlity)
   - [ ] Cleanup (Code refactoring and cleanup, that may add test cases)
   
   ### Feature/Enhancement Scale or Bug Severity
   
   #### Feature/Enhancement Scale
   - [x] Major
   - [ ] Minor
   
   ### How Has This Been Tested?
   It has been tested locally on a test lab.
   1. I changed my log4j to debug mode.
   2. I added and changed `hearbeat.update.timeout` setting to `agent.properties` file several times.
   3. I restarted agent and watched the log.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] GutoVeronezi commented on a change in pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
GutoVeronezi commented on a change in pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#discussion_r643397715



##########
File path: plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/KVMHAMonitor.java
##########
@@ -32,153 +33,143 @@
 import java.util.concurrent.ConcurrentHashMap;
 
 public class KVMHAMonitor extends KVMHABase implements Runnable {
+
     private static final Logger s_logger = Logger.getLogger(KVMHAMonitor.class);
-    private final Map<String, NfsStoragePool> _storagePool = new ConcurrentHashMap<String, NfsStoragePool>();
+    private final Map<String, NfsStoragePool> storagePool = new ConcurrentHashMap<>();
 
-    private final String _hostIP; /* private ip address */
+    private final String hostPrivateIp;
 
     public KVMHAMonitor(NfsStoragePool pool, String host, String scriptPath) {
         if (pool != null) {
-            _storagePool.put(pool._poolUUID, pool);
+            storagePool.put(pool._poolUUID, pool);
         }
-        _hostIP = host;
+        hostPrivateIp = host;
         configureHeartBeatPath(scriptPath);
+
+        _heartBeatUpdateTimeout = AgentPropertiesFileHandler.getPropertyValue(AgentProperties.HEARTBEAT_UPDATE_TIMEOUT);
     }
 
     private static synchronized void configureHeartBeatPath(String scriptPath) {
         KVMHABase.s_heartBeatPath = scriptPath;
     }
 
     public void addStoragePool(NfsStoragePool pool) {
-        synchronized (_storagePool) {
-            _storagePool.put(pool._poolUUID, pool);
+        synchronized (storagePool) {
+            storagePool.put(pool._poolUUID, pool);
         }
     }
 
     public void removeStoragePool(String uuid) {
-        synchronized (_storagePool) {
-            NfsStoragePool pool = _storagePool.get(uuid);
+        synchronized (storagePool) {
+            NfsStoragePool pool = storagePool.get(uuid);
             if (pool != null) {
                 Script.runSimpleBashScript("umount " + pool._mountDestPath);
-                _storagePool.remove(uuid);
+                storagePool.remove(uuid);
             }
         }
     }
 
     public List<NfsStoragePool> getStoragePools() {
-        synchronized (_storagePool) {
-            return new ArrayList<NfsStoragePool>(_storagePool.values());
+        synchronized (storagePool) {
+            return new ArrayList<>(storagePool.values());
         }
     }
 
     public NfsStoragePool getStoragePool(String uuid) {
-        synchronized (_storagePool) {
-            return _storagePool.get(uuid);
+        synchronized (storagePool) {
+            return storagePool.get(uuid);
         }
     }
 
-    private class Monitor extends ManagedContextRunnable {
+    protected void runHeartBeat() {
+        synchronized (storagePool) {
+            Set<String> removedPools = new HashSet<>();
+            for (String uuid : storagePool.keySet()) {
+                NfsStoragePool primaryStoragePool = storagePool.get(uuid);
+                StoragePool storage;
+                try {
+                    Connect conn = LibvirtConnection.getConnection();
+                    storage = conn.storagePoolLookupByUUIDString(uuid);
+                    if (storage == null || storage.getInfo().state != StoragePoolState.VIR_STORAGE_POOL_RUNNING) {
+                        if (storage == null) {
+                            s_logger.debug(String.format("Libvirt storage pool [%s] not found, removing from HA list.", uuid));
+                        } else {
+                            s_logger.debug(String.format("Libvirt storage pool [%s] found, but not running, removing from HA list.", uuid));
+                        }
 
-        @Override
-        protected void runInContext() {
-            synchronized (_storagePool) {
-                Set<String> removedPools = new HashSet<String>();
-                for (String uuid : _storagePool.keySet()) {
-                    NfsStoragePool primaryStoragePool = _storagePool.get(uuid);
+                        removedPools.add(uuid);
+                        continue;
+                    }
 
-                    // check for any that have been deregistered with libvirt and
-                    // skip,remove them
+                    s_logger.debug(String.format("Found NFS storage pool [%s] in libvirt, continuing.", uuid));
 
-                    StoragePool storage = null;
-                    try {
-                        Connect conn = LibvirtConnection.getConnection();
-                        storage = conn.storagePoolLookupByUUIDString(uuid);
-                        if (storage == null) {
-                            s_logger.debug("Libvirt storage pool " + uuid + " not found, removing from HA list");
-                            removedPools.add(uuid);
-                            continue;
+                } catch (LibvirtException e) {
+                    s_logger.debug(String.format("Failed to lookup libvirt storage pool [%s].", uuid), e);
 
-                        } else if (storage.getInfo().state != StoragePoolState.VIR_STORAGE_POOL_RUNNING) {
-                            s_logger.debug("Libvirt storage pool " + uuid + " found, but not running, removing from HA list");
+                    if (e.toString().contains("pool not found")) {
+                        s_logger.debug(String.format("Removing pool [%s] from HA monitor since it was deleted.", uuid));
+                        removedPools.add(uuid);
+                        continue;
+                    }
 
-                            removedPools.add(uuid);
-                            continue;
-                        }
-                        s_logger.debug("Found NFS storage pool " + uuid + " in libvirt, continuing");
+                }
 
-                    } catch (LibvirtException e) {
-                        s_logger.debug("Failed to lookup libvirt storage pool " + uuid + " due to: " + e);
+                String result = null;
+                for (int i = 1; i <= _heartBeatUpdateMaxTries; i++) {
+                    Script cmd = createHeartBeatCommand(primaryStoragePool);
+                    cmd.add("-h", hostPrivateIp);
+                    result = cmd.execute();
 
-                        // we only want to remove pool if it's not found, not if libvirt
-                        // connection fails
-                        if (e.toString().contains("pool not found")) {
-                            s_logger.debug("removing pool from HA monitor since it was deleted");
-                            removedPools.add(uuid);
-                            continue;
-                        }
-                    }
+                    s_logger.debug(String.format("The command (%s), to the pool [%s], has the result [%s].", cmd.toString(), uuid, result));
 
-                    String result = null;
-                    // Try multiple times, but sleep in between tries to ensure it isn't a short lived transient error
-                    for (int i = 1; i <= _heartBeatUpdateMaxTries; i++) {
-                        Script cmd = new Script(s_heartBeatPath, _heartBeatUpdateTimeout, s_logger);
-                        cmd.add("-i", primaryStoragePool._poolIp);
-                        cmd.add("-p", primaryStoragePool._poolMountSourcePath);
-                        cmd.add("-m", primaryStoragePool._mountDestPath);
-                        cmd.add("-h", _hostIP);
-                        result = cmd.execute();
-                        if (result != null) {
-                            s_logger.warn("write heartbeat failed: " + result + ", try: " + i + " of " + _heartBeatUpdateMaxTries);
-                            try {
-                                Thread.sleep(_heartBeatUpdateRetrySleep);
-                            } catch (InterruptedException e) {
-                                s_logger.debug("[ignored] interupted between heartbeat retries.");
-                            }
-                        } else {
-                            break;
+                    if (result != null) {
+                        s_logger.warn(String.format("Write heartbeat for pool [%s] failed: %s; try: %s of %s.", uuid, result, i, _heartBeatUpdateMaxTries));
+                        try {
+                            Thread.sleep(_heartBeatUpdateRetrySleep);
+                        } catch (InterruptedException e) {
+                            s_logger.debug("[IGNORED] Interrupted between heartbeat retries.", e);
                         }
+                    } else {
+                        break;
                     }
 
-                    if (result != null) {
-                        // Stop cloudstack-agent if can't write to heartbeat file.
-                        // This will raise an alert on the mgmt server
-                        s_logger.warn("write heartbeat failed: " + result + "; stopping cloudstack-agent");
-                        Script cmd = new Script(s_heartBeatPath, _heartBeatUpdateTimeout, s_logger);
-                        cmd.add("-i", primaryStoragePool._poolIp);
-                        cmd.add("-p", primaryStoragePool._poolMountSourcePath);
-                        cmd.add("-m", primaryStoragePool._mountDestPath);
-                        cmd.add("-c");
-                        result = cmd.execute();
-                    }
                 }
 
-                if (!removedPools.isEmpty()) {
-                    for (String uuid : removedPools) {
-                        removeStoragePool(uuid);
-                    }
+                if (result != null) {
+                    s_logger.warn(String.format("Write heartbeat for pool [%s] failed: %s; stopping cloudstack-agent.", uuid, result));
+                    Script cmd = createHeartBeatCommand(primaryStoragePool);
+                    cmd.add("-c");
+                    result = cmd.execute();
                 }
             }
 
+            if (!removedPools.isEmpty()) {
+                for (String uuid : removedPools) {
+                    removeStoragePool(uuid);
+                }
+            }
         }
+
+    }
+
+    protected Script createHeartBeatCommand(NfsStoragePool primaryStoragePool) {

Review comment:
       done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] GutoVeronezi commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
GutoVeronezi commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-763743473


   > > @ravening this parameter exists already but it is hardcoded in `KVMHABase.java`, it is set as:
   > > `protected long _heartBeatUpdateTimeout = 60000;`
   > 
   > @GabrielBrascher true but I know that Wei added several global settings to configure the values as well the preferred action as well in a much simpler way. I see in this PR that some new files are added to handle the properties which are not needed
   
   @ravening can you show me where this is externalized (if it is)? As far as we have seen in the code, there is no way for operators to configure this right now. Therefore, we need this patch to allow operators to configure the storage check timeout.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] ravening commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
ravening commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-762809491


   @ustcweizhou do we have this param already? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] GabrielBrascher commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
GabrielBrascher commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-762817748


   @ravening this parameter exists already but it is hardcoded in `KVMHABase.java`, it is set as:
   `protected long _heartBeatUpdateTimeout = 60000;`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] DaanHoogland commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
DaanHoogland commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-854526484


   > About the question:
   > This PR intends to externalize the heartbeat timeout configuration on `KVMHAMonitor` via `agent.properties`; And the purpose of these classes is to retrieve the properties from file; So in the situation we are working, we always want the value from file.
   > Is there any other situation that I should consider?
   
   I don't know but I could immagine there could be properties that require a restart on change and forbid a re-read from file. things like connection settings for instance. I am not sure. It was a legitimate question.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] GutoVeronezi commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
GutoVeronezi commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-760588911


   @GabrielBrascher sorry, my bad. 
   
   I forgot to add the new variable to the file.
   
   I'll add later.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] RodrigoDLopez commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
RodrigoDLopez commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-881539086


   Hey @GutoVeronezi 
   Thank you for your contribution. You're doing an pretty job.
   
   @DaanHoogland @GabrielBrascher @rhtyd 
   
   I ran some manual tests on this branch, and as far as I could verify everything works correctly.
   Before upgrading to this branch, the agents log was always the same. Something like:
   
   ```
   DEBUG [kvm.resource.KVMHAMonitor] Executing: /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/kvmheartbeat.sh -i 201.200.102.10 -p /export/primary2 -m /mnt/8c8fbcc3-eeef-33c1-adf6-51b915df61ca -h 201.200.101.4 
   DEBUG [kvm.resource.KVMHAMonitor] Executing while with timeout : 60000
   DEBUG [kvm.resource.KVMHAMonitor] Execution is successful.
   ```
   
   After updating, I ran some tests changing the parameter value and checking if the default value would be used.
   
   * with the default value of heartbeat.update.timeout (commented or missing parameter)
   ```
   DEBUG [kvm.resource.KVMHAMonitor] Executing: /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/kvmheartbeat.sh -i 201.200.102.10 -p /export/primary2 -m /mnt/8c8fbcc3-eeef-33c1-adf6-51b915df61ca -h 201.200.101.4 
   DEBUG [kvm.resource.KVMHAMonitor] Executing while with timeout : 60000
   DEBUG [kvm.resource.KVMHAMonitor] Execution is successful.
   ```
   
   * heartbeat.update.timeout=30000
   ```
   DEBUG [kvm.resource.KVMHAMonitor] Executing: /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/kvmheartbeat.sh -i 201.200.102.10 -p /export/primary2 -m /mnt/8c8fbcc3-eeef-33c1-adf6-51b915df61ca -h 201.200.101.4 
   DEBUG [kvm.resource.KVMHAMonitor] Executing while with timeout : 30000
   DEBUG [kvm.resource.KVMHAMonitor] Execution is successful.
   ```
   
   * heartbeat.update.timeout=10000
   ```
   DEBUG [kvm.resource.KVMHAMonitor] Executing: /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/kvmheartbeat.sh -i 201.200.102.10 -p /export/primary2 -m /mnt/8c8fbcc3-eeef-33c1-adf6-51b915df61ca -h 201.200.101.4 
   DEBUG [kvm.resource.KVMHAMonitor] Executing while with timeout : 10000
   DEBUG [kvm.resource.KVMHAMonitor] Execution is successful.
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] blueorangutan commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
blueorangutan commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-857313839


   <b>Trillian test result (tid-897)</b>
   Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
   Total time taken: 54700 seconds
   Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr4585-t897-kvm-centos7.zip
   Intermittent failure detected: /marvin/tests/smoke/test_reset_vm_on_reboot.py
   Intermittent failure detected: /marvin/tests/smoke/test_resource_accounting.py
   Intermittent failure detected: /marvin/tests/smoke/test_router_dhcphosts.py
   Intermittent failure detected: /marvin/tests/smoke/test_router_dns.py
   Intermittent failure detected: /marvin/tests/smoke/test_router_dnsservice.py
   Intermittent failure detected: /marvin/tests/smoke/test_routers_network_ops.py
   Intermittent failure detected: /marvin/tests/smoke/test_vpc_redundant.py
   Smoke tests completed. 82 look OK, 6 have error(s)
   Only failed tests results shown below:
   
   
   Test | Result | Time (s) | Test File
   --- | --- | --- | ---
   ContextSuite context=TestResetVmOnReboot>:setup | `Error` | 0.00 | test_reset_vm_on_reboot.py
   ContextSuite context=TestRAMCPUResourceAccounting>:setup | `Error` | 0.00 | test_resource_accounting.py
   ContextSuite context=TestRouterDHCPHosts>:setup | `Error` | 0.00 | test_router_dhcphosts.py
   ContextSuite context=TestRouterDHCPOpts>:setup | `Error` | 0.00 | test_router_dhcphosts.py
   ContextSuite context=TestRouterDns>:setup | `Error` | 0.00 | test_router_dns.py
   test_01_RVR_Network_FW_PF_SSH_default_routes_egress_true | `Failure` | 334.77 | test_routers_network_ops.py
   test_02_RVR_Network_FW_PF_SSH_default_routes_egress_false | `Failure` | 334.37 | test_routers_network_ops.py
   test_03_create_redundant_VPC_1tier_2VMs_2IPs_2PF_ACL_reboot_routers | `Failure` | 458.49 | test_vpc_redundant.py
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] DaanHoogland edited a comment on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
DaanHoogland edited a comment on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-882565561


   I found an issue after merging this. addHost fails in several environments:
   
   I think we'll need to revert (which I'm testing now in a lab env)
   the smoke tests succeeded so the problem is not this PR in isolation, but something got merged that conflicted with this probably.
   
   cc @GutoVeronezi @GabrielBrascher @RodrigoDLopez @nvazquez 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] blueorangutan commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
blueorangutan commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-856613755


   Packaging result: :heavy_check_mark: centos7 :heavy_check_mark: centos8 :heavy_check_mark: debian. SL-JID 188


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] DaanHoogland merged pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
DaanHoogland merged pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] nvazquez commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
nvazquez commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-882594625


   Thanks @DaanHoogland seems like reverting on the internal branch made addHost work again. Please update once tests are completed so this can be reverted 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] GutoVeronezi commented on a change in pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
GutoVeronezi commented on a change in pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#discussion_r642609935



##########
File path: plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/KVMHAMonitor.java
##########
@@ -32,153 +33,141 @@
 import java.util.concurrent.ConcurrentHashMap;
 
 public class KVMHAMonitor extends KVMHABase implements Runnable {
+
     private static final Logger s_logger = Logger.getLogger(KVMHAMonitor.class);
-    private final Map<String, NfsStoragePool> _storagePool = new ConcurrentHashMap<String, NfsStoragePool>();
+    private final Map<String, NfsStoragePool> storagePool = new ConcurrentHashMap<>();
 
-    private final String _hostIP; /* private ip address */
+    private final String hostPrivateIp;
 
     public KVMHAMonitor(NfsStoragePool pool, String host, String scriptPath) {
         if (pool != null) {
-            _storagePool.put(pool._poolUUID, pool);
+            storagePool.put(pool._poolUUID, pool);
         }
-        _hostIP = host;
+        hostPrivateIp = host;
         configureHeartBeatPath(scriptPath);
+
+        _heartBeatUpdateTimeout = AgentPropertiesFileHandler.getPropertyValue(AgentProperties.HEARTBEAT_UPDATE_TIMEOUT);
     }
 
     private static synchronized void configureHeartBeatPath(String scriptPath) {
         KVMHABase.s_heartBeatPath = scriptPath;
     }
 
     public void addStoragePool(NfsStoragePool pool) {
-        synchronized (_storagePool) {
-            _storagePool.put(pool._poolUUID, pool);
+        synchronized (storagePool) {
+            storagePool.put(pool._poolUUID, pool);
         }
     }
 
     public void removeStoragePool(String uuid) {
-        synchronized (_storagePool) {
-            NfsStoragePool pool = _storagePool.get(uuid);
+        synchronized (storagePool) {
+            NfsStoragePool pool = storagePool.get(uuid);
             if (pool != null) {
                 Script.runSimpleBashScript("umount " + pool._mountDestPath);
-                _storagePool.remove(uuid);
+                storagePool.remove(uuid);
             }
         }
     }
 
     public List<NfsStoragePool> getStoragePools() {
-        synchronized (_storagePool) {
-            return new ArrayList<NfsStoragePool>(_storagePool.values());
+        synchronized (storagePool) {
+            return new ArrayList<>(storagePool.values());
         }
     }
 
     public NfsStoragePool getStoragePool(String uuid) {
-        synchronized (_storagePool) {
-            return _storagePool.get(uuid);
+        synchronized (storagePool) {
+            return storagePool.get(uuid);
         }
     }
 
-    private class Monitor extends ManagedContextRunnable {
+    protected void runHeartBeat() {
+        synchronized (storagePool) {
+            Set<String> removedPools = new HashSet<>();
+            for (String uuid : storagePool.keySet()) {
+                NfsStoragePool primaryStoragePool = storagePool.get(uuid);
+                StoragePool storage;
+                try {
+                    Connect conn = LibvirtConnection.getConnection();
+                    storage = conn.storagePoolLookupByUUIDString(uuid);
+                    if (storage == null || storage.getInfo().state != StoragePoolState.VIR_STORAGE_POOL_RUNNING) {
+                        if (storage == null) {
+                            s_logger.debug(String.format("Libvirt storage pool [%s] not found, removing from HA list.", uuid));
+                        } else {
+                            s_logger.debug(String.format("Libvirt storage pool [%s] found, but not running, removing from HA list.", uuid));
+                        }
 
-        @Override
-        protected void runInContext() {
-            synchronized (_storagePool) {
-                Set<String> removedPools = new HashSet<String>();
-                for (String uuid : _storagePool.keySet()) {
-                    NfsStoragePool primaryStoragePool = _storagePool.get(uuid);
+                        removedPools.add(uuid);
+                        continue;
+                    }
 
-                    // check for any that have been deregistered with libvirt and
-                    // skip,remove them
+                    s_logger.debug(String.format("Found NFS storage pool [%s] in libvirt, continuing.", uuid));
 
-                    StoragePool storage = null;
-                    try {
-                        Connect conn = LibvirtConnection.getConnection();
-                        storage = conn.storagePoolLookupByUUIDString(uuid);
-                        if (storage == null) {
-                            s_logger.debug("Libvirt storage pool " + uuid + " not found, removing from HA list");
-                            removedPools.add(uuid);
-                            continue;
+                } catch (LibvirtException e) {
+                    s_logger.debug(String.format("Failed to lookup libvirt storage pool [%s].", uuid), e);
 
-                        } else if (storage.getInfo().state != StoragePoolState.VIR_STORAGE_POOL_RUNNING) {
-                            s_logger.debug("Libvirt storage pool " + uuid + " found, but not running, removing from HA list");
+                    if (e.toString().contains("pool not found")) {
+                        s_logger.debug(String.format("Removing pool [%s] from HA monitor since it was deleted.", uuid));
+                        removedPools.add(uuid);
+                        continue;
+                    }
 
-                            removedPools.add(uuid);
-                            continue;
-                        }
-                        s_logger.debug("Found NFS storage pool " + uuid + " in libvirt, continuing");
+                }
 
-                    } catch (LibvirtException e) {
-                        s_logger.debug("Failed to lookup libvirt storage pool " + uuid + " due to: " + e);
+                String result = null;
+                for (int i = 1; i <= _heartBeatUpdateMaxTries; i++) {
+                    Script cmd = new Script(s_heartBeatPath, _heartBeatUpdateTimeout, s_logger);
+                    cmd.add("-i", primaryStoragePool._poolIp);
+                    cmd.add("-p", primaryStoragePool._poolMountSourcePath);
+                    cmd.add("-m", primaryStoragePool._mountDestPath);
+                    cmd.add("-h", hostPrivateIp);
+                    result = cmd.execute();
 
-                        // we only want to remove pool if it's not found, not if libvirt
-                        // connection fails
-                        if (e.toString().contains("pool not found")) {
-                            s_logger.debug("removing pool from HA monitor since it was deleted");
-                            removedPools.add(uuid);
-                            continue;
-                        }
-                    }
+                    s_logger.debug(String.format("The command (%s), to the pool [%s], has the result [%s].", cmd.toString(), uuid, result));
 
-                    String result = null;
-                    // Try multiple times, but sleep in between tries to ensure it isn't a short lived transient error
-                    for (int i = 1; i <= _heartBeatUpdateMaxTries; i++) {
-                        Script cmd = new Script(s_heartBeatPath, _heartBeatUpdateTimeout, s_logger);
-                        cmd.add("-i", primaryStoragePool._poolIp);
-                        cmd.add("-p", primaryStoragePool._poolMountSourcePath);
-                        cmd.add("-m", primaryStoragePool._mountDestPath);
-                        cmd.add("-h", _hostIP);
-                        result = cmd.execute();
-                        if (result != null) {
-                            s_logger.warn("write heartbeat failed: " + result + ", try: " + i + " of " + _heartBeatUpdateMaxTries);
-                            try {
-                                Thread.sleep(_heartBeatUpdateRetrySleep);
-                            } catch (InterruptedException e) {
-                                s_logger.debug("[ignored] interupted between heartbeat retries.");
-                            }
-                        } else {
-                            break;
+                    if (result != null) {
+                        s_logger.warn(String.format("Write heartbeat for pool [%s] failed: %s; try: %s of %s.", uuid, result, i, _heartBeatUpdateMaxTries));
+                        try {
+                            Thread.sleep(_heartBeatUpdateRetrySleep);
+                        } catch (InterruptedException e) {
+                            s_logger.debug("[IGNORED] Interrupted between heartbeat retries.", e);
                         }
+                    } else {
+                        break;
                     }
 
-                    if (result != null) {
-                        // Stop cloudstack-agent if can't write to heartbeat file.
-                        // This will raise an alert on the mgmt server
-                        s_logger.warn("write heartbeat failed: " + result + "; stopping cloudstack-agent");
-                        Script cmd = new Script(s_heartBeatPath, _heartBeatUpdateTimeout, s_logger);
-                        cmd.add("-i", primaryStoragePool._poolIp);
-                        cmd.add("-p", primaryStoragePool._poolMountSourcePath);
-                        cmd.add("-m", primaryStoragePool._mountDestPath);
-                        cmd.add("-c");
-                        result = cmd.execute();
-                    }
                 }
 
-                if (!removedPools.isEmpty()) {
-                    for (String uuid : removedPools) {
-                        removeStoragePool(uuid);
-                    }
+                if (result != null) {
+                    s_logger.warn(String.format("Write heartbeat for pool [%s] failed: %s; stopping cloudstack-agent.", uuid, result));
+                    Script cmd = new Script(s_heartBeatPath, _heartBeatUpdateTimeout, s_logger);
+                    cmd.add("-i", primaryStoragePool._poolIp);
+                    cmd.add("-p", primaryStoragePool._poolMountSourcePath);
+                    cmd.add("-m", primaryStoragePool._mountDestPath);

Review comment:
       @GabrielBrascher done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] DaanHoogland commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
DaanHoogland commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-882565561






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] GutoVeronezi commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
GutoVeronezi commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-850866396


   @RodrigoDLopez @GabrielBrascher @DaanHoogland @sureshanaparti I'm pinging you here because you all reviewed PR [4586](https://github.com/apache/cloudstack/pull/4586).
   
   First I want to apologize for this inconvenience. This PR and [4586](https://github.com/apache/cloudstack/pull/4586) are some of my first PRs and I messed up they a little bit.
   
   Both PRs proposes differents changes, but they are linked by the new classes `AgentPropertiesFileHandler` and `AgentProperties`. These new classes were introducted in this PR and [4586](https://github.com/apache/cloudstack/pull/4586) was proposed based on it; I should have waited for this to be merged and then propose that.
   
   I marked [4586](https://github.com/apache/cloudstack/pull/4586) as draft and applied yours suggestions in this PR. If it be merged, I will rebase [4586](https://github.com/apache/cloudstack/pull/4586).
   
   Could you review this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] blueorangutan commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
blueorangutan commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-882639088


   <b>Trillian Build Failed (tid-1318)<b/>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] DaanHoogland edited a comment on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
DaanHoogland edited a comment on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-882565561


   I found an issue after merging this. addHost fails in several environments:
   
   I think we'll need to revert (which I'm testing now in a lab env)
   the smoke tests succeeded so the problem is not this PR in isolation, but something got merged that conflicted with this probably.
   
   cc @GutoVeronezi @GabrielBrascher @RodrigoDLopez @nvazquez 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] GutoVeronezi commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
GutoVeronezi commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-882649288


   @DaanHoogland np, let's get this working 😊.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] GabrielBrascher commented on a change in pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
GabrielBrascher commented on a change in pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#discussion_r642490468



##########
File path: agent/src/main/java/com/cloud/agent/properties/AgentPropertiesFileHandler.java
##########
@@ -0,0 +1,71 @@
+/*
+ * Copyright 2021 The Apache Software Foundation.

Review comment:
       Same here :-)

##########
File path: agent/src/main/java/com/cloud/agent/properties/AgentProperties.java
##########
@@ -0,0 +1,53 @@
+/*
+ * Copyright 2021 The Apache Software Foundation.

Review comment:
       @GutoVeronezi can you please remove these Copyright lines on the source code header?

##########
File path: plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/KVMHAMonitor.java
##########
@@ -32,153 +33,141 @@
 import java.util.concurrent.ConcurrentHashMap;
 
 public class KVMHAMonitor extends KVMHABase implements Runnable {
+
     private static final Logger s_logger = Logger.getLogger(KVMHAMonitor.class);
-    private final Map<String, NfsStoragePool> _storagePool = new ConcurrentHashMap<String, NfsStoragePool>();
+    private final Map<String, NfsStoragePool> storagePool = new ConcurrentHashMap<>();
 
-    private final String _hostIP; /* private ip address */
+    private final String hostPrivateIp;
 
     public KVMHAMonitor(NfsStoragePool pool, String host, String scriptPath) {
         if (pool != null) {
-            _storagePool.put(pool._poolUUID, pool);
+            storagePool.put(pool._poolUUID, pool);
         }
-        _hostIP = host;
+        hostPrivateIp = host;
         configureHeartBeatPath(scriptPath);
+
+        _heartBeatUpdateTimeout = AgentPropertiesFileHandler.getPropertyValue(AgentProperties.HEARTBEAT_UPDATE_TIMEOUT);
     }
 
     private static synchronized void configureHeartBeatPath(String scriptPath) {
         KVMHABase.s_heartBeatPath = scriptPath;
     }
 
     public void addStoragePool(NfsStoragePool pool) {
-        synchronized (_storagePool) {
-            _storagePool.put(pool._poolUUID, pool);
+        synchronized (storagePool) {
+            storagePool.put(pool._poolUUID, pool);
         }
     }
 
     public void removeStoragePool(String uuid) {
-        synchronized (_storagePool) {
-            NfsStoragePool pool = _storagePool.get(uuid);
+        synchronized (storagePool) {
+            NfsStoragePool pool = storagePool.get(uuid);
             if (pool != null) {
                 Script.runSimpleBashScript("umount " + pool._mountDestPath);
-                _storagePool.remove(uuid);
+                storagePool.remove(uuid);
             }
         }
     }
 
     public List<NfsStoragePool> getStoragePools() {
-        synchronized (_storagePool) {
-            return new ArrayList<NfsStoragePool>(_storagePool.values());
+        synchronized (storagePool) {
+            return new ArrayList<>(storagePool.values());
         }
     }
 
     public NfsStoragePool getStoragePool(String uuid) {
-        synchronized (_storagePool) {
-            return _storagePool.get(uuid);
+        synchronized (storagePool) {
+            return storagePool.get(uuid);
         }
     }
 
-    private class Monitor extends ManagedContextRunnable {
+    protected void runHeartBeat() {
+        synchronized (storagePool) {
+            Set<String> removedPools = new HashSet<>();
+            for (String uuid : storagePool.keySet()) {
+                NfsStoragePool primaryStoragePool = storagePool.get(uuid);
+                StoragePool storage;
+                try {
+                    Connect conn = LibvirtConnection.getConnection();
+                    storage = conn.storagePoolLookupByUUIDString(uuid);
+                    if (storage == null || storage.getInfo().state != StoragePoolState.VIR_STORAGE_POOL_RUNNING) {
+                        if (storage == null) {
+                            s_logger.debug(String.format("Libvirt storage pool [%s] not found, removing from HA list.", uuid));
+                        } else {
+                            s_logger.debug(String.format("Libvirt storage pool [%s] found, but not running, removing from HA list.", uuid));
+                        }
 
-        @Override
-        protected void runInContext() {
-            synchronized (_storagePool) {
-                Set<String> removedPools = new HashSet<String>();
-                for (String uuid : _storagePool.keySet()) {
-                    NfsStoragePool primaryStoragePool = _storagePool.get(uuid);
+                        removedPools.add(uuid);
+                        continue;
+                    }
 
-                    // check for any that have been deregistered with libvirt and
-                    // skip,remove them
+                    s_logger.debug(String.format("Found NFS storage pool [%s] in libvirt, continuing.", uuid));
 
-                    StoragePool storage = null;
-                    try {
-                        Connect conn = LibvirtConnection.getConnection();
-                        storage = conn.storagePoolLookupByUUIDString(uuid);
-                        if (storage == null) {
-                            s_logger.debug("Libvirt storage pool " + uuid + " not found, removing from HA list");
-                            removedPools.add(uuid);
-                            continue;
+                } catch (LibvirtException e) {
+                    s_logger.debug(String.format("Failed to lookup libvirt storage pool [%s].", uuid), e);
 
-                        } else if (storage.getInfo().state != StoragePoolState.VIR_STORAGE_POOL_RUNNING) {
-                            s_logger.debug("Libvirt storage pool " + uuid + " found, but not running, removing from HA list");
+                    if (e.toString().contains("pool not found")) {
+                        s_logger.debug(String.format("Removing pool [%s] from HA monitor since it was deleted.", uuid));
+                        removedPools.add(uuid);
+                        continue;
+                    }
 
-                            removedPools.add(uuid);
-                            continue;
-                        }
-                        s_logger.debug("Found NFS storage pool " + uuid + " in libvirt, continuing");
+                }
 
-                    } catch (LibvirtException e) {
-                        s_logger.debug("Failed to lookup libvirt storage pool " + uuid + " due to: " + e);
+                String result = null;
+                for (int i = 1; i <= _heartBeatUpdateMaxTries; i++) {
+                    Script cmd = new Script(s_heartBeatPath, _heartBeatUpdateTimeout, s_logger);
+                    cmd.add("-i", primaryStoragePool._poolIp);
+                    cmd.add("-p", primaryStoragePool._poolMountSourcePath);
+                    cmd.add("-m", primaryStoragePool._mountDestPath);
+                    cmd.add("-h", hostPrivateIp);
+                    result = cmd.execute();
 
-                        // we only want to remove pool if it's not found, not if libvirt
-                        // connection fails
-                        if (e.toString().contains("pool not found")) {
-                            s_logger.debug("removing pool from HA monitor since it was deleted");
-                            removedPools.add(uuid);
-                            continue;
-                        }
-                    }
+                    s_logger.debug(String.format("The command (%s), to the pool [%s], has the result [%s].", cmd.toString(), uuid, result));
 
-                    String result = null;
-                    // Try multiple times, but sleep in between tries to ensure it isn't a short lived transient error
-                    for (int i = 1; i <= _heartBeatUpdateMaxTries; i++) {
-                        Script cmd = new Script(s_heartBeatPath, _heartBeatUpdateTimeout, s_logger);
-                        cmd.add("-i", primaryStoragePool._poolIp);
-                        cmd.add("-p", primaryStoragePool._poolMountSourcePath);
-                        cmd.add("-m", primaryStoragePool._mountDestPath);
-                        cmd.add("-h", _hostIP);
-                        result = cmd.execute();
-                        if (result != null) {
-                            s_logger.warn("write heartbeat failed: " + result + ", try: " + i + " of " + _heartBeatUpdateMaxTries);
-                            try {
-                                Thread.sleep(_heartBeatUpdateRetrySleep);
-                            } catch (InterruptedException e) {
-                                s_logger.debug("[ignored] interupted between heartbeat retries.");
-                            }
-                        } else {
-                            break;
+                    if (result != null) {
+                        s_logger.warn(String.format("Write heartbeat for pool [%s] failed: %s; try: %s of %s.", uuid, result, i, _heartBeatUpdateMaxTries));
+                        try {
+                            Thread.sleep(_heartBeatUpdateRetrySleep);
+                        } catch (InterruptedException e) {
+                            s_logger.debug("[IGNORED] Interrupted between heartbeat retries.", e);
                         }
+                    } else {
+                        break;
                     }
 
-                    if (result != null) {
-                        // Stop cloudstack-agent if can't write to heartbeat file.
-                        // This will raise an alert on the mgmt server
-                        s_logger.warn("write heartbeat failed: " + result + "; stopping cloudstack-agent");
-                        Script cmd = new Script(s_heartBeatPath, _heartBeatUpdateTimeout, s_logger);
-                        cmd.add("-i", primaryStoragePool._poolIp);
-                        cmd.add("-p", primaryStoragePool._poolMountSourcePath);
-                        cmd.add("-m", primaryStoragePool._mountDestPath);
-                        cmd.add("-c");
-                        result = cmd.execute();
-                    }
                 }
 
-                if (!removedPools.isEmpty()) {
-                    for (String uuid : removedPools) {
-                        removeStoragePool(uuid);
-                    }
+                if (result != null) {
+                    s_logger.warn(String.format("Write heartbeat for pool [%s] failed: %s; stopping cloudstack-agent.", uuid, result));
+                    Script cmd = new Script(s_heartBeatPath, _heartBeatUpdateTimeout, s_logger);
+                    cmd.add("-i", primaryStoragePool._poolIp);
+                    cmd.add("-p", primaryStoragePool._poolMountSourcePath);
+                    cmd.add("-m", primaryStoragePool._mountDestPath);

Review comment:
       I know that these lines are duplicated just twice, but maybe it can be a nice thing to externalize these lines into a method such as `createHeartbeatCommand` or any name that is self-explaining. No problem if you don't proceed that way, but it might be a nice thing to add.
   
   From:
   ```
   Script cmd = new Script(s_heartBeatPath, _heartBeatUpdateTimeout, s_logger);
   cmd.add("-i", primaryStoragePool._poolIp);
   cmd.add("-p", primaryStoragePool._poolMountSourcePath);
   cmd.add("-m", primaryStoragePool._mountDestPath);
   ```
   To :arrow_down: 
   
   ```
   /**
    * Optional, but normally nice to have :-)      
    */
   private Script createHeartBeatCommand(String heartBeatPath, long heartBeatUpdateTimeout, Logger logger) {
        Script cmd = new Script(heartBeatPath, heartBeatUpdateTimeout, logger);
        cmd.add("-i", primaryStoragePool._poolIp);
        cmd.add("-p", primaryStoragePool._poolMountSourcePath);
        cmd.add("-m", primaryStoragePool._mountDestPath);
        return cmd;
   }
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] GutoVeronezi edited a comment on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
GutoVeronezi edited a comment on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-760588911


   @GabrielBrascher sorry, my bad. 
   
   I forgot to add the new variable to the file.
   
   I'll add it later.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] DaanHoogland commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
DaanHoogland commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-882634342


   the revert indeed works, I re-tested this PR with this ^^ latest build but am reverting this as other work is being blocked. @GutoVeronezi please rebase your code and submit a new PR after testing it. sorry about this.
   
   ```
   2021-07-19 15:00:12,092 DEBUG [c.c.a.ApiServlet] (qtp1233705144-19:ctx-27a6578a) (logid:4d64198f) ===START===  10.0.35.79 -- GET  apiKey=LIN6rqXuaJwMPfGYFh13qDwYz5VNNz1J2J6qIOWcd3oLQOq0WtD4CwRundBL6rzXToa3lQOC_vKjI3nkHtiD8Q&clusterid=fead03fb-7b43-4b88-9996-1213244c666b&command=addHost&hypervisor=KVM&podid=6b29cd4a-67c7-4dad-b388-146d4db00283&response=json&url=http%3A%2F%2F10.0.32.159&username=root&zoneid=333cd252-9808-4573-9a3e-f8133cd80024&signature=Soqqt4PW5siERQw%2FLaWG%2B27pAL0%3D
   2021-07-19 15:00:12,095 DEBUG [c.c.a.ApiServer] (qtp1233705144-19:ctx-27a6578a ctx-7dc75605) (logid:4d64198f) CIDRs from which account 'Acct[36d8ca2c-e8a1-11eb-8bd8-1e008a0003e2-admin] -- Account {"id": 2, "name": "admin", "uuid": "36d8ca2c-e8a1-11eb-8bd8-1e008a0003e2"}' is allowed to perform API calls: 0.0.0.0/0,::/0
   2021-07-19 15:00:12,111 INFO  [c.c.r.ResourceManagerImpl] (qtp1233705144-19:ctx-27a6578a ctx-7dc75605 ctx-d460a62f) (logid:4d64198f) Trying to add a new host at http://10.0.32.159 in data center 1
   2021-07-19 15:00:12,328 DEBUG [c.c.u.s.SSHCmdHelper] (qtp1233705144-19:ctx-27a6578a ctx-7dc75605 ctx-d460a62f) (logid:4d64198f) Executing cmd: ls /dev/kvm
   2021-07-19 15:00:13,544 DEBUG [c.c.u.s.SSHCmdHelper] (qtp1233705144-19:ctx-27a6578a ctx-7dc75605 ctx-d460a62f) (logid:4d64198f) SSH command: ls /dev/kvm
   SSH command output:/dev/kvm
   
   
   2021-07-19 15:00:13,545 DEBUG [c.c.u.s.SSHCmdHelper] (qtp1233705144-19:ctx-27a6578a ctx-7dc75605 ctx-d460a62f) (logid:4d64198f) Executing cmd: rpm -qa | grep -i ovmf
   2021-07-19 15:00:13,593 DEBUG [o.a.c.h.HAManagerImpl] (BackgroundTaskPollManager-5:ctx-d5f0f4cc) (logid:f64b89cf) HA health check task is running...
   2021-07-19 15:00:15,224 DEBUG [c.c.u.s.SSHCmdHelper] (qtp1233705144-19:ctx-27a6578a ctx-7dc75605 ctx-d460a62f) (logid:4d64198f) Executing cmd: rpm -qa | grep -i ovmf
   2021-07-19 15:00:15,333 DEBUG [c.c.s.StatsCollector] (StatsCollector-4:ctx-d0938266) (logid:7fcaea1f) AutoScaling Monitor is running...
   2021-07-19 15:00:15,334 DEBUG [c.c.s.StatsCollector] (StatsCollector-2:ctx-ebeff815) (logid:797af13f) HostStatsCollector is running...
   2021-07-19 15:00:15,341 DEBUG [c.c.s.StatsCollector] (StatsCollector-5:ctx-c89bff8b) (logid:80c2d343) StorageCollector is running...
   2021-07-19 15:00:16,906 DEBUG [c.c.u.s.SSHCmdHelper] (qtp1233705144-19:ctx-27a6578a ctx-7dc75605 ctx-d460a62f) (logid:4d64198f) Executing cmd: rpm -qa | grep -i ovmf
   2021-07-19 15:00:17,594 DEBUG [o.a.c.h.HAManagerImpl] (BackgroundTaskPollManager-1:ctx-615114d4) (logid:491f8fc6) HA health check task is running...
   2021-07-19 15:00:18,618 DEBUG [c.c.u.s.SSHCmdHelper] (qtp1233705144-19:ctx-27a6578a ctx-7dc75605 ctx-d460a62f) (logid:4d64198f) Executing cmd: sudo /usr/share/cloudstack-common/scripts/util/keystore-setup /etc/cloudstack/agent/agent.properties /etc/cloudstack/agent/
   2021-07-19 15:00:18,818 DEBUG [c.c.a.ApiServlet] (qtp1233705144-18:ctx-05850c1a) (logid:ad1635fe) ===START===  10.0.35.79 -- GET  apiKey=LIN6rqXuaJwMPfGYFh13qDwYz5VNNz1J2J6qIOWcd3oLQOq0WtD4CwRundBL6rzXToa3lQOC_vKjI3nkHtiD8Q&command=listHosts&response=json&signature=BYVU5RhMdLyIIO33ThMJS7BwqWk%3D
   2021-07-19 15:00:18,821 DEBUG [c.c.a.ApiServer] (qtp1233705144-18:ctx-05850c1a ctx-7f427059) (logid:ad1635fe) CIDRs from which account 'Acct[36d8ca2c-e8a1-11eb-8bd8-1e008a0003e2-admin] -- Account {"id": 2, "name": "admin", "uuid": "36d8ca2c-e8a1-11eb-8bd8-1e008a0003e2"}' is allowed to perform API calls: 0.0.0.0/0,::/0
   2021-07-19 15:00:18,824 DEBUG [c.c.a.q.QueryManagerImpl] (qtp1233705144-18:ctx-05850c1a ctx-7f427059 ctx-1a31f824) (logid:ad1635fe) >>>Searching for hosts>>>
   2021-07-19 15:00:18,828 DEBUG [c.c.a.q.QueryManagerImpl] (qtp1233705144-18:ctx-05850c1a ctx-7f427059 ctx-1a31f824) (logid:ad1635fe) >>>Generating Response>>>
   2021-07-19 15:00:18,829 DEBUG [c.c.a.ApiServlet] (qtp1233705144-18:ctx-05850c1a ctx-7f427059 ctx-1a31f824) (logid:ad1635fe) ===END===  10.0.35.79 -- GET  apiKey=LIN6rqXuaJwMPfGYFh13qDwYz5VNNz1J2J6qIOWcd3oLQOq0WtD4CwRundBL6rzXToa3lQOC_vKjI3nkHtiD8Q&command=listHosts&response=json&signature=BYVU5RhMdLyIIO33ThMJS7BwqWk%3D
   2021-07-19 15:00:21,231 DEBUG [c.c.u.s.SSHCmdHelper] (qtp1233705144-19:ctx-27a6578a ctx-7dc75605 ctx-d460a62f) (logid:4d64198f) SSH command: sudo /usr/share/cloudstack-common/scripts/util/keystore-setup /etc/cloudstack/agent/agent.properties /etc/cloudstack/agent/
   SSH command output:
   
   2021-07-19 15:00:21,314 DEBUG [c.c.u.s.SSHCmdHelper] (qtp1233705144-19:ctx-27a6578a ctx-7dc75605 ctx-d460a62f) (logid:4d64198f) Executing cmd: sudo /usr/share/cloudstack-common/scripts/util/keystore-cert-import /etc/cloudstack/agent/agent.properties /etc/cloudstack/agent/
   2021-07-19 15:00:21,595 DEBUG [o.a.c.h.HAManagerImpl] (BackgroundTaskPollManager-6:ctx-bcbc4ac3) (logid:8233c09d) HA health check task is running...
   2021-07-19 15:00:21,900 DEBUG [c.c.c.ConsoleProxyManagerImpl] (consoleproxy-1:ctx-26411329) (logid:d1390140) Skip capacity scan as there is no Primary Storage in 'Up' state
   2021-07-19 15:00:23,717 DEBUG [c.c.u.s.SSHCmdHelper] (qtp1233705144-19:ctx-27a6578a ctx-7dc75605 ctx-d460a62f) (logid:4d64198f) SSH command: sudo /usr/share/cloudstack-common/scripts/util/keystore-cert-import /etc/cloudstack/agent/agent.properties /etc/cloudstack/agent/
   SSH command output:
   chmod: cannot access ‘/etc/cloudstack/agent/cloud.key’: No such file or directory
   
   2021-07-19 15:00:23,717 DEBUG [c.c.u.s.SSHCmdHelper] (qtp1233705144-19:ctx-27a6578a ctx-7dc75605 ctx-d460a62f) (logid:4d64198f) Executing cmd: sudo /usr/share/cloudstack-common/scripts/util/keystore-cert-import /etc/cloudstack/agent/agent.properties /etc/cloudstack/agent/
   2021-07-19 15:00:25,596 DEBUG [o.a.c.h.HAManagerImpl] (BackgroundTaskPollManager-5:ctx-a24575f0) (logid:38c28e87) HA health check task is running...
   2021-07-19 15:00:26,194 DEBUG [c.c.u.s.SSHCmdHelper] (qtp1233705144-19:ctx-27a6578a ctx-7dc75605 ctx-d460a62f) (logid:4d64198f) SSH command: sudo /usr/share/cloudstack-common/scripts/util/keystore-cert-import /etc/cloudstack/agent/agent.properties /etc/cloudstack/agent/
   SSH command output:
   chmod: cannot access ‘/etc/cloudstack/agent/cloud.key’: No such file or directory
   
   2021-07-19 15:00:26,195 DEBUG [c.c.u.s.SSHCmdHelper] (qtp1233705144-19:ctx-27a6578a ctx-7dc75605 ctx-d460a62f) (logid:4d64198f) Executing cmd: sudo /usr/share/cloudstack-common/scripts/util/keystore-cert-import /etc/cloudstack/agent/agent.properties /etc/cloudstack/agent/
   2021-07-19 15:00:28,584 DEBUG [c.c.u.s.SSHCmdHelper] (qtp1233705144-19:ctx-27a6578a ctx-7dc75605 ctx-d460a62f) (logid:4d64198f) SSH command: sudo /usr/share/cloudstack-common/scripts/util/keystore-cert-import /etc/cloudstack/agent/agent.properties /etc/cloudstack/agent/
   SSH command output:
   chmod: cannot access ‘/etc/cloudstack/agent/cloud.key’: No such file or directory
   
   2021-07-19 15:00:28,585 WARN  [c.c.h.k.d.LibvirtServerDiscoverer] (qtp1233705144-19:ctx-27a6578a ctx-7dc75605 ctx-d460a62f) (logid:4d64198f)  can't setup agent, due to com.cloud.utils.exception.CloudRuntimeException: Failed to setup certificate in the KVM agent's keystore file, please see logs and configure manually! - Failed to setup certificate in the KVM agent's keystore file, please see logs and configure manually!
   2021-07-19 15:00:28,588 WARN  [c.c.r.ResourceManagerImpl] (qtp1233705144-19:ctx-27a6578a ctx-7dc75605 ctx-d460a62f) (logid:4d64198f) Unable to find the server resources at http://10.0.32.159
   2021-07-19 15:00:28,588 INFO  [c.c.u.e.CSExceptionErrorCode] (qtp1233705144-19:ctx-27a6578a ctx-7dc75605 ctx-d460a62f) (logid:4d64198f) Could not find exception: com.cloud.exception.DiscoveryException in error code list for exceptions
   2021-07-19 15:00:28,588 WARN  [o.a.c.a.c.a.h.AddHostCmd] (qtp1233705144-19:ctx-27a6578a ctx-7dc75605 ctx-d460a62f) (logid:4d64198f) Exception: 
   com.cloud.exception.DiscoveryException: Unable to add the host
           at com.cloud.resource.ResourceManagerImpl.discoverHostsFull(ResourceManagerImpl.java:845)
           at com.cloud.resource.ResourceManagerImpl.discoverHosts(ResourceManagerImpl.java:631)
           at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
           at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
           at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
           at java.base/java.lang.reflect.Method.invoke(Method.java:566)
           at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344)
           at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198)
           at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
           at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:97)
           at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
           at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:215)
           at com.sun.proxy.$Proxy197.discoverHosts(Unknown Source)
           at org.apache.cloudstack.api.command.admin.host.AddHostCmd.execute(AddHostCmd.java:142)
           at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:156)
           at com.cloud.api.ApiServer.queueCommand(ApiServer.java:764)
           at com.cloud.api.ApiServer.handleRequest(ApiServer.java:588)
           at com.cloud.api.ApiServlet.processRequestInContext(ApiServlet.java:321)
           at com.cloud.api.ApiServlet$1.run(ApiServlet.java:134)
           at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:55)
           at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:102)
           at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:52)
           at com.cloud.api.ApiServlet.processRequest(ApiServlet.java:131)
           at com.cloud.api.ApiServlet.doGet(ApiServlet.java:93)
           at javax.servlet.http.HttpServlet.service(HttpServlet.java:645)
           at javax.servlet.http.HttpServlet.service(HttpServlet.java:750)
           at org.eclipse.jetty.servlet.ServletHolder$NotAsync.service(ServletHolder.java:1443)
           at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:791)
           at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:550)
           at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
           at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)
           at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
           at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
           at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)
           at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
           at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1435)
           at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
           at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501)
           at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594)
           at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
           at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1350)
           at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
           at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:766)
           at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
           at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
           at org.eclipse.jetty.server.Server.handle(Server.java:516)
           at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388)
           at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633)
           at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380)
           at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:273)
           at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
           at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
           at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
           at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
           at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
           at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
           at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
           at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:375)
           at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:773)
           at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:905)
           at java.base/java.lang.Thread.run(Thread.java:829)
   2021-07-19 15:00:28,590 INFO  [c.c.a.ApiServer] (qtp1233705144-19:ctx-27a6578a ctx-7dc75605 ctx-d460a62f) (logid:4d64198f) Unable to add the host
   2021-07-19 15:00:28,590 DEBUG [c.c.a.ApiServlet] (qtp1233705144-19:ctx-27a6578a ctx-7dc75605 ctx-d460a62f) (logid:4d64198f) ===END===  10.0.35.79 -- GET  apiKey=LIN6rqXuaJwMPfGYFh13qDwYz5VNNz1J2J6qIOWcd3oLQOq0WtD4CwRundBL6rzXToa3lQOC_vKjI3nkHtiD8Q&clusterid=fead03fb-7b43-4b88-9996-1213244c666b&command=addHost&hypervisor=KVM&podid=6b29cd4a-67c7-4dad-b388-146d4db00283&response=json&url=http%3A%2F%2F10.0.32.159&username=root&zoneid=333cd252-9808-4573-9a3e-f8133cd80024&signature=Soqqt4PW5siERQw%2FLaWG%2B27pAL0%3D
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] blueorangutan commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
blueorangutan commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-882603417






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] DaanHoogland commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
DaanHoogland commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-859426545


   > There are some errors on Marvin's tests. Not sure if they are related to this PR, though.
   
   started a second run, we'll see if the errors are consistent.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] DaanHoogland commented on a change in pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
DaanHoogland commented on a change in pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#discussion_r642358272



##########
File path: plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/KVMHAMonitor.java
##########
@@ -32,153 +33,141 @@
 import java.util.concurrent.ConcurrentHashMap;
 
 public class KVMHAMonitor extends KVMHABase implements Runnable {
+
     private static final Logger s_logger = Logger.getLogger(KVMHAMonitor.class);
-    private final Map<String, NfsStoragePool> _storagePool = new ConcurrentHashMap<String, NfsStoragePool>();
+    private final Map<String, NfsStoragePool> storagePool = new ConcurrentHashMap<>();
 
-    private final String _hostIP; /* private ip address */
+    private final String hostPrivateIp;

Review comment:
       👍 

##########
File path: agent/src/main/java/com/cloud/agent/properties/AgentProperties.java
##########
@@ -0,0 +1,52 @@
+/*
+ * Copyright 2021 The Apache Software Foundation.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package com.cloud.agent.properties;
+
+/**
+ * Class of constant agent's properties available to configure on
+ * "agent.properties".
+ *<br><br>
+ * Not all available agent properties are defined here, but we should work to
+ * migrate them on demand to this class.
+ *
+ * @param <T> type of the default value.

Review comment:
       👍 useful javadoc, nice to see

##########
File path: agent/conf/agent.properties
##########
@@ -263,3 +263,7 @@ iscsi.session.cleanup.enabled=false
 # Automatically clean up iscsi sessions not attached to any VM.
 # Should be enabled for users using managed storage for example solidfire.
 # Should be disabled for users with unmanaged iscsi connections on their hosts
+
+# This parameter specifies the heartbeat update timeout in ms.
+# Depending on the use case, this timeout might need increasing/decreasing.
+# heartbeat.update.timeout=60000

Review comment:
       I suppose 60000 is the default (1 min) wouldn't hurt to make that explicit in here for operator convenience.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] GutoVeronezi commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
GutoVeronezi commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-854699740


   > > About the question:
   > > This PR intends to externalize the heartbeat timeout configuration on `KVMHAMonitor` via `agent.properties`; And the purpose of these classes is to retrieve the properties from file; So in the situation we are working, we always want the value from file.
   > > Is there any other situation that I should consider?
   > 
   > I don't know but I could immagine there could be properties that require a restart on change and forbid a re-read from file. things like connection settings for instance. I am not sure. It was a legitimate question.
   
   @DaanHoogland It's a good use case.
   In both cases, in some point of time, we will have to retrieve the property from the file; Maybe the question is when implement the call to `getPropertyValue`. `static final` would call it only once and, for example, in this PR, the method `getPropertyValue` is called once in constructor of  `KVMHAMonitor`, which is instantiated only once in `LibvirtComputingResource.configure`.
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] GutoVeronezi commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
GutoVeronezi commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-869831142


   Any update about this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] blueorangutan commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
blueorangutan commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-882603417


   Packaging result: :heavy_check_mark: el7 :heavy_check_mark: el8 :heavy_check_mark: debian. SL-JID 589


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] GutoVeronezi commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
GutoVeronezi commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-860746579


   @DaanHoogland @GabrielBrascher is there anything else to do?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] DaanHoogland edited a comment on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
DaanHoogland edited a comment on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-882565561


   I found an issue after merging this. addHost fails in several environments:
   
   I think we'll need to revert (which I'm testing now in a lab env)
   the smoke tests succeeded so the problem is not this PR in isolation, but something got merged that conflicted with this probably.
   
   cc @GutoVeronezi @GabrielBrascher @RodrigoDLopez @nvazquez 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] DaanHoogland commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
DaanHoogland commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-882565561


   I found an issue after merging this. addHost fails in several environments:
   ```
   /agent.properties /etc/cloudstack/agent/
   SSH command output:
   cat: /etc/cloudstack/agent/cloud.csr: No such file or directory
   chmod: cannot access '/etc/cloudstack/agent/cloud.csr': No such file or directory
   2021-07-19 13:46:10,300 DEBUG [c.c.u.s.SSHCmdHelper] (qtp1233705144-20:ctx-9944ed37 ctx-cd993147) (logid:b0400373) Executing cmd: sudo /usr/share/cloudstack-common/scripts/util/keystore-setup /etc/cloudstack/agent/agent.properties /etc/cloudstack/agent/
   2021-07-19 13:46:11,624 DEBUG [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-11:null) (logid:) SeqA 3-437166: Processing Seq 3-437166:  { Cmd , MgmtId: -1, via: 3, Ver: v1, Flags: 11, [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_proxyVmId":"2","_loadInfo":"{
     "connections": []
   }","wait":"0","bypassHostMaintenance":"false"}}] }
   2021-07-19 13:46:11,627 DEBUG [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-11:null) (logid:) SeqA 3-437166: Sending Seq 3-437166:  { Ans: , MgmtId: 32987898970999, via: 3, Ver: v1, Flags: 100010, [{"com.cloud.agent.api.AgentControlAnswer":{"result":"true","wait":"0","bypassHostMaintenance":"false"}}] }
   2021-07-19 13:46:12,333 DEBUG [c.c.u.s.SSHCmdHelper] (qtp1233705144-20:ctx-9944ed37 ctx-cd993147) (logid:b0400373) SSH command: sudo /usr/share/cloudstack-common/scripts/util/keystore-setup /etc/cloudstack/agent/agent.properties /etc/cloudstack/agent/
   SSH command output:
   cat: /etc/cloudstack/agent/cloud.csr: No such file or directory
   chmod: cannot access '/etc/cloudstack/agent/cloud.csr': No such file or directory
   2021-07-19 13:46:12,333 DEBUG [c.c.u.s.SSHCmdHelper] (qtp1233705144-20:ctx-9944ed37 ctx-cd993147) (logid:b0400373) Executing cmd: sudo /usr/share/cloudstack-common/scripts/util/keystore-setup /etc/cloudstack/agent/agent.properties /etc/cloudstack/agent/
   2021-07-19 13:46:13,330 DEBUG [o.a.c.h.HAManagerImpl] (BackgroundTaskPollManager-6:ctx-c7b3d575) (logid:a7a1aba2) HA health check task is running...
   2021-07-19 13:46:14,362 DEBUG [c.c.u.s.SSHCmdHelper] (qtp1233705144-20:ctx-9944ed37 ctx-cd993147) (logid:b0400373) SSH command: sudo /usr/share/cloudstack-common/scripts/util/keystore-setup /etc/cloudstack/agent/agent.properties /etc/cloudstack/agent/
   SSH command output:
   cat: /etc/cloudstack/agent/cloud.csr: No such file or directory
   chmod: cannot access '/etc/cloudstack/agent/cloud.csr': No such file or directory
   2021-07-19 13:46:14,363 WARN  [c.c.h.k.d.LibvirtServerDiscoverer] (qtp1233705144-20:ctx-9944ed37 ctx-cd993147) (logid:b0400373)  can't setup agent, due to com.cloud.utils.exception.CloudRuntimeException: Failed to setup keystore on the KVM host: 10.0.34.223 - Failed to setup keystore on the KVM host: 10.0.34.223
   2021-07-19 13:46:14,364 WARN  [c.c.r.ResourceManagerImpl] (qtp1233705144-20:ctx-9944ed37 ctx-cd993147) (logid:b0400373) Unable to find the server resources at http://10.0.34.223
   2021-07-19 13:46:14,364 INFO  [c.c.u.e.CSExceptionErrorCode] (qtp1233705144-20:ctx-9944ed37 ctx-cd993147) (logid:b0400373) Could not find exception: com.cloud.exception.DiscoveryException in error code list for exceptions
   2021-07-19 13:46:14,365 WARN  [o.a.c.a.c.a.h.AddHostCmd] (qtp1233705144-20:ctx-9944ed37 ctx-cd993147) (logid:b0400373) Exception: 
   com.cloud.exception.DiscoveryException: Unable to add the host
   	at com.cloud.resource.ResourceManagerImpl.discoverHostsFull(ResourceManagerImpl.java:845)
   	at com.cloud.resource.ResourceManagerImpl.discoverHosts(ResourceManagerImpl.java:631)
   	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
   	at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344)
   	at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198)
   	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
   	at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:97)
   	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
   	at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:215)
   	at com.sun.proxy.$Proxy194.discoverHosts(Unknown Source)
   	at org.apache.cloudstack.api.command.admin.host.AddHostCmd.execute(AddHostCmd.java:142)
   	at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:156)
   	at com.cloud.api.ApiServer.queueCommand(ApiServer.java:764)
   	at com.cloud.api.ApiServer.handleRequest(ApiServer.java:588)
   	at com.cloud.api.ApiServlet.processRequestInContext(ApiServlet.java:321)
   	at com.cloud.api.ApiServlet$1.run(ApiServlet.java:134)
   	at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:55)
   	at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:102)
   	at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:52)
   	at com.cloud.api.ApiServlet.processRequest(ApiServlet.java:131)
   	at com.cloud.api.ApiServlet.doPost(ApiServlet.java:98)
   	at javax.servlet.http.HttpServlet.service(HttpServlet.java:665)
   	at javax.servlet.http.HttpServlet.service(HttpServlet.java:750)
   	at org.eclipse.jetty.servlet.ServletHolder$NotAsync.service(ServletHolder.java:1443)
   	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:791)
   	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:550)
   	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
   	at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)
   	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
   	at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
   	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)
   	at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
   	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1435)
   	at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
   	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501)
   	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594)
   	at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
   	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1350)
   	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
   	at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:766)
   	at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
   	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
   	at org.eclipse.jetty.server.Server.handle(Server.java:516)
   	at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388)
   	at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633)
   	at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380)
   	at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:273)
   	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
   	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
   	at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
   	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
   	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
   	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
   	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
   	at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:375)
   	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:773)
   	at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:905)
   	at java.base/java.lang.Thread.run(Thread.java:829)
   ```
   I think we'll need to revert (which I'm testing now in a lab env)
   the smoke tests succeeded so the problem is not this PR in isolation, but something got merged that conflicted with this probably.
   
   cc @GutoVeronezi @GabrielBrascher @RodrigoDLopez @nvazquez 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] GutoVeronezi commented on a change in pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
GutoVeronezi commented on a change in pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#discussion_r642483269



##########
File path: agent/conf/agent.properties
##########
@@ -263,3 +263,7 @@ iscsi.session.cleanup.enabled=false
 # Automatically clean up iscsi sessions not attached to any VM.
 # Should be enabled for users using managed storage for example solidfire.
 # Should be disabled for users with unmanaged iscsi connections on their hosts
+
+# This parameter specifies the heartbeat update timeout in ms.
+# Depending on the use case, this timeout might need increasing/decreasing.
+# heartbeat.update.timeout=60000

Review comment:
       @DaanHoogland done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] DaanHoogland commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
DaanHoogland commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-856593942


   > > > About the question:
   > > > This PR intends to externalize the heartbeat timeout configuration on `KVMHAMonitor` via `agent.properties`; And the purpose of these classes is to retrieve the properties from file; So in the situation we are working, we always want the value from file.
   > > > Is there any other situation that I should consider?
   > > 
   > > 
   > > I don't know but I could immagine there could be properties that require a restart on change and forbid a re-read from file. things like connection settings for instance. I am not sure. It was a legitimate question.
   > 
   > @DaanHoogland It's a good use case.
   > In both cases, in some point of time, we will have to retrieve the property from the file; Maybe the question is when implement the call to `getPropertyValue`. `static final` would call it only once and, for example, in this PR, the method `getPropertyValue` is called once in constructor of `KVMHAMonitor`, which is instantiated only once in `LibvirtComputingResource.configure`.
   
   You are right and also, let's solve the issue when it really arises. this is good for now at least.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] GabrielBrascher commented on a change in pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
GabrielBrascher commented on a change in pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#discussion_r643371437



##########
File path: plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/KVMHAMonitor.java
##########
@@ -32,153 +33,143 @@
 import java.util.concurrent.ConcurrentHashMap;
 
 public class KVMHAMonitor extends KVMHABase implements Runnable {
+
     private static final Logger s_logger = Logger.getLogger(KVMHAMonitor.class);
-    private final Map<String, NfsStoragePool> _storagePool = new ConcurrentHashMap<String, NfsStoragePool>();
+    private final Map<String, NfsStoragePool> storagePool = new ConcurrentHashMap<>();
 
-    private final String _hostIP; /* private ip address */
+    private final String hostPrivateIp;
 
     public KVMHAMonitor(NfsStoragePool pool, String host, String scriptPath) {
         if (pool != null) {
-            _storagePool.put(pool._poolUUID, pool);
+            storagePool.put(pool._poolUUID, pool);
         }
-        _hostIP = host;
+        hostPrivateIp = host;
         configureHeartBeatPath(scriptPath);
+
+        _heartBeatUpdateTimeout = AgentPropertiesFileHandler.getPropertyValue(AgentProperties.HEARTBEAT_UPDATE_TIMEOUT);
     }
 
     private static synchronized void configureHeartBeatPath(String scriptPath) {
         KVMHABase.s_heartBeatPath = scriptPath;
     }
 
     public void addStoragePool(NfsStoragePool pool) {
-        synchronized (_storagePool) {
-            _storagePool.put(pool._poolUUID, pool);
+        synchronized (storagePool) {
+            storagePool.put(pool._poolUUID, pool);
         }
     }
 
     public void removeStoragePool(String uuid) {
-        synchronized (_storagePool) {
-            NfsStoragePool pool = _storagePool.get(uuid);
+        synchronized (storagePool) {
+            NfsStoragePool pool = storagePool.get(uuid);
             if (pool != null) {
                 Script.runSimpleBashScript("umount " + pool._mountDestPath);
-                _storagePool.remove(uuid);
+                storagePool.remove(uuid);
             }
         }
     }
 
     public List<NfsStoragePool> getStoragePools() {
-        synchronized (_storagePool) {
-            return new ArrayList<NfsStoragePool>(_storagePool.values());
+        synchronized (storagePool) {
+            return new ArrayList<>(storagePool.values());
         }
     }
 
     public NfsStoragePool getStoragePool(String uuid) {
-        synchronized (_storagePool) {
-            return _storagePool.get(uuid);
+        synchronized (storagePool) {
+            return storagePool.get(uuid);
         }
     }
 
-    private class Monitor extends ManagedContextRunnable {
+    protected void runHeartBeat() {
+        synchronized (storagePool) {
+            Set<String> removedPools = new HashSet<>();
+            for (String uuid : storagePool.keySet()) {
+                NfsStoragePool primaryStoragePool = storagePool.get(uuid);
+                StoragePool storage;
+                try {
+                    Connect conn = LibvirtConnection.getConnection();
+                    storage = conn.storagePoolLookupByUUIDString(uuid);
+                    if (storage == null || storage.getInfo().state != StoragePoolState.VIR_STORAGE_POOL_RUNNING) {
+                        if (storage == null) {
+                            s_logger.debug(String.format("Libvirt storage pool [%s] not found, removing from HA list.", uuid));
+                        } else {
+                            s_logger.debug(String.format("Libvirt storage pool [%s] found, but not running, removing from HA list.", uuid));
+                        }
 
-        @Override
-        protected void runInContext() {
-            synchronized (_storagePool) {
-                Set<String> removedPools = new HashSet<String>();
-                for (String uuid : _storagePool.keySet()) {
-                    NfsStoragePool primaryStoragePool = _storagePool.get(uuid);
+                        removedPools.add(uuid);
+                        continue;
+                    }
 
-                    // check for any that have been deregistered with libvirt and
-                    // skip,remove them
+                    s_logger.debug(String.format("Found NFS storage pool [%s] in libvirt, continuing.", uuid));
 
-                    StoragePool storage = null;
-                    try {
-                        Connect conn = LibvirtConnection.getConnection();
-                        storage = conn.storagePoolLookupByUUIDString(uuid);
-                        if (storage == null) {
-                            s_logger.debug("Libvirt storage pool " + uuid + " not found, removing from HA list");
-                            removedPools.add(uuid);
-                            continue;
+                } catch (LibvirtException e) {
+                    s_logger.debug(String.format("Failed to lookup libvirt storage pool [%s].", uuid), e);
 
-                        } else if (storage.getInfo().state != StoragePoolState.VIR_STORAGE_POOL_RUNNING) {
-                            s_logger.debug("Libvirt storage pool " + uuid + " found, but not running, removing from HA list");
+                    if (e.toString().contains("pool not found")) {
+                        s_logger.debug(String.format("Removing pool [%s] from HA monitor since it was deleted.", uuid));
+                        removedPools.add(uuid);
+                        continue;
+                    }
 
-                            removedPools.add(uuid);
-                            continue;
-                        }
-                        s_logger.debug("Found NFS storage pool " + uuid + " in libvirt, continuing");
+                }
 
-                    } catch (LibvirtException e) {
-                        s_logger.debug("Failed to lookup libvirt storage pool " + uuid + " due to: " + e);
+                String result = null;
+                for (int i = 1; i <= _heartBeatUpdateMaxTries; i++) {
+                    Script cmd = createHeartBeatCommand(primaryStoragePool);
+                    cmd.add("-h", hostPrivateIp);
+                    result = cmd.execute();
 
-                        // we only want to remove pool if it's not found, not if libvirt
-                        // connection fails
-                        if (e.toString().contains("pool not found")) {
-                            s_logger.debug("removing pool from HA monitor since it was deleted");
-                            removedPools.add(uuid);
-                            continue;
-                        }
-                    }
+                    s_logger.debug(String.format("The command (%s), to the pool [%s], has the result [%s].", cmd.toString(), uuid, result));
 
-                    String result = null;
-                    // Try multiple times, but sleep in between tries to ensure it isn't a short lived transient error
-                    for (int i = 1; i <= _heartBeatUpdateMaxTries; i++) {
-                        Script cmd = new Script(s_heartBeatPath, _heartBeatUpdateTimeout, s_logger);
-                        cmd.add("-i", primaryStoragePool._poolIp);
-                        cmd.add("-p", primaryStoragePool._poolMountSourcePath);
-                        cmd.add("-m", primaryStoragePool._mountDestPath);
-                        cmd.add("-h", _hostIP);
-                        result = cmd.execute();
-                        if (result != null) {
-                            s_logger.warn("write heartbeat failed: " + result + ", try: " + i + " of " + _heartBeatUpdateMaxTries);
-                            try {
-                                Thread.sleep(_heartBeatUpdateRetrySleep);
-                            } catch (InterruptedException e) {
-                                s_logger.debug("[ignored] interupted between heartbeat retries.");
-                            }
-                        } else {
-                            break;
+                    if (result != null) {
+                        s_logger.warn(String.format("Write heartbeat for pool [%s] failed: %s; try: %s of %s.", uuid, result, i, _heartBeatUpdateMaxTries));
+                        try {
+                            Thread.sleep(_heartBeatUpdateRetrySleep);
+                        } catch (InterruptedException e) {
+                            s_logger.debug("[IGNORED] Interrupted between heartbeat retries.", e);
                         }
+                    } else {
+                        break;
                     }
 
-                    if (result != null) {
-                        // Stop cloudstack-agent if can't write to heartbeat file.
-                        // This will raise an alert on the mgmt server
-                        s_logger.warn("write heartbeat failed: " + result + "; stopping cloudstack-agent");
-                        Script cmd = new Script(s_heartBeatPath, _heartBeatUpdateTimeout, s_logger);
-                        cmd.add("-i", primaryStoragePool._poolIp);
-                        cmd.add("-p", primaryStoragePool._poolMountSourcePath);
-                        cmd.add("-m", primaryStoragePool._mountDestPath);
-                        cmd.add("-c");
-                        result = cmd.execute();
-                    }
                 }
 
-                if (!removedPools.isEmpty()) {
-                    for (String uuid : removedPools) {
-                        removeStoragePool(uuid);
-                    }
+                if (result != null) {
+                    s_logger.warn(String.format("Write heartbeat for pool [%s] failed: %s; stopping cloudstack-agent.", uuid, result));
+                    Script cmd = createHeartBeatCommand(primaryStoragePool);
+                    cmd.add("-c");
+                    result = cmd.execute();
                 }
             }
 
+            if (!removedPools.isEmpty()) {
+                for (String uuid : removedPools) {
+                    removeStoragePool(uuid);
+                }
+            }
         }
+
+    }
+
+    protected Script createHeartBeatCommand(NfsStoragePool primaryStoragePool) {

Review comment:
       Nice!
   
   I would just change access from protected to private, especially considering that it uses some variables accessible just in this class `s_heartBeatPath`, `_heartBeatUpdateTimeout`, `s_logger`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] GutoVeronezi commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
GutoVeronezi commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-882649288


   @DaanHoogland np, let's get this working 😊.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] GutoVeronezi commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
GutoVeronezi commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-882649288


   @DaanHoogland np, let's get this working 😊.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] blueorangutan commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
blueorangutan commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-859942511


   <b>Trillian test result (tid-915)</b>
   Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
   Total time taken: 46008 seconds
   Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr4585-t915-kvm-centos7.zip
   Intermittent failure detected: /marvin/tests/smoke/test_kubernetes_clusters.py
   Intermittent failure detected: /marvin/tests/smoke/test_snapshots.py
   Intermittent failure detected: /marvin/tests/smoke/test_volumes.py
   Intermittent failure detected: /marvin/tests/smoke/test_hostha_kvm.py
   Smoke tests completed. 87 look OK, 1 have error(s)
   Only failed tests results shown below:
   
   
   Test | Result | Time (s) | Test File
   --- | --- | --- | ---
   test_07_deploy_kubernetes_ha_cluster | `Failure` | 3664.49 | test_kubernetes_clusters.py
   test_08_deploy_and_upgrade_kubernetes_ha_cluster | `Failure` | 0.05 | test_kubernetes_clusters.py
   test_09_delete_kubernetes_ha_cluster | `Failure` | 0.04 | test_kubernetes_clusters.py
   ContextSuite context=TestKubernetesCluster>:teardown | `Error` | 77.38 | test_kubernetes_clusters.py
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] GutoVeronezi commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
GutoVeronezi commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-851495871


   > looks good, one question and one remark towards design;
   > q: properties are always read from file. Is that always what we want?
   > r: the use of the `AgentProperties<>` (plural) is to instantiate a single property. I think it makes sense to split the class in a container `AgentProperties`, and an element `AgentProperty<>`
   > 
   > code looks good
   
   @DaanHoogland 
   
   About the remark:
   As you suggested, `AgentProperties` now is a container of `AgentProperty<>`. I turned `AgentProperty<>` an inner class of `AgentProperties` to keep accessibility policy (private constructor).
   
   About the question:
   This PR intends to externalize the heartbeat timeout configuration on `KVMHAMonitor` via `agent.properties`; And the purpose of these classes is to retrieve the properties from file; So in the situation we are working, we always want the value from file.
   Is there any other situation that I should consider?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] DaanHoogland commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
DaanHoogland commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-882565561






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] GutoVeronezi edited a comment on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
GutoVeronezi edited a comment on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-850866396


   @RodrigoDLopez @GabrielBrascher @DaanHoogland @sureshanaparti I'm pinging you here because you all reviewed PR [4586](https://github.com/apache/cloudstack/pull/4586).
   
   First I want to apologize for this inconvenience. This PR and [4586](https://github.com/apache/cloudstack/pull/4586) are some of my first PRs and I messed up they a little bit.
   
   Both PRs proposes differents changes, but they are linked by the new classes `AgentPropertiesFileHandler` and `AgentProperties`. These new classes were introducted in this PR and [4586](https://github.com/apache/cloudstack/pull/4586) was proposed based on it; I should have waited for this to be merged and then propose that.
   
   I marked [4586](https://github.com/apache/cloudstack/pull/4586) as draft and applied yours suggestions in this PR. Also I added unit tests to the handler. If it be merged, I will rebase [4586](https://github.com/apache/cloudstack/pull/4586).
   
   Could you review this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] DaanHoogland commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
DaanHoogland commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-861855274


   @GabrielBrascher  regression tests look ok, did you test?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] GabrielBrascher commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
GabrielBrascher commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-858256580


   There are some errors on Marvin's tests. Not sure if they are related to this PR, though.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] DaanHoogland commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
DaanHoogland commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-863955383


   Note that the threading model has changed @rhtyd , so yes manual testing would be needed, i'd say. I do not know any if automated tests exist for storage timeouts.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] ravening commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
ravening commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-762826636


   > @ravening this parameter exists already but it is hardcoded in `KVMHABase.java`, it is set as:
   > `protected long _heartBeatUpdateTimeout = 60000;`
   
   @GabrielBrascher true but I know that Wei added several global settings to configure the values as well the preferred action as well in a much simpler way. I see in this PR that some new files are added to handle the properties which are not needed


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] nvazquez commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
nvazquez commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-882594625


   Thanks @DaanHoogland seems like reverting on the internal branch made addHost work again. Please update once tests are completed so this can be reverted 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] nvazquez commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
nvazquez commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-882594625


   Thanks @DaanHoogland seems like reverting on the internal branch made addHost work again. Please update once tests are completed so this can be reverted 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] blueorangutan commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
blueorangutan commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-882603417






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [cloudstack] GutoVeronezi commented on pull request #4585: Externalize kvm agent storage timeout configuration

Posted by GitBox <gi...@apache.org>.
GutoVeronezi commented on pull request #4585:
URL: https://github.com/apache/cloudstack/pull/4585#issuecomment-869831142


   Any update about this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org