You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@ambari.apache.org by Nahappan Somasundaram <ns...@hortonworks.com> on 2016/03/18 00:00:10 UTC

Review Request 44986: AMBARI-15474: Listen for changes to auto-start configuration and send them to the agent during heartbeats

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44986/
-----------------------------------------------------------

Review request for Ambari, Jonathan Hurley, Nate Cole, Sumit Mohanty, and Sid Wagle.


Bugs: AMBARI-15474
    https://issues.apache.org/jira/browse/AMBARI-15474


Repository: ambari


Description
-------

AMBARI-15474: Listen for changes to auto-start configuration and send them to the agent during heartbeats.

In HeartbeatHandler::handleHeartBeat(), send the recovery configuration to the agent. Instead of pulling the recovery configuration from the application cache, implement change events and get notified whenever there are changes to the recovery configuration.

**Changes**:
1. Instead of sending the recovery configuration to the agent during every heartbeat (10 seconds interval), send the configuration only when changes are detected.
2. Used AmbariEventPublisher to publish the changes. RecoveryConfigHelper is the subscriber which listens to changes to the configuration.
3. RecoveryConfigHelper maintains a map of Cluster-->Host-->Timestamp which is the last updated timestamp of the recovery configuration. This timestamp is used as a token and is sent to the agent with the configuration. The agent provides this timestamp during each heartbeat. If the timestamp from the heartbeat is the same as the one on the server side, no configuration is sent to the agent.
4. Whenever changes are detected, the timestamp is invalidated. During the next heartbeat, the recovery configuration is read from the DB and sent to the agent along with the latest timestamp.
5. Added new unit tests to test each configuration change event.


Diffs
-----

  ambari-agent/src/main/python/ambari_agent/Controller.py c1c16ac97cca0d2677da2e26eee7b4949a4bf15c 
  ambari-agent/src/main/python/ambari_agent/Heartbeat.py b28644c834a2387d2fb0ad17d104224e830a0245 
  ambari-agent/src/main/python/ambari_agent/RecoveryManager.py ed537ca6928ce376c5f3e906f7a47c3f1919e309 
  ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeat.java 1430aa20cf68db071be85c4ad2ebb9acdaccecc0 
  ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java 3a80803724bcb6ca9e4ec875aa17c9ac1c66fbe8 
  ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java 617b04b5b8768ba28a3c8ecb6e5e00601f153396 
  ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfig.java c2c18462c586292caaf40bdb6a25a8fe9a39d76c 
  ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfigHelper.java 92a622117c71ddfc3bd2562c04ee525491bd6ffd 
  ambari-server/src/main/java/org/apache/ambari/server/events/AmbariEvent.java 78731854fe80c3d6087a73ebf00b0ffe9e287354 
  ambari-server/src/main/java/org/apache/ambari/server/events/ClusterConfigChangedEvent.java PRE-CREATION 
  ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentInstalledEvent.java 2ce01236b3bb1ab4934fe456e5669fa474987426 
  ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentRecoveryChangeEvent.java PRE-CREATION 
  ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentUninstalledEvent.java 19712cdb18592816f82898cd7aa08a3600c7e1f4 
  ambari-server/src/main/java/org/apache/ambari/server/state/ConfigImpl.java 2cc3d0013140c9c7d3453d7c5e05c3aa02bc5794 
  ambari-server/src/main/java/org/apache/ambari/server/state/ServiceComponentImpl.java 4be1c21bf41416681be836716ce38305cc3f287e 
  ambari-server/src/main/java/org/apache/ambari/server/state/svccomphost/ServiceComponentHostImpl.java 3a0075e3a3c90088ca6bd2d9eca48ef50db9fdb3 
  ambari-server/src/test/java/org/apache/ambari/server/agent/TestHeartbeatHandler.java b1f94e38f97550123dd366984438b0bbf9c2826c 
  ambari-server/src/test/java/org/apache/ambari/server/configuration/RecoveryConfigHelperTest.java c86b95a7cd10b0720f0a3b9676f65a8bd68dd6c9 
  ambari-server/src/test/java/org/apache/ambari/server/events/listeners/upgrade/HostVersionOutOfSyncListenerTest.java 1bf9d8336533bc677384314f998f10aa2f10504e 

Diff: https://reviews.apache.org/r/44986/diff/


Testing
-------

**1. mvn clean install **


[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Ambari Main ....................................... SUCCESS [5.864s]
[INFO] Apache Ambari Project POM ......................... SUCCESS [0.037s]
[INFO] Ambari Web ........................................ SUCCESS [23.683s]
[INFO] Ambari Views ...................................... SUCCESS [1.069s]
[INFO] Ambari Admin View ................................. SUCCESS [5.949s]
[INFO] ambari-metrics .................................... SUCCESS [0.534s]
[INFO] Ambari Metrics Common ............................. SUCCESS [0.431s]
[INFO] Ambari Metrics Hadoop Sink ........................ SUCCESS [1.060s]
[INFO] Ambari Metrics Flume Sink ......................... SUCCESS [0.534s]
[INFO] Ambari Metrics Kafka Sink ......................... SUCCESS [0.584s]
[INFO] Ambari Metrics Storm Sink ......................... SUCCESS [1.472s]
[INFO] Ambari Metrics Collector .......................... SUCCESS [5.981s]
[INFO] Ambari Metrics Monitor ............................ SUCCESS [1.696s]
[INFO] Ambari Metrics Grafana ............................ SUCCESS [0.819s]
[INFO] Ambari Metrics Assembly ........................... SUCCESS [1:13.281s]
[INFO] Ambari Server ..................................... SUCCESS [2:11.632s]
[INFO] Ambari Functional Tests ........................... SUCCESS [0.907s]
[INFO] Ambari Agent ...................................... SUCCESS [21.659s]
[INFO] Ambari Client ..................................... SUCCESS [0.043s]
[INFO] Ambari Python Client .............................. SUCCESS [0.829s]
[INFO] Ambari Groovy Client .............................. SUCCESS [2.078s]
[INFO] Ambari Shell ...................................... SUCCESS [0.039s]
[INFO] Ambari Python Shell ............................... SUCCESS [0.554s]
[INFO] Ambari Groovy Shell ............................... SUCCESS [0.811s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 4:42.176s
[INFO] Finished at: Thu Mar 17 15:52:41 PDT 2016
[INFO] Final Memory: 261M/1122M
[INFO] ------------------------------------------------------------------------

**2. mvn test -DskipSurefireTests **

Ran 253 tests in 6.633s

OK
----------------------------------------------------------------------
Total run:926
Total errors:0
Total failures:0
OK
INFO: AMBARI_SERVER_LIB is not set, using default /usr/lib/ambari-server
INFO: Return code from stack upgrade command, retcode = 0
StackAdvisor implementation for stack HDP1, version 2.0.6 was not found
Returning DefaultStackAdvisor implementation
StackAdvisor implementation for stack XYZ, version 1.0.0 was loaded
StackAdvisor implementation for stack XYZ, version 1.0.1 was loaded
Returning XYZ101StackAdvisor implementation
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 52.037s
[INFO] Finished at: Thu Mar 17 15:11:40 PDT 2016
[INFO] Final Memory: 57M/1008M
[INFO] ------------------------------------------------------------------------

**3. mvn test -Dtest=TestHeartbeatHandler,RecoveryConfigHelperTest,HostVersionOutOfSyncListenerTest -DskipPythonTests **

-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
Running org.apache.ambari.server.agent.TestHeartbeatHandler
Tests run: 24, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 55.137 sec - in org.apache.ambari.server.agent.TestHeartbeatHandler
Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
Running org.apache.ambari.server.configuration.RecoveryConfigHelperTest
Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 22.704 sec - in org.apache.ambari.server.configuration.RecoveryConfigHelperTest
Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
Running org.apache.ambari.server.events.listeners.upgrade.HostVersionOutOfSyncListenerTest
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 19.333 sec - in org.apache.ambari.server.events.listeners.upgrade.HostVersionOutOfSyncListenerTest

Results :

Tests run: 36, Failures: 0, Errors: 0, Skipped: 0

[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1:59.372s
[INFO] Finished at: Thu Mar 17 15:55:50 PDT 2016
[INFO] Final Memory: 55M/1018M
[INFO] ------------------------------------------------------------------------

**4. Manual tests **

*Setup*
*  Deployed a VM with the latest **trunk** bits and replaced **ambari-server.jar** from the development build.
*  Replaced **agent python** files from development.

*Tests*
curl -u admin:admin -H "X-Requested-By: ambari" -X PUT http://localhost:8080/api/v1/clusters/c1/components/METRICS_COLLECTOR -d '{"ServiceComponentInfo":{"recovery_enabled":"false"}}'

In the Ambari Web UI, METRICS_COLLECTOR was set to Maintenance Mode.


*Verification using the debugger*
Before these tests were run, it was verified that the recovery configuration was sent only during registration and not during the heartbeats since nothing was modified.

For both the tests above, it was verified that events were published for the above changes and that RecoveryConfigHelper captured those events and invalidated the timestamp. During the next heartbeat, the new recovery configuration information was pulled from the DB and send to the agent. It was also verified that for subsequent heartbeats, the recovery configuration was not sent.


Thanks,

Nahappan Somasundaram


Re: Review Request 44986: AMBARI-15474: Listen for changes to auto-start configuration and send them to the agent during heartbeats

Posted by Nate Cole <nc...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44986/#review124222
-----------------------------------------------------------


Ship it!




Ship It!

- Nate Cole


On March 17, 2016, 7 p.m., Nahappan Somasundaram wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/44986/
> -----------------------------------------------------------
> 
> (Updated March 17, 2016, 7 p.m.)
> 
> 
> Review request for Ambari, Jonathan Hurley, Nate Cole, Sumit Mohanty, and Sid Wagle.
> 
> 
> Bugs: AMBARI-15474
>     https://issues.apache.org/jira/browse/AMBARI-15474
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> AMBARI-15474: Listen for changes to auto-start configuration and send them to the agent during heartbeats.
> 
> In HeartbeatHandler::handleHeartBeat(), send the recovery configuration to the agent. Instead of pulling the recovery configuration from the application cache, implement change events and get notified whenever there are changes to the recovery configuration.
> 
> **Changes**:
> 1. Instead of sending the recovery configuration to the agent during every heartbeat (10 seconds interval), send the configuration only when changes are detected.
> 2. Used AmbariEventPublisher to publish the changes. RecoveryConfigHelper is the subscriber which listens to changes to the configuration.
> 3. RecoveryConfigHelper maintains a map of Cluster-->Host-->Timestamp which is the last updated timestamp of the recovery configuration. This timestamp is used as a token and is sent to the agent with the configuration. The agent provides this timestamp during each heartbeat. If the timestamp from the heartbeat is the same as the one on the server side, no configuration is sent to the agent.
> 4. Whenever changes are detected, the timestamp is invalidated. During the next heartbeat, the recovery configuration is read from the DB and sent to the agent along with the latest timestamp.
> 5. Added new unit tests to test each configuration change event.
> 
> 
> Diffs
> -----
> 
>   ambari-agent/src/main/python/ambari_agent/Controller.py c1c16ac97cca0d2677da2e26eee7b4949a4bf15c 
>   ambari-agent/src/main/python/ambari_agent/Heartbeat.py b28644c834a2387d2fb0ad17d104224e830a0245 
>   ambari-agent/src/main/python/ambari_agent/RecoveryManager.py ed537ca6928ce376c5f3e906f7a47c3f1919e309 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeat.java 1430aa20cf68db071be85c4ad2ebb9acdaccecc0 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java 3a80803724bcb6ca9e4ec875aa17c9ac1c66fbe8 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java 617b04b5b8768ba28a3c8ecb6e5e00601f153396 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfig.java c2c18462c586292caaf40bdb6a25a8fe9a39d76c 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfigHelper.java 92a622117c71ddfc3bd2562c04ee525491bd6ffd 
>   ambari-server/src/main/java/org/apache/ambari/server/events/AmbariEvent.java 78731854fe80c3d6087a73ebf00b0ffe9e287354 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ClusterConfigChangedEvent.java PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentInstalledEvent.java 2ce01236b3bb1ab4934fe456e5669fa474987426 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentRecoveryChangeEvent.java PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentUninstalledEvent.java 19712cdb18592816f82898cd7aa08a3600c7e1f4 
>   ambari-server/src/main/java/org/apache/ambari/server/state/ConfigImpl.java 2cc3d0013140c9c7d3453d7c5e05c3aa02bc5794 
>   ambari-server/src/main/java/org/apache/ambari/server/state/ServiceComponentImpl.java 4be1c21bf41416681be836716ce38305cc3f287e 
>   ambari-server/src/main/java/org/apache/ambari/server/state/svccomphost/ServiceComponentHostImpl.java 3a0075e3a3c90088ca6bd2d9eca48ef50db9fdb3 
>   ambari-server/src/test/java/org/apache/ambari/server/agent/TestHeartbeatHandler.java b1f94e38f97550123dd366984438b0bbf9c2826c 
>   ambari-server/src/test/java/org/apache/ambari/server/configuration/RecoveryConfigHelperTest.java c86b95a7cd10b0720f0a3b9676f65a8bd68dd6c9 
>   ambari-server/src/test/java/org/apache/ambari/server/events/listeners/upgrade/HostVersionOutOfSyncListenerTest.java 1bf9d8336533bc677384314f998f10aa2f10504e 
> 
> Diff: https://reviews.apache.org/r/44986/diff/
> 
> 
> Testing
> -------
> 
> **1. mvn clean install **
> 
> 
> [INFO] ------------------------------------------------------------------------
> [INFO] Reactor Summary:
> [INFO]
> [INFO] Ambari Main ....................................... SUCCESS [5.864s]
> [INFO] Apache Ambari Project POM ......................... SUCCESS [0.037s]
> [INFO] Ambari Web ........................................ SUCCESS [23.683s]
> [INFO] Ambari Views ...................................... SUCCESS [1.069s]
> [INFO] Ambari Admin View ................................. SUCCESS [5.949s]
> [INFO] ambari-metrics .................................... SUCCESS [0.534s]
> [INFO] Ambari Metrics Common ............................. SUCCESS [0.431s]
> [INFO] Ambari Metrics Hadoop Sink ........................ SUCCESS [1.060s]
> [INFO] Ambari Metrics Flume Sink ......................... SUCCESS [0.534s]
> [INFO] Ambari Metrics Kafka Sink ......................... SUCCESS [0.584s]
> [INFO] Ambari Metrics Storm Sink ......................... SUCCESS [1.472s]
> [INFO] Ambari Metrics Collector .......................... SUCCESS [5.981s]
> [INFO] Ambari Metrics Monitor ............................ SUCCESS [1.696s]
> [INFO] Ambari Metrics Grafana ............................ SUCCESS [0.819s]
> [INFO] Ambari Metrics Assembly ........................... SUCCESS [1:13.281s]
> [INFO] Ambari Server ..................................... SUCCESS [2:11.632s]
> [INFO] Ambari Functional Tests ........................... SUCCESS [0.907s]
> [INFO] Ambari Agent ...................................... SUCCESS [21.659s]
> [INFO] Ambari Client ..................................... SUCCESS [0.043s]
> [INFO] Ambari Python Client .............................. SUCCESS [0.829s]
> [INFO] Ambari Groovy Client .............................. SUCCESS [2.078s]
> [INFO] Ambari Shell ...................................... SUCCESS [0.039s]
> [INFO] Ambari Python Shell ............................... SUCCESS [0.554s]
> [INFO] Ambari Groovy Shell ............................... SUCCESS [0.811s]
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 4:42.176s
> [INFO] Finished at: Thu Mar 17 15:52:41 PDT 2016
> [INFO] Final Memory: 261M/1122M
> [INFO] ------------------------------------------------------------------------
> 
> **2. mvn test -DskipSurefireTests **
> 
> Ran 253 tests in 6.633s
> 
> OK
> ----------------------------------------------------------------------
> Total run:926
> Total errors:0
> Total failures:0
> OK
> INFO: AMBARI_SERVER_LIB is not set, using default /usr/lib/ambari-server
> INFO: Return code from stack upgrade command, retcode = 0
> StackAdvisor implementation for stack HDP1, version 2.0.6 was not found
> Returning DefaultStackAdvisor implementation
> StackAdvisor implementation for stack XYZ, version 1.0.0 was loaded
> StackAdvisor implementation for stack XYZ, version 1.0.1 was loaded
> Returning XYZ101StackAdvisor implementation
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 52.037s
> [INFO] Finished at: Thu Mar 17 15:11:40 PDT 2016
> [INFO] Final Memory: 57M/1008M
> [INFO] ------------------------------------------------------------------------
> 
> **3. mvn test -Dtest=TestHeartbeatHandler,RecoveryConfigHelperTest,HostVersionOutOfSyncListenerTest -DskipPythonTests **
> 
> -------------------------------------------------------
>  T E S T S
> -------------------------------------------------------
> Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
> Running org.apache.ambari.server.agent.TestHeartbeatHandler
> Tests run: 24, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 55.137 sec - in org.apache.ambari.server.agent.TestHeartbeatHandler
> Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
> Running org.apache.ambari.server.configuration.RecoveryConfigHelperTest
> Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 22.704 sec - in org.apache.ambari.server.configuration.RecoveryConfigHelperTest
> Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
> Running org.apache.ambari.server.events.listeners.upgrade.HostVersionOutOfSyncListenerTest
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 19.333 sec - in org.apache.ambari.server.events.listeners.upgrade.HostVersionOutOfSyncListenerTest
> 
> Results :
> 
> Tests run: 36, Failures: 0, Errors: 0, Skipped: 0
> 
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 1:59.372s
> [INFO] Finished at: Thu Mar 17 15:55:50 PDT 2016
> [INFO] Final Memory: 55M/1018M
> [INFO] ------------------------------------------------------------------------
> 
> **4. Manual tests **
> 
> *Setup*
> *  Deployed a VM with the latest **trunk** bits and replaced **ambari-server.jar** from the development build.
> *  Replaced **agent python** files from development.
> 
> *Tests*
> curl -u admin:admin -H "X-Requested-By: ambari" -X PUT http://localhost:8080/api/v1/clusters/c1/components/METRICS_COLLECTOR -d '{"ServiceComponentInfo":{"recovery_enabled":"false"}}'
> 
> In the Ambari Web UI, METRICS_COLLECTOR was set to Maintenance Mode.
> 
> 
> *Verification using the debugger*
> Before these tests were run, it was verified that the recovery configuration was sent only during registration and not during the heartbeats since nothing was modified.
> 
> For both the tests above, it was verified that events were published for the above changes and that RecoveryConfigHelper captured those events and invalidated the timestamp. During the next heartbeat, the new recovery configuration information was pulled from the DB and send to the agent. It was also verified that for subsequent heartbeats, the recovery configuration was not sent.
> 
> 
> Thanks,
> 
> Nahappan Somasundaram
> 
>


Re: Review Request 44986: AMBARI-15474: Listen for changes to auto-start configuration and send them to the agent during heartbeats

Posted by Nahappan Somasundaram <ns...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44986/
-----------------------------------------------------------

(Updated March 22, 2016, 7:19 a.m.)


Review request for Ambari, Jonathan Hurley, Nate Cole, Sumit Mohanty, and Sid Wagle.


Changes
-------

1. isConfigStale() - throw exception if cluster name or host name is null.
2. Added comments.


Bugs: AMBARI-15474
    https://issues.apache.org/jira/browse/AMBARI-15474


Repository: ambari


Description
-------

AMBARI-15474: Listen for changes to auto-start configuration and send them to the agent during heartbeats.

In HeartbeatHandler::handleHeartBeat(), send the recovery configuration to the agent. Instead of pulling the recovery configuration from the application cache, implement change events and get notified whenever there are changes to the recovery configuration.

**Changes**:
1. Instead of sending the recovery configuration to the agent during every heartbeat (10 seconds interval), send the configuration only when changes are detected.
2. Used AmbariEventPublisher to publish the changes. RecoveryConfigHelper is the subscriber which listens to changes to the configuration.
3. RecoveryConfigHelper maintains a map of Cluster-->Host-->Timestamp which is the last updated timestamp of the recovery configuration. This timestamp is used as a token and is sent to the agent with the configuration. The agent provides this timestamp during each heartbeat. If the timestamp from the heartbeat is the same as the one on the server side, no configuration is sent to the agent.
4. Whenever changes are detected, the timestamp is invalidated. During the next heartbeat, the recovery configuration is read from the DB and sent to the agent along with the latest timestamp.
5. Added new unit tests to test each configuration change event.


Diffs (updated)
-----

  ambari-agent/src/main/python/ambari_agent/Controller.py c1c16ac97cca0d2677da2e26eee7b4949a4bf15c 
  ambari-agent/src/main/python/ambari_agent/Heartbeat.py b28644c834a2387d2fb0ad17d104224e830a0245 
  ambari-agent/src/main/python/ambari_agent/RecoveryManager.py ed537ca6928ce376c5f3e906f7a47c3f1919e309 
  ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeat.java 1430aa20cf68db071be85c4ad2ebb9acdaccecc0 
  ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java 3a80803724bcb6ca9e4ec875aa17c9ac1c66fbe8 
  ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java 617b04b5b8768ba28a3c8ecb6e5e00601f153396 
  ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfig.java c2c18462c586292caaf40bdb6a25a8fe9a39d76c 
  ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfigHelper.java 92a622117c71ddfc3bd2562c04ee525491bd6ffd 
  ambari-server/src/main/java/org/apache/ambari/server/events/AmbariEvent.java 78731854fe80c3d6087a73ebf00b0ffe9e287354 
  ambari-server/src/main/java/org/apache/ambari/server/events/ClusterConfigChangedEvent.java PRE-CREATION 
  ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentInstalledEvent.java 2ce01236b3bb1ab4934fe456e5669fa474987426 
  ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentRecoveryChangedEvent.java PRE-CREATION 
  ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentUninstalledEvent.java 19712cdb18592816f82898cd7aa08a3600c7e1f4 
  ambari-server/src/main/java/org/apache/ambari/server/state/ConfigImpl.java 2cc3d0013140c9c7d3453d7c5e05c3aa02bc5794 
  ambari-server/src/main/java/org/apache/ambari/server/state/ServiceComponentImpl.java 4be1c21bf41416681be836716ce38305cc3f287e 
  ambari-server/src/main/java/org/apache/ambari/server/state/svccomphost/ServiceComponentHostImpl.java 3a0075e3a3c90088ca6bd2d9eca48ef50db9fdb3 
  ambari-server/src/test/java/org/apache/ambari/server/agent/TestHeartbeatHandler.java b1f94e38f97550123dd366984438b0bbf9c2826c 
  ambari-server/src/test/java/org/apache/ambari/server/configuration/RecoveryConfigHelperTest.java c86b95a7cd10b0720f0a3b9676f65a8bd68dd6c9 
  ambari-server/src/test/java/org/apache/ambari/server/events/listeners/upgrade/HostVersionOutOfSyncListenerTest.java 1bf9d8336533bc677384314f998f10aa2f10504e 

Diff: https://reviews.apache.org/r/44986/diff/


Testing
-------

**1. mvn clean install **


[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Ambari Main ....................................... SUCCESS [5.864s]
[INFO] Apache Ambari Project POM ......................... SUCCESS [0.037s]
[INFO] Ambari Web ........................................ SUCCESS [23.683s]
[INFO] Ambari Views ...................................... SUCCESS [1.069s]
[INFO] Ambari Admin View ................................. SUCCESS [5.949s]
[INFO] ambari-metrics .................................... SUCCESS [0.534s]
[INFO] Ambari Metrics Common ............................. SUCCESS [0.431s]
[INFO] Ambari Metrics Hadoop Sink ........................ SUCCESS [1.060s]
[INFO] Ambari Metrics Flume Sink ......................... SUCCESS [0.534s]
[INFO] Ambari Metrics Kafka Sink ......................... SUCCESS [0.584s]
[INFO] Ambari Metrics Storm Sink ......................... SUCCESS [1.472s]
[INFO] Ambari Metrics Collector .......................... SUCCESS [5.981s]
[INFO] Ambari Metrics Monitor ............................ SUCCESS [1.696s]
[INFO] Ambari Metrics Grafana ............................ SUCCESS [0.819s]
[INFO] Ambari Metrics Assembly ........................... SUCCESS [1:13.281s]
[INFO] Ambari Server ..................................... SUCCESS [2:11.632s]
[INFO] Ambari Functional Tests ........................... SUCCESS [0.907s]
[INFO] Ambari Agent ...................................... SUCCESS [21.659s]
[INFO] Ambari Client ..................................... SUCCESS [0.043s]
[INFO] Ambari Python Client .............................. SUCCESS [0.829s]
[INFO] Ambari Groovy Client .............................. SUCCESS [2.078s]
[INFO] Ambari Shell ...................................... SUCCESS [0.039s]
[INFO] Ambari Python Shell ............................... SUCCESS [0.554s]
[INFO] Ambari Groovy Shell ............................... SUCCESS [0.811s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 4:42.176s
[INFO] Finished at: Thu Mar 17 15:52:41 PDT 2016
[INFO] Final Memory: 261M/1122M
[INFO] ------------------------------------------------------------------------

**2. mvn test -DskipSurefireTests **

Ran 253 tests in 6.633s

OK
----------------------------------------------------------------------
Total run:926
Total errors:0
Total failures:0
OK
INFO: AMBARI_SERVER_LIB is not set, using default /usr/lib/ambari-server
INFO: Return code from stack upgrade command, retcode = 0
StackAdvisor implementation for stack HDP1, version 2.0.6 was not found
Returning DefaultStackAdvisor implementation
StackAdvisor implementation for stack XYZ, version 1.0.0 was loaded
StackAdvisor implementation for stack XYZ, version 1.0.1 was loaded
Returning XYZ101StackAdvisor implementation
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 52.037s
[INFO] Finished at: Thu Mar 17 15:11:40 PDT 2016
[INFO] Final Memory: 57M/1008M
[INFO] ------------------------------------------------------------------------

**3. mvn test -Dtest=TestHeartbeatHandler,RecoveryConfigHelperTest,HostVersionOutOfSyncListenerTest -DskipPythonTests **

-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
Running org.apache.ambari.server.agent.TestHeartbeatHandler
Tests run: 24, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 55.137 sec - in org.apache.ambari.server.agent.TestHeartbeatHandler
Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
Running org.apache.ambari.server.configuration.RecoveryConfigHelperTest
Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 22.704 sec - in org.apache.ambari.server.configuration.RecoveryConfigHelperTest
Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
Running org.apache.ambari.server.events.listeners.upgrade.HostVersionOutOfSyncListenerTest
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 19.333 sec - in org.apache.ambari.server.events.listeners.upgrade.HostVersionOutOfSyncListenerTest

Results :

Tests run: 36, Failures: 0, Errors: 0, Skipped: 0

[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1:59.372s
[INFO] Finished at: Thu Mar 17 15:55:50 PDT 2016
[INFO] Final Memory: 55M/1018M
[INFO] ------------------------------------------------------------------------

**4. Manual tests **

*Setup*
*  Deployed a VM with the latest **trunk** bits and replaced **ambari-server.jar** from the development build.
*  Replaced **agent python** files from development.

*Tests*
curl -u admin:admin -H "X-Requested-By: ambari" -X PUT http://localhost:8080/api/v1/clusters/c1/components/METRICS_COLLECTOR -d '{"ServiceComponentInfo":{"recovery_enabled":"false"}}'

In the Ambari Web UI, METRICS_COLLECTOR was set to Maintenance Mode.


*Verification using the debugger*
Before these tests were run, it was verified that the recovery configuration was sent only during registration and not during the heartbeats since nothing was modified.

For both the tests above, it was verified that events were published for the above changes and that RecoveryConfigHelper captured those events and invalidated the timestamp. During the next heartbeat, the new recovery configuration information was pulled from the DB and send to the agent. It was also verified that for subsequent heartbeats, the recovery configuration was not sent.


Thanks,

Nahappan Somasundaram


Re: Review Request 44986: AMBARI-15474: Listen for changes to auto-start configuration and send them to the agent during heartbeats

Posted by Nahappan Somasundaram <ns...@hortonworks.com>.

> On March 21, 2016, 5:38 p.m., Alejandro Fernandez wrote:
> > ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java, line 286
> > <https://reviews.apache.org/r/44986/diff/3/?file=1308892#file1308892line286>
> >
> >     It's possible for this to be called if clusterName is still null.
> 
> Nahappan Somasundaram wrote:
>     Yes, there is a possibility. Before creating a cluster, if we register the agent using the API and start it, it will attempt to register with the server. At this time the cluster name will be null. And subsequent in subsequent heartbeats, isConfigStale() is called. Though, hostname will not be null.

Thinking about this, we should not get the recovery configuration if the cluster name is not available. I'll go ahead and make cluster name and host name mandatory for the recovery configuration calls. Thanks.


- Nahappan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44986/#review124692
-----------------------------------------------------------


On March 21, 2016, 12:24 p.m., Nahappan Somasundaram wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/44986/
> -----------------------------------------------------------
> 
> (Updated March 21, 2016, 12:24 p.m.)
> 
> 
> Review request for Ambari, Jonathan Hurley, Nate Cole, Sumit Mohanty, and Sid Wagle.
> 
> 
> Bugs: AMBARI-15474
>     https://issues.apache.org/jira/browse/AMBARI-15474
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> AMBARI-15474: Listen for changes to auto-start configuration and send them to the agent during heartbeats.
> 
> In HeartbeatHandler::handleHeartBeat(), send the recovery configuration to the agent. Instead of pulling the recovery configuration from the application cache, implement change events and get notified whenever there are changes to the recovery configuration.
> 
> **Changes**:
> 1. Instead of sending the recovery configuration to the agent during every heartbeat (10 seconds interval), send the configuration only when changes are detected.
> 2. Used AmbariEventPublisher to publish the changes. RecoveryConfigHelper is the subscriber which listens to changes to the configuration.
> 3. RecoveryConfigHelper maintains a map of Cluster-->Host-->Timestamp which is the last updated timestamp of the recovery configuration. This timestamp is used as a token and is sent to the agent with the configuration. The agent provides this timestamp during each heartbeat. If the timestamp from the heartbeat is the same as the one on the server side, no configuration is sent to the agent.
> 4. Whenever changes are detected, the timestamp is invalidated. During the next heartbeat, the recovery configuration is read from the DB and sent to the agent along with the latest timestamp.
> 5. Added new unit tests to test each configuration change event.
> 
> 
> Diffs
> -----
> 
>   ambari-agent/src/main/python/ambari_agent/Controller.py c1c16ac97cca0d2677da2e26eee7b4949a4bf15c 
>   ambari-agent/src/main/python/ambari_agent/Heartbeat.py b28644c834a2387d2fb0ad17d104224e830a0245 
>   ambari-agent/src/main/python/ambari_agent/RecoveryManager.py ed537ca6928ce376c5f3e906f7a47c3f1919e309 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeat.java 1430aa20cf68db071be85c4ad2ebb9acdaccecc0 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java 3a80803724bcb6ca9e4ec875aa17c9ac1c66fbe8 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java 617b04b5b8768ba28a3c8ecb6e5e00601f153396 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfig.java c2c18462c586292caaf40bdb6a25a8fe9a39d76c 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfigHelper.java 92a622117c71ddfc3bd2562c04ee525491bd6ffd 
>   ambari-server/src/main/java/org/apache/ambari/server/events/AmbariEvent.java 78731854fe80c3d6087a73ebf00b0ffe9e287354 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ClusterConfigChangedEvent.java PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentInstalledEvent.java 2ce01236b3bb1ab4934fe456e5669fa474987426 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentRecoveryChangedEvent.java PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentUninstalledEvent.java 19712cdb18592816f82898cd7aa08a3600c7e1f4 
>   ambari-server/src/main/java/org/apache/ambari/server/state/ConfigImpl.java 2cc3d0013140c9c7d3453d7c5e05c3aa02bc5794 
>   ambari-server/src/main/java/org/apache/ambari/server/state/ServiceComponentImpl.java 4be1c21bf41416681be836716ce38305cc3f287e 
>   ambari-server/src/main/java/org/apache/ambari/server/state/svccomphost/ServiceComponentHostImpl.java 3a0075e3a3c90088ca6bd2d9eca48ef50db9fdb3 
>   ambari-server/src/test/java/org/apache/ambari/server/agent/TestHeartbeatHandler.java b1f94e38f97550123dd366984438b0bbf9c2826c 
>   ambari-server/src/test/java/org/apache/ambari/server/configuration/RecoveryConfigHelperTest.java c86b95a7cd10b0720f0a3b9676f65a8bd68dd6c9 
>   ambari-server/src/test/java/org/apache/ambari/server/events/listeners/upgrade/HostVersionOutOfSyncListenerTest.java 1bf9d8336533bc677384314f998f10aa2f10504e 
> 
> Diff: https://reviews.apache.org/r/44986/diff/
> 
> 
> Testing
> -------
> 
> **1. mvn clean install **
> 
> 
> [INFO] ------------------------------------------------------------------------
> [INFO] Reactor Summary:
> [INFO]
> [INFO] Ambari Main ....................................... SUCCESS [5.864s]
> [INFO] Apache Ambari Project POM ......................... SUCCESS [0.037s]
> [INFO] Ambari Web ........................................ SUCCESS [23.683s]
> [INFO] Ambari Views ...................................... SUCCESS [1.069s]
> [INFO] Ambari Admin View ................................. SUCCESS [5.949s]
> [INFO] ambari-metrics .................................... SUCCESS [0.534s]
> [INFO] Ambari Metrics Common ............................. SUCCESS [0.431s]
> [INFO] Ambari Metrics Hadoop Sink ........................ SUCCESS [1.060s]
> [INFO] Ambari Metrics Flume Sink ......................... SUCCESS [0.534s]
> [INFO] Ambari Metrics Kafka Sink ......................... SUCCESS [0.584s]
> [INFO] Ambari Metrics Storm Sink ......................... SUCCESS [1.472s]
> [INFO] Ambari Metrics Collector .......................... SUCCESS [5.981s]
> [INFO] Ambari Metrics Monitor ............................ SUCCESS [1.696s]
> [INFO] Ambari Metrics Grafana ............................ SUCCESS [0.819s]
> [INFO] Ambari Metrics Assembly ........................... SUCCESS [1:13.281s]
> [INFO] Ambari Server ..................................... SUCCESS [2:11.632s]
> [INFO] Ambari Functional Tests ........................... SUCCESS [0.907s]
> [INFO] Ambari Agent ...................................... SUCCESS [21.659s]
> [INFO] Ambari Client ..................................... SUCCESS [0.043s]
> [INFO] Ambari Python Client .............................. SUCCESS [0.829s]
> [INFO] Ambari Groovy Client .............................. SUCCESS [2.078s]
> [INFO] Ambari Shell ...................................... SUCCESS [0.039s]
> [INFO] Ambari Python Shell ............................... SUCCESS [0.554s]
> [INFO] Ambari Groovy Shell ............................... SUCCESS [0.811s]
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 4:42.176s
> [INFO] Finished at: Thu Mar 17 15:52:41 PDT 2016
> [INFO] Final Memory: 261M/1122M
> [INFO] ------------------------------------------------------------------------
> 
> **2. mvn test -DskipSurefireTests **
> 
> Ran 253 tests in 6.633s
> 
> OK
> ----------------------------------------------------------------------
> Total run:926
> Total errors:0
> Total failures:0
> OK
> INFO: AMBARI_SERVER_LIB is not set, using default /usr/lib/ambari-server
> INFO: Return code from stack upgrade command, retcode = 0
> StackAdvisor implementation for stack HDP1, version 2.0.6 was not found
> Returning DefaultStackAdvisor implementation
> StackAdvisor implementation for stack XYZ, version 1.0.0 was loaded
> StackAdvisor implementation for stack XYZ, version 1.0.1 was loaded
> Returning XYZ101StackAdvisor implementation
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 52.037s
> [INFO] Finished at: Thu Mar 17 15:11:40 PDT 2016
> [INFO] Final Memory: 57M/1008M
> [INFO] ------------------------------------------------------------------------
> 
> **3. mvn test -Dtest=TestHeartbeatHandler,RecoveryConfigHelperTest,HostVersionOutOfSyncListenerTest -DskipPythonTests **
> 
> -------------------------------------------------------
>  T E S T S
> -------------------------------------------------------
> Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
> Running org.apache.ambari.server.agent.TestHeartbeatHandler
> Tests run: 24, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 55.137 sec - in org.apache.ambari.server.agent.TestHeartbeatHandler
> Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
> Running org.apache.ambari.server.configuration.RecoveryConfigHelperTest
> Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 22.704 sec - in org.apache.ambari.server.configuration.RecoveryConfigHelperTest
> Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
> Running org.apache.ambari.server.events.listeners.upgrade.HostVersionOutOfSyncListenerTest
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 19.333 sec - in org.apache.ambari.server.events.listeners.upgrade.HostVersionOutOfSyncListenerTest
> 
> Results :
> 
> Tests run: 36, Failures: 0, Errors: 0, Skipped: 0
> 
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 1:59.372s
> [INFO] Finished at: Thu Mar 17 15:55:50 PDT 2016
> [INFO] Final Memory: 55M/1018M
> [INFO] ------------------------------------------------------------------------
> 
> **4. Manual tests **
> 
> *Setup*
> *  Deployed a VM with the latest **trunk** bits and replaced **ambari-server.jar** from the development build.
> *  Replaced **agent python** files from development.
> 
> *Tests*
> curl -u admin:admin -H "X-Requested-By: ambari" -X PUT http://localhost:8080/api/v1/clusters/c1/components/METRICS_COLLECTOR -d '{"ServiceComponentInfo":{"recovery_enabled":"false"}}'
> 
> In the Ambari Web UI, METRICS_COLLECTOR was set to Maintenance Mode.
> 
> 
> *Verification using the debugger*
> Before these tests were run, it was verified that the recovery configuration was sent only during registration and not during the heartbeats since nothing was modified.
> 
> For both the tests above, it was verified that events were published for the above changes and that RecoveryConfigHelper captured those events and invalidated the timestamp. During the next heartbeat, the new recovery configuration information was pulled from the DB and send to the agent. It was also verified that for subsequent heartbeats, the recovery configuration was not sent.
> 
> 
> Thanks,
> 
> Nahappan Somasundaram
> 
>


Re: Review Request 44986: AMBARI-15474: Listen for changes to auto-start configuration and send them to the agent during heartbeats

Posted by Nahappan Somasundaram <ns...@hortonworks.com>.

> On March 21, 2016, 5:38 p.m., Alejandro Fernandez wrote:
> > ambari-agent/src/main/python/ambari_agent/Heartbeat.py, line 82
> > <https://reviews.apache.org/r/44986/diff/3/?file=1308889#file1308889line82>
> >
> >     What does repr of these 2 objects look like?

The only addition here was str(recovery_timestamp). Anyways, here's how the log looks like (before the change):

INFO 2016-03-22 00:01:24,987 Heartbeat.py:78 - Building Heartbeat: {responseId = 6, timestamp = 1458604884987, commandsInProgress = True, componentsMapped = False}

More documentation https://docs.python.org/2/library/repr.html


> On March 21, 2016, 5:38 p.m., Alejandro Fernandez wrote:
> > ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java, line 286
> > <https://reviews.apache.org/r/44986/diff/3/?file=1308892#file1308892line286>
> >
> >     It's possible for this to be called if clusterName is still null.

Yes, there is a possibility. Before creating a cluster, if we register the agent using the API and start it, it will attempt to register with the server. At this time the cluster name will be null. And subsequent in subsequent heartbeats, isConfigStale() is called. Though, hostname will not be null.


> On March 21, 2016, 5:38 p.m., Alejandro Fernandez wrote:
> > ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfigHelper.java, line 90
> > <https://reviews.apache.org/r/44986/diff/3/?file=1308895#file1308895line90>
> >
> >     Does this really have to be a concurrent hashmap?

With several hosts, we will endup with several heartbeats, which can occur at the same time. At the minimum, we need a read lock in isConfigStale().


> On March 21, 2016, 5:38 p.m., Alejandro Fernandez wrote:
> > ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfigHelper.java, line 133
> > <https://reviews.apache.org/r/44986/diff/3/?file=1308895#file1308895line133>
> >
> >     Why should empty cluster or host return true instead of raising an exception.

There is a potential for cluster name to be empty, as explained in a previous comment. However, hostname cannot be null.


> On March 21, 2016, 5:38 p.m., Alejandro Fernandez wrote:
> > ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentRecoveryChangedEvent.java, line 23
> > <https://reviews.apache.org/r/44986/diff/3/?file=1308899#file1308899line23>
> >
> >     Please add a description of what "Recovery Enabled" means

Updated the comment to say auto start instead of recovery.


- Nahappan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44986/#review124692
-----------------------------------------------------------


On March 21, 2016, 12:24 p.m., Nahappan Somasundaram wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/44986/
> -----------------------------------------------------------
> 
> (Updated March 21, 2016, 12:24 p.m.)
> 
> 
> Review request for Ambari, Jonathan Hurley, Nate Cole, Sumit Mohanty, and Sid Wagle.
> 
> 
> Bugs: AMBARI-15474
>     https://issues.apache.org/jira/browse/AMBARI-15474
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> AMBARI-15474: Listen for changes to auto-start configuration and send them to the agent during heartbeats.
> 
> In HeartbeatHandler::handleHeartBeat(), send the recovery configuration to the agent. Instead of pulling the recovery configuration from the application cache, implement change events and get notified whenever there are changes to the recovery configuration.
> 
> **Changes**:
> 1. Instead of sending the recovery configuration to the agent during every heartbeat (10 seconds interval), send the configuration only when changes are detected.
> 2. Used AmbariEventPublisher to publish the changes. RecoveryConfigHelper is the subscriber which listens to changes to the configuration.
> 3. RecoveryConfigHelper maintains a map of Cluster-->Host-->Timestamp which is the last updated timestamp of the recovery configuration. This timestamp is used as a token and is sent to the agent with the configuration. The agent provides this timestamp during each heartbeat. If the timestamp from the heartbeat is the same as the one on the server side, no configuration is sent to the agent.
> 4. Whenever changes are detected, the timestamp is invalidated. During the next heartbeat, the recovery configuration is read from the DB and sent to the agent along with the latest timestamp.
> 5. Added new unit tests to test each configuration change event.
> 
> 
> Diffs
> -----
> 
>   ambari-agent/src/main/python/ambari_agent/Controller.py c1c16ac97cca0d2677da2e26eee7b4949a4bf15c 
>   ambari-agent/src/main/python/ambari_agent/Heartbeat.py b28644c834a2387d2fb0ad17d104224e830a0245 
>   ambari-agent/src/main/python/ambari_agent/RecoveryManager.py ed537ca6928ce376c5f3e906f7a47c3f1919e309 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeat.java 1430aa20cf68db071be85c4ad2ebb9acdaccecc0 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java 3a80803724bcb6ca9e4ec875aa17c9ac1c66fbe8 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java 617b04b5b8768ba28a3c8ecb6e5e00601f153396 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfig.java c2c18462c586292caaf40bdb6a25a8fe9a39d76c 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfigHelper.java 92a622117c71ddfc3bd2562c04ee525491bd6ffd 
>   ambari-server/src/main/java/org/apache/ambari/server/events/AmbariEvent.java 78731854fe80c3d6087a73ebf00b0ffe9e287354 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ClusterConfigChangedEvent.java PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentInstalledEvent.java 2ce01236b3bb1ab4934fe456e5669fa474987426 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentRecoveryChangedEvent.java PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentUninstalledEvent.java 19712cdb18592816f82898cd7aa08a3600c7e1f4 
>   ambari-server/src/main/java/org/apache/ambari/server/state/ConfigImpl.java 2cc3d0013140c9c7d3453d7c5e05c3aa02bc5794 
>   ambari-server/src/main/java/org/apache/ambari/server/state/ServiceComponentImpl.java 4be1c21bf41416681be836716ce38305cc3f287e 
>   ambari-server/src/main/java/org/apache/ambari/server/state/svccomphost/ServiceComponentHostImpl.java 3a0075e3a3c90088ca6bd2d9eca48ef50db9fdb3 
>   ambari-server/src/test/java/org/apache/ambari/server/agent/TestHeartbeatHandler.java b1f94e38f97550123dd366984438b0bbf9c2826c 
>   ambari-server/src/test/java/org/apache/ambari/server/configuration/RecoveryConfigHelperTest.java c86b95a7cd10b0720f0a3b9676f65a8bd68dd6c9 
>   ambari-server/src/test/java/org/apache/ambari/server/events/listeners/upgrade/HostVersionOutOfSyncListenerTest.java 1bf9d8336533bc677384314f998f10aa2f10504e 
> 
> Diff: https://reviews.apache.org/r/44986/diff/
> 
> 
> Testing
> -------
> 
> **1. mvn clean install **
> 
> 
> [INFO] ------------------------------------------------------------------------
> [INFO] Reactor Summary:
> [INFO]
> [INFO] Ambari Main ....................................... SUCCESS [5.864s]
> [INFO] Apache Ambari Project POM ......................... SUCCESS [0.037s]
> [INFO] Ambari Web ........................................ SUCCESS [23.683s]
> [INFO] Ambari Views ...................................... SUCCESS [1.069s]
> [INFO] Ambari Admin View ................................. SUCCESS [5.949s]
> [INFO] ambari-metrics .................................... SUCCESS [0.534s]
> [INFO] Ambari Metrics Common ............................. SUCCESS [0.431s]
> [INFO] Ambari Metrics Hadoop Sink ........................ SUCCESS [1.060s]
> [INFO] Ambari Metrics Flume Sink ......................... SUCCESS [0.534s]
> [INFO] Ambari Metrics Kafka Sink ......................... SUCCESS [0.584s]
> [INFO] Ambari Metrics Storm Sink ......................... SUCCESS [1.472s]
> [INFO] Ambari Metrics Collector .......................... SUCCESS [5.981s]
> [INFO] Ambari Metrics Monitor ............................ SUCCESS [1.696s]
> [INFO] Ambari Metrics Grafana ............................ SUCCESS [0.819s]
> [INFO] Ambari Metrics Assembly ........................... SUCCESS [1:13.281s]
> [INFO] Ambari Server ..................................... SUCCESS [2:11.632s]
> [INFO] Ambari Functional Tests ........................... SUCCESS [0.907s]
> [INFO] Ambari Agent ...................................... SUCCESS [21.659s]
> [INFO] Ambari Client ..................................... SUCCESS [0.043s]
> [INFO] Ambari Python Client .............................. SUCCESS [0.829s]
> [INFO] Ambari Groovy Client .............................. SUCCESS [2.078s]
> [INFO] Ambari Shell ...................................... SUCCESS [0.039s]
> [INFO] Ambari Python Shell ............................... SUCCESS [0.554s]
> [INFO] Ambari Groovy Shell ............................... SUCCESS [0.811s]
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 4:42.176s
> [INFO] Finished at: Thu Mar 17 15:52:41 PDT 2016
> [INFO] Final Memory: 261M/1122M
> [INFO] ------------------------------------------------------------------------
> 
> **2. mvn test -DskipSurefireTests **
> 
> Ran 253 tests in 6.633s
> 
> OK
> ----------------------------------------------------------------------
> Total run:926
> Total errors:0
> Total failures:0
> OK
> INFO: AMBARI_SERVER_LIB is not set, using default /usr/lib/ambari-server
> INFO: Return code from stack upgrade command, retcode = 0
> StackAdvisor implementation for stack HDP1, version 2.0.6 was not found
> Returning DefaultStackAdvisor implementation
> StackAdvisor implementation for stack XYZ, version 1.0.0 was loaded
> StackAdvisor implementation for stack XYZ, version 1.0.1 was loaded
> Returning XYZ101StackAdvisor implementation
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 52.037s
> [INFO] Finished at: Thu Mar 17 15:11:40 PDT 2016
> [INFO] Final Memory: 57M/1008M
> [INFO] ------------------------------------------------------------------------
> 
> **3. mvn test -Dtest=TestHeartbeatHandler,RecoveryConfigHelperTest,HostVersionOutOfSyncListenerTest -DskipPythonTests **
> 
> -------------------------------------------------------
>  T E S T S
> -------------------------------------------------------
> Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
> Running org.apache.ambari.server.agent.TestHeartbeatHandler
> Tests run: 24, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 55.137 sec - in org.apache.ambari.server.agent.TestHeartbeatHandler
> Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
> Running org.apache.ambari.server.configuration.RecoveryConfigHelperTest
> Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 22.704 sec - in org.apache.ambari.server.configuration.RecoveryConfigHelperTest
> Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
> Running org.apache.ambari.server.events.listeners.upgrade.HostVersionOutOfSyncListenerTest
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 19.333 sec - in org.apache.ambari.server.events.listeners.upgrade.HostVersionOutOfSyncListenerTest
> 
> Results :
> 
> Tests run: 36, Failures: 0, Errors: 0, Skipped: 0
> 
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 1:59.372s
> [INFO] Finished at: Thu Mar 17 15:55:50 PDT 2016
> [INFO] Final Memory: 55M/1018M
> [INFO] ------------------------------------------------------------------------
> 
> **4. Manual tests **
> 
> *Setup*
> *  Deployed a VM with the latest **trunk** bits and replaced **ambari-server.jar** from the development build.
> *  Replaced **agent python** files from development.
> 
> *Tests*
> curl -u admin:admin -H "X-Requested-By: ambari" -X PUT http://localhost:8080/api/v1/clusters/c1/components/METRICS_COLLECTOR -d '{"ServiceComponentInfo":{"recovery_enabled":"false"}}'
> 
> In the Ambari Web UI, METRICS_COLLECTOR was set to Maintenance Mode.
> 
> 
> *Verification using the debugger*
> Before these tests were run, it was verified that the recovery configuration was sent only during registration and not during the heartbeats since nothing was modified.
> 
> For both the tests above, it was verified that events were published for the above changes and that RecoveryConfigHelper captured those events and invalidated the timestamp. During the next heartbeat, the new recovery configuration information was pulled from the DB and send to the agent. It was also verified that for subsequent heartbeats, the recovery configuration was not sent.
> 
> 
> Thanks,
> 
> Nahappan Somasundaram
> 
>


Re: Review Request 44986: AMBARI-15474: Listen for changes to auto-start configuration and send them to the agent during heartbeats

Posted by Alejandro Fernandez <af...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44986/#review124692
-----------------------------------------------------------


Fix it, then Ship it!




Fix It and Ship It


ambari-agent/src/main/python/ambari_agent/Heartbeat.py (line 81)
<https://reviews.apache.org/r/44986/#comment187314>

    What does repr of these 2 objects look like?



ambari-agent/src/main/python/ambari_agent/RecoveryManager.py (line 595)
<https://reviews.apache.org/r/44986/#comment187315>

    Perhaps add a message that -1 is an acceptable value.



ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java (line 277)
<https://reviews.apache.org/r/44986/#comment187316>

    I wouldn't worry too much about this case.



ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java (line 286)
<https://reviews.apache.org/r/44986/#comment187317>

    It's possible for this to be called if clusterName is still null.



ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfigHelper.java (line 88)
<https://reviews.apache.org/r/44986/#comment187318>

    Does this really have to be a concurrent hashmap?



ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfigHelper.java (line 131)
<https://reviews.apache.org/r/44986/#comment187319>

    Why should empty cluster or host return true instead of raising an exception.



ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentRecoveryChangedEvent.java (line 23)
<https://reviews.apache.org/r/44986/#comment187320>

    Please add a description of what "Recovery Enabled" means


- Alejandro Fernandez


On March 21, 2016, 7:24 p.m., Nahappan Somasundaram wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/44986/
> -----------------------------------------------------------
> 
> (Updated March 21, 2016, 7:24 p.m.)
> 
> 
> Review request for Ambari, Jonathan Hurley, Nate Cole, Sumit Mohanty, and Sid Wagle.
> 
> 
> Bugs: AMBARI-15474
>     https://issues.apache.org/jira/browse/AMBARI-15474
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> AMBARI-15474: Listen for changes to auto-start configuration and send them to the agent during heartbeats.
> 
> In HeartbeatHandler::handleHeartBeat(), send the recovery configuration to the agent. Instead of pulling the recovery configuration from the application cache, implement change events and get notified whenever there are changes to the recovery configuration.
> 
> **Changes**:
> 1. Instead of sending the recovery configuration to the agent during every heartbeat (10 seconds interval), send the configuration only when changes are detected.
> 2. Used AmbariEventPublisher to publish the changes. RecoveryConfigHelper is the subscriber which listens to changes to the configuration.
> 3. RecoveryConfigHelper maintains a map of Cluster-->Host-->Timestamp which is the last updated timestamp of the recovery configuration. This timestamp is used as a token and is sent to the agent with the configuration. The agent provides this timestamp during each heartbeat. If the timestamp from the heartbeat is the same as the one on the server side, no configuration is sent to the agent.
> 4. Whenever changes are detected, the timestamp is invalidated. During the next heartbeat, the recovery configuration is read from the DB and sent to the agent along with the latest timestamp.
> 5. Added new unit tests to test each configuration change event.
> 
> 
> Diffs
> -----
> 
>   ambari-agent/src/main/python/ambari_agent/Controller.py c1c16ac97cca0d2677da2e26eee7b4949a4bf15c 
>   ambari-agent/src/main/python/ambari_agent/Heartbeat.py b28644c834a2387d2fb0ad17d104224e830a0245 
>   ambari-agent/src/main/python/ambari_agent/RecoveryManager.py ed537ca6928ce376c5f3e906f7a47c3f1919e309 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeat.java 1430aa20cf68db071be85c4ad2ebb9acdaccecc0 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java 3a80803724bcb6ca9e4ec875aa17c9ac1c66fbe8 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java 617b04b5b8768ba28a3c8ecb6e5e00601f153396 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfig.java c2c18462c586292caaf40bdb6a25a8fe9a39d76c 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfigHelper.java 92a622117c71ddfc3bd2562c04ee525491bd6ffd 
>   ambari-server/src/main/java/org/apache/ambari/server/events/AmbariEvent.java 78731854fe80c3d6087a73ebf00b0ffe9e287354 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ClusterConfigChangedEvent.java PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentInstalledEvent.java 2ce01236b3bb1ab4934fe456e5669fa474987426 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentRecoveryChangedEvent.java PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentUninstalledEvent.java 19712cdb18592816f82898cd7aa08a3600c7e1f4 
>   ambari-server/src/main/java/org/apache/ambari/server/state/ConfigImpl.java 2cc3d0013140c9c7d3453d7c5e05c3aa02bc5794 
>   ambari-server/src/main/java/org/apache/ambari/server/state/ServiceComponentImpl.java 4be1c21bf41416681be836716ce38305cc3f287e 
>   ambari-server/src/main/java/org/apache/ambari/server/state/svccomphost/ServiceComponentHostImpl.java 3a0075e3a3c90088ca6bd2d9eca48ef50db9fdb3 
>   ambari-server/src/test/java/org/apache/ambari/server/agent/TestHeartbeatHandler.java b1f94e38f97550123dd366984438b0bbf9c2826c 
>   ambari-server/src/test/java/org/apache/ambari/server/configuration/RecoveryConfigHelperTest.java c86b95a7cd10b0720f0a3b9676f65a8bd68dd6c9 
>   ambari-server/src/test/java/org/apache/ambari/server/events/listeners/upgrade/HostVersionOutOfSyncListenerTest.java 1bf9d8336533bc677384314f998f10aa2f10504e 
> 
> Diff: https://reviews.apache.org/r/44986/diff/
> 
> 
> Testing
> -------
> 
> **1. mvn clean install **
> 
> 
> [INFO] ------------------------------------------------------------------------
> [INFO] Reactor Summary:
> [INFO]
> [INFO] Ambari Main ....................................... SUCCESS [5.864s]
> [INFO] Apache Ambari Project POM ......................... SUCCESS [0.037s]
> [INFO] Ambari Web ........................................ SUCCESS [23.683s]
> [INFO] Ambari Views ...................................... SUCCESS [1.069s]
> [INFO] Ambari Admin View ................................. SUCCESS [5.949s]
> [INFO] ambari-metrics .................................... SUCCESS [0.534s]
> [INFO] Ambari Metrics Common ............................. SUCCESS [0.431s]
> [INFO] Ambari Metrics Hadoop Sink ........................ SUCCESS [1.060s]
> [INFO] Ambari Metrics Flume Sink ......................... SUCCESS [0.534s]
> [INFO] Ambari Metrics Kafka Sink ......................... SUCCESS [0.584s]
> [INFO] Ambari Metrics Storm Sink ......................... SUCCESS [1.472s]
> [INFO] Ambari Metrics Collector .......................... SUCCESS [5.981s]
> [INFO] Ambari Metrics Monitor ............................ SUCCESS [1.696s]
> [INFO] Ambari Metrics Grafana ............................ SUCCESS [0.819s]
> [INFO] Ambari Metrics Assembly ........................... SUCCESS [1:13.281s]
> [INFO] Ambari Server ..................................... SUCCESS [2:11.632s]
> [INFO] Ambari Functional Tests ........................... SUCCESS [0.907s]
> [INFO] Ambari Agent ...................................... SUCCESS [21.659s]
> [INFO] Ambari Client ..................................... SUCCESS [0.043s]
> [INFO] Ambari Python Client .............................. SUCCESS [0.829s]
> [INFO] Ambari Groovy Client .............................. SUCCESS [2.078s]
> [INFO] Ambari Shell ...................................... SUCCESS [0.039s]
> [INFO] Ambari Python Shell ............................... SUCCESS [0.554s]
> [INFO] Ambari Groovy Shell ............................... SUCCESS [0.811s]
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 4:42.176s
> [INFO] Finished at: Thu Mar 17 15:52:41 PDT 2016
> [INFO] Final Memory: 261M/1122M
> [INFO] ------------------------------------------------------------------------
> 
> **2. mvn test -DskipSurefireTests **
> 
> Ran 253 tests in 6.633s
> 
> OK
> ----------------------------------------------------------------------
> Total run:926
> Total errors:0
> Total failures:0
> OK
> INFO: AMBARI_SERVER_LIB is not set, using default /usr/lib/ambari-server
> INFO: Return code from stack upgrade command, retcode = 0
> StackAdvisor implementation for stack HDP1, version 2.0.6 was not found
> Returning DefaultStackAdvisor implementation
> StackAdvisor implementation for stack XYZ, version 1.0.0 was loaded
> StackAdvisor implementation for stack XYZ, version 1.0.1 was loaded
> Returning XYZ101StackAdvisor implementation
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 52.037s
> [INFO] Finished at: Thu Mar 17 15:11:40 PDT 2016
> [INFO] Final Memory: 57M/1008M
> [INFO] ------------------------------------------------------------------------
> 
> **3. mvn test -Dtest=TestHeartbeatHandler,RecoveryConfigHelperTest,HostVersionOutOfSyncListenerTest -DskipPythonTests **
> 
> -------------------------------------------------------
>  T E S T S
> -------------------------------------------------------
> Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
> Running org.apache.ambari.server.agent.TestHeartbeatHandler
> Tests run: 24, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 55.137 sec - in org.apache.ambari.server.agent.TestHeartbeatHandler
> Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
> Running org.apache.ambari.server.configuration.RecoveryConfigHelperTest
> Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 22.704 sec - in org.apache.ambari.server.configuration.RecoveryConfigHelperTest
> Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
> Running org.apache.ambari.server.events.listeners.upgrade.HostVersionOutOfSyncListenerTest
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 19.333 sec - in org.apache.ambari.server.events.listeners.upgrade.HostVersionOutOfSyncListenerTest
> 
> Results :
> 
> Tests run: 36, Failures: 0, Errors: 0, Skipped: 0
> 
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 1:59.372s
> [INFO] Finished at: Thu Mar 17 15:55:50 PDT 2016
> [INFO] Final Memory: 55M/1018M
> [INFO] ------------------------------------------------------------------------
> 
> **4. Manual tests **
> 
> *Setup*
> *  Deployed a VM with the latest **trunk** bits and replaced **ambari-server.jar** from the development build.
> *  Replaced **agent python** files from development.
> 
> *Tests*
> curl -u admin:admin -H "X-Requested-By: ambari" -X PUT http://localhost:8080/api/v1/clusters/c1/components/METRICS_COLLECTOR -d '{"ServiceComponentInfo":{"recovery_enabled":"false"}}'
> 
> In the Ambari Web UI, METRICS_COLLECTOR was set to Maintenance Mode.
> 
> 
> *Verification using the debugger*
> Before these tests were run, it was verified that the recovery configuration was sent only during registration and not during the heartbeats since nothing was modified.
> 
> For both the tests above, it was verified that events were published for the above changes and that RecoveryConfigHelper captured those events and invalidated the timestamp. During the next heartbeat, the new recovery configuration information was pulled from the DB and send to the agent. It was also verified that for subsequent heartbeats, the recovery configuration was not sent.
> 
> 
> Thanks,
> 
> Nahappan Somasundaram
> 
>


Re: Review Request 44986: AMBARI-15474: Listen for changes to auto-start configuration and send them to the agent during heartbeats

Posted by Nahappan Somasundaram <ns...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44986/
-----------------------------------------------------------

(Updated March 21, 2016, 12:24 p.m.)


Review request for Ambari, Jonathan Hurley, Nate Cole, Sumit Mohanty, and Sid Wagle.


Changes
-------

**Code review fixes:** Added documentation to the newly added functions.


Bugs: AMBARI-15474
    https://issues.apache.org/jira/browse/AMBARI-15474


Repository: ambari


Description
-------

AMBARI-15474: Listen for changes to auto-start configuration and send them to the agent during heartbeats.

In HeartbeatHandler::handleHeartBeat(), send the recovery configuration to the agent. Instead of pulling the recovery configuration from the application cache, implement change events and get notified whenever there are changes to the recovery configuration.

**Changes**:
1. Instead of sending the recovery configuration to the agent during every heartbeat (10 seconds interval), send the configuration only when changes are detected.
2. Used AmbariEventPublisher to publish the changes. RecoveryConfigHelper is the subscriber which listens to changes to the configuration.
3. RecoveryConfigHelper maintains a map of Cluster-->Host-->Timestamp which is the last updated timestamp of the recovery configuration. This timestamp is used as a token and is sent to the agent with the configuration. The agent provides this timestamp during each heartbeat. If the timestamp from the heartbeat is the same as the one on the server side, no configuration is sent to the agent.
4. Whenever changes are detected, the timestamp is invalidated. During the next heartbeat, the recovery configuration is read from the DB and sent to the agent along with the latest timestamp.
5. Added new unit tests to test each configuration change event.


Diffs (updated)
-----

  ambari-agent/src/main/python/ambari_agent/Controller.py c1c16ac97cca0d2677da2e26eee7b4949a4bf15c 
  ambari-agent/src/main/python/ambari_agent/Heartbeat.py b28644c834a2387d2fb0ad17d104224e830a0245 
  ambari-agent/src/main/python/ambari_agent/RecoveryManager.py ed537ca6928ce376c5f3e906f7a47c3f1919e309 
  ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeat.java 1430aa20cf68db071be85c4ad2ebb9acdaccecc0 
  ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java 3a80803724bcb6ca9e4ec875aa17c9ac1c66fbe8 
  ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java 617b04b5b8768ba28a3c8ecb6e5e00601f153396 
  ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfig.java c2c18462c586292caaf40bdb6a25a8fe9a39d76c 
  ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfigHelper.java 92a622117c71ddfc3bd2562c04ee525491bd6ffd 
  ambari-server/src/main/java/org/apache/ambari/server/events/AmbariEvent.java 78731854fe80c3d6087a73ebf00b0ffe9e287354 
  ambari-server/src/main/java/org/apache/ambari/server/events/ClusterConfigChangedEvent.java PRE-CREATION 
  ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentInstalledEvent.java 2ce01236b3bb1ab4934fe456e5669fa474987426 
  ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentRecoveryChangedEvent.java PRE-CREATION 
  ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentUninstalledEvent.java 19712cdb18592816f82898cd7aa08a3600c7e1f4 
  ambari-server/src/main/java/org/apache/ambari/server/state/ConfigImpl.java 2cc3d0013140c9c7d3453d7c5e05c3aa02bc5794 
  ambari-server/src/main/java/org/apache/ambari/server/state/ServiceComponentImpl.java 4be1c21bf41416681be836716ce38305cc3f287e 
  ambari-server/src/main/java/org/apache/ambari/server/state/svccomphost/ServiceComponentHostImpl.java 3a0075e3a3c90088ca6bd2d9eca48ef50db9fdb3 
  ambari-server/src/test/java/org/apache/ambari/server/agent/TestHeartbeatHandler.java b1f94e38f97550123dd366984438b0bbf9c2826c 
  ambari-server/src/test/java/org/apache/ambari/server/configuration/RecoveryConfigHelperTest.java c86b95a7cd10b0720f0a3b9676f65a8bd68dd6c9 
  ambari-server/src/test/java/org/apache/ambari/server/events/listeners/upgrade/HostVersionOutOfSyncListenerTest.java 1bf9d8336533bc677384314f998f10aa2f10504e 

Diff: https://reviews.apache.org/r/44986/diff/


Testing
-------

**1. mvn clean install **


[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Ambari Main ....................................... SUCCESS [5.864s]
[INFO] Apache Ambari Project POM ......................... SUCCESS [0.037s]
[INFO] Ambari Web ........................................ SUCCESS [23.683s]
[INFO] Ambari Views ...................................... SUCCESS [1.069s]
[INFO] Ambari Admin View ................................. SUCCESS [5.949s]
[INFO] ambari-metrics .................................... SUCCESS [0.534s]
[INFO] Ambari Metrics Common ............................. SUCCESS [0.431s]
[INFO] Ambari Metrics Hadoop Sink ........................ SUCCESS [1.060s]
[INFO] Ambari Metrics Flume Sink ......................... SUCCESS [0.534s]
[INFO] Ambari Metrics Kafka Sink ......................... SUCCESS [0.584s]
[INFO] Ambari Metrics Storm Sink ......................... SUCCESS [1.472s]
[INFO] Ambari Metrics Collector .......................... SUCCESS [5.981s]
[INFO] Ambari Metrics Monitor ............................ SUCCESS [1.696s]
[INFO] Ambari Metrics Grafana ............................ SUCCESS [0.819s]
[INFO] Ambari Metrics Assembly ........................... SUCCESS [1:13.281s]
[INFO] Ambari Server ..................................... SUCCESS [2:11.632s]
[INFO] Ambari Functional Tests ........................... SUCCESS [0.907s]
[INFO] Ambari Agent ...................................... SUCCESS [21.659s]
[INFO] Ambari Client ..................................... SUCCESS [0.043s]
[INFO] Ambari Python Client .............................. SUCCESS [0.829s]
[INFO] Ambari Groovy Client .............................. SUCCESS [2.078s]
[INFO] Ambari Shell ...................................... SUCCESS [0.039s]
[INFO] Ambari Python Shell ............................... SUCCESS [0.554s]
[INFO] Ambari Groovy Shell ............................... SUCCESS [0.811s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 4:42.176s
[INFO] Finished at: Thu Mar 17 15:52:41 PDT 2016
[INFO] Final Memory: 261M/1122M
[INFO] ------------------------------------------------------------------------

**2. mvn test -DskipSurefireTests **

Ran 253 tests in 6.633s

OK
----------------------------------------------------------------------
Total run:926
Total errors:0
Total failures:0
OK
INFO: AMBARI_SERVER_LIB is not set, using default /usr/lib/ambari-server
INFO: Return code from stack upgrade command, retcode = 0
StackAdvisor implementation for stack HDP1, version 2.0.6 was not found
Returning DefaultStackAdvisor implementation
StackAdvisor implementation for stack XYZ, version 1.0.0 was loaded
StackAdvisor implementation for stack XYZ, version 1.0.1 was loaded
Returning XYZ101StackAdvisor implementation
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 52.037s
[INFO] Finished at: Thu Mar 17 15:11:40 PDT 2016
[INFO] Final Memory: 57M/1008M
[INFO] ------------------------------------------------------------------------

**3. mvn test -Dtest=TestHeartbeatHandler,RecoveryConfigHelperTest,HostVersionOutOfSyncListenerTest -DskipPythonTests **

-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
Running org.apache.ambari.server.agent.TestHeartbeatHandler
Tests run: 24, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 55.137 sec - in org.apache.ambari.server.agent.TestHeartbeatHandler
Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
Running org.apache.ambari.server.configuration.RecoveryConfigHelperTest
Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 22.704 sec - in org.apache.ambari.server.configuration.RecoveryConfigHelperTest
Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
Running org.apache.ambari.server.events.listeners.upgrade.HostVersionOutOfSyncListenerTest
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 19.333 sec - in org.apache.ambari.server.events.listeners.upgrade.HostVersionOutOfSyncListenerTest

Results :

Tests run: 36, Failures: 0, Errors: 0, Skipped: 0

[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1:59.372s
[INFO] Finished at: Thu Mar 17 15:55:50 PDT 2016
[INFO] Final Memory: 55M/1018M
[INFO] ------------------------------------------------------------------------

**4. Manual tests **

*Setup*
*  Deployed a VM with the latest **trunk** bits and replaced **ambari-server.jar** from the development build.
*  Replaced **agent python** files from development.

*Tests*
curl -u admin:admin -H "X-Requested-By: ambari" -X PUT http://localhost:8080/api/v1/clusters/c1/components/METRICS_COLLECTOR -d '{"ServiceComponentInfo":{"recovery_enabled":"false"}}'

In the Ambari Web UI, METRICS_COLLECTOR was set to Maintenance Mode.


*Verification using the debugger*
Before these tests were run, it was verified that the recovery configuration was sent only during registration and not during the heartbeats since nothing was modified.

For both the tests above, it was verified that events were published for the above changes and that RecoveryConfigHelper captured those events and invalidated the timestamp. During the next heartbeat, the new recovery configuration information was pulled from the DB and send to the agent. It was also verified that for subsequent heartbeats, the recovery configuration was not sent.


Thanks,

Nahappan Somasundaram


Re: Review Request 44986: AMBARI-15474: Listen for changes to auto-start configuration and send them to the agent during heartbeats

Posted by Jonathan Hurley <jh...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44986/#review124606
-----------------------------------------------------------




ambari-agent/src/main/python/ambari_agent/RecoveryManager.py (lines 581 - 583)
<https://reviews.apache.org/r/44986/#comment187219>

    Hah! Sorry; Reviewboard formatting strikes me again. I can see it's not a no-op function now.


- Jonathan Hurley


On March 18, 2016, 2:07 p.m., Nahappan Somasundaram wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/44986/
> -----------------------------------------------------------
> 
> (Updated March 18, 2016, 2:07 p.m.)
> 
> 
> Review request for Ambari, Jonathan Hurley, Nate Cole, Sumit Mohanty, and Sid Wagle.
> 
> 
> Bugs: AMBARI-15474
>     https://issues.apache.org/jira/browse/AMBARI-15474
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> AMBARI-15474: Listen for changes to auto-start configuration and send them to the agent during heartbeats.
> 
> In HeartbeatHandler::handleHeartBeat(), send the recovery configuration to the agent. Instead of pulling the recovery configuration from the application cache, implement change events and get notified whenever there are changes to the recovery configuration.
> 
> **Changes**:
> 1. Instead of sending the recovery configuration to the agent during every heartbeat (10 seconds interval), send the configuration only when changes are detected.
> 2. Used AmbariEventPublisher to publish the changes. RecoveryConfigHelper is the subscriber which listens to changes to the configuration.
> 3. RecoveryConfigHelper maintains a map of Cluster-->Host-->Timestamp which is the last updated timestamp of the recovery configuration. This timestamp is used as a token and is sent to the agent with the configuration. The agent provides this timestamp during each heartbeat. If the timestamp from the heartbeat is the same as the one on the server side, no configuration is sent to the agent.
> 4. Whenever changes are detected, the timestamp is invalidated. During the next heartbeat, the recovery configuration is read from the DB and sent to the agent along with the latest timestamp.
> 5. Added new unit tests to test each configuration change event.
> 
> 
> Diffs
> -----
> 
>   ambari-agent/src/main/python/ambari_agent/Controller.py c1c16ac97cca0d2677da2e26eee7b4949a4bf15c 
>   ambari-agent/src/main/python/ambari_agent/Heartbeat.py b28644c834a2387d2fb0ad17d104224e830a0245 
>   ambari-agent/src/main/python/ambari_agent/RecoveryManager.py ed537ca6928ce376c5f3e906f7a47c3f1919e309 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeat.java 1430aa20cf68db071be85c4ad2ebb9acdaccecc0 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java 3a80803724bcb6ca9e4ec875aa17c9ac1c66fbe8 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java 617b04b5b8768ba28a3c8ecb6e5e00601f153396 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfig.java c2c18462c586292caaf40bdb6a25a8fe9a39d76c 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfigHelper.java 92a622117c71ddfc3bd2562c04ee525491bd6ffd 
>   ambari-server/src/main/java/org/apache/ambari/server/events/AmbariEvent.java 78731854fe80c3d6087a73ebf00b0ffe9e287354 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ClusterConfigChangedEvent.java PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentInstalledEvent.java 2ce01236b3bb1ab4934fe456e5669fa474987426 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentRecoveryChangedEvent.java PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentUninstalledEvent.java 19712cdb18592816f82898cd7aa08a3600c7e1f4 
>   ambari-server/src/main/java/org/apache/ambari/server/state/ConfigImpl.java 2cc3d0013140c9c7d3453d7c5e05c3aa02bc5794 
>   ambari-server/src/main/java/org/apache/ambari/server/state/ServiceComponentImpl.java 4be1c21bf41416681be836716ce38305cc3f287e 
>   ambari-server/src/main/java/org/apache/ambari/server/state/svccomphost/ServiceComponentHostImpl.java 3a0075e3a3c90088ca6bd2d9eca48ef50db9fdb3 
>   ambari-server/src/test/java/org/apache/ambari/server/agent/TestHeartbeatHandler.java b1f94e38f97550123dd366984438b0bbf9c2826c 
>   ambari-server/src/test/java/org/apache/ambari/server/configuration/RecoveryConfigHelperTest.java c86b95a7cd10b0720f0a3b9676f65a8bd68dd6c9 
>   ambari-server/src/test/java/org/apache/ambari/server/events/listeners/upgrade/HostVersionOutOfSyncListenerTest.java 1bf9d8336533bc677384314f998f10aa2f10504e 
> 
> Diff: https://reviews.apache.org/r/44986/diff/
> 
> 
> Testing
> -------
> 
> **1. mvn clean install **
> 
> 
> [INFO] ------------------------------------------------------------------------
> [INFO] Reactor Summary:
> [INFO]
> [INFO] Ambari Main ....................................... SUCCESS [5.864s]
> [INFO] Apache Ambari Project POM ......................... SUCCESS [0.037s]
> [INFO] Ambari Web ........................................ SUCCESS [23.683s]
> [INFO] Ambari Views ...................................... SUCCESS [1.069s]
> [INFO] Ambari Admin View ................................. SUCCESS [5.949s]
> [INFO] ambari-metrics .................................... SUCCESS [0.534s]
> [INFO] Ambari Metrics Common ............................. SUCCESS [0.431s]
> [INFO] Ambari Metrics Hadoop Sink ........................ SUCCESS [1.060s]
> [INFO] Ambari Metrics Flume Sink ......................... SUCCESS [0.534s]
> [INFO] Ambari Metrics Kafka Sink ......................... SUCCESS [0.584s]
> [INFO] Ambari Metrics Storm Sink ......................... SUCCESS [1.472s]
> [INFO] Ambari Metrics Collector .......................... SUCCESS [5.981s]
> [INFO] Ambari Metrics Monitor ............................ SUCCESS [1.696s]
> [INFO] Ambari Metrics Grafana ............................ SUCCESS [0.819s]
> [INFO] Ambari Metrics Assembly ........................... SUCCESS [1:13.281s]
> [INFO] Ambari Server ..................................... SUCCESS [2:11.632s]
> [INFO] Ambari Functional Tests ........................... SUCCESS [0.907s]
> [INFO] Ambari Agent ...................................... SUCCESS [21.659s]
> [INFO] Ambari Client ..................................... SUCCESS [0.043s]
> [INFO] Ambari Python Client .............................. SUCCESS [0.829s]
> [INFO] Ambari Groovy Client .............................. SUCCESS [2.078s]
> [INFO] Ambari Shell ...................................... SUCCESS [0.039s]
> [INFO] Ambari Python Shell ............................... SUCCESS [0.554s]
> [INFO] Ambari Groovy Shell ............................... SUCCESS [0.811s]
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 4:42.176s
> [INFO] Finished at: Thu Mar 17 15:52:41 PDT 2016
> [INFO] Final Memory: 261M/1122M
> [INFO] ------------------------------------------------------------------------
> 
> **2. mvn test -DskipSurefireTests **
> 
> Ran 253 tests in 6.633s
> 
> OK
> ----------------------------------------------------------------------
> Total run:926
> Total errors:0
> Total failures:0
> OK
> INFO: AMBARI_SERVER_LIB is not set, using default /usr/lib/ambari-server
> INFO: Return code from stack upgrade command, retcode = 0
> StackAdvisor implementation for stack HDP1, version 2.0.6 was not found
> Returning DefaultStackAdvisor implementation
> StackAdvisor implementation for stack XYZ, version 1.0.0 was loaded
> StackAdvisor implementation for stack XYZ, version 1.0.1 was loaded
> Returning XYZ101StackAdvisor implementation
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 52.037s
> [INFO] Finished at: Thu Mar 17 15:11:40 PDT 2016
> [INFO] Final Memory: 57M/1008M
> [INFO] ------------------------------------------------------------------------
> 
> **3. mvn test -Dtest=TestHeartbeatHandler,RecoveryConfigHelperTest,HostVersionOutOfSyncListenerTest -DskipPythonTests **
> 
> -------------------------------------------------------
>  T E S T S
> -------------------------------------------------------
> Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
> Running org.apache.ambari.server.agent.TestHeartbeatHandler
> Tests run: 24, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 55.137 sec - in org.apache.ambari.server.agent.TestHeartbeatHandler
> Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
> Running org.apache.ambari.server.configuration.RecoveryConfigHelperTest
> Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 22.704 sec - in org.apache.ambari.server.configuration.RecoveryConfigHelperTest
> Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
> Running org.apache.ambari.server.events.listeners.upgrade.HostVersionOutOfSyncListenerTest
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 19.333 sec - in org.apache.ambari.server.events.listeners.upgrade.HostVersionOutOfSyncListenerTest
> 
> Results :
> 
> Tests run: 36, Failures: 0, Errors: 0, Skipped: 0
> 
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 1:59.372s
> [INFO] Finished at: Thu Mar 17 15:55:50 PDT 2016
> [INFO] Final Memory: 55M/1018M
> [INFO] ------------------------------------------------------------------------
> 
> **4. Manual tests **
> 
> *Setup*
> *  Deployed a VM with the latest **trunk** bits and replaced **ambari-server.jar** from the development build.
> *  Replaced **agent python** files from development.
> 
> *Tests*
> curl -u admin:admin -H "X-Requested-By: ambari" -X PUT http://localhost:8080/api/v1/clusters/c1/components/METRICS_COLLECTOR -d '{"ServiceComponentInfo":{"recovery_enabled":"false"}}'
> 
> In the Ambari Web UI, METRICS_COLLECTOR was set to Maintenance Mode.
> 
> 
> *Verification using the debugger*
> Before these tests were run, it was verified that the recovery configuration was sent only during registration and not during the heartbeats since nothing was modified.
> 
> For both the tests above, it was verified that events were published for the above changes and that RecoveryConfigHelper captured those events and invalidated the timestamp. During the next heartbeat, the new recovery configuration information was pulled from the DB and send to the agent. It was also verified that for subsequent heartbeats, the recovery configuration was not sent.
> 
> 
> Thanks,
> 
> Nahappan Somasundaram
> 
>


Re: Review Request 44986: AMBARI-15474: Listen for changes to auto-start configuration and send them to the agent during heartbeats

Posted by Nahappan Somasundaram <ns...@hortonworks.com>.

> On March 21, 2016, 5:44 a.m., Jonathan Hurley wrote:
> > ambari-agent/src/main/python/ambari_agent/RecoveryManager.py, lines 581-583
> > <https://reviews.apache.org/r/44986/diff/2/?file=1306274#file1306274line581>
> >
> >     a no-op function?

No, it is not. The definition of **update_config()** is right after the **pass** statement. The *self.update_config()* a function call. The **pass** after *self.update_config* indicates end of the **update_configuration_from_registration** function.


- Nahappan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44986/#review124538
-----------------------------------------------------------


On March 18, 2016, 11:07 a.m., Nahappan Somasundaram wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/44986/
> -----------------------------------------------------------
> 
> (Updated March 18, 2016, 11:07 a.m.)
> 
> 
> Review request for Ambari, Jonathan Hurley, Nate Cole, Sumit Mohanty, and Sid Wagle.
> 
> 
> Bugs: AMBARI-15474
>     https://issues.apache.org/jira/browse/AMBARI-15474
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> AMBARI-15474: Listen for changes to auto-start configuration and send them to the agent during heartbeats.
> 
> In HeartbeatHandler::handleHeartBeat(), send the recovery configuration to the agent. Instead of pulling the recovery configuration from the application cache, implement change events and get notified whenever there are changes to the recovery configuration.
> 
> **Changes**:
> 1. Instead of sending the recovery configuration to the agent during every heartbeat (10 seconds interval), send the configuration only when changes are detected.
> 2. Used AmbariEventPublisher to publish the changes. RecoveryConfigHelper is the subscriber which listens to changes to the configuration.
> 3. RecoveryConfigHelper maintains a map of Cluster-->Host-->Timestamp which is the last updated timestamp of the recovery configuration. This timestamp is used as a token and is sent to the agent with the configuration. The agent provides this timestamp during each heartbeat. If the timestamp from the heartbeat is the same as the one on the server side, no configuration is sent to the agent.
> 4. Whenever changes are detected, the timestamp is invalidated. During the next heartbeat, the recovery configuration is read from the DB and sent to the agent along with the latest timestamp.
> 5. Added new unit tests to test each configuration change event.
> 
> 
> Diffs
> -----
> 
>   ambari-agent/src/main/python/ambari_agent/Controller.py c1c16ac97cca0d2677da2e26eee7b4949a4bf15c 
>   ambari-agent/src/main/python/ambari_agent/Heartbeat.py b28644c834a2387d2fb0ad17d104224e830a0245 
>   ambari-agent/src/main/python/ambari_agent/RecoveryManager.py ed537ca6928ce376c5f3e906f7a47c3f1919e309 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeat.java 1430aa20cf68db071be85c4ad2ebb9acdaccecc0 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java 3a80803724bcb6ca9e4ec875aa17c9ac1c66fbe8 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java 617b04b5b8768ba28a3c8ecb6e5e00601f153396 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfig.java c2c18462c586292caaf40bdb6a25a8fe9a39d76c 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfigHelper.java 92a622117c71ddfc3bd2562c04ee525491bd6ffd 
>   ambari-server/src/main/java/org/apache/ambari/server/events/AmbariEvent.java 78731854fe80c3d6087a73ebf00b0ffe9e287354 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ClusterConfigChangedEvent.java PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentInstalledEvent.java 2ce01236b3bb1ab4934fe456e5669fa474987426 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentRecoveryChangedEvent.java PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentUninstalledEvent.java 19712cdb18592816f82898cd7aa08a3600c7e1f4 
>   ambari-server/src/main/java/org/apache/ambari/server/state/ConfigImpl.java 2cc3d0013140c9c7d3453d7c5e05c3aa02bc5794 
>   ambari-server/src/main/java/org/apache/ambari/server/state/ServiceComponentImpl.java 4be1c21bf41416681be836716ce38305cc3f287e 
>   ambari-server/src/main/java/org/apache/ambari/server/state/svccomphost/ServiceComponentHostImpl.java 3a0075e3a3c90088ca6bd2d9eca48ef50db9fdb3 
>   ambari-server/src/test/java/org/apache/ambari/server/agent/TestHeartbeatHandler.java b1f94e38f97550123dd366984438b0bbf9c2826c 
>   ambari-server/src/test/java/org/apache/ambari/server/configuration/RecoveryConfigHelperTest.java c86b95a7cd10b0720f0a3b9676f65a8bd68dd6c9 
>   ambari-server/src/test/java/org/apache/ambari/server/events/listeners/upgrade/HostVersionOutOfSyncListenerTest.java 1bf9d8336533bc677384314f998f10aa2f10504e 
> 
> Diff: https://reviews.apache.org/r/44986/diff/
> 
> 
> Testing
> -------
> 
> **1. mvn clean install **
> 
> 
> [INFO] ------------------------------------------------------------------------
> [INFO] Reactor Summary:
> [INFO]
> [INFO] Ambari Main ....................................... SUCCESS [5.864s]
> [INFO] Apache Ambari Project POM ......................... SUCCESS [0.037s]
> [INFO] Ambari Web ........................................ SUCCESS [23.683s]
> [INFO] Ambari Views ...................................... SUCCESS [1.069s]
> [INFO] Ambari Admin View ................................. SUCCESS [5.949s]
> [INFO] ambari-metrics .................................... SUCCESS [0.534s]
> [INFO] Ambari Metrics Common ............................. SUCCESS [0.431s]
> [INFO] Ambari Metrics Hadoop Sink ........................ SUCCESS [1.060s]
> [INFO] Ambari Metrics Flume Sink ......................... SUCCESS [0.534s]
> [INFO] Ambari Metrics Kafka Sink ......................... SUCCESS [0.584s]
> [INFO] Ambari Metrics Storm Sink ......................... SUCCESS [1.472s]
> [INFO] Ambari Metrics Collector .......................... SUCCESS [5.981s]
> [INFO] Ambari Metrics Monitor ............................ SUCCESS [1.696s]
> [INFO] Ambari Metrics Grafana ............................ SUCCESS [0.819s]
> [INFO] Ambari Metrics Assembly ........................... SUCCESS [1:13.281s]
> [INFO] Ambari Server ..................................... SUCCESS [2:11.632s]
> [INFO] Ambari Functional Tests ........................... SUCCESS [0.907s]
> [INFO] Ambari Agent ...................................... SUCCESS [21.659s]
> [INFO] Ambari Client ..................................... SUCCESS [0.043s]
> [INFO] Ambari Python Client .............................. SUCCESS [0.829s]
> [INFO] Ambari Groovy Client .............................. SUCCESS [2.078s]
> [INFO] Ambari Shell ...................................... SUCCESS [0.039s]
> [INFO] Ambari Python Shell ............................... SUCCESS [0.554s]
> [INFO] Ambari Groovy Shell ............................... SUCCESS [0.811s]
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 4:42.176s
> [INFO] Finished at: Thu Mar 17 15:52:41 PDT 2016
> [INFO] Final Memory: 261M/1122M
> [INFO] ------------------------------------------------------------------------
> 
> **2. mvn test -DskipSurefireTests **
> 
> Ran 253 tests in 6.633s
> 
> OK
> ----------------------------------------------------------------------
> Total run:926
> Total errors:0
> Total failures:0
> OK
> INFO: AMBARI_SERVER_LIB is not set, using default /usr/lib/ambari-server
> INFO: Return code from stack upgrade command, retcode = 0
> StackAdvisor implementation for stack HDP1, version 2.0.6 was not found
> Returning DefaultStackAdvisor implementation
> StackAdvisor implementation for stack XYZ, version 1.0.0 was loaded
> StackAdvisor implementation for stack XYZ, version 1.0.1 was loaded
> Returning XYZ101StackAdvisor implementation
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 52.037s
> [INFO] Finished at: Thu Mar 17 15:11:40 PDT 2016
> [INFO] Final Memory: 57M/1008M
> [INFO] ------------------------------------------------------------------------
> 
> **3. mvn test -Dtest=TestHeartbeatHandler,RecoveryConfigHelperTest,HostVersionOutOfSyncListenerTest -DskipPythonTests **
> 
> -------------------------------------------------------
>  T E S T S
> -------------------------------------------------------
> Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
> Running org.apache.ambari.server.agent.TestHeartbeatHandler
> Tests run: 24, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 55.137 sec - in org.apache.ambari.server.agent.TestHeartbeatHandler
> Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
> Running org.apache.ambari.server.configuration.RecoveryConfigHelperTest
> Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 22.704 sec - in org.apache.ambari.server.configuration.RecoveryConfigHelperTest
> Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
> Running org.apache.ambari.server.events.listeners.upgrade.HostVersionOutOfSyncListenerTest
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 19.333 sec - in org.apache.ambari.server.events.listeners.upgrade.HostVersionOutOfSyncListenerTest
> 
> Results :
> 
> Tests run: 36, Failures: 0, Errors: 0, Skipped: 0
> 
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 1:59.372s
> [INFO] Finished at: Thu Mar 17 15:55:50 PDT 2016
> [INFO] Final Memory: 55M/1018M
> [INFO] ------------------------------------------------------------------------
> 
> **4. Manual tests **
> 
> *Setup*
> *  Deployed a VM with the latest **trunk** bits and replaced **ambari-server.jar** from the development build.
> *  Replaced **agent python** files from development.
> 
> *Tests*
> curl -u admin:admin -H "X-Requested-By: ambari" -X PUT http://localhost:8080/api/v1/clusters/c1/components/METRICS_COLLECTOR -d '{"ServiceComponentInfo":{"recovery_enabled":"false"}}'
> 
> In the Ambari Web UI, METRICS_COLLECTOR was set to Maintenance Mode.
> 
> 
> *Verification using the debugger*
> Before these tests were run, it was verified that the recovery configuration was sent only during registration and not during the heartbeats since nothing was modified.
> 
> For both the tests above, it was verified that events were published for the above changes and that RecoveryConfigHelper captured those events and invalidated the timestamp. During the next heartbeat, the new recovery configuration information was pulled from the DB and send to the agent. It was also verified that for subsequent heartbeats, the recovery configuration was not sent.
> 
> 
> Thanks,
> 
> Nahappan Somasundaram
> 
>


Re: Review Request 44986: AMBARI-15474: Listen for changes to auto-start configuration and send them to the agent during heartbeats

Posted by Jonathan Hurley <jh...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44986/#review124538
-----------------------------------------------------------


Fix it, then Ship it!




Nice use of the events. Just an FYI; remember that the AmbariEventPublisher (although it's asynchronous) is probably single-threaded so that events being pushed onto the bus are delivered to listeners one-at-a-time. Just wanted to make sure you knew that and didn't rely on concurrency here for any type of performance.


ambari-agent/src/main/python/ambari_agent/RecoveryManager.py (lines 581 - 583)
<https://reviews.apache.org/r/44986/#comment187140>

    a no-op function?



ambari-agent/src/main/python/ambari_agent/RecoveryManager.py (lines 586 - 590)
<https://reviews.apache.org/r/44986/#comment187141>

    Can you doc the params here; there are a lot of them.



ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeat.java (line 71)
<https://reviews.apache.org/r/44986/#comment187142>

    Doc.



ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeat.java (line 75)
<https://reviews.apache.org/r/44986/#comment187143>

    Doc.



ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java (lines 272 - 275)
<https://reviews.apache.org/r/44986/#comment187144>

    Well, for now :)



ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java (line 115)
<https://reviews.apache.org/r/44986/#comment187145>

    Doc.



ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java (line 119)
<https://reviews.apache.org/r/44986/#comment187146>

    Doc.



ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfig.java (line 104)
<https://reviews.apache.org/r/44986/#comment187147>

    Doc.



ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfig.java (line 108)
<https://reviews.apache.org/r/44986/#comment187148>

    Doc.



ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfigHelper.java (line 115)
<https://reviews.apache.org/r/44986/#comment187149>

    Doc how this is being calculated.



ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfigHelper.java (line 208)
<https://reviews.apache.org/r/44986/#comment187150>

    Doc.


- Jonathan Hurley


On March 18, 2016, 2:07 p.m., Nahappan Somasundaram wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/44986/
> -----------------------------------------------------------
> 
> (Updated March 18, 2016, 2:07 p.m.)
> 
> 
> Review request for Ambari, Jonathan Hurley, Nate Cole, Sumit Mohanty, and Sid Wagle.
> 
> 
> Bugs: AMBARI-15474
>     https://issues.apache.org/jira/browse/AMBARI-15474
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> AMBARI-15474: Listen for changes to auto-start configuration and send them to the agent during heartbeats.
> 
> In HeartbeatHandler::handleHeartBeat(), send the recovery configuration to the agent. Instead of pulling the recovery configuration from the application cache, implement change events and get notified whenever there are changes to the recovery configuration.
> 
> **Changes**:
> 1. Instead of sending the recovery configuration to the agent during every heartbeat (10 seconds interval), send the configuration only when changes are detected.
> 2. Used AmbariEventPublisher to publish the changes. RecoveryConfigHelper is the subscriber which listens to changes to the configuration.
> 3. RecoveryConfigHelper maintains a map of Cluster-->Host-->Timestamp which is the last updated timestamp of the recovery configuration. This timestamp is used as a token and is sent to the agent with the configuration. The agent provides this timestamp during each heartbeat. If the timestamp from the heartbeat is the same as the one on the server side, no configuration is sent to the agent.
> 4. Whenever changes are detected, the timestamp is invalidated. During the next heartbeat, the recovery configuration is read from the DB and sent to the agent along with the latest timestamp.
> 5. Added new unit tests to test each configuration change event.
> 
> 
> Diffs
> -----
> 
>   ambari-agent/src/main/python/ambari_agent/Controller.py c1c16ac97cca0d2677da2e26eee7b4949a4bf15c 
>   ambari-agent/src/main/python/ambari_agent/Heartbeat.py b28644c834a2387d2fb0ad17d104224e830a0245 
>   ambari-agent/src/main/python/ambari_agent/RecoveryManager.py ed537ca6928ce376c5f3e906f7a47c3f1919e309 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeat.java 1430aa20cf68db071be85c4ad2ebb9acdaccecc0 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java 3a80803724bcb6ca9e4ec875aa17c9ac1c66fbe8 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java 617b04b5b8768ba28a3c8ecb6e5e00601f153396 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfig.java c2c18462c586292caaf40bdb6a25a8fe9a39d76c 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfigHelper.java 92a622117c71ddfc3bd2562c04ee525491bd6ffd 
>   ambari-server/src/main/java/org/apache/ambari/server/events/AmbariEvent.java 78731854fe80c3d6087a73ebf00b0ffe9e287354 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ClusterConfigChangedEvent.java PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentInstalledEvent.java 2ce01236b3bb1ab4934fe456e5669fa474987426 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentRecoveryChangedEvent.java PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentUninstalledEvent.java 19712cdb18592816f82898cd7aa08a3600c7e1f4 
>   ambari-server/src/main/java/org/apache/ambari/server/state/ConfigImpl.java 2cc3d0013140c9c7d3453d7c5e05c3aa02bc5794 
>   ambari-server/src/main/java/org/apache/ambari/server/state/ServiceComponentImpl.java 4be1c21bf41416681be836716ce38305cc3f287e 
>   ambari-server/src/main/java/org/apache/ambari/server/state/svccomphost/ServiceComponentHostImpl.java 3a0075e3a3c90088ca6bd2d9eca48ef50db9fdb3 
>   ambari-server/src/test/java/org/apache/ambari/server/agent/TestHeartbeatHandler.java b1f94e38f97550123dd366984438b0bbf9c2826c 
>   ambari-server/src/test/java/org/apache/ambari/server/configuration/RecoveryConfigHelperTest.java c86b95a7cd10b0720f0a3b9676f65a8bd68dd6c9 
>   ambari-server/src/test/java/org/apache/ambari/server/events/listeners/upgrade/HostVersionOutOfSyncListenerTest.java 1bf9d8336533bc677384314f998f10aa2f10504e 
> 
> Diff: https://reviews.apache.org/r/44986/diff/
> 
> 
> Testing
> -------
> 
> **1. mvn clean install **
> 
> 
> [INFO] ------------------------------------------------------------------------
> [INFO] Reactor Summary:
> [INFO]
> [INFO] Ambari Main ....................................... SUCCESS [5.864s]
> [INFO] Apache Ambari Project POM ......................... SUCCESS [0.037s]
> [INFO] Ambari Web ........................................ SUCCESS [23.683s]
> [INFO] Ambari Views ...................................... SUCCESS [1.069s]
> [INFO] Ambari Admin View ................................. SUCCESS [5.949s]
> [INFO] ambari-metrics .................................... SUCCESS [0.534s]
> [INFO] Ambari Metrics Common ............................. SUCCESS [0.431s]
> [INFO] Ambari Metrics Hadoop Sink ........................ SUCCESS [1.060s]
> [INFO] Ambari Metrics Flume Sink ......................... SUCCESS [0.534s]
> [INFO] Ambari Metrics Kafka Sink ......................... SUCCESS [0.584s]
> [INFO] Ambari Metrics Storm Sink ......................... SUCCESS [1.472s]
> [INFO] Ambari Metrics Collector .......................... SUCCESS [5.981s]
> [INFO] Ambari Metrics Monitor ............................ SUCCESS [1.696s]
> [INFO] Ambari Metrics Grafana ............................ SUCCESS [0.819s]
> [INFO] Ambari Metrics Assembly ........................... SUCCESS [1:13.281s]
> [INFO] Ambari Server ..................................... SUCCESS [2:11.632s]
> [INFO] Ambari Functional Tests ........................... SUCCESS [0.907s]
> [INFO] Ambari Agent ...................................... SUCCESS [21.659s]
> [INFO] Ambari Client ..................................... SUCCESS [0.043s]
> [INFO] Ambari Python Client .............................. SUCCESS [0.829s]
> [INFO] Ambari Groovy Client .............................. SUCCESS [2.078s]
> [INFO] Ambari Shell ...................................... SUCCESS [0.039s]
> [INFO] Ambari Python Shell ............................... SUCCESS [0.554s]
> [INFO] Ambari Groovy Shell ............................... SUCCESS [0.811s]
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 4:42.176s
> [INFO] Finished at: Thu Mar 17 15:52:41 PDT 2016
> [INFO] Final Memory: 261M/1122M
> [INFO] ------------------------------------------------------------------------
> 
> **2. mvn test -DskipSurefireTests **
> 
> Ran 253 tests in 6.633s
> 
> OK
> ----------------------------------------------------------------------
> Total run:926
> Total errors:0
> Total failures:0
> OK
> INFO: AMBARI_SERVER_LIB is not set, using default /usr/lib/ambari-server
> INFO: Return code from stack upgrade command, retcode = 0
> StackAdvisor implementation for stack HDP1, version 2.0.6 was not found
> Returning DefaultStackAdvisor implementation
> StackAdvisor implementation for stack XYZ, version 1.0.0 was loaded
> StackAdvisor implementation for stack XYZ, version 1.0.1 was loaded
> Returning XYZ101StackAdvisor implementation
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 52.037s
> [INFO] Finished at: Thu Mar 17 15:11:40 PDT 2016
> [INFO] Final Memory: 57M/1008M
> [INFO] ------------------------------------------------------------------------
> 
> **3. mvn test -Dtest=TestHeartbeatHandler,RecoveryConfigHelperTest,HostVersionOutOfSyncListenerTest -DskipPythonTests **
> 
> -------------------------------------------------------
>  T E S T S
> -------------------------------------------------------
> Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
> Running org.apache.ambari.server.agent.TestHeartbeatHandler
> Tests run: 24, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 55.137 sec - in org.apache.ambari.server.agent.TestHeartbeatHandler
> Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
> Running org.apache.ambari.server.configuration.RecoveryConfigHelperTest
> Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 22.704 sec - in org.apache.ambari.server.configuration.RecoveryConfigHelperTest
> Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
> Running org.apache.ambari.server.events.listeners.upgrade.HostVersionOutOfSyncListenerTest
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 19.333 sec - in org.apache.ambari.server.events.listeners.upgrade.HostVersionOutOfSyncListenerTest
> 
> Results :
> 
> Tests run: 36, Failures: 0, Errors: 0, Skipped: 0
> 
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 1:59.372s
> [INFO] Finished at: Thu Mar 17 15:55:50 PDT 2016
> [INFO] Final Memory: 55M/1018M
> [INFO] ------------------------------------------------------------------------
> 
> **4. Manual tests **
> 
> *Setup*
> *  Deployed a VM with the latest **trunk** bits and replaced **ambari-server.jar** from the development build.
> *  Replaced **agent python** files from development.
> 
> *Tests*
> curl -u admin:admin -H "X-Requested-By: ambari" -X PUT http://localhost:8080/api/v1/clusters/c1/components/METRICS_COLLECTOR -d '{"ServiceComponentInfo":{"recovery_enabled":"false"}}'
> 
> In the Ambari Web UI, METRICS_COLLECTOR was set to Maintenance Mode.
> 
> 
> *Verification using the debugger*
> Before these tests were run, it was verified that the recovery configuration was sent only during registration and not during the heartbeats since nothing was modified.
> 
> For both the tests above, it was verified that events were published for the above changes and that RecoveryConfigHelper captured those events and invalidated the timestamp. During the next heartbeat, the new recovery configuration information was pulled from the DB and send to the agent. It was also verified that for subsequent heartbeats, the recovery configuration was not sent.
> 
> 
> Thanks,
> 
> Nahappan Somasundaram
> 
>


Re: Review Request 44986: AMBARI-15474: Listen for changes to auto-start configuration and send them to the agent during heartbeats

Posted by Nahappan Somasundaram <ns...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44986/
-----------------------------------------------------------

(Updated March 18, 2016, 11:07 a.m.)


Review request for Ambari, Jonathan Hurley, Nate Cole, Sumit Mohanty, and Sid Wagle.


Changes
-------

1. Fixed Log.info() to use {}
2. Renamed ServiceComponentRecoveryChangeEvent to ServiceComponentRecoveryChangedEvent


Bugs: AMBARI-15474
    https://issues.apache.org/jira/browse/AMBARI-15474


Repository: ambari


Description
-------

AMBARI-15474: Listen for changes to auto-start configuration and send them to the agent during heartbeats.

In HeartbeatHandler::handleHeartBeat(), send the recovery configuration to the agent. Instead of pulling the recovery configuration from the application cache, implement change events and get notified whenever there are changes to the recovery configuration.

**Changes**:
1. Instead of sending the recovery configuration to the agent during every heartbeat (10 seconds interval), send the configuration only when changes are detected.
2. Used AmbariEventPublisher to publish the changes. RecoveryConfigHelper is the subscriber which listens to changes to the configuration.
3. RecoveryConfigHelper maintains a map of Cluster-->Host-->Timestamp which is the last updated timestamp of the recovery configuration. This timestamp is used as a token and is sent to the agent with the configuration. The agent provides this timestamp during each heartbeat. If the timestamp from the heartbeat is the same as the one on the server side, no configuration is sent to the agent.
4. Whenever changes are detected, the timestamp is invalidated. During the next heartbeat, the recovery configuration is read from the DB and sent to the agent along with the latest timestamp.
5. Added new unit tests to test each configuration change event.


Diffs (updated)
-----

  ambari-agent/src/main/python/ambari_agent/Controller.py c1c16ac97cca0d2677da2e26eee7b4949a4bf15c 
  ambari-agent/src/main/python/ambari_agent/Heartbeat.py b28644c834a2387d2fb0ad17d104224e830a0245 
  ambari-agent/src/main/python/ambari_agent/RecoveryManager.py ed537ca6928ce376c5f3e906f7a47c3f1919e309 
  ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeat.java 1430aa20cf68db071be85c4ad2ebb9acdaccecc0 
  ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java 3a80803724bcb6ca9e4ec875aa17c9ac1c66fbe8 
  ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java 617b04b5b8768ba28a3c8ecb6e5e00601f153396 
  ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfig.java c2c18462c586292caaf40bdb6a25a8fe9a39d76c 
  ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfigHelper.java 92a622117c71ddfc3bd2562c04ee525491bd6ffd 
  ambari-server/src/main/java/org/apache/ambari/server/events/AmbariEvent.java 78731854fe80c3d6087a73ebf00b0ffe9e287354 
  ambari-server/src/main/java/org/apache/ambari/server/events/ClusterConfigChangedEvent.java PRE-CREATION 
  ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentInstalledEvent.java 2ce01236b3bb1ab4934fe456e5669fa474987426 
  ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentRecoveryChangedEvent.java PRE-CREATION 
  ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentUninstalledEvent.java 19712cdb18592816f82898cd7aa08a3600c7e1f4 
  ambari-server/src/main/java/org/apache/ambari/server/state/ConfigImpl.java 2cc3d0013140c9c7d3453d7c5e05c3aa02bc5794 
  ambari-server/src/main/java/org/apache/ambari/server/state/ServiceComponentImpl.java 4be1c21bf41416681be836716ce38305cc3f287e 
  ambari-server/src/main/java/org/apache/ambari/server/state/svccomphost/ServiceComponentHostImpl.java 3a0075e3a3c90088ca6bd2d9eca48ef50db9fdb3 
  ambari-server/src/test/java/org/apache/ambari/server/agent/TestHeartbeatHandler.java b1f94e38f97550123dd366984438b0bbf9c2826c 
  ambari-server/src/test/java/org/apache/ambari/server/configuration/RecoveryConfigHelperTest.java c86b95a7cd10b0720f0a3b9676f65a8bd68dd6c9 
  ambari-server/src/test/java/org/apache/ambari/server/events/listeners/upgrade/HostVersionOutOfSyncListenerTest.java 1bf9d8336533bc677384314f998f10aa2f10504e 

Diff: https://reviews.apache.org/r/44986/diff/


Testing
-------

**1. mvn clean install **


[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Ambari Main ....................................... SUCCESS [5.864s]
[INFO] Apache Ambari Project POM ......................... SUCCESS [0.037s]
[INFO] Ambari Web ........................................ SUCCESS [23.683s]
[INFO] Ambari Views ...................................... SUCCESS [1.069s]
[INFO] Ambari Admin View ................................. SUCCESS [5.949s]
[INFO] ambari-metrics .................................... SUCCESS [0.534s]
[INFO] Ambari Metrics Common ............................. SUCCESS [0.431s]
[INFO] Ambari Metrics Hadoop Sink ........................ SUCCESS [1.060s]
[INFO] Ambari Metrics Flume Sink ......................... SUCCESS [0.534s]
[INFO] Ambari Metrics Kafka Sink ......................... SUCCESS [0.584s]
[INFO] Ambari Metrics Storm Sink ......................... SUCCESS [1.472s]
[INFO] Ambari Metrics Collector .......................... SUCCESS [5.981s]
[INFO] Ambari Metrics Monitor ............................ SUCCESS [1.696s]
[INFO] Ambari Metrics Grafana ............................ SUCCESS [0.819s]
[INFO] Ambari Metrics Assembly ........................... SUCCESS [1:13.281s]
[INFO] Ambari Server ..................................... SUCCESS [2:11.632s]
[INFO] Ambari Functional Tests ........................... SUCCESS [0.907s]
[INFO] Ambari Agent ...................................... SUCCESS [21.659s]
[INFO] Ambari Client ..................................... SUCCESS [0.043s]
[INFO] Ambari Python Client .............................. SUCCESS [0.829s]
[INFO] Ambari Groovy Client .............................. SUCCESS [2.078s]
[INFO] Ambari Shell ...................................... SUCCESS [0.039s]
[INFO] Ambari Python Shell ............................... SUCCESS [0.554s]
[INFO] Ambari Groovy Shell ............................... SUCCESS [0.811s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 4:42.176s
[INFO] Finished at: Thu Mar 17 15:52:41 PDT 2016
[INFO] Final Memory: 261M/1122M
[INFO] ------------------------------------------------------------------------

**2. mvn test -DskipSurefireTests **

Ran 253 tests in 6.633s

OK
----------------------------------------------------------------------
Total run:926
Total errors:0
Total failures:0
OK
INFO: AMBARI_SERVER_LIB is not set, using default /usr/lib/ambari-server
INFO: Return code from stack upgrade command, retcode = 0
StackAdvisor implementation for stack HDP1, version 2.0.6 was not found
Returning DefaultStackAdvisor implementation
StackAdvisor implementation for stack XYZ, version 1.0.0 was loaded
StackAdvisor implementation for stack XYZ, version 1.0.1 was loaded
Returning XYZ101StackAdvisor implementation
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 52.037s
[INFO] Finished at: Thu Mar 17 15:11:40 PDT 2016
[INFO] Final Memory: 57M/1008M
[INFO] ------------------------------------------------------------------------

**3. mvn test -Dtest=TestHeartbeatHandler,RecoveryConfigHelperTest,HostVersionOutOfSyncListenerTest -DskipPythonTests **

-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
Running org.apache.ambari.server.agent.TestHeartbeatHandler
Tests run: 24, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 55.137 sec - in org.apache.ambari.server.agent.TestHeartbeatHandler
Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
Running org.apache.ambari.server.configuration.RecoveryConfigHelperTest
Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 22.704 sec - in org.apache.ambari.server.configuration.RecoveryConfigHelperTest
Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
Running org.apache.ambari.server.events.listeners.upgrade.HostVersionOutOfSyncListenerTest
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 19.333 sec - in org.apache.ambari.server.events.listeners.upgrade.HostVersionOutOfSyncListenerTest

Results :

Tests run: 36, Failures: 0, Errors: 0, Skipped: 0

[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1:59.372s
[INFO] Finished at: Thu Mar 17 15:55:50 PDT 2016
[INFO] Final Memory: 55M/1018M
[INFO] ------------------------------------------------------------------------

**4. Manual tests **

*Setup*
*  Deployed a VM with the latest **trunk** bits and replaced **ambari-server.jar** from the development build.
*  Replaced **agent python** files from development.

*Tests*
curl -u admin:admin -H "X-Requested-By: ambari" -X PUT http://localhost:8080/api/v1/clusters/c1/components/METRICS_COLLECTOR -d '{"ServiceComponentInfo":{"recovery_enabled":"false"}}'

In the Ambari Web UI, METRICS_COLLECTOR was set to Maintenance Mode.


*Verification using the debugger*
Before these tests were run, it was verified that the recovery configuration was sent only during registration and not during the heartbeats since nothing was modified.

For both the tests above, it was verified that events were published for the above changes and that RecoveryConfigHelper captured those events and invalidated the timestamp. During the next heartbeat, the new recovery configuration information was pulled from the DB and send to the agent. It was also verified that for subsequent heartbeats, the recovery configuration was not sent.


Thanks,

Nahappan Somasundaram


Re: Review Request 44986: AMBARI-15474: Listen for changes to auto-start configuration and send them to the agent during heartbeats

Posted by Nate Cole <nc...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44986/#review124220
-----------------------------------------------------------




ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java (lines 287 - 289)
<https://reviews.apache.org/r/44986/#comment186680>

    Use {} format for log statements like so: LOG.info("Recovery configuration set to {}", response.getRecoveryConfig());


- Nate Cole


On March 17, 2016, 7 p.m., Nahappan Somasundaram wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/44986/
> -----------------------------------------------------------
> 
> (Updated March 17, 2016, 7 p.m.)
> 
> 
> Review request for Ambari, Jonathan Hurley, Nate Cole, Sumit Mohanty, and Sid Wagle.
> 
> 
> Bugs: AMBARI-15474
>     https://issues.apache.org/jira/browse/AMBARI-15474
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> AMBARI-15474: Listen for changes to auto-start configuration and send them to the agent during heartbeats.
> 
> In HeartbeatHandler::handleHeartBeat(), send the recovery configuration to the agent. Instead of pulling the recovery configuration from the application cache, implement change events and get notified whenever there are changes to the recovery configuration.
> 
> **Changes**:
> 1. Instead of sending the recovery configuration to the agent during every heartbeat (10 seconds interval), send the configuration only when changes are detected.
> 2. Used AmbariEventPublisher to publish the changes. RecoveryConfigHelper is the subscriber which listens to changes to the configuration.
> 3. RecoveryConfigHelper maintains a map of Cluster-->Host-->Timestamp which is the last updated timestamp of the recovery configuration. This timestamp is used as a token and is sent to the agent with the configuration. The agent provides this timestamp during each heartbeat. If the timestamp from the heartbeat is the same as the one on the server side, no configuration is sent to the agent.
> 4. Whenever changes are detected, the timestamp is invalidated. During the next heartbeat, the recovery configuration is read from the DB and sent to the agent along with the latest timestamp.
> 5. Added new unit tests to test each configuration change event.
> 
> 
> Diffs
> -----
> 
>   ambari-agent/src/main/python/ambari_agent/Controller.py c1c16ac97cca0d2677da2e26eee7b4949a4bf15c 
>   ambari-agent/src/main/python/ambari_agent/Heartbeat.py b28644c834a2387d2fb0ad17d104224e830a0245 
>   ambari-agent/src/main/python/ambari_agent/RecoveryManager.py ed537ca6928ce376c5f3e906f7a47c3f1919e309 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeat.java 1430aa20cf68db071be85c4ad2ebb9acdaccecc0 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java 3a80803724bcb6ca9e4ec875aa17c9ac1c66fbe8 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java 617b04b5b8768ba28a3c8ecb6e5e00601f153396 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfig.java c2c18462c586292caaf40bdb6a25a8fe9a39d76c 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfigHelper.java 92a622117c71ddfc3bd2562c04ee525491bd6ffd 
>   ambari-server/src/main/java/org/apache/ambari/server/events/AmbariEvent.java 78731854fe80c3d6087a73ebf00b0ffe9e287354 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ClusterConfigChangedEvent.java PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentInstalledEvent.java 2ce01236b3bb1ab4934fe456e5669fa474987426 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentRecoveryChangeEvent.java PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentUninstalledEvent.java 19712cdb18592816f82898cd7aa08a3600c7e1f4 
>   ambari-server/src/main/java/org/apache/ambari/server/state/ConfigImpl.java 2cc3d0013140c9c7d3453d7c5e05c3aa02bc5794 
>   ambari-server/src/main/java/org/apache/ambari/server/state/ServiceComponentImpl.java 4be1c21bf41416681be836716ce38305cc3f287e 
>   ambari-server/src/main/java/org/apache/ambari/server/state/svccomphost/ServiceComponentHostImpl.java 3a0075e3a3c90088ca6bd2d9eca48ef50db9fdb3 
>   ambari-server/src/test/java/org/apache/ambari/server/agent/TestHeartbeatHandler.java b1f94e38f97550123dd366984438b0bbf9c2826c 
>   ambari-server/src/test/java/org/apache/ambari/server/configuration/RecoveryConfigHelperTest.java c86b95a7cd10b0720f0a3b9676f65a8bd68dd6c9 
>   ambari-server/src/test/java/org/apache/ambari/server/events/listeners/upgrade/HostVersionOutOfSyncListenerTest.java 1bf9d8336533bc677384314f998f10aa2f10504e 
> 
> Diff: https://reviews.apache.org/r/44986/diff/
> 
> 
> Testing
> -------
> 
> **1. mvn clean install **
> 
> 
> [INFO] ------------------------------------------------------------------------
> [INFO] Reactor Summary:
> [INFO]
> [INFO] Ambari Main ....................................... SUCCESS [5.864s]
> [INFO] Apache Ambari Project POM ......................... SUCCESS [0.037s]
> [INFO] Ambari Web ........................................ SUCCESS [23.683s]
> [INFO] Ambari Views ...................................... SUCCESS [1.069s]
> [INFO] Ambari Admin View ................................. SUCCESS [5.949s]
> [INFO] ambari-metrics .................................... SUCCESS [0.534s]
> [INFO] Ambari Metrics Common ............................. SUCCESS [0.431s]
> [INFO] Ambari Metrics Hadoop Sink ........................ SUCCESS [1.060s]
> [INFO] Ambari Metrics Flume Sink ......................... SUCCESS [0.534s]
> [INFO] Ambari Metrics Kafka Sink ......................... SUCCESS [0.584s]
> [INFO] Ambari Metrics Storm Sink ......................... SUCCESS [1.472s]
> [INFO] Ambari Metrics Collector .......................... SUCCESS [5.981s]
> [INFO] Ambari Metrics Monitor ............................ SUCCESS [1.696s]
> [INFO] Ambari Metrics Grafana ............................ SUCCESS [0.819s]
> [INFO] Ambari Metrics Assembly ........................... SUCCESS [1:13.281s]
> [INFO] Ambari Server ..................................... SUCCESS [2:11.632s]
> [INFO] Ambari Functional Tests ........................... SUCCESS [0.907s]
> [INFO] Ambari Agent ...................................... SUCCESS [21.659s]
> [INFO] Ambari Client ..................................... SUCCESS [0.043s]
> [INFO] Ambari Python Client .............................. SUCCESS [0.829s]
> [INFO] Ambari Groovy Client .............................. SUCCESS [2.078s]
> [INFO] Ambari Shell ...................................... SUCCESS [0.039s]
> [INFO] Ambari Python Shell ............................... SUCCESS [0.554s]
> [INFO] Ambari Groovy Shell ............................... SUCCESS [0.811s]
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 4:42.176s
> [INFO] Finished at: Thu Mar 17 15:52:41 PDT 2016
> [INFO] Final Memory: 261M/1122M
> [INFO] ------------------------------------------------------------------------
> 
> **2. mvn test -DskipSurefireTests **
> 
> Ran 253 tests in 6.633s
> 
> OK
> ----------------------------------------------------------------------
> Total run:926
> Total errors:0
> Total failures:0
> OK
> INFO: AMBARI_SERVER_LIB is not set, using default /usr/lib/ambari-server
> INFO: Return code from stack upgrade command, retcode = 0
> StackAdvisor implementation for stack HDP1, version 2.0.6 was not found
> Returning DefaultStackAdvisor implementation
> StackAdvisor implementation for stack XYZ, version 1.0.0 was loaded
> StackAdvisor implementation for stack XYZ, version 1.0.1 was loaded
> Returning XYZ101StackAdvisor implementation
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 52.037s
> [INFO] Finished at: Thu Mar 17 15:11:40 PDT 2016
> [INFO] Final Memory: 57M/1008M
> [INFO] ------------------------------------------------------------------------
> 
> **3. mvn test -Dtest=TestHeartbeatHandler,RecoveryConfigHelperTest,HostVersionOutOfSyncListenerTest -DskipPythonTests **
> 
> -------------------------------------------------------
>  T E S T S
> -------------------------------------------------------
> Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
> Running org.apache.ambari.server.agent.TestHeartbeatHandler
> Tests run: 24, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 55.137 sec - in org.apache.ambari.server.agent.TestHeartbeatHandler
> Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
> Running org.apache.ambari.server.configuration.RecoveryConfigHelperTest
> Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 22.704 sec - in org.apache.ambari.server.configuration.RecoveryConfigHelperTest
> Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
> Running org.apache.ambari.server.events.listeners.upgrade.HostVersionOutOfSyncListenerTest
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 19.333 sec - in org.apache.ambari.server.events.listeners.upgrade.HostVersionOutOfSyncListenerTest
> 
> Results :
> 
> Tests run: 36, Failures: 0, Errors: 0, Skipped: 0
> 
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 1:59.372s
> [INFO] Finished at: Thu Mar 17 15:55:50 PDT 2016
> [INFO] Final Memory: 55M/1018M
> [INFO] ------------------------------------------------------------------------
> 
> **4. Manual tests **
> 
> *Setup*
> *  Deployed a VM with the latest **trunk** bits and replaced **ambari-server.jar** from the development build.
> *  Replaced **agent python** files from development.
> 
> *Tests*
> curl -u admin:admin -H "X-Requested-By: ambari" -X PUT http://localhost:8080/api/v1/clusters/c1/components/METRICS_COLLECTOR -d '{"ServiceComponentInfo":{"recovery_enabled":"false"}}'
> 
> In the Ambari Web UI, METRICS_COLLECTOR was set to Maintenance Mode.
> 
> 
> *Verification using the debugger*
> Before these tests were run, it was verified that the recovery configuration was sent only during registration and not during the heartbeats since nothing was modified.
> 
> For both the tests above, it was verified that events were published for the above changes and that RecoveryConfigHelper captured those events and invalidated the timestamp. During the next heartbeat, the new recovery configuration information was pulled from the DB and send to the agent. It was also verified that for subsequent heartbeats, the recovery configuration was not sent.
> 
> 
> Thanks,
> 
> Nahappan Somasundaram
> 
>


Re: Review Request 44986: AMBARI-15474: Listen for changes to auto-start configuration and send them to the agent during heartbeats

Posted by Nahappan Somasundaram <ns...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44986/#review124119
-----------------------------------------------------------




ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentRecoveryChangeEvent.java (line 1)
<https://reviews.apache.org/r/44986/#comment186500>

    Rename to ServiceComponentChangedEvent to be generic. It can be used whenever a property changes.


- Nahappan Somasundaram


On March 17, 2016, 4 p.m., Nahappan Somasundaram wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/44986/
> -----------------------------------------------------------
> 
> (Updated March 17, 2016, 4 p.m.)
> 
> 
> Review request for Ambari, Jonathan Hurley, Nate Cole, Sumit Mohanty, and Sid Wagle.
> 
> 
> Bugs: AMBARI-15474
>     https://issues.apache.org/jira/browse/AMBARI-15474
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> AMBARI-15474: Listen for changes to auto-start configuration and send them to the agent during heartbeats.
> 
> In HeartbeatHandler::handleHeartBeat(), send the recovery configuration to the agent. Instead of pulling the recovery configuration from the application cache, implement change events and get notified whenever there are changes to the recovery configuration.
> 
> **Changes**:
> 1. Instead of sending the recovery configuration to the agent during every heartbeat (10 seconds interval), send the configuration only when changes are detected.
> 2. Used AmbariEventPublisher to publish the changes. RecoveryConfigHelper is the subscriber which listens to changes to the configuration.
> 3. RecoveryConfigHelper maintains a map of Cluster-->Host-->Timestamp which is the last updated timestamp of the recovery configuration. This timestamp is used as a token and is sent to the agent with the configuration. The agent provides this timestamp during each heartbeat. If the timestamp from the heartbeat is the same as the one on the server side, no configuration is sent to the agent.
> 4. Whenever changes are detected, the timestamp is invalidated. During the next heartbeat, the recovery configuration is read from the DB and sent to the agent along with the latest timestamp.
> 5. Added new unit tests to test each configuration change event.
> 
> 
> Diffs
> -----
> 
>   ambari-agent/src/main/python/ambari_agent/Controller.py c1c16ac97cca0d2677da2e26eee7b4949a4bf15c 
>   ambari-agent/src/main/python/ambari_agent/Heartbeat.py b28644c834a2387d2fb0ad17d104224e830a0245 
>   ambari-agent/src/main/python/ambari_agent/RecoveryManager.py ed537ca6928ce376c5f3e906f7a47c3f1919e309 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeat.java 1430aa20cf68db071be85c4ad2ebb9acdaccecc0 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java 3a80803724bcb6ca9e4ec875aa17c9ac1c66fbe8 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java 617b04b5b8768ba28a3c8ecb6e5e00601f153396 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfig.java c2c18462c586292caaf40bdb6a25a8fe9a39d76c 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfigHelper.java 92a622117c71ddfc3bd2562c04ee525491bd6ffd 
>   ambari-server/src/main/java/org/apache/ambari/server/events/AmbariEvent.java 78731854fe80c3d6087a73ebf00b0ffe9e287354 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ClusterConfigChangedEvent.java PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentInstalledEvent.java 2ce01236b3bb1ab4934fe456e5669fa474987426 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentRecoveryChangeEvent.java PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentUninstalledEvent.java 19712cdb18592816f82898cd7aa08a3600c7e1f4 
>   ambari-server/src/main/java/org/apache/ambari/server/state/ConfigImpl.java 2cc3d0013140c9c7d3453d7c5e05c3aa02bc5794 
>   ambari-server/src/main/java/org/apache/ambari/server/state/ServiceComponentImpl.java 4be1c21bf41416681be836716ce38305cc3f287e 
>   ambari-server/src/main/java/org/apache/ambari/server/state/svccomphost/ServiceComponentHostImpl.java 3a0075e3a3c90088ca6bd2d9eca48ef50db9fdb3 
>   ambari-server/src/test/java/org/apache/ambari/server/agent/TestHeartbeatHandler.java b1f94e38f97550123dd366984438b0bbf9c2826c 
>   ambari-server/src/test/java/org/apache/ambari/server/configuration/RecoveryConfigHelperTest.java c86b95a7cd10b0720f0a3b9676f65a8bd68dd6c9 
>   ambari-server/src/test/java/org/apache/ambari/server/events/listeners/upgrade/HostVersionOutOfSyncListenerTest.java 1bf9d8336533bc677384314f998f10aa2f10504e 
> 
> Diff: https://reviews.apache.org/r/44986/diff/
> 
> 
> Testing
> -------
> 
> **1. mvn clean install **
> 
> 
> [INFO] ------------------------------------------------------------------------
> [INFO] Reactor Summary:
> [INFO]
> [INFO] Ambari Main ....................................... SUCCESS [5.864s]
> [INFO] Apache Ambari Project POM ......................... SUCCESS [0.037s]
> [INFO] Ambari Web ........................................ SUCCESS [23.683s]
> [INFO] Ambari Views ...................................... SUCCESS [1.069s]
> [INFO] Ambari Admin View ................................. SUCCESS [5.949s]
> [INFO] ambari-metrics .................................... SUCCESS [0.534s]
> [INFO] Ambari Metrics Common ............................. SUCCESS [0.431s]
> [INFO] Ambari Metrics Hadoop Sink ........................ SUCCESS [1.060s]
> [INFO] Ambari Metrics Flume Sink ......................... SUCCESS [0.534s]
> [INFO] Ambari Metrics Kafka Sink ......................... SUCCESS [0.584s]
> [INFO] Ambari Metrics Storm Sink ......................... SUCCESS [1.472s]
> [INFO] Ambari Metrics Collector .......................... SUCCESS [5.981s]
> [INFO] Ambari Metrics Monitor ............................ SUCCESS [1.696s]
> [INFO] Ambari Metrics Grafana ............................ SUCCESS [0.819s]
> [INFO] Ambari Metrics Assembly ........................... SUCCESS [1:13.281s]
> [INFO] Ambari Server ..................................... SUCCESS [2:11.632s]
> [INFO] Ambari Functional Tests ........................... SUCCESS [0.907s]
> [INFO] Ambari Agent ...................................... SUCCESS [21.659s]
> [INFO] Ambari Client ..................................... SUCCESS [0.043s]
> [INFO] Ambari Python Client .............................. SUCCESS [0.829s]
> [INFO] Ambari Groovy Client .............................. SUCCESS [2.078s]
> [INFO] Ambari Shell ...................................... SUCCESS [0.039s]
> [INFO] Ambari Python Shell ............................... SUCCESS [0.554s]
> [INFO] Ambari Groovy Shell ............................... SUCCESS [0.811s]
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 4:42.176s
> [INFO] Finished at: Thu Mar 17 15:52:41 PDT 2016
> [INFO] Final Memory: 261M/1122M
> [INFO] ------------------------------------------------------------------------
> 
> **2. mvn test -DskipSurefireTests **
> 
> Ran 253 tests in 6.633s
> 
> OK
> ----------------------------------------------------------------------
> Total run:926
> Total errors:0
> Total failures:0
> OK
> INFO: AMBARI_SERVER_LIB is not set, using default /usr/lib/ambari-server
> INFO: Return code from stack upgrade command, retcode = 0
> StackAdvisor implementation for stack HDP1, version 2.0.6 was not found
> Returning DefaultStackAdvisor implementation
> StackAdvisor implementation for stack XYZ, version 1.0.0 was loaded
> StackAdvisor implementation for stack XYZ, version 1.0.1 was loaded
> Returning XYZ101StackAdvisor implementation
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 52.037s
> [INFO] Finished at: Thu Mar 17 15:11:40 PDT 2016
> [INFO] Final Memory: 57M/1008M
> [INFO] ------------------------------------------------------------------------
> 
> **3. mvn test -Dtest=TestHeartbeatHandler,RecoveryConfigHelperTest,HostVersionOutOfSyncListenerTest -DskipPythonTests **
> 
> -------------------------------------------------------
>  T E S T S
> -------------------------------------------------------
> Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
> Running org.apache.ambari.server.agent.TestHeartbeatHandler
> Tests run: 24, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 55.137 sec - in org.apache.ambari.server.agent.TestHeartbeatHandler
> Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
> Running org.apache.ambari.server.configuration.RecoveryConfigHelperTest
> Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 22.704 sec - in org.apache.ambari.server.configuration.RecoveryConfigHelperTest
> Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m -Djava.awt.headless=true
> Running org.apache.ambari.server.events.listeners.upgrade.HostVersionOutOfSyncListenerTest
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 19.333 sec - in org.apache.ambari.server.events.listeners.upgrade.HostVersionOutOfSyncListenerTest
> 
> Results :
> 
> Tests run: 36, Failures: 0, Errors: 0, Skipped: 0
> 
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 1:59.372s
> [INFO] Finished at: Thu Mar 17 15:55:50 PDT 2016
> [INFO] Final Memory: 55M/1018M
> [INFO] ------------------------------------------------------------------------
> 
> **4. Manual tests **
> 
> *Setup*
> *  Deployed a VM with the latest **trunk** bits and replaced **ambari-server.jar** from the development build.
> *  Replaced **agent python** files from development.
> 
> *Tests*
> curl -u admin:admin -H "X-Requested-By: ambari" -X PUT http://localhost:8080/api/v1/clusters/c1/components/METRICS_COLLECTOR -d '{"ServiceComponentInfo":{"recovery_enabled":"false"}}'
> 
> In the Ambari Web UI, METRICS_COLLECTOR was set to Maintenance Mode.
> 
> 
> *Verification using the debugger*
> Before these tests were run, it was verified that the recovery configuration was sent only during registration and not during the heartbeats since nothing was modified.
> 
> For both the tests above, it was verified that events were published for the above changes and that RecoveryConfigHelper captured those events and invalidated the timestamp. During the next heartbeat, the new recovery configuration information was pulled from the DB and send to the agent. It was also verified that for subsequent heartbeats, the recovery configuration was not sent.
> 
> 
> Thanks,
> 
> Nahappan Somasundaram
> 
>