You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ambari.apache.org by Dmitro Lisnichenko <dl...@hortonworks.com> on 2015/06/05 13:02:34 UTC

Review Request 35121: Ambari-agent died when trying to auto restart itself

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35121/
-----------------------------------------------------------

Review request for Ambari, Alejandro Fernandez and Andrew Onischuk.


Bugs: AMBARI-11717
    https://issues.apache.org/jira/browse/AMBARI-11717


Repository: ambari


Description
-------

INFO 2015-05-19 15:49:39,909 NetUtil.py:60 - Connecting to https://1d8.vm:8440/connection_info
INFO 2015-05-19 15:49:40,063 security.py:93 - SSL Connect being called.. connecting to the server
INFO 2015-05-19 15:49:40,215 security.py:55 - SSL connection established. Two-way SSL authentication is turned off on the server.
INFO 2015-05-19 15:49:40,261 Controller.py:245 - Heartbeat response received (id = 380)
ERROR 2015-05-19 15:49:40,261 Controller.py:263 - Error in responseId sequence - restarting
out file is empty
STR:
deploy multi-node cluster in virtual machines
make snapshots
in few hours, revert to previous snapshots. All agents except an agent on server host are dead
EXPECTED:
agents should just reconnect to server
While suspending Ambari cluster in VM is definitely not supported, we should ensure that auto restart on invalid responce Id is not actually killing agents.


Diffs
-----

  ambari-agent/src/main/python/ambari_agent/AlertSchedulerHandler.py 5d26227 
  ambari-agent/src/main/python/ambari_agent/Controller.py 4e5de6c 
  ambari-agent/src/main/python/ambari_agent/ExitHelper.py PRE-CREATION 
  ambari-agent/src/main/python/ambari_agent/apscheduler/threadpool.py 8ec47da 
  ambari-agent/src/main/python/ambari_agent/main.py 5972717 
  ambari-agent/src/test/python/ambari_agent/TestController.py a202ba4 
  ambari-agent/src/test/python/ambari_agent/TestMain.py 3c20997 

Diff: https://reviews.apache.org/r/35121/diff/


Testing
-------

[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] Ambari Views ...................................... SUCCESS [3.463s]
[INFO] Ambari Metrics Common ............................. SUCCESS [1.212s]
[INFO] Ambari Server ..................................... SUCCESS [53.686s]
[INFO] Ambari Agent ...................................... SUCCESS [18.695s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1:20.081s
[INFO] Finished at: Fri Jun 05 14:01:02 EEST 2015
[INFO] Final Memory: 54M/396M
[INFO] ------------------------------------------------------------------------


Thanks,

Dmitro Lisnichenko


Re: Review Request 35121: Ambari-agent died when trying to auto restart itself

Posted by Andrew Onischuk <ao...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35121/#review87691
-----------------------------------------------------------

Ship it!


Ship It!

- Andrew Onischuk


On June 5, 2015, 11:02 a.m., Dmitro Lisnichenko wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35121/
> -----------------------------------------------------------
> 
> (Updated June 5, 2015, 11:02 a.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez and Andrew Onischuk.
> 
> 
> Bugs: AMBARI-11717
>     https://issues.apache.org/jira/browse/AMBARI-11717
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> INFO 2015-05-19 15:49:39,909 NetUtil.py:60 - Connecting to https://1d8.vm:8440/connection_info
> INFO 2015-05-19 15:49:40,063 security.py:93 - SSL Connect being called.. connecting to the server
> INFO 2015-05-19 15:49:40,215 security.py:55 - SSL connection established. Two-way SSL authentication is turned off on the server.
> INFO 2015-05-19 15:49:40,261 Controller.py:245 - Heartbeat response received (id = 380)
> ERROR 2015-05-19 15:49:40,261 Controller.py:263 - Error in responseId sequence - restarting
> out file is empty
> STR:
> deploy multi-node cluster in virtual machines
> make snapshots
> in few hours, revert to previous snapshots. All agents except an agent on server host are dead
> EXPECTED:
> agents should just reconnect to server
> While suspending Ambari cluster in VM is definitely not supported, we should ensure that auto restart on invalid responce Id is not actually killing agents.
> 
> 
> Diffs
> -----
> 
>   ambari-agent/src/main/python/ambari_agent/AlertSchedulerHandler.py 5d26227 
>   ambari-agent/src/main/python/ambari_agent/Controller.py 4e5de6c 
>   ambari-agent/src/main/python/ambari_agent/ExitHelper.py PRE-CREATION 
>   ambari-agent/src/main/python/ambari_agent/apscheduler/threadpool.py 8ec47da 
>   ambari-agent/src/main/python/ambari_agent/main.py 5972717 
>   ambari-agent/src/test/python/ambari_agent/TestController.py a202ba4 
>   ambari-agent/src/test/python/ambari_agent/TestMain.py 3c20997 
> 
> Diff: https://reviews.apache.org/r/35121/diff/
> 
> 
> Testing
> -------
> 
> [INFO] ------------------------------------------------------------------------
> [INFO] Reactor Summary:
> [INFO] 
> [INFO] Ambari Views ...................................... SUCCESS [3.463s]
> [INFO] Ambari Metrics Common ............................. SUCCESS [1.212s]
> [INFO] Ambari Server ..................................... SUCCESS [53.686s]
> [INFO] Ambari Agent ...................................... SUCCESS [18.695s]
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 1:20.081s
> [INFO] Finished at: Fri Jun 05 14:01:02 EEST 2015
> [INFO] Final Memory: 54M/396M
> [INFO] ------------------------------------------------------------------------
> 
> 
> Thanks,
> 
> Dmitro Lisnichenko
> 
>