You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@ambari.apache.org by Sebastian Toader <st...@hortonworks.com> on 2017/02/01 07:50:27 UTC

Re: Review Request 56133: AMBARI-19802. Debug: agent randomly losing heartbeat with the server

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56133/#review163789
-----------------------------------------------------------




ambari-agent/src/main/python/ambari_agent/Controller.py (line 303)
<https://reviews.apache.org/r/56133/#comment235288>

    This looks to be outside of the scope of ```if current_time - heartbeat_running_msg_timestamp > state_interval:``` thus would log on every heartbeat (every second). We don't want to flood the log file. This should be logged at ```state_intervals```



ambari-agent/src/main/python/ambari_agent/Controller.py (line 310)
<https://reviews.apache.org/r/56133/#comment235289>

    This will log on every heartbeat which is fine for DEBUG log level but not for INFO level.



ambari-agent/src/main/python/ambari_agent/Controller.py (line 319)
<https://reviews.apache.org/r/56133/#comment235290>

    This will log with every heartbeat which is fine for DEBUG logging level but not INFO.



ambari-agent/src/main/python/ambari_agent/Controller.py (line 331)
<https://reviews.apache.org/r/56133/#comment235291>

    This will log with every heartbeat which is fine for DEBUG logging level but not INFO.



ambari-agent/src/main/python/ambari_agent/Controller.py (line 340)
<https://reviews.apache.org/r/56133/#comment235292>

    This will log with every heartbeat which is fine for DEBUG logging level but not INFO.



ambari-agent/src/main/python/ambari_agent/Controller.py (line 376)
<https://reviews.apache.org/r/56133/#comment235293>

    This will log with every heartbeat which is fine for DEBUG logging level but not INFO.



ambari-agent/src/main/python/ambari_agent/Controller.py (line 410)
<https://reviews.apache.org/r/56133/#comment235296>

    How often will this be logged at INFO level?



ambari-agent/src/main/python/ambari_agent/Controller.py (line 414)
<https://reviews.apache.org/r/56133/#comment235297>

    How often will this be logged at INFO level?



ambari-agent/src/main/python/ambari_agent/Controller.py (line 428)
<https://reviews.apache.org/r/56133/#comment235298>

    How often will this be logged at INFO level?



ambari-agent/src/main/python/ambari_agent/Controller.py (line 470)
<https://reviews.apache.org/r/56133/#comment235300>

    How often will this log at INFO level?



ambari-agent/src/main/python/ambari_agent/Controller.py (line 477)
<https://reviews.apache.org/r/56133/#comment235301>

    How often will this log on INFO level?


- Sebastian Toader


On Jan. 31, 2017, 7:49 p.m., Attila Doroszlai wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56133/
> -----------------------------------------------------------
> 
> (Updated Jan. 31, 2017, 7:49 p.m.)
> 
> 
> Review request for Ambari, Sandor Magyari, Sumit Mohanty, and Sebastian Toader.
> 
> 
> Bugs: AMBARI-19802
>     https://issues.apache.org/jira/browse/AMBARI-19802
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Add more logging in heartbeat cycle.
> 
> 
> Diffs
> -----
> 
>   ambari-agent/src/main/python/ambari_agent/Controller.py 63707159f00a3b6ef885849acf8233bd83cc4749 
>   ambari-agent/src/main/python/ambari_agent/StatusCommandsExecutor.py fbb29f4fe59e34e33242f81b783d39a9921c777a 
> 
> Diff: https://reviews.apache.org/r/56133/diff/
> 
> 
> Testing
> -------
> 
> Checked agent log at both INFO and DEBUG log level.
> 
> 
> Thanks,
> 
> Attila Doroszlai
> 
>


Re: Review Request 56133: AMBARI-19802. Debug: agent randomly losing heartbeat with the server

Posted by Attila Doroszlai <ad...@hortonworks.com>.

> On Feb. 1, 2017, 8:50 a.m., Sebastian Toader wrote:
> >

When `state_interval` has not yet passed, `logging_level` is set to `DEBUG`, resulting in these messages being ignored if agent log level is at `INFO`, but printed if agent is at `DEBUG`.  After passing `state_interval`, however, `logging_level` is set to `INFO`, so the messages are logged.  This lets us avoid a bunch of `if`s.

Sample output:

```
INFO 2017-02-01 08:05:33,522 Controller.py:257 - Adding 27 status commands. Heartbeat id = 184
INFO 2017-02-01 08:05:33,523 ActionQueue.py:120 - Adding STATUS_COMMAND for component YARN_CLIENT of service YARN of cluster TEST to the queue.
...
INFO 2017-02-01 08:05:35,215 ActionQueue.py:120 - Adding STATUS_COMMAND for component MYSQL_SERVER of service HIVE of cluster TEST to the queue.
INFO 2017-02-01 08:05:37,360 Controller.py:303 - Heartbeat (response id = 185) with server is running...
INFO 2017-02-01 08:05:37,361 Controller.py:310 - Building heartbeat message
INFO 2017-02-01 08:05:37,398 Heartbeat.py:90 - Adding host info/state to heartbeat message.
INFO 2017-02-01 08:05:37,483 Hardware.py:168 - Some mount points were ignored: /, /dev/shm, /boot, /vagrant, /vagrant
INFO 2017-02-01 08:05:37,485 Controller.py:319 - Sending Heartbeat (id = 185)
INFO 2017-02-01 08:05:37,532 Controller.py:331 - Heartbeat response received (id = 186)
INFO 2017-02-01 08:05:37,532 Controller.py:340 - Heartbeat interval is 1 seconds
INFO 2017-02-01 08:05:37,532 Controller.py:376 - Updating configurations from heartbeat
INFO 2017-02-01 08:05:37,533 Controller.py:385 - Adding cancel/execution commands
INFO 2017-02-01 08:05:37,533 Controller.py:470 - Waiting 0.9 for next heartbeat
INFO 2017-02-01 08:05:38,433 Controller.py:477 - Wait for next heartbeat over
INFO 2017-02-01 08:06:34,225 Controller.py:257 - Adding 27 status commands. Heartbeat id = 246
INFO 2017-02-01 08:06:34,225 ActionQueue.py:120 - Adding STATUS_COMMAND for component YARN_CLIENT of service YARN of cluster TEST to the queue.
...
INFO 2017-02-01 08:06:35,904 ActionQueue.py:120 - Adding STATUS_COMMAND for component MYSQL_SERVER of service HIVE of cluster TEST to the queue.
INFO 2017-02-01 08:06:38,055 Controller.py:303 - Heartbeat (response id = 247) with server is running...
INFO 2017-02-01 08:06:38,055 Controller.py:310 - Building heartbeat message
INFO 2017-02-01 08:06:38,056 Heartbeat.py:90 - Adding host info/state to heartbeat message.
INFO 2017-02-01 08:06:38,123 Hardware.py:168 - Some mount points were ignored: /, /dev/shm, /boot, /vagrant, /vagrant
INFO 2017-02-01 08:06:38,123 Controller.py:319 - Sending Heartbeat (id = 247)
INFO 2017-02-01 08:06:38,168 Controller.py:331 - Heartbeat response received (id = 248)
INFO 2017-02-01 08:06:38,168 Controller.py:340 - Heartbeat interval is 1 seconds
INFO 2017-02-01 08:06:38,168 Controller.py:376 - Updating configurations from heartbeat
INFO 2017-02-01 08:06:38,168 Controller.py:385 - Adding cancel/execution commands
INFO 2017-02-01 08:06:38,169 Controller.py:470 - Waiting 0.9 for next heartbeat
INFO 2017-02-01 08:06:39,069 Controller.py:477 - Wait for next heartbeat over
```


- Attila


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56133/#review163789
-----------------------------------------------------------


On Jan. 31, 2017, 7:49 p.m., Attila Doroszlai wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56133/
> -----------------------------------------------------------
> 
> (Updated Jan. 31, 2017, 7:49 p.m.)
> 
> 
> Review request for Ambari, Sandor Magyari, Sumit Mohanty, and Sebastian Toader.
> 
> 
> Bugs: AMBARI-19802
>     https://issues.apache.org/jira/browse/AMBARI-19802
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Add more logging in heartbeat cycle.
> 
> 
> Diffs
> -----
> 
>   ambari-agent/src/main/python/ambari_agent/Controller.py 63707159f00a3b6ef885849acf8233bd83cc4749 
>   ambari-agent/src/main/python/ambari_agent/StatusCommandsExecutor.py fbb29f4fe59e34e33242f81b783d39a9921c777a 
> 
> Diff: https://reviews.apache.org/r/56133/diff/
> 
> 
> Testing
> -------
> 
> Checked agent log at both INFO and DEBUG log level.
> 
> 
> Thanks,
> 
> Attila Doroszlai
> 
>