You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ambari.apache.org by "Dmitry Lysnichenko (JIRA)" <ji...@apache.org> on 2014/08/21 19:17:11 UTC

[jira] [Resolved] (AMBARI-6978) Uncatched exception at ambari agent - it may die on connection error

     [ https://issues.apache.org/jira/browse/AMBARI-6978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dmitry Lysnichenko resolved AMBARI-6978.
----------------------------------------

    Resolution: Fixed

Committed to trunk

> Uncatched exception at ambari agent - it may die on connection error
> --------------------------------------------------------------------
>
>                 Key: AMBARI-6978
>                 URL: https://issues.apache.org/jira/browse/AMBARI-6978
>             Project: Ambari
>          Issue Type: Task
>          Components: agent, test
>    Affects Versions: 1.7.0
>            Reporter: Dmitry Lysnichenko
>            Assignee: Dmitry Lysnichenko
>             Fix For: 1.7.0
>
>
> I've got into this situation on 2 agent hosts after I've upgraded ambari-server, reset database and restarted ambari-server few times. Probably there was a rare case when agent got connection exception during registration, and it was not catched. So agent registration failed, and I had to go to agent host and to start agent manually.
> The expected behavior for an agent is to ignore any server-side connection problems and stay alive. 
> {code}
> INFO 2014-08-19 19:02:03,899 NetUtil.py:48 - Connecting to https://vm-0.vm:8440/connection_info
> WARNING 2014-08-19 19:02:03,900 NetUtil.py:71 - Failed to connect to https://vm-0.vm:8440/connection_info due to [Errno 111] Connection refused  
> DEBUG 2014-08-19 19:02:03,900 security.py:47 - Server two-way SSL authentication required: False
> INFO 2014-08-19 19:02:03,900 security.py:93 - SSL Connect being called.. connecting to the server
> DEBUG 2014-08-19 19:02:03,901 security.py:134 - Error in sending/receving data from the server Traceback (most recent call last):
>   File "/usr/lib/python2.6/site-packages/ambari_agent/security.py", line 128, in request
>     req.get_data(), req.headers)
>   File "/usr/lib64/python2.6/httplib.py", line 920, in request
>     self._send_request(method, url, body, headers)
>   File "/usr/lib64/python2.6/httplib.py", line 951, in _send_request
>     self.endheaders()
>   File "/usr/lib64/python2.6/httplib.py", line 908, in endheaders
>     self._send_output()
>   File "/usr/lib64/python2.6/httplib.py", line 780, in _send_output
>     self.send(msg)
>   File "/usr/lib64/python2.6/httplib.py", line 739, in send
>     self.connect()
>   File "/usr/lib/python2.6/site-packages/ambari_agent/security.py", line 53, in connect
>     sock = self.create_connection()
>   File "/usr/lib/python2.6/site-packages/ambari_agent/security.py", line 94, in create_connection
>     sock = socket.create_connection((self.host, self.port), 60)
>   File "/usr/lib64/python2.6/socket.py", line 567, in create_connection
>     raise error, msg
> error: [Errno 111] Connection refused
> INFO 2014-08-19 19:02:03,901 security.py:135 - Encountered communication error. Details: error(111, 'Connection refused')
> ERROR 2014-08-19 19:02:03,901 Controller.py:115 - Request to https://vm-0.vm:8441/agent/v1/register/vm-2.vm failed due to Error occured during connecting to the server: [Errno 111] Connection refused
> INFO 2014-08-19 19:02:03,907 main.py:55 - signal received, exiting.
> INFO 2014-08-19 19:02:03,907 ProcessHelper.py:39 - Removing pid file
> INFO 2014-08-19 19:02:03,907 ProcessHelper.py:46 - Removing temp files
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)