You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ambari.apache.org by "Jonathan Hurley (JIRA)" <ji...@apache.org> on 2015/03/16 16:10:38 UTC

[jira] [Created] (AMBARI-10083) Ambari Agent Alerts Prevents Binding to the Ping Port Listener On Startup

Jonathan Hurley created AMBARI-10083:
----------------------------------------

             Summary: Ambari Agent Alerts Prevents Binding to the Ping Port Listener On Startup
                 Key: AMBARI-10083
                 URL: https://issues.apache.org/jira/browse/AMBARI-10083
             Project: Ambari
          Issue Type: Bug
          Components: ambari-agent
    Affects Versions: 2.0.0
            Reporter: Jonathan Hurley
            Assignee: Jonathan Hurley
            Priority: Critical
             Fix For: 2.0.0


When restarting an Ambari Agent, child processes seem to hold onto the ping port server socket that the parent agent process listens on:

{noformat}
hdp2-02-02: ERROR: ambari-agent start failed. For more details, see
/var/log/ambari-agent/ambari-agent.out:
hdp2-02-02: ====================
hdp2-02-02: UID        PID  PPID  C STIME TTY          TIME CMD
hdp2-02-02: root     23667 23663  0 09:40 ?        00:00:00 /usr/bin/sudo
su ambari-qa -l -s /bin/bash -c export
PATH='/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/sbin:/sbin:/usr/lib/amb
ari-server/*:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/
root/bin:/var/lib/ambari-agent:/var/lib/ambari-agent:/bin/:/usr/bin/:/usr/l
ib/hive/bin/:/usr/sbin/' ; hive --hiveconf
hive.metastore.uris=thrift://hdp2-02-02.kane.homelinux.net:9083 -e 'show
databases;'
hdp2-02-02: Exception in thread Thread-1 (most likely raised during
interpreter shutdown):
hdp2-02-02: Traceback (most recent call last):
hdp2-02-02:   File "/usr/lib64/python2.6/threading.py", line 532, in
__bootstrap_inner
hdp2-02-02:   File
"/usr/lib/python2.6/site-packages/ambari_agent/DataCleaner.py", line 119,
in run
hdp2-02-02:   File "/usr/lib64/python2.6/logging/__init__.py", line 1056,
in info
hdp2-02-02:   File "/usr/lib64/python2.6/logging/__init__.py", line 1164,
in _log
hdp2-02-02:   File "/usr/lib64/python2.6/logging/__init__.py", line 1134,
in findCaller
hdp2-02-02: <type 'exceptions.AttributeError'>: 'NoneType' object has no
attribute 'path'
hdp2-02-02: ====================
hdp2-02-02: Agent out at: /var/log/ambari-agent/ambari-agent.out
hdp2-02-02: Agent log at: /var/log/ambari-agent/ambari-agent.log
hdp2-02-02: ambari-server: unrecognized service


Here is the tail of the ambari-agent log on that server:

NFO 2015-03-11 09:40:13,518 logger.py:65 - u"Execute['hive --hiveconf
hive.metastore.uris=thrift://hdp2-02-02:9083 -e 'show
databases;'']" {'path': ['/bin/
', '/usr/bin/', '/usr/lib/hive/bin/', '/usr/sbin/'], 'user': 'ambari-qa',
'timeout': 240}
INFO 2015-03-11 09:40:13,756 scheduler.py:527 - Job
"ec115aa5-8e09-454c-a4db-3f7d8ee47d84 (trigger: interval[0:01:00], next
run at: 2015-03-11 09:41:12.764254)" executed succe
ssfully
INFO 2015-03-11 09:40:19,090 Heartbeat.py:75 - Building Heartbeat:
{responseId = 3286, timestamp = 1426081219090, commandsInProgress = False,
componentsMapped = True}
INFO 2015-03-11 09:40:19,102 Controller.py:247 - Heartbeat response
received (id = 3287)
INFO 2015-03-11 09:40:19,102 Controller.py:291 - No commands sent from
hdp2-02-01.kane.homelinux.net
INFO 2015-03-11 09:40:26,771 main.py:68 - loglevel=logging.INFO
INFO 2015-03-11 09:40:29,103 Heartbeat.py:75 - Building Heartbeat:
{responseId = 3287, timestamp = 1426081229103, commandsInProgress = False,
componentsMapped = True}
INFO 2015-03-11 09:40:29,104 security.py:135 - Encountered communication
error. Details: BadStatusLine('',)
ERROR 2015-03-11 09:40:29,104 Controller.py:319 - Connection to
hdp2-02-01.kane.homelinux.net was lost (details=Request to
https://hdp2-02-01.kane.homelinux.net:8441/agent/v1/
heartbeat/hdp2-02-02.kane.homelinux.net failed due to Error occured during
connecting to the server: )
INFO 2015-03-11 09:40:33,312 main.py:68 - loglevel=logging.INFO
INFO 2015-03-11 09:40:33,313 DataCleaner.py:36 - Data cleanup thread
started
INFO 2015-03-11 09:40:33,323 DataCleaner.py:117 - Data cleanup started
ERROR 2015-03-11 09:40:33,433 main.py:243 - Failed to start ping port
listener of: Could not open port 8670 because port already used by another
process:
UID        PID  PPID  C STIME TTY          TIME CMD
root     23667 23663  0 09:40 ?        00:00:00 /usr/bin/sudo su ambari-qa
-l -s /bin/bash -c export
PATH='/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/sbin:/sbin:/usr/lib/a
mbari-server/*:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
:/root/bin:/var/lib/ambari-agent:/var/lib/ambari-agent:/bin/:/usr/bin/:/usr
/lib/hive/bin/:/usr/sbin/
' ; hive --hiveconf
hive.metastore.uris=thrift://hdp2-02-02:9083 -e 'show
databases;'

INFO 2015-03-11 09:40:33,433 PingPortListener.py:62 - Ping port listener
killed
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)