You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ambari.apache.org by "Ritesh (JIRA)" <ji...@apache.org> on 2017/08/10 02:45:00 UTC
[jira] [Created] (AMBARI-21697) Spark thrift service was alerting
for connectivity while using http mode
Ritesh created AMBARI-21697:
-------------------------------
Summary: Spark thrift service was alerting for connectivity while using http mode
Key: AMBARI-21697
URL: https://issues.apache.org/jira/browse/AMBARI-21697
Project: Ambari
Issue Type: Bug
Components: ambari-server
Affects Versions: 2.5.1
Reporter: Ritesh
Newly installed clusters keep showing ambari thrift server down alert while using http mode.
An alert for spark thrift service is seen everytime new cluster is created.
The script used by alert is /var/lib/ambari-agent/cache/common-services/SPARK2/2.0.0/package/scripts/alerts/alert_spark2_thrift_port.py
Error stack
=======
Connection failed on host hn0-salqa0.lv5aupozrfhezhozcxr3xjcwqe.dx.internal.cloudapp.net:10016 (Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/common-services/SPARK2/2.0.0/package/scripts/alerts/alert_spark2_thrift_port.py", line 144, in execute
Execute(cmd, user=hiveruser, path=[beeline_cmd], timeout=CHECK_COMMAND_TIMEOUT_DEFAULT)
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 155, in _init_
self.env.run()
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run
self.run_action(resource, action)
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action
provider_action()
File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 262, in action_run
tries=self.resource.tries, try_sleep=self.resource.try_sleep)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 72, in inner
result = function(command, **kwargs)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 102, in checked_call
tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 150, in _call_wrapper
result = _call(command, **kwargs_copy)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 303, in _call
raise ExecutionFailed(err_msg, code, out, err)
*ExecutionFailed: Execution of '! beeline -u 'jdbc:hive2://hn0-salqa0.lv5aupozrfhezhozcxr3xjcwqe.dx.internal.cloudapp.net:10016/default' transportMode=http -e '' 2>&1| awk '
{print}
'|grep -i -e 'Connection refused' -e 'Invalid URL'' returned 1. Error: Could not open client transport with JDBC Uri: jdbc:hive2://hn0-salqa0.lv5aupozrfhezhozcxr3xjcwqe.dx.internal.cloudapp.net:10016/default: java.net.ConnectException: Connection refused (Connection refused) (state=08S01,code=0)*
Error: Could not open client transport with JDBC Uri: jdbc:hive2://hn0-salqa0.lv5aupozrfhezhozcxr3xjcwqe.dx.internal.cloudapp.net:10016/default: java.net.ConnectException: Connection refused (Connection refused) (state=08S01,code=0)
It seems that alert is checking wrong port (10016 instead of 10002) when configured in http mode (transportMode=http).
Reason
=====
From the logic in the script it seems that if the transport mode is binary it will use HIVE_SERVER_THRIFT_PORT which is same as of THRIFT_PORT_DEFAULT. Hence it will always go for 10016 port.
============
THRIFT_PORT_DEFAULT = 10016
HIVE_SERVER_TRANSPORT_MODE_DEFAULT = 'binary'
port = THRIFT_PORT_DEFAULT
if transport_mode.lower() == 'binary' and HIVE_SERVER_THRIFT_PORT_KEY in configurations:
port = int(configurations[HIVE_SERVER_THRIFT_PORT_KEY])
========
Resolution
We should change the default port to 10002 in the alert script.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)