You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ambari.apache.org by "Ritesh (JIRA)" <ji...@apache.org> on 2017/08/10 02:52:02 UTC

[jira] [Updated] (AMBARI-21697) Ambari showing false alert for STS connectivity while using http mode

     [ https://issues.apache.org/jira/browse/AMBARI-21697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ritesh updated AMBARI-21697:
----------------------------
    Summary: Ambari showing false alert for STS connectivity while using http mode  (was: Spark thrift service was alerting for connectivity while using http mode)

> Ambari showing false alert for STS connectivity while using http mode
> ---------------------------------------------------------------------
>
>                 Key: AMBARI-21697
>                 URL: https://issues.apache.org/jira/browse/AMBARI-21697
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-server
>    Affects Versions: 2.5.1
>            Reporter: Ritesh
>         Attachments: AMBARI-999.patch
>
>
> Newly installed clusters keep showing ambari thrift server down alert while using http mode.
> An alert for spark thrift service is seen everytime new cluster is created. 
> The script used by alert is /var/lib/ambari-agent/cache/common-services/SPARK2/2.0.0/package/scripts/alerts/alert_spark2_thrift_port.py
> Error stack 
> =======
> Connection failed on host hn0-salqa0.lv5aupozrfhezhozcxr3xjcwqe.dx.internal.cloudapp.net:10016 (Traceback (most recent call last): 
> File "/var/lib/ambari-agent/cache/common-services/SPARK2/2.0.0/package/scripts/alerts/alert_spark2_thrift_port.py", line 144, in execute 
> Execute(cmd, user=hiveruser, path=[beeline_cmd], timeout=CHECK_COMMAND_TIMEOUT_DEFAULT) 
> File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 155, in _init_ 
> self.env.run() 
> File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run 
> self.run_action(resource, action) 
> File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action 
> provider_action() 
> File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 262, in action_run 
> tries=self.resource.tries, try_sleep=self.resource.try_sleep) 
> File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 72, in inner 
> result = function(command, **kwargs) 
> File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 102, in checked_call 
> tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy) 
> File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 150, in _call_wrapper 
> result = _call(command, **kwargs_copy) 
> File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 303, in _call 
> raise ExecutionFailed(err_msg, code, out, err)
> *ExecutionFailed: Execution of '! beeline -u 'jdbc:hive2://hn0-salqa0.lv5aupozrfhezhozcxr3xjcwqe.dx.internal.cloudapp.net:10016/default' transportMode=http -e '' 2>&1| awk '
> {print}
> '|grep -i -e 'Connection refused' -e 'Invalid URL'' returned 1. Error: Could not open client transport with JDBC Uri: jdbc:hive2://hn0-salqa0.lv5aupozrfhezhozcxr3xjcwqe.dx.internal.cloudapp.net:10016/default: java.net.ConnectException: Connection refused (Connection refused) (state=08S01,code=0)*
> Error: Could not open client transport with JDBC Uri: jdbc:hive2://hn0-salqa0.lv5aupozrfhezhozcxr3xjcwqe.dx.internal.cloudapp.net:10016/default: java.net.ConnectException: Connection refused (Connection refused) (state=08S01,code=0)
> It seems that alert is checking wrong port (10016 instead of 10002) when configured in http mode (transportMode=http).
> Reason
> =====
> From the logic in the script it seems that if the transport mode is binary it will use HIVE_SERVER_THRIFT_PORT which is same as of THRIFT_PORT_DEFAULT. Hence it will always go for 10016 port. 
> ============
> THRIFT_PORT_DEFAULT = 10016
> HIVE_SERVER_TRANSPORT_MODE_DEFAULT = 'binary'
> port = THRIFT_PORT_DEFAULT
> if transport_mode.lower() == 'binary' and HIVE_SERVER_THRIFT_PORT_KEY in configurations:
> port = int(configurations[HIVE_SERVER_THRIFT_PORT_KEY])
> ========
> Resolution 
> We should change the default port to 10002 in the alert script. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)