You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ambari.apache.org by "Hoc Phan (JIRA)" <ji...@apache.org> on 2017/12/28 00:01:00 UTC
[jira] [Created] (AMBARI-22701) hive CLI process leak on metastore
alert
Hoc Phan created AMBARI-22701:
---------------------------------
Summary: hive CLI process leak on metastore alert
Key: AMBARI-22701
URL: https://issues.apache.org/jira/browse/AMBARI-22701
Project: Ambari
Issue Type: Bug
Components: alerts
Affects Versions: 2.4.0
Environment: CentOS 6.9
Ambari 2.4.0.1
Hortonworks Hadoop 2.5.0.0-1245
Hive installed
Tez installed
Reporter: Hoc Phan
alert_hive_metastore.py will cause orphan processes running over time. Below is one example:
1001 593317 593316 0 Dec24 ? 00:00:00 -bash -c export PATH='/usr/sbin:/sbin:/usr/ lib/ambari-server/*:/sbin:/usr/sbin:/bin:/usr/bin:/var/lib/ambari-agent:/bin/:/usr/bin/:/usr/s bin/:/usr/hdp/current/hive-metastore/bin' ; export HIVE_CONF_DIR="/usr/hdp/current/hive-metastore/conf/conf.server" ; hive --hiveconf hive.metastore.uris=thrift://demo.local:9083 --hiveconf hive.metastore.client.connect.retry.delay=1 --hiveconf hive.metastore.failure.retries=1 --hiveconf hive.metastore.connect.retries=1 --hiveconf hive.metastore.client.socket.timeout=14 --hiveconf hive.execution.engine=mr -e "show databases;"
There could be thousands of those over many months in the host with Hive Metastore. To check, run below two commands:
ps -ef | grep "[s]how databases" | wc -l
ps h -Led -o user | sort | uniq -c | sort -n
This will hit nproc limit and crash other services in the same host.
The fixes are:
1. Swap to "hive" user instead of "ambari-qa" user:
https://issues.apache.org/jira/browse/AMBARI-22142
2. Change hive CLI to beeline:
https://issues.apache.org/jira/browse/AMBARI-17006
For some reasons, the hive CLI processes don't get killed and kept "lingering" around.
Proposed fix in /var/lib/ambari-server/resources/common-services/HIVE/0.12.0.2.0/package/alerts
Instructions:
1. Add below lines below "HIVE_METASTORE_URIS_KEY = '{{hive-site/hive.metastore.uris}}'"
HIVE_SERVER_THRIFT_PORT_KEY = '{{hive-site/hive.server2.thrift.port}}'
HIVE_SERVER_THRIFT_HTTP_PORT_KEY = '{{hive-site/hive.server2.thrift.http.port}}'
HIVE_SERVER_TRANSPORT_MODE_KEY = '{{hive-site/hive.server2.transport.mode}}'
THRIFT_PORT_DEFAULT = 10000
HIVE_SERVER_TRANSPORT_MODE_DEFAULT = 'binary'
2. Change SMOKEUSER_DEFAULT = 'ambari-qa' to:
SMOKEUSER_DEFAULT = 'hive'
3. Replace
return (SECURITY_ENABLED_KEY,SMOKEUSER_KEYTAB_KEY,SMOKEUSER_PRINCIPAL_KEY, HIVE_METASTORE_URIS_KEY, SMOKEUSER_KEY, KERBEROS_EXECUTABLE_SEARCH_PATHS_KEY, STACK_ROOT)
with this:
return (SECURITY_ENABLED_KEY,SMOKEUSER_KEYTAB_KEY,SMOKEUSER_PRINCIPAL_KEY, HIVE_METASTORE_URIS_KEY, SMOKEUSER_KEY, KERBEROS_EXECUTABLE_SEARCH_PATHS_KEY, STACK_ROOT, HIVE_SERVER_THRIFT_PORT_KEY, HIVE_SERVER_THRIFT_HTTP_PORT_KEY, HIVE_SERVER_TRANSPORT_MODE_KEY)
4. Replace this
return (HIVE_METASTORE_URIS_KEY, HADOOPUSER_KEY)
with this:
return (HIVE_SERVER_THRIFT_PORT_KEY, HIVE_SERVER_THRIFT_HTTP_PORT_KEY, HIVE_SERVER_TRANSPORT_MODE_KEY, HIVE_METASTORE_URIS_KEY, HADOOPUSER_KEY)
5. Comment out these lines because it will kept injecting ambari-qa user back
#if SMOKEUSER_KEY in configurations:
# smokeuser = configurations[SMOKEUSER_KEY]
6. Replace this code block:
cmd = format("export HIVE_CONF_DIR='{conf_dir}' ; "
"hive --hiveconf hive.metastore.uris={metastore_uri}\
--hiveconf hive.metastore.client.connect.retry.delay=1\
--hiveconf hive.metastore.failure.retries=1\
--hiveconf hive.metastore.connect.retries=1\
--hiveconf hive.metastore.client.socket.timeout=14\
--hiveconf hive.execution.engine=mr -e 'show databases;'")
with this block:
transport_mode = HIVE_SERVER_TRANSPORT_MODE_DEFAULT
if HIVE_SERVER_TRANSPORT_MODE_KEY in configurations:
transport_mode = configurations[HIVE_SERVER_TRANSPORT_MODE_KEY]
port = THRIFT_PORT_DEFAULT
if transport_mode.lower() == 'binary' and HIVE_SERVER_THRIFT_PORT_KEY in configurations:
port = int(configurations[HIVE_SERVER_THRIFT_PORT_KEY])
elif transport_mode.lower() == 'http' and HIVE_SERVER_THRIFT_HTTP_PORT_KEY in configurations:
port = int(configurations[HIVE_SERVER_THRIFT_HTTP_PORT_KEY])
cmd = format("export HIVE_CONF_DIR='{conf_dir}' ; "
"beeline -u jdbc:hive2://{host_name}:{port}/\
--hiveconf hive.metastore.client.connect.retry.delay=1\
--hiveconf hive.metastore.failure.retries=1\
--hiveconf hive.metastore.connect.retries=1\
--hiveconf hive.metastore.client.socket.timeout=14\
--hiveconf hive.execution.engine=mr -e 'show databases;'")
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)