You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ambari.apache.org by "Sam Mingolelli (JIRA)" <ji...@apache.org> on 2016/02/25 00:35:18 UTC

[jira] [Commented] (AMBARI-15165) HDFS Datanode won't start in secure cluster

    [ https://issues.apache.org/jira/browse/AMBARI-15165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15166335#comment-15166335 ] 

Sam Mingolelli commented on AMBARI-15165:
-----------------------------------------

I think I figured out this particular issue. This line is key:

{quote}
java.io.IOException: Login failure for dn/host-192-168-114-49.td.local@<REDACTED KERBEROS REALM> from keytab /etc/security/keytabs/dn.service.keytab: javax.security.auth.login.LoginException: Unable to obtain password from user
{quote}

For whatever reason this system was identifying itself as 2 different hostnames. I'd used hostnamectl to explicitly set it but when Ambari + HDP + Kerberos is used it was constructing a principal using the hostname host-192-168-114-49.td.local. I resolved this issue by explicitly setting the hostname in my host's /etc/hosts file as well.

Doing a hostname -A showed the offending hostname was still in use, by adding it to /etc/hosts it placated both the hostname -A and Ambari so that it saw the identical hostname that hostnamectl was reporting as well.

> HDFS Datanode won't start in secure cluster
> -------------------------------------------
>
>                 Key: AMBARI-15165
>                 URL: https://issues.apache.org/jira/browse/AMBARI-15165
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-web
>    Affects Versions: 2.2.0
>         Environment: {code}
> $ cat /etc/redhat-release
> CentOS Linux release 7.2.1511 (Core)
> $ uname -a
> Linux dev09-ost-hivetest-h-hb02.td.local 3.10.0-327.10.1.el7.x86_64 #1 SMP Tue Feb 16 17:03:50 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
> {code}
>            Reporter: Sam Mingolelli
>
> This issue sounds related, but I'm on the newer version which should include this patch already: https://issues.apache.org/jira/browse/AMBARI-12355
> When I attempt to Kerberoize a HDP cluster the startup of the HDFS datanode fails quietly. Nothing telling in the logs, see the referenced below ambari-agent errors log.
> {code}
> Traceback (most recent call last):
>   File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/datanode.py", line 167, in <module>
>     DataNode().execute()
>   File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 219, in execute
>     method(env)
>   File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/datanode.py", line 62, in start
>     datanode(action="start")
>   File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk
>     return fn(*args, **kwargs)
>   File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_datanode.py", line 72, in datanode
>     create_log_dir=True
>   File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/utils.py", line 267, in service
>     Execute(daemon_cmd, not_if=process_id_exists_command, environment=hadoop_env_exports)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 154, in __init__
>     self.env.run()
>   File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 158, in run
>     self.run_action(resource, action)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 121, in run_action
>     provider_action()
>   File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 238, in action_run
>     tries=self.resource.tries, try_sleep=self.resource.try_sleep)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 70, in inner
>     result = function(command, **kwargs)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92, in checked_call
>     tries=tries, try_sleep=try_sleep)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140, in _call_wrapper
>     result = _call(command, **kwargs_copy)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291, in _call
>     raise Fail(err_msg)
> resource_management.core.exceptions.Fail: Execution of 'ambari-sudo.sh  -H -E /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh --config /usr/hdp/current/hadoop-client/conf start datanode' returned 1. starting datanode, logging to /var/log/hadoop/hdfs/hadoop-hdfs-datanode-dev09-ost-hivetest-h-hb02.td.local.out
> stdout:   /var/lib/ambari-agent/data/output-228.txt
> 2016-02-24 10:51:14,841 - The hadoop conf dir /usr/hdp/current/hadoop-client/conf exists, will call conf-select on it for version 2.3.4.0-3485
> 2016-02-24 10:51:14,841 - Checking if need to create versioned conf dir /etc/hadoop/2.3.4.0-3485/0
> 2016-02-24 10:51:14,841 - call['conf-select create-conf-dir --package hadoop --stack-version 2.3.4.0-3485 --conf-version 0'] {'logoutput': False, 'sudo': True, 'quiet': False, 'stderr': -1}
> 2016-02-24 10:51:14,877 - call returned (1, '/etc/hadoop/2.3.4.0-3485/0 exist already', '')
> 2016-02-24 10:51:14,878 - checked_call['conf-select set-conf-dir --package hadoop --stack-version 2.3.4.0-3485 --conf-version 0'] {'logoutput': False, 'sudo': True, 'quiet': False}
> 2016-02-24 10:51:14,910 - checked_call returned (0, '/usr/hdp/2.3.4.0-3485/hadoop/conf -> /etc/hadoop/2.3.4.0-3485/0')
> 2016-02-24 10:51:14,910 - Ensuring that hadoop has the correct symlink structure
> 2016-02-24 10:51:14,910 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf
> 2016-02-24 10:51:15,091 - The hadoop conf dir /usr/hdp/current/hadoop-client/conf exists, will call conf-select on it for version 2.3.4.0-3485
> 2016-02-24 10:51:15,091 - Checking if need to create versioned conf dir /etc/hadoop/2.3.4.0-3485/0
> 2016-02-24 10:51:15,091 - call['conf-select create-conf-dir --package hadoop --stack-version 2.3.4.0-3485 --conf-version 0'] {'logoutput': False, 'sudo': True, 'quiet': False, 'stderr': -1}
> 2016-02-24 10:51:15,120 - call returned (1, '/etc/hadoop/2.3.4.0-3485/0 exist already', '')
> 2016-02-24 10:51:15,121 - checked_call['conf-select set-conf-dir --package hadoop --stack-version 2.3.4.0-3485 --conf-version 0'] {'logoutput': False, 'sudo': True, 'quiet': False}
> 2016-02-24 10:51:15,162 - checked_call returned (0, '/usr/hdp/2.3.4.0-3485/hadoop/conf -> /etc/hadoop/2.3.4.0-3485/0')
> 2016-02-24 10:51:15,162 - Ensuring that hadoop has the correct symlink structure
> 2016-02-24 10:51:15,162 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf
> 2016-02-24 10:51:15,164 - Group['hadoop'] {}
> 2016-02-24 10:51:15,165 - Group['users'] {}
> 2016-02-24 10:51:15,166 - Group['knox'] {}
> 2016-02-24 10:51:15,166 - User['hive'] {'gid': 'hadoop', 'groups': [u'hadoop']}
> 2016-02-24 10:51:15,167 - User['zookeeper'] {'gid': 'hadoop', 'groups': [u'hadoop']}
> 2016-02-24 10:51:15,168 - User['ams'] {'gid': 'hadoop', 'groups': [u'hadoop']}
> 2016-02-24 10:51:15,168 - User['ambari-qa'] {'gid': 'hadoop', 'groups': [u'users']}
> 2016-02-24 10:51:15,169 - User['tez'] {'gid': 'hadoop', 'groups': [u'users']}
> 2016-02-24 10:51:15,170 - User['hdfs'] {'gid': 'hadoop', 'groups': [u'hadoop']}
> 2016-02-24 10:51:15,171 - User['yarn'] {'gid': 'hadoop', 'groups': [u'hadoop']}
> 2016-02-24 10:51:15,172 - User['hcat'] {'gid': 'hadoop', 'groups': [u'hadoop']}
> 2016-02-24 10:51:15,172 - User['mapred'] {'gid': 'hadoop', 'groups': [u'hadoop']}
> 2016-02-24 10:51:15,173 - User['hbase'] {'gid': 'hadoop', 'groups': [u'hadoop']}
> 2016-02-24 10:51:15,174 - User['knox'] {'gid': 'hadoop', 'groups': [u'hadoop']}
> 2016-02-24 10:51:15,175 - File['/var/lib/ambari-agent/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555}
> 2016-02-24 10:51:15,177 - Execute['/var/lib/ambari-agent/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa'] {'not_if': '(test $(id -u ambari-qa) -gt 1000) || (false)'}
> 2016-02-24 10:51:15,182 - Skipping Execute['/var/lib/ambari-agent/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa'] due to not_if
> 2016-02-24 10:51:15,183 - Directory['/tmp/hbase-hbase'] {'owner': 'hbase', 'recursive': True, 'mode': 0775, 'cd_access': 'a'}
> 2016-02-24 10:51:15,184 - File['/var/lib/ambari-agent/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555}
> 2016-02-24 10:51:15,185 - Execute['/var/lib/ambari-agent/tmp/changeUid.sh hbase /home/hbase,/tmp/hbase,/usr/bin/hbase,/var/log/hbase,/tmp/hbase-hbase'] {'not_if': '(test $(id -u hbase) -gt 1000) || (false)'}
> 2016-02-24 10:51:15,190 - Skipping Execute['/var/lib/ambari-agent/tmp/changeUid.sh hbase /home/hbase,/tmp/hbase,/usr/bin/hbase,/var/log/hbase,/tmp/hbase-hbase'] due to not_if
> 2016-02-24 10:51:15,191 - Group['hdfs'] {'ignore_failures': False}
> 2016-02-24 10:51:15,191 - User['hdfs'] {'ignore_failures': False, 'groups': [u'hadoop', u'hdfs']}
> 2016-02-24 10:51:15,192 - Directory['/etc/hadoop'] {'mode': 0755}
> 2016-02-24 10:51:15,210 - File['/usr/hdp/current/hadoop-client/conf/hadoop-env.sh'] {'content': InlineTemplate(...), 'owner': 'root', 'group': 'hadoop'}
> 2016-02-24 10:51:15,211 - Directory['/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir'] {'owner': 'hdfs', 'group': 'hadoop', 'mode': 0777}
> 2016-02-24 10:51:15,224 - Execute[('setenforce', '0')] {'not_if': '(! which getenforce ) || (which getenforce && getenforce | grep -q Disabled)', 'sudo': True, 'only_if': 'test -f /selinux/enforce'}
> 2016-02-24 10:51:15,237 - Skipping Execute[('setenforce', '0')] due to not_if
> 2016-02-24 10:51:15,237 - Directory['/var/log/hadoop'] {'owner': 'root', 'mode': 0775, 'group': 'hadoop', 'recursive': True, 'cd_access': 'a'}
> 2016-02-24 10:51:15,240 - Directory['/var/run/hadoop'] {'owner': 'root', 'group': 'root', 'recursive': True, 'cd_access': 'a'}
> 2016-02-24 10:51:15,240 - Changing owner for /var/run/hadoop from 1006 to root
> 2016-02-24 10:51:15,240 - Changing group for /var/run/hadoop from 1001 to root
> 2016-02-24 10:51:15,240 - Directory['/tmp/hadoop-hdfs'] {'owner': 'hdfs', 'recursive': True, 'cd_access': 'a'}
> 2016-02-24 10:51:15,245 - File['/usr/hdp/current/hadoop-client/conf/commons-logging.properties'] {'content': Template('commons-logging.properties.j2'), 'owner': 'root'}
> 2016-02-24 10:51:15,247 - File['/usr/hdp/current/hadoop-client/conf/health_check'] {'content': Template('health_check.j2'), 'owner': 'root'}
> 2016-02-24 10:51:15,248 - File['/usr/hdp/current/hadoop-client/conf/log4j.properties'] {'content': ..., 'owner': 'hdfs', 'group': 'hadoop', 'mode': 0644}
> 2016-02-24 10:51:15,259 - File['/usr/hdp/current/hadoop-client/conf/hadoop-metrics2.properties'] {'content': Template('hadoop-metrics2.properties.j2'), 'owner': 'hdfs'}
> 2016-02-24 10:51:15,260 - File['/usr/hdp/current/hadoop-client/conf/task-log4j.properties'] {'content': StaticFile('task-log4j.properties'), 'mode': 0755}
> 2016-02-24 10:51:15,261 - File['/usr/hdp/current/hadoop-client/conf/configuration.xsl'] {'owner': 'hdfs', 'group': 'hadoop'}
> 2016-02-24 10:51:15,266 - File['/etc/hadoop/conf/topology_mappings.data'] {'owner': 'hdfs', 'content': Template('topology_mappings.data.j2'), 'only_if': 'test -d /etc/hadoop/conf', 'group': 'hadoop'}
> 2016-02-24 10:51:15,271 - File['/etc/hadoop/conf/topology_script.py'] {'content': StaticFile('topology_script.py'), 'only_if': 'test -d /etc/hadoop/conf', 'mode': 0755}
> 2016-02-24 10:51:15,467 - The hadoop conf dir /usr/hdp/current/hadoop-client/conf exists, will call conf-select on it for version 2.3.4.0-3485
> 2016-02-24 10:51:15,468 - Checking if need to create versioned conf dir /etc/hadoop/2.3.4.0-3485/0
> 2016-02-24 10:51:15,468 - call['conf-select create-conf-dir --package hadoop --stack-version 2.3.4.0-3485 --conf-version 0'] {'logoutput': False, 'sudo': True, 'quiet': False, 'stderr': -1}
> 2016-02-24 10:51:15,501 - call returned (1, '/etc/hadoop/2.3.4.0-3485/0 exist already', '')
> 2016-02-24 10:51:15,501 - checked_call['conf-select set-conf-dir --package hadoop --stack-version 2.3.4.0-3485 --conf-version 0'] {'logoutput': False, 'sudo': True, 'quiet': False}
> 2016-02-24 10:51:15,534 - checked_call returned (0, '/usr/hdp/2.3.4.0-3485/hadoop/conf -> /etc/hadoop/2.3.4.0-3485/0')
> 2016-02-24 10:51:15,534 - Ensuring that hadoop has the correct symlink structure
> 2016-02-24 10:51:15,534 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf
> 2016-02-24 10:51:15,536 - The hadoop conf dir /usr/hdp/current/hadoop-client/conf exists, will call conf-select on it for version 2.3.4.0-3485
> 2016-02-24 10:51:15,536 - Checking if need to create versioned conf dir /etc/hadoop/2.3.4.0-3485/0
> 2016-02-24 10:51:15,537 - call['conf-select create-conf-dir --package hadoop --stack-version 2.3.4.0-3485 --conf-version 0'] {'logoutput': False, 'sudo': True, 'quiet': False, 'stderr': -1}
> 2016-02-24 10:51:15,565 - call returned (1, '/etc/hadoop/2.3.4.0-3485/0 exist already', '')
> 2016-02-24 10:51:15,566 - checked_call['conf-select set-conf-dir --package hadoop --stack-version 2.3.4.0-3485 --conf-version 0'] {'logoutput': False, 'sudo': True, 'quiet': False}
> 2016-02-24 10:51:15,595 - checked_call returned (0, '/usr/hdp/2.3.4.0-3485/hadoop/conf -> /etc/hadoop/2.3.4.0-3485/0')
> 2016-02-24 10:51:15,596 - Ensuring that hadoop has the correct symlink structure
> 2016-02-24 10:51:15,596 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf
> 2016-02-24 10:51:15,605 - Directory['/etc/security/limits.d'] {'owner': 'root', 'group': 'root', 'recursive': True}
> 2016-02-24 10:51:15,612 - File['/etc/security/limits.d/hdfs.conf'] {'content': Template('hdfs.conf.j2'), 'owner': 'root', 'group': 'root', 'mode': 0644}
> 2016-02-24 10:51:15,613 - XmlConfig['hadoop-policy.xml'] {'owner': 'hdfs', 'group': 'hadoop', 'conf_dir': '/usr/hdp/current/hadoop-client/conf', 'configuration_attributes': {}, 'configurations': ...}
> 2016-02-24 10:51:15,626 - Generating config: /usr/hdp/current/hadoop-client/conf/hadoop-policy.xml
> 2016-02-24 10:51:15,627 - File['/usr/hdp/current/hadoop-client/conf/hadoop-policy.xml'] {'owner': 'hdfs', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': None, 'encoding': 'UTF-8'}
> 2016-02-24 10:51:15,638 - XmlConfig['ssl-client.xml'] {'owner': 'hdfs', 'group': 'hadoop', 'conf_dir': '/usr/hdp/current/hadoop-client/conf', 'configuration_attributes': {}, 'configurations': ...}
> 2016-02-24 10:51:15,649 - Generating config: /usr/hdp/current/hadoop-client/conf/ssl-client.xml
> 2016-02-24 10:51:15,650 - File['/usr/hdp/current/hadoop-client/conf/ssl-client.xml'] {'owner': 'hdfs', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': None, 'encoding': 'UTF-8'}
> 2016-02-24 10:51:15,657 - Directory['/usr/hdp/current/hadoop-client/conf/secure'] {'owner': 'root', 'group': 'hadoop', 'recursive': True, 'cd_access': 'a'}
> 2016-02-24 10:51:15,658 - XmlConfig['ssl-client.xml'] {'owner': 'hdfs', 'group': 'hadoop', 'conf_dir': '/usr/hdp/current/hadoop-client/conf/secure', 'configuration_attributes': {}, 'configurations': ...}
> 2016-02-24 10:51:15,669 - Generating config: /usr/hdp/current/hadoop-client/conf/secure/ssl-client.xml
> 2016-02-24 10:51:15,669 - File['/usr/hdp/current/hadoop-client/conf/secure/ssl-client.xml'] {'owner': 'hdfs', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': None, 'encoding': 'UTF-8'}
> 2016-02-24 10:51:15,677 - XmlConfig['ssl-server.xml'] {'owner': 'hdfs', 'group': 'hadoop', 'conf_dir': '/usr/hdp/current/hadoop-client/conf', 'configuration_attributes': {}, 'configurations': ...}
> 2016-02-24 10:51:15,688 - Generating config: /usr/hdp/current/hadoop-client/conf/ssl-server.xml
> 2016-02-24 10:51:15,689 - File['/usr/hdp/current/hadoop-client/conf/ssl-server.xml'] {'owner': 'hdfs', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': None, 'encoding': 'UTF-8'}
> 2016-02-24 10:51:15,697 - XmlConfig['hdfs-site.xml'] {'owner': 'hdfs', 'group': 'hadoop', 'conf_dir': '/usr/hdp/current/hadoop-client/conf', 'configuration_attributes': {}, 'configurations': ...}
> 2016-02-24 10:51:15,708 - Generating config: /usr/hdp/current/hadoop-client/conf/hdfs-site.xml
> 2016-02-24 10:51:15,709 - File['/usr/hdp/current/hadoop-client/conf/hdfs-site.xml'] {'owner': 'hdfs', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': None, 'encoding': 'UTF-8'}
> 2016-02-24 10:51:15,770 - XmlConfig['core-site.xml'] {'group': 'hadoop', 'conf_dir': '/usr/hdp/current/hadoop-client/conf', 'mode': 0644, 'configuration_attributes': {}, 'owner': 'hdfs', 'configurations': ...}
> 2016-02-24 10:51:15,781 - Generating config: /usr/hdp/current/hadoop-client/conf/core-site.xml
> 2016-02-24 10:51:15,782 - File['/usr/hdp/current/hadoop-client/conf/core-site.xml'] {'owner': 'hdfs', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': 0644, 'encoding': 'UTF-8'}
> 2016-02-24 10:51:15,810 - File['/usr/hdp/current/hadoop-client/conf/slaves'] {'content': Template('slaves.j2'), 'owner': 'root'}
> 2016-02-24 10:51:15,811 - Directory['/var/lib/hadoop-hdfs'] {'owner': 'hdfs', 'group': 'hadoop', 'mode': 0751, 'recursive': True}
> 2016-02-24 10:51:15,817 - Host contains mounts: ['/sys', '/proc', '/dev', '/sys/kernel/security', '/dev/shm', '/dev/pts', '/run', '/sys/fs/cgroup', '/sys/fs/cgroup/systemd', '/sys/fs/pstore', '/sys/fs/cgroup/perf_event', '/sys/fs/cgroup/memory', '/sys/fs/cgroup/devices', '/sys/fs/cgroup/cpuset', '/sys/fs/cgroup/hugetlb', '/sys/fs/cgroup/freezer', '/sys/fs/cgroup/blkio', '/sys/fs/cgroup/cpu,cpuacct', '/sys/fs/cgroup/net_cls', '/sys/kernel/config', '/', '/proc/sys/fs/binfmt_misc', '/dev/mqueue', '/sys/kernel/debug', '/dev/hugepages', '/run/user/0', '/run/user/1000', '/proc/sys/fs/binfmt_misc'].
> 2016-02-24 10:51:15,817 - Mount point for directory /hadoop/hdfs/data is /
> 2016-02-24 10:51:15,817 - File['/var/lib/ambari-agent/data/datanode/dfs_data_dir_mount.hist'] {'content': '\n# This file keeps track of the last known mount-point for each DFS data dir.\n# It is safe to delete, since it will get regenerated the next time that the DataNode starts.\n# However, it is not advised to delete this file since Ambari may\n# re-create a DFS data dir that used to be mounted on a drive but is now mounted on the root.\n# Comments begin with a hash (#) symbol\n# data_dir,mount_point\n/hadoop/hdfs/data,/\n', 'owner': 'hdfs', 'group': 'hadoop', 'mode': 0644}
> 2016-02-24 10:51:15,819 - Directory['/var/run/hadoop'] {'owner': 'hdfs', 'group': 'hadoop', 'mode': 0755}
> 2016-02-24 10:51:15,819 - Changing owner for /var/run/hadoop from 0 to hdfs
> 2016-02-24 10:51:15,819 - Changing group for /var/run/hadoop from 0 to hadoop
> 2016-02-24 10:51:15,819 - Directory['/var/run/hadoop/hdfs'] {'owner': 'hdfs', 'recursive': True}
> 2016-02-24 10:51:15,820 - Directory['/var/log/hadoop/hdfs'] {'owner': 'hdfs', 'recursive': True}
> 2016-02-24 10:51:15,820 - File['/var/run/hadoop/hdfs/hadoop-hdfs-datanode.pid'] {'action': ['delete'], 'not_if': 'ambari-sudo.sh  -H -E test -f /var/run/hadoop/hdfs/hadoop-hdfs-datanode.pid && ambari-sudo.sh  -H -E pgrep -F /var/run/hadoop/hdfs/hadoop-hdfs-datanode.pid'}
> 2016-02-24 10:51:15,833 - Deleting File['/var/run/hadoop/hdfs/hadoop-hdfs-datanode.pid']
> 2016-02-24 10:51:15,833 - Execute['ambari-sudo.sh  -H -E /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh --config /usr/hdp/current/hadoop-client/conf start datanode'] {'environment': {'HADOOP_LIBEXEC_DIR': '/usr/hdp/current/hadoop-client/libexec'}, 'not_if': 'ambari-sudo.sh  -H -E test -f /var/run/hadoop/hdfs/hadoop-hdfs-datanode.pid && ambari-sudo.sh  -H -E pgrep -F /var/run/hadoop/hdfs/hadoop-hdfs-datanode.pid'}
> {code}
> When I attempted to run the hdfs ... datanode command directly like so:
> {code}
> strace -s 2000 -o ~/slog.txt /usr/hdp/2.3.4.0-3485/hadoop-hdfs/bin/hdfs --config /usr/hdp/current/hadoop-client/conf datanode
> {code}
> I noticed this section which mentions to additional log files I hadn't see before.
> {code}
> read(255, "#!/usr/bin/env bash\n\n# Licensed to the Apache Software Foundation (ASF) under one or more\n# contributor license agreements.  See the NOTICE file distributed with\n# this work for additional information regarding copyright ownership.\n# The ASF licenses this file to You under the Apache License, Version 2.0\n# (the \"License\"); you may not use this file except in compliance with\n# the License.  You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n\n# Environment Variables\n#\n#   JSVC_HOME  home directory of jsvc binary.  Required for starting secure\n#              datanode.\n#\n#   JSVC_OUTFILE  path to jsvc output file.  Defaults to\n#                 $HADOOP_LOG_DIR/jsvc.out.\n#\n#   JSVC_ERRFILE  path to jsvc error file.  Defaults to $HADOOP_LOG_DIR/jsvc.err.\n\nbin=`which $0`\nbin=`dirname ${bin}`\nbin=`cd \"$bin\" > /dev/null; pwd`\n\nDEFAULT_LIBEXEC_DIR=\"$bin\"/../libexec\n\nif [ -n \"$HADOOP_HOME\" ]; then\n  DEFAULT_LIBEXEC_DIR=\"$HADOOP_HOME\"/libexec\nfi\n\nHADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR}\n. $HADOOP_LIBEXEC_DIR/hdfs-config.sh\n\nfunction print_usage(){\n  echo \"Usage: hdfs [--config confdir] [--loglevel loglevel] COMMAND\"\n  echo \"       where COMMAND is one of:\"\n  echo \"  dfs                  run a filesystem command on the file systems supported in Hadoop.\"\n  echo \"  classpath            prints the classpath\"\n  echo \"  namenode -format     format the DFS filesystem\"\n  echo \"  secondarynamenode    run the DFS secondary namenode\"\n  echo \"  namenode             run the DFS namenode\"\n  echo \"  journalnode          run the DFS journalnode\"\n  echo \"  zkfc                 run the ZK Failover Controller daemon\"\n  echo"..., 8192) = 8192
> {code}
> Specifically these files:
> - /var/log/hadoop/hdfs/jsvc.out
> - /var/log/hadoop/hdfs/jsvc.err
> In looking in the jsvc.err file I found this:
> {code}
> STARTUP_MSG:   build = git@github.com:hortonworks/hadoop.git -r ef0582ca14b8177a3cbb6376807545272677d730; compiled by 'jenkins' on 2015-12-16T03:01Z
> STARTUP_MSG:   java = 1.8.0_60
> ************************************************************/
> 16/02/24 11:30:18 INFO datanode.DataNode: registered UNIX signal handlers for [TERM, HUP, INT]
> 16/02/24 11:30:18 FATAL datanode.DataNode: Exception in secureMain
> java.io.IOException: Login failure for dn/host-192-168-114-49.td.local@<REDACTED KERBEROS REALM> from keytab /etc/security/keytabs/dn.service.keytab: javax.security.auth.login.LoginException: Unable to obtain password from user
>         at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:962)
>         at org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:275)
>         at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2296)
>         at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2345)
>         at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2526)
>         at org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.start(SecureDataNodeStarter.java:76)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:497)
>         at org.apache.commons.daemon.support.DaemonLoader.start(DaemonLoader.java:243)
> Caused by: javax.security.auth.login.LoginException: Unable to obtain password from user
>         at com.sun.security.auth.module.Krb5LoginModule.promptForPass(Krb5LoginModule.java:897)
>         at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:760)
>         at com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:497)
>         at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
>         at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
>         at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
>         at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
>         at javax.security.auth.login.LoginContext.login(LoginContext.java:587)
>         at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:953)
>         ... 10 more
> 16/02/24 11:30:18 INFO util.ExitUtil: Exiting with status 1
> 16/02/24 11:30:18 INFO datanode.DataNode: SHUTDOWN_MSG:
> /************************************************************
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)