You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ambari.apache.org by "Andrew Onischuk (JIRA)" <ji...@apache.org> on 2014/10/21 17:16:33 UTC

[jira] [Resolved] (AMBARI-7882) Decommission of JobTracker fails on secure cluster

     [ https://issues.apache.org/jira/browse/AMBARI-7882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Onischuk resolved AMBARI-7882.
-------------------------------------
    Resolution: Fixed

Committed to trunk and branch-1.7.0

> Decommission of JobTracker fails on secure cluster
> --------------------------------------------------
>
>                 Key: AMBARI-7882
>                 URL: https://issues.apache.org/jira/browse/AMBARI-7882
>             Project: Ambari
>          Issue Type: Bug
>            Reporter: Andrew Onischuk
>            Assignee: Andrew Onischuk
>             Fix For: 1.7.0
>
>
> Exception text:
>     
>     
>     
>     {
>       "href" : "http://ec2-54-165-160-62.compute-1.amazonaws.com:8080/api/v1/clusters/cl1/requests/21/tasks/235",
>       "Tasks" : {
>         "attempt_cnt" : 1,
>         "cluster_name" : "cl1",
>         "command" : "CUSTOM_COMMAND",
>         "command_detail" : "DECOMMISSION, Excluded: ip-172-31-37-151.ec2.internal",
>         "custom_command_name" : "DECOMMISSION",
>         "end_time" : 1413796875994,
>         "error_log" : "/var/lib/ambari-agent/data/errors-235.txt",
>         "exit_code" : 1,
>         "host_name" : "ip-172-31-37-148.ec2.internal",
>         "id" : 235,
>         "output_log" : "/var/lib/ambari-agent/data/output-235.txt",
>         "request_id" : 21,
>         "role" : "JOBTRACKER",
>         "stage_id" : 1,
>         "start_time" : 1413796870551,
>         "status" : "FAILED",
>         "stderr" : "2014-10-20 09:21:15,291 - Error while executing command 'decommission':\nTraceback (most recent call last):\n  File \"/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py\", line 122, in execute\n    method(env)\n  File \"/var/lib/ambari-agent/cache/stacks/HDP/1.3.2/services/MAPREDUCE/package/scripts/jobtracker.py\", line 78, in decommission\n    kinit_override=True)\n  File \"/usr/lib/python2.6/site-packages/resource_management/core/base.py\", line 148, in __init__\n    self.env.run()\n  File \"/usr/lib/python2.6/site-packages/resource_management/core/environment.py\", line 149, in run\n    self.run_action(resource, action)\n  File \"/usr/lib/python2.6/site-packages/resource_management/core/environment.py\", line 115, in run_action\n    provider_action()\n  File \"/usr/lib/python2.6/site-packages/resource_management/libraries/providers/execute_hadoop.py\", line 50, in action_run\n    path        = self.resource.bin_dir\n  File \"/usr/lib/python2.6/site-packages/resource_management/core/base.py\", line 148, in __init__\n    self.env.run()\n  File \"/usr/lib/python2.6/site-packages/resource_management/core/environment.py\", line 149, in run\n    self.run_action(resource, action)\n  File \"/usr/lib/python2.6/site-packages/resource_management/core/environment.py\", line 115, in run_action\n    provider_action()\n  File \"/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py\", line 237, in action_run\n    raise ex\nFail: Execution of 'hadoop --config /etc/hadoop/conf mradmin -refreshNodes' returned 255. 14/10/20 09:21:15 ERROR security.UserGroupInformation: PriviledgedActionException as:mapred cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]\n14/10/20 09:21:15 WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]\n14/10/20 09:21:15 ERROR security.UserGroupInformation: PriviledgedActionException as:mapred cause:java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]\nrefreshNodes: Call to ip-172-31-37-148.ec2.internal/172.31.37.148:50300 failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]",
>         "stdout" : "2014-10-20 09:21:11,334 - File['/etc/hadoop/conf/mapred.exclude'] {'owner': 'mapred', 'content': Template('exclude_hosts_list.j2'), 'group': 'hadoop'}\n2014-10-20 09:21:11,338 - Writing File['/etc/hadoop/conf/mapred.exclude'] because contents don't match\n2014-10-20 09:21:11,339 - ExecuteHadoop['mradmin -refreshNodes'] {'conf_dir': '/etc/hadoop/conf', 'kinit_override': True, 'user': 'mapred'}\n2014-10-20 09:21:11,341 - Execute['hadoop --config /etc/hadoop/conf mradmin -refreshNodes'] {'logoutput': False, 'path': [], 'tries': 1, 'user': 'mapred', 'try_sleep': 0}\n2014-10-20 09:21:15,291 - Error while executing command 'decommission':\nTraceback (most recent call last):\n  File \"/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py\", line 122, in execute\n    method(env)\n  File \"/var/lib/ambari-agent/cache/stacks/HDP/1.3.2/services/MAPREDUCE/package/scripts/jobtracker.py\", line 78, in decommission\n    kinit_override=True)\n  File \"/usr/lib/python2.6/site-packages/resource_management/core/base.py\", line 148, in __init__\n    self.env.run()\n  File \"/usr/lib/python2.6/site-packages/resource_management/core/environment.py\", line 149, in run\n    self.run_action(resource, action)\n  File \"/usr/lib/python2.6/site-packages/resource_management/core/environment.py\", line 115, in run_action\n    provider_action()\n  File \"/usr/lib/python2.6/site-packages/resource_management/libraries/providers/execute_hadoop.py\", line 50, in action_run\n    path        = self.resource.bin_dir\n  File \"/usr/lib/python2.6/site-packages/resource_management/core/base.py\", line 148, in __init__\n    self.env.run()\n  File \"/usr/lib/python2.6/site-packages/resource_management/core/environment.py\", line 149, in run\n    self.run_action(resource, action)\n  File \"/usr/lib/python2.6/site-packages/resource_management/core/environment.py\", line 115, in run_action\n    provider_action()\n  File \"/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py\", line 237, in action_run\n    raise ex\nFail: Execution of 'hadoop --config /etc/hadoop/conf mradmin -refreshNodes' returned 255. 14/10/20 09:21:15 ERROR security.UserGroupInformation: PriviledgedActionException as:mapred cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]\n14/10/20 09:21:15 WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]\n14/10/20 09:21:15 ERROR security.UserGroupInformation: PriviledgedActionException as:mapred cause:java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]\nrefreshNodes: Call to ip-172-31-37-148.ec2.internal/172.31.37.148:50300 failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]",
>         "structured_out" : { }
>       }
>     }
>     



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)