You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ambari.apache.org by "Alejandro Fernandez (JIRA)" <ji...@apache.org> on 2015/06/19 21:54:02 UTC
[jira] [Commented] (AMBARI-12013) Datanode failed to restart during
RU because the shutdownDatanode -upgrade command can fail sometimes
[ https://issues.apache.org/jira/browse/AMBARI-12013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593868#comment-14593868 ]
Alejandro Fernandez commented on AMBARI-12013:
----------------------------------------------
Pushed to trunk in commit 519461587799668fea4f90d2efce5e002a02c890
> Datanode failed to restart during RU because the shutdownDatanode -upgrade command can fail sometimes
> -----------------------------------------------------------------------------------------------------
>
> Key: AMBARI-12013
> URL: https://issues.apache.org/jira/browse/AMBARI-12013
> Project: Ambari
> Issue Type: Bug
> Components: ambari-server, ari-server
> Affects Versions: 2.1.0
> Reporter: Alejandro Fernandez
> Assignee: Alejandro Fernandez
> Priority: Critical
> Fix For: 2.1.0
>
> Attachments: AMBARI-12013.branch-2.1.patch, AMBARI-12013.patch
>
>
> Deploy Test with RU from HDP 2.2.0.0-2041 to HDP-2.3.0.0-2398
> Failed on: Restarting DataNode on ip-172-31-44-83.ec2.internalshow details
> {code}
> Traceback (most recent call last):
> File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/datanode.py", line 151, in <module>
> DataNode().execute()
> File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 216, in execute
> method(env)
> File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 437, in restart
> self.stop(env, rolling_restart=rolling_restart)
> File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/datanode.py", line 55, in stop
> datanode_upgrade.pre_upgrade_shutdown()
> File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/datanode_upgrade.py", line 43, in pre_upgrade_shutdown
> Execute(command, user=params.hdfs_user, tries=1 )
> File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 157, in __init__
> self.env.run()
> File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 152, in run
> self.run_action(resource, action)
> File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 118, in run_action
> provider_action()
> File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 254, in action_run
> tries=self.resource.tries, try_sleep=self.resource.try_sleep)
> File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 70, in inner
> result = function(command, **kwargs)
> File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92, in checked_call
> tries=tries, try_sleep=try_sleep)
> File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140, in _call_wrapper
> result = _call(command, **kwargs_copy)
> File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291, in _call
> raise Fail(err_msg)
> resource_management.core.exceptions.Fail: Execution of 'hdfs dfsadmin -shutdownDatanode 0.0.0.0:8010 upgrade' returned 255. shutdownDatanode: Shutdown already in progress.
> {code}
> There's a known issue in HDP 2.2.0.0 (HDFS-7533) where shutting down the datanode will not work because not all writers have responder running, but sendOOB() tries anyway.
> If the shutdown command fails with an output of "Shutdown already in progress", then Ambari should call datanode(action="stop"), which under the hood calls "hadoop-daemon.sh stop datanode"
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)