You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ambari.apache.org by "Hadoop QA (JIRA)" <ji...@apache.org> on 2015/06/02 08:41:17 UTC
[jira] [Commented] (AMBARI-11605) Restarting HistoryServer fails during RU because NameNode is in safemode

    [ https://issues.apache.org/jira/browse/AMBARI-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14568645#comment-14568645 ] 

Hadoop QA commented on AMBARI-11605:
------------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12736722/AMBARI-11605.patch
  against trunk revision .

    {color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/Ambari-trunk-test-patch/2974//console

This message is automatically generated.

> Restarting HistoryServer fails during RU because NameNode is in safemode
> ------------------------------------------------------------------------
>
>                 Key: AMBARI-11605
>                 URL: https://issues.apache.org/jira/browse/AMBARI-11605
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-server
>    Affects Versions: 2.1.0
>            Reporter: Alejandro Fernandez
>            Assignee: Alejandro Fernandez
>             Fix For: 2.1.0
>
>         Attachments: AMBARI-11605.patch
>
>
> When restarting HistoryServer for the first time during the Core Masters rolling upgrade, the restart fails with the following:
> {noformat}
> 2015-05-28 20:03:32,540 - HdfsResource['/hdp/apps/2.3.0.0-2112/mapreduce'] {'security_enabled': False, 'hadoop_bin_dir': '/usr/hdp/2.3.0.0-2112/hadoop/bin', 'keytab': [EMPTY], 'default_fs': 'hdfs://c1ha', 'hdfs_site': ..., 'kinit_path_local': 'kinit', 'principal_name': [EMPTY], 'user': 'hdfs', 'owner': 'hdfs', 'hadoop_conf_dir': '/usr/hdp/current/hadoop-client/conf', 'type': 'directory', 'action': ['create_on_execute'], 'mode': 0555}
> 2015-05-28 20:03:32,600 - checked_call['curl -L -w '%{http_code}' -X GET 'http://jhurley-ru-2.c.pramod-thangali.internal:50070/webhdfs/v1/hdp/apps/2.3.0.0-2112/mapreduce?op=GETFILESTATUS&user.name=hdfs''] {'logoutput': None, 'user': 'hdfs', 'quiet': False}
> 2015-05-28 20:03:37,862 - checked_call returned (0, '{"RemoteException":{"exception":"FileNotFoundException","javaClassName":"java.io.FileNotFoundException","message":"File does not exist: /hdp/apps/2.3.0.0-2112/mapreduce"}}404')
> 2015-05-28 20:03:37,866 - checked_call['curl -L -w '%{http_code}' -X PUT 'http://jhurley-ru-2.c.pramod-thangali.internal:50070/webhdfs/v1/hdp/apps/2.3.0.0-2112/mapreduce?op=MKDIRS&user.name=hdfs''] {'logoutput': None, 'user': 'hdfs', 'quiet': False}
> 2015-05-28 20:03:37,993 - checked_call returned (0, '{"RemoteException":{"exception":"RetriableException","javaClassName":"org.apache.hadoop.ipc.RetriableException","message":"org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create directory /hdp/apps/2.3.0.0-2112/mapreduce. Name node is in safe mode.\\nThe reported blocks 414 needs additional 77 blocks to reach the threshold 0.9900 of total blocks 495.\\nThe number of live datanodes 4 has reached the minimum number 0. Safe mode will be turned off automatically once the thresholds have been reached."}}403')
> {noformat}
> Retrying after this error fixes the problem.
> Turns out that now that the HDFS command run faster, by the time the HistorySever is restarted, it's still possible for the standby NameNode to still be in safemode.
> For this reason, we must wait for both NameNodes to come out of safemode before proceeding to any other services or Service Checks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)