You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ambari.apache.org by "Siddharth Wagle (JIRA)" <ji...@apache.org> on 2017/10/17 17:23:00 UTC

[jira] [Created] (AMBARI-22257) Metrics collector fails to stop after Datanode is stopped in distributed mode

Siddharth Wagle created AMBARI-22257:
----------------------------------------

             Summary: Metrics collector fails to stop after Datanode is stopped in distributed mode
                 Key: AMBARI-22257
                 URL: https://issues.apache.org/jira/browse/AMBARI-22257
             Project: Ambari
          Issue Type: Bug
          Components: ambari-metrics
    Affects Versions: 2.0.0
            Reporter: Siddharth Wagle
            Assignee: Siddharth Wagle
            Priority: Critical
             Fix For: 2.6.0


AMS collector stop failed due to timeout at the ams-hbase regionserver stop. The log contains lots of exceptions related to DN connection issues during the stop. The problem here is that DNs were stopped before the collector. 

{code}
2017-10-17 14:29:10,689 ERROR [Thread-274] hdfs.DFSClient: Failed to close inode 17762
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/ams/hbase/WALs/ctr-e134-1499953498516-230429-01-000007.hwx.site,61320,1508248489809/ctr-e134-1499953498516-230429-01-000007.hwx.site%2C61320%2C1508248489809.default.1508250548392 could only be replicated to 0 nodes instead of minReplication (=1).  There are 3 datanode(s) running and 3 node(s) are excluded in this operation.
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1719)
{code}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)