You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ambari.apache.org by "Siddharth Wagle (JIRA)" <ji...@apache.org> on 2017/10/17 17:26:00 UTC
[jira] [Updated] (AMBARI-22257) Metrics collector fails to stop
after Datanode is stopped in distributed mode
[ https://issues.apache.org/jira/browse/AMBARI-22257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Siddharth Wagle updated AMBARI-22257:
-------------------------------------
Status: Patch Available (was: Open)
> Metrics collector fails to stop after Datanode is stopped in distributed mode
> -----------------------------------------------------------------------------
>
> Key: AMBARI-22257
> URL: https://issues.apache.org/jira/browse/AMBARI-22257
> Project: Ambari
> Issue Type: Bug
> Components: ambari-metrics
> Affects Versions: 2.0.0
> Reporter: Siddharth Wagle
> Assignee: Siddharth Wagle
> Priority: Critical
> Fix For: 2.6.0
>
> Attachments: AMBARI-22257.patch
>
>
> AMS collector stop failed due to timeout at the ams-hbase regionserver stop. The log contains lots of exceptions related to DN connection issues during the stop. The problem here is that DNs were stopped before the collector.
> {code}
> 2017-10-17 14:29:10,689 ERROR [Thread-274] hdfs.DFSClient: Failed to close inode 17762
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/ams/hbase/WALs/ctr-e134-1499953498516-230429-01-000007.hwx.site,61320,1508248489809/ctr-e134-1499953498516-230429-01-000007.hwx.site%2C61320%2C1508248489809.default.1508250548392 could only be replicated to 0 nodes instead of minReplication (=1). There are 3 datanode(s) running and 3 node(s) are excluded in this operation.
> at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1719)
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)