You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Chao Sun (Jira)" <ji...@apache.org> on 2019/11/26 19:45:00 UTC
[jira] [Created] (HDFS-15014) [RBF] WebHdfs chooseDatanode
shouldn't call getDatanodeReport
Chao Sun created HDFS-15014:
-------------------------------
Summary: [RBF] WebHdfs chooseDatanode shouldn't call getDatanodeReport
Key: HDFS-15014
URL: https://issues.apache.org/jira/browse/HDFS-15014
Project: Hadoop HDFS
Issue Type: Bug
Components: rbf
Reporter: Chao Sun
Currently the {{chooseDatanode}} call (which is shared by {{open}}, {{create}}, {{append}} and {{getFileChecksum}}) in RBF WebHDFS calls {{getDatanodeReport}} from ALL downstream namenodes:
{code}
private DatanodeInfo chooseDatanode(final Router router,
final String path, final HttpOpParam.Op op, final long openOffset,
final String excludeDatanodes) throws IOException {
// We need to get the DNs as a privileged user
final RouterRpcServer rpcServer = getRPCServer(router);
UserGroupInformation loginUser = UserGroupInformation.getLoginUser();
RouterRpcServer.setCurrentUser(loginUser);
DatanodeInfo[] dns = null;
try {
dns = rpcServer.getDatanodeReport(DatanodeReportType.LIVE);
} catch (IOException e) {
LOG.error("Cannot get the datanodes from the RPC server", e);
} finally {
// Reset ugi to remote user for remaining operations.
RouterRpcServer.resetCurrentUser();
}
HashSet<Node> excludes = new HashSet<Node>();
if (excludeDatanodes != null) {
Collection<String> collection =
getTrimmedStringCollection(excludeDatanodes);
for (DatanodeInfo dn : dns) {
if (collection.contains(dn.getName())) {
excludes.add(dn);
}
}
}
...
{code}
The {{getDatanodeReport}} is very expensive (particularly in a large cluster) as it need to lock the {{DatanodeManager}} which is also shared by calls such as processing heartbeats. Check HDFS-14366 for a similar issue.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org