You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Anand Srinivasan (Jira)" <ji...@apache.org> on 2021/09/23 19:13:00 UTC

[jira] [Comment Edited] (YARN-10967) setPermission() call floods HDFS NN RPC queue

    [ https://issues.apache.org/jira/browse/YARN-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17419416#comment-17419416 ] 

Anand Srinivasan edited comment on YARN-10967 at 9/23/21, 7:12 PM:
-------------------------------------------------------------------

Steve, thanks for the comment. 

This particular customer is running > 100k jobs per day along with content summary (For example du) and other high latency calls that is killing the RPC processing time. Customer wants to get rid of this setPermission calls during this process.


was (Author: anand.srinivasan):
Steve, thanks for the comment. 

This particular customer is running > 100k per day along with content summary (For example du) and other high latency calls that is killing the RPC processing time. Customer wants to get rid of this setPermission calls during this process.

> setPermission() call floods HDFS NN RPC queue
> ---------------------------------------------
>
>                 Key: YARN-10967
>                 URL: https://issues.apache.org/jira/browse/YARN-10967
>             Project: Hadoop YARN
>          Issue Type: Improvement
>    Affects Versions: 3.0.0
>            Reporter: Anand Srinivasan
>            Priority: Major
>              Labels: performance
>
> Checking the code changes  for the log aggregation feature, we could see that when the log aggregator is inited for each app, we do verify and create remote dir where we make an additional call to setPermission() even though the remote dir exists and the permissions are set as expected.
> This code path was introduced to cater to the cloud storage where we had to make this additional check to ensure the remote file system and the corresponding cloud storage supports setting permissions.
> Upstream jira that introduced this call.
> https://issues.apache.org/jira/browse/YARN-9030
> This additional setPermission() call per each app/job floods the HDFS NN and its RPC queue which affects the performance overall.
> The ask here is to see if it's feasible to do the following :
> (a)if we can put the code introduced via YARN-9030 behind a configuration option (may be setting this option to false by default (assuming the storage used is HDFS) to bypass this code)
> (b)check if customer is using HDFS storage internally in the code (by checking yarn.nodemanager.remote-app-log-dir) and bypass this code if the storage is indeed HDFS.
> given that the code introduced in YARN-9030 is mainly put in for cloud storage providers.
> Thanks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org