You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Jason Lowe (JIRA)" <ji...@apache.org> on 2014/08/29 19:31:54 UTC

[jira] [Commented] (YARN-1781) NM should allow users to specify max disk utilization for local disks

    [ https://issues.apache.org/jira/browse/YARN-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14115534#comment-14115534 ] 

Jason Lowe commented on YARN-1781:
----------------------------------

We've run into situations where this new behavior results in disks that end up being filled by containers remain full and never recover.  See YARN-2473.

YARN-90 won't help much in this case because the files that filled the disk won't be deleted.  Prior to this change the disks would auto-recover when the container completed, so this is a significant regression.

> NM should allow users to specify max disk utilization for local disks
> ---------------------------------------------------------------------
>
>                 Key: YARN-1781
>                 URL: https://issues.apache.org/jira/browse/YARN-1781
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager
>            Reporter: Varun Vasudev
>            Assignee: Varun Vasudev
>             Fix For: 2.4.0
>
>         Attachments: apache-yarn-1781.0.patch, apache-yarn-1781.1.patch, apache-yarn-1781.2.patch, apache-yarn-1781.3.patch, apache-yarn-1781.4.patch
>
>
> This is related to YARN-257(it's probably a sub task?). Currently, the NM does not detect full disks and allows full disks to be used by containers leading to repeated failures. YARN-257 deals with graceful handling of full disks. This ticket is only about detection of full disks by the disk health checkers.
> The NM should allow users to set a maximum disk utilization for local disks and mark disks as bad once they exceed that utilization. At the very least, the NM should at least detect full disks.



--
This message was sent by Atlassian JIRA
(v6.2#6252)