You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Hitesh Shah (JIRA)" <ji...@apache.org> on 2014/03/11 19:59:49 UTC

[jira] [Commented] (TEZ-925) Tez job failed due to full disk

    [ https://issues.apache.org/jira/browse/TEZ-925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13930785#comment-13930785 ] 

Hitesh Shah commented on TEZ-925:
---------------------------------

[~yeshavora] I don't believe YARN has a way to tell whether a container failed due to a bad disk or no space left on a disk. If this is the feature that you are looking for, it might be better to file a jira against YARN for this information to be made available to an application. 

If the real issue is whether Tez is not black listing the node itself for multiple failing containers, that could be addressed in Tez. Could you provide any additional logs so that we can understand whether this is the underlying problem. 

Furthermore, there is probably a separate issue that should be filed against YARN as it is allocating and launching containers on an NM that is having such disk problems. 



> Tez job failed due to full disk
> -------------------------------
>
>                 Key: TEZ-925
>                 URL: https://issues.apache.org/jira/browse/TEZ-925
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Yesha Vora
>
> Tez job tries to create a container even if disk is full. Thus,  job fails with below error.
>  "Can't create directory application_1393835169637_0138 in /tmp/yarn/local/usercache/user/appcache/application_1393835169637_0138 - No space left on device"
> However, Tez should be able to detect full disk and blacklist the node. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)