You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Gopal V (JIRA)" <ji...@apache.org> on 2014/10/06 21:40:33 UTC

[jira] [Updated] (TEZ-1277) Tez Spill handler should truncate files to reserve space on disk

     [ https://issues.apache.org/jira/browse/TEZ-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gopal V updated TEZ-1277:
-------------------------
    Description: 
Occasionally tasks fail due to full disks because the disks had space when the task was allocating via LocalDirAllocator, but the disk space was actually promised to many tasks instead of just one.

This race condition shows up when a 1Gb spill can be done in ~10s or so.

There is no way to do this via the hadoop-fs abstraction - but an SSD based spill wastes most of the IOPS on journal updates about the file length changing.

  was:
Occasionally tasks fail due to full disks because the disks had space when the task was allocating via LocalDirAllocator, but the disk space was actually promised to many tasks instead of just one.

This race condition shows up when a 1Gb spill can be done in ~10s or so.


> Tez Spill handler should truncate files to reserve space on disk
> ----------------------------------------------------------------
>
>                 Key: TEZ-1277
>                 URL: https://issues.apache.org/jira/browse/TEZ-1277
>             Project: Apache Tez
>          Issue Type: Improvement
>    Affects Versions: 0.5.0
>            Reporter: Gopal V
>            Assignee: Gopal V
>
> Occasionally tasks fail due to full disks because the disks had space when the task was allocating via LocalDirAllocator, but the disk space was actually promised to many tasks instead of just one.
> This race condition shows up when a 1Gb spill can be done in ~10s or so.
> There is no way to do this via the hadoop-fs abstraction - but an SSD based spill wastes most of the IOPS on journal updates about the file length changing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)