You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Raghu Angadi (JIRA)" <ji...@apache.org> on 2007/03/31 05:20:25 UTC

[jira] Created: (HADOOP-1189) Still seeing some unexpected 'No space left on device' exceptions

Still seeing some unexpected 'No space left on device' exceptions
-----------------------------------------------------------------

                 Key: HADOOP-1189
                 URL: https://issues.apache.org/jira/browse/HADOOP-1189
             Project: Hadoop
          Issue Type: Bug
          Components: dfs
    Affects Versions: 0.12.2
            Reporter: Raghu Angadi
             Fix For: 0.13.0



One of the datanodes has one full partition (disk) out of four. Expected behaviour is that datanode should skip this partition and use only the other three. HADOOP-990 fixed some bugs related to this. It seems to work ok but some exceptions are still seeping through. In one case there 33 of these out 1200+ blocks written to this node. Not sure what caused this. I will submit a patch to the prints a more useful message throw the original exception.

Two unlikely reasons I can think of are 2% reserve space (8GB in this case) is not enough or client some how still says block size is zero in some cases. Better error message should help here.

If you see small number of these exceptions compared to number of blocks written, for now you don't need change anything.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1189) Still seeing some unexpected 'No space left on device' exceptions

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghu Angadi updated HADOOP-1189:
---------------------------------

    Status: Open  (was: Patch Available)


I think we found the problem. Consider the following 'du .' out put:

Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda3            950877384 148532012 792685008  16% /export/workspace

'Capacity - Used' is 80234537 and available is 792685008. A little bit of mismatch is expected. But this could be very large. On one node, we saw 'Capacity - Used' was 21GB and available was 0. In our code we use 'Capacity - Used' instead of MIN(Capacity - Used, Available).

I will submit a new patch.



> Still seeing some unexpected 'No space left on device' exceptions
> -----------------------------------------------------------------
>
>                 Key: HADOOP-1189
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1189
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.12.2
>            Reporter: Raghu Angadi
>         Assigned To: Raghu Angadi
>             Fix For: 0.13.0
>
>         Attachments: HADOOP-1189.patch
>
>
> One of the datanodes has one full partition (disk) out of four. Expected behaviour is that datanode should skip this partition and use only the other three. HADOOP-990 fixed some bugs related to this. It seems to work ok but some exceptions are still seeping through. In one case there 33 of these out 1200+ blocks written to this node. Not sure what caused this. I will submit a patch to the prints a more useful message throw the original exception.
> Two unlikely reasons I can think of are 2% reserve space (8GB in this case) is not enough or client some how still says block size is zero in some cases. Better error message should help here.
> If you see small number of these exceptions compared to number of blocks written, for now you don't need change anything.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1189) Still seeing some unexpected 'No space left on device' exceptions

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12486499 ] 

Hadoop QA commented on HADOOP-1189:
-----------------------------------

+1, because http://issues.apache.org/jira/secure/attachment/12354781/HADOOP-1189.patch applied and successfully tested against trunk revision http://svn.apache.org/repos/asf/lucene/hadoop/trunk/525268. Results are at http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch

> Still seeing some unexpected 'No space left on device' exceptions
> -----------------------------------------------------------------
>
>                 Key: HADOOP-1189
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1189
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.12.2
>            Reporter: Raghu Angadi
>         Assigned To: Raghu Angadi
>             Fix For: 0.13.0
>
>         Attachments: HADOOP-1189.patch
>
>
> One of the datanodes has one full partition (disk) out of four. Expected behaviour is that datanode should skip this partition and use only the other three. HADOOP-990 fixed some bugs related to this. It seems to work ok but some exceptions are still seeping through. In one case there 33 of these out 1200+ blocks written to this node. Not sure what caused this. I will submit a patch to the prints a more useful message throw the original exception.
> Two unlikely reasons I can think of are 2% reserve space (8GB in this case) is not enough or client some how still says block size is zero in some cases. Better error message should help here.
> If you see small number of these exceptions compared to number of blocks written, for now you don't need change anything.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1189) Still seeing some unexpected 'No space left on device' exceptions

Posted by "Tom White (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12488016 ] 

Tom White commented on HADOOP-1189:
-----------------------------------

Hairong, are you able to check that this patch fixes the problem for you, before this is committed? Thanks.

> Still seeing some unexpected 'No space left on device' exceptions
> -----------------------------------------------------------------
>
>                 Key: HADOOP-1189
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1189
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.12.2
>            Reporter: Raghu Angadi
>         Assigned To: Raghu Angadi
>             Fix For: 0.13.0
>
>         Attachments: HADOOP-1189-2.patch, HADOOP-1189-3.patch
>
>
> One of the datanodes has one full partition (disk) out of four. Expected behaviour is that datanode should skip this partition and use only the other three. HADOOP-990 fixed some bugs related to this. It seems to work ok but some exceptions are still seeping through. In one case there 33 of these out 1200+ blocks written to this node. Not sure what caused this. I will submit a patch to the prints a more useful message throw the original exception.
> Two unlikely reasons I can think of are 2% reserve space (8GB in this case) is not enough or client some how still says block size is zero in some cases. Better error message should help here.
> If you see small number of these exceptions compared to number of blocks written, for now you don't need change anything.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1189) Still seeing some unexpected 'No space left on device' exceptions

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghu Angadi updated HADOOP-1189:
---------------------------------

    Attachment:     (was: HADOOP-1189.patch)

> Still seeing some unexpected 'No space left on device' exceptions
> -----------------------------------------------------------------
>
>                 Key: HADOOP-1189
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1189
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.12.2
>            Reporter: Raghu Angadi
>         Assigned To: Raghu Angadi
>             Fix For: 0.13.0
>
>         Attachments: HADOOP-1189-2.patch, HADOOP-1189-3.patch
>
>
> One of the datanodes has one full partition (disk) out of four. Expected behaviour is that datanode should skip this partition and use only the other three. HADOOP-990 fixed some bugs related to this. It seems to work ok but some exceptions are still seeping through. In one case there 33 of these out 1200+ blocks written to this node. Not sure what caused this. I will submit a patch to the prints a more useful message throw the original exception.
> Two unlikely reasons I can think of are 2% reserve space (8GB in this case) is not enough or client some how still says block size is zero in some cases. Better error message should help here.
> If you see small number of these exceptions compared to number of blocks written, for now you don't need change anything.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1189) Still seeing some unexpected 'No space left on device' exceptions

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghu Angadi updated HADOOP-1189:
---------------------------------

    Attachment: HADOOP-1189-2.patch

Attached 2.patch. It contains the change to log message and fixes getAvailable(). This could go in to 0.12.3.

> Still seeing some unexpected 'No space left on device' exceptions
> -----------------------------------------------------------------
>
>                 Key: HADOOP-1189
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1189
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.12.2
>            Reporter: Raghu Angadi
>         Assigned To: Raghu Angadi
>             Fix For: 0.13.0
>
>         Attachments: HADOOP-1189-2.patch, HADOOP-1189.patch
>
>
> One of the datanodes has one full partition (disk) out of four. Expected behaviour is that datanode should skip this partition and use only the other three. HADOOP-990 fixed some bugs related to this. It seems to work ok but some exceptions are still seeping through. In one case there 33 of these out 1200+ blocks written to this node. Not sure what caused this. I will submit a patch to the prints a more useful message throw the original exception.
> Two unlikely reasons I can think of are 2% reserve space (8GB in this case) is not enough or client some how still says block size is zero in some cases. Better error message should help here.
> If you see small number of these exceptions compared to number of blocks written, for now you don't need change anything.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1189) Still seeing some unexpected 'No space left on device' exceptions

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12488333 ] 

Hadoop QA commented on HADOOP-1189:
-----------------------------------

Integrated in Hadoop-Nightly #55 (See http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/55/)

> Still seeing some unexpected 'No space left on device' exceptions
> -----------------------------------------------------------------
>
>                 Key: HADOOP-1189
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1189
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.12.2
>            Reporter: Raghu Angadi
>         Assigned To: Raghu Angadi
>             Fix For: 0.13.0
>
>         Attachments: HADOOP-1189-2.patch, HADOOP-1189-3.patch
>
>
> One of the datanodes has one full partition (disk) out of four. Expected behaviour is that datanode should skip this partition and use only the other three. HADOOP-990 fixed some bugs related to this. It seems to work ok but some exceptions are still seeping through. In one case there 33 of these out 1200+ blocks written to this node. Not sure what caused this. I will submit a patch to the prints a more useful message throw the original exception.
> Two unlikely reasons I can think of are 2% reserve space (8GB in this case) is not enough or client some how still says block size is zero in some cases. Better error message should help here.
> If you see small number of these exceptions compared to number of blocks written, for now you don't need change anything.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1189) Still seeing some unexpected 'No space left on device' exceptions

Posted by "Koji Noguchi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12486789 ] 

Koji Noguchi commented on HADOOP-1189:
--------------------------------------

On one node with full drive, it showed  something like

  _____:  /dev/_            190451020 181125476         0 100% /___/____

Total, used and available space.
It doesn't add up because  os reserve some space for disk de-frag.
Could this be a reason?



> Still seeing some unexpected 'No space left on device' exceptions
> -----------------------------------------------------------------
>
>                 Key: HADOOP-1189
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1189
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.12.2
>            Reporter: Raghu Angadi
>         Assigned To: Raghu Angadi
>             Fix For: 0.13.0
>
>         Attachments: HADOOP-1189.patch
>
>
> One of the datanodes has one full partition (disk) out of four. Expected behaviour is that datanode should skip this partition and use only the other three. HADOOP-990 fixed some bugs related to this. It seems to work ok but some exceptions are still seeping through. In one case there 33 of these out 1200+ blocks written to this node. Not sure what caused this. I will submit a patch to the prints a more useful message throw the original exception.
> Two unlikely reasons I can think of are 2% reserve space (8GB in this case) is not enough or client some how still says block size is zero in some cases. Better error message should help here.
> If you see small number of these exceptions compared to number of blocks written, for now you don't need change anything.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1189) Still seeing some unexpected 'No space left on device' exceptions

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12488115 ] 

Raghu Angadi commented on HADOOP-1189:
--------------------------------------


Yes. We are actually using it in our cluster.


> Still seeing some unexpected 'No space left on device' exceptions
> -----------------------------------------------------------------
>
>                 Key: HADOOP-1189
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1189
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.12.2
>            Reporter: Raghu Angadi
>         Assigned To: Raghu Angadi
>             Fix For: 0.13.0
>
>         Attachments: HADOOP-1189-2.patch, HADOOP-1189-3.patch
>
>
> One of the datanodes has one full partition (disk) out of four. Expected behaviour is that datanode should skip this partition and use only the other three. HADOOP-990 fixed some bugs related to this. It seems to work ok but some exceptions are still seeping through. In one case there 33 of these out 1200+ blocks written to this node. Not sure what caused this. I will submit a patch to the prints a more useful message throw the original exception.
> Two unlikely reasons I can think of are 2% reserve space (8GB in this case) is not enough or client some how still says block size is zero in some cases. Better error message should help here.
> If you see small number of these exceptions compared to number of blocks written, for now you don't need change anything.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1189) Still seeing some unexpected 'No space left on device' exceptions

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghu Angadi updated HADOOP-1189:
---------------------------------

    Attachment: HADOOP-1189.patch


Attached patch prints a warning and throws the IOException received.

The new log entry looks like this:

2007-04-02 12:59:15,940 WARN org.apache.hadoop.dfs.DataNode: No space left on device while writing blk_8638782110649810591 (length: 67108864) to /export/crawlspace/rangadi/tmp/ramfs (Cur available space : 20554389)
2007-04-02 12:59:15,943 ERROR org.apache.hadoop.dfs.DataNode: DataXCeiver java.io.IOException: No space left on device
        at java.io.FileOutputStream.writeBytes(Native Method)
        at java.io.FileOutputStream.write(FileOutputStream.java:260)
        at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
        at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
        at java.io.DataOutputStream.write(DataOutputStream.java:90)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(DataNode.java:837)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:603)
        at java.lang.Thread.run(Thread.java:619)


> Still seeing some unexpected 'No space left on device' exceptions
> -----------------------------------------------------------------
>
>                 Key: HADOOP-1189
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1189
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.12.2
>            Reporter: Raghu Angadi
>         Assigned To: Raghu Angadi
>             Fix For: 0.13.0
>
>         Attachments: HADOOP-1189.patch
>
>
> One of the datanodes has one full partition (disk) out of four. Expected behaviour is that datanode should skip this partition and use only the other three. HADOOP-990 fixed some bugs related to this. It seems to work ok but some exceptions are still seeping through. In one case there 33 of these out 1200+ blocks written to this node. Not sure what caused this. I will submit a patch to the prints a more useful message throw the original exception.
> Two unlikely reasons I can think of are 2% reserve space (8GB in this case) is not enough or client some how still says block size is zero in some cases. Better error message should help here.
> If you see small number of these exceptions compared to number of blocks written, for now you don't need change anything.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1189) Still seeing some unexpected 'No space left on device' exceptions

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12487367 ] 

Raghu Angadi commented on HADOOP-1189:
--------------------------------------

Also, this should probably go into 0.12.4.. I guess its safe to assume there would be at least one more release before 13.0 :-)

> Still seeing some unexpected 'No space left on device' exceptions
> -----------------------------------------------------------------
>
>                 Key: HADOOP-1189
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1189
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.12.2
>            Reporter: Raghu Angadi
>         Assigned To: Raghu Angadi
>             Fix For: 0.13.0
>
>         Attachments: HADOOP-1189-2.patch, HADOOP-1189-3.patch
>
>
> One of the datanodes has one full partition (disk) out of four. Expected behaviour is that datanode should skip this partition and use only the other three. HADOOP-990 fixed some bugs related to this. It seems to work ok but some exceptions are still seeping through. In one case there 33 of these out 1200+ blocks written to this node. Not sure what caused this. I will submit a patch to the prints a more useful message throw the original exception.
> Two unlikely reasons I can think of are 2% reserve space (8GB in this case) is not enough or client some how still says block size is zero in some cases. Better error message should help here.
> If you see small number of these exceptions compared to number of blocks written, for now you don't need change anything.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1189) Still seeing some unexpected 'No space left on device' exceptions

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12487329 ] 

Hadoop QA commented on HADOOP-1189:
-----------------------------------

+1, because http://issues.apache.org/jira/secure/attachment/12354967/HADOOP-1189-3.patch applied and successfully tested against trunk revision http://svn.apache.org/repos/asf/lucene/hadoop/trunk/526215. Results are at http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch

> Still seeing some unexpected 'No space left on device' exceptions
> -----------------------------------------------------------------
>
>                 Key: HADOOP-1189
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1189
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.12.2
>            Reporter: Raghu Angadi
>         Assigned To: Raghu Angadi
>             Fix For: 0.13.0
>
>         Attachments: HADOOP-1189-2.patch, HADOOP-1189-3.patch
>
>
> One of the datanodes has one full partition (disk) out of four. Expected behaviour is that datanode should skip this partition and use only the other three. HADOOP-990 fixed some bugs related to this. It seems to work ok but some exceptions are still seeping through. In one case there 33 of these out 1200+ blocks written to this node. Not sure what caused this. I will submit a patch to the prints a more useful message throw the original exception.
> Two unlikely reasons I can think of are 2% reserve space (8GB in this case) is not enough or client some how still says block size is zero in some cases. Better error message should help here.
> If you see small number of these exceptions compared to number of blocks written, for now you don't need change anything.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1189) Still seeing some unexpected 'No space left on device' exceptions

Posted by "Tom White (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tom White updated HADOOP-1189:
------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I've just committed this. Thanks Raghu!

> Still seeing some unexpected 'No space left on device' exceptions
> -----------------------------------------------------------------
>
>                 Key: HADOOP-1189
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1189
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.12.2
>            Reporter: Raghu Angadi
>         Assigned To: Raghu Angadi
>             Fix For: 0.13.0
>
>         Attachments: HADOOP-1189-2.patch, HADOOP-1189-3.patch
>
>
> One of the datanodes has one full partition (disk) out of four. Expected behaviour is that datanode should skip this partition and use only the other three. HADOOP-990 fixed some bugs related to this. It seems to work ok but some exceptions are still seeping through. In one case there 33 of these out 1200+ blocks written to this node. Not sure what caused this. I will submit a patch to the prints a more useful message throw the original exception.
> Two unlikely reasons I can think of are 2% reserve space (8GB in this case) is not enough or client some how still says block size is zero in some cases. Better error message should help here.
> If you see small number of these exceptions compared to number of blocks written, for now you don't need change anything.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-1189) Still seeing some unexpected 'No space left on device' exceptions

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghu Angadi reassigned HADOOP-1189:
------------------------------------

    Assignee: Raghu Angadi

> Still seeing some unexpected 'No space left on device' exceptions
> -----------------------------------------------------------------
>
>                 Key: HADOOP-1189
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1189
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.12.2
>            Reporter: Raghu Angadi
>         Assigned To: Raghu Angadi
>             Fix For: 0.13.0
>
>
> One of the datanodes has one full partition (disk) out of four. Expected behaviour is that datanode should skip this partition and use only the other three. HADOOP-990 fixed some bugs related to this. It seems to work ok but some exceptions are still seeping through. In one case there 33 of these out 1200+ blocks written to this node. Not sure what caused this. I will submit a patch to the prints a more useful message throw the original exception.
> Two unlikely reasons I can think of are 2% reserve space (8GB in this case) is not enough or client some how still says block size is zero in some cases. Better error message should help here.
> If you see small number of these exceptions compared to number of blocks written, for now you don't need change anything.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1189) Still seeing some unexpected 'No space left on device' exceptions

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghu Angadi updated HADOOP-1189:
---------------------------------

    Status: Patch Available  (was: Open)

Thanks Hairong. Yes, this only improves log message. I'm no sure how such patches are handled. I making this patch available and you can change it to 'open' after the patch is submitted.


> Still seeing some unexpected 'No space left on device' exceptions
> -----------------------------------------------------------------
>
>                 Key: HADOOP-1189
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1189
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.12.2
>            Reporter: Raghu Angadi
>         Assigned To: Raghu Angadi
>             Fix For: 0.13.0
>
>         Attachments: HADOOP-1189.patch
>
>
> One of the datanodes has one full partition (disk) out of four. Expected behaviour is that datanode should skip this partition and use only the other three. HADOOP-990 fixed some bugs related to this. It seems to work ok but some exceptions are still seeping through. In one case there 33 of these out 1200+ blocks written to this node. Not sure what caused this. I will submit a patch to the prints a more useful message throw the original exception.
> Two unlikely reasons I can think of are 2% reserve space (8GB in this case) is not enough or client some how still says block size is zero in some cases. Better error message should help here.
> If you see small number of these exceptions compared to number of blocks written, for now you don't need change anything.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1189) Still seeing some unexpected 'No space left on device' exceptions

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghu Angadi updated HADOOP-1189:
---------------------------------

    Attachment: HADOOP-1189-3.patch


Update the patch to remove changes to log message etc. It just fixes the calculation in getAvailable().
Hairong, could you check the patch again?


> Still seeing some unexpected 'No space left on device' exceptions
> -----------------------------------------------------------------
>
>                 Key: HADOOP-1189
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1189
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.12.2
>            Reporter: Raghu Angadi
>         Assigned To: Raghu Angadi
>             Fix For: 0.13.0
>
>         Attachments: HADOOP-1189-2.patch, HADOOP-1189-3.patch
>
>
> One of the datanodes has one full partition (disk) out of four. Expected behaviour is that datanode should skip this partition and use only the other three. HADOOP-990 fixed some bugs related to this. It seems to work ok but some exceptions are still seeping through. In one case there 33 of these out 1200+ blocks written to this node. Not sure what caused this. I will submit a patch to the prints a more useful message throw the original exception.
> Two unlikely reasons I can think of are 2% reserve space (8GB in this case) is not enough or client some how still says block size is zero in some cases. Better error message should help here.
> If you see small number of these exceptions compared to number of blocks written, for now you don't need change anything.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1189) Still seeing some unexpected 'No space left on device' exceptions

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12486487 ] 

Hairong Kuang commented on HADOOP-1189:
---------------------------------------

+1

This patch is intended for identifying the problem. So we probably still need to keep the issue open after the patch is committed.

> Still seeing some unexpected 'No space left on device' exceptions
> -----------------------------------------------------------------
>
>                 Key: HADOOP-1189
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1189
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.12.2
>            Reporter: Raghu Angadi
>         Assigned To: Raghu Angadi
>             Fix For: 0.13.0
>
>         Attachments: HADOOP-1189.patch
>
>
> One of the datanodes has one full partition (disk) out of four. Expected behaviour is that datanode should skip this partition and use only the other three. HADOOP-990 fixed some bugs related to this. It seems to work ok but some exceptions are still seeping through. In one case there 33 of these out 1200+ blocks written to this node. Not sure what caused this. I will submit a patch to the prints a more useful message throw the original exception.
> Two unlikely reasons I can think of are 2% reserve space (8GB in this case) is not enough or client some how still says block size is zero in some cases. Better error message should help here.
> If you see small number of these exceptions compared to number of blocks written, for now you don't need change anything.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1189) Still seeing some unexpected 'No space left on device' exceptions

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghu Angadi updated HADOOP-1189:
---------------------------------

    Status: Patch Available  (was: Open)

> Still seeing some unexpected 'No space left on device' exceptions
> -----------------------------------------------------------------
>
>                 Key: HADOOP-1189
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1189
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.12.2
>            Reporter: Raghu Angadi
>         Assigned To: Raghu Angadi
>             Fix For: 0.13.0
>
>         Attachments: HADOOP-1189-2.patch, HADOOP-1189-3.patch
>
>
> One of the datanodes has one full partition (disk) out of four. Expected behaviour is that datanode should skip this partition and use only the other three. HADOOP-990 fixed some bugs related to this. It seems to work ok but some exceptions are still seeping through. In one case there 33 of these out 1200+ blocks written to this node. Not sure what caused this. I will submit a patch to the prints a more useful message throw the original exception.
> Two unlikely reasons I can think of are 2% reserve space (8GB in this case) is not enough or client some how still says block size is zero in some cases. Better error message should help here.
> If you see small number of these exceptions compared to number of blocks written, for now you don't need change anything.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.