You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2009/03/30 10:38:50 UTC

[jira] Created: (HBASE-1296) Bulk upload can overwhelm a datanode filling its disk because HDFS deletes lag

Bulk upload can overwhelm a datanode filling its disk because HDFS deletes lag
------------------------------------------------------------------------------

                 Key: HBASE-1296
                 URL: https://issues.apache.org/jira/browse/HBASE-1296
             Project: Hadoop HBase
          Issue Type: Bug
            Reporter: stack


>From a private mail with Jérôme Thièvre:

"Two times I got the same kind of problems described by schubert zhang in the thread *HDFS unbalance issue. (HBase over HDFS). *The first time, I found one of my regionserver  with a full disk, whereas the others were at 10% of their capacity. I saw a lot of move file command at hadoop level from the full regionserver to others, but after check on the hdfs, I can see that hadoop copies the file but fails to delete it.  As the regionserver disk was full, it didn't work anymore, even after hadoop and hbase restart. I had to delete all the data.

"The second time this problem occurs I stopped the row insertion process before the disk was full. After hadoop and hbase restart, hadoop has deleted the files and the system was operational.
It seems that in some cases where the system is heavily loaded with continuous writes and compactions, hadoop can't remove files."



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HBASE-1296) Bulk upload can overwhelm a datanode filling its disk because HDFS deletes lag

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12724666#action_12724666 ] 

Andrew Purtell edited comment on HBASE-1296 at 6/26/09 12:30 PM:
-----------------------------------------------------------------

Related JIRAs

* HDFS-66

* HDFS-122

* HDFS-139

* HDFS-166


      was (Author: apurtell):
    What HDFS jiras track this issue?
  
> Bulk upload can overwhelm a datanode filling its disk because HDFS deletes lag
> ------------------------------------------------------------------------------
>
>                 Key: HBASE-1296
>                 URL: https://issues.apache.org/jira/browse/HBASE-1296
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>
> From a private mail with Jérôme Thièvre:
> "Two times I got the same kind of problems described by schubert zhang in the thread *HDFS unbalance issue. (HBase over HDFS). *The first time, I found one of my regionserver  with a full disk, whereas the others were at 10% of their capacity. I saw a lot of move file command at hadoop level from the full regionserver to others, but after check on the hdfs, I can see that hadoop copies the file but fails to delete it.  As the regionserver disk was full, it didn't work anymore, even after hadoop and hbase restart. I had to delete all the data.
> "The second time this problem occurs I stopped the row insertion process before the disk was full. After hadoop and hbase restart, hadoop has deleted the files and the system was operational.
> It seems that in some cases where the system is heavily loaded with continuous writes and compactions, hadoop can't remove files."

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1296) Bulk upload can overwhelm a datanode filling its disk because HDFS deletes lag

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12724666#action_12724666 ] 

Andrew Purtell commented on HBASE-1296:
---------------------------------------

What HDFS jiras track this issue?

> Bulk upload can overwhelm a datanode filling its disk because HDFS deletes lag
> ------------------------------------------------------------------------------
>
>                 Key: HBASE-1296
>                 URL: https://issues.apache.org/jira/browse/HBASE-1296
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>
> From a private mail with Jérôme Thièvre:
> "Two times I got the same kind of problems described by schubert zhang in the thread *HDFS unbalance issue. (HBase over HDFS). *The first time, I found one of my regionserver  with a full disk, whereas the others were at 10% of their capacity. I saw a lot of move file command at hadoop level from the full regionserver to others, but after check on the hdfs, I can see that hadoop copies the file but fails to delete it.  As the regionserver disk was full, it didn't work anymore, even after hadoop and hbase restart. I had to delete all the data.
> "The second time this problem occurs I stopped the row insertion process before the disk was full. After hadoop and hbase restart, hadoop has deleted the files and the system was operational.
> It seems that in some cases where the system is heavily loaded with continuous writes and compactions, hadoop can't remove files."

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1296) Bulk upload can overwhelm a datanode filling its disk because HDFS deletes lag

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12724754#action_12724754 ] 

stack commented on HBASE-1296:
------------------------------

This happened to me about two weeks ago. Namenode was uncontactable -- so it appeared -- and it was all deleting all the time.

> Bulk upload can overwhelm a datanode filling its disk because HDFS deletes lag
> ------------------------------------------------------------------------------
>
>                 Key: HBASE-1296
>                 URL: https://issues.apache.org/jira/browse/HBASE-1296
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>
> From a private mail with Jérôme Thièvre:
> "Two times I got the same kind of problems described by schubert zhang in the thread *HDFS unbalance issue. (HBase over HDFS). *The first time, I found one of my regionserver  with a full disk, whereas the others were at 10% of their capacity. I saw a lot of move file command at hadoop level from the full regionserver to others, but after check on the hdfs, I can see that hadoop copies the file but fails to delete it.  As the regionserver disk was full, it didn't work anymore, even after hadoop and hbase restart. I had to delete all the data.
> "The second time this problem occurs I stopped the row insertion process before the disk was full. After hadoop and hbase restart, hadoop has deleted the files and the system was operational.
> It seems that in some cases where the system is heavily loaded with continuous writes and compactions, hadoop can't remove files."

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1296) Bulk upload can overwhelm a datanode filling its disk because HDFS deletes lag

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-1296:
-------------------------

    Comment: was deleted

(was: This happened to me about two weeks ago. Namenode was uncontactable -- so it appeared -- and it was all deleting all the time.)

> Bulk upload can overwhelm a datanode filling its disk because HDFS deletes lag
> ------------------------------------------------------------------------------
>
>                 Key: HBASE-1296
>                 URL: https://issues.apache.org/jira/browse/HBASE-1296
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>
> From a private mail with Jérôme Thièvre:
> "Two times I got the same kind of problems described by schubert zhang in the thread *HDFS unbalance issue. (HBase over HDFS). *The first time, I found one of my regionserver  with a full disk, whereas the others were at 10% of their capacity. I saw a lot of move file command at hadoop level from the full regionserver to others, but after check on the hdfs, I can see that hadoop copies the file but fails to delete it.  As the regionserver disk was full, it didn't work anymore, even after hadoop and hbase restart. I had to delete all the data.
> "The second time this problem occurs I stopped the row insertion process before the disk was full. After hadoop and hbase restart, hadoop has deleted the files and the system was operational.
> It seems that in some cases where the system is heavily loaded with continuous writes and compactions, hadoop can't remove files."

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.