You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bookkeeper.apache.org by "Sijie Guo (Commented) (JIRA)" <ji...@apache.org> on 2012/03/02 11:40:59 UTC

[jira] [Commented] (BOOKKEEPER-180) bookie server doesn't quit when running out of disk space

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220826#comment-13220826 ] 

Sijie Guo commented on BOOKKEEPER-180:
--------------------------------------

> I dont understand the new code in Bookie#run. Shouldn't the Deathwatcher catch this problem?

currently Deathwatcher watching on running flag to know whether bookie is alive or not. If Bookie thread encountered exceptions such as IOException (due to no disk space left), the bookie thread quits but other threads are still alive and the running flag is not set to false. so new code is added to shut down other threads.
                
> bookie server doesn't quit when running out of disk space
> ---------------------------------------------------------
>
>                 Key: BOOKKEEPER-180
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-180
>             Project: Bookkeeper
>          Issue Type: Bug
>          Components: bookkeeper-server
>            Reporter: Sijie Guo
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-180.diff, conn3.png
>
>
> we found that the publish throughput drops down when one bookie server ran out of disk space (due to we don't do log rotation   which exhausts disk space). 
> did some investigation, we found that bookie server doesn't quit when encountering no disk space issue. so hub server treat this bookie server as available. The adding requests would be sent to this bookie server, some adding requests are put in journal queue to flush, but the journal flush thread has quit due to no disk space. so these adding requests didn't respond to bookie client until it read timeout and chose other bookie servers.
> we did an experiment to shut down the ran-out-of-disk-space bookie, the publish throughput went up again quickly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira