You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@bookkeeper.apache.org by Bobby Evans <ev...@oath.com.INVALID> on 2018/02/15 17:19:45 UTC

Journal Corruption when disk is full

We recently had a situation where the disk a journal is on filled up and we
ended up with a partial edit in the journal.  This ended up being a
METAENTRY_ID_LEDGER_KEY so only a partial master key was output and this
ended up making the bookie fail on startup.

Because we are running a forked version of bookkeeper I was wondering if
anyone had done something to fix this in the past.  If not I am happy to
take a crack at it, but I don't want to double up efforts if it is already
fixed.

Thanks,

Bobby

Re: Journal Corruption when disk is full

Posted by Ivan Kelly <iv...@apache.org>.

On Fri, Feb 16, 2018 at 9:48 PM, Matteo Merli <ma...@gmail.com> wrote:
>> One thing that should circumvent this is that the bookie should go
>> into readonly mode when it hits 95% full disk.
>
> I think this is only applying to the ledgers disk, but not for the journal.
That doesn't seem like the right thing to do :/

-Ivan

Re: Journal Corruption when disk is full

Posted by Matteo Merli <ma...@gmail.com>.

> One thing that should circumvent this is that the bookie should go
> into readonly mode when it hits 95% full disk.

I think this is only applying to the ledgers disk, but not for the journal.

And, to answer to Bobby, the switch to read-only mode feature was already
present in 4.3 (again, just for storage device).

Matteo

On Thu, Feb 15, 2018 at 2:56 PM Ivan Kelly <iv...@apache.org> wrote:

> On Thu, Feb 15, 2018 at 9:49 PM, Bobby Evans <ev...@oath.com.invalid>
> wrote:
> > I don't have the read only mode on disk full feature yet.  I will look at
> > pulling it back to our fork, but I will also look at fixing the
> journaling
> > in general.  Having spoken with the HDFS team here, they have seen a lot
> of
> > scary things that appear similar to this situation when a disk starts to
> go
> > bad. It would probably be in our best interest to guard against some of
> > those things on the bookies too.
> What scary things are the HDFS team doing? One thing we are doing in
> the journal, is that we preallocate the disk before we write to it. I
> remember, back in the day, this was mostly to get smoother latency, as
> the filesystem would get less involved, but this should also avoid the
> situation you described in your original email (unless the filesystem
> is overcommitting, or theirs some strange CoW stuff going on). Also, I
> recall some changes that came in from twitter that would pad each
> write to the journal out to the expected block size (i don't think we
> queried the actual size), which would ensure that you didn't try to
> rewrite a block, which could corrupt data if you failed in the middle
> of a rewrite. Of course, there's no guarantee that these things are
> bug free, but they should have handled the situation you described.
>
> -Ivan
>
-- 
Matteo Merli
<mm...@apache.org>

Re: Journal Corruption when disk is full

Posted by Ivan Kelly <iv...@apache.org>.

On Thu, Feb 15, 2018 at 9:49 PM, Bobby Evans <ev...@oath.com.invalid> wrote:
> I don't have the read only mode on disk full feature yet.  I will look at
> pulling it back to our fork, but I will also look at fixing the journaling
> in general.  Having spoken with the HDFS team here, they have seen a lot of
> scary things that appear similar to this situation when a disk starts to go
> bad. It would probably be in our best interest to guard against some of
> those things on the bookies too.
What scary things are the HDFS team doing? One thing we are doing in
the journal, is that we preallocate the disk before we write to it. I
remember, back in the day, this was mostly to get smoother latency, as
the filesystem would get less involved, but this should also avoid the
situation you described in your original email (unless the filesystem
is overcommitting, or theirs some strange CoW stuff going on). Also, I
recall some changes that came in from twitter that would pad each
write to the journal out to the expected block size (i don't think we
queried the actual size), which would ensure that you didn't try to
rewrite a block, which could corrupt data if you failed in the middle
of a rewrite. Of course, there's no guarantee that these things are
bug free, but they should have handled the situation you described.

-Ivan

Re: Journal Corruption when disk is full

Posted by Bobby Evans <ev...@oath.com.INVALID>.

I don't have the read only mode on disk full feature yet.  I will look at
pulling it back to our fork, but I will also look at fixing the journaling
in general.  Having spoken with the HDFS team here, they have seen a lot of
scary things that appear similar to this situation when a disk starts to go
bad. It would probably be in our best interest to guard against some of
those things on the bookies too.  Thanks again

On Thu, Feb 15, 2018 at 12:00 PM, Enrico Olivelli <eo...@gmail.com>
wrote:

> Il gio 15 feb 2018, 18:45 Ivan Kelly <iv...@apache.org> ha scritto:
>
> > Hi Bobby,
> >
> > One thing that should circumvent this is that the bookie should go
> > into readonly mode when it hits 95% full disk.
> >
> > This change is in your branch[1]. Have you disabled it?
> >
> > Regards,
> > Ivan
> >
> > [1]
> > https://github.com/yahoo/bookkeeper/blob/7fb556fa2dbc1d308b5d7ec3e2676b
> 8b11704698/bookkeeper-server/conf/bk_server.conf#L247
> >
> > On Thu, Feb 15, 2018 at 6:19 PM, Bobby Evans <ev...@oath.com.invalid>
> > wrote:
> > > We recently had a situation where the disk a journal is on filled up
> and
> > we
> > > ended up with a partial edit in the journal.  This ended up being a
> > > METAENTRY_ID_LEDGER_KEY so only a partial master key was output and
> this
> > > ended up making the bookie fail on startup.
> > >
> > > Because we are running a forked version of bookkeeper I was wondering
> if
> > > anyone had done something to fix this in the past.  If not I am happy
> to
> > > take a crack at it, but I don't want to double up efforts if it is
> > already
> > > fixed.
> > >
> > > Thanks,
> > >
> > > Bobby
> >
>
>
> Nothing by my side, sorry.
>
> I know that sometimes some bookie crashed due to lack of disk space but my
> collegues simply dropped the bookie.
> Now that bookies turn into read-only mode the problems is mitigated a lot.
> Sorry
>
> Enrico
>
> > --
>
>
> -- Enrico Olivelli
>

Re: Journal Corruption when disk is full

Posted by Enrico Olivelli <eo...@gmail.com>.

Il gio 15 feb 2018, 18:45 Ivan Kelly <iv...@apache.org> ha scritto:

> Hi Bobby,
>
> One thing that should circumvent this is that the bookie should go
> into readonly mode when it hits 95% full disk.
>
> This change is in your branch[1]. Have you disabled it?
>
> Regards,
> Ivan
>
> [1]
> https://github.com/yahoo/bookkeeper/blob/7fb556fa2dbc1d308b5d7ec3e2676b8b11704698/bookkeeper-server/conf/bk_server.conf#L247
>
> On Thu, Feb 15, 2018 at 6:19 PM, Bobby Evans <ev...@oath.com.invalid>
> wrote:
> > We recently had a situation where the disk a journal is on filled up and
> we
> > ended up with a partial edit in the journal.  This ended up being a
> > METAENTRY_ID_LEDGER_KEY so only a partial master key was output and this
> > ended up making the bookie fail on startup.
> >
> > Because we are running a forked version of bookkeeper I was wondering if
> > anyone had done something to fix this in the past.  If not I am happy to
> > take a crack at it, but I don't want to double up efforts if it is
> already
> > fixed.
> >
> > Thanks,
> >
> > Bobby
>


Nothing by my side, sorry.

I know that sometimes some bookie crashed due to lack of disk space but my
collegues simply dropped the bookie.
Now that bookies turn into read-only mode the problems is mitigated a lot.
Sorry

Enrico

> --


-- Enrico Olivelli

Re: Journal Corruption when disk is full

Posted by Ivan Kelly <iv...@apache.org>.

Hi Bobby,

One thing that should circumvent this is that the bookie should go
into readonly mode when it hits 95% full disk.

This change is in your branch[1]. Have you disabled it?

Regards,
Ivan

[1] https://github.com/yahoo/bookkeeper/blob/7fb556fa2dbc1d308b5d7ec3e2676b8b11704698/bookkeeper-server/conf/bk_server.conf#L247

On Thu, Feb 15, 2018 at 6:19 PM, Bobby Evans <ev...@oath.com.invalid> wrote:
> We recently had a situation where the disk a journal is on filled up and we
> ended up with a partial edit in the journal.  This ended up being a
> METAENTRY_ID_LEDGER_KEY so only a partial master key was output and this
> ended up making the bookie fail on startup.
>
> Because we are running a forked version of bookkeeper I was wondering if
> anyone had done something to fix this in the past.  If not I am happy to
> take a crack at it, but I don't want to double up efforts if it is already
> fixed.
>
> Thanks,
>
> Bobby