You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bookkeeper.apache.org by Venkateswara Rao Jujjuri <ju...@gmail.com> on 2019/05/29 04:54:11 UTC

Re: Changing ledger metadata to binary format

Hi all, bringing back this conversation back to life..

https://github.com/apache/bookkeeper/issues/723

What is our plan to move forward with binary format? Anyone using binary
format in production? even for new clusters?
What is blocking us from using binary format for the new clusters?

On Mon, Jul 30, 2018 at 11:02 AM Sijie Guo <gu...@gmail.com> wrote:

> JV,
>
> 1) for the real world use case, this is the problem I have hit which I have
> to manually add a flag in the ledger metadata to avoid serializing ctime.
>
> https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java#L105
>
> 2)  > We have introduced customMetadata in the past and that was not
> mandatory.
> How did we make that work with text format?
>
> by default if you don't use customMetadata, customMetadata is not
> serialized into metadata. so there is no problem for rolling back if you
> don't
> use this feature. so that's why it wasn't a problem for most of the people.
>
> the ctime field was the problem.
>
> 3)
> > At this point, I am not sure if we have identified all the pieces and if
> they fit together.
> > Enrico, are you driving the roll-back story? or whoever, can you please
> come up with a comprehensive proposal?
>
> we have a good-enough story for upgrading/rolling-back on wire-protocol and
> disk formats. however we don't have a good story on metadata part.
> that's why I sort of pulling in the metadata piece only. however if filling
> in a big picture would help discussing this problem, I can drive the whole
> picture
> for this roll-back story.
>
> - Sijie
>
> On Mon, Jul 30, 2018 at 7:55 AM Venkateswara Rao Jujjuri <
> jujjuri@gmail.com>
> wrote:
>
> > We have introduced customMetadata in the past and that was not mandatory.
> > How did we make that work with text format?
> >
>
>
>
> > Also, my preference is to wait until we have a need to introduce a new
> > field for this conversion if needed.
> >
> > JV
> >
> > On Mon, Jul 30, 2018 at 12:42 AM, Enrico Olivelli <eo...@gmail.com>
> > wrote:
> >
> > > Il giorno lun 30 lug 2018 alle ore 09:17 Ivan Kelly <iv...@apache.org>
> > ha
> > > scritto:
> > >
> > > > >> Thank you for putting this together. It is also good to put this
> as
> > a
> > > > BP,
> > > > >> since it is about the metadata layout.
> > > >
> > > > I'll put a BP up this week once I have initial feedback.
> > > >
> > > > >> > - When writing a metadata, check what is in /ledgers/LAYOUT. If
> it
> > > is
> > > > >> > as above, write using the current text protobuf. If is bumped,
> use
> > > the
> > > > >> > new binary format.
> > > > >
> > > > > isn't it too costly? Adding a zk read for each write. We could add
> a
> > > > watch
> > > > > but is has an important cost
> > > >
> > > > This cost can be amortized. In simplest case, we only read when
> client
> > > > starts (as already happens). Client are able both read and write the
> > > > old format, so clients that read the pre-upgrade version will still
> > > > write in text format, and this is fine.
> > > >
> > >
> > > For me it is fine to read it when client starts. A rolling restart of
> the
> > > client application will be enough to load the new configuration.
> > >
> > >
> > > >
> > > > However, we would want them to eventually see the new format without
> a
> > > > restart. At one extreme, we could put a watch on the znode, but this
> > > > could be quite heavy (I think the only per-client watch we have now
> is
> > > > on /ledgers/available). Alternatively, we could just read
> > > > /ledgers/LAYOUT once per day. This way clients who saw the old
> version
> > > > will eventually get updated.
> > > >
> > >
> > > I prefer NOT to use the watch, this cost will be payed forever even
> when
> > > there will be no configuration changes.
> > >
> > > I would go with the 'load at start' option, it is clear and works fine
> > >
> > >
> > >
> > > >
> > > > > What about having a client side config writeMetadataVersion ?
> > > > > We start a new metadata version, the new one  will be encoded as
> > > binary.
> > > > > By default 4.8 clients will use previous version, as we already do
> > for
> > > > > journal and fileinfo on bookies.
> > > > >
> > > > > Each ledger is independent from the others, there is no need of a
> > > global
> > > > > flag written on zk.
> > > >
> > > > There is a global property here though,which only the administrator
> > > > can control. It is "Can all clients in the cluster/application read
> > > > the new ledger metadata version?". This is why I'd like to keep it as
> > > > a centralized flag rather than allowing each client to decide for
> > > > themselves.
> > > >
> > >
> > > ok, let's keep it
> > >
> > > Enrico
> > >
> > >
> > >
> > > >
> > > > > Don't we already have some tool? (Maybe I only have seen such tool
> in
> > > my
> > > > > company applications)
> > > >
> > > > It may already exist, I haven't checked.
> > > >
> > > > -Ivan
> > > >
> > >
> >
> >
> >
> > --
> > Jvrao
> > ---
> > First they ignore you, then they laugh at you, then they fight you, then
> > you win. - Mahatma Gandhi
> >
>


-- 
Jvrao
---
First they ignore you, then they laugh at you, then they fight you, then
you win. - Mahatma Gandhi

Re: Changing ledger metadata to binary format

Posted by Ivan Kelly <iv...@apache.org>.
On Thu, May 30, 2019 at 12:15 AM Venkateswara Rao Jujjuri
<ju...@gmail.com> wrote:
>
> > "Let's decide this when we need it"
>
> Right. This is the time as we are trying to add fault domain info to the
> cookie.

When you say cookie, you mean under /ledgers/cookies? That falls
outside the scope of binary metadata.

> > to make a flag in the metadata store that we flip to enable it.
> Right I do recall the same. But given your proposal in this thread, all
> readers try to read in binary format,
> if that fails they fail back to text mode.

Actually, it doesn't even need this fallback, as the version of the
metadata is stored at the very start, so we switch on that.

> If this is the case, does the following make sense?
> 1. Bookies are updated first with new version. Replication workers
> recognize binary/text format and any metadata updates happen in the same
> format.
> 2. Clients are rolled out later with a caveat that the ledgers created by
> new clients can't be read by old clients. This may be ok as long as they
> don't share
>     the same ledgers.
> With these two things, do we need an explicit flip to enable?

This works, but you need to expose a mechanism in the client to tell
it to write in V3.

As long as old clients don't try and read, there should be no problem.

> I can read the patches again, but let me ask :) did we take care of bookie
> shell also to understand binary format?

Yes, bookie shell can understand it since it reads though the same interfaces.

-Ivan

Re: Changing ledger metadata to binary format

Posted by Venkateswara Rao Jujjuri <ju...@gmail.com>.
> "Let's decide this when we need it"

Right. This is the time as we are trying to add fault domain info to the
cookie.

> to make a flag in the metadata store that we flip to enable it.
Right I do recall the same. But given your proposal in this thread, all
readers try to read in binary format,
if that fails they fail back to text mode.
If this is the case, does the following make sense?
1. Bookies are updated first with new version. Replication workers
recognize binary/text format and any metadata updates happen in the same
format.
2. Clients are rolled out later with a caveat that the ledgers created by
new clients can't be read by old clients. This may be ok as long as they
don't share
    the same ledgers.
With these two things, do we need an explicit flip to enable?

> The important thing at the time was to make it so clients can _read_
binary metadata

I can read the patches again, but let me ask :) did we take care of bookie
shell also to understand binary format?

On Wed, May 29, 2019 at 3:15 AM Ivan Kelly <iv...@apache.org> wrote:

> > What is our plan to move forward with binary format?
>
> I've not plans regarding it. Moving forward with it will happen when
> someone comes with a metadata change which will break text metadata
> users (i.e. almost any metadata change).
>
> > Anyone using binary format in production? even for new clusters?
>
> We are not.
>
> > What is blocking us from using binary format for the new clusters?
>
> We need a tool to enable it. I can't remember the exact details, but
> there was a conversation about whether to make each users specify the
> metadata version in their configuration, or to make a flag in the
> metadata store that we flip to enable it. I don't think we settled on
> anything, and it turned into a "Let's decide this when we need it"
> type of thing. The important thing at the time was to make it so
> clients can _read_ binary metadata, so that these clients can then
> read future updates.
>
> -Ivan
>


-- 
Jvrao
---
First they ignore you, then they laugh at you, then they fight you, then
you win. - Mahatma Gandhi

Re: Changing ledger metadata to binary format

Posted by Ivan Kelly <iv...@apache.org>.
> What is our plan to move forward with binary format?

I've not plans regarding it. Moving forward with it will happen when
someone comes with a metadata change which will break text metadata
users (i.e. almost any metadata change).

> Anyone using binary format in production? even for new clusters?

We are not.

> What is blocking us from using binary format for the new clusters?

We need a tool to enable it. I can't remember the exact details, but
there was a conversation about whether to make each users specify the
metadata version in their configuration, or to make a flag in the
metadata store that we flip to enable it. I don't think we settled on
anything, and it turned into a "Let's decide this when we need it"
type of thing. The important thing at the time was to make it so
clients can _read_ binary metadata, so that these clients can then
read future updates.

-Ivan