You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bookkeeper.apache.org by Jack Vanlightly <ja...@gmail.com> on 2021/12/11 09:53:39 UTC

Unbounded ledgers

Hi all,

As a mostly thought exercise, I have put together the protocol changes
required to make ledgers unbounded. I have written up the changes here:
https://jack-vanlightly.com/blog/2021/12/9/tweaking-the-bookkeeper-protocol-unbounded-ledgers

The benefits of such an approach is a simplified stream protocol that
doesn't require chaining ledgers together. An individual ledger is already
segmented and we don't lose the benefits of a segmented log if we make one
stream a single ledger.

I open it up for discussion here so people can comment, find issues or
indicate they like the idea. It is not a pressing need so I haven't
contemplated making a BP out of it yet.

Jack

Re: Unbounded ledgers

Posted by Jack Vanlightly <jv...@splunk.com.INVALID>.
Hi Enrico,

I think if BookKeeper didn't exist and we were in the design phase, I'd
prefer ledgers to be unbounded, so that I could either have a single ledger
as a stream, or if I still wanted to combine them, like Pravega would need
to, I would close ledgers intentionally (rather than due to error). In that
design, much of the logic such as garbage collection would be of fragments
rather than ledgers.

One of the benefits of unbounded ledgers is not being forced to build a
log-of-ledgers on top, it simplifies new use cases. If I were building a
distributed database and wanted to use BookKeeper for the commit log, I'd
love having the LedgerHandle API being enough for my needs.

In the real world we have bounded ledgers and all the things that go with
that and there is not a compelling use case right now to change everything
to work with fragments. The big players have their implementations of
log-of-ledgers and there is DistributedLog for those that don't want to
build their own log-of-ledgers.

But if one day there is a compelling reason to go with unbounded ledgers,
then we know it's possible and we get all the benefits of a segmented log
because ledgers are already segmented.

Jack



On Fri, Dec 17, 2021 at 1:02 PM Enrico Olivelli <eo...@gmail.com> wrote:

> [ External sender. Exercise caution. ]
>
> Hello Jack,
> This idea sounds really interesting.
>
> As you are citing in the doc, one tricky part from my point of view is
> about deleting fragments in order to release space (and be compliant with
> legal stuff...)
> We will also have to add the fragment as a top level concept of BooKeeper
> in order to allow maintenance of the ledgers.
>
> I wonder if it is better to keep this here in BookKeeper, or is it better
> to keep this complexity at a higher level, like in Pulsar and Pravega,
> that, to make it simple
> are basically abstractions over BK to implement such unbounded streams of
> data.
>
> As in Pulsar and in Pravega, when you deal with an unbounded amount of data
> you will have to deal with Offloading to some external (cheaper) storage,
> so that's another thing that we should move into BooKeeper.
>
> Thanks for sharing
>
> Enrico
>
>
> Il giorno sab 11 dic 2021 alle ore 10:53 Jack Vanlightly <
> jack.vanlightly@gmail.com> ha scritto:
>
> > Hi all,
> >
> > As a mostly thought exercise, I have put together the protocol changes
> > required to make ledgers unbounded. I have written up the changes here:
> >
> >
> https://jack-vanlightly.com/blog/2021/12/9/tweaking-the-bookkeeper-protocol-unbounded-ledgers
> >
> > The benefits of such an approach is a simplified stream protocol that
> > doesn't require chaining ledgers together. An individual ledger is
> already
> > segmented and we don't lose the benefits of a segmented log if we make
> one
> > stream a single ledger.
> >
> > I open it up for discussion here so people can comment, find issues or
> > indicate they like the idea. It is not a pressing need so I haven't
> > contemplated making a BP out of it yet.
> >
> > Jack
> >
>

Re: Unbounded ledgers

Posted by Enrico Olivelli <eo...@gmail.com>.
Hello Jack,
This idea sounds really interesting.

As you are citing in the doc, one tricky part from my point of view is
about deleting fragments in order to release space (and be compliant with
legal stuff...)
We will also have to add the fragment as a top level concept of BooKeeper
in order to allow maintenance of the ledgers.

I wonder if it is better to keep this here in BookKeeper, or is it better
to keep this complexity at a higher level, like in Pulsar and Pravega,
that, to make it simple
are basically abstractions over BK to implement such unbounded streams of
data.

As in Pulsar and in Pravega, when you deal with an unbounded amount of data
you will have to deal with Offloading to some external (cheaper) storage,
so that's another thing that we should move into BooKeeper.

Thanks for sharing

Enrico


Il giorno sab 11 dic 2021 alle ore 10:53 Jack Vanlightly <
jack.vanlightly@gmail.com> ha scritto:

> Hi all,
>
> As a mostly thought exercise, I have put together the protocol changes
> required to make ledgers unbounded. I have written up the changes here:
>
> https://jack-vanlightly.com/blog/2021/12/9/tweaking-the-bookkeeper-protocol-unbounded-ledgers
>
> The benefits of such an approach is a simplified stream protocol that
> doesn't require chaining ledgers together. An individual ledger is already
> segmented and we don't lose the benefits of a segmented log if we make one
> stream a single ledger.
>
> I open it up for discussion here so people can comment, find issues or
> indicate they like the idea. It is not a pressing need so I haven't
> contemplated making a BP out of it yet.
>
> Jack
>