You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by 丛搏 <co...@gmail.com> on 2022/09/07 03:59:48 UTC

Re: [DISCUSSION] PiP196 TransactionBuffer Multiple-snapshots

Hi Xiangying

I think this is a very good optimization solution, it solves the
problem that users have a lot of aborted transactions.

Thanks!
Bo

Yubiao Feng <yu...@streamnative.io.invalid> 于2022年8月16日周二 11:33写道:
>
> Hi Xiangying
>
> >> Can the sequence id generation strategy be added to the proposal?
>
> >  think it's an implementation detail that shouldn't be exposed to the user
> at all.
>
> OK.
>
> Thanks
> Yubiao Feng
>
> On Mon, Aug 15, 2022 at 8:11 PM Xiangying Meng <xi...@apache.org> wrote:
>
> > Hi, yubiao,
> > I think it's an implementation detail that shouldn't be exposed to the user
> > at all.
> > Yours sincerely,
> > Xiangying Meng
> >
> > On Mon, Aug 15, 2022 at 8:02 PM Yubiao Feng
> > <yu...@streamnative.io.invalid> wrote:
> >
> > > Hi Xiangying
> > >
> > > Thank you for your reply. Sorry, I have one more question:
> > >
> > > > If these operations are failed at operation 2, the old snapshots will
> > be
> > > covered by the new large snapshot when compact due to they have the same
> > > sequence ID.
> > >
> > > Can the sequence id generation strategy be added to the doc?
> > >
> > > On Mon, Aug 15, 2022 at 6:35 PM Xiangying Meng <xi...@apache.org>
> > > wrote:
> > >
> > > > Hi, yubiao,
> > > > First of all, thanks for the attention and questions. Then for your
> > three
> > > > questions:
> > > > 1.
> > > >  > Does the merge take place in memory or in BK?
> > > > The snapshot will merge in BK. For specific details, you can see
> > detailed
> > > > instructions in the* ### Merge snapshot section.*
> > > > 2.
> > > > >How do we ensure the atomicity of the two writes, I suggest adding a
> > > check
> > > > We do not guarantee their atomicity. The position of the snapshot is
> > > > generally unchanged, so the previous index is also valid. If the index
> > > > write fails after a snapshot is written, the final result is that the
> > > > snapshot write fails this time. There will be no other worse results,
> > and
> > > > no dirty data will be introduced due to compression.
> > > > 3.
> > > > >Clean up unused aborts data
> > > > Snapshot cleanup can be found in *####take snapshot ##### How*.
> > > > The cleanup of the index is done automatically by the compressor. I
> > will
> > > > add it at *### Snapshot index topic.*
> > > >
> > > > yours sincerely,
> > > > Xiangying Meng
> > > >
> > > >
> > > >
> > > >
> > > > On Mon, Aug 15, 2022 at 3:56 PM Yubiao Feng
> > > > <yu...@streamnative.io.invalid> wrote:
> > > >
> > > > > Hi Xiangying
> > > > >
> > > > > I think Multiple-snapshots for TB is a good idea. And I have these
> > > > > questions:
> > > > >
> > > > >
> > > > > > The number of the transactions in a snapshot can be configured, and
> > > we
> > > > > hope it is small, then we can merge the small snapshots into a large
> > > > > snapshot when it reaches a configured number.
> > > > >
> > > > > Does the merge take place in memory or in BK?
> > > > >
> > > > > - If we merge small-snapshot in memory, can we just use
> > large-snapshot?
> > > > > - If we merge small-snapshot in BK, how to do it?
> > > > >
> > > > >
> > > > >
> > > > > > The index is written after each multiple-snapshot is written.
> > > > >
> > > > > Snapshot and index are stored in different topics, right?
> > > > >
> > > > > How do we ensure the atomicity of the two writes, I suggest adding a
> > > > check
> > > > > mechanism that snapshot not recorded in the index is invalid.
> > > > >
> > > > >
> > > > >
> > > > > > #### Clean up unused aborts data
> > > > >
> > > > > Now, this section only has instructions for clear snapshots.
> > > > > I think we should add this: how to delete/override the index data.
> > > > >
> > > > > Thanks
> > > > > Yubiao Feng
> > > > >
> > > > > On Thu, Aug 4, 2022 at 10:27 AM Xiangying Meng <xiangying@apache.org
> > >
> > > > > wrote:
> > > > >
> > > > > > Hi, Pulsar community,
> > > > > > I`d like to start a discussion about transaction multiple-snapshot.
> > > > > > In order to get rid of the capacity limitation of the bookkeeper
> > > entry,
> > > > > we
> > > > > > plan to use multiple snapshots. More details can be found here
> > > > > > <https://github.com/apache/pulsar/issues/16913>.
> > > > > >
> > > > > > Yours sincerely,
> > > > > > Xiangying Meng
> > > > > >
> > > > >
> > > >
> > >
> >