You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by Zhanpeng Wu <wu...@gmail.com> on 2022/01/14 02:36:42 UTC

[VOTE] PIP-129: Introduce intermediate state for ledger deletion

This is the voting thread for PIP-129. It will stay open for at least 48
hours.  Pasted below for quoting convenience.

----

https://github.com/apache/pulsar/issues/13526

----

## Motivation

Under the current ledger-trimming design in
`org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl#internalTrimLedgers`,
we need to collect those ledgers that need to be deleted first, and then
perform the asynchronous deletion of the ledger concurrently, but we do not
continue to pay attention to whether the deletion operation is completed.
If the meta-information update has been successfully completed but an error
occurs during the asynchronous deletion, the ledger may not be deleted, but
at the logical level we think that the deletion has been completed, which
will make this part of the data remain in the storage layer forever (such
as bk). As the usage time of the cluster becomes longer, the residual data
that cannot be deleted will gradually increase.

In order to achieve this goal, we can separate the logic of
meta-information update and ledger deletion. In the trimming process, we
can first mark which ledgers are deletable, and update the results to the
metadatastore. We can perform the deletion of marked ledgers asynchronously
in the callback of updating the meta information, so that the original
logic can be retained seamlessly. Therefore, when we are rolling upgrade or
rollback, the only difference is whether the deleted ledger is marked for
deletion.

To be more specific:
1. for upgrade, only the marker information of ledger has been added, and
the logical sequence of deletion has not changed.
2. for rollback, some ledgers that have been marked for deletion may not be
deleted due to the restart of the broker. This behavior is consistent with
the original version.

In addition, if the ledger that has been marked is not deleted
successfully, the marker will not be removed. So for this part of ledgers,
every time trimming is triggered, it will be deleted again, which is
equivalent to a check and retry mechanism.

## Goal

We need to modify some logic in
`org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl#internalTrimLedgers`
so that the ledger deletion logic in ledger-trimming is split into two
stages, marking and deleting. Once the marker information is updated to the
metadatastore, every trimming will try to trigger the ledger deletion until
all the deleteable ledgers are successfully deleted.

## Implementation

This proposal aims to separate the deletion logic in ledger-trimming, so
that `ManagedLedgerImpl#internalTrimLedgers` is responsible for marking the
deletable ledgers and then perform actual ledger deletion according to the
metadatastore.

Therefore, the entire trimming process is broken down into the following
steps:

1. mark deletable ledgers and update ledger metadata.
2. do acutual ledger deletion after metadata is updated.

For step 1, we can store the marker of deletable information in
`org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl#propertiesMap`. When
retrieving the deleted ledger information, we can directly query by
iterating `propertiesMap`. If this solution is not accepted, maybe we can
create a new znode to store these information, but this approach will not
be able to reuse the current design.

For step 2, we can perform the deletion of marked ledgers asynchronously in
the callback of updating the meta information. And every trimming will
trigger the check and delete for those deleteable ledgers.

Related PR: https://github.com/apache/pulsar/pull/13575

Re: [VOTE] PIP-129: Introduce intermediate state for ledger deletion

Posted by PengHui Li <pe...@apache.org>.
I have added the PIP to the WIKI page https://github.com/apache/pulsar/wiki

Thanks for the great work Zhanpeng,

Regards,
Penghui

On Thu, Jan 20, 2022 at 4:05 PM Zhanpeng Wu <wu...@gmail.com>
wrote:

> Thanks for your participation. Close the vote with 3 (+1) bindings and 3
> (+1) non-bindings, 0 (-1).
>
> Lin Lin <li...@apache.org> 于2022年1月20日周四 15:56写道:
>
> > +1
> >
>

Re: [VOTE] PIP-129: Introduce intermediate state for ledger deletion

Posted by Zhanpeng Wu <wu...@gmail.com>.
Thanks for your participation. Close the vote with 3 (+1) bindings and 3
(+1) non-bindings, 0 (-1).

Lin Lin <li...@apache.org> 于2022年1月20日周四 15:56写道:

> +1
>

Re: [VOTE] PIP-129: Introduce intermediate state for ledger deletion

Posted by Lin Lin <li...@apache.org>.
+1

Re: [VOTE] PIP-129: Introduce intermediate state for ledger deletion

Posted by PengHui Li <pe...@apache.org>.
+1 (binding)

On Fri, Jan 14, 2022 at 4:32 PM Aloys Zhang <al...@apache.org> wrote:

> +1 (non-binding)
>
> Haiting Jiang <ji...@apache.org> 于2022年1月14日周五 16:12写道:
>
> > +1 (non)
> >
> > On 2022/01/14 03:23:37 mattison chao wrote:
> > > +1 (non-binding)
> > >
> > > Best,
> > > Mattison
> > >
> > > On Fri, 14 Jan 2022 at 11:19, Hang Chen <ch...@apache.org> wrote:
> > >
> > > > +1 (binding)
> > > >
> > > > Best,
> > > > Hang
> > > >
> > > > Zhanpeng Wu <wu...@gmail.com> 于2022年1月14日周五 10:37写道:
> > > > >
> > > > > This is the voting thread for PIP-129. It will stay open for at
> > least 48
> > > > > hours.  Pasted below for quoting convenience.
> > > > >
> > > > > ----
> > > > >
> > > > > https://github.com/apache/pulsar/issues/13526
> > > > >
> > > > > ----
> > > > >
> > > > > ## Motivation
> > > > >
> > > > > Under the current ledger-trimming design in
> > > > >
> > > >
> >
> `org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl#internalTrimLedgers`,
> > > > > we need to collect those ledgers that need to be deleted first, and
> > then
> > > > > perform the asynchronous deletion of the ledger concurrently, but
> we
> > do
> > > > not
> > > > > continue to pay attention to whether the deletion operation is
> > completed.
> > > > > If the meta-information update has been successfully completed but
> an
> > > > error
> > > > > occurs during the asynchronous deletion, the ledger may not be
> > deleted,
> > > > but
> > > > > at the logical level we think that the deletion has been completed,
> > which
> > > > > will make this part of the data remain in the storage layer forever
> > (such
> > > > > as bk). As the usage time of the cluster becomes longer, the
> residual
> > > > data
> > > > > that cannot be deleted will gradually increase.
> > > > >
> > > > > In order to achieve this goal, we can separate the logic of
> > > > > meta-information update and ledger deletion. In the trimming
> > process, we
> > > > > can first mark which ledgers are deletable, and update the results
> > to the
> > > > > metadatastore. We can perform the deletion of marked ledgers
> > > > asynchronously
> > > > > in the callback of updating the meta information, so that the
> > original
> > > > > logic can be retained seamlessly. Therefore, when we are rolling
> > upgrade
> > > > or
> > > > > rollback, the only difference is whether the deleted ledger is
> > marked for
> > > > > deletion.
> > > > >
> > > > > To be more specific:
> > > > > 1. for upgrade, only the marker information of ledger has been
> > added, and
> > > > > the logical sequence of deletion has not changed.
> > > > > 2. for rollback, some ledgers that have been marked for deletion
> may
> > not
> > > > be
> > > > > deleted due to the restart of the broker. This behavior is
> consistent
> > > > with
> > > > > the original version.
> > > > >
> > > > > In addition, if the ledger that has been marked is not deleted
> > > > > successfully, the marker will not be removed. So for this part of
> > > > ledgers,
> > > > > every time trimming is triggered, it will be deleted again, which
> is
> > > > > equivalent to a check and retry mechanism.
> > > > >
> > > > > ## Goal
> > > > >
> > > > > We need to modify some logic in
> > > > >
> > > >
> >
> `org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl#internalTrimLedgers`
> > > > > so that the ledger deletion logic in ledger-trimming is split into
> > two
> > > > > stages, marking and deleting. Once the marker information is
> updated
> > to
> > > > the
> > > > > metadatastore, every trimming will try to trigger the ledger
> deletion
> > > > until
> > > > > all the deleteable ledgers are successfully deleted.
> > > > >
> > > > > ## Implementation
> > > > >
> > > > > This proposal aims to separate the deletion logic in
> > ledger-trimming, so
> > > > > that `ManagedLedgerImpl#internalTrimLedgers` is responsible for
> > marking
> > > > the
> > > > > deletable ledgers and then perform actual ledger deletion according
> > to
> > > > the
> > > > > metadatastore.
> > > > >
> > > > > Therefore, the entire trimming process is broken down into the
> > following
> > > > > steps:
> > > > >
> > > > > 1. mark deletable ledgers and update ledger metadata.
> > > > > 2. do acutual ledger deletion after metadata is updated.
> > > > >
> > > > > For step 1, we can store the marker of deletable information in
> > > > >
> `org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl#propertiesMap`.
> > > > When
> > > > > retrieving the deleted ledger information, we can directly query by
> > > > > iterating `propertiesMap`. If this solution is not accepted, maybe
> > we can
> > > > > create a new znode to store these information, but this approach
> > will not
> > > > > be able to reuse the current design.
> > > > >
> > > > > For step 2, we can perform the deletion of marked ledgers
> > asynchronously
> > > > in
> > > > > the callback of updating the meta information. And every trimming
> > will
> > > > > trigger the check and delete for those deleteable ledgers.
> > > > >
> > > > > Related PR: https://github.com/apache/pulsar/pull/13575
> > > >
> > >
> >
>

Re: [VOTE] PIP-129: Introduce intermediate state for ledger deletion

Posted by Aloys Zhang <al...@apache.org>.
+1 (non-binding)

Haiting Jiang <ji...@apache.org> 于2022年1月14日周五 16:12写道:

> +1 (non)
>
> On 2022/01/14 03:23:37 mattison chao wrote:
> > +1 (non-binding)
> >
> > Best,
> > Mattison
> >
> > On Fri, 14 Jan 2022 at 11:19, Hang Chen <ch...@apache.org> wrote:
> >
> > > +1 (binding)
> > >
> > > Best,
> > > Hang
> > >
> > > Zhanpeng Wu <wu...@gmail.com> 于2022年1月14日周五 10:37写道:
> > > >
> > > > This is the voting thread for PIP-129. It will stay open for at
> least 48
> > > > hours.  Pasted below for quoting convenience.
> > > >
> > > > ----
> > > >
> > > > https://github.com/apache/pulsar/issues/13526
> > > >
> > > > ----
> > > >
> > > > ## Motivation
> > > >
> > > > Under the current ledger-trimming design in
> > > >
> > >
> `org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl#internalTrimLedgers`,
> > > > we need to collect those ledgers that need to be deleted first, and
> then
> > > > perform the asynchronous deletion of the ledger concurrently, but we
> do
> > > not
> > > > continue to pay attention to whether the deletion operation is
> completed.
> > > > If the meta-information update has been successfully completed but an
> > > error
> > > > occurs during the asynchronous deletion, the ledger may not be
> deleted,
> > > but
> > > > at the logical level we think that the deletion has been completed,
> which
> > > > will make this part of the data remain in the storage layer forever
> (such
> > > > as bk). As the usage time of the cluster becomes longer, the residual
> > > data
> > > > that cannot be deleted will gradually increase.
> > > >
> > > > In order to achieve this goal, we can separate the logic of
> > > > meta-information update and ledger deletion. In the trimming
> process, we
> > > > can first mark which ledgers are deletable, and update the results
> to the
> > > > metadatastore. We can perform the deletion of marked ledgers
> > > asynchronously
> > > > in the callback of updating the meta information, so that the
> original
> > > > logic can be retained seamlessly. Therefore, when we are rolling
> upgrade
> > > or
> > > > rollback, the only difference is whether the deleted ledger is
> marked for
> > > > deletion.
> > > >
> > > > To be more specific:
> > > > 1. for upgrade, only the marker information of ledger has been
> added, and
> > > > the logical sequence of deletion has not changed.
> > > > 2. for rollback, some ledgers that have been marked for deletion may
> not
> > > be
> > > > deleted due to the restart of the broker. This behavior is consistent
> > > with
> > > > the original version.
> > > >
> > > > In addition, if the ledger that has been marked is not deleted
> > > > successfully, the marker will not be removed. So for this part of
> > > ledgers,
> > > > every time trimming is triggered, it will be deleted again, which is
> > > > equivalent to a check and retry mechanism.
> > > >
> > > > ## Goal
> > > >
> > > > We need to modify some logic in
> > > >
> > >
> `org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl#internalTrimLedgers`
> > > > so that the ledger deletion logic in ledger-trimming is split into
> two
> > > > stages, marking and deleting. Once the marker information is updated
> to
> > > the
> > > > metadatastore, every trimming will try to trigger the ledger deletion
> > > until
> > > > all the deleteable ledgers are successfully deleted.
> > > >
> > > > ## Implementation
> > > >
> > > > This proposal aims to separate the deletion logic in
> ledger-trimming, so
> > > > that `ManagedLedgerImpl#internalTrimLedgers` is responsible for
> marking
> > > the
> > > > deletable ledgers and then perform actual ledger deletion according
> to
> > > the
> > > > metadatastore.
> > > >
> > > > Therefore, the entire trimming process is broken down into the
> following
> > > > steps:
> > > >
> > > > 1. mark deletable ledgers and update ledger metadata.
> > > > 2. do acutual ledger deletion after metadata is updated.
> > > >
> > > > For step 1, we can store the marker of deletable information in
> > > > `org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl#propertiesMap`.
> > > When
> > > > retrieving the deleted ledger information, we can directly query by
> > > > iterating `propertiesMap`. If this solution is not accepted, maybe
> we can
> > > > create a new znode to store these information, but this approach
> will not
> > > > be able to reuse the current design.
> > > >
> > > > For step 2, we can perform the deletion of marked ledgers
> asynchronously
> > > in
> > > > the callback of updating the meta information. And every trimming
> will
> > > > trigger the check and delete for those deleteable ledgers.
> > > >
> > > > Related PR: https://github.com/apache/pulsar/pull/13575
> > >
> >
>

Re: [VOTE] PIP-129: Introduce intermediate state for ledger deletion

Posted by Haiting Jiang <ji...@apache.org>.
+1 (non)

On 2022/01/14 03:23:37 mattison chao wrote:
> +1 (non-binding)
> 
> Best,
> Mattison
> 
> On Fri, 14 Jan 2022 at 11:19, Hang Chen <ch...@apache.org> wrote:
> 
> > +1 (binding)
> >
> > Best,
> > Hang
> >
> > Zhanpeng Wu <wu...@gmail.com> 于2022年1月14日周五 10:37写道:
> > >
> > > This is the voting thread for PIP-129. It will stay open for at least 48
> > > hours.  Pasted below for quoting convenience.
> > >
> > > ----
> > >
> > > https://github.com/apache/pulsar/issues/13526
> > >
> > > ----
> > >
> > > ## Motivation
> > >
> > > Under the current ledger-trimming design in
> > >
> > `org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl#internalTrimLedgers`,
> > > we need to collect those ledgers that need to be deleted first, and then
> > > perform the asynchronous deletion of the ledger concurrently, but we do
> > not
> > > continue to pay attention to whether the deletion operation is completed.
> > > If the meta-information update has been successfully completed but an
> > error
> > > occurs during the asynchronous deletion, the ledger may not be deleted,
> > but
> > > at the logical level we think that the deletion has been completed, which
> > > will make this part of the data remain in the storage layer forever (such
> > > as bk). As the usage time of the cluster becomes longer, the residual
> > data
> > > that cannot be deleted will gradually increase.
> > >
> > > In order to achieve this goal, we can separate the logic of
> > > meta-information update and ledger deletion. In the trimming process, we
> > > can first mark which ledgers are deletable, and update the results to the
> > > metadatastore. We can perform the deletion of marked ledgers
> > asynchronously
> > > in the callback of updating the meta information, so that the original
> > > logic can be retained seamlessly. Therefore, when we are rolling upgrade
> > or
> > > rollback, the only difference is whether the deleted ledger is marked for
> > > deletion.
> > >
> > > To be more specific:
> > > 1. for upgrade, only the marker information of ledger has been added, and
> > > the logical sequence of deletion has not changed.
> > > 2. for rollback, some ledgers that have been marked for deletion may not
> > be
> > > deleted due to the restart of the broker. This behavior is consistent
> > with
> > > the original version.
> > >
> > > In addition, if the ledger that has been marked is not deleted
> > > successfully, the marker will not be removed. So for this part of
> > ledgers,
> > > every time trimming is triggered, it will be deleted again, which is
> > > equivalent to a check and retry mechanism.
> > >
> > > ## Goal
> > >
> > > We need to modify some logic in
> > >
> > `org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl#internalTrimLedgers`
> > > so that the ledger deletion logic in ledger-trimming is split into two
> > > stages, marking and deleting. Once the marker information is updated to
> > the
> > > metadatastore, every trimming will try to trigger the ledger deletion
> > until
> > > all the deleteable ledgers are successfully deleted.
> > >
> > > ## Implementation
> > >
> > > This proposal aims to separate the deletion logic in ledger-trimming, so
> > > that `ManagedLedgerImpl#internalTrimLedgers` is responsible for marking
> > the
> > > deletable ledgers and then perform actual ledger deletion according to
> > the
> > > metadatastore.
> > >
> > > Therefore, the entire trimming process is broken down into the following
> > > steps:
> > >
> > > 1. mark deletable ledgers and update ledger metadata.
> > > 2. do acutual ledger deletion after metadata is updated.
> > >
> > > For step 1, we can store the marker of deletable information in
> > > `org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl#propertiesMap`.
> > When
> > > retrieving the deleted ledger information, we can directly query by
> > > iterating `propertiesMap`. If this solution is not accepted, maybe we can
> > > create a new znode to store these information, but this approach will not
> > > be able to reuse the current design.
> > >
> > > For step 2, we can perform the deletion of marked ledgers asynchronously
> > in
> > > the callback of updating the meta information. And every trimming will
> > > trigger the check and delete for those deleteable ledgers.
> > >
> > > Related PR: https://github.com/apache/pulsar/pull/13575
> >
> 

回复: [VOTE] PIP-129: Introduce intermediate state for ledger deletion

Posted by zhangao <ga...@qq.com.INVALID>.
+1 (non-binding)




------------------&nbsp;原始邮件&nbsp;------------------
发件人:                                                                                                                        "dev"                                                                                    <mattisonchao@gmail.com&gt;;
发送时间:&nbsp;2022年1月14日(星期五) 中午11:23
收件人:&nbsp;"dev"<dev@pulsar.apache.org&gt;;

主题:&nbsp;Re: [VOTE] PIP-129: Introduce intermediate state for ledger deletion



+1 (non-binding)

Best,
Mattison

On Fri, 14 Jan 2022 at 11:19, Hang Chen <chenhang@apache.org&gt; wrote:

&gt; +1 (binding)
&gt;
&gt; Best,
&gt; Hang
&gt;
&gt; Zhanpeng Wu <wuzhanpeng.will@gmail.com&gt; 于2022年1月14日周五 10:37写道:
&gt; &gt;
&gt; &gt; This is the voting thread for PIP-129. It will stay open for at least 48
&gt; &gt; hours.&nbsp; Pasted below for quoting convenience.
&gt; &gt;
&gt; &gt; ----
&gt; &gt;
&gt; &gt; https://github.com/apache/pulsar/issues/13526
&gt; &gt;
&gt; &gt; ----
&gt; &gt;
&gt; &gt; ## Motivation
&gt; &gt;
&gt; &gt; Under the current ledger-trimming design in
&gt; &gt;
&gt; `org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl#internalTrimLedgers`,
&gt; &gt; we need to collect those ledgers that need to be deleted first, and then
&gt; &gt; perform the asynchronous deletion of the ledger concurrently, but we do
&gt; not
&gt; &gt; continue to pay attention to whether the deletion operation is completed.
&gt; &gt; If the meta-information update has been successfully completed but an
&gt; error
&gt; &gt; occurs during the asynchronous deletion, the ledger may not be deleted,
&gt; but
&gt; &gt; at the logical level we think that the deletion has been completed, which
&gt; &gt; will make this part of the data remain in the storage layer forever (such
&gt; &gt; as bk). As the usage time of the cluster becomes longer, the residual
&gt; data
&gt; &gt; that cannot be deleted will gradually increase.
&gt; &gt;
&gt; &gt; In order to achieve this goal, we can separate the logic of
&gt; &gt; meta-information update and ledger deletion. In the trimming process, we
&gt; &gt; can first mark which ledgers are deletable, and update the results to the
&gt; &gt; metadatastore. We can perform the deletion of marked ledgers
&gt; asynchronously
&gt; &gt; in the callback of updating the meta information, so that the original
&gt; &gt; logic can be retained seamlessly. Therefore, when we are rolling upgrade
&gt; or
&gt; &gt; rollback, the only difference is whether the deleted ledger is marked for
&gt; &gt; deletion.
&gt; &gt;
&gt; &gt; To be more specific:
&gt; &gt; 1. for upgrade, only the marker information of ledger has been added, and
&gt; &gt; the logical sequence of deletion has not changed.
&gt; &gt; 2. for rollback, some ledgers that have been marked for deletion may not
&gt; be
&gt; &gt; deleted due to the restart of the broker. This behavior is consistent
&gt; with
&gt; &gt; the original version.
&gt; &gt;
&gt; &gt; In addition, if the ledger that has been marked is not deleted
&gt; &gt; successfully, the marker will not be removed. So for this part of
&gt; ledgers,
&gt; &gt; every time trimming is triggered, it will be deleted again, which is
&gt; &gt; equivalent to a check and retry mechanism.
&gt; &gt;
&gt; &gt; ## Goal
&gt; &gt;
&gt; &gt; We need to modify some logic in
&gt; &gt;
&gt; `org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl#internalTrimLedgers`
&gt; &gt; so that the ledger deletion logic in ledger-trimming is split into two
&gt; &gt; stages, marking and deleting. Once the marker information is updated to
&gt; the
&gt; &gt; metadatastore, every trimming will try to trigger the ledger deletion
&gt; until
&gt; &gt; all the deleteable ledgers are successfully deleted.
&gt; &gt;
&gt; &gt; ## Implementation
&gt; &gt;
&gt; &gt; This proposal aims to separate the deletion logic in ledger-trimming, so
&gt; &gt; that `ManagedLedgerImpl#internalTrimLedgers` is responsible for marking
&gt; the
&gt; &gt; deletable ledgers and then perform actual ledger deletion according to
&gt; the
&gt; &gt; metadatastore.
&gt; &gt;
&gt; &gt; Therefore, the entire trimming process is broken down into the following
&gt; &gt; steps:
&gt; &gt;
&gt; &gt; 1. mark deletable ledgers and update ledger metadata.
&gt; &gt; 2. do acutual ledger deletion after metadata is updated.
&gt; &gt;
&gt; &gt; For step 1, we can store the marker of deletable information in
&gt; &gt; `org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl#propertiesMap`.
&gt; When
&gt; &gt; retrieving the deleted ledger information, we can directly query by
&gt; &gt; iterating `propertiesMap`. If this solution is not accepted, maybe we can
&gt; &gt; create a new znode to store these information, but this approach will not
&gt; &gt; be able to reuse the current design.
&gt; &gt;
&gt; &gt; For step 2, we can perform the deletion of marked ledgers asynchronously
&gt; in
&gt; &gt; the callback of updating the meta information. And every trimming will
&gt; &gt; trigger the check and delete for those deleteable ledgers.
&gt; &gt;
&gt; &gt; Related PR: https://github.com/apache/pulsar/pull/13575
&gt;

Re: [VOTE] PIP-129: Introduce intermediate state for ledger deletion

Posted by mattison chao <ma...@gmail.com>.
+1 (non-binding)

Best,
Mattison

On Fri, 14 Jan 2022 at 11:19, Hang Chen <ch...@apache.org> wrote:

> +1 (binding)
>
> Best,
> Hang
>
> Zhanpeng Wu <wu...@gmail.com> 于2022年1月14日周五 10:37写道:
> >
> > This is the voting thread for PIP-129. It will stay open for at least 48
> > hours.  Pasted below for quoting convenience.
> >
> > ----
> >
> > https://github.com/apache/pulsar/issues/13526
> >
> > ----
> >
> > ## Motivation
> >
> > Under the current ledger-trimming design in
> >
> `org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl#internalTrimLedgers`,
> > we need to collect those ledgers that need to be deleted first, and then
> > perform the asynchronous deletion of the ledger concurrently, but we do
> not
> > continue to pay attention to whether the deletion operation is completed.
> > If the meta-information update has been successfully completed but an
> error
> > occurs during the asynchronous deletion, the ledger may not be deleted,
> but
> > at the logical level we think that the deletion has been completed, which
> > will make this part of the data remain in the storage layer forever (such
> > as bk). As the usage time of the cluster becomes longer, the residual
> data
> > that cannot be deleted will gradually increase.
> >
> > In order to achieve this goal, we can separate the logic of
> > meta-information update and ledger deletion. In the trimming process, we
> > can first mark which ledgers are deletable, and update the results to the
> > metadatastore. We can perform the deletion of marked ledgers
> asynchronously
> > in the callback of updating the meta information, so that the original
> > logic can be retained seamlessly. Therefore, when we are rolling upgrade
> or
> > rollback, the only difference is whether the deleted ledger is marked for
> > deletion.
> >
> > To be more specific:
> > 1. for upgrade, only the marker information of ledger has been added, and
> > the logical sequence of deletion has not changed.
> > 2. for rollback, some ledgers that have been marked for deletion may not
> be
> > deleted due to the restart of the broker. This behavior is consistent
> with
> > the original version.
> >
> > In addition, if the ledger that has been marked is not deleted
> > successfully, the marker will not be removed. So for this part of
> ledgers,
> > every time trimming is triggered, it will be deleted again, which is
> > equivalent to a check and retry mechanism.
> >
> > ## Goal
> >
> > We need to modify some logic in
> >
> `org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl#internalTrimLedgers`
> > so that the ledger deletion logic in ledger-trimming is split into two
> > stages, marking and deleting. Once the marker information is updated to
> the
> > metadatastore, every trimming will try to trigger the ledger deletion
> until
> > all the deleteable ledgers are successfully deleted.
> >
> > ## Implementation
> >
> > This proposal aims to separate the deletion logic in ledger-trimming, so
> > that `ManagedLedgerImpl#internalTrimLedgers` is responsible for marking
> the
> > deletable ledgers and then perform actual ledger deletion according to
> the
> > metadatastore.
> >
> > Therefore, the entire trimming process is broken down into the following
> > steps:
> >
> > 1. mark deletable ledgers and update ledger metadata.
> > 2. do acutual ledger deletion after metadata is updated.
> >
> > For step 1, we can store the marker of deletable information in
> > `org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl#propertiesMap`.
> When
> > retrieving the deleted ledger information, we can directly query by
> > iterating `propertiesMap`. If this solution is not accepted, maybe we can
> > create a new znode to store these information, but this approach will not
> > be able to reuse the current design.
> >
> > For step 2, we can perform the deletion of marked ledgers asynchronously
> in
> > the callback of updating the meta information. And every trimming will
> > trigger the check and delete for those deleteable ledgers.
> >
> > Related PR: https://github.com/apache/pulsar/pull/13575
>

Re: [VOTE] PIP-129: Introduce intermediate state for ledger deletion

Posted by Hang Chen <ch...@apache.org>.
+1 (binding)

Best,
Hang

Zhanpeng Wu <wu...@gmail.com> 于2022年1月14日周五 10:37写道:
>
> This is the voting thread for PIP-129. It will stay open for at least 48
> hours.  Pasted below for quoting convenience.
>
> ----
>
> https://github.com/apache/pulsar/issues/13526
>
> ----
>
> ## Motivation
>
> Under the current ledger-trimming design in
> `org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl#internalTrimLedgers`,
> we need to collect those ledgers that need to be deleted first, and then
> perform the asynchronous deletion of the ledger concurrently, but we do not
> continue to pay attention to whether the deletion operation is completed.
> If the meta-information update has been successfully completed but an error
> occurs during the asynchronous deletion, the ledger may not be deleted, but
> at the logical level we think that the deletion has been completed, which
> will make this part of the data remain in the storage layer forever (such
> as bk). As the usage time of the cluster becomes longer, the residual data
> that cannot be deleted will gradually increase.
>
> In order to achieve this goal, we can separate the logic of
> meta-information update and ledger deletion. In the trimming process, we
> can first mark which ledgers are deletable, and update the results to the
> metadatastore. We can perform the deletion of marked ledgers asynchronously
> in the callback of updating the meta information, so that the original
> logic can be retained seamlessly. Therefore, when we are rolling upgrade or
> rollback, the only difference is whether the deleted ledger is marked for
> deletion.
>
> To be more specific:
> 1. for upgrade, only the marker information of ledger has been added, and
> the logical sequence of deletion has not changed.
> 2. for rollback, some ledgers that have been marked for deletion may not be
> deleted due to the restart of the broker. This behavior is consistent with
> the original version.
>
> In addition, if the ledger that has been marked is not deleted
> successfully, the marker will not be removed. So for this part of ledgers,
> every time trimming is triggered, it will be deleted again, which is
> equivalent to a check and retry mechanism.
>
> ## Goal
>
> We need to modify some logic in
> `org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl#internalTrimLedgers`
> so that the ledger deletion logic in ledger-trimming is split into two
> stages, marking and deleting. Once the marker information is updated to the
> metadatastore, every trimming will try to trigger the ledger deletion until
> all the deleteable ledgers are successfully deleted.
>
> ## Implementation
>
> This proposal aims to separate the deletion logic in ledger-trimming, so
> that `ManagedLedgerImpl#internalTrimLedgers` is responsible for marking the
> deletable ledgers and then perform actual ledger deletion according to the
> metadatastore.
>
> Therefore, the entire trimming process is broken down into the following
> steps:
>
> 1. mark deletable ledgers and update ledger metadata.
> 2. do acutual ledger deletion after metadata is updated.
>
> For step 1, we can store the marker of deletable information in
> `org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl#propertiesMap`. When
> retrieving the deleted ledger information, we can directly query by
> iterating `propertiesMap`. If this solution is not accepted, maybe we can
> create a new znode to store these information, but this approach will not
> be able to reuse the current design.
>
> For step 2, we can perform the deletion of marked ledgers asynchronously in
> the callback of updating the meta information. And every trimming will
> trigger the check and delete for those deleteable ledgers.
>
> Related PR: https://github.com/apache/pulsar/pull/13575