You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by Cong Zhao <zh...@apache.org> on 2023/11/07 02:55:52 UTC

[DISSCUSS] Don't retain null-key messages during topic compaction

Hi, Pulsar community

Currently, we retain all null-key messages during topic compaction, which I
don't think is necessary because when you use topic compaction,
it means that you want to retain the value according to the key, so
retaining null-key messages is meaningless.

Additionally, retaining all null-key messages will double the storage cost,
and we'll never be able to clean them up since the compacted topic has not
supported the retention policy yet.

In summary, I don't think we should retain null-key messages during topic
compaction.
Looking forward to your feedback!

Thanks,
Cong Zhao

Re: [DISSCUSS] Don't retain null-key messages during topic compaction

Posted by Cong Zhao <zh...@apache.org>.
Hi mattison,

Thanks for your suggestion, I agree with you, we can add a configuration to smooth the migrate this change.

Let's see if anyone else has any other ideas, and if everyone agrees with this approach, I'll implement it.

Thanks,
Cong Zhao

On 2023/11/07 03:03:58 mattison chao wrote:
> Hi, Cong
> 
> IMO, Please do not break the previous directly. We can migrate it smoothly.  We can add a configuration and give the
> Timeline of making this configuration default and removing it in the next release version.
> 
> For example:
> 
> - Add configuration: `compactionRemainNullKey=true` by default (current behaviour)
> - Make `compactionRemainNullKey=false` default  in the 3.2.0
> - Delete the configuration `compactionRemainNullKey` in 3.3.0.
> 
> This approach will avoid breaking changes and give our users enough time to migrate their usage.
> 
> Plus, I think it’s fair to cherry-pick it to all the previous active branches. 
> 
> 
> Thanks!
> Mattison
> 
> 
> 
> > On Nov 7, 2023, at 10:55, Cong Zhao <zh...@apache.org> wrote:
> > 
> > Hi, Pulsar community
> > 
> > Currently, we retain all null-key messages during topic compaction, which I
> > don't think is necessary because when you use topic compaction,
> > it means that you want to retain the value according to the key, so
> > retaining null-key messages is meaningless.
> > 
> > Additionally, retaining all null-key messages will double the storage cost,
> > and we'll never be able to clean them up since the compacted topic has not
> > supported the retention policy yet.
> > 
> > In summary, I don't think we should retain null-key messages during topic
> > compaction.
> > Looking forward to your feedback!
> > 
> > Thanks,
> > Cong Zhao
> 
> 

Re: [DISSCUSS] Don't retain null-key messages during topic compaction

Posted by mattison chao <ma...@gmail.com>.
Hi, Cong

IMO, Please do not break the previous directly. We can migrate it smoothly.  We can add a configuration and give the
Timeline of making this configuration default and removing it in the next release version.

For example:

- Add configuration: `compactionRemainNullKey=true` by default (current behaviour)
- Make `compactionRemainNullKey=false` default  in the 3.2.0
- Delete the configuration `compactionRemainNullKey` in 3.3.0.

This approach will avoid breaking changes and give our users enough time to migrate their usage.

Plus, I think it’s fair to cherry-pick it to all the previous active branches. 


Thanks!
Mattison



> On Nov 7, 2023, at 10:55, Cong Zhao <zh...@apache.org> wrote:
> 
> Hi, Pulsar community
> 
> Currently, we retain all null-key messages during topic compaction, which I
> don't think is necessary because when you use topic compaction,
> it means that you want to retain the value according to the key, so
> retaining null-key messages is meaningless.
> 
> Additionally, retaining all null-key messages will double the storage cost,
> and we'll never be able to clean them up since the compacted topic has not
> supported the retention policy yet.
> 
> In summary, I don't think we should retain null-key messages during topic
> compaction.
> Looking forward to your feedback!
> 
> Thanks,
> Cong Zhao


Re: [DISSCUSS] Don't retain null-key messages during topic compaction

Posted by Cong Zhao <zh...@apache.org>.
Hi Enrico,

I don't think they conflict. We can do them separately.

We can add the policy to the per topic and per namespace again when we need it in the future, and the current PIP can just add a configuration to broker.conf first so that we can not retain the null-key messages and cherry-pick this fix into other branches with `compactionRemainNullKey=true` default to keep compatibility.

Thanks,
Cong Zhao

On 2023/11/08 07:26:30 Enrico Olivelli wrote:
> This is a good point.
> 
> Is it worth to make it configurable per topic and per namespace?
> 
> Even if this behavior is kind if a side effect or bug, maybe some
> applications rely on it.
> Flags that change the behavior per cluster are hard to adopt in nig cluster
> with many tenants.
> 
> We should always take multi tenancy in mind when we design features or
> changes.
> 
> 
> Thanks
> Enrico
> 
> Il Mer 8 Nov 2023, 07:55 Cong Zhao <zh...@apache.org> ha scritto:
> 
> > Hello everyone,
> >
> > I open a PIP-318: https://github.com/apache/pulsar/pull/21541 for this
> > discussion.
> >
> > Any feedback and suggestions are welcome.
> >
> > Thanks,
> > Cong Zhao
> >
> > On 2023/11/07 02:55:52 Cong Zhao wrote:
> > > Hi, Pulsar community
> > >
> > > Currently, we retain all null-key messages during topic compaction,
> > which I
> > > don't think is necessary because when you use topic compaction,
> > > it means that you want to retain the value according to the key, so
> > > retaining null-key messages is meaningless.
> > >
> > > Additionally, retaining all null-key messages will double the storage
> > cost,
> > > and we'll never be able to clean them up since the compacted topic has
> > not
> > > supported the retention policy yet.
> > >
> > > In summary, I don't think we should retain null-key messages during topic
> > > compaction.
> > > Looking forward to your feedback!
> > >
> > > Thanks,
> > > Cong Zhao
> > >
> >
> 

Re: [DISSCUSS] Don't retain null-key messages during topic compaction

Posted by Enrico Olivelli <eo...@gmail.com>.
This is a good point.

Is it worth to make it configurable per topic and per namespace?

Even if this behavior is kind if a side effect or bug, maybe some
applications rely on it.
Flags that change the behavior per cluster are hard to adopt in nig cluster
with many tenants.

We should always take multi tenancy in mind when we design features or
changes.


Thanks
Enrico

Il Mer 8 Nov 2023, 07:55 Cong Zhao <zh...@apache.org> ha scritto:

> Hello everyone,
>
> I open a PIP-318: https://github.com/apache/pulsar/pull/21541 for this
> discussion.
>
> Any feedback and suggestions are welcome.
>
> Thanks,
> Cong Zhao
>
> On 2023/11/07 02:55:52 Cong Zhao wrote:
> > Hi, Pulsar community
> >
> > Currently, we retain all null-key messages during topic compaction,
> which I
> > don't think is necessary because when you use topic compaction,
> > it means that you want to retain the value according to the key, so
> > retaining null-key messages is meaningless.
> >
> > Additionally, retaining all null-key messages will double the storage
> cost,
> > and we'll never be able to clean them up since the compacted topic has
> not
> > supported the retention policy yet.
> >
> > In summary, I don't think we should retain null-key messages during topic
> > compaction.
> > Looking forward to your feedback!
> >
> > Thanks,
> > Cong Zhao
> >
>

Re: [DISSCUSS] Don't retain null-key messages during topic compaction

Posted by Cong Zhao <zh...@apache.org>.
Hello everyone,

I open a PIP-318: https://github.com/apache/pulsar/pull/21541 for this discussion.

Any feedback and suggestions are welcome.

Thanks,
Cong Zhao

On 2023/11/07 02:55:52 Cong Zhao wrote:
> Hi, Pulsar community
> 
> Currently, we retain all null-key messages during topic compaction, which I
> don't think is necessary because when you use topic compaction,
> it means that you want to retain the value according to the key, so
> retaining null-key messages is meaningless.
> 
> Additionally, retaining all null-key messages will double the storage cost,
> and we'll never be able to clean them up since the compacted topic has not
> supported the retention policy yet.
> 
> In summary, I don't think we should retain null-key messages during topic
> compaction.
> Looking forward to your feedback!
> 
> Thanks,
> Cong Zhao
>