You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by Zike Yang <zk...@streamnative.io.INVALID> on 2022/01/30 10:32:35 UTC

[DISCUSS] The default value of maxPendingChunkedMessage

Hi, Pulsar community,

We found that there are inconsistencies between the code and the
documentation regarding the default value of maxPendingChunkedMessage.

In the java client code, we use 10 as the default value. [1] But in
the java doc, we use 100 as the default value. [2]
We need to fix this inconsistency. But what should we take as the
default value? From the code or the doc? I would like to hear your
discussions.

[1] https://github.com/apache/pulsar/blob/d11147616aa6cc7888420f6325bb71cd7f7ab065/pulsar-client/src/main/java/org/apache/pulsar/client/impl/conf/ConsumerConfigurationData.java#L112-L113
[2] https://github.com/apache/pulsar/blob/1e2ff8a3941b7cc6d583f528ceedc393b7e607fb/pulsar-client-api/src/main/java/org/apache/pulsar/client/api/ConsumerBuilder.java#L690

Thanks,
Zike Yang

Re: [DISCUSS] The default value of maxPendingChunkedMessage

Posted by Zike Yang <zk...@streamnative.io.INVALID>.
Thanks for all your suggestions.

> For further improvement, I think we can deprecate `maxPendingChunkedMessage` by extending the scope of  `ClientBuilder#memoryLimit` to consumers.

+1

I have created a PR to fix this inconsistency:
https://github.com/apache/pulsar/pull/14144 PTAL.

Thanks,
Zike Yang

On Sat, Feb 5, 2022 at 6:41 PM Haiting Jiang <ji...@apache.org> wrote:
>
> > I agree with updating the Javadoc to align with the actual code. This
> > will lead to fewer surprises,
>
> +1.
>
> For further improvement, I think we can deprecate `maxPendingChunkedMessage` by extending the scope of  `ClientBuilder#memoryLimit` to consumers.
>
> Thanks,
> Haiting
>
> On 2022/02/01 05:23:06 Michael Marshall wrote:
> > > We found that there are inconsistencies between the code and the
> > > documentation regarding the default value of maxPendingChunkedMessage.
> >
> > Great find!
> >
> > > A chunked buffer to contain all chunks could use much memory, for example, if a
> > > message was split into N chunks, since each chunk is 5MB by default, then 100
> > > buffers will use N*500 MB. It could reach 1GB if N > 2.
> >
> > This is a very good point.
> >
> > I agree with updating the Javadoc to align with the actual code. This
> > will lead to fewer surprises, and as Yunze Xu pointed out, a 10x
> > increase in the default could have dramatic effects on client memory
> > usage.
> >
> > Thanks,
> > Michael
> >
> >
> > On Sun, Jan 30, 2022 at 8:58 PM Zike Yang
> > <zk...@streamnative.io.invalid> wrote:
> > >
> > > Hi, Yunze,
> > >
> > > Thanks for your opinion.
> > >
> > > > A chunked buffer to contain all chunks could use much memory, for example, if a
> > > > message was split into N chunks, since each chunk is 5MB by default, then 100
> > > > buffers will use N*500 MB. It could reach 1GB if N > 2.
> > > >
> > > > In addition, normally, only if at least 100 producers sent messages to a
> > > > partition would it be meaningful to configure maxPendingChunkedMessages to 100.
> > > > IMO, it's hard to see so many producers on a partition in production.
> > >
> > > +1. I agree with you. And keeping the current default value in the
> > > code (10) will not change the default behavior of the current client.
> > > If there are no other objections, I would like to fix this
> > > inconsistency in the java client.
> > >
> > > Thanks,
> > > Zike
> > >
> > >
> > > On Sun, Jan 30, 2022 at 7:26 PM Yunze Xu <yz...@streamnative.io.invalid> wrote:
> > > >
> > > > After thinking for a while, I’d prefer 10 as the default value and I changed
> > > > the default value to 10 in C++ client, see
> > > > https://github.com/apache/pulsar/pull/14070.
> > > >
> > > > A chunked buffer to contain all chunks could use much memory, for example, if a
> > > > message was split into N chunks, since each chunk is 5MB by default, then 100
> > > > buffers will use N*500 MB. It could reach 1GB if N > 2.
> > > >
> > > > In addition, normally, only if at least 100 producers sent messages to a
> > > > partition would it be meaningful to configure maxPendingChunkedMessages to 100.
> > > > IMO, it's hard to see so many producers on a partition in production.
> > > >
> > > > Thanks,
> > > > Yunze Xu
> > > >
> > > > > 2022年1月30日 下午6:32,Zike Yang <zk...@streamnative.io.INVALID> 写道:
> > > > >
> > > > > Hi, Pulsar community,
> > > > >
> > > > > We found that there are inconsistencies between the code and the
> > > > > documentation regarding the default value of maxPendingChunkedMessage.
> > > > >
> > > > > In the java client code, we use 10 as the default value. [1] But in
> > > > > the java doc, we use 100 as the default value. [2]
> > > > > We need to fix this inconsistency. But what should we take as the
> > > > > default value? From the code or the doc? I would like to hear your
> > > > > discussions.
> > > > >
> > > > > [1] https://github.com/apache/pulsar/blob/d11147616aa6cc7888420f6325bb71cd7f7ab065/pulsar-client/src/main/java/org/apache/pulsar/client/impl/conf/ConsumerConfigurationData.java#L112-L113
> > > > > [2] https://github.com/apache/pulsar/blob/1e2ff8a3941b7cc6d583f528ceedc393b7e607fb/pulsar-client-api/src/main/java/org/apache/pulsar/client/api/ConsumerBuilder.java#L690
> > > > >
> > > > > Thanks,
> > > > > Zike Yang
> > > >
> >

Re: [DISCUSS] The default value of maxPendingChunkedMessage

Posted by Haiting Jiang <ji...@apache.org>.
> I agree with updating the Javadoc to align with the actual code. This
> will lead to fewer surprises,

+1. 

For further improvement, I think we can deprecate `maxPendingChunkedMessage` by extending the scope of  `ClientBuilder#memoryLimit` to consumers. 

Thanks,
Haiting

On 2022/02/01 05:23:06 Michael Marshall wrote:
> > We found that there are inconsistencies between the code and the
> > documentation regarding the default value of maxPendingChunkedMessage.
> 
> Great find!
> 
> > A chunked buffer to contain all chunks could use much memory, for example, if a
> > message was split into N chunks, since each chunk is 5MB by default, then 100
> > buffers will use N*500 MB. It could reach 1GB if N > 2.
> 
> This is a very good point.
> 
> I agree with updating the Javadoc to align with the actual code. This
> will lead to fewer surprises, and as Yunze Xu pointed out, a 10x
> increase in the default could have dramatic effects on client memory
> usage.
> 
> Thanks,
> Michael
> 
> 
> On Sun, Jan 30, 2022 at 8:58 PM Zike Yang
> <zk...@streamnative.io.invalid> wrote:
> >
> > Hi, Yunze,
> >
> > Thanks for your opinion.
> >
> > > A chunked buffer to contain all chunks could use much memory, for example, if a
> > > message was split into N chunks, since each chunk is 5MB by default, then 100
> > > buffers will use N*500 MB. It could reach 1GB if N > 2.
> > >
> > > In addition, normally, only if at least 100 producers sent messages to a
> > > partition would it be meaningful to configure maxPendingChunkedMessages to 100.
> > > IMO, it's hard to see so many producers on a partition in production.
> >
> > +1. I agree with you. And keeping the current default value in the
> > code (10) will not change the default behavior of the current client.
> > If there are no other objections, I would like to fix this
> > inconsistency in the java client.
> >
> > Thanks,
> > Zike
> >
> >
> > On Sun, Jan 30, 2022 at 7:26 PM Yunze Xu <yz...@streamnative.io.invalid> wrote:
> > >
> > > After thinking for a while, I’d prefer 10 as the default value and I changed
> > > the default value to 10 in C++ client, see
> > > https://github.com/apache/pulsar/pull/14070.
> > >
> > > A chunked buffer to contain all chunks could use much memory, for example, if a
> > > message was split into N chunks, since each chunk is 5MB by default, then 100
> > > buffers will use N*500 MB. It could reach 1GB if N > 2.
> > >
> > > In addition, normally, only if at least 100 producers sent messages to a
> > > partition would it be meaningful to configure maxPendingChunkedMessages to 100.
> > > IMO, it's hard to see so many producers on a partition in production.
> > >
> > > Thanks,
> > > Yunze Xu
> > >
> > > > 2022年1月30日 下午6:32,Zike Yang <zk...@streamnative.io.INVALID> 写道:
> > > >
> > > > Hi, Pulsar community,
> > > >
> > > > We found that there are inconsistencies between the code and the
> > > > documentation regarding the default value of maxPendingChunkedMessage.
> > > >
> > > > In the java client code, we use 10 as the default value. [1] But in
> > > > the java doc, we use 100 as the default value. [2]
> > > > We need to fix this inconsistency. But what should we take as the
> > > > default value? From the code or the doc? I would like to hear your
> > > > discussions.
> > > >
> > > > [1] https://github.com/apache/pulsar/blob/d11147616aa6cc7888420f6325bb71cd7f7ab065/pulsar-client/src/main/java/org/apache/pulsar/client/impl/conf/ConsumerConfigurationData.java#L112-L113
> > > > [2] https://github.com/apache/pulsar/blob/1e2ff8a3941b7cc6d583f528ceedc393b7e607fb/pulsar-client-api/src/main/java/org/apache/pulsar/client/api/ConsumerBuilder.java#L690
> > > >
> > > > Thanks,
> > > > Zike Yang
> > >
> 

Re: [DISCUSS] The default value of maxPendingChunkedMessage

Posted by Michael Marshall <mm...@apache.org>.
> We found that there are inconsistencies between the code and the
> documentation regarding the default value of maxPendingChunkedMessage.

Great find!

> A chunked buffer to contain all chunks could use much memory, for example, if a
> message was split into N chunks, since each chunk is 5MB by default, then 100
> buffers will use N*500 MB. It could reach 1GB if N > 2.

This is a very good point.

I agree with updating the Javadoc to align with the actual code. This
will lead to fewer surprises, and as Yunze Xu pointed out, a 10x
increase in the default could have dramatic effects on client memory
usage.

Thanks,
Michael


On Sun, Jan 30, 2022 at 8:58 PM Zike Yang
<zk...@streamnative.io.invalid> wrote:
>
> Hi, Yunze,
>
> Thanks for your opinion.
>
> > A chunked buffer to contain all chunks could use much memory, for example, if a
> > message was split into N chunks, since each chunk is 5MB by default, then 100
> > buffers will use N*500 MB. It could reach 1GB if N > 2.
> >
> > In addition, normally, only if at least 100 producers sent messages to a
> > partition would it be meaningful to configure maxPendingChunkedMessages to 100.
> > IMO, it's hard to see so many producers on a partition in production.
>
> +1. I agree with you. And keeping the current default value in the
> code (10) will not change the default behavior of the current client.
> If there are no other objections, I would like to fix this
> inconsistency in the java client.
>
> Thanks,
> Zike
>
>
> On Sun, Jan 30, 2022 at 7:26 PM Yunze Xu <yz...@streamnative.io.invalid> wrote:
> >
> > After thinking for a while, I’d prefer 10 as the default value and I changed
> > the default value to 10 in C++ client, see
> > https://github.com/apache/pulsar/pull/14070.
> >
> > A chunked buffer to contain all chunks could use much memory, for example, if a
> > message was split into N chunks, since each chunk is 5MB by default, then 100
> > buffers will use N*500 MB. It could reach 1GB if N > 2.
> >
> > In addition, normally, only if at least 100 producers sent messages to a
> > partition would it be meaningful to configure maxPendingChunkedMessages to 100.
> > IMO, it's hard to see so many producers on a partition in production.
> >
> > Thanks,
> > Yunze Xu
> >
> > > 2022年1月30日 下午6:32,Zike Yang <zk...@streamnative.io.INVALID> 写道:
> > >
> > > Hi, Pulsar community,
> > >
> > > We found that there are inconsistencies between the code and the
> > > documentation regarding the default value of maxPendingChunkedMessage.
> > >
> > > In the java client code, we use 10 as the default value. [1] But in
> > > the java doc, we use 100 as the default value. [2]
> > > We need to fix this inconsistency. But what should we take as the
> > > default value? From the code or the doc? I would like to hear your
> > > discussions.
> > >
> > > [1] https://github.com/apache/pulsar/blob/d11147616aa6cc7888420f6325bb71cd7f7ab065/pulsar-client/src/main/java/org/apache/pulsar/client/impl/conf/ConsumerConfigurationData.java#L112-L113
> > > [2] https://github.com/apache/pulsar/blob/1e2ff8a3941b7cc6d583f528ceedc393b7e607fb/pulsar-client-api/src/main/java/org/apache/pulsar/client/api/ConsumerBuilder.java#L690
> > >
> > > Thanks,
> > > Zike Yang
> >

Re: [DISCUSS] The default value of maxPendingChunkedMessage

Posted by Zike Yang <zk...@streamnative.io.INVALID>.
Hi, Yunze,

Thanks for your opinion.

> A chunked buffer to contain all chunks could use much memory, for example, if a
> message was split into N chunks, since each chunk is 5MB by default, then 100
> buffers will use N*500 MB. It could reach 1GB if N > 2.
>
> In addition, normally, only if at least 100 producers sent messages to a
> partition would it be meaningful to configure maxPendingChunkedMessages to 100.
> IMO, it's hard to see so many producers on a partition in production.

+1. I agree with you. And keeping the current default value in the
code (10) will not change the default behavior of the current client.
If there are no other objections, I would like to fix this
inconsistency in the java client.

Thanks,
Zike


On Sun, Jan 30, 2022 at 7:26 PM Yunze Xu <yz...@streamnative.io.invalid> wrote:
>
> After thinking for a while, I’d prefer 10 as the default value and I changed
> the default value to 10 in C++ client, see
> https://github.com/apache/pulsar/pull/14070.
>
> A chunked buffer to contain all chunks could use much memory, for example, if a
> message was split into N chunks, since each chunk is 5MB by default, then 100
> buffers will use N*500 MB. It could reach 1GB if N > 2.
>
> In addition, normally, only if at least 100 producers sent messages to a
> partition would it be meaningful to configure maxPendingChunkedMessages to 100.
> IMO, it's hard to see so many producers on a partition in production.
>
> Thanks,
> Yunze Xu
>
> > 2022年1月30日 下午6:32,Zike Yang <zk...@streamnative.io.INVALID> 写道:
> >
> > Hi, Pulsar community,
> >
> > We found that there are inconsistencies between the code and the
> > documentation regarding the default value of maxPendingChunkedMessage.
> >
> > In the java client code, we use 10 as the default value. [1] But in
> > the java doc, we use 100 as the default value. [2]
> > We need to fix this inconsistency. But what should we take as the
> > default value? From the code or the doc? I would like to hear your
> > discussions.
> >
> > [1] https://github.com/apache/pulsar/blob/d11147616aa6cc7888420f6325bb71cd7f7ab065/pulsar-client/src/main/java/org/apache/pulsar/client/impl/conf/ConsumerConfigurationData.java#L112-L113
> > [2] https://github.com/apache/pulsar/blob/1e2ff8a3941b7cc6d583f528ceedc393b7e607fb/pulsar-client-api/src/main/java/org/apache/pulsar/client/api/ConsumerBuilder.java#L690
> >
> > Thanks,
> > Zike Yang
>

Re: [DISCUSS] The default value of maxPendingChunkedMessage

Posted by Yunze Xu <yz...@streamnative.io.INVALID>.
After thinking for a while, I’d prefer 10 as the default value and I changed
the default value to 10 in C++ client, see
https://github.com/apache/pulsar/pull/14070.

A chunked buffer to contain all chunks could use much memory, for example, if a
message was split into N chunks, since each chunk is 5MB by default, then 100
buffers will use N*500 MB. It could reach 1GB if N > 2.

In addition, normally, only if at least 100 producers sent messages to a
partition would it be meaningful to configure maxPendingChunkedMessages to 100.
IMO, it's hard to see so many producers on a partition in production.

Thanks,
Yunze Xu

> 2022年1月30日 下午6:32,Zike Yang <zk...@streamnative.io.INVALID> 写道:
> 
> Hi, Pulsar community,
> 
> We found that there are inconsistencies between the code and the
> documentation regarding the default value of maxPendingChunkedMessage.
> 
> In the java client code, we use 10 as the default value. [1] But in
> the java doc, we use 100 as the default value. [2]
> We need to fix this inconsistency. But what should we take as the
> default value? From the code or the doc? I would like to hear your
> discussions.
> 
> [1] https://github.com/apache/pulsar/blob/d11147616aa6cc7888420f6325bb71cd7f7ab065/pulsar-client/src/main/java/org/apache/pulsar/client/impl/conf/ConsumerConfigurationData.java#L112-L113
> [2] https://github.com/apache/pulsar/blob/1e2ff8a3941b7cc6d583f528ceedc393b7e607fb/pulsar-client-api/src/main/java/org/apache/pulsar/client/api/ConsumerBuilder.java#L690
> 
> Thanks,
> Zike Yang