You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by Michael Marshall <mm...@apache.org> on 2022/02/01 05:23:06 UTC

Re: [DISCUSS] The default value of maxPendingChunkedMessage

> We found that there are inconsistencies between the code and the
> documentation regarding the default value of maxPendingChunkedMessage.

Great find!

> A chunked buffer to contain all chunks could use much memory, for example, if a
> message was split into N chunks, since each chunk is 5MB by default, then 100
> buffers will use N*500 MB. It could reach 1GB if N > 2.

This is a very good point.

I agree with updating the Javadoc to align with the actual code. This
will lead to fewer surprises, and as Yunze Xu pointed out, a 10x
increase in the default could have dramatic effects on client memory
usage.

Thanks,
Michael


On Sun, Jan 30, 2022 at 8:58 PM Zike Yang
<zk...@streamnative.io.invalid> wrote:
>
> Hi, Yunze,
>
> Thanks for your opinion.
>
> > A chunked buffer to contain all chunks could use much memory, for example, if a
> > message was split into N chunks, since each chunk is 5MB by default, then 100
> > buffers will use N*500 MB. It could reach 1GB if N > 2.
> >
> > In addition, normally, only if at least 100 producers sent messages to a
> > partition would it be meaningful to configure maxPendingChunkedMessages to 100.
> > IMO, it's hard to see so many producers on a partition in production.
>
> +1. I agree with you. And keeping the current default value in the
> code (10) will not change the default behavior of the current client.
> If there are no other objections, I would like to fix this
> inconsistency in the java client.
>
> Thanks,
> Zike
>
>
> On Sun, Jan 30, 2022 at 7:26 PM Yunze Xu <yz...@streamnative.io.invalid> wrote:
> >
> > After thinking for a while, I’d prefer 10 as the default value and I changed
> > the default value to 10 in C++ client, see
> > https://github.com/apache/pulsar/pull/14070.
> >
> > A chunked buffer to contain all chunks could use much memory, for example, if a
> > message was split into N chunks, since each chunk is 5MB by default, then 100
> > buffers will use N*500 MB. It could reach 1GB if N > 2.
> >
> > In addition, normally, only if at least 100 producers sent messages to a
> > partition would it be meaningful to configure maxPendingChunkedMessages to 100.
> > IMO, it's hard to see so many producers on a partition in production.
> >
> > Thanks,
> > Yunze Xu
> >
> > > 2022年1月30日 下午6:32,Zike Yang <zk...@streamnative.io.INVALID> 写道:
> > >
> > > Hi, Pulsar community,
> > >
> > > We found that there are inconsistencies between the code and the
> > > documentation regarding the default value of maxPendingChunkedMessage.
> > >
> > > In the java client code, we use 10 as the default value. [1] But in
> > > the java doc, we use 100 as the default value. [2]
> > > We need to fix this inconsistency. But what should we take as the
> > > default value? From the code or the doc? I would like to hear your
> > > discussions.
> > >
> > > [1] https://github.com/apache/pulsar/blob/d11147616aa6cc7888420f6325bb71cd7f7ab065/pulsar-client/src/main/java/org/apache/pulsar/client/impl/conf/ConsumerConfigurationData.java#L112-L113
> > > [2] https://github.com/apache/pulsar/blob/1e2ff8a3941b7cc6d583f528ceedc393b7e607fb/pulsar-client-api/src/main/java/org/apache/pulsar/client/api/ConsumerBuilder.java#L690
> > >
> > > Thanks,
> > > Zike Yang
> >

Re: [DISCUSS] The default value of maxPendingChunkedMessage

Posted by Zike Yang <zk...@streamnative.io.INVALID>.
Thanks for all your suggestions.

> For further improvement, I think we can deprecate `maxPendingChunkedMessage` by extending the scope of  `ClientBuilder#memoryLimit` to consumers.

+1

I have created a PR to fix this inconsistency:
https://github.com/apache/pulsar/pull/14144 PTAL.

Thanks,
Zike Yang

On Sat, Feb 5, 2022 at 6:41 PM Haiting Jiang <ji...@apache.org> wrote:
>
> > I agree with updating the Javadoc to align with the actual code. This
> > will lead to fewer surprises,
>
> +1.
>
> For further improvement, I think we can deprecate `maxPendingChunkedMessage` by extending the scope of  `ClientBuilder#memoryLimit` to consumers.
>
> Thanks,
> Haiting
>
> On 2022/02/01 05:23:06 Michael Marshall wrote:
> > > We found that there are inconsistencies between the code and the
> > > documentation regarding the default value of maxPendingChunkedMessage.
> >
> > Great find!
> >
> > > A chunked buffer to contain all chunks could use much memory, for example, if a
> > > message was split into N chunks, since each chunk is 5MB by default, then 100
> > > buffers will use N*500 MB. It could reach 1GB if N > 2.
> >
> > This is a very good point.
> >
> > I agree with updating the Javadoc to align with the actual code. This
> > will lead to fewer surprises, and as Yunze Xu pointed out, a 10x
> > increase in the default could have dramatic effects on client memory
> > usage.
> >
> > Thanks,
> > Michael
> >
> >
> > On Sun, Jan 30, 2022 at 8:58 PM Zike Yang
> > <zk...@streamnative.io.invalid> wrote:
> > >
> > > Hi, Yunze,
> > >
> > > Thanks for your opinion.
> > >
> > > > A chunked buffer to contain all chunks could use much memory, for example, if a
> > > > message was split into N chunks, since each chunk is 5MB by default, then 100
> > > > buffers will use N*500 MB. It could reach 1GB if N > 2.
> > > >
> > > > In addition, normally, only if at least 100 producers sent messages to a
> > > > partition would it be meaningful to configure maxPendingChunkedMessages to 100.
> > > > IMO, it's hard to see so many producers on a partition in production.
> > >
> > > +1. I agree with you. And keeping the current default value in the
> > > code (10) will not change the default behavior of the current client.
> > > If there are no other objections, I would like to fix this
> > > inconsistency in the java client.
> > >
> > > Thanks,
> > > Zike
> > >
> > >
> > > On Sun, Jan 30, 2022 at 7:26 PM Yunze Xu <yz...@streamnative.io.invalid> wrote:
> > > >
> > > > After thinking for a while, I’d prefer 10 as the default value and I changed
> > > > the default value to 10 in C++ client, see
> > > > https://github.com/apache/pulsar/pull/14070.
> > > >
> > > > A chunked buffer to contain all chunks could use much memory, for example, if a
> > > > message was split into N chunks, since each chunk is 5MB by default, then 100
> > > > buffers will use N*500 MB. It could reach 1GB if N > 2.
> > > >
> > > > In addition, normally, only if at least 100 producers sent messages to a
> > > > partition would it be meaningful to configure maxPendingChunkedMessages to 100.
> > > > IMO, it's hard to see so many producers on a partition in production.
> > > >
> > > > Thanks,
> > > > Yunze Xu
> > > >
> > > > > 2022年1月30日 下午6:32,Zike Yang <zk...@streamnative.io.INVALID> 写道:
> > > > >
> > > > > Hi, Pulsar community,
> > > > >
> > > > > We found that there are inconsistencies between the code and the
> > > > > documentation regarding the default value of maxPendingChunkedMessage.
> > > > >
> > > > > In the java client code, we use 10 as the default value. [1] But in
> > > > > the java doc, we use 100 as the default value. [2]
> > > > > We need to fix this inconsistency. But what should we take as the
> > > > > default value? From the code or the doc? I would like to hear your
> > > > > discussions.
> > > > >
> > > > > [1] https://github.com/apache/pulsar/blob/d11147616aa6cc7888420f6325bb71cd7f7ab065/pulsar-client/src/main/java/org/apache/pulsar/client/impl/conf/ConsumerConfigurationData.java#L112-L113
> > > > > [2] https://github.com/apache/pulsar/blob/1e2ff8a3941b7cc6d583f528ceedc393b7e607fb/pulsar-client-api/src/main/java/org/apache/pulsar/client/api/ConsumerBuilder.java#L690
> > > > >
> > > > > Thanks,
> > > > > Zike Yang
> > > >
> >

Re: [DISCUSS] The default value of maxPendingChunkedMessage

Posted by Haiting Jiang <ji...@apache.org>.
> I agree with updating the Javadoc to align with the actual code. This
> will lead to fewer surprises,

+1. 

For further improvement, I think we can deprecate `maxPendingChunkedMessage` by extending the scope of  `ClientBuilder#memoryLimit` to consumers. 

Thanks,
Haiting

On 2022/02/01 05:23:06 Michael Marshall wrote:
> > We found that there are inconsistencies between the code and the
> > documentation regarding the default value of maxPendingChunkedMessage.
> 
> Great find!
> 
> > A chunked buffer to contain all chunks could use much memory, for example, if a
> > message was split into N chunks, since each chunk is 5MB by default, then 100
> > buffers will use N*500 MB. It could reach 1GB if N > 2.
> 
> This is a very good point.
> 
> I agree with updating the Javadoc to align with the actual code. This
> will lead to fewer surprises, and as Yunze Xu pointed out, a 10x
> increase in the default could have dramatic effects on client memory
> usage.
> 
> Thanks,
> Michael
> 
> 
> On Sun, Jan 30, 2022 at 8:58 PM Zike Yang
> <zk...@streamnative.io.invalid> wrote:
> >
> > Hi, Yunze,
> >
> > Thanks for your opinion.
> >
> > > A chunked buffer to contain all chunks could use much memory, for example, if a
> > > message was split into N chunks, since each chunk is 5MB by default, then 100
> > > buffers will use N*500 MB. It could reach 1GB if N > 2.
> > >
> > > In addition, normally, only if at least 100 producers sent messages to a
> > > partition would it be meaningful to configure maxPendingChunkedMessages to 100.
> > > IMO, it's hard to see so many producers on a partition in production.
> >
> > +1. I agree with you. And keeping the current default value in the
> > code (10) will not change the default behavior of the current client.
> > If there are no other objections, I would like to fix this
> > inconsistency in the java client.
> >
> > Thanks,
> > Zike
> >
> >
> > On Sun, Jan 30, 2022 at 7:26 PM Yunze Xu <yz...@streamnative.io.invalid> wrote:
> > >
> > > After thinking for a while, I’d prefer 10 as the default value and I changed
> > > the default value to 10 in C++ client, see
> > > https://github.com/apache/pulsar/pull/14070.
> > >
> > > A chunked buffer to contain all chunks could use much memory, for example, if a
> > > message was split into N chunks, since each chunk is 5MB by default, then 100
> > > buffers will use N*500 MB. It could reach 1GB if N > 2.
> > >
> > > In addition, normally, only if at least 100 producers sent messages to a
> > > partition would it be meaningful to configure maxPendingChunkedMessages to 100.
> > > IMO, it's hard to see so many producers on a partition in production.
> > >
> > > Thanks,
> > > Yunze Xu
> > >
> > > > 2022年1月30日 下午6:32,Zike Yang <zk...@streamnative.io.INVALID> 写道:
> > > >
> > > > Hi, Pulsar community,
> > > >
> > > > We found that there are inconsistencies between the code and the
> > > > documentation regarding the default value of maxPendingChunkedMessage.
> > > >
> > > > In the java client code, we use 10 as the default value. [1] But in
> > > > the java doc, we use 100 as the default value. [2]
> > > > We need to fix this inconsistency. But what should we take as the
> > > > default value? From the code or the doc? I would like to hear your
> > > > discussions.
> > > >
> > > > [1] https://github.com/apache/pulsar/blob/d11147616aa6cc7888420f6325bb71cd7f7ab065/pulsar-client/src/main/java/org/apache/pulsar/client/impl/conf/ConsumerConfigurationData.java#L112-L113
> > > > [2] https://github.com/apache/pulsar/blob/1e2ff8a3941b7cc6d583f528ceedc393b7e607fb/pulsar-client-api/src/main/java/org/apache/pulsar/client/api/ConsumerBuilder.java#L690
> > > >
> > > > Thanks,
> > > > Zike Yang
> > >
>