You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by Haiting Jiang <ji...@apache.org> on 2021/12/31 12:05:54 UTC

[DISCUSSION] PIP-131: Include message header size when check maxMessageSize of non-batch message on the client side.

https://github.com/apache/pulsar/issues/13591

Pasted below for quoting convenience.

——

## Motivation

Currently, Pulsar client (Java) only checks payload size for max message size validation.

Client throws TimeoutException if we produce a message with too many properties, see [1].
But the root cause is that is trigged TooLongFrameException in broker server.

In this PIP, I propose to include message header size when check maxMessageSize of non-batch
messages, this brings the following benefits.
1. Clients can throw InvalidMessageException immediately if properties takes too much storage space.
2. This will make the behaviour consistent with topic level max message size check in broker.
3. Strictly limit the entry size less than maxMessageSize, avoid sending message to bookkeeper failed.

## Goal

Include message header size when check maxMessageSize for non-batch message on the client side.

## Implementation

```
// Add a size check in org.apache.pulsar.client.impl.ProducerImpl#processOpSendMsg
if (op.msg != null // for non-batch messages only
    && op.getMessageHeaderAndPayloadSize() > ClientCnx.getMaxMessageSize()) {
    // finish send op with InvalidMessageException
    releaseSemaphoreForSendOp(op);
    op.sendComplete(new PulsarClientException(new InvalidMessageException, op.sequenceId));
}


// org.apache.pulsar.client.impl.ProducerImpl.OpSendMsg#getMessageHeaderAndPayloadSize

public int getMessageHeaderAndPayloadSize() {
    ByteBuf cmdHeader = cmd.getFirst();
    cmdHeader.markReaderIndex();
    int totalSize = cmdHeader.readInt();
    int cmdSize = cmdHeader.readInt();
    int msgHeadersAndPayloadSize = totalSize - cmdSize - 4;
    cmdHeader.resetReaderIndex();
    return msgHeadersAndPayloadSize;
}
```

## Reject Alternatives
Add a new property like "maxPropertiesSize" or "maxHeaderSize" in broker.conf and pass it to 
client like maxMessageSize. But the implementation is much more complex, and don't have the 
benefit 2 and 3 mentioned in Motivation.

## Compatibility Issue
As a matter of fact, this PIP narrows down the sendable range. Previously, when maxMessageSize
is 1KB, it's ok to send message with 1KB properties and 1KB payload. But with this PIP, the 
sending will fail with InvalidMessageException.

One conservative way is to add a boolean config "includeHeaderInSizeCheck" to enable this 
feature. But I think it's OK to enable this directly as it's more reasonable, and I don't see good 
migration plan if we add a config for this.

The compatibility issue is worth discussing. And any suggestions are appreciated.

[1] https://github.com/apache/pulsar/issues/13560

Thanks,
Haiting Jiang

Re: [DISCUSSION] PIP-132: Include message header size when check maxMessageSize of non-batch message on the client side.

Posted by Haiting Jiang <ji...@apache.org>.
Hi Zike,

Thank you for pointing this out. Indeed, the metadata size is not exactly the same.

I think we don't need to calculate the exact final message size for chunking size, just need 
to make sure all chunked message can be produced successfully, as we don't check max 
message size if chunking is enabled, it's default behavior in client and PIP-131 disabled the 
check on broker side.

So in order to produce chunk message successfully, we can just use
`ClientCnx.getMaxMessageSize() - msgMetadata.getSerializedSize()` as chunk size.
as the varint field won't take too much space (several bytes each), and broker have 10KB 
buffer for maxMessageSize.

Thanks,
Haiting Jiang


On 2022/01/05 08:11:06 Zike Yang wrote:
> > as we have the same header size for each chunk message.
> 
> The message metadata of different chunks will not be the same. There
> are serval fields with different values such as chunk id, publish
> time. Because some type of that fields is varint, I think the size of
> the header is not always the same.
> 
> 
> On Tue, Jan 4, 2022 at 11:51 AM Haiting Jiang <ji...@apache.org> wrote:
> >
> > Good point,  I think we should shrink chunk size to "ClientCnx.getMaxMessageSize() - chunkMessageHeaderSize", as we have the same header size for each chunk message.
> >
> > Thanks,
> > Haiting Jiang
> >
> > On 2022/01/04 03:27:00 Zike Yang wrote:
> > > But how do we handle chunked messages? The chunked message is split
> > > based on the maxMessageSize(max payload size). This would seem to make
> > > `op.getMessageHeaderAndPayloadSize() > ClientCnx.getMaxMessageSize()`
> > > always true.
> > >
> > > Thanks,
> > > Zike
> > >
> > > On Fri, Dec 31, 2021 at 8:11 PM Haiting Jiang <ji...@apache.org> wrote:
> > > >
> > > > Sorry, it should be PIP-132.
> > > >
> > > > Thanks,
> > > > Haiting Jiang
> > > >
> > > > On 2021/12/31 12:05:54 Haiting Jiang wrote:
> > > > > https://github.com/apache/pulsar/issues/13591
> > > > >
> > > > > Pasted below for quoting convenience.
> > > > >
> > > > > ——
> > > > >
> > > > > ## Motivation
> > > > >
> > > > > Currently, Pulsar client (Java) only checks payload size for max message size validation.
> > > > >
> > > > > Client throws TimeoutException if we produce a message with too many properties, see [1].
> > > > > But the root cause is that is trigged TooLongFrameException in broker server.
> > > > >
> > > > > In this PIP, I propose to include message header size when check maxMessageSize of non-batch
> > > > > messages, this brings the following benefits.
> > > > > 1. Clients can throw InvalidMessageException immediately if properties takes too much storage space.
> > > > > 2. This will make the behaviour consistent with topic level max message size check in broker.
> > > > > 3. Strictly limit the entry size less than maxMessageSize, avoid sending message to bookkeeper failed.
> > > > >
> > > > > ## Goal
> > > > >
> > > > > Include message header size when check maxMessageSize for non-batch message on the client side.
> > > > >
> > > > > ## Implementation
> > > > >
> > > > > ```
> > > > > // Add a size check in org.apache.pulsar.client.impl.ProducerImpl#processOpSendMsg
> > > > > if (op.msg != null // for non-batch messages only
> > > > >     && op.getMessageHeaderAndPayloadSize() > ClientCnx.getMaxMessageSize()) {
> > > > >     // finish send op with InvalidMessageException
> > > > >     releaseSemaphoreForSendOp(op);
> > > > >     op.sendComplete(new PulsarClientException(new InvalidMessageException, op.sequenceId));
> > > > > }
> > > > >
> > > > >
> > > > > // org.apache.pulsar.client.impl.ProducerImpl.OpSendMsg#getMessageHeaderAndPayloadSize
> > > > >
> > > > > public int getMessageHeaderAndPayloadSize() {
> > > > >     ByteBuf cmdHeader = cmd.getFirst();
> > > > >     cmdHeader.markReaderIndex();
> > > > >     int totalSize = cmdHeader.readInt();
> > > > >     int cmdSize = cmdHeader.readInt();
> > > > >     int msgHeadersAndPayloadSize = totalSize - cmdSize - 4;
> > > > >     cmdHeader.resetReaderIndex();
> > > > >     return msgHeadersAndPayloadSize;
> > > > > }
> > > > > ```
> > > > >
> > > > > ## Reject Alternatives
> > > > > Add a new property like "maxPropertiesSize" or "maxHeaderSize" in broker.conf and pass it to
> > > > > client like maxMessageSize. But the implementation is much more complex, and don't have the
> > > > > benefit 2 and 3 mentioned in Motivation.
> > > > >
> > > > > ## Compatibility Issue
> > > > > As a matter of fact, this PIP narrows down the sendable range. Previously, when maxMessageSize
> > > > > is 1KB, it's ok to send message with 1KB properties and 1KB payload. But with this PIP, the
> > > > > sending will fail with InvalidMessageException.
> > > > >
> > > > > One conservative way is to add a boolean config "includeHeaderInSizeCheck" to enable this
> > > > > feature. But I think it's OK to enable this directly as it's more reasonable, and I don't see good
> > > > > migration plan if we add a config for this.
> > > > >
> > > > > The compatibility issue is worth discussing. And any suggestions are appreciated.
> > > > >
> > > > > [1] https://github.com/apache/pulsar/issues/13560
> > > > >
> > > > > Thanks,
> > > > > Haiting Jiang
> > > > >
> > >
> > >
> > >
> > > --
> > > Zike Yang
> > >
> 
> 
> 
> -- 
> Zike Yang
> 

Re: [DISCUSSION] PIP-132: Include message header size when check maxMessageSize of non-batch message on the client side.

Posted by Zike Yang <zk...@streamnative.io.INVALID>.
> as we have the same header size for each chunk message.

The message metadata of different chunks will not be the same. There
are serval fields with different values such as chunk id, publish
time. Because some type of that fields is varint, I think the size of
the header is not always the same.


On Tue, Jan 4, 2022 at 11:51 AM Haiting Jiang <ji...@apache.org> wrote:
>
> Good point,  I think we should shrink chunk size to "ClientCnx.getMaxMessageSize() - chunkMessageHeaderSize", as we have the same header size for each chunk message.
>
> Thanks,
> Haiting Jiang
>
> On 2022/01/04 03:27:00 Zike Yang wrote:
> > But how do we handle chunked messages? The chunked message is split
> > based on the maxMessageSize(max payload size). This would seem to make
> > `op.getMessageHeaderAndPayloadSize() > ClientCnx.getMaxMessageSize()`
> > always true.
> >
> > Thanks,
> > Zike
> >
> > On Fri, Dec 31, 2021 at 8:11 PM Haiting Jiang <ji...@apache.org> wrote:
> > >
> > > Sorry, it should be PIP-132.
> > >
> > > Thanks,
> > > Haiting Jiang
> > >
> > > On 2021/12/31 12:05:54 Haiting Jiang wrote:
> > > > https://github.com/apache/pulsar/issues/13591
> > > >
> > > > Pasted below for quoting convenience.
> > > >
> > > > ——
> > > >
> > > > ## Motivation
> > > >
> > > > Currently, Pulsar client (Java) only checks payload size for max message size validation.
> > > >
> > > > Client throws TimeoutException if we produce a message with too many properties, see [1].
> > > > But the root cause is that is trigged TooLongFrameException in broker server.
> > > >
> > > > In this PIP, I propose to include message header size when check maxMessageSize of non-batch
> > > > messages, this brings the following benefits.
> > > > 1. Clients can throw InvalidMessageException immediately if properties takes too much storage space.
> > > > 2. This will make the behaviour consistent with topic level max message size check in broker.
> > > > 3. Strictly limit the entry size less than maxMessageSize, avoid sending message to bookkeeper failed.
> > > >
> > > > ## Goal
> > > >
> > > > Include message header size when check maxMessageSize for non-batch message on the client side.
> > > >
> > > > ## Implementation
> > > >
> > > > ```
> > > > // Add a size check in org.apache.pulsar.client.impl.ProducerImpl#processOpSendMsg
> > > > if (op.msg != null // for non-batch messages only
> > > >     && op.getMessageHeaderAndPayloadSize() > ClientCnx.getMaxMessageSize()) {
> > > >     // finish send op with InvalidMessageException
> > > >     releaseSemaphoreForSendOp(op);
> > > >     op.sendComplete(new PulsarClientException(new InvalidMessageException, op.sequenceId));
> > > > }
> > > >
> > > >
> > > > // org.apache.pulsar.client.impl.ProducerImpl.OpSendMsg#getMessageHeaderAndPayloadSize
> > > >
> > > > public int getMessageHeaderAndPayloadSize() {
> > > >     ByteBuf cmdHeader = cmd.getFirst();
> > > >     cmdHeader.markReaderIndex();
> > > >     int totalSize = cmdHeader.readInt();
> > > >     int cmdSize = cmdHeader.readInt();
> > > >     int msgHeadersAndPayloadSize = totalSize - cmdSize - 4;
> > > >     cmdHeader.resetReaderIndex();
> > > >     return msgHeadersAndPayloadSize;
> > > > }
> > > > ```
> > > >
> > > > ## Reject Alternatives
> > > > Add a new property like "maxPropertiesSize" or "maxHeaderSize" in broker.conf and pass it to
> > > > client like maxMessageSize. But the implementation is much more complex, and don't have the
> > > > benefit 2 and 3 mentioned in Motivation.
> > > >
> > > > ## Compatibility Issue
> > > > As a matter of fact, this PIP narrows down the sendable range. Previously, when maxMessageSize
> > > > is 1KB, it's ok to send message with 1KB properties and 1KB payload. But with this PIP, the
> > > > sending will fail with InvalidMessageException.
> > > >
> > > > One conservative way is to add a boolean config "includeHeaderInSizeCheck" to enable this
> > > > feature. But I think it's OK to enable this directly as it's more reasonable, and I don't see good
> > > > migration plan if we add a config for this.
> > > >
> > > > The compatibility issue is worth discussing. And any suggestions are appreciated.
> > > >
> > > > [1] https://github.com/apache/pulsar/issues/13560
> > > >
> > > > Thanks,
> > > > Haiting Jiang
> > > >
> >
> >
> >
> > --
> > Zike Yang
> >



-- 
Zike Yang

Re: [DISCUSSION] PIP-132: Include message header size when check maxMessageSize of non-batch message on the client side.

Posted by Haiting Jiang <ji...@apache.org>.
Good point,  I think we should shrink chunk size to "ClientCnx.getMaxMessageSize() - chunkMessageHeaderSize", as we have the same header size for each chunk message. 

Thanks,
Haiting Jiang

On 2022/01/04 03:27:00 Zike Yang wrote:
> But how do we handle chunked messages? The chunked message is split
> based on the maxMessageSize(max payload size). This would seem to make
> `op.getMessageHeaderAndPayloadSize() > ClientCnx.getMaxMessageSize()`
> always true.
> 
> Thanks,
> Zike
> 
> On Fri, Dec 31, 2021 at 8:11 PM Haiting Jiang <ji...@apache.org> wrote:
> >
> > Sorry, it should be PIP-132.
> >
> > Thanks,
> > Haiting Jiang
> >
> > On 2021/12/31 12:05:54 Haiting Jiang wrote:
> > > https://github.com/apache/pulsar/issues/13591
> > >
> > > Pasted below for quoting convenience.
> > >
> > > ——
> > >
> > > ## Motivation
> > >
> > > Currently, Pulsar client (Java) only checks payload size for max message size validation.
> > >
> > > Client throws TimeoutException if we produce a message with too many properties, see [1].
> > > But the root cause is that is trigged TooLongFrameException in broker server.
> > >
> > > In this PIP, I propose to include message header size when check maxMessageSize of non-batch
> > > messages, this brings the following benefits.
> > > 1. Clients can throw InvalidMessageException immediately if properties takes too much storage space.
> > > 2. This will make the behaviour consistent with topic level max message size check in broker.
> > > 3. Strictly limit the entry size less than maxMessageSize, avoid sending message to bookkeeper failed.
> > >
> > > ## Goal
> > >
> > > Include message header size when check maxMessageSize for non-batch message on the client side.
> > >
> > > ## Implementation
> > >
> > > ```
> > > // Add a size check in org.apache.pulsar.client.impl.ProducerImpl#processOpSendMsg
> > > if (op.msg != null // for non-batch messages only
> > >     && op.getMessageHeaderAndPayloadSize() > ClientCnx.getMaxMessageSize()) {
> > >     // finish send op with InvalidMessageException
> > >     releaseSemaphoreForSendOp(op);
> > >     op.sendComplete(new PulsarClientException(new InvalidMessageException, op.sequenceId));
> > > }
> > >
> > >
> > > // org.apache.pulsar.client.impl.ProducerImpl.OpSendMsg#getMessageHeaderAndPayloadSize
> > >
> > > public int getMessageHeaderAndPayloadSize() {
> > >     ByteBuf cmdHeader = cmd.getFirst();
> > >     cmdHeader.markReaderIndex();
> > >     int totalSize = cmdHeader.readInt();
> > >     int cmdSize = cmdHeader.readInt();
> > >     int msgHeadersAndPayloadSize = totalSize - cmdSize - 4;
> > >     cmdHeader.resetReaderIndex();
> > >     return msgHeadersAndPayloadSize;
> > > }
> > > ```
> > >
> > > ## Reject Alternatives
> > > Add a new property like "maxPropertiesSize" or "maxHeaderSize" in broker.conf and pass it to
> > > client like maxMessageSize. But the implementation is much more complex, and don't have the
> > > benefit 2 and 3 mentioned in Motivation.
> > >
> > > ## Compatibility Issue
> > > As a matter of fact, this PIP narrows down the sendable range. Previously, when maxMessageSize
> > > is 1KB, it's ok to send message with 1KB properties and 1KB payload. But with this PIP, the
> > > sending will fail with InvalidMessageException.
> > >
> > > One conservative way is to add a boolean config "includeHeaderInSizeCheck" to enable this
> > > feature. But I think it's OK to enable this directly as it's more reasonable, and I don't see good
> > > migration plan if we add a config for this.
> > >
> > > The compatibility issue is worth discussing. And any suggestions are appreciated.
> > >
> > > [1] https://github.com/apache/pulsar/issues/13560
> > >
> > > Thanks,
> > > Haiting Jiang
> > >
> 
> 
> 
> -- 
> Zike Yang
> 

Re: [DISCUSSION] PIP-132: Include message header size when check maxMessageSize of non-batch message on the client side.

Posted by Zike Yang <zk...@streamnative.io.INVALID>.
But how do we handle chunked messages? The chunked message is split
based on the maxMessageSize(max payload size). This would seem to make
`op.getMessageHeaderAndPayloadSize() > ClientCnx.getMaxMessageSize()`
always true.

Thanks,
Zike

On Fri, Dec 31, 2021 at 8:11 PM Haiting Jiang <ji...@apache.org> wrote:
>
> Sorry, it should be PIP-132.
>
> Thanks,
> Haiting Jiang
>
> On 2021/12/31 12:05:54 Haiting Jiang wrote:
> > https://github.com/apache/pulsar/issues/13591
> >
> > Pasted below for quoting convenience.
> >
> > ——
> >
> > ## Motivation
> >
> > Currently, Pulsar client (Java) only checks payload size for max message size validation.
> >
> > Client throws TimeoutException if we produce a message with too many properties, see [1].
> > But the root cause is that is trigged TooLongFrameException in broker server.
> >
> > In this PIP, I propose to include message header size when check maxMessageSize of non-batch
> > messages, this brings the following benefits.
> > 1. Clients can throw InvalidMessageException immediately if properties takes too much storage space.
> > 2. This will make the behaviour consistent with topic level max message size check in broker.
> > 3. Strictly limit the entry size less than maxMessageSize, avoid sending message to bookkeeper failed.
> >
> > ## Goal
> >
> > Include message header size when check maxMessageSize for non-batch message on the client side.
> >
> > ## Implementation
> >
> > ```
> > // Add a size check in org.apache.pulsar.client.impl.ProducerImpl#processOpSendMsg
> > if (op.msg != null // for non-batch messages only
> >     && op.getMessageHeaderAndPayloadSize() > ClientCnx.getMaxMessageSize()) {
> >     // finish send op with InvalidMessageException
> >     releaseSemaphoreForSendOp(op);
> >     op.sendComplete(new PulsarClientException(new InvalidMessageException, op.sequenceId));
> > }
> >
> >
> > // org.apache.pulsar.client.impl.ProducerImpl.OpSendMsg#getMessageHeaderAndPayloadSize
> >
> > public int getMessageHeaderAndPayloadSize() {
> >     ByteBuf cmdHeader = cmd.getFirst();
> >     cmdHeader.markReaderIndex();
> >     int totalSize = cmdHeader.readInt();
> >     int cmdSize = cmdHeader.readInt();
> >     int msgHeadersAndPayloadSize = totalSize - cmdSize - 4;
> >     cmdHeader.resetReaderIndex();
> >     return msgHeadersAndPayloadSize;
> > }
> > ```
> >
> > ## Reject Alternatives
> > Add a new property like "maxPropertiesSize" or "maxHeaderSize" in broker.conf and pass it to
> > client like maxMessageSize. But the implementation is much more complex, and don't have the
> > benefit 2 and 3 mentioned in Motivation.
> >
> > ## Compatibility Issue
> > As a matter of fact, this PIP narrows down the sendable range. Previously, when maxMessageSize
> > is 1KB, it's ok to send message with 1KB properties and 1KB payload. But with this PIP, the
> > sending will fail with InvalidMessageException.
> >
> > One conservative way is to add a boolean config "includeHeaderInSizeCheck" to enable this
> > feature. But I think it's OK to enable this directly as it's more reasonable, and I don't see good
> > migration plan if we add a config for this.
> >
> > The compatibility issue is worth discussing. And any suggestions are appreciated.
> >
> > [1] https://github.com/apache/pulsar/issues/13560
> >
> > Thanks,
> > Haiting Jiang
> >



-- 
Zike Yang

Re: [DISCUSSION] PIP-132: Include message header size when check maxMessageSize of non-batch message on the client side.

Posted by Haiting Jiang <ji...@apache.org>.
Sorry, it should be PIP-132.

Thanks,
Haiting Jiang

On 2021/12/31 12:05:54 Haiting Jiang wrote:
> https://github.com/apache/pulsar/issues/13591
> 
> Pasted below for quoting convenience.
> 
> ——
> 
> ## Motivation
> 
> Currently, Pulsar client (Java) only checks payload size for max message size validation.
> 
> Client throws TimeoutException if we produce a message with too many properties, see [1].
> But the root cause is that is trigged TooLongFrameException in broker server.
> 
> In this PIP, I propose to include message header size when check maxMessageSize of non-batch
> messages, this brings the following benefits.
> 1. Clients can throw InvalidMessageException immediately if properties takes too much storage space.
> 2. This will make the behaviour consistent with topic level max message size check in broker.
> 3. Strictly limit the entry size less than maxMessageSize, avoid sending message to bookkeeper failed.
> 
> ## Goal
> 
> Include message header size when check maxMessageSize for non-batch message on the client side.
> 
> ## Implementation
> 
> ```
> // Add a size check in org.apache.pulsar.client.impl.ProducerImpl#processOpSendMsg
> if (op.msg != null // for non-batch messages only
>     && op.getMessageHeaderAndPayloadSize() > ClientCnx.getMaxMessageSize()) {
>     // finish send op with InvalidMessageException
>     releaseSemaphoreForSendOp(op);
>     op.sendComplete(new PulsarClientException(new InvalidMessageException, op.sequenceId));
> }
> 
> 
> // org.apache.pulsar.client.impl.ProducerImpl.OpSendMsg#getMessageHeaderAndPayloadSize
> 
> public int getMessageHeaderAndPayloadSize() {
>     ByteBuf cmdHeader = cmd.getFirst();
>     cmdHeader.markReaderIndex();
>     int totalSize = cmdHeader.readInt();
>     int cmdSize = cmdHeader.readInt();
>     int msgHeadersAndPayloadSize = totalSize - cmdSize - 4;
>     cmdHeader.resetReaderIndex();
>     return msgHeadersAndPayloadSize;
> }
> ```
> 
> ## Reject Alternatives
> Add a new property like "maxPropertiesSize" or "maxHeaderSize" in broker.conf and pass it to 
> client like maxMessageSize. But the implementation is much more complex, and don't have the 
> benefit 2 and 3 mentioned in Motivation.
> 
> ## Compatibility Issue
> As a matter of fact, this PIP narrows down the sendable range. Previously, when maxMessageSize
> is 1KB, it's ok to send message with 1KB properties and 1KB payload. But with this PIP, the 
> sending will fail with InvalidMessageException.
> 
> One conservative way is to add a boolean config "includeHeaderInSizeCheck" to enable this 
> feature. But I think it's OK to enable this directly as it's more reasonable, and I don't see good 
> migration plan if we add a config for this.
> 
> The compatibility issue is worth discussing. And any suggestions are appreciated.
> 
> [1] https://github.com/apache/pulsar/issues/13560
> 
> Thanks,
> Haiting Jiang
> 

Re: [DISCUSSION] PIP-132: Include message header size when check maxMessageSize of non-batch message on the client side.

Posted by Haiting Jiang <ji...@apache.org>.
Hi mattison,

Yes, this is an alternative way to solve the case of properties is too large. 

But I think this approach is more complex in coding and introduces new concept and 
configs, and I don't see the benefit of limiting the header size and payload size separately.

Thanks, 
Haiting Jiang

On 2022/01/04 04:14:33 mattison chao wrote:
> Hi, @Jason918 <https://github.com/Jason918>
> I think we can deprecate maxMessageSize to maxPayloadSize, and add maxHeaderSize to limit header size.
> after then, we can use maxPayloadSize + maxHeaderSize to get maxMessageSize at internal.
> 
> What do you think about it?
> 
> > On Dec 31, 2021, at 8:05 PM, Haiting Jiang <ji...@apache.org> wrote:
> > 
> > https://github.com/apache/pulsar/issues/13591
> > 
> > Pasted below for quoting convenience.
> > 
> > ——
> > 
> > ## Motivation
> > 
> > Currently, Pulsar client (Java) only checks payload size for max message size validation.
> > 
> > Client throws TimeoutException if we produce a message with too many properties, see [1].
> > But the root cause is that is trigged TooLongFrameException in broker server.
> > 
> > In this PIP, I propose to include message header size when check maxMessageSize of non-batch
> > messages, this brings the following benefits.
> > 1. Clients can throw InvalidMessageException immediately if properties takes too much storage space.
> > 2. This will make the behaviour consistent with topic level max message size check in broker.
> > 3. Strictly limit the entry size less than maxMessageSize, avoid sending message to bookkeeper failed.
> > 
> > ## Goal
> > 
> > Include message header size when check maxMessageSize for non-batch message on the client side.
> > 
> > ## Implementation
> > 
> > ```
> > // Add a size check in org.apache.pulsar.client.impl.ProducerImpl#processOpSendMsg
> > if (op.msg != null // for non-batch messages only
> >    && op.getMessageHeaderAndPayloadSize() > ClientCnx.getMaxMessageSize()) {
> >    // finish send op with InvalidMessageException
> >    releaseSemaphoreForSendOp(op);
> >    op.sendComplete(new PulsarClientException(new InvalidMessageException, op.sequenceId));
> > }
> > 
> > 
> > // org.apache.pulsar.client.impl.ProducerImpl.OpSendMsg#getMessageHeaderAndPayloadSize
> > 
> > public int getMessageHeaderAndPayloadSize() {
> >    ByteBuf cmdHeader = cmd.getFirst();
> >    cmdHeader.markReaderIndex();
> >    int totalSize = cmdHeader.readInt();
> >    int cmdSize = cmdHeader.readInt();
> >    int msgHeadersAndPayloadSize = totalSize - cmdSize - 4;
> >    cmdHeader.resetReaderIndex();
> >    return msgHeadersAndPayloadSize;
> > }
> > ```
> > 
> > ## Reject Alternatives
> > Add a new property like "maxPropertiesSize" or "maxHeaderSize" in broker.conf and pass it to 
> > client like maxMessageSize. But the implementation is much more complex, and don't have the 
> > benefit 2 and 3 mentioned in Motivation.
> > 
> > ## Compatibility Issue
> > As a matter of fact, this PIP narrows down the sendable range. Previously, when maxMessageSize
> > is 1KB, it's ok to send message with 1KB properties and 1KB payload. But with this PIP, the 
> > sending will fail with InvalidMessageException.
> > 
> > One conservative way is to add a boolean config "includeHeaderInSizeCheck" to enable this 
> > feature. But I think it's OK to enable this directly as it's more reasonable, and I don't see good 
> > migration plan if we add a config for this.
> > 
> > The compatibility issue is worth discussing. And any suggestions are appreciated.
> > 
> > [1] https://github.com/apache/pulsar/issues/13560
> > 
> > Thanks,
> > Haiting Jiang
> 
> 

Re: [DISCUSSION] PIP-131: Include message header size when check maxMessageSize of non-batch message on the client side.

Posted by mattison chao <ma...@gmail.com>.
Hi, @Jason918 <https://github.com/Jason918>
I think we can deprecate maxMessageSize to maxPayloadSize, and add maxHeaderSize to limit header size.
after then, we can use maxPayloadSize + maxHeaderSize to get maxMessageSize at internal.

What do you think about it?

> On Dec 31, 2021, at 8:05 PM, Haiting Jiang <ji...@apache.org> wrote:
> 
> https://github.com/apache/pulsar/issues/13591
> 
> Pasted below for quoting convenience.
> 
> ——
> 
> ## Motivation
> 
> Currently, Pulsar client (Java) only checks payload size for max message size validation.
> 
> Client throws TimeoutException if we produce a message with too many properties, see [1].
> But the root cause is that is trigged TooLongFrameException in broker server.
> 
> In this PIP, I propose to include message header size when check maxMessageSize of non-batch
> messages, this brings the following benefits.
> 1. Clients can throw InvalidMessageException immediately if properties takes too much storage space.
> 2. This will make the behaviour consistent with topic level max message size check in broker.
> 3. Strictly limit the entry size less than maxMessageSize, avoid sending message to bookkeeper failed.
> 
> ## Goal
> 
> Include message header size when check maxMessageSize for non-batch message on the client side.
> 
> ## Implementation
> 
> ```
> // Add a size check in org.apache.pulsar.client.impl.ProducerImpl#processOpSendMsg
> if (op.msg != null // for non-batch messages only
>    && op.getMessageHeaderAndPayloadSize() > ClientCnx.getMaxMessageSize()) {
>    // finish send op with InvalidMessageException
>    releaseSemaphoreForSendOp(op);
>    op.sendComplete(new PulsarClientException(new InvalidMessageException, op.sequenceId));
> }
> 
> 
> // org.apache.pulsar.client.impl.ProducerImpl.OpSendMsg#getMessageHeaderAndPayloadSize
> 
> public int getMessageHeaderAndPayloadSize() {
>    ByteBuf cmdHeader = cmd.getFirst();
>    cmdHeader.markReaderIndex();
>    int totalSize = cmdHeader.readInt();
>    int cmdSize = cmdHeader.readInt();
>    int msgHeadersAndPayloadSize = totalSize - cmdSize - 4;
>    cmdHeader.resetReaderIndex();
>    return msgHeadersAndPayloadSize;
> }
> ```
> 
> ## Reject Alternatives
> Add a new property like "maxPropertiesSize" or "maxHeaderSize" in broker.conf and pass it to 
> client like maxMessageSize. But the implementation is much more complex, and don't have the 
> benefit 2 and 3 mentioned in Motivation.
> 
> ## Compatibility Issue
> As a matter of fact, this PIP narrows down the sendable range. Previously, when maxMessageSize
> is 1KB, it's ok to send message with 1KB properties and 1KB payload. But with this PIP, the 
> sending will fail with InvalidMessageException.
> 
> One conservative way is to add a boolean config "includeHeaderInSizeCheck" to enable this 
> feature. But I think it's OK to enable this directly as it's more reasonable, and I don't see good 
> migration plan if we add a config for this.
> 
> The compatibility issue is worth discussing. And any suggestions are appreciated.
> 
> [1] https://github.com/apache/pulsar/issues/13560
> 
> Thanks,
> Haiting Jiang