You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@kafka.apache.org by Becket Qin <be...@gmail.com> on 2017/04/19 16:38:49 UTC

Re: [DISCUSS] KIP-126 - Allow KafkaProducer to batch based on uncompressed size

Thanks for the comment, Dong. I think the batch-split-ratio makes sense but
is kind of redundant to batch-split-rate.

Also the batch-split-ratio may be a little more involved to make right:
1. A all-time batch split ratio is easy to get but not that useful.
2. A time-windowed batch-split-ratio is more complicated to make accurate.
This is because it is kind of a "stateful" metric relies on the number of
batches sent in a time window and number of batches got split in the same
time window. But the sending and the splitting time are not necessarily
falling in the same window.

Besides, a rough estimation of the batch split ratio can be derived from
the existing metrics. And I think batch-split-rate is already a good
indication on whether the batch split has caused performance problem or
not.

So I am not sure if it is worth having an explicit batch-split-ratio metric
in this case.

Thanks,

Jiangjie (Becket) Qin

On Wed, Mar 22, 2017 at 10:54 AM, Dong Lin <li...@gmail.com> wrote:

> Never mind about my second comment. I misunderstood the semantics of
> producer's batch.size.
>
> On Wed, Mar 22, 2017 at 10:20 AM, Dong Lin <li...@gmail.com> wrote:
>
> > Hey Becket,
> >
> > In addition to the batch-split-rate, should we also add batch-split-ratio
> > sensor to gauge the probability that we have to split batch?
> >
> > Also, in the case that the batch size configured for the producer is
> > smaller than the max message size configured for the broker, why can't we
> > just split the batch if its size exceeds the configured batch size? The
> > benefit of this approach is that the semantics of producer is
> > straightforward because we enforce the batch size that user has
> configured.
> > The implementation would also be simpler because we don't have to reply
> on
> > KIP-4 to fetch the max message size from broker. I guess you are worrying
> > about the overhead of "unnecessary" split if a batch size is between
> > user-configured batch size and broker's max message size. But is overhead
> > really a concern? If overhead is too large because user has configured a
> > very low batch size for producer, shouldn't user adjust produce config?
> >
> > Thanks,
> > Dong
> >
> > On Wed, Mar 15, 2017 at 2:50 PM, Becket Qin <be...@gmail.com>
> wrote:
> >
> >> I see, then we are thinking about the same thing :)
> >>
> >> On Wed, Mar 15, 2017 at 2:26 PM, Ismael Juma <is...@juma.me.uk> wrote:
> >>
> >> > I meant finishing what's described in the following section and then
> >> > starting a discussion followed by a vote:
> >> >
> >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >> > 4+-+Command+line+and+centralized+administrative+operations#KIP-4-
> >> > Commandlineandcentralizedadministrativeoperations-DescribeCo
> >> nfigsRequest
> >> >
> >> > We have only voted on KIP-4 Metadata, KIP-4 Create Topics, KIP-4
> Delete
> >> > Topics so far.
> >> >
> >> > Ismael
> >> >
> >> > On Wed, Mar 15, 2017 at 8:58 PM, Becket Qin <be...@gmail.com>
> >> wrote:
> >> >
> >> > > Hi Ismael,
> >> > >
> >> > > KIP-4 is also the one that I was thinking about. We have introduced
> a
> >> > > DescribeConfigRequest there so the producer can easily get the
> >> > > configurations. By "another KIP" do you mean a new (or maybe
> extended)
> >> > > protocol or using that protocol in clients?
> >> > >
> >> > > Thanks,
> >> > >
> >> > > Jiangjie (Becket) Qin
> >> > >
> >> > > On Wed, Mar 15, 2017 at 1:21 PM, Ismael Juma <is...@juma.me.uk>
> >> wrote:
> >> > >
> >> > > > Hi Becket,
> >> > > >
> >> > > > How were you thinking of retrieving the configuration items you
> >> > > mentioned?
> >> > > > I am asking because I was planning to post a KIP for Describe
> >> Configs
> >> > > (one
> >> > > > of the protocols in KIP-4), which would expose such information.
> But
> >> > > maybe
> >> > > > you are thinking of extending Metadata request?
> >> > > >
> >> > > > Ismael
> >> > > >
> >> > > > On Wed, Mar 15, 2017 at 7:33 PM, Becket Qin <becket.qin@gmail.com
> >
> >> > > wrote:
> >> > > >
> >> > > > > Hi Jason,
> >> > > > >
> >> > > > > Good point. I was thinking about that, too. I was not sure if
> >> that is
> >> > > the
> >> > > > > right thing to do by default.
> >> > > > >
> >> > > > > If we assume people always set the batch size to max message
> size,
> >> > > > > splitting the oversized batch makes a lot of sense. But it seems
> >> > > possible
> >> > > > > that users want to control the memory footprint so they would
> set
> >> the
> >> > > > batch
> >> > > > > size to smaller than the max message size so the producer can
> have
> >> > hold
> >> > > > > batches for more partitions. In this case, splitting the batch
> >> might
> >> > > not
> >> > > > be
> >> > > > > the desired behavior.
> >> > > > >
> >> > > > > I think the most intuitive approach to this is allow the
> producer
> >> to
> >> > > get
> >> > > > > the max message size configuration (as well as some other
> >> > > configurations
> >> > > > > such as timestamp type)  from the broker side and use that to
> >> decide
> >> > > > > whether a batch should be split or not. I probably should add
> >> this to
> >> > > the
> >> > > > > KIP wiki.
> >> > > > >
> >> > > > > Thanks,
> >> > > > >
> >> > > > > Jiangjie (Becket) Qin
> >> > > > >
> >> > > > > On Wed, Mar 15, 2017 at 9:47 AM, Jason Gustafson <
> >> jason@confluent.io
> >> > >
> >> > > > > wrote:
> >> > > > >
> >> > > > > > Hey Becket,
> >> > > > > >
> >> > > > > > Thanks for the KIP! The approach seems reasonable. One
> >> > clarification:
> >> > > > is
> >> > > > > > the intent to do the splitting after the broker rejects the
> >> request
> >> > > > with
> >> > > > > > MESSAGE_TOO_LARGE, or prior to sending if the configured batch
> >> size
> >> > > is
> >> > > > > > exceeded?
> >> > > > > >
> >> > > > > > -Jason
> >> > > > > >
> >> > > > > > On Mon, Mar 13, 2017 at 8:10 PM, Becket Qin <
> >> becket.qin@gmail.com>
> >> > > > > wrote:
> >> > > > > >
> >> > > > > > > Bump up the thread for further comments. If there is no more
> >> > > comments
> >> > > > > on
> >> > > > > > > the KIP I will start the voting thread on Wed.
> >> > > > > > >
> >> > > > > > > Thanks,
> >> > > > > > >
> >> > > > > > > Jiangjie (Becket) Qin
> >> > > > > > >
> >> > > > > > > On Tue, Mar 7, 2017 at 9:48 AM, Becket Qin <
> >> becket.qin@gmail.com
> >> > >
> >> > > > > wrote:
> >> > > > > > >
> >> > > > > > > > Hi Dong,
> >> > > > > > > >
> >> > > > > > > > Thanks for the comments.
> >> > > > > > > >
> >> > > > > > > > The patch is mostly for proof of concept in case there is
> >> any
> >> > > > concern
> >> > > > > > > > about the implementation which is indeed a little tricky.
> >> > > > > > > >
> >> > > > > > > > The new metric has already been mentioned in the Public
> >> > Interface
> >> > > > > > Change
> >> > > > > > > > section.
> >> > > > > > > >
> >> > > > > > > > I added the reasoning about how the compression ratio
> >> > > > > > > > improving/deteriorate steps are determined in the wiki.
> >> > > > > > > >
> >> > > > > > > > Thanks,
> >> > > > > > > >
> >> > > > > > > > Jiangjie (Becket) Qin
> >> > > > > > > >
> >> > > > > > > >
> >> > > > > > > > On Mon, Mar 6, 2017 at 4:42 PM, Dong Lin <
> >> lindong28@gmail.com>
> >> > > > > wrote:
> >> > > > > > > >
> >> > > > > > > >> Hey Becket,
> >> > > > > > > >>
> >> > > > > > > >> I am wondering if we should first vote for the KIP before
> >> > > > reviewing
> >> > > > > > the
> >> > > > > > > >> patch. I have two comments below:
> >> > > > > > > >>
> >> > > > > > > >> - Should we specify the new sensors as part of interface
> >> > change
> >> > > in
> >> > > > > the
> >> > > > > > > >> KIP?
> >> > > > > > > >> - The KIP proposes to increase estimated compression
> ratio
> >> by
> >> > > 0.05
> >> > > > > for
> >> > > > > > > >> each
> >> > > > > > > >> underestimation and decrement the estimation by 0.005 for
> >> each
> >> > > > > > > >> overestimation. Why are these two values chosen? I think
> >> there
> >> > > is
> >> > > > > some
> >> > > > > > > >> tradeoff in selecting the value. Can the KIP be more
> >> explicit
> >> > > > about
> >> > > > > > the
> >> > > > > > > >> tradeoff and explain how these two values would impact
> >> > > producer's
> >> > > > > > > >> performance?
> >> > > > > > > >>
> >> > > > > > > >> Thanks,
> >> > > > > > > >> Dong
> >> > > > > > > >>
> >> > > > > > > >>
> >> > > > > > > >> On Sat, Mar 4, 2017 at 11:42 AM, Becket Qin <
> >> > > becket.qin@gmail.com
> >> > > > >
> >> > > > > > > wrote:
> >> > > > > > > >>
> >> > > > > > > >> > I have updated the KIP based on the latest discussion.
> >> > Please
> >> > > > > check
> >> > > > > > > and
> >> > > > > > > >> let
> >> > > > > > > >> > me know if there is any further concern.
> >> > > > > > > >> >
> >> > > > > > > >> > Thanks,
> >> > > > > > > >> >
> >> > > > > > > >> > Jiangjie (Becket) Qin
> >> > > > > > > >> >
> >> > > > > > > >> > On Sat, Mar 4, 2017 at 10:56 AM, Becket Qin <
> >> > > > becket.qin@gmail.com
> >> > > > > >
> >> > > > > > > >> wrote:
> >> > > > > > > >> >
> >> > > > > > > >> > > Actually second thought on this, rate might be better
> >> for
> >> > > two
> >> > > > > > > reasons:
> >> > > > > > > >> > > 1. Most of the metrics in the producer we already
> have
> >> are
> >> > > > using
> >> > > > > > > rate
> >> > > > > > > >> > > instead of count.
> >> > > > > > > >> > > 2. If a service is bounced, the count will be reset
> to
> >> 0,
> >> > > but
> >> > > > it
> >> > > > > > > does
> >> > > > > > > >> not
> >> > > > > > > >> > > affect rate.
> >> > > > > > > >> > >
> >> > > > > > > >> > > I'll make the change.
> >> > > > > > > >> > >
> >> > > > > > > >> > > Thanks,
> >> > > > > > > >> > >
> >> > > > > > > >> > > Jiangjie (Becket) Qin
> >> > > > > > > >> > >
> >> > > > > > > >> > > On Sat, Mar 4, 2017 at 10:27 AM, Becket Qin <
> >> > > > > becket.qin@gmail.com
> >> > > > > > >
> >> > > > > > > >> > wrote:
> >> > > > > > > >> > >
> >> > > > > > > >> > >> Hi Dong,
> >> > > > > > > >> > >>
> >> > > > > > > >> > >> Yes, there is a sensor in the patch about the split
> >> > > > occurrence.
> >> > > > > > > >> > >>
> >> > > > > > > >> > >> Currently it is a count instead of rate. In
> practice,
> >> it
> >> > > > seems
> >> > > > > > > count
> >> > > > > > > >> is
> >> > > > > > > >> > >> easier to use in this case. But I am open to change.
> >> > > > > > > >> > >>
> >> > > > > > > >> > >> Thanks,
> >> > > > > > > >> > >>
> >> > > > > > > >> > >> Jiangjie (Becket) Qin
> >> > > > > > > >> > >>
> >> > > > > > > >> > >> On Fri, Mar 3, 2017 at 7:43 PM, Dong Lin <
> >> > > > lindong28@gmail.com>
> >> > > > > > > >> wrote:
> >> > > > > > > >> > >>
> >> > > > > > > >> > >>> Hey Becket,
> >> > > > > > > >> > >>>
> >> > > > > > > >> > >>> I haven't looked at the patch yet. But since we are
> >> > going
> >> > > to
> >> > > > > try
> >> > > > > > > the
> >> > > > > > > >> > >>> split-on-oversize solution, should the KIP also
> add a
> >> > > sensor
> >> > > > > > that
> >> > > > > > > >> shows
> >> > > > > > > >> > >>> the
> >> > > > > > > >> > >>> rate of split per second and the probability of
> >> split?
> >> > > > > > > >> > >>>
> >> > > > > > > >> > >>> Thanks,
> >> > > > > > > >> > >>> Dong
> >> > > > > > > >> > >>>
> >> > > > > > > >> > >>>
> >> > > > > > > >> > >>> On Fri, Mar 3, 2017 at 6:39 PM, Becket Qin <
> >> > > > > > becket.qin@gmail.com>
> >> > > > > > > >> > wrote:
> >> > > > > > > >> > >>>
> >> > > > > > > >> > >>> > Just to clarify, the implementation is basically
> >> what
> >> > I
> >> > > > > > > mentioned
> >> > > > > > > >> > above
> >> > > > > > > >> > >>> > (split/resend + adjusted estimation evolving
> >> > algorithm)
> >> > > > and
> >> > > > > > > >> changing
> >> > > > > > > >> > >>> the
> >> > > > > > > >> > >>> > compression ratio estimation to be per topic.
> >> > > > > > > >> > >>> >
> >> > > > > > > >> > >>> > Thanks,
> >> > > > > > > >> > >>> >
> >> > > > > > > >> > >>> > Jiangjie (Becket) Qin
> >> > > > > > > >> > >>> >
> >> > > > > > > >> > >>> > On Fri, Mar 3, 2017 at 6:36 PM, Becket Qin <
> >> > > > > > > becket.qin@gmail.com>
> >> > > > > > > >> > >>> wrote:
> >> > > > > > > >> > >>> >
> >> > > > > > > >> > >>> > > I went ahead and have a patch submitted here:
> >> > > > > > > >> > >>> > > https://github.com/apache/kafka/pull/2638
> >> > > > > > > >> > >>> > >
> >> > > > > > > >> > >>> > > Per Joel's suggestion, I changed the
> compression
> >> > ratio
> >> > > > to
> >> > > > > be
> >> > > > > > > per
> >> > > > > > > >> > >>> topic as
> >> > > > > > > >> > >>> > > well. It seems working well. Since there is an
> >> > > important
> >> > > > > > > >> behavior
> >> > > > > > > >> > >>> change
> >> > > > > > > >> > >>> > > and a new sensor is added, I'll keep the KIP
> and
> >> > > update
> >> > > > it
> >> > > > > > > >> > according.
> >> > > > > > > >> > >>> > >
> >> > > > > > > >> > >>> > > Thanks,
> >> > > > > > > >> > >>> > >
> >> > > > > > > >> > >>> > > Jiangjie (Becket) Qin
> >> > > > > > > >> > >>> > >
> >> > > > > > > >> > >>> > > On Mon, Feb 27, 2017 at 3:50 PM, Joel Koshy <
> >> > > > > > > >> jjkoshy.w@gmail.com>
> >> > > > > > > >> > >>> wrote:
> >> > > > > > > >> > >>> > >
> >> > > > > > > >> > >>> > >> >
> >> > > > > > > >> > >>> > >> > Lets say we sent the batch over the wire and
> >> > > > received a
> >> > > > > > > >> > >>> > >> > RecordTooLargeException, how do we split it
> as
> >> > once
> >> > > > we
> >> > > > > > add
> >> > > > > > > >> the
> >> > > > > > > >> > >>> message
> >> > > > > > > >> > >>> > >> to
> >> > > > > > > >> > >>> > >> > the batch we loose the message level
> >> granularity.
> >> > > We
> >> > > > > will
> >> > > > > > > >> have
> >> > > > > > > >> > to
> >> > > > > > > >> > >>> > >> > decompress, do deep iteration and split and
> >> again
> >> > > > > > compress.
> >> > > > > > > >> > right?
> >> > > > > > > >> > >>> > This
> >> > > > > > > >> > >>> > >> > looks like a performance bottle neck in case
> >> of
> >> > > multi
> >> > > > > > topic
> >> > > > > > > >> > >>> producers
> >> > > > > > > >> > >>> > >> like
> >> > > > > > > >> > >>> > >> > mirror maker.
> >> > > > > > > >> > >>> > >> >
> >> > > > > > > >> > >>> > >>
> >> > > > > > > >> > >>> > >> Yes, but these should be outliers if we do
> >> > estimation
> >> > > > on
> >> > > > > a
> >> > > > > > > >> > per-topic
> >> > > > > > > >> > >>> > basis
> >> > > > > > > >> > >>> > >> and if we target a conservative-enough
> >> compression
> >> > > > ratio.
> >> > > > > > The
> >> > > > > > > >> > >>> producer
> >> > > > > > > >> > >>> > >> should also avoid sending over the wire if it
> >> can
> >> > be
> >> > > > made
> >> > > > > > > >> aware of
> >> > > > > > > >> > >>> the
> >> > > > > > > >> > >>> > >> max-message size limit on the broker, and
> split
> >> if
> >> > it
> >> > > > > > > >> determines
> >> > > > > > > >> > >>> that a
> >> > > > > > > >> > >>> > >> record exceeds the broker's config. Ideally
> this
> >> > > should
> >> > > > > be
> >> > > > > > > >> part of
> >> > > > > > > >> > >>> topic
> >> > > > > > > >> > >>> > >> metadata but is not - so it could be off a
> >> periodic
> >> > > > > > > >> > describe-configs
> >> > > > > > > >> > >>> > >> <https://cwiki.apache.org/
> >> > > > confluence/display/KAFKA/KIP-
> >> > > > > > 4+-+
> >> > > > > > > >> > >>> > >> Command+line+and+centralized+
> >> > > > > > administrative+operations#KIP-
> >> > > > > > > >> > >>> > >> 4-Commandlineandcentralizedadmin
> >> > > > > > istrativeoperations-Describe
> >> > > > > > > >> > >>> > >> ConfigsRequest>
> >> > > > > > > >> > >>> > >> (which isn't available yet). This doesn't
> remove
> >> > the
> >> > > > need
> >> > > > > > to
> >> > > > > > > >> split
> >> > > > > > > >> > >>> and
> >> > > > > > > >> > >>> > >> recompress though.
> >> > > > > > > >> > >>> > >>
> >> > > > > > > >> > >>> > >>
> >> > > > > > > >> > >>> > >> > On Mon, Feb 27, 2017 at 10:51 AM, Becket
> Qin <
> >> > > > > > > >> > >>> becket.qin@gmail.com>
> >> > > > > > > >> > >>> > >> wrote:
> >> > > > > > > >> > >>> > >> >
> >> > > > > > > >> > >>> > >> > > Hey Mayuresh,
> >> > > > > > > >> > >>> > >> > >
> >> > > > > > > >> > >>> > >> > > 1) The batch would be split when an
> >> > > > > > > >> RecordTooLargeException is
> >> > > > > > > >> > >>> > >> received.
> >> > > > > > > >> > >>> > >> > > 2) Not lower the actual compression ratio,
> >> but
> >> > > > lower
> >> > > > > > the
> >> > > > > > > >> > >>> estimated
> >> > > > > > > >> > >>> > >> > > compression ratio "according to" the
> Actual
> >> > > > > Compression
> >> > > > > > > >> > >>> Ratio(ACR).
> >> > > > > > > >> > >>> > >> > >
> >> > > > > > > >> > >>> > >> > > An example, let's start with Estimated
> >> > > Compression
> >> > > > > > Ratio
> >> > > > > > > >> > (ECR) =
> >> > > > > > > >> > >>> > 1.0.
> >> > > > > > > >> > >>> > >> Say
> >> > > > > > > >> > >>> > >> > > the compression ratio of ACR is ~0.8,
> >> instead
> >> > of
> >> > > > > > letting
> >> > > > > > > >> the
> >> > > > > > > >> > ECR
> >> > > > > > > >> > >>> > >> dropped
> >> > > > > > > >> > >>> > >> > to
> >> > > > > > > >> > >>> > >> > > 0.8 very quickly, we only drop 0.001 every
> >> time
> >> > > > when
> >> > > > > > ACR
> >> > > > > > > <
> >> > > > > > > >> > ECR.
> >> > > > > > > >> > >>> > >> However,
> >> > > > > > > >> > >>> > >> > > once we see an ACR > ECR, we increment ECR
> >> by
> >> > > 0.05.
> >> > > > > If
> >> > > > > > a
> >> > > > > > > >> > >>> > >> > > RecordTooLargeException is received, we
> >> reset
> >> > the
> >> > > > ECR
> >> > > > > > > back
> >> > > > > > > >> to
> >> > > > > > > >> > >>> 1.0
> >> > > > > > > >> > >>> > and
> >> > > > > > > >> > >>> > >> > split
> >> > > > > > > >> > >>> > >> > > the batch.
> >> > > > > > > >> > >>> > >> > >
> >> > > > > > > >> > >>> > >> > > Thanks,
> >> > > > > > > >> > >>> > >> > >
> >> > > > > > > >> > >>> > >> > > Jiangjie (Becket) Qin
> >> > > > > > > >> > >>> > >> > >
> >> > > > > > > >> > >>> > >> > >
> >> > > > > > > >> > >>> > >> > >
> >> > > > > > > >> > >>> > >> > > On Mon, Feb 27, 2017 at 10:30 AM, Mayuresh
> >> > > Gharat <
> >> > > > > > > >> > >>> > >> > > gharatmayuresh15@gmail.com> wrote:
> >> > > > > > > >> > >>> > >> > >
> >> > > > > > > >> > >>> > >> > > > Hi Becket,
> >> > > > > > > >> > >>> > >> > > >
> >> > > > > > > >> > >>> > >> > > > Seems like an interesting idea.
> >> > > > > > > >> > >>> > >> > > > I had couple of questions :
> >> > > > > > > >> > >>> > >> > > > 1) How do we decide when the batch
> should
> >> be
> >> > > > split?
> >> > > > > > > >> > >>> > >> > > > 2) What do you mean by slowly lowering
> the
> >> > > > "actual"
> >> > > > > > > >> > >>> compression
> >> > > > > > > >> > >>> > >> ratio?
> >> > > > > > > >> > >>> > >> > > > An example would really help here.
> >> > > > > > > >> > >>> > >> > > >
> >> > > > > > > >> > >>> > >> > > > Thanks,
> >> > > > > > > >> > >>> > >> > > >
> >> > > > > > > >> > >>> > >> > > > Mayuresh
> >> > > > > > > >> > >>> > >> > > >
> >> > > > > > > >> > >>> > >> > > > On Fri, Feb 24, 2017 at 3:17 PM, Becket
> >> Qin <
> >> > > > > > > >> > >>> becket.qin@gmail.com
> >> > > > > > > >> > >>> > >
> >> > > > > > > >> > >>> > >> > > wrote:
> >> > > > > > > >> > >>> > >> > > >
> >> > > > > > > >> > >>> > >> > > > > Hi Jay,
> >> > > > > > > >> > >>> > >> > > > >
> >> > > > > > > >> > >>> > >> > > > > Yeah, I got your point.
> >> > > > > > > >> > >>> > >> > > > >
> >> > > > > > > >> > >>> > >> > > > > I think there might be a solution
> which
> >> do
> >> > > not
> >> > > > > > > require
> >> > > > > > > >> > >>> adding a
> >> > > > > > > >> > >>> > >> new
> >> > > > > > > >> > >>> > >> > > > > configuration. We can start from a
> very
> >> > > > > > conservative
> >> > > > > > > >> > >>> compression
> >> > > > > > > >> > >>> > >> > ratio
> >> > > > > > > >> > >>> > >> > > > say
> >> > > > > > > >> > >>> > >> > > > > 1.0 and lower it very slowly according
> >> to
> >> > the
> >> > > > > > actual
> >> > > > > > > >> > >>> compression
> >> > > > > > > >> > >>> > >> > ratio
> >> > > > > > > >> > >>> > >> > > > > until we hit a point that we have to
> >> split
> >> > a
> >> > > > > batch.
> >> > > > > > > At
> >> > > > > > > >> > that
> >> > > > > > > >> > >>> > >> point, we
> >> > > > > > > >> > >>> > >> > > > > exponentially back off on the
> >> compression
> >> > > > ratio.
> >> > > > > > The
> >> > > > > > > >> idea
> >> > > > > > > >> > is
> >> > > > > > > >> > >>> > >> somewhat
> >> > > > > > > >> > >>> > >> > > > like
> >> > > > > > > >> > >>> > >> > > > > TCP. This should help avoid frequent
> >> split.
> >> > > > > > > >> > >>> > >> > > > >
> >> > > > > > > >> > >>> > >> > > > > The upper bound of the batch size is
> >> also a
> >> > > > > little
> >> > > > > > > >> awkward
> >> > > > > > > >> > >>> today
> >> > > > > > > >> > >>> > >> > > because
> >> > > > > > > >> > >>> > >> > > > we
> >> > > > > > > >> > >>> > >> > > > > say the batch size is based on
> >> compressed
> >> > > size,
> >> > > > > but
> >> > > > > > > >> users
> >> > > > > > > >> > >>> cannot
> >> > > > > > > >> > >>> > >> set
> >> > > > > > > >> > >>> > >> > it
> >> > > > > > > >> > >>> > >> > > > to
> >> > > > > > > >> > >>> > >> > > > > the max message size because that will
> >> > result
> >> > > > in
> >> > > > > > > >> oversized
> >> > > > > > > >> > >>> > >> messages.
> >> > > > > > > >> > >>> > >> > > With
> >> > > > > > > >> > >>> > >> > > > > this change we will be able to allow
> the
> >> > > users
> >> > > > to
> >> > > > > > set
> >> > > > > > > >> the
> >> > > > > > > >> > >>> > message
> >> > > > > > > >> > >>> > >> > size
> >> > > > > > > >> > >>> > >> > > to
> >> > > > > > > >> > >>> > >> > > > > close to max message size.
> >> > > > > > > >> > >>> > >> > > > >
> >> > > > > > > >> > >>> > >> > > > > However the downside is that there
> >> could be
> >> > > > > latency
> >> > > > > > > >> spikes
> >> > > > > > > >> > >>> in
> >> > > > > > > >> > >>> > the
> >> > > > > > > >> > >>> > >> > > system
> >> > > > > > > >> > >>> > >> > > > in
> >> > > > > > > >> > >>> > >> > > > > this case due to the splitting,
> >> especially
> >> > > when
> >> > > > > > there
> >> > > > > > > >> are
> >> > > > > > > >> > >>> many
> >> > > > > > > >> > >>> > >> > messages
> >> > > > > > > >> > >>> > >> > > > > need to be split at the same time.
> That
> >> > could
> >> > > > > > > >> potentially
> >> > > > > > > >> > >>> be an
> >> > > > > > > >> > >>> > >> issue
> >> > > > > > > >> > >>> > >> > > for
> >> > > > > > > >> > >>> > >> > > > > some users.
> >> > > > > > > >> > >>> > >> > > > >
> >> > > > > > > >> > >>> > >> > > > > What do you think about this approach?
> >> > > > > > > >> > >>> > >> > > > >
> >> > > > > > > >> > >>> > >> > > > > Thanks,
> >> > > > > > > >> > >>> > >> > > > >
> >> > > > > > > >> > >>> > >> > > > > Jiangjie (Becket) Qin
> >> > > > > > > >> > >>> > >> > > > >
> >> > > > > > > >> > >>> > >> > > > >
> >> > > > > > > >> > >>> > >> > > > >
> >> > > > > > > >> > >>> > >> > > > > On Thu, Feb 23, 2017 at 1:31 PM, Jay
> >> Kreps
> >> > <
> >> > > > > > > >> > >>> jay@confluent.io>
> >> > > > > > > >> > >>> > >> wrote:
> >> > > > > > > >> > >>> > >> > > > >
> >> > > > > > > >> > >>> > >> > > > > > Hey Becket,
> >> > > > > > > >> > >>> > >> > > > > >
> >> > > > > > > >> > >>> > >> > > > > > Yeah that makes sense.
> >> > > > > > > >> > >>> > >> > > > > >
> >> > > > > > > >> > >>> > >> > > > > > I agree that you'd really have to
> both
> >> > fix
> >> > > > the
> >> > > > > > > >> > estimation
> >> > > > > > > >> > >>> > (i.e.
> >> > > > > > > >> > >>> > >> > make
> >> > > > > > > >> > >>> > >> > > it
> >> > > > > > > >> > >>> > >> > > > > per
> >> > > > > > > >> > >>> > >> > > > > > topic or make it better estimate the
> >> high
> >> > > > > > > >> percentiles)
> >> > > > > > > >> > AND
> >> > > > > > > >> > >>> > have
> >> > > > > > > >> > >>> > >> the
> >> > > > > > > >> > >>> > >> > > > > > recovery mechanism. If you are
> >> > > > underestimating
> >> > > > > > > often
> >> > > > > > > >> and
> >> > > > > > > >> > >>> then
> >> > > > > > > >> > >>> > >> > paying
> >> > > > > > > >> > >>> > >> > > a
> >> > > > > > > >> > >>> > >> > > > > high
> >> > > > > > > >> > >>> > >> > > > > > recovery price that won't fly.
> >> > > > > > > >> > >>> > >> > > > > >
> >> > > > > > > >> > >>> > >> > > > > > I think you take my main point
> though,
> >> > > which
> >> > > > is
> >> > > > > > > just
> >> > > > > > > >> > that
> >> > > > > > > >> > >>> I
> >> > > > > > > >> > >>> > >> hate to
> >> > > > > > > >> > >>> > >> > > > > exposes
> >> > > > > > > >> > >>> > >> > > > > > these super low level options to
> users
> >> > > > because
> >> > > > > it
> >> > > > > > > is
> >> > > > > > > >> so
> >> > > > > > > >> > >>> hard
> >> > > > > > > >> > >>> > to
> >> > > > > > > >> > >>> > >> > > explain
> >> > > > > > > >> > >>> > >> > > > > to
> >> > > > > > > >> > >>> > >> > > > > > people what it means and how they
> >> should
> >> > > set
> >> > > > > it.
> >> > > > > > So
> >> > > > > > > >> if
> >> > > > > > > >> > it
> >> > > > > > > >> > >>> is
> >> > > > > > > >> > >>> > >> > possible
> >> > > > > > > >> > >>> > >> > > > to
> >> > > > > > > >> > >>> > >> > > > > > make either some combination of
> better
> >> > > > > estimation
> >> > > > > > > and
> >> > > > > > > >> > >>> > splitting
> >> > > > > > > >> > >>> > >> or
> >> > > > > > > >> > >>> > >> > > > better
> >> > > > > > > >> > >>> > >> > > > > > tolerance of overage that would be
> >> > > > preferrable.
> >> > > > > > > >> > >>> > >> > > > > >
> >> > > > > > > >> > >>> > >> > > > > > -Jay
> >> > > > > > > >> > >>> > >> > > > > >
> >> > > > > > > >> > >>> > >> > > > > > On Thu, Feb 23, 2017 at 11:51 AM,
> >> Becket
> >> > > Qin
> >> > > > <
> >> > > > > > > >> > >>> > >> becket.qin@gmail.com
> >> > > > > > > >> > >>> > >> > >
> >> > > > > > > >> > >>> > >> > > > > wrote:
> >> > > > > > > >> > >>> > >> > > > > >
> >> > > > > > > >> > >>> > >> > > > > > > @Dong,
> >> > > > > > > >> > >>> > >> > > > > > >
> >> > > > > > > >> > >>> > >> > > > > > > Thanks for the comments. The
> default
> >> > > > behavior
> >> > > > > > of
> >> > > > > > > >> the
> >> > > > > > > >> > >>> > producer
> >> > > > > > > >> > >>> > >> > won't
> >> > > > > > > >> > >>> > >> > > > > > change.
> >> > > > > > > >> > >>> > >> > > > > > > If the users want to use the
> >> > uncompressed
> >> > > > > > message
> >> > > > > > > >> > size,
> >> > > > > > > >> > >>> they
> >> > > > > > > >> > >>> > >> > > probably
> >> > > > > > > >> > >>> > >> > > > > > will
> >> > > > > > > >> > >>> > >> > > > > > > also bump up the batch size to
> >> > somewhere
> >> > > > > close
> >> > > > > > to
> >> > > > > > > >> the
> >> > > > > > > >> > >>> max
> >> > > > > > > >> > >>> > >> message
> >> > > > > > > >> > >>> > >> > > > size.
> >> > > > > > > >> > >>> > >> > > > > > > This would be in the document. BTW
> >> the
> >> > > > > default
> >> > > > > > > >> batch
> >> > > > > > > >> > >>> size is
> >> > > > > > > >> > >>> > >> 16K
> >> > > > > > > >> > >>> > >> > > > which
> >> > > > > > > >> > >>> > >> > > > > is
> >> > > > > > > >> > >>> > >> > > > > > > pretty small.
> >> > > > > > > >> > >>> > >> > > > > > >
> >> > > > > > > >> > >>> > >> > > > > > > @Jay,
> >> > > > > > > >> > >>> > >> > > > > > >
> >> > > > > > > >> > >>> > >> > > > > > > Yeah, we actually had debated
> quite
> >> a
> >> > bit
> >> > > > > > > >> internally
> >> > > > > > > >> > >>> what is
> >> > > > > > > >> > >>> > >> the
> >> > > > > > > >> > >>> > >> > > best
> >> > > > > > > >> > >>> > >> > > > > > > solution to this.
> >> > > > > > > >> > >>> > >> > > > > > >
> >> > > > > > > >> > >>> > >> > > > > > > I completely agree it is a bug. In
> >> > > practice
> >> > > > > we
> >> > > > > > > >> usually
> >> > > > > > > >> > >>> leave
> >> > > > > > > >> > >>> > >> some
> >> > > > > > > >> > >>> > >> > > > > > headroom
> >> > > > > > > >> > >>> > >> > > > > > > to allow the compressed size to
> >> grow a
> >> > > > little
> >> > > > > > if
> >> > > > > > > >> the
> >> > > > > > > >> > the
> >> > > > > > > >> > >>> > >> original
> >> > > > > > > >> > >>> > >> > > > > > messages
> >> > > > > > > >> > >>> > >> > > > > > > are not compressible, for example,
> >> 1000
> >> > > KB
> >> > > > > > > instead
> >> > > > > > > >> of
> >> > > > > > > >> > >>> > exactly
> >> > > > > > > >> > >>> > >> 1
> >> > > > > > > >> > >>> > >> > MB.
> >> > > > > > > >> > >>> > >> > > > It
> >> > > > > > > >> > >>> > >> > > > > is
> >> > > > > > > >> > >>> > >> > > > > > > likely safe enough.
> >> > > > > > > >> > >>> > >> > > > > > >
> >> > > > > > > >> > >>> > >> > > > > > > The major concern for the rejected
> >> > > > > alternative
> >> > > > > > is
> >> > > > > > > >> > >>> > >> performance. It
> >> > > > > > > >> > >>> > >> > > > > largely
> >> > > > > > > >> > >>> > >> > > > > > > depends on how frequent we need to
> >> > split
> >> > > a
> >> > > > > > batch,
> >> > > > > > > >> i.e.
> >> > > > > > > >> > >>> how
> >> > > > > > > >> > >>> > >> likely
> >> > > > > > > >> > >>> > >> > > the
> >> > > > > > > >> > >>> > >> > > > > > > estimation can go off. If we only
> >> need
> >> > to
> >> > > > the
> >> > > > > > > split
> >> > > > > > > >> > work
> >> > > > > > > >> > >>> > >> > > > occasionally,
> >> > > > > > > >> > >>> > >> > > > > > the
> >> > > > > > > >> > >>> > >> > > > > > > cost would be amortized so we
> don't
> >> > need
> >> > > to
> >> > > > > > worry
> >> > > > > > > >> > about
> >> > > > > > > >> > >>> it
> >> > > > > > > >> > >>> > too
> >> > > > > > > >> > >>> > >> > > much.
> >> > > > > > > >> > >>> > >> > > > > > > However, it looks that for a
> >> producer
> >> > > with
> >> > > > > > shared
> >> > > > > > > >> > >>> topics,
> >> > > > > > > >> > >>> > the
> >> > > > > > > >> > >>> > >> > > > > estimation
> >> > > > > > > >> > >>> > >> > > > > > is
> >> > > > > > > >> > >>> > >> > > > > > > always off. As an example,
> consider
> >> two
> >> > > > > topics,
> >> > > > > > > one
> >> > > > > > > >> > with
> >> > > > > > > >> > >>> > >> > > compression
> >> > > > > > > >> > >>> > >> > > > > > ratio
> >> > > > > > > >> > >>> > >> > > > > > > 0.6 the other 0.2, assuming
> exactly
> >> > same
> >> > > > > > traffic,
> >> > > > > > > >> the
> >> > > > > > > >> > >>> > average
> >> > > > > > > >> > >>> > >> > > > > compression
> >> > > > > > > >> > >>> > >> > > > > > > ratio would be roughly 0.4, which
> is
> >> > not
> >> > > > > right
> >> > > > > > > for
> >> > > > > > > >> > >>> either of
> >> > > > > > > >> > >>> > >> the
> >> > > > > > > >> > >>> > >> > > > > topics.
> >> > > > > > > >> > >>> > >> > > > > > So
> >> > > > > > > >> > >>> > >> > > > > > > almost half of the batches (of the
> >> > topics
> >> > > > > with
> >> > > > > > > 0.6
> >> > > > > > > >> > >>> > compression
> >> > > > > > > >> > >>> > >> > > ratio)
> >> > > > > > > >> > >>> > >> > > > > > will
> >> > > > > > > >> > >>> > >> > > > > > > end up larger than the configured
> >> batch
> >> > > > size.
> >> > > > > > > When
> >> > > > > > > >> it
> >> > > > > > > >> > >>> comes
> >> > > > > > > >> > >>> > to
> >> > > > > > > >> > >>> > >> > more
> >> > > > > > > >> > >>> > >> > > > > > topics
> >> > > > > > > >> > >>> > >> > > > > > > such as mirror maker, this becomes
> >> more
> >> > > > > > > >> unpredictable.
> >> > > > > > > >> > >>> To
> >> > > > > > > >> > >>> > >> avoid
> >> > > > > > > >> > >>> > >> > > > > frequent
> >> > > > > > > >> > >>> > >> > > > > > > rejection / split of the batches,
> we
> >> > need
> >> > > > to
> >> > > > > > > >> > configured
> >> > > > > > > >> > >>> the
> >> > > > > > > >> > >>> > >> batch
> >> > > > > > > >> > >>> > >> > > > size
> >> > > > > > > >> > >>> > >> > > > > > > pretty conservatively. This could
> >> > > actually
> >> > > > > hurt
> >> > > > > > > the
> >> > > > > > > >> > >>> > >> performance
> >> > > > > > > >> > >>> > >> > > > because
> >> > > > > > > >> > >>> > >> > > > > > we
> >> > > > > > > >> > >>> > >> > > > > > > are shoehorn the messages that are
> >> > highly
> >> > > > > > > >> compressible
> >> > > > > > > >> > >>> to a
> >> > > > > > > >> > >>> > >> small
> >> > > > > > > >> > >>> > >> > > > batch
> >> > > > > > > >> > >>> > >> > > > > > so
> >> > > > > > > >> > >>> > >> > > > > > > that the other topics that are not
> >> that
> >> > > > > > > >> compressible
> >> > > > > > > >> > >>> will
> >> > > > > > > >> > >>> > not
> >> > > > > > > >> > >>> > >> > > become
> >> > > > > > > >> > >>> > >> > > > > too
> >> > > > > > > >> > >>> > >> > > > > > > large with the same batch size. At
> >> > > > LinkedIn,
> >> > > > > > our
> >> > > > > > > >> batch
> >> > > > > > > >> > >>> size
> >> > > > > > > >> > >>> > is
> >> > > > > > > >> > >>> > >> > > > > configured
> >> > > > > > > >> > >>> > >> > > > > > > to 64 KB because of this. I think
> we
> >> > may
> >> > > > > > actually
> >> > > > > > > >> have
> >> > > > > > > >> > >>> > better
> >> > > > > > > >> > >>> > >> > > > batching
> >> > > > > > > >> > >>> > >> > > > > if
> >> > > > > > > >> > >>> > >> > > > > > > we just use the uncompressed
> message
> >> > size
> >> > > > and
> >> > > > > > 800
> >> > > > > > > >> KB
> >> > > > > > > >> > >>> batch
> >> > > > > > > >> > >>> > >> size.
> >> > > > > > > >> > >>> > >> > > > > > >
> >> > > > > > > >> > >>> > >> > > > > > > We did not think about loosening
> the
> >> > > > message
> >> > > > > > size
> >> > > > > > > >> > >>> > restriction,
> >> > > > > > > >> > >>> > >> > but
> >> > > > > > > >> > >>> > >> > > > that
> >> > > > > > > >> > >>> > >> > > > > > > sounds a viable solution given
> that
> >> the
> >> > > > > > consumer
> >> > > > > > > >> now
> >> > > > > > > >> > can
> >> > > > > > > >> > >>> > fetch
> >> > > > > > > >> > >>> > >> > > > > oversized
> >> > > > > > > >> > >>> > >> > > > > > > messages. One concern would be
> that
> >> on
> >> > > the
> >> > > > > > broker
> >> > > > > > > >> side
> >> > > > > > > >> > >>> > >> oversized
> >> > > > > > > >> > >>> > >> > > > > messages
> >> > > > > > > >> > >>> > >> > > > > > > will bring more memory pressure.
> >> With
> >> > > > KIP-92,
> >> > > > > > we
> >> > > > > > > >> may
> >> > > > > > > >> > >>> > mitigate
> >> > > > > > > >> > >>> > >> > that,
> >> > > > > > > >> > >>> > >> > > > but
> >> > > > > > > >> > >>> > >> > > > > > the
> >> > > > > > > >> > >>> > >> > > > > > > memory allocation for large
> messages
> >> > may
> >> > > > not
> >> > > > > be
> >> > > > > > > >> very
> >> > > > > > > >> > GC
> >> > > > > > > >> > >>> > >> > friendly. I
> >> > > > > > > >> > >>> > >> > > > > need
> >> > > > > > > >> > >>> > >> > > > > > to
> >> > > > > > > >> > >>> > >> > > > > > > think about this a little more.
> >> > > > > > > >> > >>> > >> > > > > > >
> >> > > > > > > >> > >>> > >> > > > > > > Thanks,
> >> > > > > > > >> > >>> > >> > > > > > >
> >> > > > > > > >> > >>> > >> > > > > > > Jiangjie (Becket) Qin
> >> > > > > > > >> > >>> > >> > > > > > >
> >> > > > > > > >> > >>> > >> > > > > > >
> >> > > > > > > >> > >>> > >> > > > > > > On Wed, Feb 22, 2017 at 8:57 PM,
> Jay
> >> > > Kreps
> >> > > > <
> >> > > > > > > >> > >>> > jay@confluent.io>
> >> > > > > > > >> > >>> > >> > > wrote:
> >> > > > > > > >> > >>> > >> > > > > > >
> >> > > > > > > >> > >>> > >> > > > > > > > Hey Becket,
> >> > > > > > > >> > >>> > >> > > > > > > >
> >> > > > > > > >> > >>> > >> > > > > > > > I get the problem we want to
> solve
> >> > with
> >> > > > > this,
> >> > > > > > > >> but I
> >> > > > > > > >> > >>> don't
> >> > > > > > > >> > >>> > >> think
> >> > > > > > > >> > >>> > >> > > > this
> >> > > > > > > >> > >>> > >> > > > > is
> >> > > > > > > >> > >>> > >> > > > > > > > something that makes sense as a
> >> user
> >> > > > > > controlled
> >> > > > > > > >> knob
> >> > > > > > > >> > >>> that
> >> > > > > > > >> > >>> > >> > > everyone
> >> > > > > > > >> > >>> > >> > > > > > > sending
> >> > > > > > > >> > >>> > >> > > > > > > > data to kafka has to think
> about.
> >> It
> >> > is
> >> > > > > > > >> basically a
> >> > > > > > > >> > >>> bug,
> >> > > > > > > >> > >>> > >> right?
> >> > > > > > > >> > >>> > >> > > > > > > >
> >> > > > > > > >> > >>> > >> > > > > > > > First, as a technical question
> is
> >> it
> >> > > true
> >> > > > > > that
> >> > > > > > > >> using
> >> > > > > > > >> > >>> the
> >> > > > > > > >> > >>> > >> > > > uncompressed
> >> > > > > > > >> > >>> > >> > > > > > > size
> >> > > > > > > >> > >>> > >> > > > > > > > for batching actually guarantees
> >> that
> >> > > you
> >> > > > > > > observe
> >> > > > > > > >> > the
> >> > > > > > > >> > >>> > >> limit? I
> >> > > > > > > >> > >>> > >> > > > think
> >> > > > > > > >> > >>> > >> > > > > > that
> >> > > > > > > >> > >>> > >> > > > > > > > implies that compression always
> >> makes
> >> > > the
> >> > > > > > > >> messages
> >> > > > > > > >> > >>> > smaller,
> >> > > > > > > >> > >>> > >> > > which i
> >> > > > > > > >> > >>> > >> > > > > > think
> >> > > > > > > >> > >>> > >> > > > > > > > usually true but is not
> >> guaranteed,
> >> > > > right?
> >> > > > > > e.g.
> >> > > > > > > >> if
> >> > > > > > > >> > >>> someone
> >> > > > > > > >> > >>> > >> > > encrypts
> >> > > > > > > >> > >>> > >> > > > > > their
> >> > > > > > > >> > >>> > >> > > > > > > > data which tends to randomize it
> >> and
> >> > > then
> >> > > > > > > enables
> >> > > > > > > >> > >>> > >> > compressesion,
> >> > > > > > > >> > >>> > >> > > it
> >> > > > > > > >> > >>> > >> > > > > > could
> >> > > > > > > >> > >>> > >> > > > > > > > slightly get bigger?
> >> > > > > > > >> > >>> > >> > > > > > > >
> >> > > > > > > >> > >>> > >> > > > > > > > I also wonder if the rejected
> >> > > > alternatives
> >> > > > > > you
> >> > > > > > > >> > >>> describe
> >> > > > > > > >> > >>> > >> > couldn't
> >> > > > > > > >> > >>> > >> > > be
> >> > > > > > > >> > >>> > >> > > > > > made
> >> > > > > > > >> > >>> > >> > > > > > > to
> >> > > > > > > >> > >>> > >> > > > > > > > work: basically try to be a bit
> >> > better
> >> > > at
> >> > > > > > > >> estimation
> >> > > > > > > >> > >>> and
> >> > > > > > > >> > >>> > >> > recover
> >> > > > > > > >> > >>> > >> > > > when
> >> > > > > > > >> > >>> > >> > > > > > we
> >> > > > > > > >> > >>> > >> > > > > > > > guess wrong. I don't think the
> >> memory
> >> > > > usage
> >> > > > > > > >> should
> >> > > > > > > >> > be
> >> > > > > > > >> > >>> a
> >> > > > > > > >> > >>> > >> > problem:
> >> > > > > > > >> > >>> > >> > > > > isn't
> >> > > > > > > >> > >>> > >> > > > > > it
> >> > > > > > > >> > >>> > >> > > > > > > > the same memory usage the
> >> consumer of
> >> > > > that
> >> > > > > > > topic
> >> > > > > > > >> > would
> >> > > > > > > >> > >>> > need?
> >> > > > > > > >> > >>> > >> > And
> >> > > > > > > >> > >>> > >> > > > > can't
> >> > > > > > > >> > >>> > >> > > > > > > you
> >> > > > > > > >> > >>> > >> > > > > > > > do the splitting and
> recompression
> >> > in a
> >> > > > > > > streaming
> >> > > > > > > >> > >>> fashion?
> >> > > > > > > >> > >>> > >> If
> >> > > > > > > >> > >>> > >> > we
> >> > > > > > > >> > >>> > >> > > an
> >> > > > > > > >> > >>> > >> > > > > > make
> >> > > > > > > >> > >>> > >> > > > > > > > the estimation rate low and the
> >> > > recovery
> >> > > > > cost
> >> > > > > > > is
> >> > > > > > > >> > just
> >> > > > > > > >> > >>> ~2x
> >> > > > > > > >> > >>> > >> the
> >> > > > > > > >> > >>> > >> > > > normal
> >> > > > > > > >> > >>> > >> > > > > > cost
> >> > > > > > > >> > >>> > >> > > > > > > > for that batch that should be
> >> totally
> >> > > > fine,
> >> > > > > > > >> right?
> >> > > > > > > >> > >>> (It's
> >> > > > > > > >> > >>> > >> > > > technically
> >> > > > > > > >> > >>> > >> > > > > > true
> >> > > > > > > >> > >>> > >> > > > > > > > you might have to split more
> than
> >> > once,
> >> > > > but
> >> > > > > > > since
> >> > > > > > > >> > you
> >> > > > > > > >> > >>> > halve
> >> > > > > > > >> > >>> > >> it
> >> > > > > > > >> > >>> > >> > > each
> >> > > > > > > >> > >>> > >> > > > > > time
> >> > > > > > > >> > >>> > >> > > > > > > I
> >> > > > > > > >> > >>> > >> > > > > > > > think should you get a number of
> >> > > halvings
> >> > > > > > that
> >> > > > > > > is
> >> > > > > > > >> > >>> > >> logarithmic
> >> > > > > > > >> > >>> > >> > in
> >> > > > > > > >> > >>> > >> > > > the
> >> > > > > > > >> > >>> > >> > > > > > miss
> >> > > > > > > >> > >>> > >> > > > > > > > size, which, with better
> >> estimation
> >> > > you'd
> >> > > > > > hope
> >> > > > > > > >> would
> >> > > > > > > >> > >>> be
> >> > > > > > > >> > >>> > >> super
> >> > > > > > > >> > >>> > >> > > duper
> >> > > > > > > >> > >>> > >> > > > > > > small).
> >> > > > > > > >> > >>> > >> > > > > > > >
> >> > > > > > > >> > >>> > >> > > > > > > > Alternatively maybe we could
> work
> >> on
> >> > > the
> >> > > > > > other
> >> > > > > > > >> side
> >> > > > > > > >> > >>> of the
> >> > > > > > > >> > >>> > >> > > problem
> >> > > > > > > >> > >>> > >> > > > > and
> >> > > > > > > >> > >>> > >> > > > > > > try
> >> > > > > > > >> > >>> > >> > > > > > > > to make it so that a small miss
> on
> >> > > > message
> >> > > > > > size
> >> > > > > > > >> > isn't
> >> > > > > > > >> > >>> a
> >> > > > > > > >> > >>> > big
> >> > > > > > > >> > >>> > >> > > > problem.
> >> > > > > > > >> > >>> > >> > > > > I
> >> > > > > > > >> > >>> > >> > > > > > > > think original issue was that
> max
> >> > size
> >> > > > and
> >> > > > > > > fetch
> >> > > > > > > >> > size
> >> > > > > > > >> > >>> were
> >> > > > > > > >> > >>> > >> > > tightly
> >> > > > > > > >> > >>> > >> > > > > > > coupled
> >> > > > > > > >> > >>> > >> > > > > > > > and the way memory in the
> consumer
> >> > > worked
> >> > > > > you
> >> > > > > > > >> really
> >> > > > > > > >> > >>> > wanted
> >> > > > > > > >> > >>> > >> > fetch
> >> > > > > > > >> > >>> > >> > > > > size
> >> > > > > > > >> > >>> > >> > > > > > to
> >> > > > > > > >> > >>> > >> > > > > > > > be as small as possible because
> >> you'd
> >> > > use
> >> > > > > > that
> >> > > > > > > >> much
> >> > > > > > > >> > >>> memory
> >> > > > > > > >> > >>> > >> per
> >> > > > > > > >> > >>> > >> > > > > fetched
> >> > > > > > > >> > >>> > >> > > > > > > > partition and the consumer would
> >> get
> >> > > > stuck
> >> > > > > if
> >> > > > > > > its
> >> > > > > > > >> > >>> fetch
> >> > > > > > > >> > >>> > size
> >> > > > > > > >> > >>> > >> > > wasn't
> >> > > > > > > >> > >>> > >> > > > > big
> >> > > > > > > >> > >>> > >> > > > > > > > enough. I think we made some
> >> progress
> >> > > on
> >> > > > > that
> >> > > > > > > >> issue
> >> > > > > > > >> > >>> and
> >> > > > > > > >> > >>> > >> maybe
> >> > > > > > > >> > >>> > >> > > more
> >> > > > > > > >> > >>> > >> > > > > > could
> >> > > > > > > >> > >>> > >> > > > > > > be
> >> > > > > > > >> > >>> > >> > > > > > > > done there so that a small bit
> of
> >> > > > fuzziness
> >> > > > > > > >> around
> >> > > > > > > >> > the
> >> > > > > > > >> > >>> > size
> >> > > > > > > >> > >>> > >> > would
> >> > > > > > > >> > >>> > >> > > > not
> >> > > > > > > >> > >>> > >> > > > > > be
> >> > > > > > > >> > >>> > >> > > > > > > an
> >> > > > > > > >> > >>> > >> > > > > > > > issue?
> >> > > > > > > >> > >>> > >> > > > > > > >
> >> > > > > > > >> > >>> > >> > > > > > > > -Jay
> >> > > > > > > >> > >>> > >> > > > > > > >
> >> > > > > > > >> > >>> > >> > > > > > > >
> >> > > > > > > >> > >>> > >> > > > > > > >
> >> > > > > > > >> > >>> > >> > > > > > > > On Tue, Feb 21, 2017 at 12:30
> PM,
> >> > > Becket
> >> > > > > Qin
> >> > > > > > <
> >> > > > > > > >> > >>> > >> > > becket.qin@gmail.com
> >> > > > > > > >> > >>> > >> > > > >
> >> > > > > > > >> > >>> > >> > > > > > > wrote:
> >> > > > > > > >> > >>> > >> > > > > > > >
> >> > > > > > > >> > >>> > >> > > > > > > > > Hi folks,
> >> > > > > > > >> > >>> > >> > > > > > > > >
> >> > > > > > > >> > >>> > >> > > > > > > > > I would like to start the
> >> > discussion
> >> > > > > thread
> >> > > > > > > on
> >> > > > > > > >> > >>> KIP-126.
> >> > > > > > > >> > >>> > >> The
> >> > > > > > > >> > >>> > >> > KIP
> >> > > > > > > >> > >>> > >> > > > > > propose
> >> > > > > > > >> > >>> > >> > > > > > > > > adding a new configuration to
> >> > > > > KafkaProducer
> >> > > > > > > to
> >> > > > > > > >> > allow
> >> > > > > > > >> > >>> > >> batching
> >> > > > > > > >> > >>> > >> > > > based
> >> > > > > > > >> > >>> > >> > > > > > on
> >> > > > > > > >> > >>> > >> > > > > > > > > uncompressed message size.
> >> > > > > > > >> > >>> > >> > > > > > > > >
> >> > > > > > > >> > >>> > >> > > > > > > > > Comments are welcome.
> >> > > > > > > >> > >>> > >> > > > > > > > >
> >> > > > > > > >> > >>> > >> > > > > > > > > The KIP wiki is following:
> >> > > > > > > >> > >>> > >> > > > > > > > >
> https://cwiki.apache.org/confl
> >> > > > > > > >> > >>> uence/display/KAFKA/KIP-
> >> > > > > > > >> > >>> > >> > > > > > > > > 126+-+Allow+KafkaProducer+to+b
> >> > > > > > > >> > >>> > >> atch+based+on+uncompressed+siz
> >> > > > > > > >> > >>> > >> > e
> >> > > > > > > >> > >>> > >> > > > > > > > >
> >> > > > > > > >> > >>> > >> > > > > > > > > Thanks,
> >> > > > > > > >> > >>> > >> > > > > > > > >
> >> > > > > > > >> > >>> > >> > > > > > > > > Jiangjie (Becket) Qin
> >> > > > > > > >> > >>> > >> > > > > > > > >
> >> > > > > > > >> > >>> > >> > > > > > > >
> >> > > > > > > >> > >>> > >> > > > > > >
> >> > > > > > > >> > >>> > >> > > > > >
> >> > > > > > > >> > >>> > >> > > > >
> >> > > > > > > >> > >>> > >> > > >
> >> > > > > > > >> > >>> > >> > > >
> >> > > > > > > >> > >>> > >> > > >
> >> > > > > > > >> > >>> > >> > > > --
> >> > > > > > > >> > >>> > >> > > > -Regards,
> >> > > > > > > >> > >>> > >> > > > Mayuresh R. Gharat
> >> > > > > > > >> > >>> > >> > > > (862) 250-7125
> >> > > > > > > >> > >>> > >> > > >
> >> > > > > > > >> > >>> > >> > >
> >> > > > > > > >> > >>> > >> >
> >> > > > > > > >> > >>> > >> >
> >> > > > > > > >> > >>> > >> >
> >> > > > > > > >> > >>> > >> > --
> >> > > > > > > >> > >>> > >> > -Regards,
> >> > > > > > > >> > >>> > >> > Mayuresh R. Gharat
> >> > > > > > > >> > >>> > >> > (862) 250-7125
> >> > > > > > > >> > >>> > >> >
> >> > > > > > > >> > >>> > >>
> >> > > > > > > >> > >>> > >
> >> > > > > > > >> > >>> > >
> >> > > > > > > >> > >>> >
> >> > > > > > > >> > >>>
> >> > > > > > > >> > >>
> >> > > > > > > >> > >>
> >> > > > > > > >> > >
> >> > > > > > > >> >
> >> > > > > > > >>
> >> > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
> >
>

Re: [DISCUSS] KIP-126 - Allow KafkaProducer to batch based on uncompressed size

Posted by Dong Lin <li...@gmail.com>.

Thanks for the explanation. I agree that it is not easy to have a
well-defined and accurate measurement of the split ratio.

On Wed, Apr 19, 2017 at 9:38 AM, Becket Qin <be...@gmail.com> wrote:

> Thanks for the comment, Dong. I think the batch-split-ratio makes sense but
> is kind of redundant to batch-split-rate.
>
> Also the batch-split-ratio may be a little more involved to make right:
> 1. A all-time batch split ratio is easy to get but not that useful.
> 2. A time-windowed batch-split-ratio is more complicated to make accurate.
> This is because it is kind of a "stateful" metric relies on the number of
> batches sent in a time window and number of batches got split in the same
> time window. But the sending and the splitting time are not necessarily
> falling in the same window.
>
> Besides, a rough estimation of the batch split ratio can be derived from
> the existing metrics. And I think batch-split-rate is already a good
> indication on whether the batch split has caused performance problem or
> not.
>
> So I am not sure if it is worth having an explicit batch-split-ratio metric
> in this case.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Wed, Mar 22, 2017 at 10:54 AM, Dong Lin <li...@gmail.com> wrote:
>
> > Never mind about my second comment. I misunderstood the semantics of
> > producer's batch.size.
> >
> > On Wed, Mar 22, 2017 at 10:20 AM, Dong Lin <li...@gmail.com> wrote:
> >
> > > Hey Becket,
> > >
> > > In addition to the batch-split-rate, should we also add
> batch-split-ratio
> > > sensor to gauge the probability that we have to split batch?
> > >
> > > Also, in the case that the batch size configured for the producer is
> > > smaller than the max message size configured for the broker, why can't
> we
> > > just split the batch if its size exceeds the configured batch size? The
> > > benefit of this approach is that the semantics of producer is
> > > straightforward because we enforce the batch size that user has
> > configured.
> > > The implementation would also be simpler because we don't have to reply
> > on
> > > KIP-4 to fetch the max message size from broker. I guess you are
> worrying
> > > about the overhead of "unnecessary" split if a batch size is between
> > > user-configured batch size and broker's max message size. But is
> overhead
> > > really a concern? If overhead is too large because user has configured
> a
> > > very low batch size for producer, shouldn't user adjust produce config?
> > >
> > > Thanks,
> > > Dong
> > >
> > > On Wed, Mar 15, 2017 at 2:50 PM, Becket Qin <be...@gmail.com>
> > wrote:
> > >
> > >> I see, then we are thinking about the same thing :)
> > >>
> > >> On Wed, Mar 15, 2017 at 2:26 PM, Ismael Juma <is...@juma.me.uk>
> wrote:
> > >>
> > >> > I meant finishing what's described in the following section and then
> > >> > starting a discussion followed by a vote:
> > >> >
> > >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > >> > 4+-+Command+line+and+centralized+administrative+operations#KIP-4-
> > >> > Commandlineandcentralizedadministrativeoperations-DescribeCo
> > >> nfigsRequest
> > >> >
> > >> > We have only voted on KIP-4 Metadata, KIP-4 Create Topics, KIP-4
> > Delete
> > >> > Topics so far.
> > >> >
> > >> > Ismael
> > >> >
> > >> > On Wed, Mar 15, 2017 at 8:58 PM, Becket Qin <be...@gmail.com>
> > >> wrote:
> > >> >
> > >> > > Hi Ismael,
> > >> > >
> > >> > > KIP-4 is also the one that I was thinking about. We have
> introduced
> > a
> > >> > > DescribeConfigRequest there so the producer can easily get the
> > >> > > configurations. By "another KIP" do you mean a new (or maybe
> > extended)
> > >> > > protocol or using that protocol in clients?
> > >> > >
> > >> > > Thanks,
> > >> > >
> > >> > > Jiangjie (Becket) Qin
> > >> > >
> > >> > > On Wed, Mar 15, 2017 at 1:21 PM, Ismael Juma <is...@juma.me.uk>
> > >> wrote:
> > >> > >
> > >> > > > Hi Becket,
> > >> > > >
> > >> > > > How were you thinking of retrieving the configuration items you
> > >> > > mentioned?
> > >> > > > I am asking because I was planning to post a KIP for Describe
> > >> Configs
> > >> > > (one
> > >> > > > of the protocols in KIP-4), which would expose such information.
> > But
> > >> > > maybe
> > >> > > > you are thinking of extending Metadata request?
> > >> > > >
> > >> > > > Ismael
> > >> > > >
> > >> > > > On Wed, Mar 15, 2017 at 7:33 PM, Becket Qin <
> becket.qin@gmail.com
> > >
> > >> > > wrote:
> > >> > > >
> > >> > > > > Hi Jason,
> > >> > > > >
> > >> > > > > Good point. I was thinking about that, too. I was not sure if
> > >> that is
> > >> > > the
> > >> > > > > right thing to do by default.
> > >> > > > >
> > >> > > > > If we assume people always set the batch size to max message
> > size,
> > >> > > > > splitting the oversized batch makes a lot of sense. But it
> seems
> > >> > > possible
> > >> > > > > that users want to control the memory footprint so they would
> > set
> > >> the
> > >> > > > batch
> > >> > > > > size to smaller than the max message size so the producer can
> > have
> > >> > hold
> > >> > > > > batches for more partitions. In this case, splitting the batch
> > >> might
> > >> > > not
> > >> > > > be
> > >> > > > > the desired behavior.
> > >> > > > >
> > >> > > > > I think the most intuitive approach to this is allow the
> > producer
> > >> to
> > >> > > get
> > >> > > > > the max message size configuration (as well as some other
> > >> > > configurations
> > >> > > > > such as timestamp type)  from the broker side and use that to
> > >> decide
> > >> > > > > whether a batch should be split or not. I probably should add
> > >> this to
> > >> > > the
> > >> > > > > KIP wiki.
> > >> > > > >
> > >> > > > > Thanks,
> > >> > > > >
> > >> > > > > Jiangjie (Becket) Qin
> > >> > > > >
> > >> > > > > On Wed, Mar 15, 2017 at 9:47 AM, Jason Gustafson <
> > >> jason@confluent.io
> > >> > >
> > >> > > > > wrote:
> > >> > > > >
> > >> > > > > > Hey Becket,
> > >> > > > > >
> > >> > > > > > Thanks for the KIP! The approach seems reasonable. One
> > >> > clarification:
> > >> > > > is
> > >> > > > > > the intent to do the splitting after the broker rejects the
> > >> request
> > >> > > > with
> > >> > > > > > MESSAGE_TOO_LARGE, or prior to sending if the configured
> batch
> > >> size
> > >> > > is
> > >> > > > > > exceeded?
> > >> > > > > >
> > >> > > > > > -Jason
> > >> > > > > >
> > >> > > > > > On Mon, Mar 13, 2017 at 8:10 PM, Becket Qin <
> > >> becket.qin@gmail.com>
> > >> > > > > wrote:
> > >> > > > > >
> > >> > > > > > > Bump up the thread for further comments. If there is no
> more
> > >> > > comments
> > >> > > > > on
> > >> > > > > > > the KIP I will start the voting thread on Wed.
> > >> > > > > > >
> > >> > > > > > > Thanks,
> > >> > > > > > >
> > >> > > > > > > Jiangjie (Becket) Qin
> > >> > > > > > >
> > >> > > > > > > On Tue, Mar 7, 2017 at 9:48 AM, Becket Qin <
> > >> becket.qin@gmail.com
> > >> > >
> > >> > > > > wrote:
> > >> > > > > > >
> > >> > > > > > > > Hi Dong,
> > >> > > > > > > >
> > >> > > > > > > > Thanks for the comments.
> > >> > > > > > > >
> > >> > > > > > > > The patch is mostly for proof of concept in case there
> is
> > >> any
> > >> > > > concern
> > >> > > > > > > > about the implementation which is indeed a little
> tricky.
> > >> > > > > > > >
> > >> > > > > > > > The new metric has already been mentioned in the Public
> > >> > Interface
> > >> > > > > > Change
> > >> > > > > > > > section.
> > >> > > > > > > >
> > >> > > > > > > > I added the reasoning about how the compression ratio
> > >> > > > > > > > improving/deteriorate steps are determined in the wiki.
> > >> > > > > > > >
> > >> > > > > > > > Thanks,
> > >> > > > > > > >
> > >> > > > > > > > Jiangjie (Becket) Qin
> > >> > > > > > > >
> > >> > > > > > > >
> > >> > > > > > > > On Mon, Mar 6, 2017 at 4:42 PM, Dong Lin <
> > >> lindong28@gmail.com>
> > >> > > > > wrote:
> > >> > > > > > > >
> > >> > > > > > > >> Hey Becket,
> > >> > > > > > > >>
> > >> > > > > > > >> I am wondering if we should first vote for the KIP
> before
> > >> > > > reviewing
> > >> > > > > > the
> > >> > > > > > > >> patch. I have two comments below:
> > >> > > > > > > >>
> > >> > > > > > > >> - Should we specify the new sensors as part of
> interface
> > >> > change
> > >> > > in
> > >> > > > > the
> > >> > > > > > > >> KIP?
> > >> > > > > > > >> - The KIP proposes to increase estimated compression
> > ratio
> > >> by
> > >> > > 0.05
> > >> > > > > for
> > >> > > > > > > >> each
> > >> > > > > > > >> underestimation and decrement the estimation by 0.005
> for
> > >> each
> > >> > > > > > > >> overestimation. Why are these two values chosen? I
> think
> > >> there
> > >> > > is
> > >> > > > > some
> > >> > > > > > > >> tradeoff in selecting the value. Can the KIP be more
> > >> explicit
> > >> > > > about
> > >> > > > > > the
> > >> > > > > > > >> tradeoff and explain how these two values would impact
> > >> > > producer's
> > >> > > > > > > >> performance?
> > >> > > > > > > >>
> > >> > > > > > > >> Thanks,
> > >> > > > > > > >> Dong
> > >> > > > > > > >>
> > >> > > > > > > >>
> > >> > > > > > > >> On Sat, Mar 4, 2017 at 11:42 AM, Becket Qin <
> > >> > > becket.qin@gmail.com
> > >> > > > >
> > >> > > > > > > wrote:
> > >> > > > > > > >>
> > >> > > > > > > >> > I have updated the KIP based on the latest
> discussion.
> > >> > Please
> > >> > > > > check
> > >> > > > > > > and
> > >> > > > > > > >> let
> > >> > > > > > > >> > me know if there is any further concern.
> > >> > > > > > > >> >
> > >> > > > > > > >> > Thanks,
> > >> > > > > > > >> >
> > >> > > > > > > >> > Jiangjie (Becket) Qin
> > >> > > > > > > >> >
> > >> > > > > > > >> > On Sat, Mar 4, 2017 at 10:56 AM, Becket Qin <
> > >> > > > becket.qin@gmail.com
> > >> > > > > >
> > >> > > > > > > >> wrote:
> > >> > > > > > > >> >
> > >> > > > > > > >> > > Actually second thought on this, rate might be
> better
> > >> for
> > >> > > two
> > >> > > > > > > reasons:
> > >> > > > > > > >> > > 1. Most of the metrics in the producer we already
> > have
> > >> are
> > >> > > > using
> > >> > > > > > > rate
> > >> > > > > > > >> > > instead of count.
> > >> > > > > > > >> > > 2. If a service is bounced, the count will be reset
> > to
> > >> 0,
> > >> > > but
> > >> > > > it
> > >> > > > > > > does
> > >> > > > > > > >> not
> > >> > > > > > > >> > > affect rate.
> > >> > > > > > > >> > >
> > >> > > > > > > >> > > I'll make the change.
> > >> > > > > > > >> > >
> > >> > > > > > > >> > > Thanks,
> > >> > > > > > > >> > >
> > >> > > > > > > >> > > Jiangjie (Becket) Qin
> > >> > > > > > > >> > >
> > >> > > > > > > >> > > On Sat, Mar 4, 2017 at 10:27 AM, Becket Qin <
> > >> > > > > becket.qin@gmail.com
> > >> > > > > > >
> > >> > > > > > > >> > wrote:
> > >> > > > > > > >> > >
> > >> > > > > > > >> > >> Hi Dong,
> > >> > > > > > > >> > >>
> > >> > > > > > > >> > >> Yes, there is a sensor in the patch about the
> split
> > >> > > > occurrence.
> > >> > > > > > > >> > >>
> > >> > > > > > > >> > >> Currently it is a count instead of rate. In
> > practice,
> > >> it
> > >> > > > seems
> > >> > > > > > > count
> > >> > > > > > > >> is
> > >> > > > > > > >> > >> easier to use in this case. But I am open to
> change.
> > >> > > > > > > >> > >>
> > >> > > > > > > >> > >> Thanks,
> > >> > > > > > > >> > >>
> > >> > > > > > > >> > >> Jiangjie (Becket) Qin
> > >> > > > > > > >> > >>
> > >> > > > > > > >> > >> On Fri, Mar 3, 2017 at 7:43 PM, Dong Lin <
> > >> > > > lindong28@gmail.com>
> > >> > > > > > > >> wrote:
> > >> > > > > > > >> > >>
> > >> > > > > > > >> > >>> Hey Becket,
> > >> > > > > > > >> > >>>
> > >> > > > > > > >> > >>> I haven't looked at the patch yet. But since we
> are
> > >> > going
> > >> > > to
> > >> > > > > try
> > >> > > > > > > the
> > >> > > > > > > >> > >>> split-on-oversize solution, should the KIP also
> > add a
> > >> > > sensor
> > >> > > > > > that
> > >> > > > > > > >> shows
> > >> > > > > > > >> > >>> the
> > >> > > > > > > >> > >>> rate of split per second and the probability of
> > >> split?
> > >> > > > > > > >> > >>>
> > >> > > > > > > >> > >>> Thanks,
> > >> > > > > > > >> > >>> Dong
> > >> > > > > > > >> > >>>
> > >> > > > > > > >> > >>>
> > >> > > > > > > >> > >>> On Fri, Mar 3, 2017 at 6:39 PM, Becket Qin <
> > >> > > > > > becket.qin@gmail.com>
> > >> > > > > > > >> > wrote:
> > >> > > > > > > >> > >>>
> > >> > > > > > > >> > >>> > Just to clarify, the implementation is
> basically
> > >> what
> > >> > I
> > >> > > > > > > mentioned
> > >> > > > > > > >> > above
> > >> > > > > > > >> > >>> > (split/resend + adjusted estimation evolving
> > >> > algorithm)
> > >> > > > and
> > >> > > > > > > >> changing
> > >> > > > > > > >> > >>> the
> > >> > > > > > > >> > >>> > compression ratio estimation to be per topic.
> > >> > > > > > > >> > >>> >
> > >> > > > > > > >> > >>> > Thanks,
> > >> > > > > > > >> > >>> >
> > >> > > > > > > >> > >>> > Jiangjie (Becket) Qin
> > >> > > > > > > >> > >>> >
> > >> > > > > > > >> > >>> > On Fri, Mar 3, 2017 at 6:36 PM, Becket Qin <
> > >> > > > > > > becket.qin@gmail.com>
> > >> > > > > > > >> > >>> wrote:
> > >> > > > > > > >> > >>> >
> > >> > > > > > > >> > >>> > > I went ahead and have a patch submitted here:
> > >> > > > > > > >> > >>> > > https://github.com/apache/kafka/pull/2638
> > >> > > > > > > >> > >>> > >
> > >> > > > > > > >> > >>> > > Per Joel's suggestion, I changed the
> > compression
> > >> > ratio
> > >> > > > to
> > >> > > > > be
> > >> > > > > > > per
> > >> > > > > > > >> > >>> topic as
> > >> > > > > > > >> > >>> > > well. It seems working well. Since there is
> an
> > >> > > important
> > >> > > > > > > >> behavior
> > >> > > > > > > >> > >>> change
> > >> > > > > > > >> > >>> > > and a new sensor is added, I'll keep the KIP
> > and
> > >> > > update
> > >> > > > it
> > >> > > > > > > >> > according.
> > >> > > > > > > >> > >>> > >
> > >> > > > > > > >> > >>> > > Thanks,
> > >> > > > > > > >> > >>> > >
> > >> > > > > > > >> > >>> > > Jiangjie (Becket) Qin
> > >> > > > > > > >> > >>> > >
> > >> > > > > > > >> > >>> > > On Mon, Feb 27, 2017 at 3:50 PM, Joel Koshy <
> > >> > > > > > > >> jjkoshy.w@gmail.com>
> > >> > > > > > > >> > >>> wrote:
> > >> > > > > > > >> > >>> > >
> > >> > > > > > > >> > >>> > >> >
> > >> > > > > > > >> > >>> > >> > Lets say we sent the batch over the wire
> and
> > >> > > > received a
> > >> > > > > > > >> > >>> > >> > RecordTooLargeException, how do we split
> it
> > as
> > >> > once
> > >> > > > we
> > >> > > > > > add
> > >> > > > > > > >> the
> > >> > > > > > > >> > >>> message
> > >> > > > > > > >> > >>> > >> to
> > >> > > > > > > >> > >>> > >> > the batch we loose the message level
> > >> granularity.
> > >> > > We
> > >> > > > > will
> > >> > > > > > > >> have
> > >> > > > > > > >> > to
> > >> > > > > > > >> > >>> > >> > decompress, do deep iteration and split
> and
> > >> again
> > >> > > > > > compress.
> > >> > > > > > > >> > right?
> > >> > > > > > > >> > >>> > This
> > >> > > > > > > >> > >>> > >> > looks like a performance bottle neck in
> case
> > >> of
> > >> > > multi
> > >> > > > > > topic
> > >> > > > > > > >> > >>> producers
> > >> > > > > > > >> > >>> > >> like
> > >> > > > > > > >> > >>> > >> > mirror maker.
> > >> > > > > > > >> > >>> > >> >
> > >> > > > > > > >> > >>> > >>
> > >> > > > > > > >> > >>> > >> Yes, but these should be outliers if we do
> > >> > estimation
> > >> > > > on
> > >> > > > > a
> > >> > > > > > > >> > per-topic
> > >> > > > > > > >> > >>> > basis
> > >> > > > > > > >> > >>> > >> and if we target a conservative-enough
> > >> compression
> > >> > > > ratio.
> > >> > > > > > The
> > >> > > > > > > >> > >>> producer
> > >> > > > > > > >> > >>> > >> should also avoid sending over the wire if
> it
> > >> can
> > >> > be
> > >> > > > made
> > >> > > > > > > >> aware of
> > >> > > > > > > >> > >>> the
> > >> > > > > > > >> > >>> > >> max-message size limit on the broker, and
> > split
> > >> if
> > >> > it
> > >> > > > > > > >> determines
> > >> > > > > > > >> > >>> that a
> > >> > > > > > > >> > >>> > >> record exceeds the broker's config. Ideally
> > this
> > >> > > should
> > >> > > > > be
> > >> > > > > > > >> part of
> > >> > > > > > > >> > >>> topic
> > >> > > > > > > >> > >>> > >> metadata but is not - so it could be off a
> > >> periodic
> > >> > > > > > > >> > describe-configs
> > >> > > > > > > >> > >>> > >> <https://cwiki.apache.org/
> > >> > > > confluence/display/KAFKA/KIP-
> > >> > > > > > 4+-+
> > >> > > > > > > >> > >>> > >> Command+line+and+centralized+
> > >> > > > > > administrative+operations#KIP-
> > >> > > > > > > >> > >>> > >> 4-Commandlineandcentralizedadmin
> > >> > > > > > istrativeoperations-Describe
> > >> > > > > > > >> > >>> > >> ConfigsRequest>
> > >> > > > > > > >> > >>> > >> (which isn't available yet). This doesn't
> > remove
> > >> > the
> > >> > > > need
> > >> > > > > > to
> > >> > > > > > > >> split
> > >> > > > > > > >> > >>> and
> > >> > > > > > > >> > >>> > >> recompress though.
> > >> > > > > > > >> > >>> > >>
> > >> > > > > > > >> > >>> > >>
> > >> > > > > > > >> > >>> > >> > On Mon, Feb 27, 2017 at 10:51 AM, Becket
> > Qin <
> > >> > > > > > > >> > >>> becket.qin@gmail.com>
> > >> > > > > > > >> > >>> > >> wrote:
> > >> > > > > > > >> > >>> > >> >
> > >> > > > > > > >> > >>> > >> > > Hey Mayuresh,
> > >> > > > > > > >> > >>> > >> > >
> > >> > > > > > > >> > >>> > >> > > 1) The batch would be split when an
> > >> > > > > > > >> RecordTooLargeException is
> > >> > > > > > > >> > >>> > >> received.
> > >> > > > > > > >> > >>> > >> > > 2) Not lower the actual compression
> ratio,
> > >> but
> > >> > > > lower
> > >> > > > > > the
> > >> > > > > > > >> > >>> estimated
> > >> > > > > > > >> > >>> > >> > > compression ratio "according to" the
> > Actual
> > >> > > > > Compression
> > >> > > > > > > >> > >>> Ratio(ACR).
> > >> > > > > > > >> > >>> > >> > >
> > >> > > > > > > >> > >>> > >> > > An example, let's start with Estimated
> > >> > > Compression
> > >> > > > > > Ratio
> > >> > > > > > > >> > (ECR) =
> > >> > > > > > > >> > >>> > 1.0.
> > >> > > > > > > >> > >>> > >> Say
> > >> > > > > > > >> > >>> > >> > > the compression ratio of ACR is ~0.8,
> > >> instead
> > >> > of
> > >> > > > > > letting
> > >> > > > > > > >> the
> > >> > > > > > > >> > ECR
> > >> > > > > > > >> > >>> > >> dropped
> > >> > > > > > > >> > >>> > >> > to
> > >> > > > > > > >> > >>> > >> > > 0.8 very quickly, we only drop 0.001
> every
> > >> time
> > >> > > > when
> > >> > > > > > ACR
> > >> > > > > > > <
> > >> > > > > > > >> > ECR.
> > >> > > > > > > >> > >>> > >> However,
> > >> > > > > > > >> > >>> > >> > > once we see an ACR > ECR, we increment
> ECR
> > >> by
> > >> > > 0.05.
> > >> > > > > If
> > >> > > > > > a
> > >> > > > > > > >> > >>> > >> > > RecordTooLargeException is received, we
> > >> reset
> > >> > the
> > >> > > > ECR
> > >> > > > > > > back
> > >> > > > > > > >> to
> > >> > > > > > > >> > >>> 1.0
> > >> > > > > > > >> > >>> > and
> > >> > > > > > > >> > >>> > >> > split
> > >> > > > > > > >> > >>> > >> > > the batch.
> > >> > > > > > > >> > >>> > >> > >
> > >> > > > > > > >> > >>> > >> > > Thanks,
> > >> > > > > > > >> > >>> > >> > >
> > >> > > > > > > >> > >>> > >> > > Jiangjie (Becket) Qin
> > >> > > > > > > >> > >>> > >> > >
> > >> > > > > > > >> > >>> > >> > >
> > >> > > > > > > >> > >>> > >> > >
> > >> > > > > > > >> > >>> > >> > > On Mon, Feb 27, 2017 at 10:30 AM,
> Mayuresh
> > >> > > Gharat <
> > >> > > > > > > >> > >>> > >> > > gharatmayuresh15@gmail.com> wrote:
> > >> > > > > > > >> > >>> > >> > >
> > >> > > > > > > >> > >>> > >> > > > Hi Becket,
> > >> > > > > > > >> > >>> > >> > > >
> > >> > > > > > > >> > >>> > >> > > > Seems like an interesting idea.
> > >> > > > > > > >> > >>> > >> > > > I had couple of questions :
> > >> > > > > > > >> > >>> > >> > > > 1) How do we decide when the batch
> > should
> > >> be
> > >> > > > split?
> > >> > > > > > > >> > >>> > >> > > > 2) What do you mean by slowly lowering
> > the
> > >> > > > "actual"
> > >> > > > > > > >> > >>> compression
> > >> > > > > > > >> > >>> > >> ratio?
> > >> > > > > > > >> > >>> > >> > > > An example would really help here.
> > >> > > > > > > >> > >>> > >> > > >
> > >> > > > > > > >> > >>> > >> > > > Thanks,
> > >> > > > > > > >> > >>> > >> > > >
> > >> > > > > > > >> > >>> > >> > > > Mayuresh
> > >> > > > > > > >> > >>> > >> > > >
> > >> > > > > > > >> > >>> > >> > > > On Fri, Feb 24, 2017 at 3:17 PM,
> Becket
> > >> Qin <
> > >> > > > > > > >> > >>> becket.qin@gmail.com
> > >> > > > > > > >> > >>> > >
> > >> > > > > > > >> > >>> > >> > > wrote:
> > >> > > > > > > >> > >>> > >> > > >
> > >> > > > > > > >> > >>> > >> > > > > Hi Jay,
> > >> > > > > > > >> > >>> > >> > > > >
> > >> > > > > > > >> > >>> > >> > > > > Yeah, I got your point.
> > >> > > > > > > >> > >>> > >> > > > >
> > >> > > > > > > >> > >>> > >> > > > > I think there might be a solution
> > which
> > >> do
> > >> > > not
> > >> > > > > > > require
> > >> > > > > > > >> > >>> adding a
> > >> > > > > > > >> > >>> > >> new
> > >> > > > > > > >> > >>> > >> > > > > configuration. We can start from a
> > very
> > >> > > > > > conservative
> > >> > > > > > > >> > >>> compression
> > >> > > > > > > >> > >>> > >> > ratio
> > >> > > > > > > >> > >>> > >> > > > say
> > >> > > > > > > >> > >>> > >> > > > > 1.0 and lower it very slowly
> according
> > >> to
> > >> > the
> > >> > > > > > actual
> > >> > > > > > > >> > >>> compression
> > >> > > > > > > >> > >>> > >> > ratio
> > >> > > > > > > >> > >>> > >> > > > > until we hit a point that we have to
> > >> split
> > >> > a
> > >> > > > > batch.
> > >> > > > > > > At
> > >> > > > > > > >> > that
> > >> > > > > > > >> > >>> > >> point, we
> > >> > > > > > > >> > >>> > >> > > > > exponentially back off on the
> > >> compression
> > >> > > > ratio.
> > >> > > > > > The
> > >> > > > > > > >> idea
> > >> > > > > > > >> > is
> > >> > > > > > > >> > >>> > >> somewhat
> > >> > > > > > > >> > >>> > >> > > > like
> > >> > > > > > > >> > >>> > >> > > > > TCP. This should help avoid frequent
> > >> split.
> > >> > > > > > > >> > >>> > >> > > > >
> > >> > > > > > > >> > >>> > >> > > > > The upper bound of the batch size is
> > >> also a
> > >> > > > > little
> > >> > > > > > > >> awkward
> > >> > > > > > > >> > >>> today
> > >> > > > > > > >> > >>> > >> > > because
> > >> > > > > > > >> > >>> > >> > > > we
> > >> > > > > > > >> > >>> > >> > > > > say the batch size is based on
> > >> compressed
> > >> > > size,
> > >> > > > > but
> > >> > > > > > > >> users
> > >> > > > > > > >> > >>> cannot
> > >> > > > > > > >> > >>> > >> set
> > >> > > > > > > >> > >>> > >> > it
> > >> > > > > > > >> > >>> > >> > > > to
> > >> > > > > > > >> > >>> > >> > > > > the max message size because that
> will
> > >> > result
> > >> > > > in
> > >> > > > > > > >> oversized
> > >> > > > > > > >> > >>> > >> messages.
> > >> > > > > > > >> > >>> > >> > > With
> > >> > > > > > > >> > >>> > >> > > > > this change we will be able to allow
> > the
> > >> > > users
> > >> > > > to
> > >> > > > > > set
> > >> > > > > > > >> the
> > >> > > > > > > >> > >>> > message
> > >> > > > > > > >> > >>> > >> > size
> > >> > > > > > > >> > >>> > >> > > to
> > >> > > > > > > >> > >>> > >> > > > > close to max message size.
> > >> > > > > > > >> > >>> > >> > > > >
> > >> > > > > > > >> > >>> > >> > > > > However the downside is that there
> > >> could be
> > >> > > > > latency
> > >> > > > > > > >> spikes
> > >> > > > > > > >> > >>> in
> > >> > > > > > > >> > >>> > the
> > >> > > > > > > >> > >>> > >> > > system
> > >> > > > > > > >> > >>> > >> > > > in
> > >> > > > > > > >> > >>> > >> > > > > this case due to the splitting,
> > >> especially
> > >> > > when
> > >> > > > > > there
> > >> > > > > > > >> are
> > >> > > > > > > >> > >>> many
> > >> > > > > > > >> > >>> > >> > messages
> > >> > > > > > > >> > >>> > >> > > > > need to be split at the same time.
> > That
> > >> > could
> > >> > > > > > > >> potentially
> > >> > > > > > > >> > >>> be an
> > >> > > > > > > >> > >>> > >> issue
> > >> > > > > > > >> > >>> > >> > > for
> > >> > > > > > > >> > >>> > >> > > > > some users.
> > >> > > > > > > >> > >>> > >> > > > >
> > >> > > > > > > >> > >>> > >> > > > > What do you think about this
> approach?
> > >> > > > > > > >> > >>> > >> > > > >
> > >> > > > > > > >> > >>> > >> > > > > Thanks,
> > >> > > > > > > >> > >>> > >> > > > >
> > >> > > > > > > >> > >>> > >> > > > > Jiangjie (Becket) Qin
> > >> > > > > > > >> > >>> > >> > > > >
> > >> > > > > > > >> > >>> > >> > > > >
> > >> > > > > > > >> > >>> > >> > > > >
> > >> > > > > > > >> > >>> > >> > > > > On Thu, Feb 23, 2017 at 1:31 PM, Jay
> > >> Kreps
> > >> > <
> > >> > > > > > > >> > >>> jay@confluent.io>
> > >> > > > > > > >> > >>> > >> wrote:
> > >> > > > > > > >> > >>> > >> > > > >
> > >> > > > > > > >> > >>> > >> > > > > > Hey Becket,
> > >> > > > > > > >> > >>> > >> > > > > >
> > >> > > > > > > >> > >>> > >> > > > > > Yeah that makes sense.
> > >> > > > > > > >> > >>> > >> > > > > >
> > >> > > > > > > >> > >>> > >> > > > > > I agree that you'd really have to
> > both
> > >> > fix
> > >> > > > the
> > >> > > > > > > >> > estimation
> > >> > > > > > > >> > >>> > (i.e.
> > >> > > > > > > >> > >>> > >> > make
> > >> > > > > > > >> > >>> > >> > > it
> > >> > > > > > > >> > >>> > >> > > > > per
> > >> > > > > > > >> > >>> > >> > > > > > topic or make it better estimate
> the
> > >> high
> > >> > > > > > > >> percentiles)
> > >> > > > > > > >> > AND
> > >> > > > > > > >> > >>> > have
> > >> > > > > > > >> > >>> > >> the
> > >> > > > > > > >> > >>> > >> > > > > > recovery mechanism. If you are
> > >> > > > underestimating
> > >> > > > > > > often
> > >> > > > > > > >> and
> > >> > > > > > > >> > >>> then
> > >> > > > > > > >> > >>> > >> > paying
> > >> > > > > > > >> > >>> > >> > > a
> > >> > > > > > > >> > >>> > >> > > > > high
> > >> > > > > > > >> > >>> > >> > > > > > recovery price that won't fly.
> > >> > > > > > > >> > >>> > >> > > > > >
> > >> > > > > > > >> > >>> > >> > > > > > I think you take my main point
> > though,
> > >> > > which
> > >> > > > is
> > >> > > > > > > just
> > >> > > > > > > >> > that
> > >> > > > > > > >> > >>> I
> > >> > > > > > > >> > >>> > >> hate to
> > >> > > > > > > >> > >>> > >> > > > > exposes
> > >> > > > > > > >> > >>> > >> > > > > > these super low level options to
> > users
> > >> > > > because
> > >> > > > > it
> > >> > > > > > > is
> > >> > > > > > > >> so
> > >> > > > > > > >> > >>> hard
> > >> > > > > > > >> > >>> > to
> > >> > > > > > > >> > >>> > >> > > explain
> > >> > > > > > > >> > >>> > >> > > > > to
> > >> > > > > > > >> > >>> > >> > > > > > people what it means and how they
> > >> should
> > >> > > set
> > >> > > > > it.
> > >> > > > > > So
> > >> > > > > > > >> if
> > >> > > > > > > >> > it
> > >> > > > > > > >> > >>> is
> > >> > > > > > > >> > >>> > >> > possible
> > >> > > > > > > >> > >>> > >> > > > to
> > >> > > > > > > >> > >>> > >> > > > > > make either some combination of
> > better
> > >> > > > > estimation
> > >> > > > > > > and
> > >> > > > > > > >> > >>> > splitting
> > >> > > > > > > >> > >>> > >> or
> > >> > > > > > > >> > >>> > >> > > > better
> > >> > > > > > > >> > >>> > >> > > > > > tolerance of overage that would be
> > >> > > > preferrable.
> > >> > > > > > > >> > >>> > >> > > > > >
> > >> > > > > > > >> > >>> > >> > > > > > -Jay
> > >> > > > > > > >> > >>> > >> > > > > >
> > >> > > > > > > >> > >>> > >> > > > > > On Thu, Feb 23, 2017 at 11:51 AM,
> > >> Becket
> > >> > > Qin
> > >> > > > <
> > >> > > > > > > >> > >>> > >> becket.qin@gmail.com
> > >> > > > > > > >> > >>> > >> > >
> > >> > > > > > > >> > >>> > >> > > > > wrote:
> > >> > > > > > > >> > >>> > >> > > > > >
> > >> > > > > > > >> > >>> > >> > > > > > > @Dong,
> > >> > > > > > > >> > >>> > >> > > > > > >
> > >> > > > > > > >> > >>> > >> > > > > > > Thanks for the comments. The
> > default
> > >> > > > behavior
> > >> > > > > > of
> > >> > > > > > > >> the
> > >> > > > > > > >> > >>> > producer
> > >> > > > > > > >> > >>> > >> > won't
> > >> > > > > > > >> > >>> > >> > > > > > change.
> > >> > > > > > > >> > >>> > >> > > > > > > If the users want to use the
> > >> > uncompressed
> > >> > > > > > message
> > >> > > > > > > >> > size,
> > >> > > > > > > >> > >>> they
> > >> > > > > > > >> > >>> > >> > > probably
> > >> > > > > > > >> > >>> > >> > > > > > will
> > >> > > > > > > >> > >>> > >> > > > > > > also bump up the batch size to
> > >> > somewhere
> > >> > > > > close
> > >> > > > > > to
> > >> > > > > > > >> the
> > >> > > > > > > >> > >>> max
> > >> > > > > > > >> > >>> > >> message
> > >> > > > > > > >> > >>> > >> > > > size.
> > >> > > > > > > >> > >>> > >> > > > > > > This would be in the document.
> BTW
> > >> the
> > >> > > > > default
> > >> > > > > > > >> batch
> > >> > > > > > > >> > >>> size is
> > >> > > > > > > >> > >>> > >> 16K
> > >> > > > > > > >> > >>> > >> > > > which
> > >> > > > > > > >> > >>> > >> > > > > is
> > >> > > > > > > >> > >>> > >> > > > > > > pretty small.
> > >> > > > > > > >> > >>> > >> > > > > > >
> > >> > > > > > > >> > >>> > >> > > > > > > @Jay,
> > >> > > > > > > >> > >>> > >> > > > > > >
> > >> > > > > > > >> > >>> > >> > > > > > > Yeah, we actually had debated
> > quite
> > >> a
> > >> > bit
> > >> > > > > > > >> internally
> > >> > > > > > > >> > >>> what is
> > >> > > > > > > >> > >>> > >> the
> > >> > > > > > > >> > >>> > >> > > best
> > >> > > > > > > >> > >>> > >> > > > > > > solution to this.
> > >> > > > > > > >> > >>> > >> > > > > > >
> > >> > > > > > > >> > >>> > >> > > > > > > I completely agree it is a bug.
> In
> > >> > > practice
> > >> > > > > we
> > >> > > > > > > >> usually
> > >> > > > > > > >> > >>> leave
> > >> > > > > > > >> > >>> > >> some
> > >> > > > > > > >> > >>> > >> > > > > > headroom
> > >> > > > > > > >> > >>> > >> > > > > > > to allow the compressed size to
> > >> grow a
> > >> > > > little
> > >> > > > > > if
> > >> > > > > > > >> the
> > >> > > > > > > >> > the
> > >> > > > > > > >> > >>> > >> original
> > >> > > > > > > >> > >>> > >> > > > > > messages
> > >> > > > > > > >> > >>> > >> > > > > > > are not compressible, for
> example,
> > >> 1000
> > >> > > KB
> > >> > > > > > > instead
> > >> > > > > > > >> of
> > >> > > > > > > >> > >>> > exactly
> > >> > > > > > > >> > >>> > >> 1
> > >> > > > > > > >> > >>> > >> > MB.
> > >> > > > > > > >> > >>> > >> > > > It
> > >> > > > > > > >> > >>> > >> > > > > is
> > >> > > > > > > >> > >>> > >> > > > > > > likely safe enough.
> > >> > > > > > > >> > >>> > >> > > > > > >
> > >> > > > > > > >> > >>> > >> > > > > > > The major concern for the
> rejected
> > >> > > > > alternative
> > >> > > > > > is
> > >> > > > > > > >> > >>> > >> performance. It
> > >> > > > > > > >> > >>> > >> > > > > largely
> > >> > > > > > > >> > >>> > >> > > > > > > depends on how frequent we need
> to
> > >> > split
> > >> > > a
> > >> > > > > > batch,
> > >> > > > > > > >> i.e.
> > >> > > > > > > >> > >>> how
> > >> > > > > > > >> > >>> > >> likely
> > >> > > > > > > >> > >>> > >> > > the
> > >> > > > > > > >> > >>> > >> > > > > > > estimation can go off. If we
> only
> > >> need
> > >> > to
> > >> > > > the
> > >> > > > > > > split
> > >> > > > > > > >> > work
> > >> > > > > > > >> > >>> > >> > > > occasionally,
> > >> > > > > > > >> > >>> > >> > > > > > the
> > >> > > > > > > >> > >>> > >> > > > > > > cost would be amortized so we
> > don't
> > >> > need
> > >> > > to
> > >> > > > > > worry
> > >> > > > > > > >> > about
> > >> > > > > > > >> > >>> it
> > >> > > > > > > >> > >>> > too
> > >> > > > > > > >> > >>> > >> > > much.
> > >> > > > > > > >> > >>> > >> > > > > > > However, it looks that for a
> > >> producer
> > >> > > with
> > >> > > > > > shared
> > >> > > > > > > >> > >>> topics,
> > >> > > > > > > >> > >>> > the
> > >> > > > > > > >> > >>> > >> > > > > estimation
> > >> > > > > > > >> > >>> > >> > > > > > is
> > >> > > > > > > >> > >>> > >> > > > > > > always off. As an example,
> > consider
> > >> two
> > >> > > > > topics,
> > >> > > > > > > one
> > >> > > > > > > >> > with
> > >> > > > > > > >> > >>> > >> > > compression
> > >> > > > > > > >> > >>> > >> > > > > > ratio
> > >> > > > > > > >> > >>> > >> > > > > > > 0.6 the other 0.2, assuming
> > exactly
> > >> > same
> > >> > > > > > traffic,
> > >> > > > > > > >> the
> > >> > > > > > > >> > >>> > average
> > >> > > > > > > >> > >>> > >> > > > > compression
> > >> > > > > > > >> > >>> > >> > > > > > > ratio would be roughly 0.4,
> which
> > is
> > >> > not
> > >> > > > > right
> > >> > > > > > > for
> > >> > > > > > > >> > >>> either of
> > >> > > > > > > >> > >>> > >> the
> > >> > > > > > > >> > >>> > >> > > > > topics.
> > >> > > > > > > >> > >>> > >> > > > > > So
> > >> > > > > > > >> > >>> > >> > > > > > > almost half of the batches (of
> the
> > >> > topics
> > >> > > > > with
> > >> > > > > > > 0.6
> > >> > > > > > > >> > >>> > compression
> > >> > > > > > > >> > >>> > >> > > ratio)
> > >> > > > > > > >> > >>> > >> > > > > > will
> > >> > > > > > > >> > >>> > >> > > > > > > end up larger than the
> configured
> > >> batch
> > >> > > > size.
> > >> > > > > > > When
> > >> > > > > > > >> it
> > >> > > > > > > >> > >>> comes
> > >> > > > > > > >> > >>> > to
> > >> > > > > > > >> > >>> > >> > more
> > >> > > > > > > >> > >>> > >> > > > > > topics
> > >> > > > > > > >> > >>> > >> > > > > > > such as mirror maker, this
> becomes
> > >> more
> > >> > > > > > > >> unpredictable.
> > >> > > > > > > >> > >>> To
> > >> > > > > > > >> > >>> > >> avoid
> > >> > > > > > > >> > >>> > >> > > > > frequent
> > >> > > > > > > >> > >>> > >> > > > > > > rejection / split of the
> batches,
> > we
> > >> > need
> > >> > > > to
> > >> > > > > > > >> > configured
> > >> > > > > > > >> > >>> the
> > >> > > > > > > >> > >>> > >> batch
> > >> > > > > > > >> > >>> > >> > > > size
> > >> > > > > > > >> > >>> > >> > > > > > > pretty conservatively. This
> could
> > >> > > actually
> > >> > > > > hurt
> > >> > > > > > > the
> > >> > > > > > > >> > >>> > >> performance
> > >> > > > > > > >> > >>> > >> > > > because
> > >> > > > > > > >> > >>> > >> > > > > > we
> > >> > > > > > > >> > >>> > >> > > > > > > are shoehorn the messages that
> are
> > >> > highly
> > >> > > > > > > >> compressible
> > >> > > > > > > >> > >>> to a
> > >> > > > > > > >> > >>> > >> small
> > >> > > > > > > >> > >>> > >> > > > batch
> > >> > > > > > > >> > >>> > >> > > > > > so
> > >> > > > > > > >> > >>> > >> > > > > > > that the other topics that are
> not
> > >> that
> > >> > > > > > > >> compressible
> > >> > > > > > > >> > >>> will
> > >> > > > > > > >> > >>> > not
> > >> > > > > > > >> > >>> > >> > > become
> > >> > > > > > > >> > >>> > >> > > > > too
> > >> > > > > > > >> > >>> > >> > > > > > > large with the same batch size.
> At
> > >> > > > LinkedIn,
> > >> > > > > > our
> > >> > > > > > > >> batch
> > >> > > > > > > >> > >>> size
> > >> > > > > > > >> > >>> > is
> > >> > > > > > > >> > >>> > >> > > > > configured
> > >> > > > > > > >> > >>> > >> > > > > > > to 64 KB because of this. I
> think
> > we
> > >> > may
> > >> > > > > > actually
> > >> > > > > > > >> have
> > >> > > > > > > >> > >>> > better
> > >> > > > > > > >> > >>> > >> > > > batching
> > >> > > > > > > >> > >>> > >> > > > > if
> > >> > > > > > > >> > >>> > >> > > > > > > we just use the uncompressed
> > message
> > >> > size
> > >> > > > and
> > >> > > > > > 800
> > >> > > > > > > >> KB
> > >> > > > > > > >> > >>> batch
> > >> > > > > > > >> > >>> > >> size.
> > >> > > > > > > >> > >>> > >> > > > > > >
> > >> > > > > > > >> > >>> > >> > > > > > > We did not think about loosening
> > the
> > >> > > > message
> > >> > > > > > size
> > >> > > > > > > >> > >>> > restriction,
> > >> > > > > > > >> > >>> > >> > but
> > >> > > > > > > >> > >>> > >> > > > that
> > >> > > > > > > >> > >>> > >> > > > > > > sounds a viable solution given
> > that
> > >> the
> > >> > > > > > consumer
> > >> > > > > > > >> now
> > >> > > > > > > >> > can
> > >> > > > > > > >> > >>> > fetch
> > >> > > > > > > >> > >>> > >> > > > > oversized
> > >> > > > > > > >> > >>> > >> > > > > > > messages. One concern would be
> > that
> > >> on
> > >> > > the
> > >> > > > > > broker
> > >> > > > > > > >> side
> > >> > > > > > > >> > >>> > >> oversized
> > >> > > > > > > >> > >>> > >> > > > > messages
> > >> > > > > > > >> > >>> > >> > > > > > > will bring more memory pressure.
> > >> With
> > >> > > > KIP-92,
> > >> > > > > > we
> > >> > > > > > > >> may
> > >> > > > > > > >> > >>> > mitigate
> > >> > > > > > > >> > >>> > >> > that,
> > >> > > > > > > >> > >>> > >> > > > but
> > >> > > > > > > >> > >>> > >> > > > > > the
> > >> > > > > > > >> > >>> > >> > > > > > > memory allocation for large
> > messages
> > >> > may
> > >> > > > not
> > >> > > > > be
> > >> > > > > > > >> very
> > >> > > > > > > >> > GC
> > >> > > > > > > >> > >>> > >> > friendly. I
> > >> > > > > > > >> > >>> > >> > > > > need
> > >> > > > > > > >> > >>> > >> > > > > > to
> > >> > > > > > > >> > >>> > >> > > > > > > think about this a little more.
> > >> > > > > > > >> > >>> > >> > > > > > >
> > >> > > > > > > >> > >>> > >> > > > > > > Thanks,
> > >> > > > > > > >> > >>> > >> > > > > > >
> > >> > > > > > > >> > >>> > >> > > > > > > Jiangjie (Becket) Qin
> > >> > > > > > > >> > >>> > >> > > > > > >
> > >> > > > > > > >> > >>> > >> > > > > > >
> > >> > > > > > > >> > >>> > >> > > > > > > On Wed, Feb 22, 2017 at 8:57 PM,
> > Jay
> > >> > > Kreps
> > >> > > > <
> > >> > > > > > > >> > >>> > jay@confluent.io>
> > >> > > > > > > >> > >>> > >> > > wrote:
> > >> > > > > > > >> > >>> > >> > > > > > >
> > >> > > > > > > >> > >>> > >> > > > > > > > Hey Becket,
> > >> > > > > > > >> > >>> > >> > > > > > > >
> > >> > > > > > > >> > >>> > >> > > > > > > > I get the problem we want to
> > solve
> > >> > with
> > >> > > > > this,
> > >> > > > > > > >> but I
> > >> > > > > > > >> > >>> don't
> > >> > > > > > > >> > >>> > >> think
> > >> > > > > > > >> > >>> > >> > > > this
> > >> > > > > > > >> > >>> > >> > > > > is
> > >> > > > > > > >> > >>> > >> > > > > > > > something that makes sense as
> a
> > >> user
> > >> > > > > > controlled
> > >> > > > > > > >> knob
> > >> > > > > > > >> > >>> that
> > >> > > > > > > >> > >>> > >> > > everyone
> > >> > > > > > > >> > >>> > >> > > > > > > sending
> > >> > > > > > > >> > >>> > >> > > > > > > > data to kafka has to think
> > about.
> > >> It
> > >> > is
> > >> > > > > > > >> basically a
> > >> > > > > > > >> > >>> bug,
> > >> > > > > > > >> > >>> > >> right?
> > >> > > > > > > >> > >>> > >> > > > > > > >
> > >> > > > > > > >> > >>> > >> > > > > > > > First, as a technical question
> > is
> > >> it
> > >> > > true
> > >> > > > > > that
> > >> > > > > > > >> using
> > >> > > > > > > >> > >>> the
> > >> > > > > > > >> > >>> > >> > > > uncompressed
> > >> > > > > > > >> > >>> > >> > > > > > > size
> > >> > > > > > > >> > >>> > >> > > > > > > > for batching actually
> guarantees
> > >> that
> > >> > > you
> > >> > > > > > > observe
> > >> > > > > > > >> > the
> > >> > > > > > > >> > >>> > >> limit? I
> > >> > > > > > > >> > >>> > >> > > > think
> > >> > > > > > > >> > >>> > >> > > > > > that
> > >> > > > > > > >> > >>> > >> > > > > > > > implies that compression
> always
> > >> makes
> > >> > > the
> > >> > > > > > > >> messages
> > >> > > > > > > >> > >>> > smaller,
> > >> > > > > > > >> > >>> > >> > > which i
> > >> > > > > > > >> > >>> > >> > > > > > think
> > >> > > > > > > >> > >>> > >> > > > > > > > usually true but is not
> > >> guaranteed,
> > >> > > > right?
> > >> > > > > > e.g.
> > >> > > > > > > >> if
> > >> > > > > > > >> > >>> someone
> > >> > > > > > > >> > >>> > >> > > encrypts
> > >> > > > > > > >> > >>> > >> > > > > > their
> > >> > > > > > > >> > >>> > >> > > > > > > > data which tends to randomize
> it
> > >> and
> > >> > > then
> > >> > > > > > > enables
> > >> > > > > > > >> > >>> > >> > compressesion,
> > >> > > > > > > >> > >>> > >> > > it
> > >> > > > > > > >> > >>> > >> > > > > > could
> > >> > > > > > > >> > >>> > >> > > > > > > > slightly get bigger?
> > >> > > > > > > >> > >>> > >> > > > > > > >
> > >> > > > > > > >> > >>> > >> > > > > > > > I also wonder if the rejected
> > >> > > > alternatives
> > >> > > > > > you
> > >> > > > > > > >> > >>> describe
> > >> > > > > > > >> > >>> > >> > couldn't
> > >> > > > > > > >> > >>> > >> > > be
> > >> > > > > > > >> > >>> > >> > > > > > made
> > >> > > > > > > >> > >>> > >> > > > > > > to
> > >> > > > > > > >> > >>> > >> > > > > > > > work: basically try to be a
> bit
> > >> > better
> > >> > > at
> > >> > > > > > > >> estimation
> > >> > > > > > > >> > >>> and
> > >> > > > > > > >> > >>> > >> > recover
> > >> > > > > > > >> > >>> > >> > > > when
> > >> > > > > > > >> > >>> > >> > > > > > we
> > >> > > > > > > >> > >>> > >> > > > > > > > guess wrong. I don't think the
> > >> memory
> > >> > > > usage
> > >> > > > > > > >> should
> > >> > > > > > > >> > be
> > >> > > > > > > >> > >>> a
> > >> > > > > > > >> > >>> > >> > problem:
> > >> > > > > > > >> > >>> > >> > > > > isn't
> > >> > > > > > > >> > >>> > >> > > > > > it
> > >> > > > > > > >> > >>> > >> > > > > > > > the same memory usage the
> > >> consumer of
> > >> > > > that
> > >> > > > > > > topic
> > >> > > > > > > >> > would
> > >> > > > > > > >> > >>> > need?
> > >> > > > > > > >> > >>> > >> > And
> > >> > > > > > > >> > >>> > >> > > > > can't
> > >> > > > > > > >> > >>> > >> > > > > > > you
> > >> > > > > > > >> > >>> > >> > > > > > > > do the splitting and
> > recompression
> > >> > in a
> > >> > > > > > > streaming
> > >> > > > > > > >> > >>> fashion?
> > >> > > > > > > >> > >>> > >> If
> > >> > > > > > > >> > >>> > >> > we
> > >> > > > > > > >> > >>> > >> > > an
> > >> > > > > > > >> > >>> > >> > > > > > make
> > >> > > > > > > >> > >>> > >> > > > > > > > the estimation rate low and
> the
> > >> > > recovery
> > >> > > > > cost
> > >> > > > > > > is
> > >> > > > > > > >> > just
> > >> > > > > > > >> > >>> ~2x
> > >> > > > > > > >> > >>> > >> the
> > >> > > > > > > >> > >>> > >> > > > normal
> > >> > > > > > > >> > >>> > >> > > > > > cost
> > >> > > > > > > >> > >>> > >> > > > > > > > for that batch that should be
> > >> totally
> > >> > > > fine,
> > >> > > > > > > >> right?
> > >> > > > > > > >> > >>> (It's
> > >> > > > > > > >> > >>> > >> > > > technically
> > >> > > > > > > >> > >>> > >> > > > > > true
> > >> > > > > > > >> > >>> > >> > > > > > > > you might have to split more
> > than
> > >> > once,
> > >> > > > but
> > >> > > > > > > since
> > >> > > > > > > >> > you
> > >> > > > > > > >> > >>> > halve
> > >> > > > > > > >> > >>> > >> it
> > >> > > > > > > >> > >>> > >> > > each
> > >> > > > > > > >> > >>> > >> > > > > > time
> > >> > > > > > > >> > >>> > >> > > > > > > I
> > >> > > > > > > >> > >>> > >> > > > > > > > think should you get a number
> of
> > >> > > halvings
> > >> > > > > > that
> > >> > > > > > > is
> > >> > > > > > > >> > >>> > >> logarithmic
> > >> > > > > > > >> > >>> > >> > in
> > >> > > > > > > >> > >>> > >> > > > the
> > >> > > > > > > >> > >>> > >> > > > > > miss
> > >> > > > > > > >> > >>> > >> > > > > > > > size, which, with better
> > >> estimation
> > >> > > you'd
> > >> > > > > > hope
> > >> > > > > > > >> would
> > >> > > > > > > >> > >>> be
> > >> > > > > > > >> > >>> > >> super
> > >> > > > > > > >> > >>> > >> > > duper
> > >> > > > > > > >> > >>> > >> > > > > > > small).
> > >> > > > > > > >> > >>> > >> > > > > > > >
> > >> > > > > > > >> > >>> > >> > > > > > > > Alternatively maybe we could
> > work
> > >> on
> > >> > > the
> > >> > > > > > other
> > >> > > > > > > >> side
> > >> > > > > > > >> > >>> of the
> > >> > > > > > > >> > >>> > >> > > problem
> > >> > > > > > > >> > >>> > >> > > > > and
> > >> > > > > > > >> > >>> > >> > > > > > > try
> > >> > > > > > > >> > >>> > >> > > > > > > > to make it so that a small
> miss
> > on
> > >> > > > message
> > >> > > > > > size
> > >> > > > > > > >> > isn't
> > >> > > > > > > >> > >>> a
> > >> > > > > > > >> > >>> > big
> > >> > > > > > > >> > >>> > >> > > > problem.
> > >> > > > > > > >> > >>> > >> > > > > I
> > >> > > > > > > >> > >>> > >> > > > > > > > think original issue was that
> > max
> > >> > size
> > >> > > > and
> > >> > > > > > > fetch
> > >> > > > > > > >> > size
> > >> > > > > > > >> > >>> were
> > >> > > > > > > >> > >>> > >> > > tightly
> > >> > > > > > > >> > >>> > >> > > > > > > coupled
> > >> > > > > > > >> > >>> > >> > > > > > > > and the way memory in the
> > consumer
> > >> > > worked
> > >> > > > > you
> > >> > > > > > > >> really
> > >> > > > > > > >> > >>> > wanted
> > >> > > > > > > >> > >>> > >> > fetch
> > >> > > > > > > >> > >>> > >> > > > > size
> > >> > > > > > > >> > >>> > >> > > > > > to
> > >> > > > > > > >> > >>> > >> > > > > > > > be as small as possible
> because
> > >> you'd
> > >> > > use
> > >> > > > > > that
> > >> > > > > > > >> much
> > >> > > > > > > >> > >>> memory
> > >> > > > > > > >> > >>> > >> per
> > >> > > > > > > >> > >>> > >> > > > > fetched
> > >> > > > > > > >> > >>> > >> > > > > > > > partition and the consumer
> would
> > >> get
> > >> > > > stuck
> > >> > > > > if
> > >> > > > > > > its
> > >> > > > > > > >> > >>> fetch
> > >> > > > > > > >> > >>> > size
> > >> > > > > > > >> > >>> > >> > > wasn't
> > >> > > > > > > >> > >>> > >> > > > > big
> > >> > > > > > > >> > >>> > >> > > > > > > > enough. I think we made some
> > >> progress
> > >> > > on
> > >> > > > > that
> > >> > > > > > > >> issue
> > >> > > > > > > >> > >>> and
> > >> > > > > > > >> > >>> > >> maybe
> > >> > > > > > > >> > >>> > >> > > more
> > >> > > > > > > >> > >>> > >> > > > > > could
> > >> > > > > > > >> > >>> > >> > > > > > > be
> > >> > > > > > > >> > >>> > >> > > > > > > > done there so that a small bit
> > of
> > >> > > > fuzziness
> > >> > > > > > > >> around
> > >> > > > > > > >> > the
> > >> > > > > > > >> > >>> > size
> > >> > > > > > > >> > >>> > >> > would
> > >> > > > > > > >> > >>> > >> > > > not
> > >> > > > > > > >> > >>> > >> > > > > > be
> > >> > > > > > > >> > >>> > >> > > > > > > an
> > >> > > > > > > >> > >>> > >> > > > > > > > issue?
> > >> > > > > > > >> > >>> > >> > > > > > > >
> > >> > > > > > > >> > >>> > >> > > > > > > > -Jay
> > >> > > > > > > >> > >>> > >> > > > > > > >
> > >> > > > > > > >> > >>> > >> > > > > > > >
> > >> > > > > > > >> > >>> > >> > > > > > > >
> > >> > > > > > > >> > >>> > >> > > > > > > > On Tue, Feb 21, 2017 at 12:30
> > PM,
> > >> > > Becket
> > >> > > > > Qin
> > >> > > > > > <
> > >> > > > > > > >> > >>> > >> > > becket.qin@gmail.com
> > >> > > > > > > >> > >>> > >> > > > >
> > >> > > > > > > >> > >>> > >> > > > > > > wrote:
> > >> > > > > > > >> > >>> > >> > > > > > > >
> > >> > > > > > > >> > >>> > >> > > > > > > > > Hi folks,
> > >> > > > > > > >> > >>> > >> > > > > > > > >
> > >> > > > > > > >> > >>> > >> > > > > > > > > I would like to start the
> > >> > discussion
> > >> > > > > thread
> > >> > > > > > > on
> > >> > > > > > > >> > >>> KIP-126.
> > >> > > > > > > >> > >>> > >> The
> > >> > > > > > > >> > >>> > >> > KIP
> > >> > > > > > > >> > >>> > >> > > > > > propose
> > >> > > > > > > >> > >>> > >> > > > > > > > > adding a new configuration
> to
> > >> > > > > KafkaProducer
> > >> > > > > > > to
> > >> > > > > > > >> > allow
> > >> > > > > > > >> > >>> > >> batching
> > >> > > > > > > >> > >>> > >> > > > based
> > >> > > > > > > >> > >>> > >> > > > > > on
> > >> > > > > > > >> > >>> > >> > > > > > > > > uncompressed message size.
> > >> > > > > > > >> > >>> > >> > > > > > > > >
> > >> > > > > > > >> > >>> > >> > > > > > > > > Comments are welcome.
> > >> > > > > > > >> > >>> > >> > > > > > > > >
> > >> > > > > > > >> > >>> > >> > > > > > > > > The KIP wiki is following:
> > >> > > > > > > >> > >>> > >> > > > > > > > >
> > https://cwiki.apache.org/confl
> > >> > > > > > > >> > >>> uence/display/KAFKA/KIP-
> > >> > > > > > > >> > >>> > >> > > > > > > > >
> 126+-+Allow+KafkaProducer+to+b
> > >> > > > > > > >> > >>> > >> atch+based+on+uncompressed+siz
> > >> > > > > > > >> > >>> > >> > e
> > >> > > > > > > >> > >>> > >> > > > > > > > >
> > >> > > > > > > >> > >>> > >> > > > > > > > > Thanks,
> > >> > > > > > > >> > >>> > >> > > > > > > > >
> > >> > > > > > > >> > >>> > >> > > > > > > > > Jiangjie (Becket) Qin
> > >> > > > > > > >> > >>> > >> > > > > > > > >
> > >> > > > > > > >> > >>> > >> > > > > > > >
> > >> > > > > > > >> > >>> > >> > > > > > >
> > >> > > > > > > >> > >>> > >> > > > > >
> > >> > > > > > > >> > >>> > >> > > > >
> > >> > > > > > > >> > >>> > >> > > >
> > >> > > > > > > >> > >>> > >> > > >
> > >> > > > > > > >> > >>> > >> > > >
> > >> > > > > > > >> > >>> > >> > > > --
> > >> > > > > > > >> > >>> > >> > > > -Regards,
> > >> > > > > > > >> > >>> > >> > > > Mayuresh R. Gharat
> > >> > > > > > > >> > >>> > >> > > > (862) 250-7125
> > >> > > > > > > >> > >>> > >> > > >
> > >> > > > > > > >> > >>> > >> > >
> > >> > > > > > > >> > >>> > >> >
> > >> > > > > > > >> > >>> > >> >
> > >> > > > > > > >> > >>> > >> >
> > >> > > > > > > >> > >>> > >> > --
> > >> > > > > > > >> > >>> > >> > -Regards,
> > >> > > > > > > >> > >>> > >> > Mayuresh R. Gharat
> > >> > > > > > > >> > >>> > >> > (862) 250-7125
> > >> > > > > > > >> > >>> > >> >
> > >> > > > > > > >> > >>> > >>
> > >> > > > > > > >> > >>> > >
> > >> > > > > > > >> > >>> > >
> > >> > > > > > > >> > >>> >
> > >> > > > > > > >> > >>>
> > >> > > > > > > >> > >>
> > >> > > > > > > >> > >>
> > >> > > > > > > >> > >
> > >> > > > > > > >> >
> > >> > > > > > > >>
> > >> > > > > > > >
> > >> > > > > > > >
> > >> > > > > > >
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> > >
> > >
> >
>