You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Nakamura <nn...@gmail.com> on 2021/06/01 14:00:10 UTC

Re: [DISCUSS] KIP-739: Block Less on KafkaProducer#send

Hi Colin,

Sorry, I still don't follow.

Right now `KafkaProducer#send` seems to trigger a metadata fetch.  Today,
we block on that before returning.  Is your proposal that we move the
metadata fetch out of `KafkaProducer#send` entirely?

Even if the metadata fetch moves to be non-blocking, I think we still need
to deal with the problems we've discussed before if the fetch happens in
the `KafkaProducer#send` method.  How do we maintain the ordering semantics
of `KafkaProducer#send`?  How do we prevent our buffer from filling up?
Which thread is responsible for checking poll()?

The only approach I can see that would avoid this would be moving the
metadata fetch to happen at a different time.  But it's not clear to me
when would be a more appropriate time to do the metadata fetch than
`KafkaProducer#send`.

I think there's something I'm missing here.  Would you mind helping me
figure out what it is?

Best,
Moses

On Sun, May 30, 2021 at 5:35 PM Colin McCabe <cm...@apache.org> wrote:

> On Tue, May 25, 2021, at 11:26, Nakamura wrote:
> > Hey Colin,
> >
> > For the metadata case, what would fixing the bug look like?  I agree that
> > we should fix it, but I don't have a clear picture in my mind of what
> > fixing it should look like.  Can you elaborate?
> >
>
> If the blocking metadata fetch bug were fixed, neither the producer nor
> the consumer would block while fetching metadata. A poll() call would
> initiate a metadata fetch if needed, and a subsequent call to poll() would
> handle the results if needed. Basically the same paradigm we use for other
> network communication in the producer and consumer.
>
> best,
> Colin
>
> > Best,
> > Moses
> >
> > On Mon, May 24, 2021 at 1:54 PM Colin McCabe <cm...@apache.org> wrote:
> >
> > > Hi all,
> > >
> > > I agree that we should give users the option of having a fully async
> API,
> > > but I don't think external thread pools or queues are the right
> direction
> > > to go here. They add performance overheads and don't address the root
> > > causes of the problem.
> > >
> > > There are basically two scenarios where we block, currently. One is
> when
> > > we are doing a metadata fetch. I think this is clearly a bug, or at
> least
> > > an implementation limitation. From the user's point of view, the fact
> that
> > > we are doing a metadata fetch is an implementation detail that really
> > > shouldn't be exposed like this. We have talked about fixing this in the
> > > past. I think we just should spend the time to do it.
> > >
> > > The second scenario is where the client has produced too much data in
> too
> > > little time. This could happen if there is a network glitch, or the
> server
> > > is slower than expected. In this case, the behavior is intentional and
> not
> > > a bug. To understand this, think about what would happen if we didn't
> > > block. We would start buffering more and more data in memory, until
> finally
> > > the application died with an out of memory error. That would be
> frustrating
> > > for users and wouldn't add to the usability of Kafka.
> > >
> > > We could potentially have an option to handle the out-of-memory
> scenario
> > > differently by returning an error code immediately rather than
> blocking.
> > > Applications would have to be rewritten to handle this properly, but
> it is
> > > a possibility. I suspect that most of them wouldn't use this, but we
> could
> > > offer it as a possibility for async purists (which might include
> certain
> > > frameworks). The big problem the users would have to solve is what to
> do
> > > with the record that they were unable to produce due to the buffer full
> > > issue.
> > >
> > > best,
> > > Colin
> > >
> > >
> > > On Thu, May 20, 2021, at 10:35, Nakamura wrote:
> > > > >
> > > > > My suggestion was just do this in multiple steps/phases, firstly
> let's
> > > fix
> > > > > the issue of send being misleadingly asynchronous (i.e. internally
> its
> > > > > blocking) and then later one we can make the various
> > > > > threadpools configurable with a sane default.
> > > >
> > > > I like that approach. I updated the "Which thread should be
> responsible
> > > for
> > > > waiting" part of KIP-739 to add your suggestion as my recommended
> > > approach,
> > > > thank you!  If no one else has major concerns about that approach,
> I'll
> > > > move the alternatives to "rejected alternatives".
> > > >
> > > > On Thu, May 20, 2021 at 7:26 AM Matthew de Detrich
> > > > <ma...@aiven.io.invalid> wrote:
> > > >
> > > > > @
> > > > >
> > > > > Nakamura
> > > > > On Wed, May 19, 2021 at 7:35 PM Nakamura <nn...@gmail.com> wrote:
> > > > >
> > > > > > @Ryanne:
> > > > > > In my mind's eye I slightly prefer the throwing the "cannot
> enqueue"
> > > > > > exception to satisfying the future immediately with the "cannot
> > > enqueue"
> > > > > > exception?  But I agree, it would be worth doing more research.
> > > > > >
> > > > > > @Matthew:
> > > > > >
> > > > > > > 3. Using multiple thread pools is definitely recommended for
> > > different
> > > > > > > types of tasks, for serialization which is CPU bound you
> definitely
> > > > > would
> > > > > > > want to use a bounded thread pool that is fixed by the number
> of
> > > CPU's
> > > > > > (or
> > > > > > > something along those lines).
> > > > > > >
> https://gist.github.com/djspiewak/46b543800958cf61af6efa8e072bfd5c
> > > is
> > > > > a
> > > > > > > very good guide on this topic
> > > > > > I think this guide is good in general, but I would be hesitant to
> > > follow
> > > > > > its guidance re: offloading serialization without benchmarking
> it.
> > > My
> > > > > > understanding is that context-switches have gotten much cheaper,
> and
> > > that
> > > > > > gains from cache locality are small, but they're not nothing.
> > > Especially
> > > > > > if the workload has a very small serialization cost, I wouldn't
> be
> > > > > shocked
> > > > > > if it made it slower.  I feel pretty strongly that we should do
> more
> > > > > > research here before unconditionally encouraging serialization
> in a
> > > > > > threadpool.  If people think it's important to do it here (eg if
> we
> > > think
> > > > > > it would mean another big API change) then we should start
> thinking
> > > about
> > > > > > what benchmarking we can do to gain higher confidence in this
> kind of
> > > > > > change.  However, I don't think it would change semantics as
> > > > > substantially
> > > > > > as we're proposing here, so I would vote for pushing this to a
> > > subsequent
> > > > > > KIP.
> > > > > >
> > > > > Of course, its all down to benchmarking, benchmarking and
> benchmarking.
> > > > > Ideally speaking you want to use all of the resources available to
> > > you, so
> > > > > if you have a bottleneck in serialization and you have many cores
> free
> > > then
> > > > > using multiple cores may be more appropriate than a single core.
> > > Typically
> > > > > I would expect that using a single thread to do serialization is
> > > likely to
> > > > > be the most situation, I was just responding to an earlier point
> that
> > > was
> > > > > made in regards to using ThreadPools for serialization (note that
> you
> > > can
> > > > > also just use a ThreadPool that is pinned to a single thread)
> > > > >
> > > > >
> > > > >
> > > > > >
> > > > > > > 4. Regarding providing the ability for users to supply their
> own
> > > custom
> > > > > > > ThreadPool this is more of an ergonomics question for the API.
> > > > > Especially
> > > > > > > when it gets to monitoring/tracing, giving the ability for
> users to
> > > > > > provide
> > > > > > > their own custom IO/CPU ThreadPools is ideal however as stated
> > > doing so
> > > > > > > means a lot of boilerplatery changes to the API. Typically
> > > speaking a
> > > > > lot
> > > > > > > of monitoring/tracing/diagnosing is done on
> > > > > ExecutionContext/ThreadPools
> > > > > > > (at least on a more rudimentary level) and hence allowing
> users to
> > > > > supply
> > > > > > a
> > > > > > > global singleton ThreadPool for IO tasks and another for CPU
> tasks
> > > > > makes
> > > > > > > their lives a lot easier. However due to the large amount of
> > > changes to
> > > > > > the
> > > > > > > API, it may be more appropriate to just use internal thread
> pools
> > > (for
> > > > > > now)
> > > > > > > since at least it's not any worse than what exists currently
> and
> > > this
> > > > > can
> > > > > > > be an improvement that is done later?
> > > > > > Is there an existing threadpool that you suggest we reuse?  Or
> are
> > > you
> > > > > > imagining that we make our own internal threadpool, and then
> maybe
> > > expose
> > > > > > configuration flags to manipulate it?  For what it's worth, I
> like
> > > having
> > > > > > an internal threadpool (perhaps just FJP.commonpool) and then
> > > providing
> > > > > an
> > > > > > alternative to pass your own threadpool.  That way people who
> want
> > > finer
> > > > > > control can get it, and everyone else can do OK with the default.
> > > > > >
> > > > > Indeed that is what I am saying. The most ideal situation is that
> > > there is
> > > > > a default internal threadpool that Kafka uses, however users of
> Kafka
> > > can
> > > > > configure there own threadpool. Having a singleton ThreadPool for
> > > blocking
> > > > > IO, non blocking IO and CPU bound tasks which can be plugged in
> all of
> > > your
> > > > > libraries (including Kafka) makes resource management much easier
> to
> > > do and
> > > > > also gives control of users to override specific threadpools for
> > > > > exceptional cases (i.e. providing a Threadpool that is pinned to a
> > > single
> > > > > core which tends to give the best latency results if this is
> something
> > > that
> > > > > is critical for you).
> > > > >
> > > > > My suggestion was just do this in multiple steps/phases, firstly
> let's
> > > fix
> > > > > the issue of send being misleadingly asynchronous (i.e. internally
> its
> > > > > blocking) and then later one we can make the various
> > > > > threadpools configurable with a sane default.
> > > > >
> > > > > >
> > > > > >
> > > > > > On Wed, May 19, 2021 at 6:01 AM Matthew de Detrich
> > > > > > <ma...@aiven.io.invalid> wrote:
> > > > > >
> > > > > > > Here are my two cents here (note that I am only seeing this on
> a
> > > > > surface
> > > > > > > level)
> > > > > > >
> > > > > > > 1. If we are going this road it makes sense to do this
> "properly"
> > > (i.e.
> > > > > > > using queues  as Ryan suggested). The reason I am saying this
> is
> > > that
> > > > > it
> > > > > > > seems that the original goal of the KIP is for it to be used in
> > > other
> > > > > > > asynchronous systems and from my personal experience, you
> really do
> > > > > need
> > > > > > to
> > > > > > > make the implementation properly asynchronous otherwise it's
> > > really not
> > > > > > > that useful.
> > > > > > > 2. Due to the previous point and what was said by others, this
> is
> > > > > likely
> > > > > > > going to break some existing semantics (i.e. people are
> currently
> > > > > relying
> > > > > > > on blocking semantics) so adding another method's/interface
> plus
> > > > > > > deprecating the older one is more annoying but ideal.
> > > > > > > 3. Using multiple thread pools is definitely recommended for
> > > different
> > > > > > > types of tasks, for serialization which is CPU bound you
> definitely
> > > > > would
> > > > > > > want to use a bounded thread pool that is fixed by the number
> of
> > > CPU's
> > > > > > (or
> > > > > > > something along those lines).
> > > > > > >
> https://gist.github.com/djspiewak/46b543800958cf61af6efa8e072bfd5c
> > > is
> > > > > a
> > > > > > > very good guide on this topic
> > > > > > > 4. Regarding providing the ability for users to supply their
> own
> > > custom
> > > > > > > ThreadPool this is more of an ergonomics question for the API.
> > > > > Especially
> > > > > > > when it gets to monitoring/tracing, giving the ability for
> users to
> > > > > > provide
> > > > > > > their own custom IO/CPU ThreadPools is ideal however as stated
> > > doing so
> > > > > > > means a lot of boilerplatery changes to the API. Typically
> > > speaking a
> > > > > lot
> > > > > > > of monitoring/tracing/diagnosing is done on
> > > > > ExecutionContext/ThreadPools
> > > > > > > (at least on a more rudimentary level) and hence allowing
> users to
> > > > > > supply a
> > > > > > > global singleton ThreadPool for IO tasks and another for CPU
> tasks
> > > > > makes
> > > > > > > their lives a lot easier. However due to the large amount of
> > > changes to
> > > > > > the
> > > > > > > API, it may be more appropriate to just use internal thread
> pools
> > > (for
> > > > > > now)
> > > > > > > since at least it's not any worse than what exists currently
> and
> > > this
> > > > > can
> > > > > > > be an improvement that is done later?
> > > > > > >
> > > > > > > On Wed, May 19, 2021 at 2:56 AM Ryanne Dolan <
> > > ryannedolan@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > I was thinking the sender would typically wrap send() in a
> > > > > > backoff/retry
> > > > > > > > loop, or else ignore any failures and drop sends on the floor
> > > > > > > > (fire-and-forget), and in both cases I think failing
> immediately
> > > is
> > > > > > > better
> > > > > > > > than blocking for a new spot in the queue or asynchronously
> > > failing
> > > > > > > > somehow.
> > > > > > > >
> > > > > > > > I think a failed future is adequate for the "explicit
> > > backpressure
> > > > > > > signal"
> > > > > > > > while avoiding any blocking anywhere. I think if we try to
> > > > > > asynchronously
> > > > > > > > signal the caller of failure (either by asynchronously
> failing
> > > the
> > > > > > future
> > > > > > > > or invoking a callback off-thread or something) we'd force
> the
> > > caller
> > > > > > to
> > > > > > > > either block or poll waiting for that signal, which somewhat
> > > defeats
> > > > > > the
> > > > > > > > purpose we're after. And of course blocking for a spot in the
> > > queue
> > > > > > > > definitely defeats the purpose (tho perhaps ameliorates the
> > > problem
> > > > > > > some).
> > > > > > > >
> > > > > > > > Throwing an exception to the caller directly (not via the
> > > future) is
> > > > > > > > another option with precedent in Kafka clients, tho it
> doesn't
> > > seem
> > > > > as
> > > > > > > > ergonomic to me.
> > > > > > > >
> > > > > > > > It would be interesting to analyze some existing usage and
> > > determine
> > > > > > how
> > > > > > > > difficult it would be to convert it to the various proposed
> APIs.
> > > > > > > >
> > > > > > > > Ryanne
> > > > > > > >
> > > > > > > > On Tue, May 18, 2021, 3:27 PM Nakamura <nn...@gmail.com>
> wrote:
> > > > > > > >
> > > > > > > > > Hi Ryanne,
> > > > > > > > >
> > > > > > > > > Hmm, that's an interesting idea.  Basically it would mean
> that
> > > > > after
> > > > > > > > > calling send, you would also have to check whether the
> returned
> > > > > > future
> > > > > > > > had
> > > > > > > > > failed with a specific exception.  I would be open to it,
> > > although
> > > > > I
> > > > > > > > think
> > > > > > > > > it might be slightly more surprising, since right now the
> > > paradigm
> > > > > is
> > > > > > > > > "enqueue synchronously, the future represents whether we
> > > succeeded
> > > > > in
> > > > > > > > > sending or not" and the new one would be "enqueue
> > > synchronously,
> > > > > the
> > > > > > > > future
> > > > > > > > > either represents whether we succeeded in enqueueing or
> not (in
> > > > > which
> > > > > > > > case
> > > > > > > > > it will be failed immediately if it failed to enqueue) or
> > > whether
> > > > > we
> > > > > > > > > succeeded in sending or not".
> > > > > > > > >
> > > > > > > > > But you're right, it should be on the table, thank you for
> > > > > suggesting
> > > > > > > it!
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Moses
> > > > > > > > >
> > > > > > > > > On Tue, May 18, 2021 at 12:23 PM Ryanne Dolan <
> > > > > ryannedolan@gmail.com
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Moses, in the case of a full queue, could we just return
> a
> > > failed
> > > > > > > > future
> > > > > > > > > > immediately?
> > > > > > > > > >
> > > > > > > > > > Ryanne
> > > > > > > > > >
> > > > > > > > > > On Tue, May 18, 2021, 10:39 AM Nakamura <
> nnythm@gmail.com>
> > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Alexandre,
> > > > > > > > > > >
> > > > > > > > > > > Thanks for bringing this up, I think I could use some
> > > feedback
> > > > > in
> > > > > > > > this
> > > > > > > > > > > area.  There are two mechanisms here, one for slowing
> down
> > > when
> > > > > > we
> > > > > > > > > don't
> > > > > > > > > > > have the relevant metadata, and the other for slowing
> down
> > > > > when a
> > > > > > > > queue
> > > > > > > > > > has
> > > > > > > > > > > filled up.  Although the first one applies backpressure
> > > > > somewhat
> > > > > > > > > > > inadvertently, we could still get in trouble if we're
> not
> > > > > > providing
> > > > > > > > > > > information to the mechanism that monitors whether
> we're
> > > > > queueing
> > > > > > > too
> > > > > > > > > > > much.  As for the second one, that is a classic
> > > backpressure
> > > > > use
> > > > > > > > case,
> > > > > > > > > so
> > > > > > > > > > > it's definitely important that we don't drop that
> ability.
> > > > > > > > > > >
> > > > > > > > > > > Right now backpressure is applied by blocking, which
> is a
> > > > > natural
> > > > > > > way
> > > > > > > > > to
> > > > > > > > > > > apply backpressure in synchronous systems, but can
> lead to
> > > > > > > > unnecessary
> > > > > > > > > > > slowdowns in asynchronous systems.  In my opinion, the
> > > safest
> > > > > way
> > > > > > > to
> > > > > > > > > > apply
> > > > > > > > > > > backpressure in an asynchronous model is to have an
> > > explicit
> > > > > > > > > backpressure
> > > > > > > > > > > signal.  A good example would be returning an
> exception,
> > > and
> > > > > > > > providing
> > > > > > > > > an
> > > > > > > > > > > optional hook to add a callback onto so that you can be
> > > > > notified
> > > > > > > when
> > > > > > > > > > it's
> > > > > > > > > > > ready to accept more messages.
> > > > > > > > > > >
> > > > > > > > > > > However, this would be a really big change to how
> users use
> > > > > > > > > > > KafkaProducer#send, so I don't know how much appetite
> we
> > > have
> > > > > for
> > > > > > > > > making
> > > > > > > > > > > that kind of change.  Maybe it would be simpler to
> remove
> > > the
> > > > > > > "don't
> > > > > > > > > > block
> > > > > > > > > > > when the per-topic queue is full" from the scope of
> this
> > > KIP,
> > > > > and
> > > > > > > > only
> > > > > > > > > > > focus on when metadata is available?  The downside is
> that
> > > we
> > > > > > will
> > > > > > > > > > probably
> > > > > > > > > > > want to change the API again later to fix this, so it
> > > might be
> > > > > > > better
> > > > > > > > > to
> > > > > > > > > > > just rip the bandaid off now.
> > > > > > > > > > >
> > > > > > > > > > > One slightly nasty thing here is that because queueing
> > > order is
> > > > > > > > > > important,
> > > > > > > > > > > if we want to use exceptions, we will want to be able
> to
> > > signal
> > > > > > the
> > > > > > > > > > failure
> > > > > > > > > > > to enqueue to the caller in such a way that they can
> still
> > > > > > enforce
> > > > > > > > > > message
> > > > > > > > > > > order if they want.  So we can't embed the failure
> > > directly in
> > > > > > the
> > > > > > > > > > returned
> > > > > > > > > > > future, we should either return two futures (nested,
> or as
> > > a
> > > > > > tuple)
> > > > > > > > or
> > > > > > > > > > else
> > > > > > > > > > > throw an exception to explain a backpressure.
> > > > > > > > > > >
> > > > > > > > > > > So there are a few things we should work out here:
> > > > > > > > > > >
> > > > > > > > > > > 1. Should we keep the "too many bytes enqueued" part of
> > > this in
> > > > > > > > scope?
> > > > > > > > > > (I
> > > > > > > > > > > would say yes, so that we can minimize churn in this
> API)
> > > > > > > > > > > 2. How should we signal backpressure so that it's
> > > appropriate
> > > > > for
> > > > > > > > > > > asynchronous systems?  (I would say that we should
> throw an
> > > > > > > > exception.
> > > > > > > > > > If
> > > > > > > > > > > we choose this and we want to pursue the queueing
> path, we
> > > > > would
> > > > > > > > *not*
> > > > > > > > > > want
> > > > > > > > > > > to enqueue messages that would push us over the limit,
> and
> > > > > would
> > > > > > > only
> > > > > > > > > > want
> > > > > > > > > > > to enqueue messages when we're waiting for metadata,
> and we
> > > > > would
> > > > > > > > want
> > > > > > > > > to
> > > > > > > > > > > keep track of the total number of bytes for those
> > > messages).
> > > > > > > > > > >
> > > > > > > > > > > What do you think?
> > > > > > > > > > >
> > > > > > > > > > > Best,
> > > > > > > > > > > Moses
> > > > > > > > > > >
> > > > > > > > > > > On Sun, May 16, 2021 at 6:16 AM Alexandre Dupriez <
> > > > > > > > > > > alexandre.dupriez@gmail.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hello Nakamura,
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks for proposing this change. I can see how the
> > > blocking
> > > > > > > > > behaviour
> > > > > > > > > > > > can be a problem when integrating with reactive
> > > frameworks
> > > > > such
> > > > > > > as
> > > > > > > > > > > > Akka. One of the questions I would have is how you
> would
> > > > > handle
> > > > > > > > back
> > > > > > > > > > > > pressure and avoid memory exhaustion when the
> producer's
> > > > > buffer
> > > > > > > is
> > > > > > > > > > > > full and tasks would start to accumulate in the
> > > out-of-band
> > > > > > queue
> > > > > > > > or
> > > > > > > > > > > > thread pool introduced with this KIP.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > > Alexandre
> > > > > > > > > > > >
> > > > > > > > > > > > Le ven. 14 mai 2021 à 15:55, Ryanne Dolan <
> > > > > > ryannedolan@gmail.com
> > > > > > > >
> > > > > > > > a
> > > > > > > > > > > écrit
> > > > > > > > > > > > :
> > > > > > > > > > > > >
> > > > > > > > > > > > > Makes sense!
> > > > > > > > > > > > >
> > > > > > > > > > > > > Ryanne
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Fri, May 14, 2021, 9:39 AM Nakamura <
> > > nnythm@gmail.com>
> > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hey Ryanne,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I see what you're saying about serde blocking,
> but I
> > > > > think
> > > > > > we
> > > > > > > > > > should
> > > > > > > > > > > > > > consider it out of scope for this patch.  Right
> now
> > > we've
> > > > > > > > nailed
> > > > > > > > > > > down a
> > > > > > > > > > > > > > couple of use cases where we can unambiguously
> say,
> > > "I
> > > > > can
> > > > > > > make
> > > > > > > > > > > > progress
> > > > > > > > > > > > > > now" or "I cannot make progress now", which
> makes it
> > > > > > possible
> > > > > > > > to
> > > > > > > > > > > > offload to
> > > > > > > > > > > > > > a different thread only if we are unable to make
> > > > > progress.
> > > > > > > > > > Extending
> > > > > > > > > > > > this
> > > > > > > > > > > > > > to CPU work like serde would mean always
> offloading,
> > > > > which
> > > > > > > > would
> > > > > > > > > > be a
> > > > > > > > > > > > > > really big performance change.  It might be worth
> > > > > exploring
> > > > > > > > > anyway,
> > > > > > > > > > > > but I'd
> > > > > > > > > > > > > > rather keep this patch focused on improving
> > > ergonomics,
> > > > > > > rather
> > > > > > > > > than
> > > > > > > > > > > > > > muddying the waters with evaluating performance
> very
> > > > > > deeply.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I think if we really do want to support serde or
> > > > > > interceptors
> > > > > > > > > that
> > > > > > > > > > do
> > > > > > > > > > > > IO on
> > > > > > > > > > > > > > the send path (which seems like an anti-pattern
> to
> > > me),
> > > > > we
> > > > > > > > should
> > > > > > > > > > > > consider
> > > > > > > > > > > > > > making that a separate SIP, and probably also
> > > consider
> > > > > > > changing
> > > > > > > > > the
> > > > > > > > > > > > API to
> > > > > > > > > > > > > > use Futures (or CompletionStages).  But I would
> > > rather
> > > > > > avoid
> > > > > > > > > scope
> > > > > > > > > > > > creep,
> > > > > > > > > > > > > > so that we have a better chance of fixing this
> part
> > > of
> > > > > the
> > > > > > > > > problem.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Yes, I think some exceptions will move to being
> async
> > > > > > instead
> > > > > > > > of
> > > > > > > > > > > sync.
> > > > > > > > > > > > > > They'll still be surfaced in the Future, so I'm
> not
> > > so
> > > > > > > > confident
> > > > > > > > > > that
> > > > > > > > > > > > it
> > > > > > > > > > > > > > would be that big a shock to users though.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > Moses
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Thu, May 13, 2021 at 7:44 PM Ryanne Dolan <
> > > > > > > > > > ryannedolan@gmail.com>
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > re serialization, my concern is that
> serialization
> > > > > often
> > > > > > > > > accounts
> > > > > > > > > > > > for a
> > > > > > > > > > > > > > lot
> > > > > > > > > > > > > > > of the cycles spent before returning the
> future.
> > > It's
> > > > > not
> > > > > > > > > > blocking
> > > > > > > > > > > > per
> > > > > > > > > > > > > > se,
> > > > > > > > > > > > > > > but it's the same effect from the caller's
> > > perspective.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Moreover, serde impls often block themselves,
> e.g.
> > > when
> > > > > > > > > fetching
> > > > > > > > > > > > schemas
> > > > > > > > > > > > > > > from a registry. I suppose it's also possible
> to
> > > block
> > > > > in
> > > > > > > > > > > > Interceptors
> > > > > > > > > > > > > > > (e.g. writing audit events or metrics), which
> > > happens
> > > > > > > before
> > > > > > > > > > serdes
> > > > > > > > > > > > iiuc.
> > > > > > > > > > > > > > > So any blocking in either of those plugins
> would
> > > block
> > > > > > the
> > > > > > > > send
> > > > > > > > > > > > unless we
> > > > > > > > > > > > > > > queue first.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > So I think we want to queue first and do
> everything
> > > > > > > > off-thread
> > > > > > > > > > when
> > > > > > > > > > > > using
> > > > > > > > > > > > > > > the new API, whatever that looks like. I just
> want
> > > to
> > > > > > make
> > > > > > > > sure
> > > > > > > > > > we
> > > > > > > > > > > > don't
> > > > > > > > > > > > > > do
> > > > > > > > > > > > > > > that for clients that wouldn't expect it.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Another consideration is exception handling.
> If we
> > > > > queue
> > > > > > > > right
> > > > > > > > > > > away,
> > > > > > > > > > > > > > we'll
> > > > > > > > > > > > > > > defer some exceptions that currently are
> thrown to
> > > the
> > > > > > > caller
> > > > > > > > > > > > (before the
> > > > > > > > > > > > > > > future is returned). In the new API, the send()
> > > > > wouldn't
> > > > > > > > throw
> > > > > > > > > > any
> > > > > > > > > > > > > > > exceptions, and instead the future would fail.
> I
> > > think
> > > > > > that
> > > > > > > > > might
> > > > > > > > > > > > mean
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > a new method signature is required.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Ryanne
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Thu, May 13, 2021, 2:57 PM Nakamura <
> > > > > > > > > nakamura.moses@gmail.com
> > > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hey Ryanne,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I agree we should add an additional
> constructor
> > > (or
> > > > > > else
> > > > > > > an
> > > > > > > > > > > > additional
> > > > > > > > > > > > > > > > overload in KafkaProducer#send, but the new
> > > > > constructor
> > > > > > > > would
> > > > > > > > > > be
> > > > > > > > > > > > easier
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > understand) if we're targeting the "user
> > > provides the
> > > > > > > > thread"
> > > > > > > > > > > > approach.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > From looking at the code, I think we can keep
> > > record
> > > > > > > > > > > serialization
> > > > > > > > > > > > on
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > user thread, if we consider that an important
> > > part of
> > > > > > the
> > > > > > > > > > > > semantics of
> > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > method.  It doesn't seem like serialization
> > > depends
> > > > > on
> > > > > > > > > knowing
> > > > > > > > > > > the
> > > > > > > > > > > > > > > cluster,
> > > > > > > > > > > > > > > > I think it's incidental that it comes after
> the
> > > first
> > > > > > > > > > "blocking"
> > > > > > > > > > > > > > segment
> > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > the method.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > Moses
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Thu, May 13, 2021 at 2:38 PM Ryanne Dolan
> <
> > > > > > > > > > > > ryannedolan@gmail.com>
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Hey Moses, I like the direction here. My
> > > thinking
> > > > > is
> > > > > > > > that a
> > > > > > > > > > > > single
> > > > > > > > > > > > > > > > > additional work queue, s.t. send() can
> enqueue
> > > and
> > > > > > > > return,
> > > > > > > > > > > seems
> > > > > > > > > > > > like
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > lightest touch. However, I don't think we
> can
> > > > > > trivially
> > > > > > > > > > process
> > > > > > > > > > > > that
> > > > > > > > > > > > > > > > queue
> > > > > > > > > > > > > > > > > in an internal thread pool without subtly
> > > changing
> > > > > > > > behavior
> > > > > > > > > > for
> > > > > > > > > > > > some
> > > > > > > > > > > > > > > > users.
> > > > > > > > > > > > > > > > > For example, users will often run send() in
> > > > > multiple
> > > > > > > > > threads
> > > > > > > > > > in
> > > > > > > > > > > > order
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > serialize faster, but that wouldn't work
> quite
> > > the
> > > > > > same
> > > > > > > > if
> > > > > > > > > > > there
> > > > > > > > > > > > were
> > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > internal thread pool.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > For this reason I'm thinking we need to
> make
> > > sure
> > > > > any
> > > > > > > > such
> > > > > > > > > > > > changes
> > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > opt-in. Maybe a new constructor with an
> > > additional
> > > > > > > > > > > ThreadFactory
> > > > > > > > > > > > > > > > parameter.
> > > > > > > > > > > > > > > > > That would at least clearly indicate that
> work
> > > will
> > > > > > > > happen
> > > > > > > > > > > > > > off-thread,
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > would require opt-in for the new behavior.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Under the hood, this ThreadFactory could be
> > > used to
> > > > > > > > create
> > > > > > > > > > the
> > > > > > > > > > > > worker
> > > > > > > > > > > > > > > > > thread that process queued sends, which
> could
> > > > > fan-out
> > > > > > > to
> > > > > > > > > > > > > > per-partition
> > > > > > > > > > > > > > > > > threads from there.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > So then you'd have two ways to send: the
> > > existing
> > > > > > way,
> > > > > > > > > where
> > > > > > > > > > > > serde
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > interceptors and whatnot are executed on
> the
> > > > > calling
> > > > > > > > > thread,
> > > > > > > > > > > and
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > new
> > > > > > > > > > > > > > > > > way, which returns right away and uses an
> > > internal
> > > > > > > > > Executor.
> > > > > > > > > > As
> > > > > > > > > > > > you
> > > > > > > > > > > > > > > point
> > > > > > > > > > > > > > > > > out, the semantics would be identical in
> either
> > > > > case,
> > > > > > > and
> > > > > > > > > it
> > > > > > > > > > > > would be
> > > > > > > > > > > > > > > > very
> > > > > > > > > > > > > > > > > easy for clients to switch.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Ryanne
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Thu, May 13, 2021, 9:00 AM Nakamura <
> > > > > > > nnythm@gmail.com
> > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Hey Folks,
> > > > > > > > > > > > > > > > > > I just posted a new proposal
> > > > > > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=181306446
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > in the wiki.  I think we have an
> opportunity
> > > to
> > > > > > > improve
> > > > > > > > > the
> > > > > > > > > > > > > > > > > > KafkaProducer#send user experience.  It
> would
> > > > > > > certainly
> > > > > > > > > > make
> > > > > > > > > > > > our
> > > > > > > > > > > > > > > lives
> > > > > > > > > > > > > > > > > > easier.  Please take a look!  There are
> two
> > > > > > > subproblems
> > > > > > > > > > that
> > > > > > > > > > > I
> > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > > guidance on, so I would appreciate
> feedback
> > > on
> > > > > both
> > > > > > > of
> > > > > > > > > > them.
> > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > Moses
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > >
> > > > > > > Matthew de Detrich
> > > > > > >
> > > > > > > *Aiven Deutschland GmbH*
> > > > > > >
> > > > > > > Immanuelkirchstraße 26, 10405 Berlin
> > > > > > >
> > > > > > > Amtsgericht Charlottenburg, HRB 209739 B
> > > > > > >
> > > > > > > *m:* +491603708037
> > > > > > >
> > > > > > > *w:* aiven.io *e:* matthew.dedetrich@aiven.io
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Matthew de Detrich
> > > > >
> > > > > *Aiven Deutschland GmbH*
> > > > >
> > > > > Immanuelkirchstraße 26, 10405 Berlin
> > > > >
> > > > > Amtsgericht Charlottenburg, HRB 209739 B
> > > > >
> > > > > *m:* +491603708037
> > > > >
> > > > > *w:* aiven.io *e:* matthew.dedetrich@aiven.io
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-739: Block Less on KafkaProducer#send

Posted by Nakamura <nn...@gmail.com>.
Hi Colin,

> Sure, we organize buffers by broker currently. However, we could set some
maximum buffer size for records that haven't been assigned to a broker yet.
OK, I think we're probably aligned then.  I think we were using slightly
different terminology (queue vs buffer) but we were actually violently
agreeing.

> In general the Kafka producer is supposed to be used from a client
thread. That thread is responsible for calling poll periodically to get the
results of any send() operations it performed. (It's possible to use the
producer from multiple threads as well.)
>
> The main point I was making is that metadata fetches can and should be
done in the same way as any other network I/O in the producer.
Thanks!  I think I don't quite understand, but I'll do some research myself
and try to understand it better.

Best,
Moses


On Wed, Jun 2, 2021 at 2:21 PM Colin McCabe <cm...@apache.org> wrote:

> On Tue, Jun 1, 2021, at 12:22, Nakamura wrote:
> > I think we're talking past each other a bit.  I know about non-blocking
> > I/O.  The problem I'm facing is how to preserve the existing semantics
> > without blocking.  Right now callers assume their work is enqueued
> in-order
> > after `KafkaProducer#send` returns.  We can't simply return a future that
> > represents the metadata fetch, because of that assumption.  We need to
> > maintain order somehow.  That is what all of the different queues we're
> > proposing are intended to do.
>
> Hi Nakamura,
>
> I guess the point I was making was that there is no connection between
> first-in, first-out semantics and blocking. Nothing about FIFO semantics
> requires blocking.
>
> > > How are the ordering semantics of `KafkaProducer#send` related to the
> > > metadata fetch?
> > KafkaProducer#send currently enqueues after it has the metadata, and it
> > passes the TopicPartition struct as part of the data when enqueueing.  We
> > can either update that data structure to be able to work with partial
> > metadata, or we can add a new queue on top.  I outline both potential
> > approaches in the current KIP.
> >
> > > That is not related to the metadata fetch. Also, I already proposed a
> > > solution (returning an error) if this is a concern.
> > Unfortunately it is, because `KafkaProducer#send` conflates the two of
> > them.  That seems to be the central difficulty of preserving the
> semantics
> > here.
>
> Sure, we organize buffers by broker currently. However, we could set some
> maximum buffer size for records that haven't been assigned to a broker yet.
>
> >
> > > The same client thread that always has been responsible for checking
> poll.
> > Please pretend I've never contributed to Kafka before :). Which thread is
> > that?
>
> In general the Kafka producer is supposed to be used from a client thread.
> That thread is responsible for calling poll periodically to get the results
> of any send() operations it performed. (It's possible to use the producer
> from multiple threads as well.)
>
> The main point I was making is that metadata fetches can and should be
> done in the same way as any other network I/O in the producer.
>
> best,
> Colin
>
> >
> > Best,
> > Moses
> >
> > On Tue, Jun 1, 2021 at 3:12 PM Ryanne Dolan <ry...@gmail.com>
> wrote:
> >
> > > Colin, the issue for me isn't so much whether non-blocking I/O is used
> or
> > > not, but the fact that the caller observes a long time between calling
> > > send() and receiving the returned future. This behavior can be
> considered
> > > "blocking" whether or not I/O is involved.
> > >
> > > > How are the ordering semantics of `KafkaProducer#send` related to the
> > > metadata fetch?
> > > > I already proposed a solution (returning an error)
> > >
> > > There is a subtle difference between failing immediately vs blocking
> for
> > > metadata, related to ordering in the face of retries. Say we set the
> send
> > > timeout to max-long (or something high enough that we rarely encounter
> > > timeouts in practice), and set max inflight requests to 1. Today, we
> can
> > > reasonably assume that calling send() in sequence to a specific
> partition
> > > will result in the corresponding sequence landing on that partition,
> > > regardless of how the caller handles retries. The caller might not
> handle
> > > retries at all. But if we can fail immediately (e.g. when the metadata
> > > isn't yet ready), then the caller must handle retries carefully.
> > > Specifically, the caller must retry each send() before proceeding to
> the
> > > next. This basically means that the caller must block on each send() in
> > > order to maintain the proper sequence -- how else would the caller know
> > > whether it will need to retry or not?
> > >
> > > In other words, failing immediately punts the problem to the caller to
> > > handle, while the caller is less-equipped to deal with it. I don't
> think we
> > > should do that, at least not in the default case.
> > >
> > > I actually don't have any objections to this approach so long as it's
> > > opt-in. It sounds like you are suggesting to fix the bug for everyone,
> but
> > > I don't think we can do that without subtly breaking things.
> > >
> > > Ryanne
> > >
> > > On Tue, Jun 1, 2021 at 12:31 PM Colin McCabe <cm...@apache.org>
> wrote:
> > >
> > > > On Tue, Jun 1, 2021, at 07:00, Nakamura wrote:
> > > > > Hi Colin,
> > > > >
> > > > > Sorry, I still don't follow.
> > > > >
> > > > > Right now `KafkaProducer#send` seems to trigger a metadata fetch.
> > > Today,
> > > > > we block on that before returning.  Is your proposal that we move
> the
> > > > > metadata fetch out of `KafkaProducer#send` entirely?
> > > > >
> > > >
> > > > KafkaProducer#send is supposed to initiate non-blocking I/O, but not
> wait
> > > > for it to complete.
> > > >
> > > > There's more information about non-blocking I/O in Java here:
> > > > https://en.wikipedia.org/wiki/Non-blocking_I/O_%28Java%29
> > > >
> > > > >
> > > > > Even if the metadata fetch moves to be non-blocking, I think we
> still
> > > > need
> > > > > to deal with the problems we've discussed before if the fetch
> happens
> > > in
> > > > > the `KafkaProducer#send` method.  How do we maintain the ordering
> > > > semantics
> > > > > of `KafkaProducer#send`?
> > > >
> > > > How are the ordering semantics of `KafkaProducer#send` related to the
> > > > metadata fetch?
> > > >
> > > > >  How do we prevent our buffer from filling up?
> > > >
> > > > That is not related to the metadata fetch. Also, I already proposed a
> > > > solution (returning an error) if this is a concern.
> > > >
> > > > > Which thread is responsible for checking poll()?
> > > >
> > > > The same client thread that always has been responsible for checking
> > > poll.
> > > >
> > > > >
> > > > > The only approach I can see that would avoid this would be moving
> the
> > > > > metadata fetch to happen at a different time.  But it's not clear
> to me
> > > > > when would be a more appropriate time to do the metadata fetch than
> > > > > `KafkaProducer#send`.
> > > > >
> > > >
> > > > It's not about moving the metadata fetch to happen at a different
> time.
> > > > It's about using non-blocking I/O, like we do for other network I/O.
> (And
> > > > actually, if you want to get really technical, we do this for the
> > > metadata
> > > > fetch too, it's just that we have a hack that loops to transform it
> back
> > > > into blocking I/O.)
> > > >
> > > > best,
> > > > Colin
> > > >
> > > > > I think there's something I'm missing here.  Would you mind
> helping me
> > > > > figure out what it is?
> > > > >
> > > > > Best,
> > > > > Moses
> > > > >
> > > > > On Sun, May 30, 2021 at 5:35 PM Colin McCabe <cm...@apache.org>
> > > wrote:
> > > > >
> > > > > > On Tue, May 25, 2021, at 11:26, Nakamura wrote:
> > > > > > > Hey Colin,
> > > > > > >
> > > > > > > For the metadata case, what would fixing the bug look like?  I
> > > agree
> > > > that
> > > > > > > we should fix it, but I don't have a clear picture in my mind
> of
> > > what
> > > > > > > fixing it should look like.  Can you elaborate?
> > > > > > >
> > > > > >
> > > > > > If the blocking metadata fetch bug were fixed, neither the
> producer
> > > nor
> > > > > > the consumer would block while fetching metadata. A poll() call
> would
> > > > > > initiate a metadata fetch if needed, and a subsequent call to
> poll()
> > > > would
> > > > > > handle the results if needed. Basically the same paradigm we use
> for
> > > > other
> > > > > > network communication in the producer and consumer.
> > > > > >
> > > > > > best,
> > > > > > Colin
> > > > > >
> > > > > > > Best,
> > > > > > > Moses
> > > > > > >
> > > > > > > On Mon, May 24, 2021 at 1:54 PM Colin McCabe <
> cmccabe@apache.org>
> > > > wrote:
> > > > > > >
> > > > > > > > Hi all,
> > > > > > > >
> > > > > > > > I agree that we should give users the option of having a
> fully
> > > > async
> > > > > > API,
> > > > > > > > but I don't think external thread pools or queues are the
> right
> > > > > > direction
> > > > > > > > to go here. They add performance overheads and don't address
> the
> > > > root
> > > > > > > > causes of the problem.
> > > > > > > >
> > > > > > > > There are basically two scenarios where we block, currently.
> One
> > > is
> > > > > > when
> > > > > > > > we are doing a metadata fetch. I think this is clearly a
> bug, or
> > > at
> > > > > > least
> > > > > > > > an implementation limitation. From the user's point of view,
> the
> > > > fact
> > > > > > that
> > > > > > > > we are doing a metadata fetch is an implementation detail
> that
> > > > really
> > > > > > > > shouldn't be exposed like this. We have talked about fixing
> this
> > > > in the
> > > > > > > > past. I think we just should spend the time to do it.
> > > > > > > >
> > > > > > > > The second scenario is where the client has produced too much
> > > data
> > > > in
> > > > > > too
> > > > > > > > little time. This could happen if there is a network glitch,
> or
> > > the
> > > > > > server
> > > > > > > > is slower than expected. In this case, the behavior is
> > > intentional
> > > > and
> > > > > > not
> > > > > > > > a bug. To understand this, think about what would happen if
> we
> > > > didn't
> > > > > > > > block. We would start buffering more and more data in memory,
> > > until
> > > > > > finally
> > > > > > > > the application died with an out of memory error. That would
> be
> > > > > > frustrating
> > > > > > > > for users and wouldn't add to the usability of Kafka.
> > > > > > > >
> > > > > > > > We could potentially have an option to handle the
> out-of-memory
> > > > > > scenario
> > > > > > > > differently by returning an error code immediately rather
> than
> > > > > > blocking.
> > > > > > > > Applications would have to be rewritten to handle this
> properly,
> > > > but
> > > > > > it is
> > > > > > > > a possibility. I suspect that most of them wouldn't use
> this, but
> > > > we
> > > > > > could
> > > > > > > > offer it as a possibility for async purists (which might
> include
> > > > > > certain
> > > > > > > > frameworks). The big problem the users would have to solve is
> > > what
> > > > to
> > > > > > do
> > > > > > > > with the record that they were unable to produce due to the
> > > buffer
> > > > full
> > > > > > > > issue.
> > > > > > > >
> > > > > > > > best,
> > > > > > > > Colin
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, May 20, 2021, at 10:35, Nakamura wrote:
> > > > > > > > > >
> > > > > > > > > > My suggestion was just do this in multiple steps/phases,
> > > > firstly
> > > > > > let's
> > > > > > > > fix
> > > > > > > > > > the issue of send being misleadingly asynchronous (i.e.
> > > > internally
> > > > > > its
> > > > > > > > > > blocking) and then later one we can make the various
> > > > > > > > > > threadpools configurable with a sane default.
> > > > > > > > >
> > > > > > > > > I like that approach. I updated the "Which thread should be
> > > > > > responsible
> > > > > > > > for
> > > > > > > > > waiting" part of KIP-739 to add your suggestion as my
> > > recommended
> > > > > > > > approach,
> > > > > > > > > thank you!  If no one else has major concerns about that
> > > > approach,
> > > > > > I'll
> > > > > > > > > move the alternatives to "rejected alternatives".
> > > > > > > > >
> > > > > > > > > On Thu, May 20, 2021 at 7:26 AM Matthew de Detrich
> > > > > > > > > <ma...@aiven.io.invalid> wrote:
> > > > > > > > >
> > > > > > > > > > @
> > > > > > > > > >
> > > > > > > > > > Nakamura
> > > > > > > > > > On Wed, May 19, 2021 at 7:35 PM Nakamura <
> nnythm@gmail.com>
> > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > @Ryanne:
> > > > > > > > > > > In my mind's eye I slightly prefer the throwing the
> "cannot
> > > > > > enqueue"
> > > > > > > > > > > exception to satisfying the future immediately with the
> > > > "cannot
> > > > > > > > enqueue"
> > > > > > > > > > > exception?  But I agree, it would be worth doing more
> > > > research.
> > > > > > > > > > >
> > > > > > > > > > > @Matthew:
> > > > > > > > > > >
> > > > > > > > > > > > 3. Using multiple thread pools is definitely
> recommended
> > > > for
> > > > > > > > different
> > > > > > > > > > > > types of tasks, for serialization which is CPU bound
> you
> > > > > > definitely
> > > > > > > > > > would
> > > > > > > > > > > > want to use a bounded thread pool that is fixed by
> the
> > > > number
> > > > > > of
> > > > > > > > CPU's
> > > > > > > > > > > (or
> > > > > > > > > > > > something along those lines).
> > > > > > > > > > > >
> > > > > >
> https://gist.github.com/djspiewak/46b543800958cf61af6efa8e072bfd5c
> > > > > > > > is
> > > > > > > > > > a
> > > > > > > > > > > > very good guide on this topic
> > > > > > > > > > > I think this guide is good in general, but I would be
> > > > hesitant to
> > > > > > > > follow
> > > > > > > > > > > its guidance re: offloading serialization without
> > > > benchmarking
> > > > > > it.
> > > > > > > > My
> > > > > > > > > > > understanding is that context-switches have gotten much
> > > > cheaper,
> > > > > > and
> > > > > > > > that
> > > > > > > > > > > gains from cache locality are small, but they're not
> > > nothing.
> > > > > > > > Especially
> > > > > > > > > > > if the workload has a very small serialization cost, I
> > > > wouldn't
> > > > > > be
> > > > > > > > > > shocked
> > > > > > > > > > > if it made it slower.  I feel pretty strongly that we
> > > should
> > > > do
> > > > > > more
> > > > > > > > > > > research here before unconditionally encouraging
> > > > serialization
> > > > > > in a
> > > > > > > > > > > threadpool.  If people think it's important to do it
> here
> > > > (eg if
> > > > > > we
> > > > > > > > think
> > > > > > > > > > > it would mean another big API change) then we should
> start
> > > > > > thinking
> > > > > > > > about
> > > > > > > > > > > what benchmarking we can do to gain higher confidence
> in
> > > this
> > > > > > kind of
> > > > > > > > > > > change.  However, I don't think it would change
> semantics
> > > as
> > > > > > > > > > substantially
> > > > > > > > > > > as we're proposing here, so I would vote for pushing
> this
> > > to
> > > > a
> > > > > > > > subsequent
> > > > > > > > > > > KIP.
> > > > > > > > > > >
> > > > > > > > > > Of course, its all down to benchmarking, benchmarking and
> > > > > > benchmarking.
> > > > > > > > > > Ideally speaking you want to use all of the resources
> > > > available to
> > > > > > > > you, so
> > > > > > > > > > if you have a bottleneck in serialization and you have
> many
> > > > cores
> > > > > > free
> > > > > > > > then
> > > > > > > > > > using multiple cores may be more appropriate than a
> single
> > > > core.
> > > > > > > > Typically
> > > > > > > > > > I would expect that using a single thread to do
> serialization
> > > > is
> > > > > > > > likely to
> > > > > > > > > > be the most situation, I was just responding to an
> earlier
> > > > point
> > > > > > that
> > > > > > > > was
> > > > > > > > > > made in regards to using ThreadPools for serialization
> (note
> > > > that
> > > > > > you
> > > > > > > > can
> > > > > > > > > > also just use a ThreadPool that is pinned to a single
> thread)
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > 4. Regarding providing the ability for users to
> supply
> > > > their
> > > > > > own
> > > > > > > > custom
> > > > > > > > > > > > ThreadPool this is more of an ergonomics question
> for the
> > > > API.
> > > > > > > > > > Especially
> > > > > > > > > > > > when it gets to monitoring/tracing, giving the
> ability
> > > for
> > > > > > users to
> > > > > > > > > > > provide
> > > > > > > > > > > > their own custom IO/CPU ThreadPools is ideal however
> as
> > > > stated
> > > > > > > > doing so
> > > > > > > > > > > > means a lot of boilerplatery changes to the API.
> > > Typically
> > > > > > > > speaking a
> > > > > > > > > > lot
> > > > > > > > > > > > of monitoring/tracing/diagnosing is done on
> > > > > > > > > > ExecutionContext/ThreadPools
> > > > > > > > > > > > (at least on a more rudimentary level) and hence
> allowing
> > > > > > users to
> > > > > > > > > > supply
> > > > > > > > > > > a
> > > > > > > > > > > > global singleton ThreadPool for IO tasks and another
> for
> > > > CPU
> > > > > > tasks
> > > > > > > > > > makes
> > > > > > > > > > > > their lives a lot easier. However due to the large
> amount
> > > > of
> > > > > > > > changes to
> > > > > > > > > > > the
> > > > > > > > > > > > API, it may be more appropriate to just use internal
> > > thread
> > > > > > pools
> > > > > > > > (for
> > > > > > > > > > > now)
> > > > > > > > > > > > since at least it's not any worse than what exists
> > > > currently
> > > > > > and
> > > > > > > > this
> > > > > > > > > > can
> > > > > > > > > > > > be an improvement that is done later?
> > > > > > > > > > > Is there an existing threadpool that you suggest we
> reuse?
> > > > Or
> > > > > > are
> > > > > > > > you
> > > > > > > > > > > imagining that we make our own internal threadpool, and
> > > then
> > > > > > maybe
> > > > > > > > expose
> > > > > > > > > > > configuration flags to manipulate it?  For what it's
> > > worth, I
> > > > > > like
> > > > > > > > having
> > > > > > > > > > > an internal threadpool (perhaps just FJP.commonpool)
> and
> > > then
> > > > > > > > providing
> > > > > > > > > > an
> > > > > > > > > > > alternative to pass your own threadpool.  That way
> people
> > > who
> > > > > > want
> > > > > > > > finer
> > > > > > > > > > > control can get it, and everyone else can do OK with
> the
> > > > default.
> > > > > > > > > > >
> > > > > > > > > > Indeed that is what I am saying. The most ideal
> situation is
> > > > that
> > > > > > > > there is
> > > > > > > > > > a default internal threadpool that Kafka uses, however
> users
> > > of
> > > > > > Kafka
> > > > > > > > can
> > > > > > > > > > configure there own threadpool. Having a singleton
> ThreadPool
> > > > for
> > > > > > > > blocking
> > > > > > > > > > IO, non blocking IO and CPU bound tasks which can be
> plugged
> > > in
> > > > > > all of
> > > > > > > > your
> > > > > > > > > > libraries (including Kafka) makes resource management
> much
> > > > easier
> > > > > > to
> > > > > > > > do and
> > > > > > > > > > also gives control of users to override specific
> threadpools
> > > > for
> > > > > > > > > > exceptional cases (i.e. providing a Threadpool that is
> pinned
> > > > to a
> > > > > > > > single
> > > > > > > > > > core which tends to give the best latency results if
> this is
> > > > > > something
> > > > > > > > that
> > > > > > > > > > is critical for you).
> > > > > > > > > >
> > > > > > > > > > My suggestion was just do this in multiple steps/phases,
> > > > firstly
> > > > > > let's
> > > > > > > > fix
> > > > > > > > > > the issue of send being misleadingly asynchronous (i.e.
> > > > internally
> > > > > > its
> > > > > > > > > > blocking) and then later one we can make the various
> > > > > > > > > > threadpools configurable with a sane default.
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Wed, May 19, 2021 at 6:01 AM Matthew de Detrich
> > > > > > > > > > > <ma...@aiven.io.invalid> wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Here are my two cents here (note that I am only
> seeing
> > > > this on
> > > > > > a
> > > > > > > > > > surface
> > > > > > > > > > > > level)
> > > > > > > > > > > >
> > > > > > > > > > > > 1. If we are going this road it makes sense to do
> this
> > > > > > "properly"
> > > > > > > > (i.e.
> > > > > > > > > > > > using queues  as Ryan suggested). The reason I am
> saying
> > > > this
> > > > > > is
> > > > > > > > that
> > > > > > > > > > it
> > > > > > > > > > > > seems that the original goal of the KIP is for it to
> be
> > > > used in
> > > > > > > > other
> > > > > > > > > > > > asynchronous systems and from my personal
> experience, you
> > > > > > really do
> > > > > > > > > > need
> > > > > > > > > > > to
> > > > > > > > > > > > make the implementation properly asynchronous
> otherwise
> > > > it's
> > > > > > > > really not
> > > > > > > > > > > > that useful.
> > > > > > > > > > > > 2. Due to the previous point and what was said by
> others,
> > > > this
> > > > > > is
> > > > > > > > > > likely
> > > > > > > > > > > > going to break some existing semantics (i.e. people
> are
> > > > > > currently
> > > > > > > > > > relying
> > > > > > > > > > > > on blocking semantics) so adding another
> > > method's/interface
> > > > > > plus
> > > > > > > > > > > > deprecating the older one is more annoying but ideal.
> > > > > > > > > > > > 3. Using multiple thread pools is definitely
> recommended
> > > > for
> > > > > > > > different
> > > > > > > > > > > > types of tasks, for serialization which is CPU bound
> you
> > > > > > definitely
> > > > > > > > > > would
> > > > > > > > > > > > want to use a bounded thread pool that is fixed by
> the
> > > > number
> > > > > > of
> > > > > > > > CPU's
> > > > > > > > > > > (or
> > > > > > > > > > > > something along those lines).
> > > > > > > > > > > >
> > > > > >
> https://gist.github.com/djspiewak/46b543800958cf61af6efa8e072bfd5c
> > > > > > > > is
> > > > > > > > > > a
> > > > > > > > > > > > very good guide on this topic
> > > > > > > > > > > > 4. Regarding providing the ability for users to
> supply
> > > > their
> > > > > > own
> > > > > > > > custom
> > > > > > > > > > > > ThreadPool this is more of an ergonomics question
> for the
> > > > API.
> > > > > > > > > > Especially
> > > > > > > > > > > > when it gets to monitoring/tracing, giving the
> ability
> > > for
> > > > > > users to
> > > > > > > > > > > provide
> > > > > > > > > > > > their own custom IO/CPU ThreadPools is ideal however
> as
> > > > stated
> > > > > > > > doing so
> > > > > > > > > > > > means a lot of boilerplatery changes to the API.
> > > Typically
> > > > > > > > speaking a
> > > > > > > > > > lot
> > > > > > > > > > > > of monitoring/tracing/diagnosing is done on
> > > > > > > > > > ExecutionContext/ThreadPools
> > > > > > > > > > > > (at least on a more rudimentary level) and hence
> allowing
> > > > > > users to
> > > > > > > > > > > supply a
> > > > > > > > > > > > global singleton ThreadPool for IO tasks and another
> for
> > > > CPU
> > > > > > tasks
> > > > > > > > > > makes
> > > > > > > > > > > > their lives a lot easier. However due to the large
> amount
> > > > of
> > > > > > > > changes to
> > > > > > > > > > > the
> > > > > > > > > > > > API, it may be more appropriate to just use internal
> > > thread
> > > > > > pools
> > > > > > > > (for
> > > > > > > > > > > now)
> > > > > > > > > > > > since at least it's not any worse than what exists
> > > > currently
> > > > > > and
> > > > > > > > this
> > > > > > > > > > can
> > > > > > > > > > > > be an improvement that is done later?
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, May 19, 2021 at 2:56 AM Ryanne Dolan <
> > > > > > > > ryannedolan@gmail.com>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > I was thinking the sender would typically wrap
> send()
> > > in
> > > > a
> > > > > > > > > > > backoff/retry
> > > > > > > > > > > > > loop, or else ignore any failures and drop sends
> on the
> > > > floor
> > > > > > > > > > > > > (fire-and-forget), and in both cases I think
> failing
> > > > > > immediately
> > > > > > > > is
> > > > > > > > > > > > better
> > > > > > > > > > > > > than blocking for a new spot in the queue or
> > > > asynchronously
> > > > > > > > failing
> > > > > > > > > > > > > somehow.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I think a failed future is adequate for the
> "explicit
> > > > > > > > backpressure
> > > > > > > > > > > > signal"
> > > > > > > > > > > > > while avoiding any blocking anywhere. I think if
> we try
> > > > to
> > > > > > > > > > > asynchronously
> > > > > > > > > > > > > signal the caller of failure (either by
> asynchronously
> > > > > > failing
> > > > > > > > the
> > > > > > > > > > > future
> > > > > > > > > > > > > or invoking a callback off-thread or something)
> we'd
> > > > force
> > > > > > the
> > > > > > > > caller
> > > > > > > > > > > to
> > > > > > > > > > > > > either block or poll waiting for that signal, which
> > > > somewhat
> > > > > > > > defeats
> > > > > > > > > > > the
> > > > > > > > > > > > > purpose we're after. And of course blocking for a
> spot
> > > > in the
> > > > > > > > queue
> > > > > > > > > > > > > definitely defeats the purpose (tho perhaps
> ameliorates
> > > > the
> > > > > > > > problem
> > > > > > > > > > > > some).
> > > > > > > > > > > > >
> > > > > > > > > > > > > Throwing an exception to the caller directly (not
> via
> > > the
> > > > > > > > future) is
> > > > > > > > > > > > > another option with precedent in Kafka clients,
> tho it
> > > > > > doesn't
> > > > > > > > seem
> > > > > > > > > > as
> > > > > > > > > > > > > ergonomic to me.
> > > > > > > > > > > > >
> > > > > > > > > > > > > It would be interesting to analyze some existing
> usage
> > > > and
> > > > > > > > determine
> > > > > > > > > > > how
> > > > > > > > > > > > > difficult it would be to convert it to the various
> > > > proposed
> > > > > > APIs.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Ryanne
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Tue, May 18, 2021, 3:27 PM Nakamura <
> > > nnythm@gmail.com
> > > > >
> > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi Ryanne,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Hmm, that's an interesting idea.  Basically it
> would
> > > > mean
> > > > > > that
> > > > > > > > > > after
> > > > > > > > > > > > > > calling send, you would also have to check
> whether
> > > the
> > > > > > returned
> > > > > > > > > > > future
> > > > > > > > > > > > > had
> > > > > > > > > > > > > > failed with a specific exception.  I would be
> open to
> > > > it,
> > > > > > > > although
> > > > > > > > > > I
> > > > > > > > > > > > > think
> > > > > > > > > > > > > > it might be slightly more surprising, since
> right now
> > > > the
> > > > > > > > paradigm
> > > > > > > > > > is
> > > > > > > > > > > > > > "enqueue synchronously, the future represents
> whether
> > > > we
> > > > > > > > succeeded
> > > > > > > > > > in
> > > > > > > > > > > > > > sending or not" and the new one would be "enqueue
> > > > > > > > synchronously,
> > > > > > > > > > the
> > > > > > > > > > > > > future
> > > > > > > > > > > > > > either represents whether we succeeded in
> enqueueing
> > > or
> > > > > > not (in
> > > > > > > > > > which
> > > > > > > > > > > > > case
> > > > > > > > > > > > > > it will be failed immediately if it failed to
> > > enqueue)
> > > > or
> > > > > > > > whether
> > > > > > > > > > we
> > > > > > > > > > > > > > succeeded in sending or not".
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > But you're right, it should be on the table,
> thank
> > > you
> > > > for
> > > > > > > > > > suggesting
> > > > > > > > > > > > it!
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > Moses
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Tue, May 18, 2021 at 12:23 PM Ryanne Dolan <
> > > > > > > > > > ryannedolan@gmail.com
> > > > > > > > > > > >
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Moses, in the case of a full queue, could we
> just
> > > > return
> > > > > > a
> > > > > > > > failed
> > > > > > > > > > > > > future
> > > > > > > > > > > > > > > immediately?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Ryanne
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Tue, May 18, 2021, 10:39 AM Nakamura <
> > > > > > nnythm@gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi Alexandre,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks for bringing this up, I think I could
> use
> > > > some
> > > > > > > > feedback
> > > > > > > > > > in
> > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > area.  There are two mechanisms here, one for
> > > > slowing
> > > > > > down
> > > > > > > > when
> > > > > > > > > > > we
> > > > > > > > > > > > > > don't
> > > > > > > > > > > > > > > > have the relevant metadata, and the other for
> > > > slowing
> > > > > > down
> > > > > > > > > > when a
> > > > > > > > > > > > > queue
> > > > > > > > > > > > > > > has
> > > > > > > > > > > > > > > > filled up.  Although the first one applies
> > > > backpressure
> > > > > > > > > > somewhat
> > > > > > > > > > > > > > > > inadvertently, we could still get in trouble
> if
> > > > we're
> > > > > > not
> > > > > > > > > > > providing
> > > > > > > > > > > > > > > > information to the mechanism that monitors
> > > whether
> > > > > > we're
> > > > > > > > > > queueing
> > > > > > > > > > > > too
> > > > > > > > > > > > > > > > much.  As for the second one, that is a
> classic
> > > > > > > > backpressure
> > > > > > > > > > use
> > > > > > > > > > > > > case,
> > > > > > > > > > > > > > so
> > > > > > > > > > > > > > > > it's definitely important that we don't drop
> that
> > > > > > ability.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Right now backpressure is applied by
> blocking,
> > > > which
> > > > > > is a
> > > > > > > > > > natural
> > > > > > > > > > > > way
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > apply backpressure in synchronous systems,
> but
> > > can
> > > > > > lead to
> > > > > > > > > > > > > unnecessary
> > > > > > > > > > > > > > > > slowdowns in asynchronous systems.  In my
> > > opinion,
> > > > the
> > > > > > > > safest
> > > > > > > > > > way
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > apply
> > > > > > > > > > > > > > > > backpressure in an asynchronous model is to
> have
> > > an
> > > > > > > > explicit
> > > > > > > > > > > > > > backpressure
> > > > > > > > > > > > > > > > signal.  A good example would be returning an
> > > > > > exception,
> > > > > > > > and
> > > > > > > > > > > > > providing
> > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > optional hook to add a callback onto so that
> you
> > > > can be
> > > > > > > > > > notified
> > > > > > > > > > > > when
> > > > > > > > > > > > > > > it's
> > > > > > > > > > > > > > > > ready to accept more messages.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > However, this would be a really big change
> to how
> > > > > > users use
> > > > > > > > > > > > > > > > KafkaProducer#send, so I don't know how much
> > > > appetite
> > > > > > we
> > > > > > > > have
> > > > > > > > > > for
> > > > > > > > > > > > > > making
> > > > > > > > > > > > > > > > that kind of change.  Maybe it would be
> simpler
> > > to
> > > > > > remove
> > > > > > > > the
> > > > > > > > > > > > "don't
> > > > > > > > > > > > > > > block
> > > > > > > > > > > > > > > > when the per-topic queue is full" from the
> scope
> > > of
> > > > > > this
> > > > > > > > KIP,
> > > > > > > > > > and
> > > > > > > > > > > > > only
> > > > > > > > > > > > > > > > focus on when metadata is available?  The
> > > downside
> > > > is
> > > > > > that
> > > > > > > > we
> > > > > > > > > > > will
> > > > > > > > > > > > > > > probably
> > > > > > > > > > > > > > > > want to change the API again later to fix
> this,
> > > so
> > > > it
> > > > > > > > might be
> > > > > > > > > > > > better
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > just rip the bandaid off now.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > One slightly nasty thing here is that because
> > > > queueing
> > > > > > > > order is
> > > > > > > > > > > > > > > important,
> > > > > > > > > > > > > > > > if we want to use exceptions, we will want
> to be
> > > > able
> > > > > > to
> > > > > > > > signal
> > > > > > > > > > > the
> > > > > > > > > > > > > > > failure
> > > > > > > > > > > > > > > > to enqueue to the caller in such a way that
> they
> > > > can
> > > > > > still
> > > > > > > > > > > enforce
> > > > > > > > > > > > > > > message
> > > > > > > > > > > > > > > > order if they want.  So we can't embed the
> > > failure
> > > > > > > > directly in
> > > > > > > > > > > the
> > > > > > > > > > > > > > > returned
> > > > > > > > > > > > > > > > future, we should either return two futures
> > > > (nested,
> > > > > > or as
> > > > > > > > a
> > > > > > > > > > > tuple)
> > > > > > > > > > > > > or
> > > > > > > > > > > > > > > else
> > > > > > > > > > > > > > > > throw an exception to explain a backpressure.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > So there are a few things we should work out
> > > here:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 1. Should we keep the "too many bytes
> enqueued"
> > > > part of
> > > > > > > > this in
> > > > > > > > > > > > > scope?
> > > > > > > > > > > > > > > (I
> > > > > > > > > > > > > > > > would say yes, so that we can minimize churn
> in
> > > > this
> > > > > > API)
> > > > > > > > > > > > > > > > 2. How should we signal backpressure so that
> it's
> > > > > > > > appropriate
> > > > > > > > > > for
> > > > > > > > > > > > > > > > asynchronous systems?  (I would say that we
> > > should
> > > > > > throw an
> > > > > > > > > > > > > exception.
> > > > > > > > > > > > > > > If
> > > > > > > > > > > > > > > > we choose this and we want to pursue the
> queueing
> > > > > > path, we
> > > > > > > > > > would
> > > > > > > > > > > > > *not*
> > > > > > > > > > > > > > > want
> > > > > > > > > > > > > > > > to enqueue messages that would push us over
> the
> > > > limit,
> > > > > > and
> > > > > > > > > > would
> > > > > > > > > > > > only
> > > > > > > > > > > > > > > want
> > > > > > > > > > > > > > > > to enqueue messages when we're waiting for
> > > > metadata,
> > > > > > and we
> > > > > > > > > > would
> > > > > > > > > > > > > want
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > keep track of the total number of bytes for
> those
> > > > > > > > messages).
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > What do you think?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > Moses
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Sun, May 16, 2021 at 6:16 AM Alexandre
> > > Dupriez <
> > > > > > > > > > > > > > > > alexandre.dupriez@gmail.com> wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Hello Nakamura,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thanks for proposing this change. I can
> see how
> > > > the
> > > > > > > > blocking
> > > > > > > > > > > > > > behaviour
> > > > > > > > > > > > > > > > > can be a problem when integrating with
> reactive
> > > > > > > > frameworks
> > > > > > > > > > such
> > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > Akka. One of the questions I would have is
> how
> > > > you
> > > > > > would
> > > > > > > > > > handle
> > > > > > > > > > > > > back
> > > > > > > > > > > > > > > > > pressure and avoid memory exhaustion when
> the
> > > > > > producer's
> > > > > > > > > > buffer
> > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > full and tasks would start to accumulate
> in the
> > > > > > > > out-of-band
> > > > > > > > > > > queue
> > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > thread pool introduced with this KIP.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > Alexandre
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Le ven. 14 mai 2021 à 15:55, Ryanne Dolan <
> > > > > > > > > > > ryannedolan@gmail.com
> > > > > > > > > > > > >
> > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > écrit
> > > > > > > > > > > > > > > > > :
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Makes sense!
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Ryanne
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Fri, May 14, 2021, 9:39 AM Nakamura <
> > > > > > > > nnythm@gmail.com>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Hey Ryanne,
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > I see what you're saying about serde
> > > > blocking,
> > > > > > but I
> > > > > > > > > > think
> > > > > > > > > > > we
> > > > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > > > > consider it out of scope for this
> patch.
> > > > Right
> > > > > > now
> > > > > > > > we've
> > > > > > > > > > > > > nailed
> > > > > > > > > > > > > > > > down a
> > > > > > > > > > > > > > > > > > > couple of use cases where we can
> > > > unambiguously
> > > > > > say,
> > > > > > > > "I
> > > > > > > > > > can
> > > > > > > > > > > > make
> > > > > > > > > > > > > > > > > progress
> > > > > > > > > > > > > > > > > > > now" or "I cannot make progress now",
> which
> > > > > > makes it
> > > > > > > > > > > possible
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > offload to
> > > > > > > > > > > > > > > > > > > a different thread only if we are
> unable to
> > > > make
> > > > > > > > > > progress.
> > > > > > > > > > > > > > > Extending
> > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > to CPU work like serde would mean
> always
> > > > > > offloading,
> > > > > > > > > > which
> > > > > > > > > > > > > would
> > > > > > > > > > > > > > > be a
> > > > > > > > > > > > > > > > > > > really big performance change.  It
> might be
> > > > worth
> > > > > > > > > > exploring
> > > > > > > > > > > > > > anyway,
> > > > > > > > > > > > > > > > > but I'd
> > > > > > > > > > > > > > > > > > > rather keep this patch focused on
> improving
> > > > > > > > ergonomics,
> > > > > > > > > > > > rather
> > > > > > > > > > > > > > than
> > > > > > > > > > > > > > > > > > > muddying the waters with evaluating
> > > > performance
> > > > > > very
> > > > > > > > > > > deeply.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > I think if we really do want to support
> > > > serde or
> > > > > > > > > > > interceptors
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > do
> > > > > > > > > > > > > > > > > IO on
> > > > > > > > > > > > > > > > > > > the send path (which seems like an
> > > > anti-pattern
> > > > > > to
> > > > > > > > me),
> > > > > > > > > > we
> > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > > consider
> > > > > > > > > > > > > > > > > > > making that a separate SIP, and
> probably
> > > also
> > > > > > > > consider
> > > > > > > > > > > > changing
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > API to
> > > > > > > > > > > > > > > > > > > use Futures (or CompletionStages).
> But I
> > > > would
> > > > > > > > rather
> > > > > > > > > > > avoid
> > > > > > > > > > > > > > scope
> > > > > > > > > > > > > > > > > creep,
> > > > > > > > > > > > > > > > > > > so that we have a better chance of
> fixing
> > > > this
> > > > > > part
> > > > > > > > of
> > > > > > > > > > the
> > > > > > > > > > > > > > problem.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Yes, I think some exceptions will move
> to
> > > > being
> > > > > > async
> > > > > > > > > > > instead
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > sync.
> > > > > > > > > > > > > > > > > > > They'll still be surfaced in the
> Future, so
> > > > I'm
> > > > > > not
> > > > > > > > so
> > > > > > > > > > > > > confident
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > would be that big a shock to users
> though.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > Moses
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > On Thu, May 13, 2021 at 7:44 PM Ryanne
> > > Dolan
> > > > <
> > > > > > > > > > > > > > > ryannedolan@gmail.com>
> > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > re serialization, my concern is that
> > > > > > serialization
> > > > > > > > > > often
> > > > > > > > > > > > > > accounts
> > > > > > > > > > > > > > > > > for a
> > > > > > > > > > > > > > > > > > > lot
> > > > > > > > > > > > > > > > > > > > of the cycles spent before returning
> the
> > > > > > future.
> > > > > > > > It's
> > > > > > > > > > not
> > > > > > > > > > > > > > > blocking
> > > > > > > > > > > > > > > > > per
> > > > > > > > > > > > > > > > > > > se,
> > > > > > > > > > > > > > > > > > > > but it's the same effect from the
> > > caller's
> > > > > > > > perspective.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Moreover, serde impls often block
> > > > themselves,
> > > > > > e.g.
> > > > > > > > when
> > > > > > > > > > > > > > fetching
> > > > > > > > > > > > > > > > > schemas
> > > > > > > > > > > > > > > > > > > > from a registry. I suppose it's also
> > > > possible
> > > > > > to
> > > > > > > > block
> > > > > > > > > > in
> > > > > > > > > > > > > > > > > Interceptors
> > > > > > > > > > > > > > > > > > > > (e.g. writing audit events or
> metrics),
> > > > which
> > > > > > > > happens
> > > > > > > > > > > > before
> > > > > > > > > > > > > > > serdes
> > > > > > > > > > > > > > > > > iiuc.
> > > > > > > > > > > > > > > > > > > > So any blocking in either of those
> > > plugins
> > > > > > would
> > > > > > > > block
> > > > > > > > > > > the
> > > > > > > > > > > > > send
> > > > > > > > > > > > > > > > > unless we
> > > > > > > > > > > > > > > > > > > > queue first.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > So I think we want to queue first
> and do
> > > > > > everything
> > > > > > > > > > > > > off-thread
> > > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > using
> > > > > > > > > > > > > > > > > > > > the new API, whatever that looks
> like. I
> > > > just
> > > > > > want
> > > > > > > > to
> > > > > > > > > > > make
> > > > > > > > > > > > > sure
> > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > don't
> > > > > > > > > > > > > > > > > > > do
> > > > > > > > > > > > > > > > > > > > that for clients that wouldn't
> expect it.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Another consideration is exception
> > > > handling.
> > > > > > If we
> > > > > > > > > > queue
> > > > > > > > > > > > > right
> > > > > > > > > > > > > > > > away,
> > > > > > > > > > > > > > > > > > > we'll
> > > > > > > > > > > > > > > > > > > > defer some exceptions that currently
> are
> > > > > > thrown to
> > > > > > > > the
> > > > > > > > > > > > caller
> > > > > > > > > > > > > > > > > (before the
> > > > > > > > > > > > > > > > > > > > future is returned). In the new API,
> the
> > > > send()
> > > > > > > > > > wouldn't
> > > > > > > > > > > > > throw
> > > > > > > > > > > > > > > any
> > > > > > > > > > > > > > > > > > > > exceptions, and instead the future
> would
> > > > fail.
> > > > > > I
> > > > > > > > think
> > > > > > > > > > > that
> > > > > > > > > > > > > > might
> > > > > > > > > > > > > > > > > mean
> > > > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > > a new method signature is required.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Ryanne
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > On Thu, May 13, 2021, 2:57 PM
> Nakamura <
> > > > > > > > > > > > > > nakamura.moses@gmail.com
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Hey Ryanne,
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > I agree we should add an additional
> > > > > > constructor
> > > > > > > > (or
> > > > > > > > > > > else
> > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > additional
> > > > > > > > > > > > > > > > > > > > > overload in KafkaProducer#send,
> but the
> > > > new
> > > > > > > > > > constructor
> > > > > > > > > > > > > would
> > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > easier
> > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > understand) if we're targeting the
> > > "user
> > > > > > > > provides the
> > > > > > > > > > > > > thread"
> > > > > > > > > > > > > > > > > approach.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > From looking at the code, I think
> we
> > > can
> > > > keep
> > > > > > > > record
> > > > > > > > > > > > > > > > serialization
> > > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > user thread, if we consider that an
> > > > important
> > > > > > > > part of
> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > semantics of
> > > > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > > > method.  It doesn't seem like
> > > > serialization
> > > > > > > > depends
> > > > > > > > > > on
> > > > > > > > > > > > > > knowing
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > cluster,
> > > > > > > > > > > > > > > > > > > > > I think it's incidental that it
> comes
> > > > after
> > > > > > the
> > > > > > > > first
> > > > > > > > > > > > > > > "blocking"
> > > > > > > > > > > > > > > > > > > segment
> > > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > the method.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > Moses
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > On Thu, May 13, 2021 at 2:38 PM
> Ryanne
> > > > Dolan
> > > > > > <
> > > > > > > > > > > > > > > > > ryannedolan@gmail.com>
> > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Hey Moses, I like the direction
> here.
> > > > My
> > > > > > > > thinking
> > > > > > > > > > is
> > > > > > > > > > > > > that a
> > > > > > > > > > > > > > > > > single
> > > > > > > > > > > > > > > > > > > > > > additional work queue, s.t.
> send()
> > > can
> > > > > > enqueue
> > > > > > > > and
> > > > > > > > > > > > > return,
> > > > > > > > > > > > > > > > seems
> > > > > > > > > > > > > > > > > like
> > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > lightest touch. However, I don't
> > > think
> > > > we
> > > > > > can
> > > > > > > > > > > trivially
> > > > > > > > > > > > > > > process
> > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > > > queue
> > > > > > > > > > > > > > > > > > > > > > in an internal thread pool
> without
> > > > subtly
> > > > > > > > changing
> > > > > > > > > > > > > behavior
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > > > > > users.
> > > > > > > > > > > > > > > > > > > > > > For example, users will often run
> > > > send() in
> > > > > > > > > > multiple
> > > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > order
> > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > serialize faster, but that
> wouldn't
> > > > work
> > > > > > quite
> > > > > > > > the
> > > > > > > > > > > same
> > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > there
> > > > > > > > > > > > > > > > > were
> > > > > > > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > > > > > internal thread pool.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > For this reason I'm thinking we
> need
> > > to
> > > > > > make
> > > > > > > > sure
> > > > > > > > > > any
> > > > > > > > > > > > > such
> > > > > > > > > > > > > > > > > changes
> > > > > > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > > > > > opt-in. Maybe a new constructor
> with
> > > an
> > > > > > > > additional
> > > > > > > > > > > > > > > > ThreadFactory
> > > > > > > > > > > > > > > > > > > > > parameter.
> > > > > > > > > > > > > > > > > > > > > > That would at least clearly
> indicate
> > > > that
> > > > > > work
> > > > > > > > will
> > > > > > > > > > > > > happen
> > > > > > > > > > > > > > > > > > > off-thread,
> > > > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > > would require opt-in for the new
> > > > behavior.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Under the hood, this
> ThreadFactory
> > > > could be
> > > > > > > > used to
> > > > > > > > > > > > > create
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > worker
> > > > > > > > > > > > > > > > > > > > > > thread that process queued sends,
> > > which
> > > > > > could
> > > > > > > > > > fan-out
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > per-partition
> > > > > > > > > > > > > > > > > > > > > > threads from there.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > So then you'd have two ways to
> send:
> > > > the
> > > > > > > > existing
> > > > > > > > > > > way,
> > > > > > > > > > > > > > where
> > > > > > > > > > > > > > > > > serde
> > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > > interceptors and whatnot are
> executed
> > > > on
> > > > > > the
> > > > > > > > > > calling
> > > > > > > > > > > > > > thread,
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > new
> > > > > > > > > > > > > > > > > > > > > > way, which returns right away and
> > > uses
> > > > an
> > > > > > > > internal
> > > > > > > > > > > > > > Executor.
> > > > > > > > > > > > > > > As
> > > > > > > > > > > > > > > > > you
> > > > > > > > > > > > > > > > > > > > point
> > > > > > > > > > > > > > > > > > > > > > out, the semantics would be
> identical
> > > > in
> > > > > > either
> > > > > > > > > > case,
> > > > > > > > > > > > and
> > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > would be
> > > > > > > > > > > > > > > > > > > > > very
> > > > > > > > > > > > > > > > > > > > > > easy for clients to switch.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Ryanne
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > On Thu, May 13, 2021, 9:00 AM
> > > Nakamura
> > > > <
> > > > > > > > > > > > nnythm@gmail.com
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Hey Folks,
> > > > > > > > > > > > > > > > > > > > > > > I just posted a new proposal
> > > > > > > > > > > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=181306446
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > in the wiki.  I think we have
> an
> > > > > > opportunity
> > > > > > > > to
> > > > > > > > > > > > improve
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > KafkaProducer#send user
> experience.
> > > > It
> > > > > > would
> > > > > > > > > > > > certainly
> > > > > > > > > > > > > > > make
> > > > > > > > > > > > > > > > > our
> > > > > > > > > > > > > > > > > > > > lives
> > > > > > > > > > > > > > > > > > > > > > > easier.  Please take a look!
> There
> > > > are
> > > > > > two
> > > > > > > > > > > > subproblems
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > I
> > > > > > > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > > > > > > > guidance on, so I would
> appreciate
> > > > > > feedback
> > > > > > > > on
> > > > > > > > > > both
> > > > > > > > > > > > of
> > > > > > > > > > > > > > > them.
> > > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > > > Moses
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > >
> > > > > > > > > > > > Matthew de Detrich
> > > > > > > > > > > >
> > > > > > > > > > > > *Aiven Deutschland GmbH*
> > > > > > > > > > > >
> > > > > > > > > > > > Immanuelkirchstraße 26, 10405 Berlin
> > > > > > > > > > > >
> > > > > > > > > > > > Amtsgericht Charlottenburg, HRB 209739 B
> > > > > > > > > > > >
> > > > > > > > > > > > *m:* +491603708037
> > > > > > > > > > > >
> > > > > > > > > > > > *w:* aiven.io *e:* matthew.dedetrich@aiven.io
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > >
> > > > > > > > > > Matthew de Detrich
> > > > > > > > > >
> > > > > > > > > > *Aiven Deutschland GmbH*
> > > > > > > > > >
> > > > > > > > > > Immanuelkirchstraße 26, 10405 Berlin
> > > > > > > > > >
> > > > > > > > > > Amtsgericht Charlottenburg, HRB 209739 B
> > > > > > > > > >
> > > > > > > > > > *m:* +491603708037
> > > > > > > > > >
> > > > > > > > > > *w:* aiven.io *e:* matthew.dedetrich@aiven.io
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-739: Block Less on KafkaProducer#send

Posted by Colin McCabe <cm...@apache.org>.
On Tue, Jun 1, 2021, at 12:22, Nakamura wrote:
> I think we're talking past each other a bit.  I know about non-blocking
> I/O.  The problem I'm facing is how to preserve the existing semantics
> without blocking.  Right now callers assume their work is enqueued in-order
> after `KafkaProducer#send` returns.  We can't simply return a future that
> represents the metadata fetch, because of that assumption.  We need to
> maintain order somehow.  That is what all of the different queues we're
> proposing are intended to do.

Hi Nakamura,

I guess the point I was making was that there is no connection between first-in, first-out semantics and blocking. Nothing about FIFO semantics requires blocking.

> > How are the ordering semantics of `KafkaProducer#send` related to the
> > metadata fetch?
> KafkaProducer#send currently enqueues after it has the metadata, and it
> passes the TopicPartition struct as part of the data when enqueueing.  We
> can either update that data structure to be able to work with partial
> metadata, or we can add a new queue on top.  I outline both potential
> approaches in the current KIP.
> 
> > That is not related to the metadata fetch. Also, I already proposed a
> > solution (returning an error) if this is a concern.
> Unfortunately it is, because `KafkaProducer#send` conflates the two of
> them.  That seems to be the central difficulty of preserving the semantics
> here.

Sure, we organize buffers by broker currently. However, we could set some maximum buffer size for records that haven't been assigned to a broker yet.

> 
> > The same client thread that always has been responsible for checking poll.
> Please pretend I've never contributed to Kafka before :). Which thread is
> that?

In general the Kafka producer is supposed to be used from a client thread. That thread is responsible for calling poll periodically to get the results of any send() operations it performed. (It's possible to use the producer from multiple threads as well.)

The main point I was making is that metadata fetches can and should be done in the same way as any other network I/O in the producer.

best,
Colin

> 
> Best,
> Moses
> 
> On Tue, Jun 1, 2021 at 3:12 PM Ryanne Dolan <ry...@gmail.com> wrote:
> 
> > Colin, the issue for me isn't so much whether non-blocking I/O is used or
> > not, but the fact that the caller observes a long time between calling
> > send() and receiving the returned future. This behavior can be considered
> > "blocking" whether or not I/O is involved.
> >
> > > How are the ordering semantics of `KafkaProducer#send` related to the
> > metadata fetch?
> > > I already proposed a solution (returning an error)
> >
> > There is a subtle difference between failing immediately vs blocking for
> > metadata, related to ordering in the face of retries. Say we set the send
> > timeout to max-long (or something high enough that we rarely encounter
> > timeouts in practice), and set max inflight requests to 1. Today, we can
> > reasonably assume that calling send() in sequence to a specific partition
> > will result in the corresponding sequence landing on that partition,
> > regardless of how the caller handles retries. The caller might not handle
> > retries at all. But if we can fail immediately (e.g. when the metadata
> > isn't yet ready), then the caller must handle retries carefully.
> > Specifically, the caller must retry each send() before proceeding to the
> > next. This basically means that the caller must block on each send() in
> > order to maintain the proper sequence -- how else would the caller know
> > whether it will need to retry or not?
> >
> > In other words, failing immediately punts the problem to the caller to
> > handle, while the caller is less-equipped to deal with it. I don't think we
> > should do that, at least not in the default case.
> >
> > I actually don't have any objections to this approach so long as it's
> > opt-in. It sounds like you are suggesting to fix the bug for everyone, but
> > I don't think we can do that without subtly breaking things.
> >
> > Ryanne
> >
> > On Tue, Jun 1, 2021 at 12:31 PM Colin McCabe <cm...@apache.org> wrote:
> >
> > > On Tue, Jun 1, 2021, at 07:00, Nakamura wrote:
> > > > Hi Colin,
> > > >
> > > > Sorry, I still don't follow.
> > > >
> > > > Right now `KafkaProducer#send` seems to trigger a metadata fetch.
> > Today,
> > > > we block on that before returning.  Is your proposal that we move the
> > > > metadata fetch out of `KafkaProducer#send` entirely?
> > > >
> > >
> > > KafkaProducer#send is supposed to initiate non-blocking I/O, but not wait
> > > for it to complete.
> > >
> > > There's more information about non-blocking I/O in Java here:
> > > https://en.wikipedia.org/wiki/Non-blocking_I/O_%28Java%29
> > >
> > > >
> > > > Even if the metadata fetch moves to be non-blocking, I think we still
> > > need
> > > > to deal with the problems we've discussed before if the fetch happens
> > in
> > > > the `KafkaProducer#send` method.  How do we maintain the ordering
> > > semantics
> > > > of `KafkaProducer#send`?
> > >
> > > How are the ordering semantics of `KafkaProducer#send` related to the
> > > metadata fetch?
> > >
> > > >  How do we prevent our buffer from filling up?
> > >
> > > That is not related to the metadata fetch. Also, I already proposed a
> > > solution (returning an error) if this is a concern.
> > >
> > > > Which thread is responsible for checking poll()?
> > >
> > > The same client thread that always has been responsible for checking
> > poll.
> > >
> > > >
> > > > The only approach I can see that would avoid this would be moving the
> > > > metadata fetch to happen at a different time.  But it's not clear to me
> > > > when would be a more appropriate time to do the metadata fetch than
> > > > `KafkaProducer#send`.
> > > >
> > >
> > > It's not about moving the metadata fetch to happen at a different time.
> > > It's about using non-blocking I/O, like we do for other network I/O. (And
> > > actually, if you want to get really technical, we do this for the
> > metadata
> > > fetch too, it's just that we have a hack that loops to transform it back
> > > into blocking I/O.)
> > >
> > > best,
> > > Colin
> > >
> > > > I think there's something I'm missing here.  Would you mind helping me
> > > > figure out what it is?
> > > >
> > > > Best,
> > > > Moses
> > > >
> > > > On Sun, May 30, 2021 at 5:35 PM Colin McCabe <cm...@apache.org>
> > wrote:
> > > >
> > > > > On Tue, May 25, 2021, at 11:26, Nakamura wrote:
> > > > > > Hey Colin,
> > > > > >
> > > > > > For the metadata case, what would fixing the bug look like?  I
> > agree
> > > that
> > > > > > we should fix it, but I don't have a clear picture in my mind of
> > what
> > > > > > fixing it should look like.  Can you elaborate?
> > > > > >
> > > > >
> > > > > If the blocking metadata fetch bug were fixed, neither the producer
> > nor
> > > > > the consumer would block while fetching metadata. A poll() call would
> > > > > initiate a metadata fetch if needed, and a subsequent call to poll()
> > > would
> > > > > handle the results if needed. Basically the same paradigm we use for
> > > other
> > > > > network communication in the producer and consumer.
> > > > >
> > > > > best,
> > > > > Colin
> > > > >
> > > > > > Best,
> > > > > > Moses
> > > > > >
> > > > > > On Mon, May 24, 2021 at 1:54 PM Colin McCabe <cm...@apache.org>
> > > wrote:
> > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > I agree that we should give users the option of having a fully
> > > async
> > > > > API,
> > > > > > > but I don't think external thread pools or queues are the right
> > > > > direction
> > > > > > > to go here. They add performance overheads and don't address the
> > > root
> > > > > > > causes of the problem.
> > > > > > >
> > > > > > > There are basically two scenarios where we block, currently. One
> > is
> > > > > when
> > > > > > > we are doing a metadata fetch. I think this is clearly a bug, or
> > at
> > > > > least
> > > > > > > an implementation limitation. From the user's point of view, the
> > > fact
> > > > > that
> > > > > > > we are doing a metadata fetch is an implementation detail that
> > > really
> > > > > > > shouldn't be exposed like this. We have talked about fixing this
> > > in the
> > > > > > > past. I think we just should spend the time to do it.
> > > > > > >
> > > > > > > The second scenario is where the client has produced too much
> > data
> > > in
> > > > > too
> > > > > > > little time. This could happen if there is a network glitch, or
> > the
> > > > > server
> > > > > > > is slower than expected. In this case, the behavior is
> > intentional
> > > and
> > > > > not
> > > > > > > a bug. To understand this, think about what would happen if we
> > > didn't
> > > > > > > block. We would start buffering more and more data in memory,
> > until
> > > > > finally
> > > > > > > the application died with an out of memory error. That would be
> > > > > frustrating
> > > > > > > for users and wouldn't add to the usability of Kafka.
> > > > > > >
> > > > > > > We could potentially have an option to handle the out-of-memory
> > > > > scenario
> > > > > > > differently by returning an error code immediately rather than
> > > > > blocking.
> > > > > > > Applications would have to be rewritten to handle this properly,
> > > but
> > > > > it is
> > > > > > > a possibility. I suspect that most of them wouldn't use this, but
> > > we
> > > > > could
> > > > > > > offer it as a possibility for async purists (which might include
> > > > > certain
> > > > > > > frameworks). The big problem the users would have to solve is
> > what
> > > to
> > > > > do
> > > > > > > with the record that they were unable to produce due to the
> > buffer
> > > full
> > > > > > > issue.
> > > > > > >
> > > > > > > best,
> > > > > > > Colin
> > > > > > >
> > > > > > >
> > > > > > > On Thu, May 20, 2021, at 10:35, Nakamura wrote:
> > > > > > > > >
> > > > > > > > > My suggestion was just do this in multiple steps/phases,
> > > firstly
> > > > > let's
> > > > > > > fix
> > > > > > > > > the issue of send being misleadingly asynchronous (i.e.
> > > internally
> > > > > its
> > > > > > > > > blocking) and then later one we can make the various
> > > > > > > > > threadpools configurable with a sane default.
> > > > > > > >
> > > > > > > > I like that approach. I updated the "Which thread should be
> > > > > responsible
> > > > > > > for
> > > > > > > > waiting" part of KIP-739 to add your suggestion as my
> > recommended
> > > > > > > approach,
> > > > > > > > thank you!  If no one else has major concerns about that
> > > approach,
> > > > > I'll
> > > > > > > > move the alternatives to "rejected alternatives".
> > > > > > > >
> > > > > > > > On Thu, May 20, 2021 at 7:26 AM Matthew de Detrich
> > > > > > > > <ma...@aiven.io.invalid> wrote:
> > > > > > > >
> > > > > > > > > @
> > > > > > > > >
> > > > > > > > > Nakamura
> > > > > > > > > On Wed, May 19, 2021 at 7:35 PM Nakamura <nn...@gmail.com>
> > > wrote:
> > > > > > > > >
> > > > > > > > > > @Ryanne:
> > > > > > > > > > In my mind's eye I slightly prefer the throwing the "cannot
> > > > > enqueue"
> > > > > > > > > > exception to satisfying the future immediately with the
> > > "cannot
> > > > > > > enqueue"
> > > > > > > > > > exception?  But I agree, it would be worth doing more
> > > research.
> > > > > > > > > >
> > > > > > > > > > @Matthew:
> > > > > > > > > >
> > > > > > > > > > > 3. Using multiple thread pools is definitely recommended
> > > for
> > > > > > > different
> > > > > > > > > > > types of tasks, for serialization which is CPU bound you
> > > > > definitely
> > > > > > > > > would
> > > > > > > > > > > want to use a bounded thread pool that is fixed by the
> > > number
> > > > > of
> > > > > > > CPU's
> > > > > > > > > > (or
> > > > > > > > > > > something along those lines).
> > > > > > > > > > >
> > > > > https://gist.github.com/djspiewak/46b543800958cf61af6efa8e072bfd5c
> > > > > > > is
> > > > > > > > > a
> > > > > > > > > > > very good guide on this topic
> > > > > > > > > > I think this guide is good in general, but I would be
> > > hesitant to
> > > > > > > follow
> > > > > > > > > > its guidance re: offloading serialization without
> > > benchmarking
> > > > > it.
> > > > > > > My
> > > > > > > > > > understanding is that context-switches have gotten much
> > > cheaper,
> > > > > and
> > > > > > > that
> > > > > > > > > > gains from cache locality are small, but they're not
> > nothing.
> > > > > > > Especially
> > > > > > > > > > if the workload has a very small serialization cost, I
> > > wouldn't
> > > > > be
> > > > > > > > > shocked
> > > > > > > > > > if it made it slower.  I feel pretty strongly that we
> > should
> > > do
> > > > > more
> > > > > > > > > > research here before unconditionally encouraging
> > > serialization
> > > > > in a
> > > > > > > > > > threadpool.  If people think it's important to do it here
> > > (eg if
> > > > > we
> > > > > > > think
> > > > > > > > > > it would mean another big API change) then we should start
> > > > > thinking
> > > > > > > about
> > > > > > > > > > what benchmarking we can do to gain higher confidence in
> > this
> > > > > kind of
> > > > > > > > > > change.  However, I don't think it would change semantics
> > as
> > > > > > > > > substantially
> > > > > > > > > > as we're proposing here, so I would vote for pushing this
> > to
> > > a
> > > > > > > subsequent
> > > > > > > > > > KIP.
> > > > > > > > > >
> > > > > > > > > Of course, its all down to benchmarking, benchmarking and
> > > > > benchmarking.
> > > > > > > > > Ideally speaking you want to use all of the resources
> > > available to
> > > > > > > you, so
> > > > > > > > > if you have a bottleneck in serialization and you have many
> > > cores
> > > > > free
> > > > > > > then
> > > > > > > > > using multiple cores may be more appropriate than a single
> > > core.
> > > > > > > Typically
> > > > > > > > > I would expect that using a single thread to do serialization
> > > is
> > > > > > > likely to
> > > > > > > > > be the most situation, I was just responding to an earlier
> > > point
> > > > > that
> > > > > > > was
> > > > > > > > > made in regards to using ThreadPools for serialization (note
> > > that
> > > > > you
> > > > > > > can
> > > > > > > > > also just use a ThreadPool that is pinned to a single thread)
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > 4. Regarding providing the ability for users to supply
> > > their
> > > > > own
> > > > > > > custom
> > > > > > > > > > > ThreadPool this is more of an ergonomics question for the
> > > API.
> > > > > > > > > Especially
> > > > > > > > > > > when it gets to monitoring/tracing, giving the ability
> > for
> > > > > users to
> > > > > > > > > > provide
> > > > > > > > > > > their own custom IO/CPU ThreadPools is ideal however as
> > > stated
> > > > > > > doing so
> > > > > > > > > > > means a lot of boilerplatery changes to the API.
> > Typically
> > > > > > > speaking a
> > > > > > > > > lot
> > > > > > > > > > > of monitoring/tracing/diagnosing is done on
> > > > > > > > > ExecutionContext/ThreadPools
> > > > > > > > > > > (at least on a more rudimentary level) and hence allowing
> > > > > users to
> > > > > > > > > supply
> > > > > > > > > > a
> > > > > > > > > > > global singleton ThreadPool for IO tasks and another for
> > > CPU
> > > > > tasks
> > > > > > > > > makes
> > > > > > > > > > > their lives a lot easier. However due to the large amount
> > > of
> > > > > > > changes to
> > > > > > > > > > the
> > > > > > > > > > > API, it may be more appropriate to just use internal
> > thread
> > > > > pools
> > > > > > > (for
> > > > > > > > > > now)
> > > > > > > > > > > since at least it's not any worse than what exists
> > > currently
> > > > > and
> > > > > > > this
> > > > > > > > > can
> > > > > > > > > > > be an improvement that is done later?
> > > > > > > > > > Is there an existing threadpool that you suggest we reuse?
> > > Or
> > > > > are
> > > > > > > you
> > > > > > > > > > imagining that we make our own internal threadpool, and
> > then
> > > > > maybe
> > > > > > > expose
> > > > > > > > > > configuration flags to manipulate it?  For what it's
> > worth, I
> > > > > like
> > > > > > > having
> > > > > > > > > > an internal threadpool (perhaps just FJP.commonpool) and
> > then
> > > > > > > providing
> > > > > > > > > an
> > > > > > > > > > alternative to pass your own threadpool.  That way people
> > who
> > > > > want
> > > > > > > finer
> > > > > > > > > > control can get it, and everyone else can do OK with the
> > > default.
> > > > > > > > > >
> > > > > > > > > Indeed that is what I am saying. The most ideal situation is
> > > that
> > > > > > > there is
> > > > > > > > > a default internal threadpool that Kafka uses, however users
> > of
> > > > > Kafka
> > > > > > > can
> > > > > > > > > configure there own threadpool. Having a singleton ThreadPool
> > > for
> > > > > > > blocking
> > > > > > > > > IO, non blocking IO and CPU bound tasks which can be plugged
> > in
> > > > > all of
> > > > > > > your
> > > > > > > > > libraries (including Kafka) makes resource management much
> > > easier
> > > > > to
> > > > > > > do and
> > > > > > > > > also gives control of users to override specific threadpools
> > > for
> > > > > > > > > exceptional cases (i.e. providing a Threadpool that is pinned
> > > to a
> > > > > > > single
> > > > > > > > > core which tends to give the best latency results if this is
> > > > > something
> > > > > > > that
> > > > > > > > > is critical for you).
> > > > > > > > >
> > > > > > > > > My suggestion was just do this in multiple steps/phases,
> > > firstly
> > > > > let's
> > > > > > > fix
> > > > > > > > > the issue of send being misleadingly asynchronous (i.e.
> > > internally
> > > > > its
> > > > > > > > > blocking) and then later one we can make the various
> > > > > > > > > threadpools configurable with a sane default.
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Wed, May 19, 2021 at 6:01 AM Matthew de Detrich
> > > > > > > > > > <ma...@aiven.io.invalid> wrote:
> > > > > > > > > >
> > > > > > > > > > > Here are my two cents here (note that I am only seeing
> > > this on
> > > > > a
> > > > > > > > > surface
> > > > > > > > > > > level)
> > > > > > > > > > >
> > > > > > > > > > > 1. If we are going this road it makes sense to do this
> > > > > "properly"
> > > > > > > (i.e.
> > > > > > > > > > > using queues  as Ryan suggested). The reason I am saying
> > > this
> > > > > is
> > > > > > > that
> > > > > > > > > it
> > > > > > > > > > > seems that the original goal of the KIP is for it to be
> > > used in
> > > > > > > other
> > > > > > > > > > > asynchronous systems and from my personal experience, you
> > > > > really do
> > > > > > > > > need
> > > > > > > > > > to
> > > > > > > > > > > make the implementation properly asynchronous otherwise
> > > it's
> > > > > > > really not
> > > > > > > > > > > that useful.
> > > > > > > > > > > 2. Due to the previous point and what was said by others,
> > > this
> > > > > is
> > > > > > > > > likely
> > > > > > > > > > > going to break some existing semantics (i.e. people are
> > > > > currently
> > > > > > > > > relying
> > > > > > > > > > > on blocking semantics) so adding another
> > method's/interface
> > > > > plus
> > > > > > > > > > > deprecating the older one is more annoying but ideal.
> > > > > > > > > > > 3. Using multiple thread pools is definitely recommended
> > > for
> > > > > > > different
> > > > > > > > > > > types of tasks, for serialization which is CPU bound you
> > > > > definitely
> > > > > > > > > would
> > > > > > > > > > > want to use a bounded thread pool that is fixed by the
> > > number
> > > > > of
> > > > > > > CPU's
> > > > > > > > > > (or
> > > > > > > > > > > something along those lines).
> > > > > > > > > > >
> > > > > https://gist.github.com/djspiewak/46b543800958cf61af6efa8e072bfd5c
> > > > > > > is
> > > > > > > > > a
> > > > > > > > > > > very good guide on this topic
> > > > > > > > > > > 4. Regarding providing the ability for users to supply
> > > their
> > > > > own
> > > > > > > custom
> > > > > > > > > > > ThreadPool this is more of an ergonomics question for the
> > > API.
> > > > > > > > > Especially
> > > > > > > > > > > when it gets to monitoring/tracing, giving the ability
> > for
> > > > > users to
> > > > > > > > > > provide
> > > > > > > > > > > their own custom IO/CPU ThreadPools is ideal however as
> > > stated
> > > > > > > doing so
> > > > > > > > > > > means a lot of boilerplatery changes to the API.
> > Typically
> > > > > > > speaking a
> > > > > > > > > lot
> > > > > > > > > > > of monitoring/tracing/diagnosing is done on
> > > > > > > > > ExecutionContext/ThreadPools
> > > > > > > > > > > (at least on a more rudimentary level) and hence allowing
> > > > > users to
> > > > > > > > > > supply a
> > > > > > > > > > > global singleton ThreadPool for IO tasks and another for
> > > CPU
> > > > > tasks
> > > > > > > > > makes
> > > > > > > > > > > their lives a lot easier. However due to the large amount
> > > of
> > > > > > > changes to
> > > > > > > > > > the
> > > > > > > > > > > API, it may be more appropriate to just use internal
> > thread
> > > > > pools
> > > > > > > (for
> > > > > > > > > > now)
> > > > > > > > > > > since at least it's not any worse than what exists
> > > currently
> > > > > and
> > > > > > > this
> > > > > > > > > can
> > > > > > > > > > > be an improvement that is done later?
> > > > > > > > > > >
> > > > > > > > > > > On Wed, May 19, 2021 at 2:56 AM Ryanne Dolan <
> > > > > > > ryannedolan@gmail.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > I was thinking the sender would typically wrap send()
> > in
> > > a
> > > > > > > > > > backoff/retry
> > > > > > > > > > > > loop, or else ignore any failures and drop sends on the
> > > floor
> > > > > > > > > > > > (fire-and-forget), and in both cases I think failing
> > > > > immediately
> > > > > > > is
> > > > > > > > > > > better
> > > > > > > > > > > > than blocking for a new spot in the queue or
> > > asynchronously
> > > > > > > failing
> > > > > > > > > > > > somehow.
> > > > > > > > > > > >
> > > > > > > > > > > > I think a failed future is adequate for the "explicit
> > > > > > > backpressure
> > > > > > > > > > > signal"
> > > > > > > > > > > > while avoiding any blocking anywhere. I think if we try
> > > to
> > > > > > > > > > asynchronously
> > > > > > > > > > > > signal the caller of failure (either by asynchronously
> > > > > failing
> > > > > > > the
> > > > > > > > > > future
> > > > > > > > > > > > or invoking a callback off-thread or something) we'd
> > > force
> > > > > the
> > > > > > > caller
> > > > > > > > > > to
> > > > > > > > > > > > either block or poll waiting for that signal, which
> > > somewhat
> > > > > > > defeats
> > > > > > > > > > the
> > > > > > > > > > > > purpose we're after. And of course blocking for a spot
> > > in the
> > > > > > > queue
> > > > > > > > > > > > definitely defeats the purpose (tho perhaps ameliorates
> > > the
> > > > > > > problem
> > > > > > > > > > > some).
> > > > > > > > > > > >
> > > > > > > > > > > > Throwing an exception to the caller directly (not via
> > the
> > > > > > > future) is
> > > > > > > > > > > > another option with precedent in Kafka clients, tho it
> > > > > doesn't
> > > > > > > seem
> > > > > > > > > as
> > > > > > > > > > > > ergonomic to me.
> > > > > > > > > > > >
> > > > > > > > > > > > It would be interesting to analyze some existing usage
> > > and
> > > > > > > determine
> > > > > > > > > > how
> > > > > > > > > > > > difficult it would be to convert it to the various
> > > proposed
> > > > > APIs.
> > > > > > > > > > > >
> > > > > > > > > > > > Ryanne
> > > > > > > > > > > >
> > > > > > > > > > > > On Tue, May 18, 2021, 3:27 PM Nakamura <
> > nnythm@gmail.com
> > > >
> > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi Ryanne,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Hmm, that's an interesting idea.  Basically it would
> > > mean
> > > > > that
> > > > > > > > > after
> > > > > > > > > > > > > calling send, you would also have to check whether
> > the
> > > > > returned
> > > > > > > > > > future
> > > > > > > > > > > > had
> > > > > > > > > > > > > failed with a specific exception.  I would be open to
> > > it,
> > > > > > > although
> > > > > > > > > I
> > > > > > > > > > > > think
> > > > > > > > > > > > > it might be slightly more surprising, since right now
> > > the
> > > > > > > paradigm
> > > > > > > > > is
> > > > > > > > > > > > > "enqueue synchronously, the future represents whether
> > > we
> > > > > > > succeeded
> > > > > > > > > in
> > > > > > > > > > > > > sending or not" and the new one would be "enqueue
> > > > > > > synchronously,
> > > > > > > > > the
> > > > > > > > > > > > future
> > > > > > > > > > > > > either represents whether we succeeded in enqueueing
> > or
> > > > > not (in
> > > > > > > > > which
> > > > > > > > > > > > case
> > > > > > > > > > > > > it will be failed immediately if it failed to
> > enqueue)
> > > or
> > > > > > > whether
> > > > > > > > > we
> > > > > > > > > > > > > succeeded in sending or not".
> > > > > > > > > > > > >
> > > > > > > > > > > > > But you're right, it should be on the table, thank
> > you
> > > for
> > > > > > > > > suggesting
> > > > > > > > > > > it!
> > > > > > > > > > > > >
> > > > > > > > > > > > > Best,
> > > > > > > > > > > > > Moses
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Tue, May 18, 2021 at 12:23 PM Ryanne Dolan <
> > > > > > > > > ryannedolan@gmail.com
> > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Moses, in the case of a full queue, could we just
> > > return
> > > > > a
> > > > > > > failed
> > > > > > > > > > > > future
> > > > > > > > > > > > > > immediately?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Ryanne
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Tue, May 18, 2021, 10:39 AM Nakamura <
> > > > > nnythm@gmail.com>
> > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi Alexandre,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks for bringing this up, I think I could use
> > > some
> > > > > > > feedback
> > > > > > > > > in
> > > > > > > > > > > > this
> > > > > > > > > > > > > > > area.  There are two mechanisms here, one for
> > > slowing
> > > > > down
> > > > > > > when
> > > > > > > > > > we
> > > > > > > > > > > > > don't
> > > > > > > > > > > > > > > have the relevant metadata, and the other for
> > > slowing
> > > > > down
> > > > > > > > > when a
> > > > > > > > > > > > queue
> > > > > > > > > > > > > > has
> > > > > > > > > > > > > > > filled up.  Although the first one applies
> > > backpressure
> > > > > > > > > somewhat
> > > > > > > > > > > > > > > inadvertently, we could still get in trouble if
> > > we're
> > > > > not
> > > > > > > > > > providing
> > > > > > > > > > > > > > > information to the mechanism that monitors
> > whether
> > > > > we're
> > > > > > > > > queueing
> > > > > > > > > > > too
> > > > > > > > > > > > > > > much.  As for the second one, that is a classic
> > > > > > > backpressure
> > > > > > > > > use
> > > > > > > > > > > > case,
> > > > > > > > > > > > > so
> > > > > > > > > > > > > > > it's definitely important that we don't drop that
> > > > > ability.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Right now backpressure is applied by blocking,
> > > which
> > > > > is a
> > > > > > > > > natural
> > > > > > > > > > > way
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > apply backpressure in synchronous systems, but
> > can
> > > > > lead to
> > > > > > > > > > > > unnecessary
> > > > > > > > > > > > > > > slowdowns in asynchronous systems.  In my
> > opinion,
> > > the
> > > > > > > safest
> > > > > > > > > way
> > > > > > > > > > > to
> > > > > > > > > > > > > > apply
> > > > > > > > > > > > > > > backpressure in an asynchronous model is to have
> > an
> > > > > > > explicit
> > > > > > > > > > > > > backpressure
> > > > > > > > > > > > > > > signal.  A good example would be returning an
> > > > > exception,
> > > > > > > and
> > > > > > > > > > > > providing
> > > > > > > > > > > > > an
> > > > > > > > > > > > > > > optional hook to add a callback onto so that you
> > > can be
> > > > > > > > > notified
> > > > > > > > > > > when
> > > > > > > > > > > > > > it's
> > > > > > > > > > > > > > > ready to accept more messages.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > However, this would be a really big change to how
> > > > > users use
> > > > > > > > > > > > > > > KafkaProducer#send, so I don't know how much
> > > appetite
> > > > > we
> > > > > > > have
> > > > > > > > > for
> > > > > > > > > > > > > making
> > > > > > > > > > > > > > > that kind of change.  Maybe it would be simpler
> > to
> > > > > remove
> > > > > > > the
> > > > > > > > > > > "don't
> > > > > > > > > > > > > > block
> > > > > > > > > > > > > > > when the per-topic queue is full" from the scope
> > of
> > > > > this
> > > > > > > KIP,
> > > > > > > > > and
> > > > > > > > > > > > only
> > > > > > > > > > > > > > > focus on when metadata is available?  The
> > downside
> > > is
> > > > > that
> > > > > > > we
> > > > > > > > > > will
> > > > > > > > > > > > > > probably
> > > > > > > > > > > > > > > want to change the API again later to fix this,
> > so
> > > it
> > > > > > > might be
> > > > > > > > > > > better
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > just rip the bandaid off now.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > One slightly nasty thing here is that because
> > > queueing
> > > > > > > order is
> > > > > > > > > > > > > > important,
> > > > > > > > > > > > > > > if we want to use exceptions, we will want to be
> > > able
> > > > > to
> > > > > > > signal
> > > > > > > > > > the
> > > > > > > > > > > > > > failure
> > > > > > > > > > > > > > > to enqueue to the caller in such a way that they
> > > can
> > > > > still
> > > > > > > > > > enforce
> > > > > > > > > > > > > > message
> > > > > > > > > > > > > > > order if they want.  So we can't embed the
> > failure
> > > > > > > directly in
> > > > > > > > > > the
> > > > > > > > > > > > > > returned
> > > > > > > > > > > > > > > future, we should either return two futures
> > > (nested,
> > > > > or as
> > > > > > > a
> > > > > > > > > > tuple)
> > > > > > > > > > > > or
> > > > > > > > > > > > > > else
> > > > > > > > > > > > > > > throw an exception to explain a backpressure.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > So there are a few things we should work out
> > here:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 1. Should we keep the "too many bytes enqueued"
> > > part of
> > > > > > > this in
> > > > > > > > > > > > scope?
> > > > > > > > > > > > > > (I
> > > > > > > > > > > > > > > would say yes, so that we can minimize churn in
> > > this
> > > > > API)
> > > > > > > > > > > > > > > 2. How should we signal backpressure so that it's
> > > > > > > appropriate
> > > > > > > > > for
> > > > > > > > > > > > > > > asynchronous systems?  (I would say that we
> > should
> > > > > throw an
> > > > > > > > > > > > exception.
> > > > > > > > > > > > > > If
> > > > > > > > > > > > > > > we choose this and we want to pursue the queueing
> > > > > path, we
> > > > > > > > > would
> > > > > > > > > > > > *not*
> > > > > > > > > > > > > > want
> > > > > > > > > > > > > > > to enqueue messages that would push us over the
> > > limit,
> > > > > and
> > > > > > > > > would
> > > > > > > > > > > only
> > > > > > > > > > > > > > want
> > > > > > > > > > > > > > > to enqueue messages when we're waiting for
> > > metadata,
> > > > > and we
> > > > > > > > > would
> > > > > > > > > > > > want
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > keep track of the total number of bytes for those
> > > > > > > messages).
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > What do you think?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > Moses
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Sun, May 16, 2021 at 6:16 AM Alexandre
> > Dupriez <
> > > > > > > > > > > > > > > alexandre.dupriez@gmail.com> wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hello Nakamura,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks for proposing this change. I can see how
> > > the
> > > > > > > blocking
> > > > > > > > > > > > > behaviour
> > > > > > > > > > > > > > > > can be a problem when integrating with reactive
> > > > > > > frameworks
> > > > > > > > > such
> > > > > > > > > > > as
> > > > > > > > > > > > > > > > Akka. One of the questions I would have is how
> > > you
> > > > > would
> > > > > > > > > handle
> > > > > > > > > > > > back
> > > > > > > > > > > > > > > > pressure and avoid memory exhaustion when the
> > > > > producer's
> > > > > > > > > buffer
> > > > > > > > > > > is
> > > > > > > > > > > > > > > > full and tasks would start to accumulate in the
> > > > > > > out-of-band
> > > > > > > > > > queue
> > > > > > > > > > > > or
> > > > > > > > > > > > > > > > thread pool introduced with this KIP.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > Alexandre
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Le ven. 14 mai 2021 à 15:55, Ryanne Dolan <
> > > > > > > > > > ryannedolan@gmail.com
> > > > > > > > > > > >
> > > > > > > > > > > > a
> > > > > > > > > > > > > > > écrit
> > > > > > > > > > > > > > > > :
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Makes sense!
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Ryanne
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Fri, May 14, 2021, 9:39 AM Nakamura <
> > > > > > > nnythm@gmail.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Hey Ryanne,
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > I see what you're saying about serde
> > > blocking,
> > > > > but I
> > > > > > > > > think
> > > > > > > > > > we
> > > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > > > consider it out of scope for this patch.
> > > Right
> > > > > now
> > > > > > > we've
> > > > > > > > > > > > nailed
> > > > > > > > > > > > > > > down a
> > > > > > > > > > > > > > > > > > couple of use cases where we can
> > > unambiguously
> > > > > say,
> > > > > > > "I
> > > > > > > > > can
> > > > > > > > > > > make
> > > > > > > > > > > > > > > > progress
> > > > > > > > > > > > > > > > > > now" or "I cannot make progress now", which
> > > > > makes it
> > > > > > > > > > possible
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > > offload to
> > > > > > > > > > > > > > > > > > a different thread only if we are unable to
> > > make
> > > > > > > > > progress.
> > > > > > > > > > > > > > Extending
> > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > to CPU work like serde would mean always
> > > > > offloading,
> > > > > > > > > which
> > > > > > > > > > > > would
> > > > > > > > > > > > > > be a
> > > > > > > > > > > > > > > > > > really big performance change.  It might be
> > > worth
> > > > > > > > > exploring
> > > > > > > > > > > > > anyway,
> > > > > > > > > > > > > > > > but I'd
> > > > > > > > > > > > > > > > > > rather keep this patch focused on improving
> > > > > > > ergonomics,
> > > > > > > > > > > rather
> > > > > > > > > > > > > than
> > > > > > > > > > > > > > > > > > muddying the waters with evaluating
> > > performance
> > > > > very
> > > > > > > > > > deeply.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > I think if we really do want to support
> > > serde or
> > > > > > > > > > interceptors
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > do
> > > > > > > > > > > > > > > > IO on
> > > > > > > > > > > > > > > > > > the send path (which seems like an
> > > anti-pattern
> > > > > to
> > > > > > > me),
> > > > > > > > > we
> > > > > > > > > > > > should
> > > > > > > > > > > > > > > > consider
> > > > > > > > > > > > > > > > > > making that a separate SIP, and probably
> > also
> > > > > > > consider
> > > > > > > > > > > changing
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > API to
> > > > > > > > > > > > > > > > > > use Futures (or CompletionStages).  But I
> > > would
> > > > > > > rather
> > > > > > > > > > avoid
> > > > > > > > > > > > > scope
> > > > > > > > > > > > > > > > creep,
> > > > > > > > > > > > > > > > > > so that we have a better chance of fixing
> > > this
> > > > > part
> > > > > > > of
> > > > > > > > > the
> > > > > > > > > > > > > problem.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Yes, I think some exceptions will move to
> > > being
> > > > > async
> > > > > > > > > > instead
> > > > > > > > > > > > of
> > > > > > > > > > > > > > > sync.
> > > > > > > > > > > > > > > > > > They'll still be surfaced in the Future, so
> > > I'm
> > > > > not
> > > > > > > so
> > > > > > > > > > > > confident
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > would be that big a shock to users though.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > Moses
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Thu, May 13, 2021 at 7:44 PM Ryanne
> > Dolan
> > > <
> > > > > > > > > > > > > > ryannedolan@gmail.com>
> > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > re serialization, my concern is that
> > > > > serialization
> > > > > > > > > often
> > > > > > > > > > > > > accounts
> > > > > > > > > > > > > > > > for a
> > > > > > > > > > > > > > > > > > lot
> > > > > > > > > > > > > > > > > > > of the cycles spent before returning the
> > > > > future.
> > > > > > > It's
> > > > > > > > > not
> > > > > > > > > > > > > > blocking
> > > > > > > > > > > > > > > > per
> > > > > > > > > > > > > > > > > > se,
> > > > > > > > > > > > > > > > > > > but it's the same effect from the
> > caller's
> > > > > > > perspective.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Moreover, serde impls often block
> > > themselves,
> > > > > e.g.
> > > > > > > when
> > > > > > > > > > > > > fetching
> > > > > > > > > > > > > > > > schemas
> > > > > > > > > > > > > > > > > > > from a registry. I suppose it's also
> > > possible
> > > > > to
> > > > > > > block
> > > > > > > > > in
> > > > > > > > > > > > > > > > Interceptors
> > > > > > > > > > > > > > > > > > > (e.g. writing audit events or metrics),
> > > which
> > > > > > > happens
> > > > > > > > > > > before
> > > > > > > > > > > > > > serdes
> > > > > > > > > > > > > > > > iiuc.
> > > > > > > > > > > > > > > > > > > So any blocking in either of those
> > plugins
> > > > > would
> > > > > > > block
> > > > > > > > > > the
> > > > > > > > > > > > send
> > > > > > > > > > > > > > > > unless we
> > > > > > > > > > > > > > > > > > > queue first.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > So I think we want to queue first and do
> > > > > everything
> > > > > > > > > > > > off-thread
> > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > using
> > > > > > > > > > > > > > > > > > > the new API, whatever that looks like. I
> > > just
> > > > > want
> > > > > > > to
> > > > > > > > > > make
> > > > > > > > > > > > sure
> > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > don't
> > > > > > > > > > > > > > > > > > do
> > > > > > > > > > > > > > > > > > > that for clients that wouldn't expect it.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Another consideration is exception
> > > handling.
> > > > > If we
> > > > > > > > > queue
> > > > > > > > > > > > right
> > > > > > > > > > > > > > > away,
> > > > > > > > > > > > > > > > > > we'll
> > > > > > > > > > > > > > > > > > > defer some exceptions that currently are
> > > > > thrown to
> > > > > > > the
> > > > > > > > > > > caller
> > > > > > > > > > > > > > > > (before the
> > > > > > > > > > > > > > > > > > > future is returned). In the new API, the
> > > send()
> > > > > > > > > wouldn't
> > > > > > > > > > > > throw
> > > > > > > > > > > > > > any
> > > > > > > > > > > > > > > > > > > exceptions, and instead the future would
> > > fail.
> > > > > I
> > > > > > > think
> > > > > > > > > > that
> > > > > > > > > > > > > might
> > > > > > > > > > > > > > > > mean
> > > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > a new method signature is required.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Ryanne
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > On Thu, May 13, 2021, 2:57 PM Nakamura <
> > > > > > > > > > > > > nakamura.moses@gmail.com
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Hey Ryanne,
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > I agree we should add an additional
> > > > > constructor
> > > > > > > (or
> > > > > > > > > > else
> > > > > > > > > > > an
> > > > > > > > > > > > > > > > additional
> > > > > > > > > > > > > > > > > > > > overload in KafkaProducer#send, but the
> > > new
> > > > > > > > > constructor
> > > > > > > > > > > > would
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > easier
> > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > understand) if we're targeting the
> > "user
> > > > > > > provides the
> > > > > > > > > > > > thread"
> > > > > > > > > > > > > > > > approach.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > From looking at the code, I think we
> > can
> > > keep
> > > > > > > record
> > > > > > > > > > > > > > > serialization
> > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > user thread, if we consider that an
> > > important
> > > > > > > part of
> > > > > > > > > > the
> > > > > > > > > > > > > > > > semantics of
> > > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > > method.  It doesn't seem like
> > > serialization
> > > > > > > depends
> > > > > > > > > on
> > > > > > > > > > > > > knowing
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > cluster,
> > > > > > > > > > > > > > > > > > > > I think it's incidental that it comes
> > > after
> > > > > the
> > > > > > > first
> > > > > > > > > > > > > > "blocking"
> > > > > > > > > > > > > > > > > > segment
> > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > the method.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > Moses
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > On Thu, May 13, 2021 at 2:38 PM Ryanne
> > > Dolan
> > > > > <
> > > > > > > > > > > > > > > > ryannedolan@gmail.com>
> > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Hey Moses, I like the direction here.
> > > My
> > > > > > > thinking
> > > > > > > > > is
> > > > > > > > > > > > that a
> > > > > > > > > > > > > > > > single
> > > > > > > > > > > > > > > > > > > > > additional work queue, s.t. send()
> > can
> > > > > enqueue
> > > > > > > and
> > > > > > > > > > > > return,
> > > > > > > > > > > > > > > seems
> > > > > > > > > > > > > > > > like
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > lightest touch. However, I don't
> > think
> > > we
> > > > > can
> > > > > > > > > > trivially
> > > > > > > > > > > > > > process
> > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > > queue
> > > > > > > > > > > > > > > > > > > > > in an internal thread pool without
> > > subtly
> > > > > > > changing
> > > > > > > > > > > > behavior
> > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > > > > users.
> > > > > > > > > > > > > > > > > > > > > For example, users will often run
> > > send() in
> > > > > > > > > multiple
> > > > > > > > > > > > > threads
> > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > order
> > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > serialize faster, but that wouldn't
> > > work
> > > > > quite
> > > > > > > the
> > > > > > > > > > same
> > > > > > > > > > > > if
> > > > > > > > > > > > > > > there
> > > > > > > > > > > > > > > > were
> > > > > > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > > > > internal thread pool.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > For this reason I'm thinking we need
> > to
> > > > > make
> > > > > > > sure
> > > > > > > > > any
> > > > > > > > > > > > such
> > > > > > > > > > > > > > > > changes
> > > > > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > > > > opt-in. Maybe a new constructor with
> > an
> > > > > > > additional
> > > > > > > > > > > > > > > ThreadFactory
> > > > > > > > > > > > > > > > > > > > parameter.
> > > > > > > > > > > > > > > > > > > > > That would at least clearly indicate
> > > that
> > > > > work
> > > > > > > will
> > > > > > > > > > > > happen
> > > > > > > > > > > > > > > > > > off-thread,
> > > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > would require opt-in for the new
> > > behavior.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Under the hood, this ThreadFactory
> > > could be
> > > > > > > used to
> > > > > > > > > > > > create
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > worker
> > > > > > > > > > > > > > > > > > > > > thread that process queued sends,
> > which
> > > > > could
> > > > > > > > > fan-out
> > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > per-partition
> > > > > > > > > > > > > > > > > > > > > threads from there.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > So then you'd have two ways to send:
> > > the
> > > > > > > existing
> > > > > > > > > > way,
> > > > > > > > > > > > > where
> > > > > > > > > > > > > > > > serde
> > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > interceptors and whatnot are executed
> > > on
> > > > > the
> > > > > > > > > calling
> > > > > > > > > > > > > thread,
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > new
> > > > > > > > > > > > > > > > > > > > > way, which returns right away and
> > uses
> > > an
> > > > > > > internal
> > > > > > > > > > > > > Executor.
> > > > > > > > > > > > > > As
> > > > > > > > > > > > > > > > you
> > > > > > > > > > > > > > > > > > > point
> > > > > > > > > > > > > > > > > > > > > out, the semantics would be identical
> > > in
> > > > > either
> > > > > > > > > case,
> > > > > > > > > > > and
> > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > would be
> > > > > > > > > > > > > > > > > > > > very
> > > > > > > > > > > > > > > > > > > > > easy for clients to switch.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Ryanne
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > On Thu, May 13, 2021, 9:00 AM
> > Nakamura
> > > <
> > > > > > > > > > > nnythm@gmail.com
> > > > > > > > > > > > >
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Hey Folks,
> > > > > > > > > > > > > > > > > > > > > > I just posted a new proposal
> > > > > > > > > > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=181306446
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > in the wiki.  I think we have an
> > > > > opportunity
> > > > > > > to
> > > > > > > > > > > improve
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > KafkaProducer#send user experience.
> > > It
> > > > > would
> > > > > > > > > > > certainly
> > > > > > > > > > > > > > make
> > > > > > > > > > > > > > > > our
> > > > > > > > > > > > > > > > > > > lives
> > > > > > > > > > > > > > > > > > > > > > easier.  Please take a look!  There
> > > are
> > > > > two
> > > > > > > > > > > subproblems
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > I
> > > > > > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > > > > > > guidance on, so I would appreciate
> > > > > feedback
> > > > > > > on
> > > > > > > > > both
> > > > > > > > > > > of
> > > > > > > > > > > > > > them.
> > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > > Moses
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > >
> > > > > > > > > > > Matthew de Detrich
> > > > > > > > > > >
> > > > > > > > > > > *Aiven Deutschland GmbH*
> > > > > > > > > > >
> > > > > > > > > > > Immanuelkirchstraße 26, 10405 Berlin
> > > > > > > > > > >
> > > > > > > > > > > Amtsgericht Charlottenburg, HRB 209739 B
> > > > > > > > > > >
> > > > > > > > > > > *m:* +491603708037
> > > > > > > > > > >
> > > > > > > > > > > *w:* aiven.io *e:* matthew.dedetrich@aiven.io
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > >
> > > > > > > > > Matthew de Detrich
> > > > > > > > >
> > > > > > > > > *Aiven Deutschland GmbH*
> > > > > > > > >
> > > > > > > > > Immanuelkirchstraße 26, 10405 Berlin
> > > > > > > > >
> > > > > > > > > Amtsgericht Charlottenburg, HRB 209739 B
> > > > > > > > >
> > > > > > > > > *m:* +491603708037
> > > > > > > > >
> > > > > > > > > *w:* aiven.io *e:* matthew.dedetrich@aiven.io
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 

Re: [DISCUSS] KIP-739: Block Less on KafkaProducer#send

Posted by Nakamura <nn...@gmail.com>.
Hi Colin,

> KafkaProducer#send is supposed to initiate non-blocking I/O, but not wait
for it to complete.
>
> There's more information about non-blocking I/O in Java here:
> https://en.wikipedia.org/wiki/Non-blocking_I/O_%28Java%29
I think we're talking past each other a bit.  I know about non-blocking
I/O.  The problem I'm facing is how to preserve the existing semantics
without blocking.  Right now callers assume their work is enqueued in-order
after `KafkaProducer#send` returns.  We can't simply return a future that
represents the metadata fetch, because of that assumption.  We need to
maintain order somehow.  That is what all of the different queues we're
proposing are intended to do.

> How are the ordering semantics of `KafkaProducer#send` related to the
metadata fetch?
KafkaProducer#send currently enqueues after it has the metadata, and it
passes the TopicPartition struct as part of the data when enqueueing.  We
can either update that data structure to be able to work with partial
metadata, or we can add a new queue on top.  I outline both potential
approaches in the current KIP.

> That is not related to the metadata fetch. Also, I already proposed a
solution (returning an error) if this is a concern.
Unfortunately it is, because `KafkaProducer#send` conflates the two of
them.  That seems to be the central difficulty of preserving the semantics
here.

> The same client thread that always has been responsible for checking poll.
Please pretend I've never contributed to Kafka before :). Which thread is
that?

Best,
Moses

On Tue, Jun 1, 2021 at 3:12 PM Ryanne Dolan <ry...@gmail.com> wrote:

> Colin, the issue for me isn't so much whether non-blocking I/O is used or
> not, but the fact that the caller observes a long time between calling
> send() and receiving the returned future. This behavior can be considered
> "blocking" whether or not I/O is involved.
>
> > How are the ordering semantics of `KafkaProducer#send` related to the
> metadata fetch?
> > I already proposed a solution (returning an error)
>
> There is a subtle difference between failing immediately vs blocking for
> metadata, related to ordering in the face of retries. Say we set the send
> timeout to max-long (or something high enough that we rarely encounter
> timeouts in practice), and set max inflight requests to 1. Today, we can
> reasonably assume that calling send() in sequence to a specific partition
> will result in the corresponding sequence landing on that partition,
> regardless of how the caller handles retries. The caller might not handle
> retries at all. But if we can fail immediately (e.g. when the metadata
> isn't yet ready), then the caller must handle retries carefully.
> Specifically, the caller must retry each send() before proceeding to the
> next. This basically means that the caller must block on each send() in
> order to maintain the proper sequence -- how else would the caller know
> whether it will need to retry or not?
>
> In other words, failing immediately punts the problem to the caller to
> handle, while the caller is less-equipped to deal with it. I don't think we
> should do that, at least not in the default case.
>
> I actually don't have any objections to this approach so long as it's
> opt-in. It sounds like you are suggesting to fix the bug for everyone, but
> I don't think we can do that without subtly breaking things.
>
> Ryanne
>
> On Tue, Jun 1, 2021 at 12:31 PM Colin McCabe <cm...@apache.org> wrote:
>
> > On Tue, Jun 1, 2021, at 07:00, Nakamura wrote:
> > > Hi Colin,
> > >
> > > Sorry, I still don't follow.
> > >
> > > Right now `KafkaProducer#send` seems to trigger a metadata fetch.
> Today,
> > > we block on that before returning.  Is your proposal that we move the
> > > metadata fetch out of `KafkaProducer#send` entirely?
> > >
> >
> > KafkaProducer#send is supposed to initiate non-blocking I/O, but not wait
> > for it to complete.
> >
> > There's more information about non-blocking I/O in Java here:
> > https://en.wikipedia.org/wiki/Non-blocking_I/O_%28Java%29
> >
> > >
> > > Even if the metadata fetch moves to be non-blocking, I think we still
> > need
> > > to deal with the problems we've discussed before if the fetch happens
> in
> > > the `KafkaProducer#send` method.  How do we maintain the ordering
> > semantics
> > > of `KafkaProducer#send`?
> >
> > How are the ordering semantics of `KafkaProducer#send` related to the
> > metadata fetch?
> >
> > >  How do we prevent our buffer from filling up?
> >
> > That is not related to the metadata fetch. Also, I already proposed a
> > solution (returning an error) if this is a concern.
> >
> > > Which thread is responsible for checking poll()?
> >
> > The same client thread that always has been responsible for checking
> poll.
> >
> > >
> > > The only approach I can see that would avoid this would be moving the
> > > metadata fetch to happen at a different time.  But it's not clear to me
> > > when would be a more appropriate time to do the metadata fetch than
> > > `KafkaProducer#send`.
> > >
> >
> > It's not about moving the metadata fetch to happen at a different time.
> > It's about using non-blocking I/O, like we do for other network I/O. (And
> > actually, if you want to get really technical, we do this for the
> metadata
> > fetch too, it's just that we have a hack that loops to transform it back
> > into blocking I/O.)
> >
> > best,
> > Colin
> >
> > > I think there's something I'm missing here.  Would you mind helping me
> > > figure out what it is?
> > >
> > > Best,
> > > Moses
> > >
> > > On Sun, May 30, 2021 at 5:35 PM Colin McCabe <cm...@apache.org>
> wrote:
> > >
> > > > On Tue, May 25, 2021, at 11:26, Nakamura wrote:
> > > > > Hey Colin,
> > > > >
> > > > > For the metadata case, what would fixing the bug look like?  I
> agree
> > that
> > > > > we should fix it, but I don't have a clear picture in my mind of
> what
> > > > > fixing it should look like.  Can you elaborate?
> > > > >
> > > >
> > > > If the blocking metadata fetch bug were fixed, neither the producer
> nor
> > > > the consumer would block while fetching metadata. A poll() call would
> > > > initiate a metadata fetch if needed, and a subsequent call to poll()
> > would
> > > > handle the results if needed. Basically the same paradigm we use for
> > other
> > > > network communication in the producer and consumer.
> > > >
> > > > best,
> > > > Colin
> > > >
> > > > > Best,
> > > > > Moses
> > > > >
> > > > > On Mon, May 24, 2021 at 1:54 PM Colin McCabe <cm...@apache.org>
> > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > I agree that we should give users the option of having a fully
> > async
> > > > API,
> > > > > > but I don't think external thread pools or queues are the right
> > > > direction
> > > > > > to go here. They add performance overheads and don't address the
> > root
> > > > > > causes of the problem.
> > > > > >
> > > > > > There are basically two scenarios where we block, currently. One
> is
> > > > when
> > > > > > we are doing a metadata fetch. I think this is clearly a bug, or
> at
> > > > least
> > > > > > an implementation limitation. From the user's point of view, the
> > fact
> > > > that
> > > > > > we are doing a metadata fetch is an implementation detail that
> > really
> > > > > > shouldn't be exposed like this. We have talked about fixing this
> > in the
> > > > > > past. I think we just should spend the time to do it.
> > > > > >
> > > > > > The second scenario is where the client has produced too much
> data
> > in
> > > > too
> > > > > > little time. This could happen if there is a network glitch, or
> the
> > > > server
> > > > > > is slower than expected. In this case, the behavior is
> intentional
> > and
> > > > not
> > > > > > a bug. To understand this, think about what would happen if we
> > didn't
> > > > > > block. We would start buffering more and more data in memory,
> until
> > > > finally
> > > > > > the application died with an out of memory error. That would be
> > > > frustrating
> > > > > > for users and wouldn't add to the usability of Kafka.
> > > > > >
> > > > > > We could potentially have an option to handle the out-of-memory
> > > > scenario
> > > > > > differently by returning an error code immediately rather than
> > > > blocking.
> > > > > > Applications would have to be rewritten to handle this properly,
> > but
> > > > it is
> > > > > > a possibility. I suspect that most of them wouldn't use this, but
> > we
> > > > could
> > > > > > offer it as a possibility for async purists (which might include
> > > > certain
> > > > > > frameworks). The big problem the users would have to solve is
> what
> > to
> > > > do
> > > > > > with the record that they were unable to produce due to the
> buffer
> > full
> > > > > > issue.
> > > > > >
> > > > > > best,
> > > > > > Colin
> > > > > >
> > > > > >
> > > > > > On Thu, May 20, 2021, at 10:35, Nakamura wrote:
> > > > > > > >
> > > > > > > > My suggestion was just do this in multiple steps/phases,
> > firstly
> > > > let's
> > > > > > fix
> > > > > > > > the issue of send being misleadingly asynchronous (i.e.
> > internally
> > > > its
> > > > > > > > blocking) and then later one we can make the various
> > > > > > > > threadpools configurable with a sane default.
> > > > > > >
> > > > > > > I like that approach. I updated the "Which thread should be
> > > > responsible
> > > > > > for
> > > > > > > waiting" part of KIP-739 to add your suggestion as my
> recommended
> > > > > > approach,
> > > > > > > thank you!  If no one else has major concerns about that
> > approach,
> > > > I'll
> > > > > > > move the alternatives to "rejected alternatives".
> > > > > > >
> > > > > > > On Thu, May 20, 2021 at 7:26 AM Matthew de Detrich
> > > > > > > <ma...@aiven.io.invalid> wrote:
> > > > > > >
> > > > > > > > @
> > > > > > > >
> > > > > > > > Nakamura
> > > > > > > > On Wed, May 19, 2021 at 7:35 PM Nakamura <nn...@gmail.com>
> > wrote:
> > > > > > > >
> > > > > > > > > @Ryanne:
> > > > > > > > > In my mind's eye I slightly prefer the throwing the "cannot
> > > > enqueue"
> > > > > > > > > exception to satisfying the future immediately with the
> > "cannot
> > > > > > enqueue"
> > > > > > > > > exception?  But I agree, it would be worth doing more
> > research.
> > > > > > > > >
> > > > > > > > > @Matthew:
> > > > > > > > >
> > > > > > > > > > 3. Using multiple thread pools is definitely recommended
> > for
> > > > > > different
> > > > > > > > > > types of tasks, for serialization which is CPU bound you
> > > > definitely
> > > > > > > > would
> > > > > > > > > > want to use a bounded thread pool that is fixed by the
> > number
> > > > of
> > > > > > CPU's
> > > > > > > > > (or
> > > > > > > > > > something along those lines).
> > > > > > > > > >
> > > > https://gist.github.com/djspiewak/46b543800958cf61af6efa8e072bfd5c
> > > > > > is
> > > > > > > > a
> > > > > > > > > > very good guide on this topic
> > > > > > > > > I think this guide is good in general, but I would be
> > hesitant to
> > > > > > follow
> > > > > > > > > its guidance re: offloading serialization without
> > benchmarking
> > > > it.
> > > > > > My
> > > > > > > > > understanding is that context-switches have gotten much
> > cheaper,
> > > > and
> > > > > > that
> > > > > > > > > gains from cache locality are small, but they're not
> nothing.
> > > > > > Especially
> > > > > > > > > if the workload has a very small serialization cost, I
> > wouldn't
> > > > be
> > > > > > > > shocked
> > > > > > > > > if it made it slower.  I feel pretty strongly that we
> should
> > do
> > > > more
> > > > > > > > > research here before unconditionally encouraging
> > serialization
> > > > in a
> > > > > > > > > threadpool.  If people think it's important to do it here
> > (eg if
> > > > we
> > > > > > think
> > > > > > > > > it would mean another big API change) then we should start
> > > > thinking
> > > > > > about
> > > > > > > > > what benchmarking we can do to gain higher confidence in
> this
> > > > kind of
> > > > > > > > > change.  However, I don't think it would change semantics
> as
> > > > > > > > substantially
> > > > > > > > > as we're proposing here, so I would vote for pushing this
> to
> > a
> > > > > > subsequent
> > > > > > > > > KIP.
> > > > > > > > >
> > > > > > > > Of course, its all down to benchmarking, benchmarking and
> > > > benchmarking.
> > > > > > > > Ideally speaking you want to use all of the resources
> > available to
> > > > > > you, so
> > > > > > > > if you have a bottleneck in serialization and you have many
> > cores
> > > > free
> > > > > > then
> > > > > > > > using multiple cores may be more appropriate than a single
> > core.
> > > > > > Typically
> > > > > > > > I would expect that using a single thread to do serialization
> > is
> > > > > > likely to
> > > > > > > > be the most situation, I was just responding to an earlier
> > point
> > > > that
> > > > > > was
> > > > > > > > made in regards to using ThreadPools for serialization (note
> > that
> > > > you
> > > > > > can
> > > > > > > > also just use a ThreadPool that is pinned to a single thread)
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > > 4. Regarding providing the ability for users to supply
> > their
> > > > own
> > > > > > custom
> > > > > > > > > > ThreadPool this is more of an ergonomics question for the
> > API.
> > > > > > > > Especially
> > > > > > > > > > when it gets to monitoring/tracing, giving the ability
> for
> > > > users to
> > > > > > > > > provide
> > > > > > > > > > their own custom IO/CPU ThreadPools is ideal however as
> > stated
> > > > > > doing so
> > > > > > > > > > means a lot of boilerplatery changes to the API.
> Typically
> > > > > > speaking a
> > > > > > > > lot
> > > > > > > > > > of monitoring/tracing/diagnosing is done on
> > > > > > > > ExecutionContext/ThreadPools
> > > > > > > > > > (at least on a more rudimentary level) and hence allowing
> > > > users to
> > > > > > > > supply
> > > > > > > > > a
> > > > > > > > > > global singleton ThreadPool for IO tasks and another for
> > CPU
> > > > tasks
> > > > > > > > makes
> > > > > > > > > > their lives a lot easier. However due to the large amount
> > of
> > > > > > changes to
> > > > > > > > > the
> > > > > > > > > > API, it may be more appropriate to just use internal
> thread
> > > > pools
> > > > > > (for
> > > > > > > > > now)
> > > > > > > > > > since at least it's not any worse than what exists
> > currently
> > > > and
> > > > > > this
> > > > > > > > can
> > > > > > > > > > be an improvement that is done later?
> > > > > > > > > Is there an existing threadpool that you suggest we reuse?
> > Or
> > > > are
> > > > > > you
> > > > > > > > > imagining that we make our own internal threadpool, and
> then
> > > > maybe
> > > > > > expose
> > > > > > > > > configuration flags to manipulate it?  For what it's
> worth, I
> > > > like
> > > > > > having
> > > > > > > > > an internal threadpool (perhaps just FJP.commonpool) and
> then
> > > > > > providing
> > > > > > > > an
> > > > > > > > > alternative to pass your own threadpool.  That way people
> who
> > > > want
> > > > > > finer
> > > > > > > > > control can get it, and everyone else can do OK with the
> > default.
> > > > > > > > >
> > > > > > > > Indeed that is what I am saying. The most ideal situation is
> > that
> > > > > > there is
> > > > > > > > a default internal threadpool that Kafka uses, however users
> of
> > > > Kafka
> > > > > > can
> > > > > > > > configure there own threadpool. Having a singleton ThreadPool
> > for
> > > > > > blocking
> > > > > > > > IO, non blocking IO and CPU bound tasks which can be plugged
> in
> > > > all of
> > > > > > your
> > > > > > > > libraries (including Kafka) makes resource management much
> > easier
> > > > to
> > > > > > do and
> > > > > > > > also gives control of users to override specific threadpools
> > for
> > > > > > > > exceptional cases (i.e. providing a Threadpool that is pinned
> > to a
> > > > > > single
> > > > > > > > core which tends to give the best latency results if this is
> > > > something
> > > > > > that
> > > > > > > > is critical for you).
> > > > > > > >
> > > > > > > > My suggestion was just do this in multiple steps/phases,
> > firstly
> > > > let's
> > > > > > fix
> > > > > > > > the issue of send being misleadingly asynchronous (i.e.
> > internally
> > > > its
> > > > > > > > blocking) and then later one we can make the various
> > > > > > > > threadpools configurable with a sane default.
> > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Wed, May 19, 2021 at 6:01 AM Matthew de Detrich
> > > > > > > > > <ma...@aiven.io.invalid> wrote:
> > > > > > > > >
> > > > > > > > > > Here are my two cents here (note that I am only seeing
> > this on
> > > > a
> > > > > > > > surface
> > > > > > > > > > level)
> > > > > > > > > >
> > > > > > > > > > 1. If we are going this road it makes sense to do this
> > > > "properly"
> > > > > > (i.e.
> > > > > > > > > > using queues  as Ryan suggested). The reason I am saying
> > this
> > > > is
> > > > > > that
> > > > > > > > it
> > > > > > > > > > seems that the original goal of the KIP is for it to be
> > used in
> > > > > > other
> > > > > > > > > > asynchronous systems and from my personal experience, you
> > > > really do
> > > > > > > > need
> > > > > > > > > to
> > > > > > > > > > make the implementation properly asynchronous otherwise
> > it's
> > > > > > really not
> > > > > > > > > > that useful.
> > > > > > > > > > 2. Due to the previous point and what was said by others,
> > this
> > > > is
> > > > > > > > likely
> > > > > > > > > > going to break some existing semantics (i.e. people are
> > > > currently
> > > > > > > > relying
> > > > > > > > > > on blocking semantics) so adding another
> method's/interface
> > > > plus
> > > > > > > > > > deprecating the older one is more annoying but ideal.
> > > > > > > > > > 3. Using multiple thread pools is definitely recommended
> > for
> > > > > > different
> > > > > > > > > > types of tasks, for serialization which is CPU bound you
> > > > definitely
> > > > > > > > would
> > > > > > > > > > want to use a bounded thread pool that is fixed by the
> > number
> > > > of
> > > > > > CPU's
> > > > > > > > > (or
> > > > > > > > > > something along those lines).
> > > > > > > > > >
> > > > https://gist.github.com/djspiewak/46b543800958cf61af6efa8e072bfd5c
> > > > > > is
> > > > > > > > a
> > > > > > > > > > very good guide on this topic
> > > > > > > > > > 4. Regarding providing the ability for users to supply
> > their
> > > > own
> > > > > > custom
> > > > > > > > > > ThreadPool this is more of an ergonomics question for the
> > API.
> > > > > > > > Especially
> > > > > > > > > > when it gets to monitoring/tracing, giving the ability
> for
> > > > users to
> > > > > > > > > provide
> > > > > > > > > > their own custom IO/CPU ThreadPools is ideal however as
> > stated
> > > > > > doing so
> > > > > > > > > > means a lot of boilerplatery changes to the API.
> Typically
> > > > > > speaking a
> > > > > > > > lot
> > > > > > > > > > of monitoring/tracing/diagnosing is done on
> > > > > > > > ExecutionContext/ThreadPools
> > > > > > > > > > (at least on a more rudimentary level) and hence allowing
> > > > users to
> > > > > > > > > supply a
> > > > > > > > > > global singleton ThreadPool for IO tasks and another for
> > CPU
> > > > tasks
> > > > > > > > makes
> > > > > > > > > > their lives a lot easier. However due to the large amount
> > of
> > > > > > changes to
> > > > > > > > > the
> > > > > > > > > > API, it may be more appropriate to just use internal
> thread
> > > > pools
> > > > > > (for
> > > > > > > > > now)
> > > > > > > > > > since at least it's not any worse than what exists
> > currently
> > > > and
> > > > > > this
> > > > > > > > can
> > > > > > > > > > be an improvement that is done later?
> > > > > > > > > >
> > > > > > > > > > On Wed, May 19, 2021 at 2:56 AM Ryanne Dolan <
> > > > > > ryannedolan@gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > I was thinking the sender would typically wrap send()
> in
> > a
> > > > > > > > > backoff/retry
> > > > > > > > > > > loop, or else ignore any failures and drop sends on the
> > floor
> > > > > > > > > > > (fire-and-forget), and in both cases I think failing
> > > > immediately
> > > > > > is
> > > > > > > > > > better
> > > > > > > > > > > than blocking for a new spot in the queue or
> > asynchronously
> > > > > > failing
> > > > > > > > > > > somehow.
> > > > > > > > > > >
> > > > > > > > > > > I think a failed future is adequate for the "explicit
> > > > > > backpressure
> > > > > > > > > > signal"
> > > > > > > > > > > while avoiding any blocking anywhere. I think if we try
> > to
> > > > > > > > > asynchronously
> > > > > > > > > > > signal the caller of failure (either by asynchronously
> > > > failing
> > > > > > the
> > > > > > > > > future
> > > > > > > > > > > or invoking a callback off-thread or something) we'd
> > force
> > > > the
> > > > > > caller
> > > > > > > > > to
> > > > > > > > > > > either block or poll waiting for that signal, which
> > somewhat
> > > > > > defeats
> > > > > > > > > the
> > > > > > > > > > > purpose we're after. And of course blocking for a spot
> > in the
> > > > > > queue
> > > > > > > > > > > definitely defeats the purpose (tho perhaps ameliorates
> > the
> > > > > > problem
> > > > > > > > > > some).
> > > > > > > > > > >
> > > > > > > > > > > Throwing an exception to the caller directly (not via
> the
> > > > > > future) is
> > > > > > > > > > > another option with precedent in Kafka clients, tho it
> > > > doesn't
> > > > > > seem
> > > > > > > > as
> > > > > > > > > > > ergonomic to me.
> > > > > > > > > > >
> > > > > > > > > > > It would be interesting to analyze some existing usage
> > and
> > > > > > determine
> > > > > > > > > how
> > > > > > > > > > > difficult it would be to convert it to the various
> > proposed
> > > > APIs.
> > > > > > > > > > >
> > > > > > > > > > > Ryanne
> > > > > > > > > > >
> > > > > > > > > > > On Tue, May 18, 2021, 3:27 PM Nakamura <
> nnythm@gmail.com
> > >
> > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi Ryanne,
> > > > > > > > > > > >
> > > > > > > > > > > > Hmm, that's an interesting idea.  Basically it would
> > mean
> > > > that
> > > > > > > > after
> > > > > > > > > > > > calling send, you would also have to check whether
> the
> > > > returned
> > > > > > > > > future
> > > > > > > > > > > had
> > > > > > > > > > > > failed with a specific exception.  I would be open to
> > it,
> > > > > > although
> > > > > > > > I
> > > > > > > > > > > think
> > > > > > > > > > > > it might be slightly more surprising, since right now
> > the
> > > > > > paradigm
> > > > > > > > is
> > > > > > > > > > > > "enqueue synchronously, the future represents whether
> > we
> > > > > > succeeded
> > > > > > > > in
> > > > > > > > > > > > sending or not" and the new one would be "enqueue
> > > > > > synchronously,
> > > > > > > > the
> > > > > > > > > > > future
> > > > > > > > > > > > either represents whether we succeeded in enqueueing
> or
> > > > not (in
> > > > > > > > which
> > > > > > > > > > > case
> > > > > > > > > > > > it will be failed immediately if it failed to
> enqueue)
> > or
> > > > > > whether
> > > > > > > > we
> > > > > > > > > > > > succeeded in sending or not".
> > > > > > > > > > > >
> > > > > > > > > > > > But you're right, it should be on the table, thank
> you
> > for
> > > > > > > > suggesting
> > > > > > > > > > it!
> > > > > > > > > > > >
> > > > > > > > > > > > Best,
> > > > > > > > > > > > Moses
> > > > > > > > > > > >
> > > > > > > > > > > > On Tue, May 18, 2021 at 12:23 PM Ryanne Dolan <
> > > > > > > > ryannedolan@gmail.com
> > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Moses, in the case of a full queue, could we just
> > return
> > > > a
> > > > > > failed
> > > > > > > > > > > future
> > > > > > > > > > > > > immediately?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Ryanne
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Tue, May 18, 2021, 10:39 AM Nakamura <
> > > > nnythm@gmail.com>
> > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi Alexandre,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks for bringing this up, I think I could use
> > some
> > > > > > feedback
> > > > > > > > in
> > > > > > > > > > > this
> > > > > > > > > > > > > > area.  There are two mechanisms here, one for
> > slowing
> > > > down
> > > > > > when
> > > > > > > > > we
> > > > > > > > > > > > don't
> > > > > > > > > > > > > > have the relevant metadata, and the other for
> > slowing
> > > > down
> > > > > > > > when a
> > > > > > > > > > > queue
> > > > > > > > > > > > > has
> > > > > > > > > > > > > > filled up.  Although the first one applies
> > backpressure
> > > > > > > > somewhat
> > > > > > > > > > > > > > inadvertently, we could still get in trouble if
> > we're
> > > > not
> > > > > > > > > providing
> > > > > > > > > > > > > > information to the mechanism that monitors
> whether
> > > > we're
> > > > > > > > queueing
> > > > > > > > > > too
> > > > > > > > > > > > > > much.  As for the second one, that is a classic
> > > > > > backpressure
> > > > > > > > use
> > > > > > > > > > > case,
> > > > > > > > > > > > so
> > > > > > > > > > > > > > it's definitely important that we don't drop that
> > > > ability.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Right now backpressure is applied by blocking,
> > which
> > > > is a
> > > > > > > > natural
> > > > > > > > > > way
> > > > > > > > > > > > to
> > > > > > > > > > > > > > apply backpressure in synchronous systems, but
> can
> > > > lead to
> > > > > > > > > > > unnecessary
> > > > > > > > > > > > > > slowdowns in asynchronous systems.  In my
> opinion,
> > the
> > > > > > safest
> > > > > > > > way
> > > > > > > > > > to
> > > > > > > > > > > > > apply
> > > > > > > > > > > > > > backpressure in an asynchronous model is to have
> an
> > > > > > explicit
> > > > > > > > > > > > backpressure
> > > > > > > > > > > > > > signal.  A good example would be returning an
> > > > exception,
> > > > > > and
> > > > > > > > > > > providing
> > > > > > > > > > > > an
> > > > > > > > > > > > > > optional hook to add a callback onto so that you
> > can be
> > > > > > > > notified
> > > > > > > > > > when
> > > > > > > > > > > > > it's
> > > > > > > > > > > > > > ready to accept more messages.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > However, this would be a really big change to how
> > > > users use
> > > > > > > > > > > > > > KafkaProducer#send, so I don't know how much
> > appetite
> > > > we
> > > > > > have
> > > > > > > > for
> > > > > > > > > > > > making
> > > > > > > > > > > > > > that kind of change.  Maybe it would be simpler
> to
> > > > remove
> > > > > > the
> > > > > > > > > > "don't
> > > > > > > > > > > > > block
> > > > > > > > > > > > > > when the per-topic queue is full" from the scope
> of
> > > > this
> > > > > > KIP,
> > > > > > > > and
> > > > > > > > > > > only
> > > > > > > > > > > > > > focus on when metadata is available?  The
> downside
> > is
> > > > that
> > > > > > we
> > > > > > > > > will
> > > > > > > > > > > > > probably
> > > > > > > > > > > > > > want to change the API again later to fix this,
> so
> > it
> > > > > > might be
> > > > > > > > > > better
> > > > > > > > > > > > to
> > > > > > > > > > > > > > just rip the bandaid off now.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > One slightly nasty thing here is that because
> > queueing
> > > > > > order is
> > > > > > > > > > > > > important,
> > > > > > > > > > > > > > if we want to use exceptions, we will want to be
> > able
> > > > to
> > > > > > signal
> > > > > > > > > the
> > > > > > > > > > > > > failure
> > > > > > > > > > > > > > to enqueue to the caller in such a way that they
> > can
> > > > still
> > > > > > > > > enforce
> > > > > > > > > > > > > message
> > > > > > > > > > > > > > order if they want.  So we can't embed the
> failure
> > > > > > directly in
> > > > > > > > > the
> > > > > > > > > > > > > returned
> > > > > > > > > > > > > > future, we should either return two futures
> > (nested,
> > > > or as
> > > > > > a
> > > > > > > > > tuple)
> > > > > > > > > > > or
> > > > > > > > > > > > > else
> > > > > > > > > > > > > > throw an exception to explain a backpressure.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > So there are a few things we should work out
> here:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 1. Should we keep the "too many bytes enqueued"
> > part of
> > > > > > this in
> > > > > > > > > > > scope?
> > > > > > > > > > > > > (I
> > > > > > > > > > > > > > would say yes, so that we can minimize churn in
> > this
> > > > API)
> > > > > > > > > > > > > > 2. How should we signal backpressure so that it's
> > > > > > appropriate
> > > > > > > > for
> > > > > > > > > > > > > > asynchronous systems?  (I would say that we
> should
> > > > throw an
> > > > > > > > > > > exception.
> > > > > > > > > > > > > If
> > > > > > > > > > > > > > we choose this and we want to pursue the queueing
> > > > path, we
> > > > > > > > would
> > > > > > > > > > > *not*
> > > > > > > > > > > > > want
> > > > > > > > > > > > > > to enqueue messages that would push us over the
> > limit,
> > > > and
> > > > > > > > would
> > > > > > > > > > only
> > > > > > > > > > > > > want
> > > > > > > > > > > > > > to enqueue messages when we're waiting for
> > metadata,
> > > > and we
> > > > > > > > would
> > > > > > > > > > > want
> > > > > > > > > > > > to
> > > > > > > > > > > > > > keep track of the total number of bytes for those
> > > > > > messages).
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > What do you think?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > Moses
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Sun, May 16, 2021 at 6:16 AM Alexandre
> Dupriez <
> > > > > > > > > > > > > > alexandre.dupriez@gmail.com> wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hello Nakamura,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks for proposing this change. I can see how
> > the
> > > > > > blocking
> > > > > > > > > > > > behaviour
> > > > > > > > > > > > > > > can be a problem when integrating with reactive
> > > > > > frameworks
> > > > > > > > such
> > > > > > > > > > as
> > > > > > > > > > > > > > > Akka. One of the questions I would have is how
> > you
> > > > would
> > > > > > > > handle
> > > > > > > > > > > back
> > > > > > > > > > > > > > > pressure and avoid memory exhaustion when the
> > > > producer's
> > > > > > > > buffer
> > > > > > > > > > is
> > > > > > > > > > > > > > > full and tasks would start to accumulate in the
> > > > > > out-of-band
> > > > > > > > > queue
> > > > > > > > > > > or
> > > > > > > > > > > > > > > thread pool introduced with this KIP.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > Alexandre
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Le ven. 14 mai 2021 à 15:55, Ryanne Dolan <
> > > > > > > > > ryannedolan@gmail.com
> > > > > > > > > > >
> > > > > > > > > > > a
> > > > > > > > > > > > > > écrit
> > > > > > > > > > > > > > > :
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Makes sense!
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Ryanne
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Fri, May 14, 2021, 9:39 AM Nakamura <
> > > > > > nnythm@gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Hey Ryanne,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I see what you're saying about serde
> > blocking,
> > > > but I
> > > > > > > > think
> > > > > > > > > we
> > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > > consider it out of scope for this patch.
> > Right
> > > > now
> > > > > > we've
> > > > > > > > > > > nailed
> > > > > > > > > > > > > > down a
> > > > > > > > > > > > > > > > > couple of use cases where we can
> > unambiguously
> > > > say,
> > > > > > "I
> > > > > > > > can
> > > > > > > > > > make
> > > > > > > > > > > > > > > progress
> > > > > > > > > > > > > > > > > now" or "I cannot make progress now", which
> > > > makes it
> > > > > > > > > possible
> > > > > > > > > > > to
> > > > > > > > > > > > > > > offload to
> > > > > > > > > > > > > > > > > a different thread only if we are unable to
> > make
> > > > > > > > progress.
> > > > > > > > > > > > > Extending
> > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > to CPU work like serde would mean always
> > > > offloading,
> > > > > > > > which
> > > > > > > > > > > would
> > > > > > > > > > > > > be a
> > > > > > > > > > > > > > > > > really big performance change.  It might be
> > worth
> > > > > > > > exploring
> > > > > > > > > > > > anyway,
> > > > > > > > > > > > > > > but I'd
> > > > > > > > > > > > > > > > > rather keep this patch focused on improving
> > > > > > ergonomics,
> > > > > > > > > > rather
> > > > > > > > > > > > than
> > > > > > > > > > > > > > > > > muddying the waters with evaluating
> > performance
> > > > very
> > > > > > > > > deeply.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I think if we really do want to support
> > serde or
> > > > > > > > > interceptors
> > > > > > > > > > > > that
> > > > > > > > > > > > > do
> > > > > > > > > > > > > > > IO on
> > > > > > > > > > > > > > > > > the send path (which seems like an
> > anti-pattern
> > > > to
> > > > > > me),
> > > > > > > > we
> > > > > > > > > > > should
> > > > > > > > > > > > > > > consider
> > > > > > > > > > > > > > > > > making that a separate SIP, and probably
> also
> > > > > > consider
> > > > > > > > > > changing
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > API to
> > > > > > > > > > > > > > > > > use Futures (or CompletionStages).  But I
> > would
> > > > > > rather
> > > > > > > > > avoid
> > > > > > > > > > > > scope
> > > > > > > > > > > > > > > creep,
> > > > > > > > > > > > > > > > > so that we have a better chance of fixing
> > this
> > > > part
> > > > > > of
> > > > > > > > the
> > > > > > > > > > > > problem.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Yes, I think some exceptions will move to
> > being
> > > > async
> > > > > > > > > instead
> > > > > > > > > > > of
> > > > > > > > > > > > > > sync.
> > > > > > > > > > > > > > > > > They'll still be surfaced in the Future, so
> > I'm
> > > > not
> > > > > > so
> > > > > > > > > > > confident
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > would be that big a shock to users though.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > Moses
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Thu, May 13, 2021 at 7:44 PM Ryanne
> Dolan
> > <
> > > > > > > > > > > > > ryannedolan@gmail.com>
> > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > re serialization, my concern is that
> > > > serialization
> > > > > > > > often
> > > > > > > > > > > > accounts
> > > > > > > > > > > > > > > for a
> > > > > > > > > > > > > > > > > lot
> > > > > > > > > > > > > > > > > > of the cycles spent before returning the
> > > > future.
> > > > > > It's
> > > > > > > > not
> > > > > > > > > > > > > blocking
> > > > > > > > > > > > > > > per
> > > > > > > > > > > > > > > > > se,
> > > > > > > > > > > > > > > > > > but it's the same effect from the
> caller's
> > > > > > perspective.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Moreover, serde impls often block
> > themselves,
> > > > e.g.
> > > > > > when
> > > > > > > > > > > > fetching
> > > > > > > > > > > > > > > schemas
> > > > > > > > > > > > > > > > > > from a registry. I suppose it's also
> > possible
> > > > to
> > > > > > block
> > > > > > > > in
> > > > > > > > > > > > > > > Interceptors
> > > > > > > > > > > > > > > > > > (e.g. writing audit events or metrics),
> > which
> > > > > > happens
> > > > > > > > > > before
> > > > > > > > > > > > > serdes
> > > > > > > > > > > > > > > iiuc.
> > > > > > > > > > > > > > > > > > So any blocking in either of those
> plugins
> > > > would
> > > > > > block
> > > > > > > > > the
> > > > > > > > > > > send
> > > > > > > > > > > > > > > unless we
> > > > > > > > > > > > > > > > > > queue first.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > So I think we want to queue first and do
> > > > everything
> > > > > > > > > > > off-thread
> > > > > > > > > > > > > when
> > > > > > > > > > > > > > > using
> > > > > > > > > > > > > > > > > > the new API, whatever that looks like. I
> > just
> > > > want
> > > > > > to
> > > > > > > > > make
> > > > > > > > > > > sure
> > > > > > > > > > > > > we
> > > > > > > > > > > > > > > don't
> > > > > > > > > > > > > > > > > do
> > > > > > > > > > > > > > > > > > that for clients that wouldn't expect it.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Another consideration is exception
> > handling.
> > > > If we
> > > > > > > > queue
> > > > > > > > > > > right
> > > > > > > > > > > > > > away,
> > > > > > > > > > > > > > > > > we'll
> > > > > > > > > > > > > > > > > > defer some exceptions that currently are
> > > > thrown to
> > > > > > the
> > > > > > > > > > caller
> > > > > > > > > > > > > > > (before the
> > > > > > > > > > > > > > > > > > future is returned). In the new API, the
> > send()
> > > > > > > > wouldn't
> > > > > > > > > > > throw
> > > > > > > > > > > > > any
> > > > > > > > > > > > > > > > > > exceptions, and instead the future would
> > fail.
> > > > I
> > > > > > think
> > > > > > > > > that
> > > > > > > > > > > > might
> > > > > > > > > > > > > > > mean
> > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > a new method signature is required.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Ryanne
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Thu, May 13, 2021, 2:57 PM Nakamura <
> > > > > > > > > > > > nakamura.moses@gmail.com
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Hey Ryanne,
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > I agree we should add an additional
> > > > constructor
> > > > > > (or
> > > > > > > > > else
> > > > > > > > > > an
> > > > > > > > > > > > > > > additional
> > > > > > > > > > > > > > > > > > > overload in KafkaProducer#send, but the
> > new
> > > > > > > > constructor
> > > > > > > > > > > would
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > > easier
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > understand) if we're targeting the
> "user
> > > > > > provides the
> > > > > > > > > > > thread"
> > > > > > > > > > > > > > > approach.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > From looking at the code, I think we
> can
> > keep
> > > > > > record
> > > > > > > > > > > > > > serialization
> > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > user thread, if we consider that an
> > important
> > > > > > part of
> > > > > > > > > the
> > > > > > > > > > > > > > > semantics of
> > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > method.  It doesn't seem like
> > serialization
> > > > > > depends
> > > > > > > > on
> > > > > > > > > > > > knowing
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > cluster,
> > > > > > > > > > > > > > > > > > > I think it's incidental that it comes
> > after
> > > > the
> > > > > > first
> > > > > > > > > > > > > "blocking"
> > > > > > > > > > > > > > > > > segment
> > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > the method.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > Moses
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > On Thu, May 13, 2021 at 2:38 PM Ryanne
> > Dolan
> > > > <
> > > > > > > > > > > > > > > ryannedolan@gmail.com>
> > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Hey Moses, I like the direction here.
> > My
> > > > > > thinking
> > > > > > > > is
> > > > > > > > > > > that a
> > > > > > > > > > > > > > > single
> > > > > > > > > > > > > > > > > > > > additional work queue, s.t. send()
> can
> > > > enqueue
> > > > > > and
> > > > > > > > > > > return,
> > > > > > > > > > > > > > seems
> > > > > > > > > > > > > > > like
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > lightest touch. However, I don't
> think
> > we
> > > > can
> > > > > > > > > trivially
> > > > > > > > > > > > > process
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > queue
> > > > > > > > > > > > > > > > > > > > in an internal thread pool without
> > subtly
> > > > > > changing
> > > > > > > > > > > behavior
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > > > users.
> > > > > > > > > > > > > > > > > > > > For example, users will often run
> > send() in
> > > > > > > > multiple
> > > > > > > > > > > > threads
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > > order
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > serialize faster, but that wouldn't
> > work
> > > > quite
> > > > > > the
> > > > > > > > > same
> > > > > > > > > > > if
> > > > > > > > > > > > > > there
> > > > > > > > > > > > > > > were
> > > > > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > > > internal thread pool.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > For this reason I'm thinking we need
> to
> > > > make
> > > > > > sure
> > > > > > > > any
> > > > > > > > > > > such
> > > > > > > > > > > > > > > changes
> > > > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > > > opt-in. Maybe a new constructor with
> an
> > > > > > additional
> > > > > > > > > > > > > > ThreadFactory
> > > > > > > > > > > > > > > > > > > parameter.
> > > > > > > > > > > > > > > > > > > > That would at least clearly indicate
> > that
> > > > work
> > > > > > will
> > > > > > > > > > > happen
> > > > > > > > > > > > > > > > > off-thread,
> > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > would require opt-in for the new
> > behavior.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Under the hood, this ThreadFactory
> > could be
> > > > > > used to
> > > > > > > > > > > create
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > worker
> > > > > > > > > > > > > > > > > > > > thread that process queued sends,
> which
> > > > could
> > > > > > > > fan-out
> > > > > > > > > > to
> > > > > > > > > > > > > > > > > per-partition
> > > > > > > > > > > > > > > > > > > > threads from there.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > So then you'd have two ways to send:
> > the
> > > > > > existing
> > > > > > > > > way,
> > > > > > > > > > > > where
> > > > > > > > > > > > > > > serde
> > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > interceptors and whatnot are executed
> > on
> > > > the
> > > > > > > > calling
> > > > > > > > > > > > thread,
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > new
> > > > > > > > > > > > > > > > > > > > way, which returns right away and
> uses
> > an
> > > > > > internal
> > > > > > > > > > > > Executor.
> > > > > > > > > > > > > As
> > > > > > > > > > > > > > > you
> > > > > > > > > > > > > > > > > > point
> > > > > > > > > > > > > > > > > > > > out, the semantics would be identical
> > in
> > > > either
> > > > > > > > case,
> > > > > > > > > > and
> > > > > > > > > > > > it
> > > > > > > > > > > > > > > would be
> > > > > > > > > > > > > > > > > > > very
> > > > > > > > > > > > > > > > > > > > easy for clients to switch.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Ryanne
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > On Thu, May 13, 2021, 9:00 AM
> Nakamura
> > <
> > > > > > > > > > nnythm@gmail.com
> > > > > > > > > > > >
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Hey Folks,
> > > > > > > > > > > > > > > > > > > > > I just posted a new proposal
> > > > > > > > > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=181306446
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > in the wiki.  I think we have an
> > > > opportunity
> > > > > > to
> > > > > > > > > > improve
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > KafkaProducer#send user experience.
> > It
> > > > would
> > > > > > > > > > certainly
> > > > > > > > > > > > > make
> > > > > > > > > > > > > > > our
> > > > > > > > > > > > > > > > > > lives
> > > > > > > > > > > > > > > > > > > > > easier.  Please take a look!  There
> > are
> > > > two
> > > > > > > > > > subproblems
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > I
> > > > > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > > > > > guidance on, so I would appreciate
> > > > feedback
> > > > > > on
> > > > > > > > both
> > > > > > > > > > of
> > > > > > > > > > > > > them.
> > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > Moses
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > >
> > > > > > > > > > Matthew de Detrich
> > > > > > > > > >
> > > > > > > > > > *Aiven Deutschland GmbH*
> > > > > > > > > >
> > > > > > > > > > Immanuelkirchstraße 26, 10405 Berlin
> > > > > > > > > >
> > > > > > > > > > Amtsgericht Charlottenburg, HRB 209739 B
> > > > > > > > > >
> > > > > > > > > > *m:* +491603708037
> > > > > > > > > >
> > > > > > > > > > *w:* aiven.io *e:* matthew.dedetrich@aiven.io
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > >
> > > > > > > > Matthew de Detrich
> > > > > > > >
> > > > > > > > *Aiven Deutschland GmbH*
> > > > > > > >
> > > > > > > > Immanuelkirchstraße 26, 10405 Berlin
> > > > > > > >
> > > > > > > > Amtsgericht Charlottenburg, HRB 209739 B
> > > > > > > >
> > > > > > > > *m:* +491603708037
> > > > > > > >
> > > > > > > > *w:* aiven.io *e:* matthew.dedetrich@aiven.io
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-739: Block Less on KafkaProducer#send

Posted by Colin McCabe <co...@cmccabe.xyz>.
On Tue, Jun 1, 2021, at 12:12, Ryanne Dolan wrote:
> Colin, the issue for me isn't so much whether non-blocking I/O is used or
> not, but the fact that the caller observes a long time between calling
> send() and receiving the returned future. This behavior can be considered
> "blocking" whether or not I/O is involved.
> 

Yes, I agree. That's why I said:

> ... if you want to get really technical, we do this [non-blocking I/O] for 
> the metadata fetch too, it's just that we have a hack that loops to
> transform it back into blocking I/O.)

Since we loop until the I/O is complete, you can consider this behavior "blocking on I/O".

> > How are the ordering semantics of `KafkaProducer#send` related to the
> > metadata fetch?
> > I already proposed a solution (returning an error)
> 
> There is a subtle difference between failing immediately vs blocking for
> metadata, related to ordering in the face of retries. Say we set the send
> timeout to max-long (or something high enough that we rarely encounter
> timeouts in practice), and set max inflight requests to 1. Today, we can
> reasonably assume that calling send() in sequence to a specific partition
> will result in the corresponding sequence landing on that partition,
> regardless of how the caller handles retries. The caller might not handle
> retries at all. But if we can fail immediately (e.g. when the metadata
> isn't yet ready), then the caller must handle retries carefully.
> Specifically, the caller must retry each send() before proceeding to the
> next. This basically means that the caller must block on each send() in
> order to maintain the proper sequence -- how else would the caller know
> whether it will need to retry or not?

To be clear, I was not proposing failing immediately when metadata was not available. I proposed failing immediately when there was no available buffer space.

> In other words, failing immediately punts the problem to the caller to
> handle, while the caller is less-equipped to deal with it. I don't think we
> should do that, at least not in the default case.
> 
> I actually don't have any objections to this approach so long as it's
> opt-in. It sounds like you are suggesting to fix the bug for everyone, but
> I don't think we can do that without subtly breaking things.

There are two different issues we're discussing

1. the metadata fetch is done synchronously, not asynchronously.

2. the producer blocks when there is no buffer space available.

The solution to #1 is just to fix the code to be async as expected.

For #2 we could have an opt-in way of returning an error, if people feel strongly. But honestly #2 is much less of an issue in practice.

best,
Colin

> 
> Ryanne
> 
> On Tue, Jun 1, 2021 at 12:31 PM Colin McCabe <cm...@apache.org> wrote:
> 
> > On Tue, Jun 1, 2021, at 07:00, Nakamura wrote:
> > > Hi Colin,
> > >
> > > Sorry, I still don't follow.
> > >
> > > Right now `KafkaProducer#send` seems to trigger a metadata fetch.  Today,
> > > we block on that before returning.  Is your proposal that we move the
> > > metadata fetch out of `KafkaProducer#send` entirely?
> > >
> >
> > KafkaProducer#send is supposed to initiate non-blocking I/O, but not wait
> > for it to complete.
> >
> > There's more information about non-blocking I/O in Java here:
> > https://en.wikipedia.org/wiki/Non-blocking_I/O_%28Java%29
> >
> > >
> > > Even if the metadata fetch moves to be non-blocking, I think we still
> > need
> > > to deal with the problems we've discussed before if the fetch happens in
> > > the `KafkaProducer#send` method.  How do we maintain the ordering
> > semantics
> > > of `KafkaProducer#send`?
> >
> > How are the ordering semantics of `KafkaProducer#send` related to the
> > metadata fetch?
> >
> > >  How do we prevent our buffer from filling up?
> >
> > That is not related to the metadata fetch. Also, I already proposed a
> > solution (returning an error) if this is a concern.
> >
> > > Which thread is responsible for checking poll()?
> >
> > The same client thread that always has been responsible for checking poll.
> >
> > >
> > > The only approach I can see that would avoid this would be moving the
> > > metadata fetch to happen at a different time.  But it's not clear to me
> > > when would be a more appropriate time to do the metadata fetch than
> > > `KafkaProducer#send`.
> > >
> >
> > It's not about moving the metadata fetch to happen at a different time.
> > It's about using non-blocking I/O, like we do for other network I/O. (And
> > actually, if you want to get really technical, we do this for the metadata
> > fetch too, it's just that we have a hack that loops to transform it back
> > into blocking I/O.)
> >
> > best,
> > Colin
> >
> > > I think there's something I'm missing here.  Would you mind helping me
> > > figure out what it is?
> > >
> > > Best,
> > > Moses
> > >
> > > On Sun, May 30, 2021 at 5:35 PM Colin McCabe <cm...@apache.org> wrote:
> > >
> > > > On Tue, May 25, 2021, at 11:26, Nakamura wrote:
> > > > > Hey Colin,
> > > > >
> > > > > For the metadata case, what would fixing the bug look like?  I agree
> > that
> > > > > we should fix it, but I don't have a clear picture in my mind of what
> > > > > fixing it should look like.  Can you elaborate?
> > > > >
> > > >
> > > > If the blocking metadata fetch bug were fixed, neither the producer nor
> > > > the consumer would block while fetching metadata. A poll() call would
> > > > initiate a metadata fetch if needed, and a subsequent call to poll()
> > would
> > > > handle the results if needed. Basically the same paradigm we use for
> > other
> > > > network communication in the producer and consumer.
> > > >
> > > > best,
> > > > Colin
> > > >
> > > > > Best,
> > > > > Moses
> > > > >
> > > > > On Mon, May 24, 2021 at 1:54 PM Colin McCabe <cm...@apache.org>
> > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > I agree that we should give users the option of having a fully
> > async
> > > > API,
> > > > > > but I don't think external thread pools or queues are the right
> > > > direction
> > > > > > to go here. They add performance overheads and don't address the
> > root
> > > > > > causes of the problem.
> > > > > >
> > > > > > There are basically two scenarios where we block, currently. One is
> > > > when
> > > > > > we are doing a metadata fetch. I think this is clearly a bug, or at
> > > > least
> > > > > > an implementation limitation. From the user's point of view, the
> > fact
> > > > that
> > > > > > we are doing a metadata fetch is an implementation detail that
> > really
> > > > > > shouldn't be exposed like this. We have talked about fixing this
> > in the
> > > > > > past. I think we just should spend the time to do it.
> > > > > >
> > > > > > The second scenario is where the client has produced too much data
> > in
> > > > too
> > > > > > little time. This could happen if there is a network glitch, or the
> > > > server
> > > > > > is slower than expected. In this case, the behavior is intentional
> > and
> > > > not
> > > > > > a bug. To understand this, think about what would happen if we
> > didn't
> > > > > > block. We would start buffering more and more data in memory, until
> > > > finally
> > > > > > the application died with an out of memory error. That would be
> > > > frustrating
> > > > > > for users and wouldn't add to the usability of Kafka.
> > > > > >
> > > > > > We could potentially have an option to handle the out-of-memory
> > > > scenario
> > > > > > differently by returning an error code immediately rather than
> > > > blocking.
> > > > > > Applications would have to be rewritten to handle this properly,
> > but
> > > > it is
> > > > > > a possibility. I suspect that most of them wouldn't use this, but
> > we
> > > > could
> > > > > > offer it as a possibility for async purists (which might include
> > > > certain
> > > > > > frameworks). The big problem the users would have to solve is what
> > to
> > > > do
> > > > > > with the record that they were unable to produce due to the buffer
> > full
> > > > > > issue.
> > > > > >
> > > > > > best,
> > > > > > Colin
> > > > > >
> > > > > >
> > > > > > On Thu, May 20, 2021, at 10:35, Nakamura wrote:
> > > > > > > >
> > > > > > > > My suggestion was just do this in multiple steps/phases,
> > firstly
> > > > let's
> > > > > > fix
> > > > > > > > the issue of send being misleadingly asynchronous (i.e.
> > internally
> > > > its
> > > > > > > > blocking) and then later one we can make the various
> > > > > > > > threadpools configurable with a sane default.
> > > > > > >
> > > > > > > I like that approach. I updated the "Which thread should be
> > > > responsible
> > > > > > for
> > > > > > > waiting" part of KIP-739 to add your suggestion as my recommended
> > > > > > approach,
> > > > > > > thank you!  If no one else has major concerns about that
> > approach,
> > > > I'll
> > > > > > > move the alternatives to "rejected alternatives".
> > > > > > >
> > > > > > > On Thu, May 20, 2021 at 7:26 AM Matthew de Detrich
> > > > > > > <ma...@aiven.io.invalid> wrote:
> > > > > > >
> > > > > > > > @
> > > > > > > >
> > > > > > > > Nakamura
> > > > > > > > On Wed, May 19, 2021 at 7:35 PM Nakamura <nn...@gmail.com>
> > wrote:
> > > > > > > >
> > > > > > > > > @Ryanne:
> > > > > > > > > In my mind's eye I slightly prefer the throwing the "cannot
> > > > enqueue"
> > > > > > > > > exception to satisfying the future immediately with the
> > "cannot
> > > > > > enqueue"
> > > > > > > > > exception?  But I agree, it would be worth doing more
> > research.
> > > > > > > > >
> > > > > > > > > @Matthew:
> > > > > > > > >
> > > > > > > > > > 3. Using multiple thread pools is definitely recommended
> > for
> > > > > > different
> > > > > > > > > > types of tasks, for serialization which is CPU bound you
> > > > definitely
> > > > > > > > would
> > > > > > > > > > want to use a bounded thread pool that is fixed by the
> > number
> > > > of
> > > > > > CPU's
> > > > > > > > > (or
> > > > > > > > > > something along those lines).
> > > > > > > > > >
> > > > https://gist.github.com/djspiewak/46b543800958cf61af6efa8e072bfd5c
> > > > > > is
> > > > > > > > a
> > > > > > > > > > very good guide on this topic
> > > > > > > > > I think this guide is good in general, but I would be
> > hesitant to
> > > > > > follow
> > > > > > > > > its guidance re: offloading serialization without
> > benchmarking
> > > > it.
> > > > > > My
> > > > > > > > > understanding is that context-switches have gotten much
> > cheaper,
> > > > and
> > > > > > that
> > > > > > > > > gains from cache locality are small, but they're not nothing.
> > > > > > Especially
> > > > > > > > > if the workload has a very small serialization cost, I
> > wouldn't
> > > > be
> > > > > > > > shocked
> > > > > > > > > if it made it slower.  I feel pretty strongly that we should
> > do
> > > > more
> > > > > > > > > research here before unconditionally encouraging
> > serialization
> > > > in a
> > > > > > > > > threadpool.  If people think it's important to do it here
> > (eg if
> > > > we
> > > > > > think
> > > > > > > > > it would mean another big API change) then we should start
> > > > thinking
> > > > > > about
> > > > > > > > > what benchmarking we can do to gain higher confidence in this
> > > > kind of
> > > > > > > > > change.  However, I don't think it would change semantics as
> > > > > > > > substantially
> > > > > > > > > as we're proposing here, so I would vote for pushing this to
> > a
> > > > > > subsequent
> > > > > > > > > KIP.
> > > > > > > > >
> > > > > > > > Of course, its all down to benchmarking, benchmarking and
> > > > benchmarking.
> > > > > > > > Ideally speaking you want to use all of the resources
> > available to
> > > > > > you, so
> > > > > > > > if you have a bottleneck in serialization and you have many
> > cores
> > > > free
> > > > > > then
> > > > > > > > using multiple cores may be more appropriate than a single
> > core.
> > > > > > Typically
> > > > > > > > I would expect that using a single thread to do serialization
> > is
> > > > > > likely to
> > > > > > > > be the most situation, I was just responding to an earlier
> > point
> > > > that
> > > > > > was
> > > > > > > > made in regards to using ThreadPools for serialization (note
> > that
> > > > you
> > > > > > can
> > > > > > > > also just use a ThreadPool that is pinned to a single thread)
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > > 4. Regarding providing the ability for users to supply
> > their
> > > > own
> > > > > > custom
> > > > > > > > > > ThreadPool this is more of an ergonomics question for the
> > API.
> > > > > > > > Especially
> > > > > > > > > > when it gets to monitoring/tracing, giving the ability for
> > > > users to
> > > > > > > > > provide
> > > > > > > > > > their own custom IO/CPU ThreadPools is ideal however as
> > stated
> > > > > > doing so
> > > > > > > > > > means a lot of boilerplatery changes to the API. Typically
> > > > > > speaking a
> > > > > > > > lot
> > > > > > > > > > of monitoring/tracing/diagnosing is done on
> > > > > > > > ExecutionContext/ThreadPools
> > > > > > > > > > (at least on a more rudimentary level) and hence allowing
> > > > users to
> > > > > > > > supply
> > > > > > > > > a
> > > > > > > > > > global singleton ThreadPool for IO tasks and another for
> > CPU
> > > > tasks
> > > > > > > > makes
> > > > > > > > > > their lives a lot easier. However due to the large amount
> > of
> > > > > > changes to
> > > > > > > > > the
> > > > > > > > > > API, it may be more appropriate to just use internal thread
> > > > pools
> > > > > > (for
> > > > > > > > > now)
> > > > > > > > > > since at least it's not any worse than what exists
> > currently
> > > > and
> > > > > > this
> > > > > > > > can
> > > > > > > > > > be an improvement that is done later?
> > > > > > > > > Is there an existing threadpool that you suggest we reuse?
> > Or
> > > > are
> > > > > > you
> > > > > > > > > imagining that we make our own internal threadpool, and then
> > > > maybe
> > > > > > expose
> > > > > > > > > configuration flags to manipulate it?  For what it's worth, I
> > > > like
> > > > > > having
> > > > > > > > > an internal threadpool (perhaps just FJP.commonpool) and then
> > > > > > providing
> > > > > > > > an
> > > > > > > > > alternative to pass your own threadpool.  That way people who
> > > > want
> > > > > > finer
> > > > > > > > > control can get it, and everyone else can do OK with the
> > default.
> > > > > > > > >
> > > > > > > > Indeed that is what I am saying. The most ideal situation is
> > that
> > > > > > there is
> > > > > > > > a default internal threadpool that Kafka uses, however users of
> > > > Kafka
> > > > > > can
> > > > > > > > configure there own threadpool. Having a singleton ThreadPool
> > for
> > > > > > blocking
> > > > > > > > IO, non blocking IO and CPU bound tasks which can be plugged in
> > > > all of
> > > > > > your
> > > > > > > > libraries (including Kafka) makes resource management much
> > easier
> > > > to
> > > > > > do and
> > > > > > > > also gives control of users to override specific threadpools
> > for
> > > > > > > > exceptional cases (i.e. providing a Threadpool that is pinned
> > to a
> > > > > > single
> > > > > > > > core which tends to give the best latency results if this is
> > > > something
> > > > > > that
> > > > > > > > is critical for you).
> > > > > > > >
> > > > > > > > My suggestion was just do this in multiple steps/phases,
> > firstly
> > > > let's
> > > > > > fix
> > > > > > > > the issue of send being misleadingly asynchronous (i.e.
> > internally
> > > > its
> > > > > > > > blocking) and then later one we can make the various
> > > > > > > > threadpools configurable with a sane default.
> > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Wed, May 19, 2021 at 6:01 AM Matthew de Detrich
> > > > > > > > > <ma...@aiven.io.invalid> wrote:
> > > > > > > > >
> > > > > > > > > > Here are my two cents here (note that I am only seeing
> > this on
> > > > a
> > > > > > > > surface
> > > > > > > > > > level)
> > > > > > > > > >
> > > > > > > > > > 1. If we are going this road it makes sense to do this
> > > > "properly"
> > > > > > (i.e.
> > > > > > > > > > using queues  as Ryan suggested). The reason I am saying
> > this
> > > > is
> > > > > > that
> > > > > > > > it
> > > > > > > > > > seems that the original goal of the KIP is for it to be
> > used in
> > > > > > other
> > > > > > > > > > asynchronous systems and from my personal experience, you
> > > > really do
> > > > > > > > need
> > > > > > > > > to
> > > > > > > > > > make the implementation properly asynchronous otherwise
> > it's
> > > > > > really not
> > > > > > > > > > that useful.
> > > > > > > > > > 2. Due to the previous point and what was said by others,
> > this
> > > > is
> > > > > > > > likely
> > > > > > > > > > going to break some existing semantics (i.e. people are
> > > > currently
> > > > > > > > relying
> > > > > > > > > > on blocking semantics) so adding another method's/interface
> > > > plus
> > > > > > > > > > deprecating the older one is more annoying but ideal.
> > > > > > > > > > 3. Using multiple thread pools is definitely recommended
> > for
> > > > > > different
> > > > > > > > > > types of tasks, for serialization which is CPU bound you
> > > > definitely
> > > > > > > > would
> > > > > > > > > > want to use a bounded thread pool that is fixed by the
> > number
> > > > of
> > > > > > CPU's
> > > > > > > > > (or
> > > > > > > > > > something along those lines).
> > > > > > > > > >
> > > > https://gist.github.com/djspiewak/46b543800958cf61af6efa8e072bfd5c
> > > > > > is
> > > > > > > > a
> > > > > > > > > > very good guide on this topic
> > > > > > > > > > 4. Regarding providing the ability for users to supply
> > their
> > > > own
> > > > > > custom
> > > > > > > > > > ThreadPool this is more of an ergonomics question for the
> > API.
> > > > > > > > Especially
> > > > > > > > > > when it gets to monitoring/tracing, giving the ability for
> > > > users to
> > > > > > > > > provide
> > > > > > > > > > their own custom IO/CPU ThreadPools is ideal however as
> > stated
> > > > > > doing so
> > > > > > > > > > means a lot of boilerplatery changes to the API. Typically
> > > > > > speaking a
> > > > > > > > lot
> > > > > > > > > > of monitoring/tracing/diagnosing is done on
> > > > > > > > ExecutionContext/ThreadPools
> > > > > > > > > > (at least on a more rudimentary level) and hence allowing
> > > > users to
> > > > > > > > > supply a
> > > > > > > > > > global singleton ThreadPool for IO tasks and another for
> > CPU
> > > > tasks
> > > > > > > > makes
> > > > > > > > > > their lives a lot easier. However due to the large amount
> > of
> > > > > > changes to
> > > > > > > > > the
> > > > > > > > > > API, it may be more appropriate to just use internal thread
> > > > pools
> > > > > > (for
> > > > > > > > > now)
> > > > > > > > > > since at least it's not any worse than what exists
> > currently
> > > > and
> > > > > > this
> > > > > > > > can
> > > > > > > > > > be an improvement that is done later?
> > > > > > > > > >
> > > > > > > > > > On Wed, May 19, 2021 at 2:56 AM Ryanne Dolan <
> > > > > > ryannedolan@gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > I was thinking the sender would typically wrap send() in
> > a
> > > > > > > > > backoff/retry
> > > > > > > > > > > loop, or else ignore any failures and drop sends on the
> > floor
> > > > > > > > > > > (fire-and-forget), and in both cases I think failing
> > > > immediately
> > > > > > is
> > > > > > > > > > better
> > > > > > > > > > > than blocking for a new spot in the queue or
> > asynchronously
> > > > > > failing
> > > > > > > > > > > somehow.
> > > > > > > > > > >
> > > > > > > > > > > I think a failed future is adequate for the "explicit
> > > > > > backpressure
> > > > > > > > > > signal"
> > > > > > > > > > > while avoiding any blocking anywhere. I think if we try
> > to
> > > > > > > > > asynchronously
> > > > > > > > > > > signal the caller of failure (either by asynchronously
> > > > failing
> > > > > > the
> > > > > > > > > future
> > > > > > > > > > > or invoking a callback off-thread or something) we'd
> > force
> > > > the
> > > > > > caller
> > > > > > > > > to
> > > > > > > > > > > either block or poll waiting for that signal, which
> > somewhat
> > > > > > defeats
> > > > > > > > > the
> > > > > > > > > > > purpose we're after. And of course blocking for a spot
> > in the
> > > > > > queue
> > > > > > > > > > > definitely defeats the purpose (tho perhaps ameliorates
> > the
> > > > > > problem
> > > > > > > > > > some).
> > > > > > > > > > >
> > > > > > > > > > > Throwing an exception to the caller directly (not via the
> > > > > > future) is
> > > > > > > > > > > another option with precedent in Kafka clients, tho it
> > > > doesn't
> > > > > > seem
> > > > > > > > as
> > > > > > > > > > > ergonomic to me.
> > > > > > > > > > >
> > > > > > > > > > > It would be interesting to analyze some existing usage
> > and
> > > > > > determine
> > > > > > > > > how
> > > > > > > > > > > difficult it would be to convert it to the various
> > proposed
> > > > APIs.
> > > > > > > > > > >
> > > > > > > > > > > Ryanne
> > > > > > > > > > >
> > > > > > > > > > > On Tue, May 18, 2021, 3:27 PM Nakamura <nnythm@gmail.com
> > >
> > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi Ryanne,
> > > > > > > > > > > >
> > > > > > > > > > > > Hmm, that's an interesting idea.  Basically it would
> > mean
> > > > that
> > > > > > > > after
> > > > > > > > > > > > calling send, you would also have to check whether the
> > > > returned
> > > > > > > > > future
> > > > > > > > > > > had
> > > > > > > > > > > > failed with a specific exception.  I would be open to
> > it,
> > > > > > although
> > > > > > > > I
> > > > > > > > > > > think
> > > > > > > > > > > > it might be slightly more surprising, since right now
> > the
> > > > > > paradigm
> > > > > > > > is
> > > > > > > > > > > > "enqueue synchronously, the future represents whether
> > we
> > > > > > succeeded
> > > > > > > > in
> > > > > > > > > > > > sending or not" and the new one would be "enqueue
> > > > > > synchronously,
> > > > > > > > the
> > > > > > > > > > > future
> > > > > > > > > > > > either represents whether we succeeded in enqueueing or
> > > > not (in
> > > > > > > > which
> > > > > > > > > > > case
> > > > > > > > > > > > it will be failed immediately if it failed to enqueue)
> > or
> > > > > > whether
> > > > > > > > we
> > > > > > > > > > > > succeeded in sending or not".
> > > > > > > > > > > >
> > > > > > > > > > > > But you're right, it should be on the table, thank you
> > for
> > > > > > > > suggesting
> > > > > > > > > > it!
> > > > > > > > > > > >
> > > > > > > > > > > > Best,
> > > > > > > > > > > > Moses
> > > > > > > > > > > >
> > > > > > > > > > > > On Tue, May 18, 2021 at 12:23 PM Ryanne Dolan <
> > > > > > > > ryannedolan@gmail.com
> > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Moses, in the case of a full queue, could we just
> > return
> > > > a
> > > > > > failed
> > > > > > > > > > > future
> > > > > > > > > > > > > immediately?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Ryanne
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Tue, May 18, 2021, 10:39 AM Nakamura <
> > > > nnythm@gmail.com>
> > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi Alexandre,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks for bringing this up, I think I could use
> > some
> > > > > > feedback
> > > > > > > > in
> > > > > > > > > > > this
> > > > > > > > > > > > > > area.  There are two mechanisms here, one for
> > slowing
> > > > down
> > > > > > when
> > > > > > > > > we
> > > > > > > > > > > > don't
> > > > > > > > > > > > > > have the relevant metadata, and the other for
> > slowing
> > > > down
> > > > > > > > when a
> > > > > > > > > > > queue
> > > > > > > > > > > > > has
> > > > > > > > > > > > > > filled up.  Although the first one applies
> > backpressure
> > > > > > > > somewhat
> > > > > > > > > > > > > > inadvertently, we could still get in trouble if
> > we're
> > > > not
> > > > > > > > > providing
> > > > > > > > > > > > > > information to the mechanism that monitors whether
> > > > we're
> > > > > > > > queueing
> > > > > > > > > > too
> > > > > > > > > > > > > > much.  As for the second one, that is a classic
> > > > > > backpressure
> > > > > > > > use
> > > > > > > > > > > case,
> > > > > > > > > > > > so
> > > > > > > > > > > > > > it's definitely important that we don't drop that
> > > > ability.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Right now backpressure is applied by blocking,
> > which
> > > > is a
> > > > > > > > natural
> > > > > > > > > > way
> > > > > > > > > > > > to
> > > > > > > > > > > > > > apply backpressure in synchronous systems, but can
> > > > lead to
> > > > > > > > > > > unnecessary
> > > > > > > > > > > > > > slowdowns in asynchronous systems.  In my opinion,
> > the
> > > > > > safest
> > > > > > > > way
> > > > > > > > > > to
> > > > > > > > > > > > > apply
> > > > > > > > > > > > > > backpressure in an asynchronous model is to have an
> > > > > > explicit
> > > > > > > > > > > > backpressure
> > > > > > > > > > > > > > signal.  A good example would be returning an
> > > > exception,
> > > > > > and
> > > > > > > > > > > providing
> > > > > > > > > > > > an
> > > > > > > > > > > > > > optional hook to add a callback onto so that you
> > can be
> > > > > > > > notified
> > > > > > > > > > when
> > > > > > > > > > > > > it's
> > > > > > > > > > > > > > ready to accept more messages.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > However, this would be a really big change to how
> > > > users use
> > > > > > > > > > > > > > KafkaProducer#send, so I don't know how much
> > appetite
> > > > we
> > > > > > have
> > > > > > > > for
> > > > > > > > > > > > making
> > > > > > > > > > > > > > that kind of change.  Maybe it would be simpler to
> > > > remove
> > > > > > the
> > > > > > > > > > "don't
> > > > > > > > > > > > > block
> > > > > > > > > > > > > > when the per-topic queue is full" from the scope of
> > > > this
> > > > > > KIP,
> > > > > > > > and
> > > > > > > > > > > only
> > > > > > > > > > > > > > focus on when metadata is available?  The downside
> > is
> > > > that
> > > > > > we
> > > > > > > > > will
> > > > > > > > > > > > > probably
> > > > > > > > > > > > > > want to change the API again later to fix this, so
> > it
> > > > > > might be
> > > > > > > > > > better
> > > > > > > > > > > > to
> > > > > > > > > > > > > > just rip the bandaid off now.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > One slightly nasty thing here is that because
> > queueing
> > > > > > order is
> > > > > > > > > > > > > important,
> > > > > > > > > > > > > > if we want to use exceptions, we will want to be
> > able
> > > > to
> > > > > > signal
> > > > > > > > > the
> > > > > > > > > > > > > failure
> > > > > > > > > > > > > > to enqueue to the caller in such a way that they
> > can
> > > > still
> > > > > > > > > enforce
> > > > > > > > > > > > > message
> > > > > > > > > > > > > > order if they want.  So we can't embed the failure
> > > > > > directly in
> > > > > > > > > the
> > > > > > > > > > > > > returned
> > > > > > > > > > > > > > future, we should either return two futures
> > (nested,
> > > > or as
> > > > > > a
> > > > > > > > > tuple)
> > > > > > > > > > > or
> > > > > > > > > > > > > else
> > > > > > > > > > > > > > throw an exception to explain a backpressure.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > So there are a few things we should work out here:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 1. Should we keep the "too many bytes enqueued"
> > part of
> > > > > > this in
> > > > > > > > > > > scope?
> > > > > > > > > > > > > (I
> > > > > > > > > > > > > > would say yes, so that we can minimize churn in
> > this
> > > > API)
> > > > > > > > > > > > > > 2. How should we signal backpressure so that it's
> > > > > > appropriate
> > > > > > > > for
> > > > > > > > > > > > > > asynchronous systems?  (I would say that we should
> > > > throw an
> > > > > > > > > > > exception.
> > > > > > > > > > > > > If
> > > > > > > > > > > > > > we choose this and we want to pursue the queueing
> > > > path, we
> > > > > > > > would
> > > > > > > > > > > *not*
> > > > > > > > > > > > > want
> > > > > > > > > > > > > > to enqueue messages that would push us over the
> > limit,
> > > > and
> > > > > > > > would
> > > > > > > > > > only
> > > > > > > > > > > > > want
> > > > > > > > > > > > > > to enqueue messages when we're waiting for
> > metadata,
> > > > and we
> > > > > > > > would
> > > > > > > > > > > want
> > > > > > > > > > > > to
> > > > > > > > > > > > > > keep track of the total number of bytes for those
> > > > > > messages).
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > What do you think?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > Moses
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Sun, May 16, 2021 at 6:16 AM Alexandre Dupriez <
> > > > > > > > > > > > > > alexandre.dupriez@gmail.com> wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hello Nakamura,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks for proposing this change. I can see how
> > the
> > > > > > blocking
> > > > > > > > > > > > behaviour
> > > > > > > > > > > > > > > can be a problem when integrating with reactive
> > > > > > frameworks
> > > > > > > > such
> > > > > > > > > > as
> > > > > > > > > > > > > > > Akka. One of the questions I would have is how
> > you
> > > > would
> > > > > > > > handle
> > > > > > > > > > > back
> > > > > > > > > > > > > > > pressure and avoid memory exhaustion when the
> > > > producer's
> > > > > > > > buffer
> > > > > > > > > > is
> > > > > > > > > > > > > > > full and tasks would start to accumulate in the
> > > > > > out-of-band
> > > > > > > > > queue
> > > > > > > > > > > or
> > > > > > > > > > > > > > > thread pool introduced with this KIP.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > Alexandre
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Le ven. 14 mai 2021 à 15:55, Ryanne Dolan <
> > > > > > > > > ryannedolan@gmail.com
> > > > > > > > > > >
> > > > > > > > > > > a
> > > > > > > > > > > > > > écrit
> > > > > > > > > > > > > > > :
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Makes sense!
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Ryanne
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Fri, May 14, 2021, 9:39 AM Nakamura <
> > > > > > nnythm@gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Hey Ryanne,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I see what you're saying about serde
> > blocking,
> > > > but I
> > > > > > > > think
> > > > > > > > > we
> > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > > consider it out of scope for this patch.
> > Right
> > > > now
> > > > > > we've
> > > > > > > > > > > nailed
> > > > > > > > > > > > > > down a
> > > > > > > > > > > > > > > > > couple of use cases where we can
> > unambiguously
> > > > say,
> > > > > > "I
> > > > > > > > can
> > > > > > > > > > make
> > > > > > > > > > > > > > > progress
> > > > > > > > > > > > > > > > > now" or "I cannot make progress now", which
> > > > makes it
> > > > > > > > > possible
> > > > > > > > > > > to
> > > > > > > > > > > > > > > offload to
> > > > > > > > > > > > > > > > > a different thread only if we are unable to
> > make
> > > > > > > > progress.
> > > > > > > > > > > > > Extending
> > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > to CPU work like serde would mean always
> > > > offloading,
> > > > > > > > which
> > > > > > > > > > > would
> > > > > > > > > > > > > be a
> > > > > > > > > > > > > > > > > really big performance change.  It might be
> > worth
> > > > > > > > exploring
> > > > > > > > > > > > anyway,
> > > > > > > > > > > > > > > but I'd
> > > > > > > > > > > > > > > > > rather keep this patch focused on improving
> > > > > > ergonomics,
> > > > > > > > > > rather
> > > > > > > > > > > > than
> > > > > > > > > > > > > > > > > muddying the waters with evaluating
> > performance
> > > > very
> > > > > > > > > deeply.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I think if we really do want to support
> > serde or
> > > > > > > > > interceptors
> > > > > > > > > > > > that
> > > > > > > > > > > > > do
> > > > > > > > > > > > > > > IO on
> > > > > > > > > > > > > > > > > the send path (which seems like an
> > anti-pattern
> > > > to
> > > > > > me),
> > > > > > > > we
> > > > > > > > > > > should
> > > > > > > > > > > > > > > consider
> > > > > > > > > > > > > > > > > making that a separate SIP, and probably also
> > > > > > consider
> > > > > > > > > > changing
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > API to
> > > > > > > > > > > > > > > > > use Futures (or CompletionStages).  But I
> > would
> > > > > > rather
> > > > > > > > > avoid
> > > > > > > > > > > > scope
> > > > > > > > > > > > > > > creep,
> > > > > > > > > > > > > > > > > so that we have a better chance of fixing
> > this
> > > > part
> > > > > > of
> > > > > > > > the
> > > > > > > > > > > > problem.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Yes, I think some exceptions will move to
> > being
> > > > async
> > > > > > > > > instead
> > > > > > > > > > > of
> > > > > > > > > > > > > > sync.
> > > > > > > > > > > > > > > > > They'll still be surfaced in the Future, so
> > I'm
> > > > not
> > > > > > so
> > > > > > > > > > > confident
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > would be that big a shock to users though.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > Moses
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Thu, May 13, 2021 at 7:44 PM Ryanne Dolan
> > <
> > > > > > > > > > > > > ryannedolan@gmail.com>
> > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > re serialization, my concern is that
> > > > serialization
> > > > > > > > often
> > > > > > > > > > > > accounts
> > > > > > > > > > > > > > > for a
> > > > > > > > > > > > > > > > > lot
> > > > > > > > > > > > > > > > > > of the cycles spent before returning the
> > > > future.
> > > > > > It's
> > > > > > > > not
> > > > > > > > > > > > > blocking
> > > > > > > > > > > > > > > per
> > > > > > > > > > > > > > > > > se,
> > > > > > > > > > > > > > > > > > but it's the same effect from the caller's
> > > > > > perspective.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Moreover, serde impls often block
> > themselves,
> > > > e.g.
> > > > > > when
> > > > > > > > > > > > fetching
> > > > > > > > > > > > > > > schemas
> > > > > > > > > > > > > > > > > > from a registry. I suppose it's also
> > possible
> > > > to
> > > > > > block
> > > > > > > > in
> > > > > > > > > > > > > > > Interceptors
> > > > > > > > > > > > > > > > > > (e.g. writing audit events or metrics),
> > which
> > > > > > happens
> > > > > > > > > > before
> > > > > > > > > > > > > serdes
> > > > > > > > > > > > > > > iiuc.
> > > > > > > > > > > > > > > > > > So any blocking in either of those plugins
> > > > would
> > > > > > block
> > > > > > > > > the
> > > > > > > > > > > send
> > > > > > > > > > > > > > > unless we
> > > > > > > > > > > > > > > > > > queue first.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > So I think we want to queue first and do
> > > > everything
> > > > > > > > > > > off-thread
> > > > > > > > > > > > > when
> > > > > > > > > > > > > > > using
> > > > > > > > > > > > > > > > > > the new API, whatever that looks like. I
> > just
> > > > want
> > > > > > to
> > > > > > > > > make
> > > > > > > > > > > sure
> > > > > > > > > > > > > we
> > > > > > > > > > > > > > > don't
> > > > > > > > > > > > > > > > > do
> > > > > > > > > > > > > > > > > > that for clients that wouldn't expect it.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Another consideration is exception
> > handling.
> > > > If we
> > > > > > > > queue
> > > > > > > > > > > right
> > > > > > > > > > > > > > away,
> > > > > > > > > > > > > > > > > we'll
> > > > > > > > > > > > > > > > > > defer some exceptions that currently are
> > > > thrown to
> > > > > > the
> > > > > > > > > > caller
> > > > > > > > > > > > > > > (before the
> > > > > > > > > > > > > > > > > > future is returned). In the new API, the
> > send()
> > > > > > > > wouldn't
> > > > > > > > > > > throw
> > > > > > > > > > > > > any
> > > > > > > > > > > > > > > > > > exceptions, and instead the future would
> > fail.
> > > > I
> > > > > > think
> > > > > > > > > that
> > > > > > > > > > > > might
> > > > > > > > > > > > > > > mean
> > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > a new method signature is required.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Ryanne
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Thu, May 13, 2021, 2:57 PM Nakamura <
> > > > > > > > > > > > nakamura.moses@gmail.com
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Hey Ryanne,
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > I agree we should add an additional
> > > > constructor
> > > > > > (or
> > > > > > > > > else
> > > > > > > > > > an
> > > > > > > > > > > > > > > additional
> > > > > > > > > > > > > > > > > > > overload in KafkaProducer#send, but the
> > new
> > > > > > > > constructor
> > > > > > > > > > > would
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > > easier
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > understand) if we're targeting the "user
> > > > > > provides the
> > > > > > > > > > > thread"
> > > > > > > > > > > > > > > approach.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > From looking at the code, I think we can
> > keep
> > > > > > record
> > > > > > > > > > > > > > serialization
> > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > user thread, if we consider that an
> > important
> > > > > > part of
> > > > > > > > > the
> > > > > > > > > > > > > > > semantics of
> > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > method.  It doesn't seem like
> > serialization
> > > > > > depends
> > > > > > > > on
> > > > > > > > > > > > knowing
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > cluster,
> > > > > > > > > > > > > > > > > > > I think it's incidental that it comes
> > after
> > > > the
> > > > > > first
> > > > > > > > > > > > > "blocking"
> > > > > > > > > > > > > > > > > segment
> > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > the method.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > Moses
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > On Thu, May 13, 2021 at 2:38 PM Ryanne
> > Dolan
> > > > <
> > > > > > > > > > > > > > > ryannedolan@gmail.com>
> > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Hey Moses, I like the direction here.
> > My
> > > > > > thinking
> > > > > > > > is
> > > > > > > > > > > that a
> > > > > > > > > > > > > > > single
> > > > > > > > > > > > > > > > > > > > additional work queue, s.t. send() can
> > > > enqueue
> > > > > > and
> > > > > > > > > > > return,
> > > > > > > > > > > > > > seems
> > > > > > > > > > > > > > > like
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > lightest touch. However, I don't think
> > we
> > > > can
> > > > > > > > > trivially
> > > > > > > > > > > > > process
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > queue
> > > > > > > > > > > > > > > > > > > > in an internal thread pool without
> > subtly
> > > > > > changing
> > > > > > > > > > > behavior
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > > > users.
> > > > > > > > > > > > > > > > > > > > For example, users will often run
> > send() in
> > > > > > > > multiple
> > > > > > > > > > > > threads
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > > order
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > serialize faster, but that wouldn't
> > work
> > > > quite
> > > > > > the
> > > > > > > > > same
> > > > > > > > > > > if
> > > > > > > > > > > > > > there
> > > > > > > > > > > > > > > were
> > > > > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > > > internal thread pool.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > For this reason I'm thinking we need to
> > > > make
> > > > > > sure
> > > > > > > > any
> > > > > > > > > > > such
> > > > > > > > > > > > > > > changes
> > > > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > > > opt-in. Maybe a new constructor with an
> > > > > > additional
> > > > > > > > > > > > > > ThreadFactory
> > > > > > > > > > > > > > > > > > > parameter.
> > > > > > > > > > > > > > > > > > > > That would at least clearly indicate
> > that
> > > > work
> > > > > > will
> > > > > > > > > > > happen
> > > > > > > > > > > > > > > > > off-thread,
> > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > would require opt-in for the new
> > behavior.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Under the hood, this ThreadFactory
> > could be
> > > > > > used to
> > > > > > > > > > > create
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > worker
> > > > > > > > > > > > > > > > > > > > thread that process queued sends, which
> > > > could
> > > > > > > > fan-out
> > > > > > > > > > to
> > > > > > > > > > > > > > > > > per-partition
> > > > > > > > > > > > > > > > > > > > threads from there.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > So then you'd have two ways to send:
> > the
> > > > > > existing
> > > > > > > > > way,
> > > > > > > > > > > > where
> > > > > > > > > > > > > > > serde
> > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > interceptors and whatnot are executed
> > on
> > > > the
> > > > > > > > calling
> > > > > > > > > > > > thread,
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > new
> > > > > > > > > > > > > > > > > > > > way, which returns right away and uses
> > an
> > > > > > internal
> > > > > > > > > > > > Executor.
> > > > > > > > > > > > > As
> > > > > > > > > > > > > > > you
> > > > > > > > > > > > > > > > > > point
> > > > > > > > > > > > > > > > > > > > out, the semantics would be identical
> > in
> > > > either
> > > > > > > > case,
> > > > > > > > > > and
> > > > > > > > > > > > it
> > > > > > > > > > > > > > > would be
> > > > > > > > > > > > > > > > > > > very
> > > > > > > > > > > > > > > > > > > > easy for clients to switch.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Ryanne
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > On Thu, May 13, 2021, 9:00 AM Nakamura
> > <
> > > > > > > > > > nnythm@gmail.com
> > > > > > > > > > > >
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Hey Folks,
> > > > > > > > > > > > > > > > > > > > > I just posted a new proposal
> > > > > > > > > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=181306446
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > in the wiki.  I think we have an
> > > > opportunity
> > > > > > to
> > > > > > > > > > improve
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > KafkaProducer#send user experience.
> > It
> > > > would
> > > > > > > > > > certainly
> > > > > > > > > > > > > make
> > > > > > > > > > > > > > > our
> > > > > > > > > > > > > > > > > > lives
> > > > > > > > > > > > > > > > > > > > > easier.  Please take a look!  There
> > are
> > > > two
> > > > > > > > > > subproblems
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > I
> > > > > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > > > > > guidance on, so I would appreciate
> > > > feedback
> > > > > > on
> > > > > > > > both
> > > > > > > > > > of
> > > > > > > > > > > > > them.
> > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > Moses
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > >
> > > > > > > > > > Matthew de Detrich
> > > > > > > > > >
> > > > > > > > > > *Aiven Deutschland GmbH*
> > > > > > > > > >
> > > > > > > > > > Immanuelkirchstraße 26, 10405 Berlin
> > > > > > > > > >
> > > > > > > > > > Amtsgericht Charlottenburg, HRB 209739 B
> > > > > > > > > >
> > > > > > > > > > *m:* +491603708037
> > > > > > > > > >
> > > > > > > > > > *w:* aiven.io *e:* matthew.dedetrich@aiven.io
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > >
> > > > > > > > Matthew de Detrich
> > > > > > > >
> > > > > > > > *Aiven Deutschland GmbH*
> > > > > > > >
> > > > > > > > Immanuelkirchstraße 26, 10405 Berlin
> > > > > > > >
> > > > > > > > Amtsgericht Charlottenburg, HRB 209739 B
> > > > > > > >
> > > > > > > > *m:* +491603708037
> > > > > > > >
> > > > > > > > *w:* aiven.io *e:* matthew.dedetrich@aiven.io
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 

Re: [DISCUSS] KIP-739: Block Less on KafkaProducer#send

Posted by Ryanne Dolan <ry...@gmail.com>.
Colin, the issue for me isn't so much whether non-blocking I/O is used or
not, but the fact that the caller observes a long time between calling
send() and receiving the returned future. This behavior can be considered
"blocking" whether or not I/O is involved.

> How are the ordering semantics of `KafkaProducer#send` related to the
metadata fetch?
> I already proposed a solution (returning an error)

There is a subtle difference between failing immediately vs blocking for
metadata, related to ordering in the face of retries. Say we set the send
timeout to max-long (or something high enough that we rarely encounter
timeouts in practice), and set max inflight requests to 1. Today, we can
reasonably assume that calling send() in sequence to a specific partition
will result in the corresponding sequence landing on that partition,
regardless of how the caller handles retries. The caller might not handle
retries at all. But if we can fail immediately (e.g. when the metadata
isn't yet ready), then the caller must handle retries carefully.
Specifically, the caller must retry each send() before proceeding to the
next. This basically means that the caller must block on each send() in
order to maintain the proper sequence -- how else would the caller know
whether it will need to retry or not?

In other words, failing immediately punts the problem to the caller to
handle, while the caller is less-equipped to deal with it. I don't think we
should do that, at least not in the default case.

I actually don't have any objections to this approach so long as it's
opt-in. It sounds like you are suggesting to fix the bug for everyone, but
I don't think we can do that without subtly breaking things.

Ryanne

On Tue, Jun 1, 2021 at 12:31 PM Colin McCabe <cm...@apache.org> wrote:

> On Tue, Jun 1, 2021, at 07:00, Nakamura wrote:
> > Hi Colin,
> >
> > Sorry, I still don't follow.
> >
> > Right now `KafkaProducer#send` seems to trigger a metadata fetch.  Today,
> > we block on that before returning.  Is your proposal that we move the
> > metadata fetch out of `KafkaProducer#send` entirely?
> >
>
> KafkaProducer#send is supposed to initiate non-blocking I/O, but not wait
> for it to complete.
>
> There's more information about non-blocking I/O in Java here:
> https://en.wikipedia.org/wiki/Non-blocking_I/O_%28Java%29
>
> >
> > Even if the metadata fetch moves to be non-blocking, I think we still
> need
> > to deal with the problems we've discussed before if the fetch happens in
> > the `KafkaProducer#send` method.  How do we maintain the ordering
> semantics
> > of `KafkaProducer#send`?
>
> How are the ordering semantics of `KafkaProducer#send` related to the
> metadata fetch?
>
> >  How do we prevent our buffer from filling up?
>
> That is not related to the metadata fetch. Also, I already proposed a
> solution (returning an error) if this is a concern.
>
> > Which thread is responsible for checking poll()?
>
> The same client thread that always has been responsible for checking poll.
>
> >
> > The only approach I can see that would avoid this would be moving the
> > metadata fetch to happen at a different time.  But it's not clear to me
> > when would be a more appropriate time to do the metadata fetch than
> > `KafkaProducer#send`.
> >
>
> It's not about moving the metadata fetch to happen at a different time.
> It's about using non-blocking I/O, like we do for other network I/O. (And
> actually, if you want to get really technical, we do this for the metadata
> fetch too, it's just that we have a hack that loops to transform it back
> into blocking I/O.)
>
> best,
> Colin
>
> > I think there's something I'm missing here.  Would you mind helping me
> > figure out what it is?
> >
> > Best,
> > Moses
> >
> > On Sun, May 30, 2021 at 5:35 PM Colin McCabe <cm...@apache.org> wrote:
> >
> > > On Tue, May 25, 2021, at 11:26, Nakamura wrote:
> > > > Hey Colin,
> > > >
> > > > For the metadata case, what would fixing the bug look like?  I agree
> that
> > > > we should fix it, but I don't have a clear picture in my mind of what
> > > > fixing it should look like.  Can you elaborate?
> > > >
> > >
> > > If the blocking metadata fetch bug were fixed, neither the producer nor
> > > the consumer would block while fetching metadata. A poll() call would
> > > initiate a metadata fetch if needed, and a subsequent call to poll()
> would
> > > handle the results if needed. Basically the same paradigm we use for
> other
> > > network communication in the producer and consumer.
> > >
> > > best,
> > > Colin
> > >
> > > > Best,
> > > > Moses
> > > >
> > > > On Mon, May 24, 2021 at 1:54 PM Colin McCabe <cm...@apache.org>
> wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I agree that we should give users the option of having a fully
> async
> > > API,
> > > > > but I don't think external thread pools or queues are the right
> > > direction
> > > > > to go here. They add performance overheads and don't address the
> root
> > > > > causes of the problem.
> > > > >
> > > > > There are basically two scenarios where we block, currently. One is
> > > when
> > > > > we are doing a metadata fetch. I think this is clearly a bug, or at
> > > least
> > > > > an implementation limitation. From the user's point of view, the
> fact
> > > that
> > > > > we are doing a metadata fetch is an implementation detail that
> really
> > > > > shouldn't be exposed like this. We have talked about fixing this
> in the
> > > > > past. I think we just should spend the time to do it.
> > > > >
> > > > > The second scenario is where the client has produced too much data
> in
> > > too
> > > > > little time. This could happen if there is a network glitch, or the
> > > server
> > > > > is slower than expected. In this case, the behavior is intentional
> and
> > > not
> > > > > a bug. To understand this, think about what would happen if we
> didn't
> > > > > block. We would start buffering more and more data in memory, until
> > > finally
> > > > > the application died with an out of memory error. That would be
> > > frustrating
> > > > > for users and wouldn't add to the usability of Kafka.
> > > > >
> > > > > We could potentially have an option to handle the out-of-memory
> > > scenario
> > > > > differently by returning an error code immediately rather than
> > > blocking.
> > > > > Applications would have to be rewritten to handle this properly,
> but
> > > it is
> > > > > a possibility. I suspect that most of them wouldn't use this, but
> we
> > > could
> > > > > offer it as a possibility for async purists (which might include
> > > certain
> > > > > frameworks). The big problem the users would have to solve is what
> to
> > > do
> > > > > with the record that they were unable to produce due to the buffer
> full
> > > > > issue.
> > > > >
> > > > > best,
> > > > > Colin
> > > > >
> > > > >
> > > > > On Thu, May 20, 2021, at 10:35, Nakamura wrote:
> > > > > > >
> > > > > > > My suggestion was just do this in multiple steps/phases,
> firstly
> > > let's
> > > > > fix
> > > > > > > the issue of send being misleadingly asynchronous (i.e.
> internally
> > > its
> > > > > > > blocking) and then later one we can make the various
> > > > > > > threadpools configurable with a sane default.
> > > > > >
> > > > > > I like that approach. I updated the "Which thread should be
> > > responsible
> > > > > for
> > > > > > waiting" part of KIP-739 to add your suggestion as my recommended
> > > > > approach,
> > > > > > thank you!  If no one else has major concerns about that
> approach,
> > > I'll
> > > > > > move the alternatives to "rejected alternatives".
> > > > > >
> > > > > > On Thu, May 20, 2021 at 7:26 AM Matthew de Detrich
> > > > > > <ma...@aiven.io.invalid> wrote:
> > > > > >
> > > > > > > @
> > > > > > >
> > > > > > > Nakamura
> > > > > > > On Wed, May 19, 2021 at 7:35 PM Nakamura <nn...@gmail.com>
> wrote:
> > > > > > >
> > > > > > > > @Ryanne:
> > > > > > > > In my mind's eye I slightly prefer the throwing the "cannot
> > > enqueue"
> > > > > > > > exception to satisfying the future immediately with the
> "cannot
> > > > > enqueue"
> > > > > > > > exception?  But I agree, it would be worth doing more
> research.
> > > > > > > >
> > > > > > > > @Matthew:
> > > > > > > >
> > > > > > > > > 3. Using multiple thread pools is definitely recommended
> for
> > > > > different
> > > > > > > > > types of tasks, for serialization which is CPU bound you
> > > definitely
> > > > > > > would
> > > > > > > > > want to use a bounded thread pool that is fixed by the
> number
> > > of
> > > > > CPU's
> > > > > > > > (or
> > > > > > > > > something along those lines).
> > > > > > > > >
> > > https://gist.github.com/djspiewak/46b543800958cf61af6efa8e072bfd5c
> > > > > is
> > > > > > > a
> > > > > > > > > very good guide on this topic
> > > > > > > > I think this guide is good in general, but I would be
> hesitant to
> > > > > follow
> > > > > > > > its guidance re: offloading serialization without
> benchmarking
> > > it.
> > > > > My
> > > > > > > > understanding is that context-switches have gotten much
> cheaper,
> > > and
> > > > > that
> > > > > > > > gains from cache locality are small, but they're not nothing.
> > > > > Especially
> > > > > > > > if the workload has a very small serialization cost, I
> wouldn't
> > > be
> > > > > > > shocked
> > > > > > > > if it made it slower.  I feel pretty strongly that we should
> do
> > > more
> > > > > > > > research here before unconditionally encouraging
> serialization
> > > in a
> > > > > > > > threadpool.  If people think it's important to do it here
> (eg if
> > > we
> > > > > think
> > > > > > > > it would mean another big API change) then we should start
> > > thinking
> > > > > about
> > > > > > > > what benchmarking we can do to gain higher confidence in this
> > > kind of
> > > > > > > > change.  However, I don't think it would change semantics as
> > > > > > > substantially
> > > > > > > > as we're proposing here, so I would vote for pushing this to
> a
> > > > > subsequent
> > > > > > > > KIP.
> > > > > > > >
> > > > > > > Of course, its all down to benchmarking, benchmarking and
> > > benchmarking.
> > > > > > > Ideally speaking you want to use all of the resources
> available to
> > > > > you, so
> > > > > > > if you have a bottleneck in serialization and you have many
> cores
> > > free
> > > > > then
> > > > > > > using multiple cores may be more appropriate than a single
> core.
> > > > > Typically
> > > > > > > I would expect that using a single thread to do serialization
> is
> > > > > likely to
> > > > > > > be the most situation, I was just responding to an earlier
> point
> > > that
> > > > > was
> > > > > > > made in regards to using ThreadPools for serialization (note
> that
> > > you
> > > > > can
> > > > > > > also just use a ThreadPool that is pinned to a single thread)
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > > 4. Regarding providing the ability for users to supply
> their
> > > own
> > > > > custom
> > > > > > > > > ThreadPool this is more of an ergonomics question for the
> API.
> > > > > > > Especially
> > > > > > > > > when it gets to monitoring/tracing, giving the ability for
> > > users to
> > > > > > > > provide
> > > > > > > > > their own custom IO/CPU ThreadPools is ideal however as
> stated
> > > > > doing so
> > > > > > > > > means a lot of boilerplatery changes to the API. Typically
> > > > > speaking a
> > > > > > > lot
> > > > > > > > > of monitoring/tracing/diagnosing is done on
> > > > > > > ExecutionContext/ThreadPools
> > > > > > > > > (at least on a more rudimentary level) and hence allowing
> > > users to
> > > > > > > supply
> > > > > > > > a
> > > > > > > > > global singleton ThreadPool for IO tasks and another for
> CPU
> > > tasks
> > > > > > > makes
> > > > > > > > > their lives a lot easier. However due to the large amount
> of
> > > > > changes to
> > > > > > > > the
> > > > > > > > > API, it may be more appropriate to just use internal thread
> > > pools
> > > > > (for
> > > > > > > > now)
> > > > > > > > > since at least it's not any worse than what exists
> currently
> > > and
> > > > > this
> > > > > > > can
> > > > > > > > > be an improvement that is done later?
> > > > > > > > Is there an existing threadpool that you suggest we reuse?
> Or
> > > are
> > > > > you
> > > > > > > > imagining that we make our own internal threadpool, and then
> > > maybe
> > > > > expose
> > > > > > > > configuration flags to manipulate it?  For what it's worth, I
> > > like
> > > > > having
> > > > > > > > an internal threadpool (perhaps just FJP.commonpool) and then
> > > > > providing
> > > > > > > an
> > > > > > > > alternative to pass your own threadpool.  That way people who
> > > want
> > > > > finer
> > > > > > > > control can get it, and everyone else can do OK with the
> default.
> > > > > > > >
> > > > > > > Indeed that is what I am saying. The most ideal situation is
> that
> > > > > there is
> > > > > > > a default internal threadpool that Kafka uses, however users of
> > > Kafka
> > > > > can
> > > > > > > configure there own threadpool. Having a singleton ThreadPool
> for
> > > > > blocking
> > > > > > > IO, non blocking IO and CPU bound tasks which can be plugged in
> > > all of
> > > > > your
> > > > > > > libraries (including Kafka) makes resource management much
> easier
> > > to
> > > > > do and
> > > > > > > also gives control of users to override specific threadpools
> for
> > > > > > > exceptional cases (i.e. providing a Threadpool that is pinned
> to a
> > > > > single
> > > > > > > core which tends to give the best latency results if this is
> > > something
> > > > > that
> > > > > > > is critical for you).
> > > > > > >
> > > > > > > My suggestion was just do this in multiple steps/phases,
> firstly
> > > let's
> > > > > fix
> > > > > > > the issue of send being misleadingly asynchronous (i.e.
> internally
> > > its
> > > > > > > blocking) and then later one we can make the various
> > > > > > > threadpools configurable with a sane default.
> > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, May 19, 2021 at 6:01 AM Matthew de Detrich
> > > > > > > > <ma...@aiven.io.invalid> wrote:
> > > > > > > >
> > > > > > > > > Here are my two cents here (note that I am only seeing
> this on
> > > a
> > > > > > > surface
> > > > > > > > > level)
> > > > > > > > >
> > > > > > > > > 1. If we are going this road it makes sense to do this
> > > "properly"
> > > > > (i.e.
> > > > > > > > > using queues  as Ryan suggested). The reason I am saying
> this
> > > is
> > > > > that
> > > > > > > it
> > > > > > > > > seems that the original goal of the KIP is for it to be
> used in
> > > > > other
> > > > > > > > > asynchronous systems and from my personal experience, you
> > > really do
> > > > > > > need
> > > > > > > > to
> > > > > > > > > make the implementation properly asynchronous otherwise
> it's
> > > > > really not
> > > > > > > > > that useful.
> > > > > > > > > 2. Due to the previous point and what was said by others,
> this
> > > is
> > > > > > > likely
> > > > > > > > > going to break some existing semantics (i.e. people are
> > > currently
> > > > > > > relying
> > > > > > > > > on blocking semantics) so adding another method's/interface
> > > plus
> > > > > > > > > deprecating the older one is more annoying but ideal.
> > > > > > > > > 3. Using multiple thread pools is definitely recommended
> for
> > > > > different
> > > > > > > > > types of tasks, for serialization which is CPU bound you
> > > definitely
> > > > > > > would
> > > > > > > > > want to use a bounded thread pool that is fixed by the
> number
> > > of
> > > > > CPU's
> > > > > > > > (or
> > > > > > > > > something along those lines).
> > > > > > > > >
> > > https://gist.github.com/djspiewak/46b543800958cf61af6efa8e072bfd5c
> > > > > is
> > > > > > > a
> > > > > > > > > very good guide on this topic
> > > > > > > > > 4. Regarding providing the ability for users to supply
> their
> > > own
> > > > > custom
> > > > > > > > > ThreadPool this is more of an ergonomics question for the
> API.
> > > > > > > Especially
> > > > > > > > > when it gets to monitoring/tracing, giving the ability for
> > > users to
> > > > > > > > provide
> > > > > > > > > their own custom IO/CPU ThreadPools is ideal however as
> stated
> > > > > doing so
> > > > > > > > > means a lot of boilerplatery changes to the API. Typically
> > > > > speaking a
> > > > > > > lot
> > > > > > > > > of monitoring/tracing/diagnosing is done on
> > > > > > > ExecutionContext/ThreadPools
> > > > > > > > > (at least on a more rudimentary level) and hence allowing
> > > users to
> > > > > > > > supply a
> > > > > > > > > global singleton ThreadPool for IO tasks and another for
> CPU
> > > tasks
> > > > > > > makes
> > > > > > > > > their lives a lot easier. However due to the large amount
> of
> > > > > changes to
> > > > > > > > the
> > > > > > > > > API, it may be more appropriate to just use internal thread
> > > pools
> > > > > (for
> > > > > > > > now)
> > > > > > > > > since at least it's not any worse than what exists
> currently
> > > and
> > > > > this
> > > > > > > can
> > > > > > > > > be an improvement that is done later?
> > > > > > > > >
> > > > > > > > > On Wed, May 19, 2021 at 2:56 AM Ryanne Dolan <
> > > > > ryannedolan@gmail.com>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > I was thinking the sender would typically wrap send() in
> a
> > > > > > > > backoff/retry
> > > > > > > > > > loop, or else ignore any failures and drop sends on the
> floor
> > > > > > > > > > (fire-and-forget), and in both cases I think failing
> > > immediately
> > > > > is
> > > > > > > > > better
> > > > > > > > > > than blocking for a new spot in the queue or
> asynchronously
> > > > > failing
> > > > > > > > > > somehow.
> > > > > > > > > >
> > > > > > > > > > I think a failed future is adequate for the "explicit
> > > > > backpressure
> > > > > > > > > signal"
> > > > > > > > > > while avoiding any blocking anywhere. I think if we try
> to
> > > > > > > > asynchronously
> > > > > > > > > > signal the caller of failure (either by asynchronously
> > > failing
> > > > > the
> > > > > > > > future
> > > > > > > > > > or invoking a callback off-thread or something) we'd
> force
> > > the
> > > > > caller
> > > > > > > > to
> > > > > > > > > > either block or poll waiting for that signal, which
> somewhat
> > > > > defeats
> > > > > > > > the
> > > > > > > > > > purpose we're after. And of course blocking for a spot
> in the
> > > > > queue
> > > > > > > > > > definitely defeats the purpose (tho perhaps ameliorates
> the
> > > > > problem
> > > > > > > > > some).
> > > > > > > > > >
> > > > > > > > > > Throwing an exception to the caller directly (not via the
> > > > > future) is
> > > > > > > > > > another option with precedent in Kafka clients, tho it
> > > doesn't
> > > > > seem
> > > > > > > as
> > > > > > > > > > ergonomic to me.
> > > > > > > > > >
> > > > > > > > > > It would be interesting to analyze some existing usage
> and
> > > > > determine
> > > > > > > > how
> > > > > > > > > > difficult it would be to convert it to the various
> proposed
> > > APIs.
> > > > > > > > > >
> > > > > > > > > > Ryanne
> > > > > > > > > >
> > > > > > > > > > On Tue, May 18, 2021, 3:27 PM Nakamura <nnythm@gmail.com
> >
> > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Ryanne,
> > > > > > > > > > >
> > > > > > > > > > > Hmm, that's an interesting idea.  Basically it would
> mean
> > > that
> > > > > > > after
> > > > > > > > > > > calling send, you would also have to check whether the
> > > returned
> > > > > > > > future
> > > > > > > > > > had
> > > > > > > > > > > failed with a specific exception.  I would be open to
> it,
> > > > > although
> > > > > > > I
> > > > > > > > > > think
> > > > > > > > > > > it might be slightly more surprising, since right now
> the
> > > > > paradigm
> > > > > > > is
> > > > > > > > > > > "enqueue synchronously, the future represents whether
> we
> > > > > succeeded
> > > > > > > in
> > > > > > > > > > > sending or not" and the new one would be "enqueue
> > > > > synchronously,
> > > > > > > the
> > > > > > > > > > future
> > > > > > > > > > > either represents whether we succeeded in enqueueing or
> > > not (in
> > > > > > > which
> > > > > > > > > > case
> > > > > > > > > > > it will be failed immediately if it failed to enqueue)
> or
> > > > > whether
> > > > > > > we
> > > > > > > > > > > succeeded in sending or not".
> > > > > > > > > > >
> > > > > > > > > > > But you're right, it should be on the table, thank you
> for
> > > > > > > suggesting
> > > > > > > > > it!
> > > > > > > > > > >
> > > > > > > > > > > Best,
> > > > > > > > > > > Moses
> > > > > > > > > > >
> > > > > > > > > > > On Tue, May 18, 2021 at 12:23 PM Ryanne Dolan <
> > > > > > > ryannedolan@gmail.com
> > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Moses, in the case of a full queue, could we just
> return
> > > a
> > > > > failed
> > > > > > > > > > future
> > > > > > > > > > > > immediately?
> > > > > > > > > > > >
> > > > > > > > > > > > Ryanne
> > > > > > > > > > > >
> > > > > > > > > > > > On Tue, May 18, 2021, 10:39 AM Nakamura <
> > > nnythm@gmail.com>
> > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi Alexandre,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks for bringing this up, I think I could use
> some
> > > > > feedback
> > > > > > > in
> > > > > > > > > > this
> > > > > > > > > > > > > area.  There are two mechanisms here, one for
> slowing
> > > down
> > > > > when
> > > > > > > > we
> > > > > > > > > > > don't
> > > > > > > > > > > > > have the relevant metadata, and the other for
> slowing
> > > down
> > > > > > > when a
> > > > > > > > > > queue
> > > > > > > > > > > > has
> > > > > > > > > > > > > filled up.  Although the first one applies
> backpressure
> > > > > > > somewhat
> > > > > > > > > > > > > inadvertently, we could still get in trouble if
> we're
> > > not
> > > > > > > > providing
> > > > > > > > > > > > > information to the mechanism that monitors whether
> > > we're
> > > > > > > queueing
> > > > > > > > > too
> > > > > > > > > > > > > much.  As for the second one, that is a classic
> > > > > backpressure
> > > > > > > use
> > > > > > > > > > case,
> > > > > > > > > > > so
> > > > > > > > > > > > > it's definitely important that we don't drop that
> > > ability.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Right now backpressure is applied by blocking,
> which
> > > is a
> > > > > > > natural
> > > > > > > > > way
> > > > > > > > > > > to
> > > > > > > > > > > > > apply backpressure in synchronous systems, but can
> > > lead to
> > > > > > > > > > unnecessary
> > > > > > > > > > > > > slowdowns in asynchronous systems.  In my opinion,
> the
> > > > > safest
> > > > > > > way
> > > > > > > > > to
> > > > > > > > > > > > apply
> > > > > > > > > > > > > backpressure in an asynchronous model is to have an
> > > > > explicit
> > > > > > > > > > > backpressure
> > > > > > > > > > > > > signal.  A good example would be returning an
> > > exception,
> > > > > and
> > > > > > > > > > providing
> > > > > > > > > > > an
> > > > > > > > > > > > > optional hook to add a callback onto so that you
> can be
> > > > > > > notified
> > > > > > > > > when
> > > > > > > > > > > > it's
> > > > > > > > > > > > > ready to accept more messages.
> > > > > > > > > > > > >
> > > > > > > > > > > > > However, this would be a really big change to how
> > > users use
> > > > > > > > > > > > > KafkaProducer#send, so I don't know how much
> appetite
> > > we
> > > > > have
> > > > > > > for
> > > > > > > > > > > making
> > > > > > > > > > > > > that kind of change.  Maybe it would be simpler to
> > > remove
> > > > > the
> > > > > > > > > "don't
> > > > > > > > > > > > block
> > > > > > > > > > > > > when the per-topic queue is full" from the scope of
> > > this
> > > > > KIP,
> > > > > > > and
> > > > > > > > > > only
> > > > > > > > > > > > > focus on when metadata is available?  The downside
> is
> > > that
> > > > > we
> > > > > > > > will
> > > > > > > > > > > > probably
> > > > > > > > > > > > > want to change the API again later to fix this, so
> it
> > > > > might be
> > > > > > > > > better
> > > > > > > > > > > to
> > > > > > > > > > > > > just rip the bandaid off now.
> > > > > > > > > > > > >
> > > > > > > > > > > > > One slightly nasty thing here is that because
> queueing
> > > > > order is
> > > > > > > > > > > > important,
> > > > > > > > > > > > > if we want to use exceptions, we will want to be
> able
> > > to
> > > > > signal
> > > > > > > > the
> > > > > > > > > > > > failure
> > > > > > > > > > > > > to enqueue to the caller in such a way that they
> can
> > > still
> > > > > > > > enforce
> > > > > > > > > > > > message
> > > > > > > > > > > > > order if they want.  So we can't embed the failure
> > > > > directly in
> > > > > > > > the
> > > > > > > > > > > > returned
> > > > > > > > > > > > > future, we should either return two futures
> (nested,
> > > or as
> > > > > a
> > > > > > > > tuple)
> > > > > > > > > > or
> > > > > > > > > > > > else
> > > > > > > > > > > > > throw an exception to explain a backpressure.
> > > > > > > > > > > > >
> > > > > > > > > > > > > So there are a few things we should work out here:
> > > > > > > > > > > > >
> > > > > > > > > > > > > 1. Should we keep the "too many bytes enqueued"
> part of
> > > > > this in
> > > > > > > > > > scope?
> > > > > > > > > > > > (I
> > > > > > > > > > > > > would say yes, so that we can minimize churn in
> this
> > > API)
> > > > > > > > > > > > > 2. How should we signal backpressure so that it's
> > > > > appropriate
> > > > > > > for
> > > > > > > > > > > > > asynchronous systems?  (I would say that we should
> > > throw an
> > > > > > > > > > exception.
> > > > > > > > > > > > If
> > > > > > > > > > > > > we choose this and we want to pursue the queueing
> > > path, we
> > > > > > > would
> > > > > > > > > > *not*
> > > > > > > > > > > > want
> > > > > > > > > > > > > to enqueue messages that would push us over the
> limit,
> > > and
> > > > > > > would
> > > > > > > > > only
> > > > > > > > > > > > want
> > > > > > > > > > > > > to enqueue messages when we're waiting for
> metadata,
> > > and we
> > > > > > > would
> > > > > > > > > > want
> > > > > > > > > > > to
> > > > > > > > > > > > > keep track of the total number of bytes for those
> > > > > messages).
> > > > > > > > > > > > >
> > > > > > > > > > > > > What do you think?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Best,
> > > > > > > > > > > > > Moses
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Sun, May 16, 2021 at 6:16 AM Alexandre Dupriez <
> > > > > > > > > > > > > alexandre.dupriez@gmail.com> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hello Nakamura,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks for proposing this change. I can see how
> the
> > > > > blocking
> > > > > > > > > > > behaviour
> > > > > > > > > > > > > > can be a problem when integrating with reactive
> > > > > frameworks
> > > > > > > such
> > > > > > > > > as
> > > > > > > > > > > > > > Akka. One of the questions I would have is how
> you
> > > would
> > > > > > > handle
> > > > > > > > > > back
> > > > > > > > > > > > > > pressure and avoid memory exhaustion when the
> > > producer's
> > > > > > > buffer
> > > > > > > > > is
> > > > > > > > > > > > > > full and tasks would start to accumulate in the
> > > > > out-of-band
> > > > > > > > queue
> > > > > > > > > > or
> > > > > > > > > > > > > > thread pool introduced with this KIP.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > Alexandre
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Le ven. 14 mai 2021 à 15:55, Ryanne Dolan <
> > > > > > > > ryannedolan@gmail.com
> > > > > > > > > >
> > > > > > > > > > a
> > > > > > > > > > > > > écrit
> > > > > > > > > > > > > > :
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Makes sense!
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Ryanne
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Fri, May 14, 2021, 9:39 AM Nakamura <
> > > > > nnythm@gmail.com>
> > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hey Ryanne,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I see what you're saying about serde
> blocking,
> > > but I
> > > > > > > think
> > > > > > > > we
> > > > > > > > > > > > should
> > > > > > > > > > > > > > > > consider it out of scope for this patch.
> Right
> > > now
> > > > > we've
> > > > > > > > > > nailed
> > > > > > > > > > > > > down a
> > > > > > > > > > > > > > > > couple of use cases where we can
> unambiguously
> > > say,
> > > > > "I
> > > > > > > can
> > > > > > > > > make
> > > > > > > > > > > > > > progress
> > > > > > > > > > > > > > > > now" or "I cannot make progress now", which
> > > makes it
> > > > > > > > possible
> > > > > > > > > > to
> > > > > > > > > > > > > > offload to
> > > > > > > > > > > > > > > > a different thread only if we are unable to
> make
> > > > > > > progress.
> > > > > > > > > > > > Extending
> > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > to CPU work like serde would mean always
> > > offloading,
> > > > > > > which
> > > > > > > > > > would
> > > > > > > > > > > > be a
> > > > > > > > > > > > > > > > really big performance change.  It might be
> worth
> > > > > > > exploring
> > > > > > > > > > > anyway,
> > > > > > > > > > > > > > but I'd
> > > > > > > > > > > > > > > > rather keep this patch focused on improving
> > > > > ergonomics,
> > > > > > > > > rather
> > > > > > > > > > > than
> > > > > > > > > > > > > > > > muddying the waters with evaluating
> performance
> > > very
> > > > > > > > deeply.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I think if we really do want to support
> serde or
> > > > > > > > interceptors
> > > > > > > > > > > that
> > > > > > > > > > > > do
> > > > > > > > > > > > > > IO on
> > > > > > > > > > > > > > > > the send path (which seems like an
> anti-pattern
> > > to
> > > > > me),
> > > > > > > we
> > > > > > > > > > should
> > > > > > > > > > > > > > consider
> > > > > > > > > > > > > > > > making that a separate SIP, and probably also
> > > > > consider
> > > > > > > > > changing
> > > > > > > > > > > the
> > > > > > > > > > > > > > API to
> > > > > > > > > > > > > > > > use Futures (or CompletionStages).  But I
> would
> > > > > rather
> > > > > > > > avoid
> > > > > > > > > > > scope
> > > > > > > > > > > > > > creep,
> > > > > > > > > > > > > > > > so that we have a better chance of fixing
> this
> > > part
> > > > > of
> > > > > > > the
> > > > > > > > > > > problem.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Yes, I think some exceptions will move to
> being
> > > async
> > > > > > > > instead
> > > > > > > > > > of
> > > > > > > > > > > > > sync.
> > > > > > > > > > > > > > > > They'll still be surfaced in the Future, so
> I'm
> > > not
> > > > > so
> > > > > > > > > > confident
> > > > > > > > > > > > that
> > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > would be that big a shock to users though.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > Moses
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Thu, May 13, 2021 at 7:44 PM Ryanne Dolan
> <
> > > > > > > > > > > > ryannedolan@gmail.com>
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > re serialization, my concern is that
> > > serialization
> > > > > > > often
> > > > > > > > > > > accounts
> > > > > > > > > > > > > > for a
> > > > > > > > > > > > > > > > lot
> > > > > > > > > > > > > > > > > of the cycles spent before returning the
> > > future.
> > > > > It's
> > > > > > > not
> > > > > > > > > > > > blocking
> > > > > > > > > > > > > > per
> > > > > > > > > > > > > > > > se,
> > > > > > > > > > > > > > > > > but it's the same effect from the caller's
> > > > > perspective.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Moreover, serde impls often block
> themselves,
> > > e.g.
> > > > > when
> > > > > > > > > > > fetching
> > > > > > > > > > > > > > schemas
> > > > > > > > > > > > > > > > > from a registry. I suppose it's also
> possible
> > > to
> > > > > block
> > > > > > > in
> > > > > > > > > > > > > > Interceptors
> > > > > > > > > > > > > > > > > (e.g. writing audit events or metrics),
> which
> > > > > happens
> > > > > > > > > before
> > > > > > > > > > > > serdes
> > > > > > > > > > > > > > iiuc.
> > > > > > > > > > > > > > > > > So any blocking in either of those plugins
> > > would
> > > > > block
> > > > > > > > the
> > > > > > > > > > send
> > > > > > > > > > > > > > unless we
> > > > > > > > > > > > > > > > > queue first.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > So I think we want to queue first and do
> > > everything
> > > > > > > > > > off-thread
> > > > > > > > > > > > when
> > > > > > > > > > > > > > using
> > > > > > > > > > > > > > > > > the new API, whatever that looks like. I
> just
> > > want
> > > > > to
> > > > > > > > make
> > > > > > > > > > sure
> > > > > > > > > > > > we
> > > > > > > > > > > > > > don't
> > > > > > > > > > > > > > > > do
> > > > > > > > > > > > > > > > > that for clients that wouldn't expect it.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Another consideration is exception
> handling.
> > > If we
> > > > > > > queue
> > > > > > > > > > right
> > > > > > > > > > > > > away,
> > > > > > > > > > > > > > > > we'll
> > > > > > > > > > > > > > > > > defer some exceptions that currently are
> > > thrown to
> > > > > the
> > > > > > > > > caller
> > > > > > > > > > > > > > (before the
> > > > > > > > > > > > > > > > > future is returned). In the new API, the
> send()
> > > > > > > wouldn't
> > > > > > > > > > throw
> > > > > > > > > > > > any
> > > > > > > > > > > > > > > > > exceptions, and instead the future would
> fail.
> > > I
> > > > > think
> > > > > > > > that
> > > > > > > > > > > might
> > > > > > > > > > > > > > mean
> > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > a new method signature is required.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Ryanne
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Thu, May 13, 2021, 2:57 PM Nakamura <
> > > > > > > > > > > nakamura.moses@gmail.com
> > > > > > > > > > > > >
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Hey Ryanne,
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > I agree we should add an additional
> > > constructor
> > > > > (or
> > > > > > > > else
> > > > > > > > > an
> > > > > > > > > > > > > > additional
> > > > > > > > > > > > > > > > > > overload in KafkaProducer#send, but the
> new
> > > > > > > constructor
> > > > > > > > > > would
> > > > > > > > > > > > be
> > > > > > > > > > > > > > easier
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > understand) if we're targeting the "user
> > > > > provides the
> > > > > > > > > > thread"
> > > > > > > > > > > > > > approach.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > From looking at the code, I think we can
> keep
> > > > > record
> > > > > > > > > > > > > serialization
> > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > user thread, if we consider that an
> important
> > > > > part of
> > > > > > > > the
> > > > > > > > > > > > > > semantics of
> > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > method.  It doesn't seem like
> serialization
> > > > > depends
> > > > > > > on
> > > > > > > > > > > knowing
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > cluster,
> > > > > > > > > > > > > > > > > > I think it's incidental that it comes
> after
> > > the
> > > > > first
> > > > > > > > > > > > "blocking"
> > > > > > > > > > > > > > > > segment
> > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > the method.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > Moses
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Thu, May 13, 2021 at 2:38 PM Ryanne
> Dolan
> > > <
> > > > > > > > > > > > > > ryannedolan@gmail.com>
> > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Hey Moses, I like the direction here.
> My
> > > > > thinking
> > > > > > > is
> > > > > > > > > > that a
> > > > > > > > > > > > > > single
> > > > > > > > > > > > > > > > > > > additional work queue, s.t. send() can
> > > enqueue
> > > > > and
> > > > > > > > > > return,
> > > > > > > > > > > > > seems
> > > > > > > > > > > > > > like
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > lightest touch. However, I don't think
> we
> > > can
> > > > > > > > trivially
> > > > > > > > > > > > process
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > queue
> > > > > > > > > > > > > > > > > > > in an internal thread pool without
> subtly
> > > > > changing
> > > > > > > > > > behavior
> > > > > > > > > > > > for
> > > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > > users.
> > > > > > > > > > > > > > > > > > > For example, users will often run
> send() in
> > > > > > > multiple
> > > > > > > > > > > threads
> > > > > > > > > > > > in
> > > > > > > > > > > > > > order
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > serialize faster, but that wouldn't
> work
> > > quite
> > > > > the
> > > > > > > > same
> > > > > > > > > > if
> > > > > > > > > > > > > there
> > > > > > > > > > > > > > were
> > > > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > > internal thread pool.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > For this reason I'm thinking we need to
> > > make
> > > > > sure
> > > > > > > any
> > > > > > > > > > such
> > > > > > > > > > > > > > changes
> > > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > > opt-in. Maybe a new constructor with an
> > > > > additional
> > > > > > > > > > > > > ThreadFactory
> > > > > > > > > > > > > > > > > > parameter.
> > > > > > > > > > > > > > > > > > > That would at least clearly indicate
> that
> > > work
> > > > > will
> > > > > > > > > > happen
> > > > > > > > > > > > > > > > off-thread,
> > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > would require opt-in for the new
> behavior.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Under the hood, this ThreadFactory
> could be
> > > > > used to
> > > > > > > > > > create
> > > > > > > > > > > > the
> > > > > > > > > > > > > > worker
> > > > > > > > > > > > > > > > > > > thread that process queued sends, which
> > > could
> > > > > > > fan-out
> > > > > > > > > to
> > > > > > > > > > > > > > > > per-partition
> > > > > > > > > > > > > > > > > > > threads from there.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > So then you'd have two ways to send:
> the
> > > > > existing
> > > > > > > > way,
> > > > > > > > > > > where
> > > > > > > > > > > > > > serde
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > interceptors and whatnot are executed
> on
> > > the
> > > > > > > calling
> > > > > > > > > > > thread,
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > new
> > > > > > > > > > > > > > > > > > > way, which returns right away and uses
> an
> > > > > internal
> > > > > > > > > > > Executor.
> > > > > > > > > > > > As
> > > > > > > > > > > > > > you
> > > > > > > > > > > > > > > > > point
> > > > > > > > > > > > > > > > > > > out, the semantics would be identical
> in
> > > either
> > > > > > > case,
> > > > > > > > > and
> > > > > > > > > > > it
> > > > > > > > > > > > > > would be
> > > > > > > > > > > > > > > > > > very
> > > > > > > > > > > > > > > > > > > easy for clients to switch.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Ryanne
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > On Thu, May 13, 2021, 9:00 AM Nakamura
> <
> > > > > > > > > nnythm@gmail.com
> > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Hey Folks,
> > > > > > > > > > > > > > > > > > > > I just posted a new proposal
> > > > > > > > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=181306446
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > in the wiki.  I think we have an
> > > opportunity
> > > > > to
> > > > > > > > > improve
> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > KafkaProducer#send user experience.
> It
> > > would
> > > > > > > > > certainly
> > > > > > > > > > > > make
> > > > > > > > > > > > > > our
> > > > > > > > > > > > > > > > > lives
> > > > > > > > > > > > > > > > > > > > easier.  Please take a look!  There
> are
> > > two
> > > > > > > > > subproblems
> > > > > > > > > > > > that
> > > > > > > > > > > > > I
> > > > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > > > > guidance on, so I would appreciate
> > > feedback
> > > > > on
> > > > > > > both
> > > > > > > > > of
> > > > > > > > > > > > them.
> > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > Moses
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > >
> > > > > > > > > Matthew de Detrich
> > > > > > > > >
> > > > > > > > > *Aiven Deutschland GmbH*
> > > > > > > > >
> > > > > > > > > Immanuelkirchstraße 26, 10405 Berlin
> > > > > > > > >
> > > > > > > > > Amtsgericht Charlottenburg, HRB 209739 B
> > > > > > > > >
> > > > > > > > > *m:* +491603708037
> > > > > > > > >
> > > > > > > > > *w:* aiven.io *e:* matthew.dedetrich@aiven.io
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > >
> > > > > > > Matthew de Detrich
> > > > > > >
> > > > > > > *Aiven Deutschland GmbH*
> > > > > > >
> > > > > > > Immanuelkirchstraße 26, 10405 Berlin
> > > > > > >
> > > > > > > Amtsgericht Charlottenburg, HRB 209739 B
> > > > > > >
> > > > > > > *m:* +491603708037
> > > > > > >
> > > > > > > *w:* aiven.io *e:* matthew.dedetrich@aiven.io
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-739: Block Less on KafkaProducer#send

Posted by Colin McCabe <cm...@apache.org>.
On Tue, Jun 1, 2021, at 07:00, Nakamura wrote:
> Hi Colin,
> 
> Sorry, I still don't follow.
> 
> Right now `KafkaProducer#send` seems to trigger a metadata fetch.  Today,
> we block on that before returning.  Is your proposal that we move the
> metadata fetch out of `KafkaProducer#send` entirely?
> 

KafkaProducer#send is supposed to initiate non-blocking I/O, but not wait for it to complete.

There's more information about non-blocking I/O in Java here: 
https://en.wikipedia.org/wiki/Non-blocking_I/O_%28Java%29

>
> Even if the metadata fetch moves to be non-blocking, I think we still need
> to deal with the problems we've discussed before if the fetch happens in
> the `KafkaProducer#send` method.  How do we maintain the ordering semantics
> of `KafkaProducer#send`?

How are the ordering semantics of `KafkaProducer#send` related to the metadata fetch?

>  How do we prevent our buffer from filling up?

That is not related to the metadata fetch. Also, I already proposed a solution (returning an error) if this is a concern.

> Which thread is responsible for checking poll()?

The same client thread that always has been responsible for checking poll.

> 
> The only approach I can see that would avoid this would be moving the
> metadata fetch to happen at a different time.  But it's not clear to me
> when would be a more appropriate time to do the metadata fetch than
> `KafkaProducer#send`.
> 

It's not about moving the metadata fetch to happen at a different time. It's about using non-blocking I/O, like we do for other network I/O. (And actually, if you want to get really technical, we do this for the metadata fetch too, it's just that we have a hack that loops to transform it back into blocking I/O.)

best,
Colin

> I think there's something I'm missing here.  Would you mind helping me
> figure out what it is?
> 
> Best,
> Moses
> 
> On Sun, May 30, 2021 at 5:35 PM Colin McCabe <cm...@apache.org> wrote:
> 
> > On Tue, May 25, 2021, at 11:26, Nakamura wrote:
> > > Hey Colin,
> > >
> > > For the metadata case, what would fixing the bug look like?  I agree that
> > > we should fix it, but I don't have a clear picture in my mind of what
> > > fixing it should look like.  Can you elaborate?
> > >
> >
> > If the blocking metadata fetch bug were fixed, neither the producer nor
> > the consumer would block while fetching metadata. A poll() call would
> > initiate a metadata fetch if needed, and a subsequent call to poll() would
> > handle the results if needed. Basically the same paradigm we use for other
> > network communication in the producer and consumer.
> >
> > best,
> > Colin
> >
> > > Best,
> > > Moses
> > >
> > > On Mon, May 24, 2021 at 1:54 PM Colin McCabe <cm...@apache.org> wrote:
> > >
> > > > Hi all,
> > > >
> > > > I agree that we should give users the option of having a fully async
> > API,
> > > > but I don't think external thread pools or queues are the right
> > direction
> > > > to go here. They add performance overheads and don't address the root
> > > > causes of the problem.
> > > >
> > > > There are basically two scenarios where we block, currently. One is
> > when
> > > > we are doing a metadata fetch. I think this is clearly a bug, or at
> > least
> > > > an implementation limitation. From the user's point of view, the fact
> > that
> > > > we are doing a metadata fetch is an implementation detail that really
> > > > shouldn't be exposed like this. We have talked about fixing this in the
> > > > past. I think we just should spend the time to do it.
> > > >
> > > > The second scenario is where the client has produced too much data in
> > too
> > > > little time. This could happen if there is a network glitch, or the
> > server
> > > > is slower than expected. In this case, the behavior is intentional and
> > not
> > > > a bug. To understand this, think about what would happen if we didn't
> > > > block. We would start buffering more and more data in memory, until
> > finally
> > > > the application died with an out of memory error. That would be
> > frustrating
> > > > for users and wouldn't add to the usability of Kafka.
> > > >
> > > > We could potentially have an option to handle the out-of-memory
> > scenario
> > > > differently by returning an error code immediately rather than
> > blocking.
> > > > Applications would have to be rewritten to handle this properly, but
> > it is
> > > > a possibility. I suspect that most of them wouldn't use this, but we
> > could
> > > > offer it as a possibility for async purists (which might include
> > certain
> > > > frameworks). The big problem the users would have to solve is what to
> > do
> > > > with the record that they were unable to produce due to the buffer full
> > > > issue.
> > > >
> > > > best,
> > > > Colin
> > > >
> > > >
> > > > On Thu, May 20, 2021, at 10:35, Nakamura wrote:
> > > > > >
> > > > > > My suggestion was just do this in multiple steps/phases, firstly
> > let's
> > > > fix
> > > > > > the issue of send being misleadingly asynchronous (i.e. internally
> > its
> > > > > > blocking) and then later one we can make the various
> > > > > > threadpools configurable with a sane default.
> > > > >
> > > > > I like that approach. I updated the "Which thread should be
> > responsible
> > > > for
> > > > > waiting" part of KIP-739 to add your suggestion as my recommended
> > > > approach,
> > > > > thank you!  If no one else has major concerns about that approach,
> > I'll
> > > > > move the alternatives to "rejected alternatives".
> > > > >
> > > > > On Thu, May 20, 2021 at 7:26 AM Matthew de Detrich
> > > > > <ma...@aiven.io.invalid> wrote:
> > > > >
> > > > > > @
> > > > > >
> > > > > > Nakamura
> > > > > > On Wed, May 19, 2021 at 7:35 PM Nakamura <nn...@gmail.com> wrote:
> > > > > >
> > > > > > > @Ryanne:
> > > > > > > In my mind's eye I slightly prefer the throwing the "cannot
> > enqueue"
> > > > > > > exception to satisfying the future immediately with the "cannot
> > > > enqueue"
> > > > > > > exception?  But I agree, it would be worth doing more research.
> > > > > > >
> > > > > > > @Matthew:
> > > > > > >
> > > > > > > > 3. Using multiple thread pools is definitely recommended for
> > > > different
> > > > > > > > types of tasks, for serialization which is CPU bound you
> > definitely
> > > > > > would
> > > > > > > > want to use a bounded thread pool that is fixed by the number
> > of
> > > > CPU's
> > > > > > > (or
> > > > > > > > something along those lines).
> > > > > > > >
> > https://gist.github.com/djspiewak/46b543800958cf61af6efa8e072bfd5c
> > > > is
> > > > > > a
> > > > > > > > very good guide on this topic
> > > > > > > I think this guide is good in general, but I would be hesitant to
> > > > follow
> > > > > > > its guidance re: offloading serialization without benchmarking
> > it.
> > > > My
> > > > > > > understanding is that context-switches have gotten much cheaper,
> > and
> > > > that
> > > > > > > gains from cache locality are small, but they're not nothing.
> > > > Especially
> > > > > > > if the workload has a very small serialization cost, I wouldn't
> > be
> > > > > > shocked
> > > > > > > if it made it slower.  I feel pretty strongly that we should do
> > more
> > > > > > > research here before unconditionally encouraging serialization
> > in a
> > > > > > > threadpool.  If people think it's important to do it here (eg if
> > we
> > > > think
> > > > > > > it would mean another big API change) then we should start
> > thinking
> > > > about
> > > > > > > what benchmarking we can do to gain higher confidence in this
> > kind of
> > > > > > > change.  However, I don't think it would change semantics as
> > > > > > substantially
> > > > > > > as we're proposing here, so I would vote for pushing this to a
> > > > subsequent
> > > > > > > KIP.
> > > > > > >
> > > > > > Of course, its all down to benchmarking, benchmarking and
> > benchmarking.
> > > > > > Ideally speaking you want to use all of the resources available to
> > > > you, so
> > > > > > if you have a bottleneck in serialization and you have many cores
> > free
> > > > then
> > > > > > using multiple cores may be more appropriate than a single core.
> > > > Typically
> > > > > > I would expect that using a single thread to do serialization is
> > > > likely to
> > > > > > be the most situation, I was just responding to an earlier point
> > that
> > > > was
> > > > > > made in regards to using ThreadPools for serialization (note that
> > you
> > > > can
> > > > > > also just use a ThreadPool that is pinned to a single thread)
> > > > > >
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > > 4. Regarding providing the ability for users to supply their
> > own
> > > > custom
> > > > > > > > ThreadPool this is more of an ergonomics question for the API.
> > > > > > Especially
> > > > > > > > when it gets to monitoring/tracing, giving the ability for
> > users to
> > > > > > > provide
> > > > > > > > their own custom IO/CPU ThreadPools is ideal however as stated
> > > > doing so
> > > > > > > > means a lot of boilerplatery changes to the API. Typically
> > > > speaking a
> > > > > > lot
> > > > > > > > of monitoring/tracing/diagnosing is done on
> > > > > > ExecutionContext/ThreadPools
> > > > > > > > (at least on a more rudimentary level) and hence allowing
> > users to
> > > > > > supply
> > > > > > > a
> > > > > > > > global singleton ThreadPool for IO tasks and another for CPU
> > tasks
> > > > > > makes
> > > > > > > > their lives a lot easier. However due to the large amount of
> > > > changes to
> > > > > > > the
> > > > > > > > API, it may be more appropriate to just use internal thread
> > pools
> > > > (for
> > > > > > > now)
> > > > > > > > since at least it's not any worse than what exists currently
> > and
> > > > this
> > > > > > can
> > > > > > > > be an improvement that is done later?
> > > > > > > Is there an existing threadpool that you suggest we reuse?  Or
> > are
> > > > you
> > > > > > > imagining that we make our own internal threadpool, and then
> > maybe
> > > > expose
> > > > > > > configuration flags to manipulate it?  For what it's worth, I
> > like
> > > > having
> > > > > > > an internal threadpool (perhaps just FJP.commonpool) and then
> > > > providing
> > > > > > an
> > > > > > > alternative to pass your own threadpool.  That way people who
> > want
> > > > finer
> > > > > > > control can get it, and everyone else can do OK with the default.
> > > > > > >
> > > > > > Indeed that is what I am saying. The most ideal situation is that
> > > > there is
> > > > > > a default internal threadpool that Kafka uses, however users of
> > Kafka
> > > > can
> > > > > > configure there own threadpool. Having a singleton ThreadPool for
> > > > blocking
> > > > > > IO, non blocking IO and CPU bound tasks which can be plugged in
> > all of
> > > > your
> > > > > > libraries (including Kafka) makes resource management much easier
> > to
> > > > do and
> > > > > > also gives control of users to override specific threadpools for
> > > > > > exceptional cases (i.e. providing a Threadpool that is pinned to a
> > > > single
> > > > > > core which tends to give the best latency results if this is
> > something
> > > > that
> > > > > > is critical for you).
> > > > > >
> > > > > > My suggestion was just do this in multiple steps/phases, firstly
> > let's
> > > > fix
> > > > > > the issue of send being misleadingly asynchronous (i.e. internally
> > its
> > > > > > blocking) and then later one we can make the various
> > > > > > threadpools configurable with a sane default.
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Wed, May 19, 2021 at 6:01 AM Matthew de Detrich
> > > > > > > <ma...@aiven.io.invalid> wrote:
> > > > > > >
> > > > > > > > Here are my two cents here (note that I am only seeing this on
> > a
> > > > > > surface
> > > > > > > > level)
> > > > > > > >
> > > > > > > > 1. If we are going this road it makes sense to do this
> > "properly"
> > > > (i.e.
> > > > > > > > using queues  as Ryan suggested). The reason I am saying this
> > is
> > > > that
> > > > > > it
> > > > > > > > seems that the original goal of the KIP is for it to be used in
> > > > other
> > > > > > > > asynchronous systems and from my personal experience, you
> > really do
> > > > > > need
> > > > > > > to
> > > > > > > > make the implementation properly asynchronous otherwise it's
> > > > really not
> > > > > > > > that useful.
> > > > > > > > 2. Due to the previous point and what was said by others, this
> > is
> > > > > > likely
> > > > > > > > going to break some existing semantics (i.e. people are
> > currently
> > > > > > relying
> > > > > > > > on blocking semantics) so adding another method's/interface
> > plus
> > > > > > > > deprecating the older one is more annoying but ideal.
> > > > > > > > 3. Using multiple thread pools is definitely recommended for
> > > > different
> > > > > > > > types of tasks, for serialization which is CPU bound you
> > definitely
> > > > > > would
> > > > > > > > want to use a bounded thread pool that is fixed by the number
> > of
> > > > CPU's
> > > > > > > (or
> > > > > > > > something along those lines).
> > > > > > > >
> > https://gist.github.com/djspiewak/46b543800958cf61af6efa8e072bfd5c
> > > > is
> > > > > > a
> > > > > > > > very good guide on this topic
> > > > > > > > 4. Regarding providing the ability for users to supply their
> > own
> > > > custom
> > > > > > > > ThreadPool this is more of an ergonomics question for the API.
> > > > > > Especially
> > > > > > > > when it gets to monitoring/tracing, giving the ability for
> > users to
> > > > > > > provide
> > > > > > > > their own custom IO/CPU ThreadPools is ideal however as stated
> > > > doing so
> > > > > > > > means a lot of boilerplatery changes to the API. Typically
> > > > speaking a
> > > > > > lot
> > > > > > > > of monitoring/tracing/diagnosing is done on
> > > > > > ExecutionContext/ThreadPools
> > > > > > > > (at least on a more rudimentary level) and hence allowing
> > users to
> > > > > > > supply a
> > > > > > > > global singleton ThreadPool for IO tasks and another for CPU
> > tasks
> > > > > > makes
> > > > > > > > their lives a lot easier. However due to the large amount of
> > > > changes to
> > > > > > > the
> > > > > > > > API, it may be more appropriate to just use internal thread
> > pools
> > > > (for
> > > > > > > now)
> > > > > > > > since at least it's not any worse than what exists currently
> > and
> > > > this
> > > > > > can
> > > > > > > > be an improvement that is done later?
> > > > > > > >
> > > > > > > > On Wed, May 19, 2021 at 2:56 AM Ryanne Dolan <
> > > > ryannedolan@gmail.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > I was thinking the sender would typically wrap send() in a
> > > > > > > backoff/retry
> > > > > > > > > loop, or else ignore any failures and drop sends on the floor
> > > > > > > > > (fire-and-forget), and in both cases I think failing
> > immediately
> > > > is
> > > > > > > > better
> > > > > > > > > than blocking for a new spot in the queue or asynchronously
> > > > failing
> > > > > > > > > somehow.
> > > > > > > > >
> > > > > > > > > I think a failed future is adequate for the "explicit
> > > > backpressure
> > > > > > > > signal"
> > > > > > > > > while avoiding any blocking anywhere. I think if we try to
> > > > > > > asynchronously
> > > > > > > > > signal the caller of failure (either by asynchronously
> > failing
> > > > the
> > > > > > > future
> > > > > > > > > or invoking a callback off-thread or something) we'd force
> > the
> > > > caller
> > > > > > > to
> > > > > > > > > either block or poll waiting for that signal, which somewhat
> > > > defeats
> > > > > > > the
> > > > > > > > > purpose we're after. And of course blocking for a spot in the
> > > > queue
> > > > > > > > > definitely defeats the purpose (tho perhaps ameliorates the
> > > > problem
> > > > > > > > some).
> > > > > > > > >
> > > > > > > > > Throwing an exception to the caller directly (not via the
> > > > future) is
> > > > > > > > > another option with precedent in Kafka clients, tho it
> > doesn't
> > > > seem
> > > > > > as
> > > > > > > > > ergonomic to me.
> > > > > > > > >
> > > > > > > > > It would be interesting to analyze some existing usage and
> > > > determine
> > > > > > > how
> > > > > > > > > difficult it would be to convert it to the various proposed
> > APIs.
> > > > > > > > >
> > > > > > > > > Ryanne
> > > > > > > > >
> > > > > > > > > On Tue, May 18, 2021, 3:27 PM Nakamura <nn...@gmail.com>
> > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Ryanne,
> > > > > > > > > >
> > > > > > > > > > Hmm, that's an interesting idea.  Basically it would mean
> > that
> > > > > > after
> > > > > > > > > > calling send, you would also have to check whether the
> > returned
> > > > > > > future
> > > > > > > > > had
> > > > > > > > > > failed with a specific exception.  I would be open to it,
> > > > although
> > > > > > I
> > > > > > > > > think
> > > > > > > > > > it might be slightly more surprising, since right now the
> > > > paradigm
> > > > > > is
> > > > > > > > > > "enqueue synchronously, the future represents whether we
> > > > succeeded
> > > > > > in
> > > > > > > > > > sending or not" and the new one would be "enqueue
> > > > synchronously,
> > > > > > the
> > > > > > > > > future
> > > > > > > > > > either represents whether we succeeded in enqueueing or
> > not (in
> > > > > > which
> > > > > > > > > case
> > > > > > > > > > it will be failed immediately if it failed to enqueue) or
> > > > whether
> > > > > > we
> > > > > > > > > > succeeded in sending or not".
> > > > > > > > > >
> > > > > > > > > > But you're right, it should be on the table, thank you for
> > > > > > suggesting
> > > > > > > > it!
> > > > > > > > > >
> > > > > > > > > > Best,
> > > > > > > > > > Moses
> > > > > > > > > >
> > > > > > > > > > On Tue, May 18, 2021 at 12:23 PM Ryanne Dolan <
> > > > > > ryannedolan@gmail.com
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Moses, in the case of a full queue, could we just return
> > a
> > > > failed
> > > > > > > > > future
> > > > > > > > > > > immediately?
> > > > > > > > > > >
> > > > > > > > > > > Ryanne
> > > > > > > > > > >
> > > > > > > > > > > On Tue, May 18, 2021, 10:39 AM Nakamura <
> > nnythm@gmail.com>
> > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi Alexandre,
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks for bringing this up, I think I could use some
> > > > feedback
> > > > > > in
> > > > > > > > > this
> > > > > > > > > > > > area.  There are two mechanisms here, one for slowing
> > down
> > > > when
> > > > > > > we
> > > > > > > > > > don't
> > > > > > > > > > > > have the relevant metadata, and the other for slowing
> > down
> > > > > > when a
> > > > > > > > > queue
> > > > > > > > > > > has
> > > > > > > > > > > > filled up.  Although the first one applies backpressure
> > > > > > somewhat
> > > > > > > > > > > > inadvertently, we could still get in trouble if we're
> > not
> > > > > > > providing
> > > > > > > > > > > > information to the mechanism that monitors whether
> > we're
> > > > > > queueing
> > > > > > > > too
> > > > > > > > > > > > much.  As for the second one, that is a classic
> > > > backpressure
> > > > > > use
> > > > > > > > > case,
> > > > > > > > > > so
> > > > > > > > > > > > it's definitely important that we don't drop that
> > ability.
> > > > > > > > > > > >
> > > > > > > > > > > > Right now backpressure is applied by blocking, which
> > is a
> > > > > > natural
> > > > > > > > way
> > > > > > > > > > to
> > > > > > > > > > > > apply backpressure in synchronous systems, but can
> > lead to
> > > > > > > > > unnecessary
> > > > > > > > > > > > slowdowns in asynchronous systems.  In my opinion, the
> > > > safest
> > > > > > way
> > > > > > > > to
> > > > > > > > > > > apply
> > > > > > > > > > > > backpressure in an asynchronous model is to have an
> > > > explicit
> > > > > > > > > > backpressure
> > > > > > > > > > > > signal.  A good example would be returning an
> > exception,
> > > > and
> > > > > > > > > providing
> > > > > > > > > > an
> > > > > > > > > > > > optional hook to add a callback onto so that you can be
> > > > > > notified
> > > > > > > > when
> > > > > > > > > > > it's
> > > > > > > > > > > > ready to accept more messages.
> > > > > > > > > > > >
> > > > > > > > > > > > However, this would be a really big change to how
> > users use
> > > > > > > > > > > > KafkaProducer#send, so I don't know how much appetite
> > we
> > > > have
> > > > > > for
> > > > > > > > > > making
> > > > > > > > > > > > that kind of change.  Maybe it would be simpler to
> > remove
> > > > the
> > > > > > > > "don't
> > > > > > > > > > > block
> > > > > > > > > > > > when the per-topic queue is full" from the scope of
> > this
> > > > KIP,
> > > > > > and
> > > > > > > > > only
> > > > > > > > > > > > focus on when metadata is available?  The downside is
> > that
> > > > we
> > > > > > > will
> > > > > > > > > > > probably
> > > > > > > > > > > > want to change the API again later to fix this, so it
> > > > might be
> > > > > > > > better
> > > > > > > > > > to
> > > > > > > > > > > > just rip the bandaid off now.
> > > > > > > > > > > >
> > > > > > > > > > > > One slightly nasty thing here is that because queueing
> > > > order is
> > > > > > > > > > > important,
> > > > > > > > > > > > if we want to use exceptions, we will want to be able
> > to
> > > > signal
> > > > > > > the
> > > > > > > > > > > failure
> > > > > > > > > > > > to enqueue to the caller in such a way that they can
> > still
> > > > > > > enforce
> > > > > > > > > > > message
> > > > > > > > > > > > order if they want.  So we can't embed the failure
> > > > directly in
> > > > > > > the
> > > > > > > > > > > returned
> > > > > > > > > > > > future, we should either return two futures (nested,
> > or as
> > > > a
> > > > > > > tuple)
> > > > > > > > > or
> > > > > > > > > > > else
> > > > > > > > > > > > throw an exception to explain a backpressure.
> > > > > > > > > > > >
> > > > > > > > > > > > So there are a few things we should work out here:
> > > > > > > > > > > >
> > > > > > > > > > > > 1. Should we keep the "too many bytes enqueued" part of
> > > > this in
> > > > > > > > > scope?
> > > > > > > > > > > (I
> > > > > > > > > > > > would say yes, so that we can minimize churn in this
> > API)
> > > > > > > > > > > > 2. How should we signal backpressure so that it's
> > > > appropriate
> > > > > > for
> > > > > > > > > > > > asynchronous systems?  (I would say that we should
> > throw an
> > > > > > > > > exception.
> > > > > > > > > > > If
> > > > > > > > > > > > we choose this and we want to pursue the queueing
> > path, we
> > > > > > would
> > > > > > > > > *not*
> > > > > > > > > > > want
> > > > > > > > > > > > to enqueue messages that would push us over the limit,
> > and
> > > > > > would
> > > > > > > > only
> > > > > > > > > > > want
> > > > > > > > > > > > to enqueue messages when we're waiting for metadata,
> > and we
> > > > > > would
> > > > > > > > > want
> > > > > > > > > > to
> > > > > > > > > > > > keep track of the total number of bytes for those
> > > > messages).
> > > > > > > > > > > >
> > > > > > > > > > > > What do you think?
> > > > > > > > > > > >
> > > > > > > > > > > > Best,
> > > > > > > > > > > > Moses
> > > > > > > > > > > >
> > > > > > > > > > > > On Sun, May 16, 2021 at 6:16 AM Alexandre Dupriez <
> > > > > > > > > > > > alexandre.dupriez@gmail.com> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hello Nakamura,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks for proposing this change. I can see how the
> > > > blocking
> > > > > > > > > > behaviour
> > > > > > > > > > > > > can be a problem when integrating with reactive
> > > > frameworks
> > > > > > such
> > > > > > > > as
> > > > > > > > > > > > > Akka. One of the questions I would have is how you
> > would
> > > > > > handle
> > > > > > > > > back
> > > > > > > > > > > > > pressure and avoid memory exhaustion when the
> > producer's
> > > > > > buffer
> > > > > > > > is
> > > > > > > > > > > > > full and tasks would start to accumulate in the
> > > > out-of-band
> > > > > > > queue
> > > > > > > > > or
> > > > > > > > > > > > > thread pool introduced with this KIP.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > Alexandre
> > > > > > > > > > > > >
> > > > > > > > > > > > > Le ven. 14 mai 2021 à 15:55, Ryanne Dolan <
> > > > > > > ryannedolan@gmail.com
> > > > > > > > >
> > > > > > > > > a
> > > > > > > > > > > > écrit
> > > > > > > > > > > > > :
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Makes sense!
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Ryanne
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Fri, May 14, 2021, 9:39 AM Nakamura <
> > > > nnythm@gmail.com>
> > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hey Ryanne,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I see what you're saying about serde blocking,
> > but I
> > > > > > think
> > > > > > > we
> > > > > > > > > > > should
> > > > > > > > > > > > > > > consider it out of scope for this patch.  Right
> > now
> > > > we've
> > > > > > > > > nailed
> > > > > > > > > > > > down a
> > > > > > > > > > > > > > > couple of use cases where we can unambiguously
> > say,
> > > > "I
> > > > > > can
> > > > > > > > make
> > > > > > > > > > > > > progress
> > > > > > > > > > > > > > > now" or "I cannot make progress now", which
> > makes it
> > > > > > > possible
> > > > > > > > > to
> > > > > > > > > > > > > offload to
> > > > > > > > > > > > > > > a different thread only if we are unable to make
> > > > > > progress.
> > > > > > > > > > > Extending
> > > > > > > > > > > > > this
> > > > > > > > > > > > > > > to CPU work like serde would mean always
> > offloading,
> > > > > > which
> > > > > > > > > would
> > > > > > > > > > > be a
> > > > > > > > > > > > > > > really big performance change.  It might be worth
> > > > > > exploring
> > > > > > > > > > anyway,
> > > > > > > > > > > > > but I'd
> > > > > > > > > > > > > > > rather keep this patch focused on improving
> > > > ergonomics,
> > > > > > > > rather
> > > > > > > > > > than
> > > > > > > > > > > > > > > muddying the waters with evaluating performance
> > very
> > > > > > > deeply.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I think if we really do want to support serde or
> > > > > > > interceptors
> > > > > > > > > > that
> > > > > > > > > > > do
> > > > > > > > > > > > > IO on
> > > > > > > > > > > > > > > the send path (which seems like an anti-pattern
> > to
> > > > me),
> > > > > > we
> > > > > > > > > should
> > > > > > > > > > > > > consider
> > > > > > > > > > > > > > > making that a separate SIP, and probably also
> > > > consider
> > > > > > > > changing
> > > > > > > > > > the
> > > > > > > > > > > > > API to
> > > > > > > > > > > > > > > use Futures (or CompletionStages).  But I would
> > > > rather
> > > > > > > avoid
> > > > > > > > > > scope
> > > > > > > > > > > > > creep,
> > > > > > > > > > > > > > > so that we have a better chance of fixing this
> > part
> > > > of
> > > > > > the
> > > > > > > > > > problem.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Yes, I think some exceptions will move to being
> > async
> > > > > > > instead
> > > > > > > > > of
> > > > > > > > > > > > sync.
> > > > > > > > > > > > > > > They'll still be surfaced in the Future, so I'm
> > not
> > > > so
> > > > > > > > > confident
> > > > > > > > > > > that
> > > > > > > > > > > > > it
> > > > > > > > > > > > > > > would be that big a shock to users though.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > Moses
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Thu, May 13, 2021 at 7:44 PM Ryanne Dolan <
> > > > > > > > > > > ryannedolan@gmail.com>
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > re serialization, my concern is that
> > serialization
> > > > > > often
> > > > > > > > > > accounts
> > > > > > > > > > > > > for a
> > > > > > > > > > > > > > > lot
> > > > > > > > > > > > > > > > of the cycles spent before returning the
> > future.
> > > > It's
> > > > > > not
> > > > > > > > > > > blocking
> > > > > > > > > > > > > per
> > > > > > > > > > > > > > > se,
> > > > > > > > > > > > > > > > but it's the same effect from the caller's
> > > > perspective.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Moreover, serde impls often block themselves,
> > e.g.
> > > > when
> > > > > > > > > > fetching
> > > > > > > > > > > > > schemas
> > > > > > > > > > > > > > > > from a registry. I suppose it's also possible
> > to
> > > > block
> > > > > > in
> > > > > > > > > > > > > Interceptors
> > > > > > > > > > > > > > > > (e.g. writing audit events or metrics), which
> > > > happens
> > > > > > > > before
> > > > > > > > > > > serdes
> > > > > > > > > > > > > iiuc.
> > > > > > > > > > > > > > > > So any blocking in either of those plugins
> > would
> > > > block
> > > > > > > the
> > > > > > > > > send
> > > > > > > > > > > > > unless we
> > > > > > > > > > > > > > > > queue first.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > So I think we want to queue first and do
> > everything
> > > > > > > > > off-thread
> > > > > > > > > > > when
> > > > > > > > > > > > > using
> > > > > > > > > > > > > > > > the new API, whatever that looks like. I just
> > want
> > > > to
> > > > > > > make
> > > > > > > > > sure
> > > > > > > > > > > we
> > > > > > > > > > > > > don't
> > > > > > > > > > > > > > > do
> > > > > > > > > > > > > > > > that for clients that wouldn't expect it.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Another consideration is exception handling.
> > If we
> > > > > > queue
> > > > > > > > > right
> > > > > > > > > > > > away,
> > > > > > > > > > > > > > > we'll
> > > > > > > > > > > > > > > > defer some exceptions that currently are
> > thrown to
> > > > the
> > > > > > > > caller
> > > > > > > > > > > > > (before the
> > > > > > > > > > > > > > > > future is returned). In the new API, the send()
> > > > > > wouldn't
> > > > > > > > > throw
> > > > > > > > > > > any
> > > > > > > > > > > > > > > > exceptions, and instead the future would fail.
> > I
> > > > think
> > > > > > > that
> > > > > > > > > > might
> > > > > > > > > > > > > mean
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > a new method signature is required.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Ryanne
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Thu, May 13, 2021, 2:57 PM Nakamura <
> > > > > > > > > > nakamura.moses@gmail.com
> > > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Hey Ryanne,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I agree we should add an additional
> > constructor
> > > > (or
> > > > > > > else
> > > > > > > > an
> > > > > > > > > > > > > additional
> > > > > > > > > > > > > > > > > overload in KafkaProducer#send, but the new
> > > > > > constructor
> > > > > > > > > would
> > > > > > > > > > > be
> > > > > > > > > > > > > easier
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > understand) if we're targeting the "user
> > > > provides the
> > > > > > > > > thread"
> > > > > > > > > > > > > approach.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > From looking at the code, I think we can keep
> > > > record
> > > > > > > > > > > > serialization
> > > > > > > > > > > > > on
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > user thread, if we consider that an important
> > > > part of
> > > > > > > the
> > > > > > > > > > > > > semantics of
> > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > method.  It doesn't seem like serialization
> > > > depends
> > > > > > on
> > > > > > > > > > knowing
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > cluster,
> > > > > > > > > > > > > > > > > I think it's incidental that it comes after
> > the
> > > > first
> > > > > > > > > > > "blocking"
> > > > > > > > > > > > > > > segment
> > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > the method.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > Moses
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Thu, May 13, 2021 at 2:38 PM Ryanne Dolan
> > <
> > > > > > > > > > > > > ryannedolan@gmail.com>
> > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Hey Moses, I like the direction here. My
> > > > thinking
> > > > > > is
> > > > > > > > > that a
> > > > > > > > > > > > > single
> > > > > > > > > > > > > > > > > > additional work queue, s.t. send() can
> > enqueue
> > > > and
> > > > > > > > > return,
> > > > > > > > > > > > seems
> > > > > > > > > > > > > like
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > lightest touch. However, I don't think we
> > can
> > > > > > > trivially
> > > > > > > > > > > process
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > queue
> > > > > > > > > > > > > > > > > > in an internal thread pool without subtly
> > > > changing
> > > > > > > > > behavior
> > > > > > > > > > > for
> > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > users.
> > > > > > > > > > > > > > > > > > For example, users will often run send() in
> > > > > > multiple
> > > > > > > > > > threads
> > > > > > > > > > > in
> > > > > > > > > > > > > order
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > serialize faster, but that wouldn't work
> > quite
> > > > the
> > > > > > > same
> > > > > > > > > if
> > > > > > > > > > > > there
> > > > > > > > > > > > > were
> > > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > internal thread pool.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > For this reason I'm thinking we need to
> > make
> > > > sure
> > > > > > any
> > > > > > > > > such
> > > > > > > > > > > > > changes
> > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > opt-in. Maybe a new constructor with an
> > > > additional
> > > > > > > > > > > > ThreadFactory
> > > > > > > > > > > > > > > > > parameter.
> > > > > > > > > > > > > > > > > > That would at least clearly indicate that
> > work
> > > > will
> > > > > > > > > happen
> > > > > > > > > > > > > > > off-thread,
> > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > would require opt-in for the new behavior.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Under the hood, this ThreadFactory could be
> > > > used to
> > > > > > > > > create
> > > > > > > > > > > the
> > > > > > > > > > > > > worker
> > > > > > > > > > > > > > > > > > thread that process queued sends, which
> > could
> > > > > > fan-out
> > > > > > > > to
> > > > > > > > > > > > > > > per-partition
> > > > > > > > > > > > > > > > > > threads from there.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > So then you'd have two ways to send: the
> > > > existing
> > > > > > > way,
> > > > > > > > > > where
> > > > > > > > > > > > > serde
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > interceptors and whatnot are executed on
> > the
> > > > > > calling
> > > > > > > > > > thread,
> > > > > > > > > > > > and
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > new
> > > > > > > > > > > > > > > > > > way, which returns right away and uses an
> > > > internal
> > > > > > > > > > Executor.
> > > > > > > > > > > As
> > > > > > > > > > > > > you
> > > > > > > > > > > > > > > > point
> > > > > > > > > > > > > > > > > > out, the semantics would be identical in
> > either
> > > > > > case,
> > > > > > > > and
> > > > > > > > > > it
> > > > > > > > > > > > > would be
> > > > > > > > > > > > > > > > > very
> > > > > > > > > > > > > > > > > > easy for clients to switch.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Ryanne
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Thu, May 13, 2021, 9:00 AM Nakamura <
> > > > > > > > nnythm@gmail.com
> > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Hey Folks,
> > > > > > > > > > > > > > > > > > > I just posted a new proposal
> > > > > > > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=181306446
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > in the wiki.  I think we have an
> > opportunity
> > > > to
> > > > > > > > improve
> > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > KafkaProducer#send user experience.  It
> > would
> > > > > > > > certainly
> > > > > > > > > > > make
> > > > > > > > > > > > > our
> > > > > > > > > > > > > > > > lives
> > > > > > > > > > > > > > > > > > > easier.  Please take a look!  There are
> > two
> > > > > > > > subproblems
> > > > > > > > > > > that
> > > > > > > > > > > > I
> > > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > > > guidance on, so I would appreciate
> > feedback
> > > > on
> > > > > > both
> > > > > > > > of
> > > > > > > > > > > them.
> > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > Moses
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > >
> > > > > > > > Matthew de Detrich
> > > > > > > >
> > > > > > > > *Aiven Deutschland GmbH*
> > > > > > > >
> > > > > > > > Immanuelkirchstraße 26, 10405 Berlin
> > > > > > > >
> > > > > > > > Amtsgericht Charlottenburg, HRB 209739 B
> > > > > > > >
> > > > > > > > *m:* +491603708037
> > > > > > > >
> > > > > > > > *w:* aiven.io *e:* matthew.dedetrich@aiven.io
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > >
> > > > > > Matthew de Detrich
> > > > > >
> > > > > > *Aiven Deutschland GmbH*
> > > > > >
> > > > > > Immanuelkirchstraße 26, 10405 Berlin
> > > > > >
> > > > > > Amtsgericht Charlottenburg, HRB 209739 B
> > > > > >
> > > > > > *m:* +491603708037
> > > > > >
> > > > > > *w:* aiven.io *e:* matthew.dedetrich@aiven.io
> > > > > >
> > > > >
> > > >
> > >
> >
>