You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Vahid Hashemian <va...@gmail.com> on 2021/02/10 20:58:15 UTC

Re: [DISCUSS] KIP-712: Shallow Mirroring

Retitled the thread to conform to the common format.

On Fri, Feb 5, 2021 at 4:00 PM Ning Zhang <ni...@gmail.com> wrote:

> Hello Henry,
>
> This is a very interesting proposal.
> https://issues.apache.org/jira/browse/KAFKA-10728 reflects the similar
> concern of re-compressing data in mirror maker.
>
> Probably one thing may need to clarify is: how "shallow" mirroring is only
> applied to mirrormaker use case, if the changes need to be made on generic
> consumer and producer (e.g. by adding `fetch.raw.bytes` and
> `send.raw.bytes` to producer and consumer config)
>
> On 2021/02/05 00:59:57, Henry Cai <hc...@pinterest.com.INVALID> wrote:
> > Dear Community members,
> >
> > We are proposing a new feature to improve the performance of Kafka mirror
> > maker:
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-712%3A+Shallow+Mirroring
> >
> > The current Kafka MirrorMaker process (with the underlying Consumer and
> > Producer library) uses significant CPU cycles and memory to
> > decompress/recompress, deserialize/re-serialize messages and copy
> multiple
> > times of messages bytes along the mirroring/replicating stages.
> >
> > The KIP proposes a *shallow mirror* feature which brings back the shallow
> > iterator concept to the mirror process and also proposes to skip the
> > unnecessary message decompression and recompression steps.  We argue in
> > many cases users just want a simple replication pipeline to replicate the
> > message as it is from the source cluster to the destination cluster.  In
> > many cases the messages in the source cluster are already compressed and
> > properly batched, users just need an identical copy of the message bytes
> > through the mirroring without any transformation or repartitioning.
> >
> > We have a prototype implementation in house with MirrorMaker v1 and
> > observed *CPU usage dropped from 50% to 15%* for some mirror pipelines.
> >
> > We name this feature: *shallow mirroring* since it has some resemblance
> to
> > the old Kafka 0.7 namesake feature but the implementations are not quite
> > the same.  ‘*Shallow*’ means 1. we *shallowly* iterate RecordBatches
> inside
> > MemoryRecords structure instead of deep iterating records inside
> > RecordBatch; 2. We *shallowly* copy (share) pointers inside ByteBuffer
> > instead of deep copying and deserializing bytes into objects.
> >
> > Please share discussions/feedback along this email thread.
> >
>


-- 

Thanks!
--Vahid

Re: [DISCUSS] KIP-712: Shallow Mirroring

Posted by Henry Cai <hc...@pinterest.com.INVALID>.
Ismael,

On the network performance side, the issue is on the throughput.  For
networking purposes, you gain throughout by combining data into bigger
batch otherwise you pay higher overhead on network
handshaking/roudtrip-delay and wastage on the underlying network packet
buffer.  On mirrormaker, it fetches data inbound through FetchResponse from
source broker which can return data in MBytes (comprised of multiple
batches in the same response), however the outbound ProduceRequest to
target broker are in KByte batch size range if we don't optimize.  (The
reason those batches are in KBytes is because when the application client
originally produced to the 1st broker, they tended to select smaller batch
sizes to achieve low latency). The outbound throughput is not going to be
able to match with the inbound throughput.

In order to match the networking parity between inbound FetchResponse
(which allows packing multiple batches) and outbound ProduceRequest, we
ask to restore the multiple batch packing capability of ProduceRequest.  I
think KIP-98 did a shortcut to remove the multiple batch packing capability
to save the extra work they need to do to support transactions across
multiple batches.

For our use case, we view MM as a mere replication pipe to get the data
from one data center into another data center, the remote broker is almost
like a follower to the broker in the source cluster.  For broker to broker
replication, it's using FetchRequest/FetchResponse which does use multiple
batch packing to achieve the optimal network throughput.  On the broker
code path, FetchResponse and ProduceRequest went to the same handling code
and the broker will just append the MemoryRecords (which contains multiple
batches) into the same log segment file as one unit.  So it's not
particularly hard to restore the multiple batch packing feature back for
ProduceRequest.



On Tue, Mar 30, 2021 at 12:34 AM Ismael Juma <is...@juma.me.uk> wrote:

> Hi Henry,
>
> Can you clarify why this "network performance" issue is only related to
> shallow mirroring? Generally, we want the protocol to be generic and not
> have a number of special cases. The more special cases you have, the
> tougher it becomes to test all the edge cases.
>
> Ismael
>
> On Mon, Mar 29, 2021 at 9:51 PM Henry Cai <hc...@pinterest.com.invalid>
> wrote:
>
> > It's interesting this VOTE thread finally becomes a DISCUSS thread.
> >
> > For MM2 concern, I will take a look to see whether I can add the support
> > for MM2.
> >
> > For Ismael's concern on multiple batches in the ProduceRequest
> (conflicting
> > with KIP-98), here is my take:
> >
> > 1. We do need to group multiple batches in the same request otherwise the
> > network performance will suffer.
> > 2. For the concern on transactional message support as in KIP-98, since
> MM1
> > and MM2 currently don't support transactional messages, KIP-712 will not
> > attempt to support transactions either.  I will add a config option on
> > producer config: allowMultipleBatches.  By default this option will be
> off
> > and the user needs to explicitly turn on this option to use the shallow
> > mirror feature.  And if we detect both this option and transaction is
> > turned on we will throw an exception to protect current transaction
> > processing.
> > 3. In the future, when MM2 starts to support exact-once and transactional
> > messages (is that KIP-656?), we can revisit this code.  The current
> > transactional message already makes the compromise that the messages in
> the
> > same RecordBatch (MessageSet) are sharing the same
> > sequence-id/transaction-id, so those messages need to be committed all
> > together.  I think when we support the shallow mirror with transactional
> > semantics, we will group all batches in the same ProduceRequest in the
> same
> > transaction boundary, they need to be committed all together.  On the
> > broker side, all batches coming from ProduceRequest (or FetchResponse)
> are
> > committed in the same log segment file as one unit (current behavior).
> >
> > On Mon, Mar 29, 2021 at 8:46 AM Ryanne Dolan <ry...@gmail.com>
> > wrote:
> >
> > > Ah, I see, thanks Ismael. Now I understand your concern.
> > >
> > > From KIP-98, re this change in v3:
> > >
> > > "This allows us to remove the message set size since each message set
> > > already contains a field for the size. More importantly, since there is
> > > only one message set to be written to the log, partial produce failures
> > are
> > > no longer possible. The full message set is either successfully written
> > to
> > > the log (and replicated) or it is not."
> > >
> > > The schema and size field don't seem to be an issue, as KIP-712 already
> > > addresses.
> > >
> > > The partial produce failure issue is something I don't understand. I
> > can't
> > > tell if this was done out of convenience at the time or if there is
> > > something incompatible with partial produce success/failure and EOS.
> Does
> > > anyone know?
> > >
> > > Ryanne
> > >
> > > On Mon, Mar 29, 2021, 1:41 AM Ismael Juma <is...@juma.me.uk> wrote:
> > >
> > > > Ryanne,
> > > >
> > > > You misunderstood the referenced comment. It is about the produce
> > request
> > > > change to have multiple batches:
> > > >
> > > > "Up to ProduceRequest V2, a ProduceRequest can contain multiple
> batches
> > > of
> > > > messages stored in the record_set field, but this was disabled in V3.
> > We
> > > > are proposing to bring the multiple batches feature back to improve
> the
> > > > network throughput of the mirror maker producer when the original
> batch
> > > > size from source broker is too small."
> > > >
> > > > This is unrelated to shallow iteration.
> > > >
> > > > Ismael
> > > >
> > > > On Sun, Mar 28, 2021, 10:15 PM Ryanne Dolan <ry...@gmail.com>
> > > wrote:
> > > >
> > > > > Ismael, I don't think KIP-98 is related. Shallow iteration was
> > removed
> > > in
> > > > > KAFKA-732, which predates KIP-98 by a few years.
> > > > >
> > > > > Ryanne
> > > > >
> > > > > On Sun, Mar 28, 2021, 11:25 PM Ismael Juma <is...@juma.me.uk>
> > wrote:
> > > > >
> > > > > > Thanks for the KIP. I have a few high level comments:
> > > > > >
> > > > > > 1. Like Tom, I'm not convinced about the proposal to make this
> > change
> > > > to
> > > > > > MirrorMaker 1 if we intend to deprecate it and remove it. I would
> > > > rather
> > > > > us
> > > > > > focus our efforts on the implementation we intend to support
> going
> > > > > forward.
> > > > > > 2. The producer/consumer configs seem pretty dangerous for
> general
> > > > usage,
> > > > > > but the KIP doesn't address the potential downsides.
> > > > > > 3. How does the ProducerRequest change impact exactly-once (if at
> > > all)?
> > > > > The
> > > > > > change we are reverting was done as part of KIP-98. Have we
> > > considered
> > > > > the
> > > > > > original reasons for the change?
> > > > > >
> > > > > > Thanks,
> > > > > > Ismael
> > > > > >
> > > > > > On Wed, Feb 10, 2021 at 12:58 PM Vahid Hashemian <
> > > > > > vahid.hashemian@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Retitled the thread to conform to the common format.
> > > > > > >
> > > > > > > On Fri, Feb 5, 2021 at 4:00 PM Ning Zhang <
> > ning2008wisc@gmail.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > > Hello Henry,
> > > > > > > >
> > > > > > > > This is a very interesting proposal.
> > > > > > > > https://issues.apache.org/jira/browse/KAFKA-10728 reflects
> the
> > > > > similar
> > > > > > > > concern of re-compressing data in mirror maker.
> > > > > > > >
> > > > > > > > Probably one thing may need to clarify is: how "shallow"
> > > mirroring
> > > > is
> > > > > > > only
> > > > > > > > applied to mirrormaker use case, if the changes need to be
> made
> > > on
> > > > > > > generic
> > > > > > > > consumer and producer (e.g. by adding `fetch.raw.bytes` and
> > > > > > > > `send.raw.bytes` to producer and consumer config)
> > > > > > > >
> > > > > > > > On 2021/02/05 00:59:57, Henry Cai <hcai@pinterest.com.INVALID
> >
> > > > > wrote:
> > > > > > > > > Dear Community members,
> > > > > > > > >
> > > > > > > > > We are proposing a new feature to improve the performance
> of
> > > > Kafka
> > > > > > > mirror
> > > > > > > > > maker:
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-712%3A+Shallow+Mirroring
> > > > > > > > >
> > > > > > > > > The current Kafka MirrorMaker process (with the underlying
> > > > Consumer
> > > > > > and
> > > > > > > > > Producer library) uses significant CPU cycles and memory to
> > > > > > > > > decompress/recompress, deserialize/re-serialize messages
> and
> > > copy
> > > > > > > > multiple
> > > > > > > > > times of messages bytes along the mirroring/replicating
> > stages.
> > > > > > > > >
> > > > > > > > > The KIP proposes a *shallow mirror* feature which brings
> back
> > > the
> > > > > > > shallow
> > > > > > > > > iterator concept to the mirror process and also proposes to
> > > skip
> > > > > the
> > > > > > > > > unnecessary message decompression and recompression steps.
> > We
> > > > > argue
> > > > > > in
> > > > > > > > > many cases users just want a simple replication pipeline to
> > > > > replicate
> > > > > > > the
> > > > > > > > > message as it is from the source cluster to the destination
> > > > > cluster.
> > > > > > > In
> > > > > > > > > many cases the messages in the source cluster are already
> > > > > compressed
> > > > > > > and
> > > > > > > > > properly batched, users just need an identical copy of the
> > > > message
> > > > > > > bytes
> > > > > > > > > through the mirroring without any transformation or
> > > > repartitioning.
> > > > > > > > >
> > > > > > > > > We have a prototype implementation in house with
> MirrorMaker
> > v1
> > > > and
> > > > > > > > > observed *CPU usage dropped from 50% to 15%* for some
> mirror
> > > > > > pipelines.
> > > > > > > > >
> > > > > > > > > We name this feature: *shallow mirroring* since it has some
> > > > > > resemblance
> > > > > > > > to
> > > > > > > > > the old Kafka 0.7 namesake feature but the implementations
> > are
> > > > not
> > > > > > > quite
> > > > > > > > > the same.  ‘*Shallow*’ means 1. we *shallowly* iterate
> > > > > RecordBatches
> > > > > > > > inside
> > > > > > > > > MemoryRecords structure instead of deep iterating records
> > > inside
> > > > > > > > > RecordBatch; 2. We *shallowly* copy (share) pointers inside
> > > > > > ByteBuffer
> > > > > > > > > instead of deep copying and deserializing bytes into
> objects.
> > > > > > > > >
> > > > > > > > > Please share discussions/feedback along this email thread.
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > >
> > > > > > > Thanks!
> > > > > > > --Vahid
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-712: Shallow Mirroring

Posted by Ryanne Dolan <ry...@gmail.com>.
Henry, a suggestion: instead of introducing a new configuration property
which enables the proposed behavior in the existing clients, we could
expose this behavior only thru package-private classes
(RecordBatchConsumer/Producer or something) which wrap or extend the
existing clients. In other words, the KIP remains as proposed except client
code must instantiate a different class.

Then, ofc, MM could use these internal clients without exposing a new
public API. This would allay many of the concerns raised so far, re
exposing unsafe APIs, weird generics, etc, and would require no significant
changes to Connect or MM.

I'm not familiar enough with the EOS implementation to know, but I hope
there is a way forward there. Very often the sort of topic that gets
replicated is not transactional anyway.

Ryanne

On Thu, Apr 1, 2021, 3:06 AM Henry Cai <hc...@pinterest.com.invalid> wrote:

> Jun,
>
> Thanks for your insight looking into this KIP, we do believe the shallow
> iteration will give quite a significant performance boost.
>
> On your concerns:
>
> 1. Cleaner API.  One alternative is to create new batch APIs.  On consumer,
> it would become Consumer.pollBatch returns a ConsumerBatch object which
> contains topic/partition/firstOffsetOfBatch/pointerToByteBuffer, similarly
> Producer.sendBatch(ProducerBatch).  Both ConsumerBatch and ProducerBatch
> objects are fixed types (no generics), serializer is gone, interceptors are
> probably not needed initially (unless people see the need to intercept on
> the batch level).  On MM2 side, the current flow is ConsumerRecord ->
> Connect's SourceRecord -> ProducerRecord, we would need to enhance Connect
> framework to add SourceTask.pollBatch() method which returns a SourceBatch
> object, so the object conversion flow becomes ConsumerBatch -> SourceBatch
> -> ProducerBatch, we probably won't support any transformers on Batch
> objects.
> 2. PID/ProducerEpoch/SeqNo passing through RecordBatch.  I think those
> transaction fields are only meaningful in the original source kafka
> cluster, producer id/seqNo are not the same for the target kafka cluster.
> So if MM is not going to support transactions at the moment, we can clear
> those fields when they are going through MM.  Once MM starts to support
> transactions in the future, it probably will start its own PID/SeqNo etc.
> 3. For EOS and read_committed/read_uncommitted support, we can do phased
> support.  Phase 1, don't support transactional messages in the source
> cluster (i.e. abort if it sees control batch records).  Phase 2: applying
> commit/abort on the batch boundary level.  I am not too familiar with the
> isolation level and abort transaction code path, but it seems the control
> unit is currently on the batch boundary (commit/abort the whole batch), if
> so, this should also be doable.
> 4. MessageHandler in MM1 or SMT in MM2, initially we don't need to support
> them.  Since now the object is a ConsumerBatch and the existing handler is
> written for the individual object.  Deserialize the batch into individual
> objects would defeat the purpose of performance optimization.
> 5. Multiple batch performances, will do some testing on this.
>
> On Wed, Mar 31, 2021 at 10:14 AM Jun Rao <ju...@confluent.io.invalid> wrote:
>
> > Hi, Henry,
> >
> > Thanks for the KIP. Sorry for the late reply. A few comments below.
> >
> > 1. The 'shallow' feature is potentially useful. I do agree with Tom that
> > the proposed API changes seem unclean. Quite a few existing stuff don't
> > really work together with this (e.g., generics, serializer, interceptors,
> > configs like max.poll.records, etc). It's also hard to explain this
> change
> > to the common users of the consumer/producer API. I think it would be
> > useful to explore if there is another cleaner way of adding this. For
> > example, you mentioned that creating a new set of APIs doesn't work for
> > MM2. However, we could potentially change the connect interface to allow
> > MM2 to use the new API. If this doesn't work, it would be useful to
> explain
> > that in the rejected alternative section.
> >
> > 2. I am not sure that we could pass through all fields in RecordBatch.
> For
> > example, a MM instance could be receiving RecordBatch from different
> source
> > partitions. Mixing the PID/ProducerEpoch/FirstSequence fields across them
> > in a single producer will be weird. So, it would be useful to document
> this
> > part clearer.
> >
> > 3. EOS. While MM itself doesn't support mirroring data in an
> > exactly-once way, it needs to support reading from a topic with EOS data.
> > So, it would be useful to document whether both read_committed and
> > read_uncommitted mode are supported and what kind of RecordBatch the
> > consumer returns in each case.
> >
> > 4. With the 'shallow' feature, it seems that some existing features in MM
> > won't work. For example, I am not sure if SMT works in MM2
> > and MirrorMakerMessageHandler works in MM1. It would be useful to
> document
> > this kind of impact in the KIP.
> >
> > 5. Multiple batches per partition in the produce request. This seems not
> > strictly required in KIP-98. However, changing this will probably add a
> bit
> > more complexity in the producer. So, it would be useful to understand its
> > benefits, especially since it doesn't seem to directly help reduce the
> CPU
> > cost in MM. For example, do you have performance numbers with and without
> > this enabled in your MM tests?
> >
> > Thanks,
> >
> > Jun
> >
> > On Tue, Mar 30, 2021 at 1:27 PM Henry Cai <hc...@pinterest.com.invalid>
> > wrote:
> >
> > > Tom,
> > >
> > > Thanks for your comments.  Yes it's a bit clumsy to use the existing
> > > consumer and producer API to carry the underlying record batch, but
> > > creating a new set of API would also mean other use cases (e.g. MM2)
> > > wouldn't be able to use that feature easily.  We can throw exceptions
> if
> > we
> > > see clients are setting serializer/compression in the consumer config
> > > option.
> > >
> > > The consumer is essentially getting back a collection of
> > > RecordBatchByteBuffer records and passing them to the producer.  Most
> of
> > > the internal APIs inside consumer and producer code paths are actually
> > > taking on ByteBuffer as the argument so it's not too much work to get
> the
> > > byte buffer through.
> > >
> > > For the worry that the client might see the inside of that byte buffer,
> > we
> > > can create a RecordBatchByteBufferRecord class to wrap the underlying
> > byte
> > > buffer so hopefully they will not drill too deep into that object.
> > Java's
> > > ByteBuffer does have a asReadOnlyBuffer() method to return a read-only
> > > buffer, that can be explored as well.
> > >
> > > On Tue, Mar 30, 2021 at 4:24 AM Tom Bentley <tb...@redhat.com>
> wrote:
> > >
> > > > Hi Henry and Ryanne,
> > > >
> > > > Related to Ismael's point about the producer & consumer configs being
> > > > dangerous, I can see two parts to this:
> > > >
> > > > 2a. Both the proposed configs seem to be fundamentally incompatible
> > with
> > > > the Producer's existing key.serializer, value.serializer and
> > > > compression.type configs, likewise the consumers key.deserializer and
> > > > value.deserializer. I don't see a way to avoid this, since those
> > existing
> > > > configs are already separate things. (I did consider whether using
> > > > special-case Deserializer and Serializer could be used instead, but
> > that
> > > > doesn't work nicely; in this use case they're necessarily all
> > configured
> > > > together). I think all we could do would be to reject configs which
> > tried
> > > > to set those existing client configs in conjunction with
> > fetch.raw.bytes
> > > > and send.raw.bytes.
> > > >
> > > > 2b. That still leaves a public Java API which would allow access to
> the
> > > raw
> > > > byte buffers. AFAICS we don't actually need user code to have access
> to
> > > the
> > > > raw buffers. It would be enough to get an opaque object that wrapped
> > the
> > > > ByteBuffer from the consumer and pass it to the producer. It's only
> the
> > > > consumer and producer code which needs to be able to obtain the
> wrapped
> > > > buffer.
> > > >
> > > > Kind regards,
> > > >
> > > > Tom
> > > >
> > > > On Tue, Mar 30, 2021 at 8:41 AM Ismael Juma <is...@juma.me.uk>
> wrote:
> > > >
> > > > > Hi Henry,
> > > > >
> > > > > Can you clarify why this "network performance" issue is only
> related
> > to
> > > > > shallow mirroring? Generally, we want the protocol to be generic
> and
> > > not
> > > > > have a number of special cases. The more special cases you have,
> the
> > > > > tougher it becomes to test all the edge cases.
> > > > >
> > > > > Ismael
> > > > >
> > > > > On Mon, Mar 29, 2021 at 9:51 PM Henry Cai
> <hcai@pinterest.com.invalid
> > >
> > > > > wrote:
> > > > >
> > > > > > It's interesting this VOTE thread finally becomes a DISCUSS
> thread.
> > > > > >
> > > > > > For MM2 concern, I will take a look to see whether I can add the
> > > > support
> > > > > > for MM2.
> > > > > >
> > > > > > For Ismael's concern on multiple batches in the ProduceRequest
> > > > > (conflicting
> > > > > > with KIP-98), here is my take:
> > > > > >
> > > > > > 1. We do need to group multiple batches in the same request
> > otherwise
> > > > the
> > > > > > network performance will suffer.
> > > > > > 2. For the concern on transactional message support as in KIP-98,
> > > since
> > > > > MM1
> > > > > > and MM2 currently don't support transactional messages, KIP-712
> > will
> > > > not
> > > > > > attempt to support transactions either.  I will add a config
> option
> > > on
> > > > > > producer config: allowMultipleBatches.  By default this option
> will
> > > be
> > > > > off
> > > > > > and the user needs to explicitly turn on this option to use the
> > > shallow
> > > > > > mirror feature.  And if we detect both this option and
> transaction
> > is
> > > > > > turned on we will throw an exception to protect current
> transaction
> > > > > > processing.
> > > > > > 3. In the future, when MM2 starts to support exact-once and
> > > > transactional
> > > > > > messages (is that KIP-656?), we can revisit this code.  The
> current
> > > > > > transactional message already makes the compromise that the
> > messages
> > > in
> > > > > the
> > > > > > same RecordBatch (MessageSet) are sharing the same
> > > > > > sequence-id/transaction-id, so those messages need to be
> committed
> > > all
> > > > > > together.  I think when we support the shallow mirror with
> > > > transactional
> > > > > > semantics, we will group all batches in the same ProduceRequest
> in
> > > the
> > > > > same
> > > > > > transaction boundary, they need to be committed all together.  On
> > the
> > > > > > broker side, all batches coming from ProduceRequest (or
> > > FetchResponse)
> > > > > are
> > > > > > committed in the same log segment file as one unit (current
> > > behavior).
> > > > > >
> > > > > > On Mon, Mar 29, 2021 at 8:46 AM Ryanne Dolan <
> > ryannedolan@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Ah, I see, thanks Ismael. Now I understand your concern.
> > > > > > >
> > > > > > > From KIP-98, re this change in v3:
> > > > > > >
> > > > > > > "This allows us to remove the message set size since each
> message
> > > set
> > > > > > > already contains a field for the size. More importantly, since
> > > there
> > > > is
> > > > > > > only one message set to be written to the log, partial produce
> > > > failures
> > > > > > are
> > > > > > > no longer possible. The full message set is either successfully
> > > > written
> > > > > > to
> > > > > > > the log (and replicated) or it is not."
> > > > > > >
> > > > > > > The schema and size field don't seem to be an issue, as KIP-712
> > > > already
> > > > > > > addresses.
> > > > > > >
> > > > > > > The partial produce failure issue is something I don't
> > understand.
> > > I
> > > > > > can't
> > > > > > > tell if this was done out of convenience at the time or if
> there
> > is
> > > > > > > something incompatible with partial produce success/failure and
> > > EOS.
> > > > > Does
> > > > > > > anyone know?
> > > > > > >
> > > > > > > Ryanne
> > > > > > >
> > > > > > > On Mon, Mar 29, 2021, 1:41 AM Ismael Juma <is...@juma.me.uk>
> > > wrote:
> > > > > > >
> > > > > > > > Ryanne,
> > > > > > > >
> > > > > > > > You misunderstood the referenced comment. It is about the
> > produce
> > > > > > request
> > > > > > > > change to have multiple batches:
> > > > > > > >
> > > > > > > > "Up to ProduceRequest V2, a ProduceRequest can contain
> multiple
> > > > > batches
> > > > > > > of
> > > > > > > > messages stored in the record_set field, but this was
> disabled
> > in
> > > > V3.
> > > > > > We
> > > > > > > > are proposing to bring the multiple batches feature back to
> > > improve
> > > > > the
> > > > > > > > network throughput of the mirror maker producer when the
> > original
> > > > > batch
> > > > > > > > size from source broker is too small."
> > > > > > > >
> > > > > > > > This is unrelated to shallow iteration.
> > > > > > > >
> > > > > > > > Ismael
> > > > > > > >
> > > > > > > > On Sun, Mar 28, 2021, 10:15 PM Ryanne Dolan <
> > > ryannedolan@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Ismael, I don't think KIP-98 is related. Shallow iteration
> > was
> > > > > > removed
> > > > > > > in
> > > > > > > > > KAFKA-732, which predates KIP-98 by a few years.
> > > > > > > > >
> > > > > > > > > Ryanne
> > > > > > > > >
> > > > > > > > > On Sun, Mar 28, 2021, 11:25 PM Ismael Juma <
> > ismael@juma.me.uk>
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Thanks for the KIP. I have a few high level comments:
> > > > > > > > > >
> > > > > > > > > > 1. Like Tom, I'm not convinced about the proposal to make
> > > this
> > > > > > change
> > > > > > > > to
> > > > > > > > > > MirrorMaker 1 if we intend to deprecate it and remove
> it. I
> > > > would
> > > > > > > > rather
> > > > > > > > > us
> > > > > > > > > > focus our efforts on the implementation we intend to
> > support
> > > > > going
> > > > > > > > > forward.
> > > > > > > > > > 2. The producer/consumer configs seem pretty dangerous
> for
> > > > > general
> > > > > > > > usage,
> > > > > > > > > > but the KIP doesn't address the potential downsides.
> > > > > > > > > > 3. How does the ProducerRequest change impact
> exactly-once
> > > (if
> > > > at
> > > > > > > all)?
> > > > > > > > > The
> > > > > > > > > > change we are reverting was done as part of KIP-98. Have
> we
> > > > > > > considered
> > > > > > > > > the
> > > > > > > > > > original reasons for the change?
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Ismael
> > > > > > > > > >
> > > > > > > > > > On Wed, Feb 10, 2021 at 12:58 PM Vahid Hashemian <
> > > > > > > > > > vahid.hashemian@gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Retitled the thread to conform to the common format.
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Feb 5, 2021 at 4:00 PM Ning Zhang <
> > > > > > ning2008wisc@gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hello Henry,
> > > > > > > > > > > >
> > > > > > > > > > > > This is a very interesting proposal.
> > > > > > > > > > > > https://issues.apache.org/jira/browse/KAFKA-10728
> > > reflects
> > > > > the
> > > > > > > > > similar
> > > > > > > > > > > > concern of re-compressing data in mirror maker.
> > > > > > > > > > > >
> > > > > > > > > > > > Probably one thing may need to clarify is: how
> > "shallow"
> > > > > > > mirroring
> > > > > > > > is
> > > > > > > > > > > only
> > > > > > > > > > > > applied to mirrormaker use case, if the changes need
> to
> > > be
> > > > > made
> > > > > > > on
> > > > > > > > > > > generic
> > > > > > > > > > > > consumer and producer (e.g. by adding
> `fetch.raw.bytes`
> > > and
> > > > > > > > > > > > `send.raw.bytes` to producer and consumer config)
> > > > > > > > > > > >
> > > > > > > > > > > > On 2021/02/05 00:59:57, Henry Cai
> > > > <hcai@pinterest.com.INVALID
> > > > > >
> > > > > > > > > wrote:
> > > > > > > > > > > > > Dear Community members,
> > > > > > > > > > > > >
> > > > > > > > > > > > > We are proposing a new feature to improve the
> > > performance
> > > > > of
> > > > > > > > Kafka
> > > > > > > > > > > mirror
> > > > > > > > > > > > > maker:
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-712%3A+Shallow+Mirroring
> > > > > > > > > > > > >
> > > > > > > > > > > > > The current Kafka MirrorMaker process (with the
> > > > underlying
> > > > > > > > Consumer
> > > > > > > > > > and
> > > > > > > > > > > > > Producer library) uses significant CPU cycles and
> > > memory
> > > > to
> > > > > > > > > > > > > decompress/recompress, deserialize/re-serialize
> > > messages
> > > > > and
> > > > > > > copy
> > > > > > > > > > > > multiple
> > > > > > > > > > > > > times of messages bytes along the
> > mirroring/replicating
> > > > > > stages.
> > > > > > > > > > > > >
> > > > > > > > > > > > > The KIP proposes a *shallow mirror* feature which
> > > brings
> > > > > back
> > > > > > > the
> > > > > > > > > > > shallow
> > > > > > > > > > > > > iterator concept to the mirror process and also
> > > proposes
> > > > to
> > > > > > > skip
> > > > > > > > > the
> > > > > > > > > > > > > unnecessary message decompression and recompression
> > > > steps.
> > > > > > We
> > > > > > > > > argue
> > > > > > > > > > in
> > > > > > > > > > > > > many cases users just want a simple replication
> > > pipeline
> > > > to
> > > > > > > > > replicate
> > > > > > > > > > > the
> > > > > > > > > > > > > message as it is from the source cluster to the
> > > > destination
> > > > > > > > > cluster.
> > > > > > > > > > > In
> > > > > > > > > > > > > many cases the messages in the source cluster are
> > > already
> > > > > > > > > compressed
> > > > > > > > > > > and
> > > > > > > > > > > > > properly batched, users just need an identical copy
> > of
> > > > the
> > > > > > > > message
> > > > > > > > > > > bytes
> > > > > > > > > > > > > through the mirroring without any transformation or
> > > > > > > > repartitioning.
> > > > > > > > > > > > >
> > > > > > > > > > > > > We have a prototype implementation in house with
> > > > > MirrorMaker
> > > > > > v1
> > > > > > > > and
> > > > > > > > > > > > > observed *CPU usage dropped from 50% to 15%* for
> some
> > > > > mirror
> > > > > > > > > > pipelines.
> > > > > > > > > > > > >
> > > > > > > > > > > > > We name this feature: *shallow mirroring* since it
> > has
> > > > some
> > > > > > > > > > resemblance
> > > > > > > > > > > > to
> > > > > > > > > > > > > the old Kafka 0.7 namesake feature but the
> > > > implementations
> > > > > > are
> > > > > > > > not
> > > > > > > > > > > quite
> > > > > > > > > > > > > the same.  ‘*Shallow*’ means 1. we *shallowly*
> > iterate
> > > > > > > > > RecordBatches
> > > > > > > > > > > > inside
> > > > > > > > > > > > > MemoryRecords structure instead of deep iterating
> > > records
> > > > > > > inside
> > > > > > > > > > > > > RecordBatch; 2. We *shallowly* copy (share)
> pointers
> > > > inside
> > > > > > > > > > ByteBuffer
> > > > > > > > > > > > > instead of deep copying and deserializing bytes
> into
> > > > > objects.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Please share discussions/feedback along this email
> > > > thread.
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > >
> > > > > > > > > > > Thanks!
> > > > > > > > > > > --Vahid
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-712: Shallow Mirroring

Posted by Jun Rao <ju...@confluent.io.INVALID>.
Hi, Henry,

Thanks for the reply.

3. Regarding EOS, supporting read_committed makes sense.

1. Regarding re-implementing 30 methods. Could we structure the code such
that the implementation of most of the methods could be shared?

Thanks,

Jun

On Thu, Apr 1, 2021 at 11:18 PM Henry Cai <hc...@pinterest.com.invalid>
wrote:

> Tom,
>
> I don't think we need to refactor the Consumer/Producer class hierarchy for
> this KIP.  If we introduce a separate class for BatchConsumer, there are
> more than 30 methods in Consumer interface this new class needs to
> implement and the implementation would be exactly the same as
> KafkaConsumer.
>
> If we just introduce a new method 'ConsumerBatchRecord pollBatch(duration)'
> on the Consumer class, there is no need to make this method to handle
> generic type since the return type is a concrete class.  The worry you have
> that later the user can tweak the content so they can send
> ProducerRecord<NewFoo, NewBar> can be avoided if we make the class
> ConsumerBatchRecord final and only define one method
> 'toProducerBatchRecord()' on the class, and makes the constructor on
> ProducerBatchRecord package private as well to make people not able to
> create a ProducerBatchRecord with arbitrary content.
>
> The other design is just have one class BatchRecord (instead of having two
> classes: ConsumerBatchRecord and ProducerBatchRecord),
> Consumer.pollBatch(duration) will return BatchRecord,
> Producer.sendBatch(BatchRecord) will send the batch.  BatchRecord class is
> final and the constructor is hidden, so it's unmodifiable by the user code.
>
> On Thu, Apr 1, 2021 at 6:27 AM Tom Bentley <tb...@redhat.com> wrote:
>
> > Hi Henry, Jun and Ismael,
> >
> > A few things make me wonder if building this into the existing Producer
> and
> > Consumer APIs is really the right thing to do:
> >
> > 1. Type safety. The existing Producer and Consumer are both generic in K
> > and V, but those type parameters are meaningless in the batch case. For
> > example, the apparent type safety of a Producer<Foo, Bar> would be
> violated
> > by using the batch method to actually send a <Baz, Quux>. Another
> example:
> > What happens if I pass a producer configured for records to someone that
> > requires one configured for batches (and vice versa)?
> >
> > 2. The existing Producer and Consumer would both accept a number of
> configs
> > which didn't apply in the batch case.
> >
> > In the discussion for KIP-706 Jason was imagining a more abstracted set
> of
> > client APIs which separated the data from the topic destination/origin,
> and
> > he mentioned basically this exact use case. This got me thinking, and
> > although I don't want to derail this conversion, I thought I'd sketch
> what
> > I came up with.
> >
> >
> > On the Consumer side:
> >
> > ```java
> > // Abstraction over where messages come from
> > interface ReceiveSource;
> > class TopicPartition implements SendTarget, ReceiveSource;
> > class Topic implements SendTarget, ReceiveSource;
> > class TopicId implements SendTarget, ReceiveSource;
> > class TopicPattern implements ReceiveSource;
> >
> > // New abstraction for consumer-like things
> > interface Receiver<X> {
> >   assign(ReceiveSource source);
> >   subscribe(ReceiveSource source);
> >   // etc
> >   X poll(Duration);
> > }
> >
> > // Consumer doesn't change, except for the implements clause
> > interface Consumer<K, V> implements Receiver<ConsumerRecords<K, V>> {
> >   assign(ReceiveSource source);
> >   subscribe(ReceiveSource source);
> >   ConsumerRecords<K, V> poll(Duration);
> > }
> >
> > // KafkaConsumer doesn't change at all
> > class KafkaConsumer<K, V> implements Consumer<K, V> {
> > }
> >
> > // Specialise Receiver for batch-based consumption.
> > interface BatchConsumer implements Receiver<ConsumerBatch> {
> >
> > }
> >
> > // Implementation
> > class KafkaBatchConsumer implements BatchConsumer {
> >
> > }
> >
> > class ConsumerBatch {
> >   // For KIP-712, a way to convert batches without exposing low level
> > details like ByteBuffer
> >   ProducerBatchPayload toProducerBatch();
> > }
> > ```
> >
> > On the producer side:
> > ```java
> > // Abstraction over targets (see the ReceiveSource for the impls)
> > interface SendTarget;
> >
> > // Abstraction over data that can be send to a target
> > interface Payload;
> > class ProducerRecordPayload<K, V> implements Payload {
> >   // Like ProducerRecord, but without the topic and partition
> > }
> > class ProducerBatchPayload implements Payload {
> >   // For the KIP-712 case
> > }
> >
> > // A new abstraction over producer-like things
> > interface Transmitter<P extends Payload> {
> >   CompletionStage<T> send(SendTarget target, P payload);
> > }
> >
> > // Producer gains an extends clause
> > interface Producer<K, V> extends Transmitter<ProducerRecord<K, V>> {
> > }
> >
> > class KafkaProducer<K, V> implements Producer<K, V> {
> >   // Unchanged, included for completeness
> > }
> >
> > interface BatchProducer extends Transmitter<ProducerBatch> {
> >   CompletionStage<T> send(SendTarget target, ProducerBatch)
> > }
> >
> > class KafkaBatchProducer extends BatchProducer {
> >   // New. In practice a lot of common code between this and KafkaProducer
> > could be factored into an abstract class.
> > }
> > ```
> >
> > Really I'm just re-stating Jason's KIP-706 idea in the context of this
> KIP,
> > but it would address the type safety issue and also enable a batch
> consumer
> > to have its own set of configs. It also allows the new Producer.send
> return
> > type to be CompletionStage, which is KIP-706's objective. And, of course
> > it's compatible with possible future work around produce to/consume from
> > topic id.
> >
> > Kind regards,
> >
> > Tom
> >
> >
> >
> > On Thu, Apr 1, 2021 at 9:11 AM Henry Cai <hc...@pinterest.com.invalid>
> > wrote:
> >
> > > Jun,
> > >
> > > Thanks for your insight looking into this KIP, we do believe the
> shallow
> > > iteration will give quite a significant performance boost.
> > >
> > > On your concerns:
> > >
> > > 1. Cleaner API.  One alternative is to create new batch APIs.  On
> > consumer,
> > > it would become Consumer.pollBatch returns a ConsumerBatch object which
> > > contains topic/partition/firstOffsetOfBatch/pointerToByteBuffer,
> > similarly
> > > Producer.sendBatch(ProducerBatch).  Both ConsumerBatch and
> ProducerBatch
> > > objects are fixed types (no generics), serializer is gone, interceptors
> > are
> > > probably not needed initially (unless people see the need to intercept
> on
> > > the batch level).  On MM2 side, the current flow is ConsumerRecord ->
> > > Connect's SourceRecord -> ProducerRecord, we would need to enhance
> > Connect
> > > framework to add SourceTask.pollBatch() method which returns a
> > SourceBatch
> > > object, so the object conversion flow becomes ConsumerBatch ->
> > SourceBatch
> > > -> ProducerBatch, we probably won't support any transformers on Batch
> > > objects.
> > > 2. PID/ProducerEpoch/SeqNo passing through RecordBatch.  I think those
> > > transaction fields are only meaningful in the original source kafka
> > > cluster, producer id/seqNo are not the same for the target kafka
> cluster.
> > > So if MM is not going to support transactions at the moment, we can
> clear
> > > those fields when they are going through MM.  Once MM starts to support
> > > transactions in the future, it probably will start its own PID/SeqNo
> etc.
> > > 3. For EOS and read_committed/read_uncommitted support, we can do
> phased
> > > support.  Phase 1, don't support transactional messages in the source
> > > cluster (i.e. abort if it sees control batch records).  Phase 2:
> applying
> > > commit/abort on the batch boundary level.  I am not too familiar with
> the
> > > isolation level and abort transaction code path, but it seems the
> control
> > > unit is currently on the batch boundary (commit/abort the whole batch),
> > if
> > > so, this should also be doable.
> > > 4. MessageHandler in MM1 or SMT in MM2, initially we don't need to
> > support
> > > them.  Since now the object is a ConsumerBatch and the existing handler
> > is
> > > written for the individual object.  Deserialize the batch into
> individual
> > > objects would defeat the purpose of performance optimization.
> > > 5. Multiple batch performances, will do some testing on this.
> > >
> > > On Wed, Mar 31, 2021 at 10:14 AM Jun Rao <ju...@confluent.io.invalid>
> > wrote:
> > >
> > > > Hi, Henry,
> > > >
> > > > Thanks for the KIP. Sorry for the late reply. A few comments below.
> > > >
> > > > 1. The 'shallow' feature is potentially useful. I do agree with Tom
> > that
> > > > the proposed API changes seem unclean. Quite a few existing stuff
> don't
> > > > really work together with this (e.g., generics, serializer,
> > interceptors,
> > > > configs like max.poll.records, etc). It's also hard to explain this
> > > change
> > > > to the common users of the consumer/producer API. I think it would be
> > > > useful to explore if there is another cleaner way of adding this. For
> > > > example, you mentioned that creating a new set of APIs doesn't work
> for
> > > > MM2. However, we could potentially change the connect interface to
> > allow
> > > > MM2 to use the new API. If this doesn't work, it would be useful to
> > > explain
> > > > that in the rejected alternative section.
> > > >
> > > > 2. I am not sure that we could pass through all fields in
> RecordBatch.
> > > For
> > > > example, a MM instance could be receiving RecordBatch from different
> > > source
> > > > partitions. Mixing the PID/ProducerEpoch/FirstSequence fields across
> > them
> > > > in a single producer will be weird. So, it would be useful to
> document
> > > this
> > > > part clearer.
> > > >
> > > > 3. EOS. While MM itself doesn't support mirroring data in an
> > > > exactly-once way, it needs to support reading from a topic with EOS
> > data.
> > > > So, it would be useful to document whether both read_committed and
> > > > read_uncommitted mode are supported and what kind of RecordBatch the
> > > > consumer returns in each case.
> > > >
> > > > 4. With the 'shallow' feature, it seems that some existing features
> in
> > MM
> > > > won't work. For example, I am not sure if SMT works in MM2
> > > > and MirrorMakerMessageHandler works in MM1. It would be useful to
> > > document
> > > > this kind of impact in the KIP.
> > > >
> > > > 5. Multiple batches per partition in the produce request. This seems
> > not
> > > > strictly required in KIP-98. However, changing this will probably
> add a
> > > bit
> > > > more complexity in the producer. So, it would be useful to understand
> > its
> > > > benefits, especially since it doesn't seem to directly help reduce
> the
> > > CPU
> > > > cost in MM. For example, do you have performance numbers with and
> > without
> > > > this enabled in your MM tests?
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > > On Tue, Mar 30, 2021 at 1:27 PM Henry Cai <hcai@pinterest.com.invalid
> >
> > > > wrote:
> > > >
> > > > > Tom,
> > > > >
> > > > > Thanks for your comments.  Yes it's a bit clumsy to use the
> existing
> > > > > consumer and producer API to carry the underlying record batch, but
> > > > > creating a new set of API would also mean other use cases (e.g.
> MM2)
> > > > > wouldn't be able to use that feature easily.  We can throw
> exceptions
> > > if
> > > > we
> > > > > see clients are setting serializer/compression in the consumer
> config
> > > > > option.
> > > > >
> > > > > The consumer is essentially getting back a collection of
> > > > > RecordBatchByteBuffer records and passing them to the producer.
> Most
> > > of
> > > > > the internal APIs inside consumer and producer code paths are
> > actually
> > > > > taking on ByteBuffer as the argument so it's not too much work to
> get
> > > the
> > > > > byte buffer through.
> > > > >
> > > > > For the worry that the client might see the inside of that byte
> > buffer,
> > > > we
> > > > > can create a RecordBatchByteBufferRecord class to wrap the
> underlying
> > > > byte
> > > > > buffer so hopefully they will not drill too deep into that object.
> > > > Java's
> > > > > ByteBuffer does have a asReadOnlyBuffer() method to return a
> > read-only
> > > > > buffer, that can be explored as well.
> > > > >
> > > > > On Tue, Mar 30, 2021 at 4:24 AM Tom Bentley <tb...@redhat.com>
> > > wrote:
> > > > >
> > > > > > Hi Henry and Ryanne,
> > > > > >
> > > > > > Related to Ismael's point about the producer & consumer configs
> > being
> > > > > > dangerous, I can see two parts to this:
> > > > > >
> > > > > > 2a. Both the proposed configs seem to be fundamentally
> incompatible
> > > > with
> > > > > > the Producer's existing key.serializer, value.serializer and
> > > > > > compression.type configs, likewise the consumers key.deserializer
> > and
> > > > > > value.deserializer. I don't see a way to avoid this, since those
> > > > existing
> > > > > > configs are already separate things. (I did consider whether
> using
> > > > > > special-case Deserializer and Serializer could be used instead,
> but
> > > > that
> > > > > > doesn't work nicely; in this use case they're necessarily all
> > > > configured
> > > > > > together). I think all we could do would be to reject configs
> which
> > > > tried
> > > > > > to set those existing client configs in conjunction with
> > > > fetch.raw.bytes
> > > > > > and send.raw.bytes.
> > > > > >
> > > > > > 2b. That still leaves a public Java API which would allow access
> to
> > > the
> > > > > raw
> > > > > > byte buffers. AFAICS we don't actually need user code to have
> > access
> > > to
> > > > > the
> > > > > > raw buffers. It would be enough to get an opaque object that
> > wrapped
> > > > the
> > > > > > ByteBuffer from the consumer and pass it to the producer. It's
> only
> > > the
> > > > > > consumer and producer code which needs to be able to obtain the
> > > wrapped
> > > > > > buffer.
> > > > > >
> > > > > > Kind regards,
> > > > > >
> > > > > > Tom
> > > > > >
> > > > > > On Tue, Mar 30, 2021 at 8:41 AM Ismael Juma <is...@juma.me.uk>
> > > wrote:
> > > > > >
> > > > > > > Hi Henry,
> > > > > > >
> > > > > > > Can you clarify why this "network performance" issue is only
> > > related
> > > > to
> > > > > > > shallow mirroring? Generally, we want the protocol to be
> generic
> > > and
> > > > > not
> > > > > > > have a number of special cases. The more special cases you
> have,
> > > the
> > > > > > > tougher it becomes to test all the edge cases.
> > > > > > >
> > > > > > > Ismael
> > > > > > >
> > > > > > > On Mon, Mar 29, 2021 at 9:51 PM Henry Cai
> > > <hcai@pinterest.com.invalid
> > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > It's interesting this VOTE thread finally becomes a DISCUSS
> > > thread.
> > > > > > > >
> > > > > > > > For MM2 concern, I will take a look to see whether I can add
> > the
> > > > > > support
> > > > > > > > for MM2.
> > > > > > > >
> > > > > > > > For Ismael's concern on multiple batches in the
> ProduceRequest
> > > > > > > (conflicting
> > > > > > > > with KIP-98), here is my take:
> > > > > > > >
> > > > > > > > 1. We do need to group multiple batches in the same request
> > > > otherwise
> > > > > > the
> > > > > > > > network performance will suffer.
> > > > > > > > 2. For the concern on transactional message support as in
> > KIP-98,
> > > > > since
> > > > > > > MM1
> > > > > > > > and MM2 currently don't support transactional messages,
> KIP-712
> > > > will
> > > > > > not
> > > > > > > > attempt to support transactions either.  I will add a config
> > > option
> > > > > on
> > > > > > > > producer config: allowMultipleBatches.  By default this
> option
> > > will
> > > > > be
> > > > > > > off
> > > > > > > > and the user needs to explicitly turn on this option to use
> the
> > > > > shallow
> > > > > > > > mirror feature.  And if we detect both this option and
> > > transaction
> > > > is
> > > > > > > > turned on we will throw an exception to protect current
> > > transaction
> > > > > > > > processing.
> > > > > > > > 3. In the future, when MM2 starts to support exact-once and
> > > > > > transactional
> > > > > > > > messages (is that KIP-656?), we can revisit this code.  The
> > > current
> > > > > > > > transactional message already makes the compromise that the
> > > > messages
> > > > > in
> > > > > > > the
> > > > > > > > same RecordBatch (MessageSet) are sharing the same
> > > > > > > > sequence-id/transaction-id, so those messages need to be
> > > committed
> > > > > all
> > > > > > > > together.  I think when we support the shallow mirror with
> > > > > > transactional
> > > > > > > > semantics, we will group all batches in the same
> ProduceRequest
> > > in
> > > > > the
> > > > > > > same
> > > > > > > > transaction boundary, they need to be committed all together.
> > On
> > > > the
> > > > > > > > broker side, all batches coming from ProduceRequest (or
> > > > > FetchResponse)
> > > > > > > are
> > > > > > > > committed in the same log segment file as one unit (current
> > > > > behavior).
> > > > > > > >
> > > > > > > > On Mon, Mar 29, 2021 at 8:46 AM Ryanne Dolan <
> > > > ryannedolan@gmail.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Ah, I see, thanks Ismael. Now I understand your concern.
> > > > > > > > >
> > > > > > > > > From KIP-98, re this change in v3:
> > > > > > > > >
> > > > > > > > > "This allows us to remove the message set size since each
> > > message
> > > > > set
> > > > > > > > > already contains a field for the size. More importantly,
> > since
> > > > > there
> > > > > > is
> > > > > > > > > only one message set to be written to the log, partial
> > produce
> > > > > > failures
> > > > > > > > are
> > > > > > > > > no longer possible. The full message set is either
> > successfully
> > > > > > written
> > > > > > > > to
> > > > > > > > > the log (and replicated) or it is not."
> > > > > > > > >
> > > > > > > > > The schema and size field don't seem to be an issue, as
> > KIP-712
> > > > > > already
> > > > > > > > > addresses.
> > > > > > > > >
> > > > > > > > > The partial produce failure issue is something I don't
> > > > understand.
> > > > > I
> > > > > > > > can't
> > > > > > > > > tell if this was done out of convenience at the time or if
> > > there
> > > > is
> > > > > > > > > something incompatible with partial produce success/failure
> > and
> > > > > EOS.
> > > > > > > Does
> > > > > > > > > anyone know?
> > > > > > > > >
> > > > > > > > > Ryanne
> > > > > > > > >
> > > > > > > > > On Mon, Mar 29, 2021, 1:41 AM Ismael Juma <
> ismael@juma.me.uk
> > >
> > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Ryanne,
> > > > > > > > > >
> > > > > > > > > > You misunderstood the referenced comment. It is about the
> > > > produce
> > > > > > > > request
> > > > > > > > > > change to have multiple batches:
> > > > > > > > > >
> > > > > > > > > > "Up to ProduceRequest V2, a ProduceRequest can contain
> > > multiple
> > > > > > > batches
> > > > > > > > > of
> > > > > > > > > > messages stored in the record_set field, but this was
> > > disabled
> > > > in
> > > > > > V3.
> > > > > > > > We
> > > > > > > > > > are proposing to bring the multiple batches feature back
> to
> > > > > improve
> > > > > > > the
> > > > > > > > > > network throughput of the mirror maker producer when the
> > > > original
> > > > > > > batch
> > > > > > > > > > size from source broker is too small."
> > > > > > > > > >
> > > > > > > > > > This is unrelated to shallow iteration.
> > > > > > > > > >
> > > > > > > > > > Ismael
> > > > > > > > > >
> > > > > > > > > > On Sun, Mar 28, 2021, 10:15 PM Ryanne Dolan <
> > > > > ryannedolan@gmail.com
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Ismael, I don't think KIP-98 is related. Shallow
> > iteration
> > > > was
> > > > > > > > removed
> > > > > > > > > in
> > > > > > > > > > > KAFKA-732, which predates KIP-98 by a few years.
> > > > > > > > > > >
> > > > > > > > > > > Ryanne
> > > > > > > > > > >
> > > > > > > > > > > On Sun, Mar 28, 2021, 11:25 PM Ismael Juma <
> > > > ismael@juma.me.uk>
> > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Thanks for the KIP. I have a few high level comments:
> > > > > > > > > > > >
> > > > > > > > > > > > 1. Like Tom, I'm not convinced about the proposal to
> > make
> > > > > this
> > > > > > > > change
> > > > > > > > > > to
> > > > > > > > > > > > MirrorMaker 1 if we intend to deprecate it and remove
> > > it. I
> > > > > > would
> > > > > > > > > > rather
> > > > > > > > > > > us
> > > > > > > > > > > > focus our efforts on the implementation we intend to
> > > > support
> > > > > > > going
> > > > > > > > > > > forward.
> > > > > > > > > > > > 2. The producer/consumer configs seem pretty
> dangerous
> > > for
> > > > > > > general
> > > > > > > > > > usage,
> > > > > > > > > > > > but the KIP doesn't address the potential downsides.
> > > > > > > > > > > > 3. How does the ProducerRequest change impact
> > > exactly-once
> > > > > (if
> > > > > > at
> > > > > > > > > all)?
> > > > > > > > > > > The
> > > > > > > > > > > > change we are reverting was done as part of KIP-98.
> > Have
> > > we
> > > > > > > > > considered
> > > > > > > > > > > the
> > > > > > > > > > > > original reasons for the change?
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > > Ismael
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Feb 10, 2021 at 12:58 PM Vahid Hashemian <
> > > > > > > > > > > > vahid.hashemian@gmail.com>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Retitled the thread to conform to the common
> format.
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Fri, Feb 5, 2021 at 4:00 PM Ning Zhang <
> > > > > > > > ning2008wisc@gmail.com>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hello Henry,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > This is a very interesting proposal.
> > > > > > > > > > > > > >
> https://issues.apache.org/jira/browse/KAFKA-10728
> > > > > reflects
> > > > > > > the
> > > > > > > > > > > similar
> > > > > > > > > > > > > > concern of re-compressing data in mirror maker.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Probably one thing may need to clarify is: how
> > > > "shallow"
> > > > > > > > > mirroring
> > > > > > > > > > is
> > > > > > > > > > > > > only
> > > > > > > > > > > > > > applied to mirrormaker use case, if the changes
> > need
> > > to
> > > > > be
> > > > > > > made
> > > > > > > > > on
> > > > > > > > > > > > > generic
> > > > > > > > > > > > > > consumer and producer (e.g. by adding
> > > `fetch.raw.bytes`
> > > > > and
> > > > > > > > > > > > > > `send.raw.bytes` to producer and consumer config)
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On 2021/02/05 00:59:57, Henry Cai
> > > > > > <hcai@pinterest.com.INVALID
> > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > Dear Community members,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > We are proposing a new feature to improve the
> > > > > performance
> > > > > > > of
> > > > > > > > > > Kafka
> > > > > > > > > > > > > mirror
> > > > > > > > > > > > > > > maker:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-712%3A+Shallow+Mirroring
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > The current Kafka MirrorMaker process (with the
> > > > > > underlying
> > > > > > > > > > Consumer
> > > > > > > > > > > > and
> > > > > > > > > > > > > > > Producer library) uses significant CPU cycles
> and
> > > > > memory
> > > > > > to
> > > > > > > > > > > > > > > decompress/recompress, deserialize/re-serialize
> > > > > messages
> > > > > > > and
> > > > > > > > > copy
> > > > > > > > > > > > > > multiple
> > > > > > > > > > > > > > > times of messages bytes along the
> > > > mirroring/replicating
> > > > > > > > stages.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > The KIP proposes a *shallow mirror* feature
> which
> > > > > brings
> > > > > > > back
> > > > > > > > > the
> > > > > > > > > > > > > shallow
> > > > > > > > > > > > > > > iterator concept to the mirror process and also
> > > > > proposes
> > > > > > to
> > > > > > > > > skip
> > > > > > > > > > > the
> > > > > > > > > > > > > > > unnecessary message decompression and
> > recompression
> > > > > > steps.
> > > > > > > > We
> > > > > > > > > > > argue
> > > > > > > > > > > > in
> > > > > > > > > > > > > > > many cases users just want a simple replication
> > > > > pipeline
> > > > > > to
> > > > > > > > > > > replicate
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > message as it is from the source cluster to the
> > > > > > destination
> > > > > > > > > > > cluster.
> > > > > > > > > > > > > In
> > > > > > > > > > > > > > > many cases the messages in the source cluster
> are
> > > > > already
> > > > > > > > > > > compressed
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > properly batched, users just need an identical
> > copy
> > > > of
> > > > > > the
> > > > > > > > > > message
> > > > > > > > > > > > > bytes
> > > > > > > > > > > > > > > through the mirroring without any
> transformation
> > or
> > > > > > > > > > repartitioning.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > We have a prototype implementation in house
> with
> > > > > > > MirrorMaker
> > > > > > > > v1
> > > > > > > > > > and
> > > > > > > > > > > > > > > observed *CPU usage dropped from 50% to 15%*
> for
> > > some
> > > > > > > mirror
> > > > > > > > > > > > pipelines.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > We name this feature: *shallow mirroring* since
> > it
> > > > has
> > > > > > some
> > > > > > > > > > > > resemblance
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > the old Kafka 0.7 namesake feature but the
> > > > > > implementations
> > > > > > > > are
> > > > > > > > > > not
> > > > > > > > > > > > > quite
> > > > > > > > > > > > > > > the same.  ‘*Shallow*’ means 1. we *shallowly*
> > > > iterate
> > > > > > > > > > > RecordBatches
> > > > > > > > > > > > > > inside
> > > > > > > > > > > > > > > MemoryRecords structure instead of deep
> iterating
> > > > > records
> > > > > > > > > inside
> > > > > > > > > > > > > > > RecordBatch; 2. We *shallowly* copy (share)
> > > pointers
> > > > > > inside
> > > > > > > > > > > > ByteBuffer
> > > > > > > > > > > > > > > instead of deep copying and deserializing bytes
> > > into
> > > > > > > objects.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Please share discussions/feedback along this
> > email
> > > > > > thread.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks!
> > > > > > > > > > > > > --Vahid
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-712: Shallow Mirroring

Posted by Henry Cai <hc...@pinterest.com.INVALID>.
Tom,

I don't think we need to refactor the Consumer/Producer class hierarchy for
this KIP.  If we introduce a separate class for BatchConsumer, there are
more than 30 methods in Consumer interface this new class needs to
implement and the implementation would be exactly the same as KafkaConsumer.

If we just introduce a new method 'ConsumerBatchRecord pollBatch(duration)'
on the Consumer class, there is no need to make this method to handle
generic type since the return type is a concrete class.  The worry you have
that later the user can tweak the content so they can send
ProducerRecord<NewFoo, NewBar> can be avoided if we make the class
ConsumerBatchRecord final and only define one method
'toProducerBatchRecord()' on the class, and makes the constructor on
ProducerBatchRecord package private as well to make people not able to
create a ProducerBatchRecord with arbitrary content.

The other design is just have one class BatchRecord (instead of having two
classes: ConsumerBatchRecord and ProducerBatchRecord),
Consumer.pollBatch(duration) will return BatchRecord,
Producer.sendBatch(BatchRecord) will send the batch.  BatchRecord class is
final and the constructor is hidden, so it's unmodifiable by the user code.

On Thu, Apr 1, 2021 at 6:27 AM Tom Bentley <tb...@redhat.com> wrote:

> Hi Henry, Jun and Ismael,
>
> A few things make me wonder if building this into the existing Producer and
> Consumer APIs is really the right thing to do:
>
> 1. Type safety. The existing Producer and Consumer are both generic in K
> and V, but those type parameters are meaningless in the batch case. For
> example, the apparent type safety of a Producer<Foo, Bar> would be violated
> by using the batch method to actually send a <Baz, Quux>. Another example:
> What happens if I pass a producer configured for records to someone that
> requires one configured for batches (and vice versa)?
>
> 2. The existing Producer and Consumer would both accept a number of configs
> which didn't apply in the batch case.
>
> In the discussion for KIP-706 Jason was imagining a more abstracted set of
> client APIs which separated the data from the topic destination/origin, and
> he mentioned basically this exact use case. This got me thinking, and
> although I don't want to derail this conversion, I thought I'd sketch what
> I came up with.
>
>
> On the Consumer side:
>
> ```java
> // Abstraction over where messages come from
> interface ReceiveSource;
> class TopicPartition implements SendTarget, ReceiveSource;
> class Topic implements SendTarget, ReceiveSource;
> class TopicId implements SendTarget, ReceiveSource;
> class TopicPattern implements ReceiveSource;
>
> // New abstraction for consumer-like things
> interface Receiver<X> {
>   assign(ReceiveSource source);
>   subscribe(ReceiveSource source);
>   // etc
>   X poll(Duration);
> }
>
> // Consumer doesn't change, except for the implements clause
> interface Consumer<K, V> implements Receiver<ConsumerRecords<K, V>> {
>   assign(ReceiveSource source);
>   subscribe(ReceiveSource source);
>   ConsumerRecords<K, V> poll(Duration);
> }
>
> // KafkaConsumer doesn't change at all
> class KafkaConsumer<K, V> implements Consumer<K, V> {
> }
>
> // Specialise Receiver for batch-based consumption.
> interface BatchConsumer implements Receiver<ConsumerBatch> {
>
> }
>
> // Implementation
> class KafkaBatchConsumer implements BatchConsumer {
>
> }
>
> class ConsumerBatch {
>   // For KIP-712, a way to convert batches without exposing low level
> details like ByteBuffer
>   ProducerBatchPayload toProducerBatch();
> }
> ```
>
> On the producer side:
> ```java
> // Abstraction over targets (see the ReceiveSource for the impls)
> interface SendTarget;
>
> // Abstraction over data that can be send to a target
> interface Payload;
> class ProducerRecordPayload<K, V> implements Payload {
>   // Like ProducerRecord, but without the topic and partition
> }
> class ProducerBatchPayload implements Payload {
>   // For the KIP-712 case
> }
>
> // A new abstraction over producer-like things
> interface Transmitter<P extends Payload> {
>   CompletionStage<T> send(SendTarget target, P payload);
> }
>
> // Producer gains an extends clause
> interface Producer<K, V> extends Transmitter<ProducerRecord<K, V>> {
> }
>
> class KafkaProducer<K, V> implements Producer<K, V> {
>   // Unchanged, included for completeness
> }
>
> interface BatchProducer extends Transmitter<ProducerBatch> {
>   CompletionStage<T> send(SendTarget target, ProducerBatch)
> }
>
> class KafkaBatchProducer extends BatchProducer {
>   // New. In practice a lot of common code between this and KafkaProducer
> could be factored into an abstract class.
> }
> ```
>
> Really I'm just re-stating Jason's KIP-706 idea in the context of this KIP,
> but it would address the type safety issue and also enable a batch consumer
> to have its own set of configs. It also allows the new Producer.send return
> type to be CompletionStage, which is KIP-706's objective. And, of course
> it's compatible with possible future work around produce to/consume from
> topic id.
>
> Kind regards,
>
> Tom
>
>
>
> On Thu, Apr 1, 2021 at 9:11 AM Henry Cai <hc...@pinterest.com.invalid>
> wrote:
>
> > Jun,
> >
> > Thanks for your insight looking into this KIP, we do believe the shallow
> > iteration will give quite a significant performance boost.
> >
> > On your concerns:
> >
> > 1. Cleaner API.  One alternative is to create new batch APIs.  On
> consumer,
> > it would become Consumer.pollBatch returns a ConsumerBatch object which
> > contains topic/partition/firstOffsetOfBatch/pointerToByteBuffer,
> similarly
> > Producer.sendBatch(ProducerBatch).  Both ConsumerBatch and ProducerBatch
> > objects are fixed types (no generics), serializer is gone, interceptors
> are
> > probably not needed initially (unless people see the need to intercept on
> > the batch level).  On MM2 side, the current flow is ConsumerRecord ->
> > Connect's SourceRecord -> ProducerRecord, we would need to enhance
> Connect
> > framework to add SourceTask.pollBatch() method which returns a
> SourceBatch
> > object, so the object conversion flow becomes ConsumerBatch ->
> SourceBatch
> > -> ProducerBatch, we probably won't support any transformers on Batch
> > objects.
> > 2. PID/ProducerEpoch/SeqNo passing through RecordBatch.  I think those
> > transaction fields are only meaningful in the original source kafka
> > cluster, producer id/seqNo are not the same for the target kafka cluster.
> > So if MM is not going to support transactions at the moment, we can clear
> > those fields when they are going through MM.  Once MM starts to support
> > transactions in the future, it probably will start its own PID/SeqNo etc.
> > 3. For EOS and read_committed/read_uncommitted support, we can do phased
> > support.  Phase 1, don't support transactional messages in the source
> > cluster (i.e. abort if it sees control batch records).  Phase 2: applying
> > commit/abort on the batch boundary level.  I am not too familiar with the
> > isolation level and abort transaction code path, but it seems the control
> > unit is currently on the batch boundary (commit/abort the whole batch),
> if
> > so, this should also be doable.
> > 4. MessageHandler in MM1 or SMT in MM2, initially we don't need to
> support
> > them.  Since now the object is a ConsumerBatch and the existing handler
> is
> > written for the individual object.  Deserialize the batch into individual
> > objects would defeat the purpose of performance optimization.
> > 5. Multiple batch performances, will do some testing on this.
> >
> > On Wed, Mar 31, 2021 at 10:14 AM Jun Rao <ju...@confluent.io.invalid>
> wrote:
> >
> > > Hi, Henry,
> > >
> > > Thanks for the KIP. Sorry for the late reply. A few comments below.
> > >
> > > 1. The 'shallow' feature is potentially useful. I do agree with Tom
> that
> > > the proposed API changes seem unclean. Quite a few existing stuff don't
> > > really work together with this (e.g., generics, serializer,
> interceptors,
> > > configs like max.poll.records, etc). It's also hard to explain this
> > change
> > > to the common users of the consumer/producer API. I think it would be
> > > useful to explore if there is another cleaner way of adding this. For
> > > example, you mentioned that creating a new set of APIs doesn't work for
> > > MM2. However, we could potentially change the connect interface to
> allow
> > > MM2 to use the new API. If this doesn't work, it would be useful to
> > explain
> > > that in the rejected alternative section.
> > >
> > > 2. I am not sure that we could pass through all fields in RecordBatch.
> > For
> > > example, a MM instance could be receiving RecordBatch from different
> > source
> > > partitions. Mixing the PID/ProducerEpoch/FirstSequence fields across
> them
> > > in a single producer will be weird. So, it would be useful to document
> > this
> > > part clearer.
> > >
> > > 3. EOS. While MM itself doesn't support mirroring data in an
> > > exactly-once way, it needs to support reading from a topic with EOS
> data.
> > > So, it would be useful to document whether both read_committed and
> > > read_uncommitted mode are supported and what kind of RecordBatch the
> > > consumer returns in each case.
> > >
> > > 4. With the 'shallow' feature, it seems that some existing features in
> MM
> > > won't work. For example, I am not sure if SMT works in MM2
> > > and MirrorMakerMessageHandler works in MM1. It would be useful to
> > document
> > > this kind of impact in the KIP.
> > >
> > > 5. Multiple batches per partition in the produce request. This seems
> not
> > > strictly required in KIP-98. However, changing this will probably add a
> > bit
> > > more complexity in the producer. So, it would be useful to understand
> its
> > > benefits, especially since it doesn't seem to directly help reduce the
> > CPU
> > > cost in MM. For example, do you have performance numbers with and
> without
> > > this enabled in your MM tests?
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Tue, Mar 30, 2021 at 1:27 PM Henry Cai <hc...@pinterest.com.invalid>
> > > wrote:
> > >
> > > > Tom,
> > > >
> > > > Thanks for your comments.  Yes it's a bit clumsy to use the existing
> > > > consumer and producer API to carry the underlying record batch, but
> > > > creating a new set of API would also mean other use cases (e.g. MM2)
> > > > wouldn't be able to use that feature easily.  We can throw exceptions
> > if
> > > we
> > > > see clients are setting serializer/compression in the consumer config
> > > > option.
> > > >
> > > > The consumer is essentially getting back a collection of
> > > > RecordBatchByteBuffer records and passing them to the producer.  Most
> > of
> > > > the internal APIs inside consumer and producer code paths are
> actually
> > > > taking on ByteBuffer as the argument so it's not too much work to get
> > the
> > > > byte buffer through.
> > > >
> > > > For the worry that the client might see the inside of that byte
> buffer,
> > > we
> > > > can create a RecordBatchByteBufferRecord class to wrap the underlying
> > > byte
> > > > buffer so hopefully they will not drill too deep into that object.
> > > Java's
> > > > ByteBuffer does have a asReadOnlyBuffer() method to return a
> read-only
> > > > buffer, that can be explored as well.
> > > >
> > > > On Tue, Mar 30, 2021 at 4:24 AM Tom Bentley <tb...@redhat.com>
> > wrote:
> > > >
> > > > > Hi Henry and Ryanne,
> > > > >
> > > > > Related to Ismael's point about the producer & consumer configs
> being
> > > > > dangerous, I can see two parts to this:
> > > > >
> > > > > 2a. Both the proposed configs seem to be fundamentally incompatible
> > > with
> > > > > the Producer's existing key.serializer, value.serializer and
> > > > > compression.type configs, likewise the consumers key.deserializer
> and
> > > > > value.deserializer. I don't see a way to avoid this, since those
> > > existing
> > > > > configs are already separate things. (I did consider whether using
> > > > > special-case Deserializer and Serializer could be used instead, but
> > > that
> > > > > doesn't work nicely; in this use case they're necessarily all
> > > configured
> > > > > together). I think all we could do would be to reject configs which
> > > tried
> > > > > to set those existing client configs in conjunction with
> > > fetch.raw.bytes
> > > > > and send.raw.bytes.
> > > > >
> > > > > 2b. That still leaves a public Java API which would allow access to
> > the
> > > > raw
> > > > > byte buffers. AFAICS we don't actually need user code to have
> access
> > to
> > > > the
> > > > > raw buffers. It would be enough to get an opaque object that
> wrapped
> > > the
> > > > > ByteBuffer from the consumer and pass it to the producer. It's only
> > the
> > > > > consumer and producer code which needs to be able to obtain the
> > wrapped
> > > > > buffer.
> > > > >
> > > > > Kind regards,
> > > > >
> > > > > Tom
> > > > >
> > > > > On Tue, Mar 30, 2021 at 8:41 AM Ismael Juma <is...@juma.me.uk>
> > wrote:
> > > > >
> > > > > > Hi Henry,
> > > > > >
> > > > > > Can you clarify why this "network performance" issue is only
> > related
> > > to
> > > > > > shallow mirroring? Generally, we want the protocol to be generic
> > and
> > > > not
> > > > > > have a number of special cases. The more special cases you have,
> > the
> > > > > > tougher it becomes to test all the edge cases.
> > > > > >
> > > > > > Ismael
> > > > > >
> > > > > > On Mon, Mar 29, 2021 at 9:51 PM Henry Cai
> > <hcai@pinterest.com.invalid
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > It's interesting this VOTE thread finally becomes a DISCUSS
> > thread.
> > > > > > >
> > > > > > > For MM2 concern, I will take a look to see whether I can add
> the
> > > > > support
> > > > > > > for MM2.
> > > > > > >
> > > > > > > For Ismael's concern on multiple batches in the ProduceRequest
> > > > > > (conflicting
> > > > > > > with KIP-98), here is my take:
> > > > > > >
> > > > > > > 1. We do need to group multiple batches in the same request
> > > otherwise
> > > > > the
> > > > > > > network performance will suffer.
> > > > > > > 2. For the concern on transactional message support as in
> KIP-98,
> > > > since
> > > > > > MM1
> > > > > > > and MM2 currently don't support transactional messages, KIP-712
> > > will
> > > > > not
> > > > > > > attempt to support transactions either.  I will add a config
> > option
> > > > on
> > > > > > > producer config: allowMultipleBatches.  By default this option
> > will
> > > > be
> > > > > > off
> > > > > > > and the user needs to explicitly turn on this option to use the
> > > > shallow
> > > > > > > mirror feature.  And if we detect both this option and
> > transaction
> > > is
> > > > > > > turned on we will throw an exception to protect current
> > transaction
> > > > > > > processing.
> > > > > > > 3. In the future, when MM2 starts to support exact-once and
> > > > > transactional
> > > > > > > messages (is that KIP-656?), we can revisit this code.  The
> > current
> > > > > > > transactional message already makes the compromise that the
> > > messages
> > > > in
> > > > > > the
> > > > > > > same RecordBatch (MessageSet) are sharing the same
> > > > > > > sequence-id/transaction-id, so those messages need to be
> > committed
> > > > all
> > > > > > > together.  I think when we support the shallow mirror with
> > > > > transactional
> > > > > > > semantics, we will group all batches in the same ProduceRequest
> > in
> > > > the
> > > > > > same
> > > > > > > transaction boundary, they need to be committed all together.
> On
> > > the
> > > > > > > broker side, all batches coming from ProduceRequest (or
> > > > FetchResponse)
> > > > > > are
> > > > > > > committed in the same log segment file as one unit (current
> > > > behavior).
> > > > > > >
> > > > > > > On Mon, Mar 29, 2021 at 8:46 AM Ryanne Dolan <
> > > ryannedolan@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Ah, I see, thanks Ismael. Now I understand your concern.
> > > > > > > >
> > > > > > > > From KIP-98, re this change in v3:
> > > > > > > >
> > > > > > > > "This allows us to remove the message set size since each
> > message
> > > > set
> > > > > > > > already contains a field for the size. More importantly,
> since
> > > > there
> > > > > is
> > > > > > > > only one message set to be written to the log, partial
> produce
> > > > > failures
> > > > > > > are
> > > > > > > > no longer possible. The full message set is either
> successfully
> > > > > written
> > > > > > > to
> > > > > > > > the log (and replicated) or it is not."
> > > > > > > >
> > > > > > > > The schema and size field don't seem to be an issue, as
> KIP-712
> > > > > already
> > > > > > > > addresses.
> > > > > > > >
> > > > > > > > The partial produce failure issue is something I don't
> > > understand.
> > > > I
> > > > > > > can't
> > > > > > > > tell if this was done out of convenience at the time or if
> > there
> > > is
> > > > > > > > something incompatible with partial produce success/failure
> and
> > > > EOS.
> > > > > > Does
> > > > > > > > anyone know?
> > > > > > > >
> > > > > > > > Ryanne
> > > > > > > >
> > > > > > > > On Mon, Mar 29, 2021, 1:41 AM Ismael Juma <ismael@juma.me.uk
> >
> > > > wrote:
> > > > > > > >
> > > > > > > > > Ryanne,
> > > > > > > > >
> > > > > > > > > You misunderstood the referenced comment. It is about the
> > > produce
> > > > > > > request
> > > > > > > > > change to have multiple batches:
> > > > > > > > >
> > > > > > > > > "Up to ProduceRequest V2, a ProduceRequest can contain
> > multiple
> > > > > > batches
> > > > > > > > of
> > > > > > > > > messages stored in the record_set field, but this was
> > disabled
> > > in
> > > > > V3.
> > > > > > > We
> > > > > > > > > are proposing to bring the multiple batches feature back to
> > > > improve
> > > > > > the
> > > > > > > > > network throughput of the mirror maker producer when the
> > > original
> > > > > > batch
> > > > > > > > > size from source broker is too small."
> > > > > > > > >
> > > > > > > > > This is unrelated to shallow iteration.
> > > > > > > > >
> > > > > > > > > Ismael
> > > > > > > > >
> > > > > > > > > On Sun, Mar 28, 2021, 10:15 PM Ryanne Dolan <
> > > > ryannedolan@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Ismael, I don't think KIP-98 is related. Shallow
> iteration
> > > was
> > > > > > > removed
> > > > > > > > in
> > > > > > > > > > KAFKA-732, which predates KIP-98 by a few years.
> > > > > > > > > >
> > > > > > > > > > Ryanne
> > > > > > > > > >
> > > > > > > > > > On Sun, Mar 28, 2021, 11:25 PM Ismael Juma <
> > > ismael@juma.me.uk>
> > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Thanks for the KIP. I have a few high level comments:
> > > > > > > > > > >
> > > > > > > > > > > 1. Like Tom, I'm not convinced about the proposal to
> make
> > > > this
> > > > > > > change
> > > > > > > > > to
> > > > > > > > > > > MirrorMaker 1 if we intend to deprecate it and remove
> > it. I
> > > > > would
> > > > > > > > > rather
> > > > > > > > > > us
> > > > > > > > > > > focus our efforts on the implementation we intend to
> > > support
> > > > > > going
> > > > > > > > > > forward.
> > > > > > > > > > > 2. The producer/consumer configs seem pretty dangerous
> > for
> > > > > > general
> > > > > > > > > usage,
> > > > > > > > > > > but the KIP doesn't address the potential downsides.
> > > > > > > > > > > 3. How does the ProducerRequest change impact
> > exactly-once
> > > > (if
> > > > > at
> > > > > > > > all)?
> > > > > > > > > > The
> > > > > > > > > > > change we are reverting was done as part of KIP-98.
> Have
> > we
> > > > > > > > considered
> > > > > > > > > > the
> > > > > > > > > > > original reasons for the change?
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > > Ismael
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Feb 10, 2021 at 12:58 PM Vahid Hashemian <
> > > > > > > > > > > vahid.hashemian@gmail.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Retitled the thread to conform to the common format.
> > > > > > > > > > > >
> > > > > > > > > > > > On Fri, Feb 5, 2021 at 4:00 PM Ning Zhang <
> > > > > > > ning2008wisc@gmail.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hello Henry,
> > > > > > > > > > > > >
> > > > > > > > > > > > > This is a very interesting proposal.
> > > > > > > > > > > > > https://issues.apache.org/jira/browse/KAFKA-10728
> > > > reflects
> > > > > > the
> > > > > > > > > > similar
> > > > > > > > > > > > > concern of re-compressing data in mirror maker.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Probably one thing may need to clarify is: how
> > > "shallow"
> > > > > > > > mirroring
> > > > > > > > > is
> > > > > > > > > > > > only
> > > > > > > > > > > > > applied to mirrormaker use case, if the changes
> need
> > to
> > > > be
> > > > > > made
> > > > > > > > on
> > > > > > > > > > > > generic
> > > > > > > > > > > > > consumer and producer (e.g. by adding
> > `fetch.raw.bytes`
> > > > and
> > > > > > > > > > > > > `send.raw.bytes` to producer and consumer config)
> > > > > > > > > > > > >
> > > > > > > > > > > > > On 2021/02/05 00:59:57, Henry Cai
> > > > > <hcai@pinterest.com.INVALID
> > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > > > > > > Dear Community members,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > We are proposing a new feature to improve the
> > > > performance
> > > > > > of
> > > > > > > > > Kafka
> > > > > > > > > > > > mirror
> > > > > > > > > > > > > > maker:
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-712%3A+Shallow+Mirroring
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > The current Kafka MirrorMaker process (with the
> > > > > underlying
> > > > > > > > > Consumer
> > > > > > > > > > > and
> > > > > > > > > > > > > > Producer library) uses significant CPU cycles and
> > > > memory
> > > > > to
> > > > > > > > > > > > > > decompress/recompress, deserialize/re-serialize
> > > > messages
> > > > > > and
> > > > > > > > copy
> > > > > > > > > > > > > multiple
> > > > > > > > > > > > > > times of messages bytes along the
> > > mirroring/replicating
> > > > > > > stages.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > The KIP proposes a *shallow mirror* feature which
> > > > brings
> > > > > > back
> > > > > > > > the
> > > > > > > > > > > > shallow
> > > > > > > > > > > > > > iterator concept to the mirror process and also
> > > > proposes
> > > > > to
> > > > > > > > skip
> > > > > > > > > > the
> > > > > > > > > > > > > > unnecessary message decompression and
> recompression
> > > > > steps.
> > > > > > > We
> > > > > > > > > > argue
> > > > > > > > > > > in
> > > > > > > > > > > > > > many cases users just want a simple replication
> > > > pipeline
> > > > > to
> > > > > > > > > > replicate
> > > > > > > > > > > > the
> > > > > > > > > > > > > > message as it is from the source cluster to the
> > > > > destination
> > > > > > > > > > cluster.
> > > > > > > > > > > > In
> > > > > > > > > > > > > > many cases the messages in the source cluster are
> > > > already
> > > > > > > > > > compressed
> > > > > > > > > > > > and
> > > > > > > > > > > > > > properly batched, users just need an identical
> copy
> > > of
> > > > > the
> > > > > > > > > message
> > > > > > > > > > > > bytes
> > > > > > > > > > > > > > through the mirroring without any transformation
> or
> > > > > > > > > repartitioning.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > We have a prototype implementation in house with
> > > > > > MirrorMaker
> > > > > > > v1
> > > > > > > > > and
> > > > > > > > > > > > > > observed *CPU usage dropped from 50% to 15%* for
> > some
> > > > > > mirror
> > > > > > > > > > > pipelines.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > We name this feature: *shallow mirroring* since
> it
> > > has
> > > > > some
> > > > > > > > > > > resemblance
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > the old Kafka 0.7 namesake feature but the
> > > > > implementations
> > > > > > > are
> > > > > > > > > not
> > > > > > > > > > > > quite
> > > > > > > > > > > > > > the same.  ‘*Shallow*’ means 1. we *shallowly*
> > > iterate
> > > > > > > > > > RecordBatches
> > > > > > > > > > > > > inside
> > > > > > > > > > > > > > MemoryRecords structure instead of deep iterating
> > > > records
> > > > > > > > inside
> > > > > > > > > > > > > > RecordBatch; 2. We *shallowly* copy (share)
> > pointers
> > > > > inside
> > > > > > > > > > > ByteBuffer
> > > > > > > > > > > > > > instead of deep copying and deserializing bytes
> > into
> > > > > > objects.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Please share discussions/feedback along this
> email
> > > > > thread.
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks!
> > > > > > > > > > > > --Vahid
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-712: Shallow Mirroring

Posted by Tom Bentley <tb...@redhat.com>.
Hi Henry, Jun and Ismael,

A few things make me wonder if building this into the existing Producer and
Consumer APIs is really the right thing to do:

1. Type safety. The existing Producer and Consumer are both generic in K
and V, but those type parameters are meaningless in the batch case. For
example, the apparent type safety of a Producer<Foo, Bar> would be violated
by using the batch method to actually send a <Baz, Quux>. Another example:
What happens if I pass a producer configured for records to someone that
requires one configured for batches (and vice versa)?

2. The existing Producer and Consumer would both accept a number of configs
which didn't apply in the batch case.

In the discussion for KIP-706 Jason was imagining a more abstracted set of
client APIs which separated the data from the topic destination/origin, and
he mentioned basically this exact use case. This got me thinking, and
although I don't want to derail this conversion, I thought I'd sketch what
I came up with.


On the Consumer side:

```java
// Abstraction over where messages come from
interface ReceiveSource;
class TopicPartition implements SendTarget, ReceiveSource;
class Topic implements SendTarget, ReceiveSource;
class TopicId implements SendTarget, ReceiveSource;
class TopicPattern implements ReceiveSource;

// New abstraction for consumer-like things
interface Receiver<X> {
  assign(ReceiveSource source);
  subscribe(ReceiveSource source);
  // etc
  X poll(Duration);
}

// Consumer doesn't change, except for the implements clause
interface Consumer<K, V> implements Receiver<ConsumerRecords<K, V>> {
  assign(ReceiveSource source);
  subscribe(ReceiveSource source);
  ConsumerRecords<K, V> poll(Duration);
}

// KafkaConsumer doesn't change at all
class KafkaConsumer<K, V> implements Consumer<K, V> {
}

// Specialise Receiver for batch-based consumption.
interface BatchConsumer implements Receiver<ConsumerBatch> {

}

// Implementation
class KafkaBatchConsumer implements BatchConsumer {

}

class ConsumerBatch {
  // For KIP-712, a way to convert batches without exposing low level
details like ByteBuffer
  ProducerBatchPayload toProducerBatch();
}
```

On the producer side:
```java
// Abstraction over targets (see the ReceiveSource for the impls)
interface SendTarget;

// Abstraction over data that can be send to a target
interface Payload;
class ProducerRecordPayload<K, V> implements Payload {
  // Like ProducerRecord, but without the topic and partition
}
class ProducerBatchPayload implements Payload {
  // For the KIP-712 case
}

// A new abstraction over producer-like things
interface Transmitter<P extends Payload> {
  CompletionStage<T> send(SendTarget target, P payload);
}

// Producer gains an extends clause
interface Producer<K, V> extends Transmitter<ProducerRecord<K, V>> {
}

class KafkaProducer<K, V> implements Producer<K, V> {
  // Unchanged, included for completeness
}

interface BatchProducer extends Transmitter<ProducerBatch> {
  CompletionStage<T> send(SendTarget target, ProducerBatch)
}

class KafkaBatchProducer extends BatchProducer {
  // New. In practice a lot of common code between this and KafkaProducer
could be factored into an abstract class.
}
```

Really I'm just re-stating Jason's KIP-706 idea in the context of this KIP,
but it would address the type safety issue and also enable a batch consumer
to have its own set of configs. It also allows the new Producer.send return
type to be CompletionStage, which is KIP-706's objective. And, of course
it's compatible with possible future work around produce to/consume from
topic id.

Kind regards,

Tom



On Thu, Apr 1, 2021 at 9:11 AM Henry Cai <hc...@pinterest.com.invalid> wrote:

> Jun,
>
> Thanks for your insight looking into this KIP, we do believe the shallow
> iteration will give quite a significant performance boost.
>
> On your concerns:
>
> 1. Cleaner API.  One alternative is to create new batch APIs.  On consumer,
> it would become Consumer.pollBatch returns a ConsumerBatch object which
> contains topic/partition/firstOffsetOfBatch/pointerToByteBuffer, similarly
> Producer.sendBatch(ProducerBatch).  Both ConsumerBatch and ProducerBatch
> objects are fixed types (no generics), serializer is gone, interceptors are
> probably not needed initially (unless people see the need to intercept on
> the batch level).  On MM2 side, the current flow is ConsumerRecord ->
> Connect's SourceRecord -> ProducerRecord, we would need to enhance Connect
> framework to add SourceTask.pollBatch() method which returns a SourceBatch
> object, so the object conversion flow becomes ConsumerBatch -> SourceBatch
> -> ProducerBatch, we probably won't support any transformers on Batch
> objects.
> 2. PID/ProducerEpoch/SeqNo passing through RecordBatch.  I think those
> transaction fields are only meaningful in the original source kafka
> cluster, producer id/seqNo are not the same for the target kafka cluster.
> So if MM is not going to support transactions at the moment, we can clear
> those fields when they are going through MM.  Once MM starts to support
> transactions in the future, it probably will start its own PID/SeqNo etc.
> 3. For EOS and read_committed/read_uncommitted support, we can do phased
> support.  Phase 1, don't support transactional messages in the source
> cluster (i.e. abort if it sees control batch records).  Phase 2: applying
> commit/abort on the batch boundary level.  I am not too familiar with the
> isolation level and abort transaction code path, but it seems the control
> unit is currently on the batch boundary (commit/abort the whole batch), if
> so, this should also be doable.
> 4. MessageHandler in MM1 or SMT in MM2, initially we don't need to support
> them.  Since now the object is a ConsumerBatch and the existing handler is
> written for the individual object.  Deserialize the batch into individual
> objects would defeat the purpose of performance optimization.
> 5. Multiple batch performances, will do some testing on this.
>
> On Wed, Mar 31, 2021 at 10:14 AM Jun Rao <ju...@confluent.io.invalid> wrote:
>
> > Hi, Henry,
> >
> > Thanks for the KIP. Sorry for the late reply. A few comments below.
> >
> > 1. The 'shallow' feature is potentially useful. I do agree with Tom that
> > the proposed API changes seem unclean. Quite a few existing stuff don't
> > really work together with this (e.g., generics, serializer, interceptors,
> > configs like max.poll.records, etc). It's also hard to explain this
> change
> > to the common users of the consumer/producer API. I think it would be
> > useful to explore if there is another cleaner way of adding this. For
> > example, you mentioned that creating a new set of APIs doesn't work for
> > MM2. However, we could potentially change the connect interface to allow
> > MM2 to use the new API. If this doesn't work, it would be useful to
> explain
> > that in the rejected alternative section.
> >
> > 2. I am not sure that we could pass through all fields in RecordBatch.
> For
> > example, a MM instance could be receiving RecordBatch from different
> source
> > partitions. Mixing the PID/ProducerEpoch/FirstSequence fields across them
> > in a single producer will be weird. So, it would be useful to document
> this
> > part clearer.
> >
> > 3. EOS. While MM itself doesn't support mirroring data in an
> > exactly-once way, it needs to support reading from a topic with EOS data.
> > So, it would be useful to document whether both read_committed and
> > read_uncommitted mode are supported and what kind of RecordBatch the
> > consumer returns in each case.
> >
> > 4. With the 'shallow' feature, it seems that some existing features in MM
> > won't work. For example, I am not sure if SMT works in MM2
> > and MirrorMakerMessageHandler works in MM1. It would be useful to
> document
> > this kind of impact in the KIP.
> >
> > 5. Multiple batches per partition in the produce request. This seems not
> > strictly required in KIP-98. However, changing this will probably add a
> bit
> > more complexity in the producer. So, it would be useful to understand its
> > benefits, especially since it doesn't seem to directly help reduce the
> CPU
> > cost in MM. For example, do you have performance numbers with and without
> > this enabled in your MM tests?
> >
> > Thanks,
> >
> > Jun
> >
> > On Tue, Mar 30, 2021 at 1:27 PM Henry Cai <hc...@pinterest.com.invalid>
> > wrote:
> >
> > > Tom,
> > >
> > > Thanks for your comments.  Yes it's a bit clumsy to use the existing
> > > consumer and producer API to carry the underlying record batch, but
> > > creating a new set of API would also mean other use cases (e.g. MM2)
> > > wouldn't be able to use that feature easily.  We can throw exceptions
> if
> > we
> > > see clients are setting serializer/compression in the consumer config
> > > option.
> > >
> > > The consumer is essentially getting back a collection of
> > > RecordBatchByteBuffer records and passing them to the producer.  Most
> of
> > > the internal APIs inside consumer and producer code paths are actually
> > > taking on ByteBuffer as the argument so it's not too much work to get
> the
> > > byte buffer through.
> > >
> > > For the worry that the client might see the inside of that byte buffer,
> > we
> > > can create a RecordBatchByteBufferRecord class to wrap the underlying
> > byte
> > > buffer so hopefully they will not drill too deep into that object.
> > Java's
> > > ByteBuffer does have a asReadOnlyBuffer() method to return a read-only
> > > buffer, that can be explored as well.
> > >
> > > On Tue, Mar 30, 2021 at 4:24 AM Tom Bentley <tb...@redhat.com>
> wrote:
> > >
> > > > Hi Henry and Ryanne,
> > > >
> > > > Related to Ismael's point about the producer & consumer configs being
> > > > dangerous, I can see two parts to this:
> > > >
> > > > 2a. Both the proposed configs seem to be fundamentally incompatible
> > with
> > > > the Producer's existing key.serializer, value.serializer and
> > > > compression.type configs, likewise the consumers key.deserializer and
> > > > value.deserializer. I don't see a way to avoid this, since those
> > existing
> > > > configs are already separate things. (I did consider whether using
> > > > special-case Deserializer and Serializer could be used instead, but
> > that
> > > > doesn't work nicely; in this use case they're necessarily all
> > configured
> > > > together). I think all we could do would be to reject configs which
> > tried
> > > > to set those existing client configs in conjunction with
> > fetch.raw.bytes
> > > > and send.raw.bytes.
> > > >
> > > > 2b. That still leaves a public Java API which would allow access to
> the
> > > raw
> > > > byte buffers. AFAICS we don't actually need user code to have access
> to
> > > the
> > > > raw buffers. It would be enough to get an opaque object that wrapped
> > the
> > > > ByteBuffer from the consumer and pass it to the producer. It's only
> the
> > > > consumer and producer code which needs to be able to obtain the
> wrapped
> > > > buffer.
> > > >
> > > > Kind regards,
> > > >
> > > > Tom
> > > >
> > > > On Tue, Mar 30, 2021 at 8:41 AM Ismael Juma <is...@juma.me.uk>
> wrote:
> > > >
> > > > > Hi Henry,
> > > > >
> > > > > Can you clarify why this "network performance" issue is only
> related
> > to
> > > > > shallow mirroring? Generally, we want the protocol to be generic
> and
> > > not
> > > > > have a number of special cases. The more special cases you have,
> the
> > > > > tougher it becomes to test all the edge cases.
> > > > >
> > > > > Ismael
> > > > >
> > > > > On Mon, Mar 29, 2021 at 9:51 PM Henry Cai
> <hcai@pinterest.com.invalid
> > >
> > > > > wrote:
> > > > >
> > > > > > It's interesting this VOTE thread finally becomes a DISCUSS
> thread.
> > > > > >
> > > > > > For MM2 concern, I will take a look to see whether I can add the
> > > > support
> > > > > > for MM2.
> > > > > >
> > > > > > For Ismael's concern on multiple batches in the ProduceRequest
> > > > > (conflicting
> > > > > > with KIP-98), here is my take:
> > > > > >
> > > > > > 1. We do need to group multiple batches in the same request
> > otherwise
> > > > the
> > > > > > network performance will suffer.
> > > > > > 2. For the concern on transactional message support as in KIP-98,
> > > since
> > > > > MM1
> > > > > > and MM2 currently don't support transactional messages, KIP-712
> > will
> > > > not
> > > > > > attempt to support transactions either.  I will add a config
> option
> > > on
> > > > > > producer config: allowMultipleBatches.  By default this option
> will
> > > be
> > > > > off
> > > > > > and the user needs to explicitly turn on this option to use the
> > > shallow
> > > > > > mirror feature.  And if we detect both this option and
> transaction
> > is
> > > > > > turned on we will throw an exception to protect current
> transaction
> > > > > > processing.
> > > > > > 3. In the future, when MM2 starts to support exact-once and
> > > > transactional
> > > > > > messages (is that KIP-656?), we can revisit this code.  The
> current
> > > > > > transactional message already makes the compromise that the
> > messages
> > > in
> > > > > the
> > > > > > same RecordBatch (MessageSet) are sharing the same
> > > > > > sequence-id/transaction-id, so those messages need to be
> committed
> > > all
> > > > > > together.  I think when we support the shallow mirror with
> > > > transactional
> > > > > > semantics, we will group all batches in the same ProduceRequest
> in
> > > the
> > > > > same
> > > > > > transaction boundary, they need to be committed all together.  On
> > the
> > > > > > broker side, all batches coming from ProduceRequest (or
> > > FetchResponse)
> > > > > are
> > > > > > committed in the same log segment file as one unit (current
> > > behavior).
> > > > > >
> > > > > > On Mon, Mar 29, 2021 at 8:46 AM Ryanne Dolan <
> > ryannedolan@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Ah, I see, thanks Ismael. Now I understand your concern.
> > > > > > >
> > > > > > > From KIP-98, re this change in v3:
> > > > > > >
> > > > > > > "This allows us to remove the message set size since each
> message
> > > set
> > > > > > > already contains a field for the size. More importantly, since
> > > there
> > > > is
> > > > > > > only one message set to be written to the log, partial produce
> > > > failures
> > > > > > are
> > > > > > > no longer possible. The full message set is either successfully
> > > > written
> > > > > > to
> > > > > > > the log (and replicated) or it is not."
> > > > > > >
> > > > > > > The schema and size field don't seem to be an issue, as KIP-712
> > > > already
> > > > > > > addresses.
> > > > > > >
> > > > > > > The partial produce failure issue is something I don't
> > understand.
> > > I
> > > > > > can't
> > > > > > > tell if this was done out of convenience at the time or if
> there
> > is
> > > > > > > something incompatible with partial produce success/failure and
> > > EOS.
> > > > > Does
> > > > > > > anyone know?
> > > > > > >
> > > > > > > Ryanne
> > > > > > >
> > > > > > > On Mon, Mar 29, 2021, 1:41 AM Ismael Juma <is...@juma.me.uk>
> > > wrote:
> > > > > > >
> > > > > > > > Ryanne,
> > > > > > > >
> > > > > > > > You misunderstood the referenced comment. It is about the
> > produce
> > > > > > request
> > > > > > > > change to have multiple batches:
> > > > > > > >
> > > > > > > > "Up to ProduceRequest V2, a ProduceRequest can contain
> multiple
> > > > > batches
> > > > > > > of
> > > > > > > > messages stored in the record_set field, but this was
> disabled
> > in
> > > > V3.
> > > > > > We
> > > > > > > > are proposing to bring the multiple batches feature back to
> > > improve
> > > > > the
> > > > > > > > network throughput of the mirror maker producer when the
> > original
> > > > > batch
> > > > > > > > size from source broker is too small."
> > > > > > > >
> > > > > > > > This is unrelated to shallow iteration.
> > > > > > > >
> > > > > > > > Ismael
> > > > > > > >
> > > > > > > > On Sun, Mar 28, 2021, 10:15 PM Ryanne Dolan <
> > > ryannedolan@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Ismael, I don't think KIP-98 is related. Shallow iteration
> > was
> > > > > > removed
> > > > > > > in
> > > > > > > > > KAFKA-732, which predates KIP-98 by a few years.
> > > > > > > > >
> > > > > > > > > Ryanne
> > > > > > > > >
> > > > > > > > > On Sun, Mar 28, 2021, 11:25 PM Ismael Juma <
> > ismael@juma.me.uk>
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Thanks for the KIP. I have a few high level comments:
> > > > > > > > > >
> > > > > > > > > > 1. Like Tom, I'm not convinced about the proposal to make
> > > this
> > > > > > change
> > > > > > > > to
> > > > > > > > > > MirrorMaker 1 if we intend to deprecate it and remove
> it. I
> > > > would
> > > > > > > > rather
> > > > > > > > > us
> > > > > > > > > > focus our efforts on the implementation we intend to
> > support
> > > > > going
> > > > > > > > > forward.
> > > > > > > > > > 2. The producer/consumer configs seem pretty dangerous
> for
> > > > > general
> > > > > > > > usage,
> > > > > > > > > > but the KIP doesn't address the potential downsides.
> > > > > > > > > > 3. How does the ProducerRequest change impact
> exactly-once
> > > (if
> > > > at
> > > > > > > all)?
> > > > > > > > > The
> > > > > > > > > > change we are reverting was done as part of KIP-98. Have
> we
> > > > > > > considered
> > > > > > > > > the
> > > > > > > > > > original reasons for the change?
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Ismael
> > > > > > > > > >
> > > > > > > > > > On Wed, Feb 10, 2021 at 12:58 PM Vahid Hashemian <
> > > > > > > > > > vahid.hashemian@gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Retitled the thread to conform to the common format.
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Feb 5, 2021 at 4:00 PM Ning Zhang <
> > > > > > ning2008wisc@gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hello Henry,
> > > > > > > > > > > >
> > > > > > > > > > > > This is a very interesting proposal.
> > > > > > > > > > > > https://issues.apache.org/jira/browse/KAFKA-10728
> > > reflects
> > > > > the
> > > > > > > > > similar
> > > > > > > > > > > > concern of re-compressing data in mirror maker.
> > > > > > > > > > > >
> > > > > > > > > > > > Probably one thing may need to clarify is: how
> > "shallow"
> > > > > > > mirroring
> > > > > > > > is
> > > > > > > > > > > only
> > > > > > > > > > > > applied to mirrormaker use case, if the changes need
> to
> > > be
> > > > > made
> > > > > > > on
> > > > > > > > > > > generic
> > > > > > > > > > > > consumer and producer (e.g. by adding
> `fetch.raw.bytes`
> > > and
> > > > > > > > > > > > `send.raw.bytes` to producer and consumer config)
> > > > > > > > > > > >
> > > > > > > > > > > > On 2021/02/05 00:59:57, Henry Cai
> > > > <hcai@pinterest.com.INVALID
> > > > > >
> > > > > > > > > wrote:
> > > > > > > > > > > > > Dear Community members,
> > > > > > > > > > > > >
> > > > > > > > > > > > > We are proposing a new feature to improve the
> > > performance
> > > > > of
> > > > > > > > Kafka
> > > > > > > > > > > mirror
> > > > > > > > > > > > > maker:
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-712%3A+Shallow+Mirroring
> > > > > > > > > > > > >
> > > > > > > > > > > > > The current Kafka MirrorMaker process (with the
> > > > underlying
> > > > > > > > Consumer
> > > > > > > > > > and
> > > > > > > > > > > > > Producer library) uses significant CPU cycles and
> > > memory
> > > > to
> > > > > > > > > > > > > decompress/recompress, deserialize/re-serialize
> > > messages
> > > > > and
> > > > > > > copy
> > > > > > > > > > > > multiple
> > > > > > > > > > > > > times of messages bytes along the
> > mirroring/replicating
> > > > > > stages.
> > > > > > > > > > > > >
> > > > > > > > > > > > > The KIP proposes a *shallow mirror* feature which
> > > brings
> > > > > back
> > > > > > > the
> > > > > > > > > > > shallow
> > > > > > > > > > > > > iterator concept to the mirror process and also
> > > proposes
> > > > to
> > > > > > > skip
> > > > > > > > > the
> > > > > > > > > > > > > unnecessary message decompression and recompression
> > > > steps.
> > > > > > We
> > > > > > > > > argue
> > > > > > > > > > in
> > > > > > > > > > > > > many cases users just want a simple replication
> > > pipeline
> > > > to
> > > > > > > > > replicate
> > > > > > > > > > > the
> > > > > > > > > > > > > message as it is from the source cluster to the
> > > > destination
> > > > > > > > > cluster.
> > > > > > > > > > > In
> > > > > > > > > > > > > many cases the messages in the source cluster are
> > > already
> > > > > > > > > compressed
> > > > > > > > > > > and
> > > > > > > > > > > > > properly batched, users just need an identical copy
> > of
> > > > the
> > > > > > > > message
> > > > > > > > > > > bytes
> > > > > > > > > > > > > through the mirroring without any transformation or
> > > > > > > > repartitioning.
> > > > > > > > > > > > >
> > > > > > > > > > > > > We have a prototype implementation in house with
> > > > > MirrorMaker
> > > > > > v1
> > > > > > > > and
> > > > > > > > > > > > > observed *CPU usage dropped from 50% to 15%* for
> some
> > > > > mirror
> > > > > > > > > > pipelines.
> > > > > > > > > > > > >
> > > > > > > > > > > > > We name this feature: *shallow mirroring* since it
> > has
> > > > some
> > > > > > > > > > resemblance
> > > > > > > > > > > > to
> > > > > > > > > > > > > the old Kafka 0.7 namesake feature but the
> > > > implementations
> > > > > > are
> > > > > > > > not
> > > > > > > > > > > quite
> > > > > > > > > > > > > the same.  ‘*Shallow*’ means 1. we *shallowly*
> > iterate
> > > > > > > > > RecordBatches
> > > > > > > > > > > > inside
> > > > > > > > > > > > > MemoryRecords structure instead of deep iterating
> > > records
> > > > > > > inside
> > > > > > > > > > > > > RecordBatch; 2. We *shallowly* copy (share)
> pointers
> > > > inside
> > > > > > > > > > ByteBuffer
> > > > > > > > > > > > > instead of deep copying and deserializing bytes
> into
> > > > > objects.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Please share discussions/feedback along this email
> > > > thread.
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > >
> > > > > > > > > > > Thanks!
> > > > > > > > > > > --Vahid
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-712: Shallow Mirroring

Posted by Henry Cai <hc...@pinterest.com.INVALID>.
Jun,

On the EOS support, I looked at KIP-98 and it seems to me all the control
is specified in the RecordBatch header fields:
1. Whether the batch is a control batch which contains commit or abort
marker
2. Whether the batch is transactional which contains PID

In read_committed isolation level mode (I think that's the mode makes sense
for mirroring), in the existing code in Fetcher.nextFetchRecord(), the code
decides to skip the control batch or the aborted batch based on the
abortedTransactions from FetchResponse.  This logic would also work in the
shallow iterator mode, we would just skip to the next batch.

On Thu, Apr 1, 2021 at 9:46 AM Jun Rao <ju...@confluent.io.invalid> wrote:

> Hi, Henry,
>
> Thanks for the response.
>
> 1. I agree with Tom that it's worth thinking about a separate class for
> shallow iteration instead of trying to add more complexity into the
> existing producer/consumer API. We could potentially make the new class an
> internal API if it's only useful for MM.
>
> 3. I am not sure that we could ignore transactional messages in the first
> phase. The usage of EOS is increasing. Also, one can't tell from the
> metadata of a topic whether it has EOS data or not. So, there is no easy
> way to skip EOS data from the source.
>
> Jun
>
> On Thu, Apr 1, 2021 at 1:07 AM Henry Cai <hc...@pinterest.com.invalid>
> wrote:
>
> > Jun,
> >
> > Thanks for your insight looking into this KIP, we do believe the shallow
> > iteration will give quite a significant performance boost.
> >
> > On your concerns:
> >
> > 1. Cleaner API.  One alternative is to create new batch APIs.  On
> consumer,
> > it would become Consumer.pollBatch returns a ConsumerBatch object which
> > contains topic/partition/firstOffsetOfBatch/pointerToByteBuffer,
> similarly
> > Producer.sendBatch(ProducerBatch).  Both ConsumerBatch and ProducerBatch
> > objects are fixed types (no generics), serializer is gone, interceptors
> are
> > probably not needed initially (unless people see the need to intercept on
> > the batch level).  On MM2 side, the current flow is ConsumerRecord ->
> > Connect's SourceRecord -> ProducerRecord, we would need to enhance
> Connect
> > framework to add SourceTask.pollBatch() method which returns a
> SourceBatch
> > object, so the object conversion flow becomes ConsumerBatch ->
> SourceBatch
> > -> ProducerBatch, we probably won't support any transformers on Batch
> > objects.
> > 2. PID/ProducerEpoch/SeqNo passing through RecordBatch.  I think those
> > transaction fields are only meaningful in the original source kafka
> > cluster, producer id/seqNo are not the same for the target kafka cluster.
> > So if MM is not going to support transactions at the moment, we can clear
> > those fields when they are going through MM.  Once MM starts to support
> > transactions in the future, it probably will start its own PID/SeqNo etc.
> > 3. For EOS and read_committed/read_uncommitted support, we can do phased
> > support.  Phase 1, don't support transactional messages in the source
> > cluster (i.e. abort if it sees control batch records).  Phase 2: applying
> > commit/abort on the batch boundary level.  I am not too familiar with the
> > isolation level and abort transaction code path, but it seems the control
> > unit is currently on the batch boundary (commit/abort the whole batch),
> if
> > so, this should also be doable.
> > 4. MessageHandler in MM1 or SMT in MM2, initially we don't need to
> support
> > them.  Since now the object is a ConsumerBatch and the existing handler
> is
> > written for the individual object.  Deserialize the batch into individual
> > objects would defeat the purpose of performance optimization.
> > 5. Multiple batch performances, will do some testing on this.
> >
> > On Wed, Mar 31, 2021 at 10:14 AM Jun Rao <ju...@confluent.io.invalid>
> wrote:
> >
> > > Hi, Henry,
> > >
> > > Thanks for the KIP. Sorry for the late reply. A few comments below.
> > >
> > > 1. The 'shallow' feature is potentially useful. I do agree with Tom
> that
> > > the proposed API changes seem unclean. Quite a few existing stuff don't
> > > really work together with this (e.g., generics, serializer,
> interceptors,
> > > configs like max.poll.records, etc). It's also hard to explain this
> > change
> > > to the common users of the consumer/producer API. I think it would be
> > > useful to explore if there is another cleaner way of adding this. For
> > > example, you mentioned that creating a new set of APIs doesn't work for
> > > MM2. However, we could potentially change the connect interface to
> allow
> > > MM2 to use the new API. If this doesn't work, it would be useful to
> > explain
> > > that in the rejected alternative section.
> > >
> > > 2. I am not sure that we could pass through all fields in RecordBatch.
> > For
> > > example, a MM instance could be receiving RecordBatch from different
> > source
> > > partitions. Mixing the PID/ProducerEpoch/FirstSequence fields across
> them
> > > in a single producer will be weird. So, it would be useful to document
> > this
> > > part clearer.
> > >
> > > 3. EOS. While MM itself doesn't support mirroring data in an
> > > exactly-once way, it needs to support reading from a topic with EOS
> data.
> > > So, it would be useful to document whether both read_committed and
> > > read_uncommitted mode are supported and what kind of RecordBatch the
> > > consumer returns in each case.
> > >
> > > 4. With the 'shallow' feature, it seems that some existing features in
> MM
> > > won't work. For example, I am not sure if SMT works in MM2
> > > and MirrorMakerMessageHandler works in MM1. It would be useful to
> > document
> > > this kind of impact in the KIP.
> > >
> > > 5. Multiple batches per partition in the produce request. This seems
> not
> > > strictly required in KIP-98. However, changing this will probably add a
> > bit
> > > more complexity in the producer. So, it would be useful to understand
> its
> > > benefits, especially since it doesn't seem to directly help reduce the
> > CPU
> > > cost in MM. For example, do you have performance numbers with and
> without
> > > this enabled in your MM tests?
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Tue, Mar 30, 2021 at 1:27 PM Henry Cai <hc...@pinterest.com.invalid>
> > > wrote:
> > >
> > > > Tom,
> > > >
> > > > Thanks for your comments.  Yes it's a bit clumsy to use the existing
> > > > consumer and producer API to carry the underlying record batch, but
> > > > creating a new set of API would also mean other use cases (e.g. MM2)
> > > > wouldn't be able to use that feature easily.  We can throw exceptions
> > if
> > > we
> > > > see clients are setting serializer/compression in the consumer config
> > > > option.
> > > >
> > > > The consumer is essentially getting back a collection of
> > > > RecordBatchByteBuffer records and passing them to the producer.  Most
> > of
> > > > the internal APIs inside consumer and producer code paths are
> actually
> > > > taking on ByteBuffer as the argument so it's not too much work to get
> > the
> > > > byte buffer through.
> > > >
> > > > For the worry that the client might see the inside of that byte
> buffer,
> > > we
> > > > can create a RecordBatchByteBufferRecord class to wrap the underlying
> > > byte
> > > > buffer so hopefully they will not drill too deep into that object.
> > > Java's
> > > > ByteBuffer does have a asReadOnlyBuffer() method to return a
> read-only
> > > > buffer, that can be explored as well.
> > > >
> > > > On Tue, Mar 30, 2021 at 4:24 AM Tom Bentley <tb...@redhat.com>
> > wrote:
> > > >
> > > > > Hi Henry and Ryanne,
> > > > >
> > > > > Related to Ismael's point about the producer & consumer configs
> being
> > > > > dangerous, I can see two parts to this:
> > > > >
> > > > > 2a. Both the proposed configs seem to be fundamentally incompatible
> > > with
> > > > > the Producer's existing key.serializer, value.serializer and
> > > > > compression.type configs, likewise the consumers key.deserializer
> and
> > > > > value.deserializer. I don't see a way to avoid this, since those
> > > existing
> > > > > configs are already separate things. (I did consider whether using
> > > > > special-case Deserializer and Serializer could be used instead, but
> > > that
> > > > > doesn't work nicely; in this use case they're necessarily all
> > > configured
> > > > > together). I think all we could do would be to reject configs which
> > > tried
> > > > > to set those existing client configs in conjunction with
> > > fetch.raw.bytes
> > > > > and send.raw.bytes.
> > > > >
> > > > > 2b. That still leaves a public Java API which would allow access to
> > the
> > > > raw
> > > > > byte buffers. AFAICS we don't actually need user code to have
> access
> > to
> > > > the
> > > > > raw buffers. It would be enough to get an opaque object that
> wrapped
> > > the
> > > > > ByteBuffer from the consumer and pass it to the producer. It's only
> > the
> > > > > consumer and producer code which needs to be able to obtain the
> > wrapped
> > > > > buffer.
> > > > >
> > > > > Kind regards,
> > > > >
> > > > > Tom
> > > > >
> > > > > On Tue, Mar 30, 2021 at 8:41 AM Ismael Juma <is...@juma.me.uk>
> > wrote:
> > > > >
> > > > > > Hi Henry,
> > > > > >
> > > > > > Can you clarify why this "network performance" issue is only
> > related
> > > to
> > > > > > shallow mirroring? Generally, we want the protocol to be generic
> > and
> > > > not
> > > > > > have a number of special cases. The more special cases you have,
> > the
> > > > > > tougher it becomes to test all the edge cases.
> > > > > >
> > > > > > Ismael
> > > > > >
> > > > > > On Mon, Mar 29, 2021 at 9:51 PM Henry Cai
> > <hcai@pinterest.com.invalid
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > It's interesting this VOTE thread finally becomes a DISCUSS
> > thread.
> > > > > > >
> > > > > > > For MM2 concern, I will take a look to see whether I can add
> the
> > > > > support
> > > > > > > for MM2.
> > > > > > >
> > > > > > > For Ismael's concern on multiple batches in the ProduceRequest
> > > > > > (conflicting
> > > > > > > with KIP-98), here is my take:
> > > > > > >
> > > > > > > 1. We do need to group multiple batches in the same request
> > > otherwise
> > > > > the
> > > > > > > network performance will suffer.
> > > > > > > 2. For the concern on transactional message support as in
> KIP-98,
> > > > since
> > > > > > MM1
> > > > > > > and MM2 currently don't support transactional messages, KIP-712
> > > will
> > > > > not
> > > > > > > attempt to support transactions either.  I will add a config
> > option
> > > > on
> > > > > > > producer config: allowMultipleBatches.  By default this option
> > will
> > > > be
> > > > > > off
> > > > > > > and the user needs to explicitly turn on this option to use the
> > > > shallow
> > > > > > > mirror feature.  And if we detect both this option and
> > transaction
> > > is
> > > > > > > turned on we will throw an exception to protect current
> > transaction
> > > > > > > processing.
> > > > > > > 3. In the future, when MM2 starts to support exact-once and
> > > > > transactional
> > > > > > > messages (is that KIP-656?), we can revisit this code.  The
> > current
> > > > > > > transactional message already makes the compromise that the
> > > messages
> > > > in
> > > > > > the
> > > > > > > same RecordBatch (MessageSet) are sharing the same
> > > > > > > sequence-id/transaction-id, so those messages need to be
> > committed
> > > > all
> > > > > > > together.  I think when we support the shallow mirror with
> > > > > transactional
> > > > > > > semantics, we will group all batches in the same ProduceRequest
> > in
> > > > the
> > > > > > same
> > > > > > > transaction boundary, they need to be committed all together.
> On
> > > the
> > > > > > > broker side, all batches coming from ProduceRequest (or
> > > > FetchResponse)
> > > > > > are
> > > > > > > committed in the same log segment file as one unit (current
> > > > behavior).
> > > > > > >
> > > > > > > On Mon, Mar 29, 2021 at 8:46 AM Ryanne Dolan <
> > > ryannedolan@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Ah, I see, thanks Ismael. Now I understand your concern.
> > > > > > > >
> > > > > > > > From KIP-98, re this change in v3:
> > > > > > > >
> > > > > > > > "This allows us to remove the message set size since each
> > message
> > > > set
> > > > > > > > already contains a field for the size. More importantly,
> since
> > > > there
> > > > > is
> > > > > > > > only one message set to be written to the log, partial
> produce
> > > > > failures
> > > > > > > are
> > > > > > > > no longer possible. The full message set is either
> successfully
> > > > > written
> > > > > > > to
> > > > > > > > the log (and replicated) or it is not."
> > > > > > > >
> > > > > > > > The schema and size field don't seem to be an issue, as
> KIP-712
> > > > > already
> > > > > > > > addresses.
> > > > > > > >
> > > > > > > > The partial produce failure issue is something I don't
> > > understand.
> > > > I
> > > > > > > can't
> > > > > > > > tell if this was done out of convenience at the time or if
> > there
> > > is
> > > > > > > > something incompatible with partial produce success/failure
> and
> > > > EOS.
> > > > > > Does
> > > > > > > > anyone know?
> > > > > > > >
> > > > > > > > Ryanne
> > > > > > > >
> > > > > > > > On Mon, Mar 29, 2021, 1:41 AM Ismael Juma <ismael@juma.me.uk
> >
> > > > wrote:
> > > > > > > >
> > > > > > > > > Ryanne,
> > > > > > > > >
> > > > > > > > > You misunderstood the referenced comment. It is about the
> > > produce
> > > > > > > request
> > > > > > > > > change to have multiple batches:
> > > > > > > > >
> > > > > > > > > "Up to ProduceRequest V2, a ProduceRequest can contain
> > multiple
> > > > > > batches
> > > > > > > > of
> > > > > > > > > messages stored in the record_set field, but this was
> > disabled
> > > in
> > > > > V3.
> > > > > > > We
> > > > > > > > > are proposing to bring the multiple batches feature back to
> > > > improve
> > > > > > the
> > > > > > > > > network throughput of the mirror maker producer when the
> > > original
> > > > > > batch
> > > > > > > > > size from source broker is too small."
> > > > > > > > >
> > > > > > > > > This is unrelated to shallow iteration.
> > > > > > > > >
> > > > > > > > > Ismael
> > > > > > > > >
> > > > > > > > > On Sun, Mar 28, 2021, 10:15 PM Ryanne Dolan <
> > > > ryannedolan@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Ismael, I don't think KIP-98 is related. Shallow
> iteration
> > > was
> > > > > > > removed
> > > > > > > > in
> > > > > > > > > > KAFKA-732, which predates KIP-98 by a few years.
> > > > > > > > > >
> > > > > > > > > > Ryanne
> > > > > > > > > >
> > > > > > > > > > On Sun, Mar 28, 2021, 11:25 PM Ismael Juma <
> > > ismael@juma.me.uk>
> > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Thanks for the KIP. I have a few high level comments:
> > > > > > > > > > >
> > > > > > > > > > > 1. Like Tom, I'm not convinced about the proposal to
> make
> > > > this
> > > > > > > change
> > > > > > > > > to
> > > > > > > > > > > MirrorMaker 1 if we intend to deprecate it and remove
> > it. I
> > > > > would
> > > > > > > > > rather
> > > > > > > > > > us
> > > > > > > > > > > focus our efforts on the implementation we intend to
> > > support
> > > > > > going
> > > > > > > > > > forward.
> > > > > > > > > > > 2. The producer/consumer configs seem pretty dangerous
> > for
> > > > > > general
> > > > > > > > > usage,
> > > > > > > > > > > but the KIP doesn't address the potential downsides.
> > > > > > > > > > > 3. How does the ProducerRequest change impact
> > exactly-once
> > > > (if
> > > > > at
> > > > > > > > all)?
> > > > > > > > > > The
> > > > > > > > > > > change we are reverting was done as part of KIP-98.
> Have
> > we
> > > > > > > > considered
> > > > > > > > > > the
> > > > > > > > > > > original reasons for the change?
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > > Ismael
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Feb 10, 2021 at 12:58 PM Vahid Hashemian <
> > > > > > > > > > > vahid.hashemian@gmail.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Retitled the thread to conform to the common format.
> > > > > > > > > > > >
> > > > > > > > > > > > On Fri, Feb 5, 2021 at 4:00 PM Ning Zhang <
> > > > > > > ning2008wisc@gmail.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hello Henry,
> > > > > > > > > > > > >
> > > > > > > > > > > > > This is a very interesting proposal.
> > > > > > > > > > > > > https://issues.apache.org/jira/browse/KAFKA-10728
> > > > reflects
> > > > > > the
> > > > > > > > > > similar
> > > > > > > > > > > > > concern of re-compressing data in mirror maker.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Probably one thing may need to clarify is: how
> > > "shallow"
> > > > > > > > mirroring
> > > > > > > > > is
> > > > > > > > > > > > only
> > > > > > > > > > > > > applied to mirrormaker use case, if the changes
> need
> > to
> > > > be
> > > > > > made
> > > > > > > > on
> > > > > > > > > > > > generic
> > > > > > > > > > > > > consumer and producer (e.g. by adding
> > `fetch.raw.bytes`
> > > > and
> > > > > > > > > > > > > `send.raw.bytes` to producer and consumer config)
> > > > > > > > > > > > >
> > > > > > > > > > > > > On 2021/02/05 00:59:57, Henry Cai
> > > > > <hcai@pinterest.com.INVALID
> > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > > > > > > Dear Community members,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > We are proposing a new feature to improve the
> > > > performance
> > > > > > of
> > > > > > > > > Kafka
> > > > > > > > > > > > mirror
> > > > > > > > > > > > > > maker:
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-712%3A+Shallow+Mirroring
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > The current Kafka MirrorMaker process (with the
> > > > > underlying
> > > > > > > > > Consumer
> > > > > > > > > > > and
> > > > > > > > > > > > > > Producer library) uses significant CPU cycles and
> > > > memory
> > > > > to
> > > > > > > > > > > > > > decompress/recompress, deserialize/re-serialize
> > > > messages
> > > > > > and
> > > > > > > > copy
> > > > > > > > > > > > > multiple
> > > > > > > > > > > > > > times of messages bytes along the
> > > mirroring/replicating
> > > > > > > stages.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > The KIP proposes a *shallow mirror* feature which
> > > > brings
> > > > > > back
> > > > > > > > the
> > > > > > > > > > > > shallow
> > > > > > > > > > > > > > iterator concept to the mirror process and also
> > > > proposes
> > > > > to
> > > > > > > > skip
> > > > > > > > > > the
> > > > > > > > > > > > > > unnecessary message decompression and
> recompression
> > > > > steps.
> > > > > > > We
> > > > > > > > > > argue
> > > > > > > > > > > in
> > > > > > > > > > > > > > many cases users just want a simple replication
> > > > pipeline
> > > > > to
> > > > > > > > > > replicate
> > > > > > > > > > > > the
> > > > > > > > > > > > > > message as it is from the source cluster to the
> > > > > destination
> > > > > > > > > > cluster.
> > > > > > > > > > > > In
> > > > > > > > > > > > > > many cases the messages in the source cluster are
> > > > already
> > > > > > > > > > compressed
> > > > > > > > > > > > and
> > > > > > > > > > > > > > properly batched, users just need an identical
> copy
> > > of
> > > > > the
> > > > > > > > > message
> > > > > > > > > > > > bytes
> > > > > > > > > > > > > > through the mirroring without any transformation
> or
> > > > > > > > > repartitioning.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > We have a prototype implementation in house with
> > > > > > MirrorMaker
> > > > > > > v1
> > > > > > > > > and
> > > > > > > > > > > > > > observed *CPU usage dropped from 50% to 15%* for
> > some
> > > > > > mirror
> > > > > > > > > > > pipelines.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > We name this feature: *shallow mirroring* since
> it
> > > has
> > > > > some
> > > > > > > > > > > resemblance
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > the old Kafka 0.7 namesake feature but the
> > > > > implementations
> > > > > > > are
> > > > > > > > > not
> > > > > > > > > > > > quite
> > > > > > > > > > > > > > the same.  ‘*Shallow*’ means 1. we *shallowly*
> > > iterate
> > > > > > > > > > RecordBatches
> > > > > > > > > > > > > inside
> > > > > > > > > > > > > > MemoryRecords structure instead of deep iterating
> > > > records
> > > > > > > > inside
> > > > > > > > > > > > > > RecordBatch; 2. We *shallowly* copy (share)
> > pointers
> > > > > inside
> > > > > > > > > > > ByteBuffer
> > > > > > > > > > > > > > instead of deep copying and deserializing bytes
> > into
> > > > > > objects.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Please share discussions/feedback along this
> email
> > > > > thread.
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks!
> > > > > > > > > > > > --Vahid
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-712: Shallow Mirroring

Posted by Colin McCabe <cm...@apache.org>.
On Thu, Apr 1, 2021, at 09:45, Jun Rao wrote:
> Hi, Henry,
> 
> Thanks for the response.
> 
> 1. I agree with Tom that it's worth thinking about a separate class for
> shallow iteration instead of trying to add more complexity into the
> existing producer/consumer API. We could potentially make the new class an
> internal API if it's only useful for MM.
> 

Yes, I think a separate API that just allowed sending / receiving record bytes might be the best way to go.

> 3. I am not sure that we could ignore transactional messages in the first
> phase. The usage of EOS is increasing. Also, one can't tell from the
> metadata of a topic whether it has EOS data or not. So, there is no easy
> way to skip EOS data from the source.

I also agree... we should not create more technical debt here.

Colin

> 
> Jun
> 
> On Thu, Apr 1, 2021 at 1:07 AM Henry Cai <hc...@pinterest.com.invalid> wrote:
> 
> > Jun,
> >
> > Thanks for your insight looking into this KIP, we do believe the shallow
> > iteration will give quite a significant performance boost.
> >
> > On your concerns:
> >
> > 1. Cleaner API.  One alternative is to create new batch APIs.  On consumer,
> > it would become Consumer.pollBatch returns a ConsumerBatch object which
> > contains topic/partition/firstOffsetOfBatch/pointerToByteBuffer, similarly
> > Producer.sendBatch(ProducerBatch).  Both ConsumerBatch and ProducerBatch
> > objects are fixed types (no generics), serializer is gone, interceptors are
> > probably not needed initially (unless people see the need to intercept on
> > the batch level).  On MM2 side, the current flow is ConsumerRecord ->
> > Connect's SourceRecord -> ProducerRecord, we would need to enhance Connect
> > framework to add SourceTask.pollBatch() method which returns a SourceBatch
> > object, so the object conversion flow becomes ConsumerBatch -> SourceBatch
> > -> ProducerBatch, we probably won't support any transformers on Batch
> > objects.
> > 2. PID/ProducerEpoch/SeqNo passing through RecordBatch.  I think those
> > transaction fields are only meaningful in the original source kafka
> > cluster, producer id/seqNo are not the same for the target kafka cluster.
> > So if MM is not going to support transactions at the moment, we can clear
> > those fields when they are going through MM.  Once MM starts to support
> > transactions in the future, it probably will start its own PID/SeqNo etc.
> > 3. For EOS and read_committed/read_uncommitted support, we can do phased
> > support.  Phase 1, don't support transactional messages in the source
> > cluster (i.e. abort if it sees control batch records).  Phase 2: applying
> > commit/abort on the batch boundary level.  I am not too familiar with the
> > isolation level and abort transaction code path, but it seems the control
> > unit is currently on the batch boundary (commit/abort the whole batch), if
> > so, this should also be doable.
> > 4. MessageHandler in MM1 or SMT in MM2, initially we don't need to support
> > them.  Since now the object is a ConsumerBatch and the existing handler is
> > written for the individual object.  Deserialize the batch into individual
> > objects would defeat the purpose of performance optimization.
> > 5. Multiple batch performances, will do some testing on this.
> >
> > On Wed, Mar 31, 2021 at 10:14 AM Jun Rao <ju...@confluent.io.invalid> wrote:
> >
> > > Hi, Henry,
> > >
> > > Thanks for the KIP. Sorry for the late reply. A few comments below.
> > >
> > > 1. The 'shallow' feature is potentially useful. I do agree with Tom that
> > > the proposed API changes seem unclean. Quite a few existing stuff don't
> > > really work together with this (e.g., generics, serializer, interceptors,
> > > configs like max.poll.records, etc). It's also hard to explain this
> > change
> > > to the common users of the consumer/producer API. I think it would be
> > > useful to explore if there is another cleaner way of adding this. For
> > > example, you mentioned that creating a new set of APIs doesn't work for
> > > MM2. However, we could potentially change the connect interface to allow
> > > MM2 to use the new API. If this doesn't work, it would be useful to
> > explain
> > > that in the rejected alternative section.
> > >
> > > 2. I am not sure that we could pass through all fields in RecordBatch.
> > For
> > > example, a MM instance could be receiving RecordBatch from different
> > source
> > > partitions. Mixing the PID/ProducerEpoch/FirstSequence fields across them
> > > in a single producer will be weird. So, it would be useful to document
> > this
> > > part clearer.
> > >
> > > 3. EOS. While MM itself doesn't support mirroring data in an
> > > exactly-once way, it needs to support reading from a topic with EOS data.
> > > So, it would be useful to document whether both read_committed and
> > > read_uncommitted mode are supported and what kind of RecordBatch the
> > > consumer returns in each case.
> > >
> > > 4. With the 'shallow' feature, it seems that some existing features in MM
> > > won't work. For example, I am not sure if SMT works in MM2
> > > and MirrorMakerMessageHandler works in MM1. It would be useful to
> > document
> > > this kind of impact in the KIP.
> > >
> > > 5. Multiple batches per partition in the produce request. This seems not
> > > strictly required in KIP-98. However, changing this will probably add a
> > bit
> > > more complexity in the producer. So, it would be useful to understand its
> > > benefits, especially since it doesn't seem to directly help reduce the
> > CPU
> > > cost in MM. For example, do you have performance numbers with and without
> > > this enabled in your MM tests?
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Tue, Mar 30, 2021 at 1:27 PM Henry Cai <hc...@pinterest.com.invalid>
> > > wrote:
> > >
> > > > Tom,
> > > >
> > > > Thanks for your comments.  Yes it's a bit clumsy to use the existing
> > > > consumer and producer API to carry the underlying record batch, but
> > > > creating a new set of API would also mean other use cases (e.g. MM2)
> > > > wouldn't be able to use that feature easily.  We can throw exceptions
> > if
> > > we
> > > > see clients are setting serializer/compression in the consumer config
> > > > option.
> > > >
> > > > The consumer is essentially getting back a collection of
> > > > RecordBatchByteBuffer records and passing them to the producer.  Most
> > of
> > > > the internal APIs inside consumer and producer code paths are actually
> > > > taking on ByteBuffer as the argument so it's not too much work to get
> > the
> > > > byte buffer through.
> > > >
> > > > For the worry that the client might see the inside of that byte buffer,
> > > we
> > > > can create a RecordBatchByteBufferRecord class to wrap the underlying
> > > byte
> > > > buffer so hopefully they will not drill too deep into that object.
> > > Java's
> > > > ByteBuffer does have a asReadOnlyBuffer() method to return a read-only
> > > > buffer, that can be explored as well.
> > > >
> > > > On Tue, Mar 30, 2021 at 4:24 AM Tom Bentley <tb...@redhat.com>
> > wrote:
> > > >
> > > > > Hi Henry and Ryanne,
> > > > >
> > > > > Related to Ismael's point about the producer & consumer configs being
> > > > > dangerous, I can see two parts to this:
> > > > >
> > > > > 2a. Both the proposed configs seem to be fundamentally incompatible
> > > with
> > > > > the Producer's existing key.serializer, value.serializer and
> > > > > compression.type configs, likewise the consumers key.deserializer and
> > > > > value.deserializer. I don't see a way to avoid this, since those
> > > existing
> > > > > configs are already separate things. (I did consider whether using
> > > > > special-case Deserializer and Serializer could be used instead, but
> > > that
> > > > > doesn't work nicely; in this use case they're necessarily all
> > > configured
> > > > > together). I think all we could do would be to reject configs which
> > > tried
> > > > > to set those existing client configs in conjunction with
> > > fetch.raw.bytes
> > > > > and send.raw.bytes.
> > > > >
> > > > > 2b. That still leaves a public Java API which would allow access to
> > the
> > > > raw
> > > > > byte buffers. AFAICS we don't actually need user code to have access
> > to
> > > > the
> > > > > raw buffers. It would be enough to get an opaque object that wrapped
> > > the
> > > > > ByteBuffer from the consumer and pass it to the producer. It's only
> > the
> > > > > consumer and producer code which needs to be able to obtain the
> > wrapped
> > > > > buffer.
> > > > >
> > > > > Kind regards,
> > > > >
> > > > > Tom
> > > > >
> > > > > On Tue, Mar 30, 2021 at 8:41 AM Ismael Juma <is...@juma.me.uk>
> > wrote:
> > > > >
> > > > > > Hi Henry,
> > > > > >
> > > > > > Can you clarify why this "network performance" issue is only
> > related
> > > to
> > > > > > shallow mirroring? Generally, we want the protocol to be generic
> > and
> > > > not
> > > > > > have a number of special cases. The more special cases you have,
> > the
> > > > > > tougher it becomes to test all the edge cases.
> > > > > >
> > > > > > Ismael
> > > > > >
> > > > > > On Mon, Mar 29, 2021 at 9:51 PM Henry Cai
> > <hcai@pinterest.com.invalid
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > It's interesting this VOTE thread finally becomes a DISCUSS
> > thread.
> > > > > > >
> > > > > > > For MM2 concern, I will take a look to see whether I can add the
> > > > > support
> > > > > > > for MM2.
> > > > > > >
> > > > > > > For Ismael's concern on multiple batches in the ProduceRequest
> > > > > > (conflicting
> > > > > > > with KIP-98), here is my take:
> > > > > > >
> > > > > > > 1. We do need to group multiple batches in the same request
> > > otherwise
> > > > > the
> > > > > > > network performance will suffer.
> > > > > > > 2. For the concern on transactional message support as in KIP-98,
> > > > since
> > > > > > MM1
> > > > > > > and MM2 currently don't support transactional messages, KIP-712
> > > will
> > > > > not
> > > > > > > attempt to support transactions either.  I will add a config
> > option
> > > > on
> > > > > > > producer config: allowMultipleBatches.  By default this option
> > will
> > > > be
> > > > > > off
> > > > > > > and the user needs to explicitly turn on this option to use the
> > > > shallow
> > > > > > > mirror feature.  And if we detect both this option and
> > transaction
> > > is
> > > > > > > turned on we will throw an exception to protect current
> > transaction
> > > > > > > processing.
> > > > > > > 3. In the future, when MM2 starts to support exact-once and
> > > > > transactional
> > > > > > > messages (is that KIP-656?), we can revisit this code.  The
> > current
> > > > > > > transactional message already makes the compromise that the
> > > messages
> > > > in
> > > > > > the
> > > > > > > same RecordBatch (MessageSet) are sharing the same
> > > > > > > sequence-id/transaction-id, so those messages need to be
> > committed
> > > > all
> > > > > > > together.  I think when we support the shallow mirror with
> > > > > transactional
> > > > > > > semantics, we will group all batches in the same ProduceRequest
> > in
> > > > the
> > > > > > same
> > > > > > > transaction boundary, they need to be committed all together.  On
> > > the
> > > > > > > broker side, all batches coming from ProduceRequest (or
> > > > FetchResponse)
> > > > > > are
> > > > > > > committed in the same log segment file as one unit (current
> > > > behavior).
> > > > > > >
> > > > > > > On Mon, Mar 29, 2021 at 8:46 AM Ryanne Dolan <
> > > ryannedolan@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Ah, I see, thanks Ismael. Now I understand your concern.
> > > > > > > >
> > > > > > > > From KIP-98, re this change in v3:
> > > > > > > >
> > > > > > > > "This allows us to remove the message set size since each
> > message
> > > > set
> > > > > > > > already contains a field for the size. More importantly, since
> > > > there
> > > > > is
> > > > > > > > only one message set to be written to the log, partial produce
> > > > > failures
> > > > > > > are
> > > > > > > > no longer possible. The full message set is either successfully
> > > > > written
> > > > > > > to
> > > > > > > > the log (and replicated) or it is not."
> > > > > > > >
> > > > > > > > The schema and size field don't seem to be an issue, as KIP-712
> > > > > already
> > > > > > > > addresses.
> > > > > > > >
> > > > > > > > The partial produce failure issue is something I don't
> > > understand.
> > > > I
> > > > > > > can't
> > > > > > > > tell if this was done out of convenience at the time or if
> > there
> > > is
> > > > > > > > something incompatible with partial produce success/failure and
> > > > EOS.
> > > > > > Does
> > > > > > > > anyone know?
> > > > > > > >
> > > > > > > > Ryanne
> > > > > > > >
> > > > > > > > On Mon, Mar 29, 2021, 1:41 AM Ismael Juma <is...@juma.me.uk>
> > > > wrote:
> > > > > > > >
> > > > > > > > > Ryanne,
> > > > > > > > >
> > > > > > > > > You misunderstood the referenced comment. It is about the
> > > produce
> > > > > > > request
> > > > > > > > > change to have multiple batches:
> > > > > > > > >
> > > > > > > > > "Up to ProduceRequest V2, a ProduceRequest can contain
> > multiple
> > > > > > batches
> > > > > > > > of
> > > > > > > > > messages stored in the record_set field, but this was
> > disabled
> > > in
> > > > > V3.
> > > > > > > We
> > > > > > > > > are proposing to bring the multiple batches feature back to
> > > > improve
> > > > > > the
> > > > > > > > > network throughput of the mirror maker producer when the
> > > original
> > > > > > batch
> > > > > > > > > size from source broker is too small."
> > > > > > > > >
> > > > > > > > > This is unrelated to shallow iteration.
> > > > > > > > >
> > > > > > > > > Ismael
> > > > > > > > >
> > > > > > > > > On Sun, Mar 28, 2021, 10:15 PM Ryanne Dolan <
> > > > ryannedolan@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Ismael, I don't think KIP-98 is related. Shallow iteration
> > > was
> > > > > > > removed
> > > > > > > > in
> > > > > > > > > > KAFKA-732, which predates KIP-98 by a few years.
> > > > > > > > > >
> > > > > > > > > > Ryanne
> > > > > > > > > >
> > > > > > > > > > On Sun, Mar 28, 2021, 11:25 PM Ismael Juma <
> > > ismael@juma.me.uk>
> > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Thanks for the KIP. I have a few high level comments:
> > > > > > > > > > >
> > > > > > > > > > > 1. Like Tom, I'm not convinced about the proposal to make
> > > > this
> > > > > > > change
> > > > > > > > > to
> > > > > > > > > > > MirrorMaker 1 if we intend to deprecate it and remove
> > it. I
> > > > > would
> > > > > > > > > rather
> > > > > > > > > > us
> > > > > > > > > > > focus our efforts on the implementation we intend to
> > > support
> > > > > > going
> > > > > > > > > > forward.
> > > > > > > > > > > 2. The producer/consumer configs seem pretty dangerous
> > for
> > > > > > general
> > > > > > > > > usage,
> > > > > > > > > > > but the KIP doesn't address the potential downsides.
> > > > > > > > > > > 3. How does the ProducerRequest change impact
> > exactly-once
> > > > (if
> > > > > at
> > > > > > > > all)?
> > > > > > > > > > The
> > > > > > > > > > > change we are reverting was done as part of KIP-98. Have
> > we
> > > > > > > > considered
> > > > > > > > > > the
> > > > > > > > > > > original reasons for the change?
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > > Ismael
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Feb 10, 2021 at 12:58 PM Vahid Hashemian <
> > > > > > > > > > > vahid.hashemian@gmail.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Retitled the thread to conform to the common format.
> > > > > > > > > > > >
> > > > > > > > > > > > On Fri, Feb 5, 2021 at 4:00 PM Ning Zhang <
> > > > > > > ning2008wisc@gmail.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hello Henry,
> > > > > > > > > > > > >
> > > > > > > > > > > > > This is a very interesting proposal.
> > > > > > > > > > > > > https://issues.apache.org/jira/browse/KAFKA-10728
> > > > reflects
> > > > > > the
> > > > > > > > > > similar
> > > > > > > > > > > > > concern of re-compressing data in mirror maker.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Probably one thing may need to clarify is: how
> > > "shallow"
> > > > > > > > mirroring
> > > > > > > > > is
> > > > > > > > > > > > only
> > > > > > > > > > > > > applied to mirrormaker use case, if the changes need
> > to
> > > > be
> > > > > > made
> > > > > > > > on
> > > > > > > > > > > > generic
> > > > > > > > > > > > > consumer and producer (e.g. by adding
> > `fetch.raw.bytes`
> > > > and
> > > > > > > > > > > > > `send.raw.bytes` to producer and consumer config)
> > > > > > > > > > > > >
> > > > > > > > > > > > > On 2021/02/05 00:59:57, Henry Cai
> > > > > <hcai@pinterest.com.INVALID
> > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > > > > > > Dear Community members,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > We are proposing a new feature to improve the
> > > > performance
> > > > > > of
> > > > > > > > > Kafka
> > > > > > > > > > > > mirror
> > > > > > > > > > > > > > maker:
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-712%3A+Shallow+Mirroring
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > The current Kafka MirrorMaker process (with the
> > > > > underlying
> > > > > > > > > Consumer
> > > > > > > > > > > and
> > > > > > > > > > > > > > Producer library) uses significant CPU cycles and
> > > > memory
> > > > > to
> > > > > > > > > > > > > > decompress/recompress, deserialize/re-serialize
> > > > messages
> > > > > > and
> > > > > > > > copy
> > > > > > > > > > > > > multiple
> > > > > > > > > > > > > > times of messages bytes along the
> > > mirroring/replicating
> > > > > > > stages.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > The KIP proposes a *shallow mirror* feature which
> > > > brings
> > > > > > back
> > > > > > > > the
> > > > > > > > > > > > shallow
> > > > > > > > > > > > > > iterator concept to the mirror process and also
> > > > proposes
> > > > > to
> > > > > > > > skip
> > > > > > > > > > the
> > > > > > > > > > > > > > unnecessary message decompression and recompression
> > > > > steps.
> > > > > > > We
> > > > > > > > > > argue
> > > > > > > > > > > in
> > > > > > > > > > > > > > many cases users just want a simple replication
> > > > pipeline
> > > > > to
> > > > > > > > > > replicate
> > > > > > > > > > > > the
> > > > > > > > > > > > > > message as it is from the source cluster to the
> > > > > destination
> > > > > > > > > > cluster.
> > > > > > > > > > > > In
> > > > > > > > > > > > > > many cases the messages in the source cluster are
> > > > already
> > > > > > > > > > compressed
> > > > > > > > > > > > and
> > > > > > > > > > > > > > properly batched, users just need an identical copy
> > > of
> > > > > the
> > > > > > > > > message
> > > > > > > > > > > > bytes
> > > > > > > > > > > > > > through the mirroring without any transformation or
> > > > > > > > > repartitioning.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > We have a prototype implementation in house with
> > > > > > MirrorMaker
> > > > > > > v1
> > > > > > > > > and
> > > > > > > > > > > > > > observed *CPU usage dropped from 50% to 15%* for
> > some
> > > > > > mirror
> > > > > > > > > > > pipelines.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > We name this feature: *shallow mirroring* since it
> > > has
> > > > > some
> > > > > > > > > > > resemblance
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > the old Kafka 0.7 namesake feature but the
> > > > > implementations
> > > > > > > are
> > > > > > > > > not
> > > > > > > > > > > > quite
> > > > > > > > > > > > > > the same.  ‘*Shallow*’ means 1. we *shallowly*
> > > iterate
> > > > > > > > > > RecordBatches
> > > > > > > > > > > > > inside
> > > > > > > > > > > > > > MemoryRecords structure instead of deep iterating
> > > > records
> > > > > > > > inside
> > > > > > > > > > > > > > RecordBatch; 2. We *shallowly* copy (share)
> > pointers
> > > > > inside
> > > > > > > > > > > ByteBuffer
> > > > > > > > > > > > > > instead of deep copying and deserializing bytes
> > into
> > > > > > objects.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Please share discussions/feedback along this email
> > > > > thread.
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks!
> > > > > > > > > > > > --Vahid
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-712: Shallow Mirroring

Posted by Jun Rao <ju...@confluent.io.INVALID>.
Hi, Henry,

Thanks for the response.

1. I agree with Tom that it's worth thinking about a separate class for
shallow iteration instead of trying to add more complexity into the
existing producer/consumer API. We could potentially make the new class an
internal API if it's only useful for MM.

3. I am not sure that we could ignore transactional messages in the first
phase. The usage of EOS is increasing. Also, one can't tell from the
metadata of a topic whether it has EOS data or not. So, there is no easy
way to skip EOS data from the source.

Jun

On Thu, Apr 1, 2021 at 1:07 AM Henry Cai <hc...@pinterest.com.invalid> wrote:

> Jun,
>
> Thanks for your insight looking into this KIP, we do believe the shallow
> iteration will give quite a significant performance boost.
>
> On your concerns:
>
> 1. Cleaner API.  One alternative is to create new batch APIs.  On consumer,
> it would become Consumer.pollBatch returns a ConsumerBatch object which
> contains topic/partition/firstOffsetOfBatch/pointerToByteBuffer, similarly
> Producer.sendBatch(ProducerBatch).  Both ConsumerBatch and ProducerBatch
> objects are fixed types (no generics), serializer is gone, interceptors are
> probably not needed initially (unless people see the need to intercept on
> the batch level).  On MM2 side, the current flow is ConsumerRecord ->
> Connect's SourceRecord -> ProducerRecord, we would need to enhance Connect
> framework to add SourceTask.pollBatch() method which returns a SourceBatch
> object, so the object conversion flow becomes ConsumerBatch -> SourceBatch
> -> ProducerBatch, we probably won't support any transformers on Batch
> objects.
> 2. PID/ProducerEpoch/SeqNo passing through RecordBatch.  I think those
> transaction fields are only meaningful in the original source kafka
> cluster, producer id/seqNo are not the same for the target kafka cluster.
> So if MM is not going to support transactions at the moment, we can clear
> those fields when they are going through MM.  Once MM starts to support
> transactions in the future, it probably will start its own PID/SeqNo etc.
> 3. For EOS and read_committed/read_uncommitted support, we can do phased
> support.  Phase 1, don't support transactional messages in the source
> cluster (i.e. abort if it sees control batch records).  Phase 2: applying
> commit/abort on the batch boundary level.  I am not too familiar with the
> isolation level and abort transaction code path, but it seems the control
> unit is currently on the batch boundary (commit/abort the whole batch), if
> so, this should also be doable.
> 4. MessageHandler in MM1 or SMT in MM2, initially we don't need to support
> them.  Since now the object is a ConsumerBatch and the existing handler is
> written for the individual object.  Deserialize the batch into individual
> objects would defeat the purpose of performance optimization.
> 5. Multiple batch performances, will do some testing on this.
>
> On Wed, Mar 31, 2021 at 10:14 AM Jun Rao <ju...@confluent.io.invalid> wrote:
>
> > Hi, Henry,
> >
> > Thanks for the KIP. Sorry for the late reply. A few comments below.
> >
> > 1. The 'shallow' feature is potentially useful. I do agree with Tom that
> > the proposed API changes seem unclean. Quite a few existing stuff don't
> > really work together with this (e.g., generics, serializer, interceptors,
> > configs like max.poll.records, etc). It's also hard to explain this
> change
> > to the common users of the consumer/producer API. I think it would be
> > useful to explore if there is another cleaner way of adding this. For
> > example, you mentioned that creating a new set of APIs doesn't work for
> > MM2. However, we could potentially change the connect interface to allow
> > MM2 to use the new API. If this doesn't work, it would be useful to
> explain
> > that in the rejected alternative section.
> >
> > 2. I am not sure that we could pass through all fields in RecordBatch.
> For
> > example, a MM instance could be receiving RecordBatch from different
> source
> > partitions. Mixing the PID/ProducerEpoch/FirstSequence fields across them
> > in a single producer will be weird. So, it would be useful to document
> this
> > part clearer.
> >
> > 3. EOS. While MM itself doesn't support mirroring data in an
> > exactly-once way, it needs to support reading from a topic with EOS data.
> > So, it would be useful to document whether both read_committed and
> > read_uncommitted mode are supported and what kind of RecordBatch the
> > consumer returns in each case.
> >
> > 4. With the 'shallow' feature, it seems that some existing features in MM
> > won't work. For example, I am not sure if SMT works in MM2
> > and MirrorMakerMessageHandler works in MM1. It would be useful to
> document
> > this kind of impact in the KIP.
> >
> > 5. Multiple batches per partition in the produce request. This seems not
> > strictly required in KIP-98. However, changing this will probably add a
> bit
> > more complexity in the producer. So, it would be useful to understand its
> > benefits, especially since it doesn't seem to directly help reduce the
> CPU
> > cost in MM. For example, do you have performance numbers with and without
> > this enabled in your MM tests?
> >
> > Thanks,
> >
> > Jun
> >
> > On Tue, Mar 30, 2021 at 1:27 PM Henry Cai <hc...@pinterest.com.invalid>
> > wrote:
> >
> > > Tom,
> > >
> > > Thanks for your comments.  Yes it's a bit clumsy to use the existing
> > > consumer and producer API to carry the underlying record batch, but
> > > creating a new set of API would also mean other use cases (e.g. MM2)
> > > wouldn't be able to use that feature easily.  We can throw exceptions
> if
> > we
> > > see clients are setting serializer/compression in the consumer config
> > > option.
> > >
> > > The consumer is essentially getting back a collection of
> > > RecordBatchByteBuffer records and passing them to the producer.  Most
> of
> > > the internal APIs inside consumer and producer code paths are actually
> > > taking on ByteBuffer as the argument so it's not too much work to get
> the
> > > byte buffer through.
> > >
> > > For the worry that the client might see the inside of that byte buffer,
> > we
> > > can create a RecordBatchByteBufferRecord class to wrap the underlying
> > byte
> > > buffer so hopefully they will not drill too deep into that object.
> > Java's
> > > ByteBuffer does have a asReadOnlyBuffer() method to return a read-only
> > > buffer, that can be explored as well.
> > >
> > > On Tue, Mar 30, 2021 at 4:24 AM Tom Bentley <tb...@redhat.com>
> wrote:
> > >
> > > > Hi Henry and Ryanne,
> > > >
> > > > Related to Ismael's point about the producer & consumer configs being
> > > > dangerous, I can see two parts to this:
> > > >
> > > > 2a. Both the proposed configs seem to be fundamentally incompatible
> > with
> > > > the Producer's existing key.serializer, value.serializer and
> > > > compression.type configs, likewise the consumers key.deserializer and
> > > > value.deserializer. I don't see a way to avoid this, since those
> > existing
> > > > configs are already separate things. (I did consider whether using
> > > > special-case Deserializer and Serializer could be used instead, but
> > that
> > > > doesn't work nicely; in this use case they're necessarily all
> > configured
> > > > together). I think all we could do would be to reject configs which
> > tried
> > > > to set those existing client configs in conjunction with
> > fetch.raw.bytes
> > > > and send.raw.bytes.
> > > >
> > > > 2b. That still leaves a public Java API which would allow access to
> the
> > > raw
> > > > byte buffers. AFAICS we don't actually need user code to have access
> to
> > > the
> > > > raw buffers. It would be enough to get an opaque object that wrapped
> > the
> > > > ByteBuffer from the consumer and pass it to the producer. It's only
> the
> > > > consumer and producer code which needs to be able to obtain the
> wrapped
> > > > buffer.
> > > >
> > > > Kind regards,
> > > >
> > > > Tom
> > > >
> > > > On Tue, Mar 30, 2021 at 8:41 AM Ismael Juma <is...@juma.me.uk>
> wrote:
> > > >
> > > > > Hi Henry,
> > > > >
> > > > > Can you clarify why this "network performance" issue is only
> related
> > to
> > > > > shallow mirroring? Generally, we want the protocol to be generic
> and
> > > not
> > > > > have a number of special cases. The more special cases you have,
> the
> > > > > tougher it becomes to test all the edge cases.
> > > > >
> > > > > Ismael
> > > > >
> > > > > On Mon, Mar 29, 2021 at 9:51 PM Henry Cai
> <hcai@pinterest.com.invalid
> > >
> > > > > wrote:
> > > > >
> > > > > > It's interesting this VOTE thread finally becomes a DISCUSS
> thread.
> > > > > >
> > > > > > For MM2 concern, I will take a look to see whether I can add the
> > > > support
> > > > > > for MM2.
> > > > > >
> > > > > > For Ismael's concern on multiple batches in the ProduceRequest
> > > > > (conflicting
> > > > > > with KIP-98), here is my take:
> > > > > >
> > > > > > 1. We do need to group multiple batches in the same request
> > otherwise
> > > > the
> > > > > > network performance will suffer.
> > > > > > 2. For the concern on transactional message support as in KIP-98,
> > > since
> > > > > MM1
> > > > > > and MM2 currently don't support transactional messages, KIP-712
> > will
> > > > not
> > > > > > attempt to support transactions either.  I will add a config
> option
> > > on
> > > > > > producer config: allowMultipleBatches.  By default this option
> will
> > > be
> > > > > off
> > > > > > and the user needs to explicitly turn on this option to use the
> > > shallow
> > > > > > mirror feature.  And if we detect both this option and
> transaction
> > is
> > > > > > turned on we will throw an exception to protect current
> transaction
> > > > > > processing.
> > > > > > 3. In the future, when MM2 starts to support exact-once and
> > > > transactional
> > > > > > messages (is that KIP-656?), we can revisit this code.  The
> current
> > > > > > transactional message already makes the compromise that the
> > messages
> > > in
> > > > > the
> > > > > > same RecordBatch (MessageSet) are sharing the same
> > > > > > sequence-id/transaction-id, so those messages need to be
> committed
> > > all
> > > > > > together.  I think when we support the shallow mirror with
> > > > transactional
> > > > > > semantics, we will group all batches in the same ProduceRequest
> in
> > > the
> > > > > same
> > > > > > transaction boundary, they need to be committed all together.  On
> > the
> > > > > > broker side, all batches coming from ProduceRequest (or
> > > FetchResponse)
> > > > > are
> > > > > > committed in the same log segment file as one unit (current
> > > behavior).
> > > > > >
> > > > > > On Mon, Mar 29, 2021 at 8:46 AM Ryanne Dolan <
> > ryannedolan@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Ah, I see, thanks Ismael. Now I understand your concern.
> > > > > > >
> > > > > > > From KIP-98, re this change in v3:
> > > > > > >
> > > > > > > "This allows us to remove the message set size since each
> message
> > > set
> > > > > > > already contains a field for the size. More importantly, since
> > > there
> > > > is
> > > > > > > only one message set to be written to the log, partial produce
> > > > failures
> > > > > > are
> > > > > > > no longer possible. The full message set is either successfully
> > > > written
> > > > > > to
> > > > > > > the log (and replicated) or it is not."
> > > > > > >
> > > > > > > The schema and size field don't seem to be an issue, as KIP-712
> > > > already
> > > > > > > addresses.
> > > > > > >
> > > > > > > The partial produce failure issue is something I don't
> > understand.
> > > I
> > > > > > can't
> > > > > > > tell if this was done out of convenience at the time or if
> there
> > is
> > > > > > > something incompatible with partial produce success/failure and
> > > EOS.
> > > > > Does
> > > > > > > anyone know?
> > > > > > >
> > > > > > > Ryanne
> > > > > > >
> > > > > > > On Mon, Mar 29, 2021, 1:41 AM Ismael Juma <is...@juma.me.uk>
> > > wrote:
> > > > > > >
> > > > > > > > Ryanne,
> > > > > > > >
> > > > > > > > You misunderstood the referenced comment. It is about the
> > produce
> > > > > > request
> > > > > > > > change to have multiple batches:
> > > > > > > >
> > > > > > > > "Up to ProduceRequest V2, a ProduceRequest can contain
> multiple
> > > > > batches
> > > > > > > of
> > > > > > > > messages stored in the record_set field, but this was
> disabled
> > in
> > > > V3.
> > > > > > We
> > > > > > > > are proposing to bring the multiple batches feature back to
> > > improve
> > > > > the
> > > > > > > > network throughput of the mirror maker producer when the
> > original
> > > > > batch
> > > > > > > > size from source broker is too small."
> > > > > > > >
> > > > > > > > This is unrelated to shallow iteration.
> > > > > > > >
> > > > > > > > Ismael
> > > > > > > >
> > > > > > > > On Sun, Mar 28, 2021, 10:15 PM Ryanne Dolan <
> > > ryannedolan@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Ismael, I don't think KIP-98 is related. Shallow iteration
> > was
> > > > > > removed
> > > > > > > in
> > > > > > > > > KAFKA-732, which predates KIP-98 by a few years.
> > > > > > > > >
> > > > > > > > > Ryanne
> > > > > > > > >
> > > > > > > > > On Sun, Mar 28, 2021, 11:25 PM Ismael Juma <
> > ismael@juma.me.uk>
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Thanks for the KIP. I have a few high level comments:
> > > > > > > > > >
> > > > > > > > > > 1. Like Tom, I'm not convinced about the proposal to make
> > > this
> > > > > > change
> > > > > > > > to
> > > > > > > > > > MirrorMaker 1 if we intend to deprecate it and remove
> it. I
> > > > would
> > > > > > > > rather
> > > > > > > > > us
> > > > > > > > > > focus our efforts on the implementation we intend to
> > support
> > > > > going
> > > > > > > > > forward.
> > > > > > > > > > 2. The producer/consumer configs seem pretty dangerous
> for
> > > > > general
> > > > > > > > usage,
> > > > > > > > > > but the KIP doesn't address the potential downsides.
> > > > > > > > > > 3. How does the ProducerRequest change impact
> exactly-once
> > > (if
> > > > at
> > > > > > > all)?
> > > > > > > > > The
> > > > > > > > > > change we are reverting was done as part of KIP-98. Have
> we
> > > > > > > considered
> > > > > > > > > the
> > > > > > > > > > original reasons for the change?
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Ismael
> > > > > > > > > >
> > > > > > > > > > On Wed, Feb 10, 2021 at 12:58 PM Vahid Hashemian <
> > > > > > > > > > vahid.hashemian@gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Retitled the thread to conform to the common format.
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Feb 5, 2021 at 4:00 PM Ning Zhang <
> > > > > > ning2008wisc@gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hello Henry,
> > > > > > > > > > > >
> > > > > > > > > > > > This is a very interesting proposal.
> > > > > > > > > > > > https://issues.apache.org/jira/browse/KAFKA-10728
> > > reflects
> > > > > the
> > > > > > > > > similar
> > > > > > > > > > > > concern of re-compressing data in mirror maker.
> > > > > > > > > > > >
> > > > > > > > > > > > Probably one thing may need to clarify is: how
> > "shallow"
> > > > > > > mirroring
> > > > > > > > is
> > > > > > > > > > > only
> > > > > > > > > > > > applied to mirrormaker use case, if the changes need
> to
> > > be
> > > > > made
> > > > > > > on
> > > > > > > > > > > generic
> > > > > > > > > > > > consumer and producer (e.g. by adding
> `fetch.raw.bytes`
> > > and
> > > > > > > > > > > > `send.raw.bytes` to producer and consumer config)
> > > > > > > > > > > >
> > > > > > > > > > > > On 2021/02/05 00:59:57, Henry Cai
> > > > <hcai@pinterest.com.INVALID
> > > > > >
> > > > > > > > > wrote:
> > > > > > > > > > > > > Dear Community members,
> > > > > > > > > > > > >
> > > > > > > > > > > > > We are proposing a new feature to improve the
> > > performance
> > > > > of
> > > > > > > > Kafka
> > > > > > > > > > > mirror
> > > > > > > > > > > > > maker:
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-712%3A+Shallow+Mirroring
> > > > > > > > > > > > >
> > > > > > > > > > > > > The current Kafka MirrorMaker process (with the
> > > > underlying
> > > > > > > > Consumer
> > > > > > > > > > and
> > > > > > > > > > > > > Producer library) uses significant CPU cycles and
> > > memory
> > > > to
> > > > > > > > > > > > > decompress/recompress, deserialize/re-serialize
> > > messages
> > > > > and
> > > > > > > copy
> > > > > > > > > > > > multiple
> > > > > > > > > > > > > times of messages bytes along the
> > mirroring/replicating
> > > > > > stages.
> > > > > > > > > > > > >
> > > > > > > > > > > > > The KIP proposes a *shallow mirror* feature which
> > > brings
> > > > > back
> > > > > > > the
> > > > > > > > > > > shallow
> > > > > > > > > > > > > iterator concept to the mirror process and also
> > > proposes
> > > > to
> > > > > > > skip
> > > > > > > > > the
> > > > > > > > > > > > > unnecessary message decompression and recompression
> > > > steps.
> > > > > > We
> > > > > > > > > argue
> > > > > > > > > > in
> > > > > > > > > > > > > many cases users just want a simple replication
> > > pipeline
> > > > to
> > > > > > > > > replicate
> > > > > > > > > > > the
> > > > > > > > > > > > > message as it is from the source cluster to the
> > > > destination
> > > > > > > > > cluster.
> > > > > > > > > > > In
> > > > > > > > > > > > > many cases the messages in the source cluster are
> > > already
> > > > > > > > > compressed
> > > > > > > > > > > and
> > > > > > > > > > > > > properly batched, users just need an identical copy
> > of
> > > > the
> > > > > > > > message
> > > > > > > > > > > bytes
> > > > > > > > > > > > > through the mirroring without any transformation or
> > > > > > > > repartitioning.
> > > > > > > > > > > > >
> > > > > > > > > > > > > We have a prototype implementation in house with
> > > > > MirrorMaker
> > > > > > v1
> > > > > > > > and
> > > > > > > > > > > > > observed *CPU usage dropped from 50% to 15%* for
> some
> > > > > mirror
> > > > > > > > > > pipelines.
> > > > > > > > > > > > >
> > > > > > > > > > > > > We name this feature: *shallow mirroring* since it
> > has
> > > > some
> > > > > > > > > > resemblance
> > > > > > > > > > > > to
> > > > > > > > > > > > > the old Kafka 0.7 namesake feature but the
> > > > implementations
> > > > > > are
> > > > > > > > not
> > > > > > > > > > > quite
> > > > > > > > > > > > > the same.  ‘*Shallow*’ means 1. we *shallowly*
> > iterate
> > > > > > > > > RecordBatches
> > > > > > > > > > > > inside
> > > > > > > > > > > > > MemoryRecords structure instead of deep iterating
> > > records
> > > > > > > inside
> > > > > > > > > > > > > RecordBatch; 2. We *shallowly* copy (share)
> pointers
> > > > inside
> > > > > > > > > > ByteBuffer
> > > > > > > > > > > > > instead of deep copying and deserializing bytes
> into
> > > > > objects.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Please share discussions/feedback along this email
> > > > thread.
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > >
> > > > > > > > > > > Thanks!
> > > > > > > > > > > --Vahid
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-712: Shallow Mirroring

Posted by Henry Cai <hc...@pinterest.com.INVALID>.
Jun,

Thanks for your insight looking into this KIP, we do believe the shallow
iteration will give quite a significant performance boost.

On your concerns:

1. Cleaner API.  One alternative is to create new batch APIs.  On consumer,
it would become Consumer.pollBatch returns a ConsumerBatch object which
contains topic/partition/firstOffsetOfBatch/pointerToByteBuffer, similarly
Producer.sendBatch(ProducerBatch).  Both ConsumerBatch and ProducerBatch
objects are fixed types (no generics), serializer is gone, interceptors are
probably not needed initially (unless people see the need to intercept on
the batch level).  On MM2 side, the current flow is ConsumerRecord ->
Connect's SourceRecord -> ProducerRecord, we would need to enhance Connect
framework to add SourceTask.pollBatch() method which returns a SourceBatch
object, so the object conversion flow becomes ConsumerBatch -> SourceBatch
-> ProducerBatch, we probably won't support any transformers on Batch
objects.
2. PID/ProducerEpoch/SeqNo passing through RecordBatch.  I think those
transaction fields are only meaningful in the original source kafka
cluster, producer id/seqNo are not the same for the target kafka cluster.
So if MM is not going to support transactions at the moment, we can clear
those fields when they are going through MM.  Once MM starts to support
transactions in the future, it probably will start its own PID/SeqNo etc.
3. For EOS and read_committed/read_uncommitted support, we can do phased
support.  Phase 1, don't support transactional messages in the source
cluster (i.e. abort if it sees control batch records).  Phase 2: applying
commit/abort on the batch boundary level.  I am not too familiar with the
isolation level and abort transaction code path, but it seems the control
unit is currently on the batch boundary (commit/abort the whole batch), if
so, this should also be doable.
4. MessageHandler in MM1 or SMT in MM2, initially we don't need to support
them.  Since now the object is a ConsumerBatch and the existing handler is
written for the individual object.  Deserialize the batch into individual
objects would defeat the purpose of performance optimization.
5. Multiple batch performances, will do some testing on this.

On Wed, Mar 31, 2021 at 10:14 AM Jun Rao <ju...@confluent.io.invalid> wrote:

> Hi, Henry,
>
> Thanks for the KIP. Sorry for the late reply. A few comments below.
>
> 1. The 'shallow' feature is potentially useful. I do agree with Tom that
> the proposed API changes seem unclean. Quite a few existing stuff don't
> really work together with this (e.g., generics, serializer, interceptors,
> configs like max.poll.records, etc). It's also hard to explain this change
> to the common users of the consumer/producer API. I think it would be
> useful to explore if there is another cleaner way of adding this. For
> example, you mentioned that creating a new set of APIs doesn't work for
> MM2. However, we could potentially change the connect interface to allow
> MM2 to use the new API. If this doesn't work, it would be useful to explain
> that in the rejected alternative section.
>
> 2. I am not sure that we could pass through all fields in RecordBatch. For
> example, a MM instance could be receiving RecordBatch from different source
> partitions. Mixing the PID/ProducerEpoch/FirstSequence fields across them
> in a single producer will be weird. So, it would be useful to document this
> part clearer.
>
> 3. EOS. While MM itself doesn't support mirroring data in an
> exactly-once way, it needs to support reading from a topic with EOS data.
> So, it would be useful to document whether both read_committed and
> read_uncommitted mode are supported and what kind of RecordBatch the
> consumer returns in each case.
>
> 4. With the 'shallow' feature, it seems that some existing features in MM
> won't work. For example, I am not sure if SMT works in MM2
> and MirrorMakerMessageHandler works in MM1. It would be useful to document
> this kind of impact in the KIP.
>
> 5. Multiple batches per partition in the produce request. This seems not
> strictly required in KIP-98. However, changing this will probably add a bit
> more complexity in the producer. So, it would be useful to understand its
> benefits, especially since it doesn't seem to directly help reduce the CPU
> cost in MM. For example, do you have performance numbers with and without
> this enabled in your MM tests?
>
> Thanks,
>
> Jun
>
> On Tue, Mar 30, 2021 at 1:27 PM Henry Cai <hc...@pinterest.com.invalid>
> wrote:
>
> > Tom,
> >
> > Thanks for your comments.  Yes it's a bit clumsy to use the existing
> > consumer and producer API to carry the underlying record batch, but
> > creating a new set of API would also mean other use cases (e.g. MM2)
> > wouldn't be able to use that feature easily.  We can throw exceptions if
> we
> > see clients are setting serializer/compression in the consumer config
> > option.
> >
> > The consumer is essentially getting back a collection of
> > RecordBatchByteBuffer records and passing them to the producer.  Most of
> > the internal APIs inside consumer and producer code paths are actually
> > taking on ByteBuffer as the argument so it's not too much work to get the
> > byte buffer through.
> >
> > For the worry that the client might see the inside of that byte buffer,
> we
> > can create a RecordBatchByteBufferRecord class to wrap the underlying
> byte
> > buffer so hopefully they will not drill too deep into that object.
> Java's
> > ByteBuffer does have a asReadOnlyBuffer() method to return a read-only
> > buffer, that can be explored as well.
> >
> > On Tue, Mar 30, 2021 at 4:24 AM Tom Bentley <tb...@redhat.com> wrote:
> >
> > > Hi Henry and Ryanne,
> > >
> > > Related to Ismael's point about the producer & consumer configs being
> > > dangerous, I can see two parts to this:
> > >
> > > 2a. Both the proposed configs seem to be fundamentally incompatible
> with
> > > the Producer's existing key.serializer, value.serializer and
> > > compression.type configs, likewise the consumers key.deserializer and
> > > value.deserializer. I don't see a way to avoid this, since those
> existing
> > > configs are already separate things. (I did consider whether using
> > > special-case Deserializer and Serializer could be used instead, but
> that
> > > doesn't work nicely; in this use case they're necessarily all
> configured
> > > together). I think all we could do would be to reject configs which
> tried
> > > to set those existing client configs in conjunction with
> fetch.raw.bytes
> > > and send.raw.bytes.
> > >
> > > 2b. That still leaves a public Java API which would allow access to the
> > raw
> > > byte buffers. AFAICS we don't actually need user code to have access to
> > the
> > > raw buffers. It would be enough to get an opaque object that wrapped
> the
> > > ByteBuffer from the consumer and pass it to the producer. It's only the
> > > consumer and producer code which needs to be able to obtain the wrapped
> > > buffer.
> > >
> > > Kind regards,
> > >
> > > Tom
> > >
> > > On Tue, Mar 30, 2021 at 8:41 AM Ismael Juma <is...@juma.me.uk> wrote:
> > >
> > > > Hi Henry,
> > > >
> > > > Can you clarify why this "network performance" issue is only related
> to
> > > > shallow mirroring? Generally, we want the protocol to be generic and
> > not
> > > > have a number of special cases. The more special cases you have, the
> > > > tougher it becomes to test all the edge cases.
> > > >
> > > > Ismael
> > > >
> > > > On Mon, Mar 29, 2021 at 9:51 PM Henry Cai <hcai@pinterest.com.invalid
> >
> > > > wrote:
> > > >
> > > > > It's interesting this VOTE thread finally becomes a DISCUSS thread.
> > > > >
> > > > > For MM2 concern, I will take a look to see whether I can add the
> > > support
> > > > > for MM2.
> > > > >
> > > > > For Ismael's concern on multiple batches in the ProduceRequest
> > > > (conflicting
> > > > > with KIP-98), here is my take:
> > > > >
> > > > > 1. We do need to group multiple batches in the same request
> otherwise
> > > the
> > > > > network performance will suffer.
> > > > > 2. For the concern on transactional message support as in KIP-98,
> > since
> > > > MM1
> > > > > and MM2 currently don't support transactional messages, KIP-712
> will
> > > not
> > > > > attempt to support transactions either.  I will add a config option
> > on
> > > > > producer config: allowMultipleBatches.  By default this option will
> > be
> > > > off
> > > > > and the user needs to explicitly turn on this option to use the
> > shallow
> > > > > mirror feature.  And if we detect both this option and transaction
> is
> > > > > turned on we will throw an exception to protect current transaction
> > > > > processing.
> > > > > 3. In the future, when MM2 starts to support exact-once and
> > > transactional
> > > > > messages (is that KIP-656?), we can revisit this code.  The current
> > > > > transactional message already makes the compromise that the
> messages
> > in
> > > > the
> > > > > same RecordBatch (MessageSet) are sharing the same
> > > > > sequence-id/transaction-id, so those messages need to be committed
> > all
> > > > > together.  I think when we support the shallow mirror with
> > > transactional
> > > > > semantics, we will group all batches in the same ProduceRequest in
> > the
> > > > same
> > > > > transaction boundary, they need to be committed all together.  On
> the
> > > > > broker side, all batches coming from ProduceRequest (or
> > FetchResponse)
> > > > are
> > > > > committed in the same log segment file as one unit (current
> > behavior).
> > > > >
> > > > > On Mon, Mar 29, 2021 at 8:46 AM Ryanne Dolan <
> ryannedolan@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Ah, I see, thanks Ismael. Now I understand your concern.
> > > > > >
> > > > > > From KIP-98, re this change in v3:
> > > > > >
> > > > > > "This allows us to remove the message set size since each message
> > set
> > > > > > already contains a field for the size. More importantly, since
> > there
> > > is
> > > > > > only one message set to be written to the log, partial produce
> > > failures
> > > > > are
> > > > > > no longer possible. The full message set is either successfully
> > > written
> > > > > to
> > > > > > the log (and replicated) or it is not."
> > > > > >
> > > > > > The schema and size field don't seem to be an issue, as KIP-712
> > > already
> > > > > > addresses.
> > > > > >
> > > > > > The partial produce failure issue is something I don't
> understand.
> > I
> > > > > can't
> > > > > > tell if this was done out of convenience at the time or if there
> is
> > > > > > something incompatible with partial produce success/failure and
> > EOS.
> > > > Does
> > > > > > anyone know?
> > > > > >
> > > > > > Ryanne
> > > > > >
> > > > > > On Mon, Mar 29, 2021, 1:41 AM Ismael Juma <is...@juma.me.uk>
> > wrote:
> > > > > >
> > > > > > > Ryanne,
> > > > > > >
> > > > > > > You misunderstood the referenced comment. It is about the
> produce
> > > > > request
> > > > > > > change to have multiple batches:
> > > > > > >
> > > > > > > "Up to ProduceRequest V2, a ProduceRequest can contain multiple
> > > > batches
> > > > > > of
> > > > > > > messages stored in the record_set field, but this was disabled
> in
> > > V3.
> > > > > We
> > > > > > > are proposing to bring the multiple batches feature back to
> > improve
> > > > the
> > > > > > > network throughput of the mirror maker producer when the
> original
> > > > batch
> > > > > > > size from source broker is too small."
> > > > > > >
> > > > > > > This is unrelated to shallow iteration.
> > > > > > >
> > > > > > > Ismael
> > > > > > >
> > > > > > > On Sun, Mar 28, 2021, 10:15 PM Ryanne Dolan <
> > ryannedolan@gmail.com
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > Ismael, I don't think KIP-98 is related. Shallow iteration
> was
> > > > > removed
> > > > > > in
> > > > > > > > KAFKA-732, which predates KIP-98 by a few years.
> > > > > > > >
> > > > > > > > Ryanne
> > > > > > > >
> > > > > > > > On Sun, Mar 28, 2021, 11:25 PM Ismael Juma <
> ismael@juma.me.uk>
> > > > > wrote:
> > > > > > > >
> > > > > > > > > Thanks for the KIP. I have a few high level comments:
> > > > > > > > >
> > > > > > > > > 1. Like Tom, I'm not convinced about the proposal to make
> > this
> > > > > change
> > > > > > > to
> > > > > > > > > MirrorMaker 1 if we intend to deprecate it and remove it. I
> > > would
> > > > > > > rather
> > > > > > > > us
> > > > > > > > > focus our efforts on the implementation we intend to
> support
> > > > going
> > > > > > > > forward.
> > > > > > > > > 2. The producer/consumer configs seem pretty dangerous for
> > > > general
> > > > > > > usage,
> > > > > > > > > but the KIP doesn't address the potential downsides.
> > > > > > > > > 3. How does the ProducerRequest change impact exactly-once
> > (if
> > > at
> > > > > > all)?
> > > > > > > > The
> > > > > > > > > change we are reverting was done as part of KIP-98. Have we
> > > > > > considered
> > > > > > > > the
> > > > > > > > > original reasons for the change?
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Ismael
> > > > > > > > >
> > > > > > > > > On Wed, Feb 10, 2021 at 12:58 PM Vahid Hashemian <
> > > > > > > > > vahid.hashemian@gmail.com>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Retitled the thread to conform to the common format.
> > > > > > > > > >
> > > > > > > > > > On Fri, Feb 5, 2021 at 4:00 PM Ning Zhang <
> > > > > ning2008wisc@gmail.com>
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hello Henry,
> > > > > > > > > > >
> > > > > > > > > > > This is a very interesting proposal.
> > > > > > > > > > > https://issues.apache.org/jira/browse/KAFKA-10728
> > reflects
> > > > the
> > > > > > > > similar
> > > > > > > > > > > concern of re-compressing data in mirror maker.
> > > > > > > > > > >
> > > > > > > > > > > Probably one thing may need to clarify is: how
> "shallow"
> > > > > > mirroring
> > > > > > > is
> > > > > > > > > > only
> > > > > > > > > > > applied to mirrormaker use case, if the changes need to
> > be
> > > > made
> > > > > > on
> > > > > > > > > > generic
> > > > > > > > > > > consumer and producer (e.g. by adding `fetch.raw.bytes`
> > and
> > > > > > > > > > > `send.raw.bytes` to producer and consumer config)
> > > > > > > > > > >
> > > > > > > > > > > On 2021/02/05 00:59:57, Henry Cai
> > > <hcai@pinterest.com.INVALID
> > > > >
> > > > > > > > wrote:
> > > > > > > > > > > > Dear Community members,
> > > > > > > > > > > >
> > > > > > > > > > > > We are proposing a new feature to improve the
> > performance
> > > > of
> > > > > > > Kafka
> > > > > > > > > > mirror
> > > > > > > > > > > > maker:
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-712%3A+Shallow+Mirroring
> > > > > > > > > > > >
> > > > > > > > > > > > The current Kafka MirrorMaker process (with the
> > > underlying
> > > > > > > Consumer
> > > > > > > > > and
> > > > > > > > > > > > Producer library) uses significant CPU cycles and
> > memory
> > > to
> > > > > > > > > > > > decompress/recompress, deserialize/re-serialize
> > messages
> > > > and
> > > > > > copy
> > > > > > > > > > > multiple
> > > > > > > > > > > > times of messages bytes along the
> mirroring/replicating
> > > > > stages.
> > > > > > > > > > > >
> > > > > > > > > > > > The KIP proposes a *shallow mirror* feature which
> > brings
> > > > back
> > > > > > the
> > > > > > > > > > shallow
> > > > > > > > > > > > iterator concept to the mirror process and also
> > proposes
> > > to
> > > > > > skip
> > > > > > > > the
> > > > > > > > > > > > unnecessary message decompression and recompression
> > > steps.
> > > > > We
> > > > > > > > argue
> > > > > > > > > in
> > > > > > > > > > > > many cases users just want a simple replication
> > pipeline
> > > to
> > > > > > > > replicate
> > > > > > > > > > the
> > > > > > > > > > > > message as it is from the source cluster to the
> > > destination
> > > > > > > > cluster.
> > > > > > > > > > In
> > > > > > > > > > > > many cases the messages in the source cluster are
> > already
> > > > > > > > compressed
> > > > > > > > > > and
> > > > > > > > > > > > properly batched, users just need an identical copy
> of
> > > the
> > > > > > > message
> > > > > > > > > > bytes
> > > > > > > > > > > > through the mirroring without any transformation or
> > > > > > > repartitioning.
> > > > > > > > > > > >
> > > > > > > > > > > > We have a prototype implementation in house with
> > > > MirrorMaker
> > > > > v1
> > > > > > > and
> > > > > > > > > > > > observed *CPU usage dropped from 50% to 15%* for some
> > > > mirror
> > > > > > > > > pipelines.
> > > > > > > > > > > >
> > > > > > > > > > > > We name this feature: *shallow mirroring* since it
> has
> > > some
> > > > > > > > > resemblance
> > > > > > > > > > > to
> > > > > > > > > > > > the old Kafka 0.7 namesake feature but the
> > > implementations
> > > > > are
> > > > > > > not
> > > > > > > > > > quite
> > > > > > > > > > > > the same.  ‘*Shallow*’ means 1. we *shallowly*
> iterate
> > > > > > > > RecordBatches
> > > > > > > > > > > inside
> > > > > > > > > > > > MemoryRecords structure instead of deep iterating
> > records
> > > > > > inside
> > > > > > > > > > > > RecordBatch; 2. We *shallowly* copy (share) pointers
> > > inside
> > > > > > > > > ByteBuffer
> > > > > > > > > > > > instead of deep copying and deserializing bytes into
> > > > objects.
> > > > > > > > > > > >
> > > > > > > > > > > > Please share discussions/feedback along this email
> > > thread.
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > >
> > > > > > > > > > Thanks!
> > > > > > > > > > --Vahid
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-712: Shallow Mirroring

Posted by Jun Rao <ju...@confluent.io.INVALID>.
Hi, Henry,

Thanks for the KIP. Sorry for the late reply. A few comments below.

1. The 'shallow' feature is potentially useful. I do agree with Tom that
the proposed API changes seem unclean. Quite a few existing stuff don't
really work together with this (e.g., generics, serializer, interceptors,
configs like max.poll.records, etc). It's also hard to explain this change
to the common users of the consumer/producer API. I think it would be
useful to explore if there is another cleaner way of adding this. For
example, you mentioned that creating a new set of APIs doesn't work for
MM2. However, we could potentially change the connect interface to allow
MM2 to use the new API. If this doesn't work, it would be useful to explain
that in the rejected alternative section.

2. I am not sure that we could pass through all fields in RecordBatch. For
example, a MM instance could be receiving RecordBatch from different source
partitions. Mixing the PID/ProducerEpoch/FirstSequence fields across them
in a single producer will be weird. So, it would be useful to document this
part clearer.

3. EOS. While MM itself doesn't support mirroring data in an
exactly-once way, it needs to support reading from a topic with EOS data.
So, it would be useful to document whether both read_committed and
read_uncommitted mode are supported and what kind of RecordBatch the
consumer returns in each case.

4. With the 'shallow' feature, it seems that some existing features in MM
won't work. For example, I am not sure if SMT works in MM2
and MirrorMakerMessageHandler works in MM1. It would be useful to document
this kind of impact in the KIP.

5. Multiple batches per partition in the produce request. This seems not
strictly required in KIP-98. However, changing this will probably add a bit
more complexity in the producer. So, it would be useful to understand its
benefits, especially since it doesn't seem to directly help reduce the CPU
cost in MM. For example, do you have performance numbers with and without
this enabled in your MM tests?

Thanks,

Jun

On Tue, Mar 30, 2021 at 1:27 PM Henry Cai <hc...@pinterest.com.invalid>
wrote:

> Tom,
>
> Thanks for your comments.  Yes it's a bit clumsy to use the existing
> consumer and producer API to carry the underlying record batch, but
> creating a new set of API would also mean other use cases (e.g. MM2)
> wouldn't be able to use that feature easily.  We can throw exceptions if we
> see clients are setting serializer/compression in the consumer config
> option.
>
> The consumer is essentially getting back a collection of
> RecordBatchByteBuffer records and passing them to the producer.  Most of
> the internal APIs inside consumer and producer code paths are actually
> taking on ByteBuffer as the argument so it's not too much work to get the
> byte buffer through.
>
> For the worry that the client might see the inside of that byte buffer, we
> can create a RecordBatchByteBufferRecord class to wrap the underlying byte
> buffer so hopefully they will not drill too deep into that object.  Java's
> ByteBuffer does have a asReadOnlyBuffer() method to return a read-only
> buffer, that can be explored as well.
>
> On Tue, Mar 30, 2021 at 4:24 AM Tom Bentley <tb...@redhat.com> wrote:
>
> > Hi Henry and Ryanne,
> >
> > Related to Ismael's point about the producer & consumer configs being
> > dangerous, I can see two parts to this:
> >
> > 2a. Both the proposed configs seem to be fundamentally incompatible with
> > the Producer's existing key.serializer, value.serializer and
> > compression.type configs, likewise the consumers key.deserializer and
> > value.deserializer. I don't see a way to avoid this, since those existing
> > configs are already separate things. (I did consider whether using
> > special-case Deserializer and Serializer could be used instead, but that
> > doesn't work nicely; in this use case they're necessarily all configured
> > together). I think all we could do would be to reject configs which tried
> > to set those existing client configs in conjunction with fetch.raw.bytes
> > and send.raw.bytes.
> >
> > 2b. That still leaves a public Java API which would allow access to the
> raw
> > byte buffers. AFAICS we don't actually need user code to have access to
> the
> > raw buffers. It would be enough to get an opaque object that wrapped the
> > ByteBuffer from the consumer and pass it to the producer. It's only the
> > consumer and producer code which needs to be able to obtain the wrapped
> > buffer.
> >
> > Kind regards,
> >
> > Tom
> >
> > On Tue, Mar 30, 2021 at 8:41 AM Ismael Juma <is...@juma.me.uk> wrote:
> >
> > > Hi Henry,
> > >
> > > Can you clarify why this "network performance" issue is only related to
> > > shallow mirroring? Generally, we want the protocol to be generic and
> not
> > > have a number of special cases. The more special cases you have, the
> > > tougher it becomes to test all the edge cases.
> > >
> > > Ismael
> > >
> > > On Mon, Mar 29, 2021 at 9:51 PM Henry Cai <hc...@pinterest.com.invalid>
> > > wrote:
> > >
> > > > It's interesting this VOTE thread finally becomes a DISCUSS thread.
> > > >
> > > > For MM2 concern, I will take a look to see whether I can add the
> > support
> > > > for MM2.
> > > >
> > > > For Ismael's concern on multiple batches in the ProduceRequest
> > > (conflicting
> > > > with KIP-98), here is my take:
> > > >
> > > > 1. We do need to group multiple batches in the same request otherwise
> > the
> > > > network performance will suffer.
> > > > 2. For the concern on transactional message support as in KIP-98,
> since
> > > MM1
> > > > and MM2 currently don't support transactional messages, KIP-712 will
> > not
> > > > attempt to support transactions either.  I will add a config option
> on
> > > > producer config: allowMultipleBatches.  By default this option will
> be
> > > off
> > > > and the user needs to explicitly turn on this option to use the
> shallow
> > > > mirror feature.  And if we detect both this option and transaction is
> > > > turned on we will throw an exception to protect current transaction
> > > > processing.
> > > > 3. In the future, when MM2 starts to support exact-once and
> > transactional
> > > > messages (is that KIP-656?), we can revisit this code.  The current
> > > > transactional message already makes the compromise that the messages
> in
> > > the
> > > > same RecordBatch (MessageSet) are sharing the same
> > > > sequence-id/transaction-id, so those messages need to be committed
> all
> > > > together.  I think when we support the shallow mirror with
> > transactional
> > > > semantics, we will group all batches in the same ProduceRequest in
> the
> > > same
> > > > transaction boundary, they need to be committed all together.  On the
> > > > broker side, all batches coming from ProduceRequest (or
> FetchResponse)
> > > are
> > > > committed in the same log segment file as one unit (current
> behavior).
> > > >
> > > > On Mon, Mar 29, 2021 at 8:46 AM Ryanne Dolan <ry...@gmail.com>
> > > > wrote:
> > > >
> > > > > Ah, I see, thanks Ismael. Now I understand your concern.
> > > > >
> > > > > From KIP-98, re this change in v3:
> > > > >
> > > > > "This allows us to remove the message set size since each message
> set
> > > > > already contains a field for the size. More importantly, since
> there
> > is
> > > > > only one message set to be written to the log, partial produce
> > failures
> > > > are
> > > > > no longer possible. The full message set is either successfully
> > written
> > > > to
> > > > > the log (and replicated) or it is not."
> > > > >
> > > > > The schema and size field don't seem to be an issue, as KIP-712
> > already
> > > > > addresses.
> > > > >
> > > > > The partial produce failure issue is something I don't understand.
> I
> > > > can't
> > > > > tell if this was done out of convenience at the time or if there is
> > > > > something incompatible with partial produce success/failure and
> EOS.
> > > Does
> > > > > anyone know?
> > > > >
> > > > > Ryanne
> > > > >
> > > > > On Mon, Mar 29, 2021, 1:41 AM Ismael Juma <is...@juma.me.uk>
> wrote:
> > > > >
> > > > > > Ryanne,
> > > > > >
> > > > > > You misunderstood the referenced comment. It is about the produce
> > > > request
> > > > > > change to have multiple batches:
> > > > > >
> > > > > > "Up to ProduceRequest V2, a ProduceRequest can contain multiple
> > > batches
> > > > > of
> > > > > > messages stored in the record_set field, but this was disabled in
> > V3.
> > > > We
> > > > > > are proposing to bring the multiple batches feature back to
> improve
> > > the
> > > > > > network throughput of the mirror maker producer when the original
> > > batch
> > > > > > size from source broker is too small."
> > > > > >
> > > > > > This is unrelated to shallow iteration.
> > > > > >
> > > > > > Ismael
> > > > > >
> > > > > > On Sun, Mar 28, 2021, 10:15 PM Ryanne Dolan <
> ryannedolan@gmail.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > Ismael, I don't think KIP-98 is related. Shallow iteration was
> > > > removed
> > > > > in
> > > > > > > KAFKA-732, which predates KIP-98 by a few years.
> > > > > > >
> > > > > > > Ryanne
> > > > > > >
> > > > > > > On Sun, Mar 28, 2021, 11:25 PM Ismael Juma <is...@juma.me.uk>
> > > > wrote:
> > > > > > >
> > > > > > > > Thanks for the KIP. I have a few high level comments:
> > > > > > > >
> > > > > > > > 1. Like Tom, I'm not convinced about the proposal to make
> this
> > > > change
> > > > > > to
> > > > > > > > MirrorMaker 1 if we intend to deprecate it and remove it. I
> > would
> > > > > > rather
> > > > > > > us
> > > > > > > > focus our efforts on the implementation we intend to support
> > > going
> > > > > > > forward.
> > > > > > > > 2. The producer/consumer configs seem pretty dangerous for
> > > general
> > > > > > usage,
> > > > > > > > but the KIP doesn't address the potential downsides.
> > > > > > > > 3. How does the ProducerRequest change impact exactly-once
> (if
> > at
> > > > > all)?
> > > > > > > The
> > > > > > > > change we are reverting was done as part of KIP-98. Have we
> > > > > considered
> > > > > > > the
> > > > > > > > original reasons for the change?
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Ismael
> > > > > > > >
> > > > > > > > On Wed, Feb 10, 2021 at 12:58 PM Vahid Hashemian <
> > > > > > > > vahid.hashemian@gmail.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Retitled the thread to conform to the common format.
> > > > > > > > >
> > > > > > > > > On Fri, Feb 5, 2021 at 4:00 PM Ning Zhang <
> > > > ning2008wisc@gmail.com>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hello Henry,
> > > > > > > > > >
> > > > > > > > > > This is a very interesting proposal.
> > > > > > > > > > https://issues.apache.org/jira/browse/KAFKA-10728
> reflects
> > > the
> > > > > > > similar
> > > > > > > > > > concern of re-compressing data in mirror maker.
> > > > > > > > > >
> > > > > > > > > > Probably one thing may need to clarify is: how "shallow"
> > > > > mirroring
> > > > > > is
> > > > > > > > > only
> > > > > > > > > > applied to mirrormaker use case, if the changes need to
> be
> > > made
> > > > > on
> > > > > > > > > generic
> > > > > > > > > > consumer and producer (e.g. by adding `fetch.raw.bytes`
> and
> > > > > > > > > > `send.raw.bytes` to producer and consumer config)
> > > > > > > > > >
> > > > > > > > > > On 2021/02/05 00:59:57, Henry Cai
> > <hcai@pinterest.com.INVALID
> > > >
> > > > > > > wrote:
> > > > > > > > > > > Dear Community members,
> > > > > > > > > > >
> > > > > > > > > > > We are proposing a new feature to improve the
> performance
> > > of
> > > > > > Kafka
> > > > > > > > > mirror
> > > > > > > > > > > maker:
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-712%3A+Shallow+Mirroring
> > > > > > > > > > >
> > > > > > > > > > > The current Kafka MirrorMaker process (with the
> > underlying
> > > > > > Consumer
> > > > > > > > and
> > > > > > > > > > > Producer library) uses significant CPU cycles and
> memory
> > to
> > > > > > > > > > > decompress/recompress, deserialize/re-serialize
> messages
> > > and
> > > > > copy
> > > > > > > > > > multiple
> > > > > > > > > > > times of messages bytes along the mirroring/replicating
> > > > stages.
> > > > > > > > > > >
> > > > > > > > > > > The KIP proposes a *shallow mirror* feature which
> brings
> > > back
> > > > > the
> > > > > > > > > shallow
> > > > > > > > > > > iterator concept to the mirror process and also
> proposes
> > to
> > > > > skip
> > > > > > > the
> > > > > > > > > > > unnecessary message decompression and recompression
> > steps.
> > > > We
> > > > > > > argue
> > > > > > > > in
> > > > > > > > > > > many cases users just want a simple replication
> pipeline
> > to
> > > > > > > replicate
> > > > > > > > > the
> > > > > > > > > > > message as it is from the source cluster to the
> > destination
> > > > > > > cluster.
> > > > > > > > > In
> > > > > > > > > > > many cases the messages in the source cluster are
> already
> > > > > > > compressed
> > > > > > > > > and
> > > > > > > > > > > properly batched, users just need an identical copy of
> > the
> > > > > > message
> > > > > > > > > bytes
> > > > > > > > > > > through the mirroring without any transformation or
> > > > > > repartitioning.
> > > > > > > > > > >
> > > > > > > > > > > We have a prototype implementation in house with
> > > MirrorMaker
> > > > v1
> > > > > > and
> > > > > > > > > > > observed *CPU usage dropped from 50% to 15%* for some
> > > mirror
> > > > > > > > pipelines.
> > > > > > > > > > >
> > > > > > > > > > > We name this feature: *shallow mirroring* since it has
> > some
> > > > > > > > resemblance
> > > > > > > > > > to
> > > > > > > > > > > the old Kafka 0.7 namesake feature but the
> > implementations
> > > > are
> > > > > > not
> > > > > > > > > quite
> > > > > > > > > > > the same.  ‘*Shallow*’ means 1. we *shallowly* iterate
> > > > > > > RecordBatches
> > > > > > > > > > inside
> > > > > > > > > > > MemoryRecords structure instead of deep iterating
> records
> > > > > inside
> > > > > > > > > > > RecordBatch; 2. We *shallowly* copy (share) pointers
> > inside
> > > > > > > > ByteBuffer
> > > > > > > > > > > instead of deep copying and deserializing bytes into
> > > objects.
> > > > > > > > > > >
> > > > > > > > > > > Please share discussions/feedback along this email
> > thread.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > >
> > > > > > > > > Thanks!
> > > > > > > > > --Vahid
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-712: Shallow Mirroring

Posted by Henry Cai <hc...@pinterest.com.INVALID>.
Tom,

Thanks for your comments.  Yes it's a bit clumsy to use the existing
consumer and producer API to carry the underlying record batch, but
creating a new set of API would also mean other use cases (e.g. MM2)
wouldn't be able to use that feature easily.  We can throw exceptions if we
see clients are setting serializer/compression in the consumer config
option.

The consumer is essentially getting back a collection of
RecordBatchByteBuffer records and passing them to the producer.  Most of
the internal APIs inside consumer and producer code paths are actually
taking on ByteBuffer as the argument so it's not too much work to get the
byte buffer through.

For the worry that the client might see the inside of that byte buffer, we
can create a RecordBatchByteBufferRecord class to wrap the underlying byte
buffer so hopefully they will not drill too deep into that object.  Java's
ByteBuffer does have a asReadOnlyBuffer() method to return a read-only
buffer, that can be explored as well.

On Tue, Mar 30, 2021 at 4:24 AM Tom Bentley <tb...@redhat.com> wrote:

> Hi Henry and Ryanne,
>
> Related to Ismael's point about the producer & consumer configs being
> dangerous, I can see two parts to this:
>
> 2a. Both the proposed configs seem to be fundamentally incompatible with
> the Producer's existing key.serializer, value.serializer and
> compression.type configs, likewise the consumers key.deserializer and
> value.deserializer. I don't see a way to avoid this, since those existing
> configs are already separate things. (I did consider whether using
> special-case Deserializer and Serializer could be used instead, but that
> doesn't work nicely; in this use case they're necessarily all configured
> together). I think all we could do would be to reject configs which tried
> to set those existing client configs in conjunction with fetch.raw.bytes
> and send.raw.bytes.
>
> 2b. That still leaves a public Java API which would allow access to the raw
> byte buffers. AFAICS we don't actually need user code to have access to the
> raw buffers. It would be enough to get an opaque object that wrapped the
> ByteBuffer from the consumer and pass it to the producer. It's only the
> consumer and producer code which needs to be able to obtain the wrapped
> buffer.
>
> Kind regards,
>
> Tom
>
> On Tue, Mar 30, 2021 at 8:41 AM Ismael Juma <is...@juma.me.uk> wrote:
>
> > Hi Henry,
> >
> > Can you clarify why this "network performance" issue is only related to
> > shallow mirroring? Generally, we want the protocol to be generic and not
> > have a number of special cases. The more special cases you have, the
> > tougher it becomes to test all the edge cases.
> >
> > Ismael
> >
> > On Mon, Mar 29, 2021 at 9:51 PM Henry Cai <hc...@pinterest.com.invalid>
> > wrote:
> >
> > > It's interesting this VOTE thread finally becomes a DISCUSS thread.
> > >
> > > For MM2 concern, I will take a look to see whether I can add the
> support
> > > for MM2.
> > >
> > > For Ismael's concern on multiple batches in the ProduceRequest
> > (conflicting
> > > with KIP-98), here is my take:
> > >
> > > 1. We do need to group multiple batches in the same request otherwise
> the
> > > network performance will suffer.
> > > 2. For the concern on transactional message support as in KIP-98, since
> > MM1
> > > and MM2 currently don't support transactional messages, KIP-712 will
> not
> > > attempt to support transactions either.  I will add a config option on
> > > producer config: allowMultipleBatches.  By default this option will be
> > off
> > > and the user needs to explicitly turn on this option to use the shallow
> > > mirror feature.  And if we detect both this option and transaction is
> > > turned on we will throw an exception to protect current transaction
> > > processing.
> > > 3. In the future, when MM2 starts to support exact-once and
> transactional
> > > messages (is that KIP-656?), we can revisit this code.  The current
> > > transactional message already makes the compromise that the messages in
> > the
> > > same RecordBatch (MessageSet) are sharing the same
> > > sequence-id/transaction-id, so those messages need to be committed all
> > > together.  I think when we support the shallow mirror with
> transactional
> > > semantics, we will group all batches in the same ProduceRequest in the
> > same
> > > transaction boundary, they need to be committed all together.  On the
> > > broker side, all batches coming from ProduceRequest (or FetchResponse)
> > are
> > > committed in the same log segment file as one unit (current behavior).
> > >
> > > On Mon, Mar 29, 2021 at 8:46 AM Ryanne Dolan <ry...@gmail.com>
> > > wrote:
> > >
> > > > Ah, I see, thanks Ismael. Now I understand your concern.
> > > >
> > > > From KIP-98, re this change in v3:
> > > >
> > > > "This allows us to remove the message set size since each message set
> > > > already contains a field for the size. More importantly, since there
> is
> > > > only one message set to be written to the log, partial produce
> failures
> > > are
> > > > no longer possible. The full message set is either successfully
> written
> > > to
> > > > the log (and replicated) or it is not."
> > > >
> > > > The schema and size field don't seem to be an issue, as KIP-712
> already
> > > > addresses.
> > > >
> > > > The partial produce failure issue is something I don't understand. I
> > > can't
> > > > tell if this was done out of convenience at the time or if there is
> > > > something incompatible with partial produce success/failure and EOS.
> > Does
> > > > anyone know?
> > > >
> > > > Ryanne
> > > >
> > > > On Mon, Mar 29, 2021, 1:41 AM Ismael Juma <is...@juma.me.uk> wrote:
> > > >
> > > > > Ryanne,
> > > > >
> > > > > You misunderstood the referenced comment. It is about the produce
> > > request
> > > > > change to have multiple batches:
> > > > >
> > > > > "Up to ProduceRequest V2, a ProduceRequest can contain multiple
> > batches
> > > > of
> > > > > messages stored in the record_set field, but this was disabled in
> V3.
> > > We
> > > > > are proposing to bring the multiple batches feature back to improve
> > the
> > > > > network throughput of the mirror maker producer when the original
> > batch
> > > > > size from source broker is too small."
> > > > >
> > > > > This is unrelated to shallow iteration.
> > > > >
> > > > > Ismael
> > > > >
> > > > > On Sun, Mar 28, 2021, 10:15 PM Ryanne Dolan <ryannedolan@gmail.com
> >
> > > > wrote:
> > > > >
> > > > > > Ismael, I don't think KIP-98 is related. Shallow iteration was
> > > removed
> > > > in
> > > > > > KAFKA-732, which predates KIP-98 by a few years.
> > > > > >
> > > > > > Ryanne
> > > > > >
> > > > > > On Sun, Mar 28, 2021, 11:25 PM Ismael Juma <is...@juma.me.uk>
> > > wrote:
> > > > > >
> > > > > > > Thanks for the KIP. I have a few high level comments:
> > > > > > >
> > > > > > > 1. Like Tom, I'm not convinced about the proposal to make this
> > > change
> > > > > to
> > > > > > > MirrorMaker 1 if we intend to deprecate it and remove it. I
> would
> > > > > rather
> > > > > > us
> > > > > > > focus our efforts on the implementation we intend to support
> > going
> > > > > > forward.
> > > > > > > 2. The producer/consumer configs seem pretty dangerous for
> > general
> > > > > usage,
> > > > > > > but the KIP doesn't address the potential downsides.
> > > > > > > 3. How does the ProducerRequest change impact exactly-once (if
> at
> > > > all)?
> > > > > > The
> > > > > > > change we are reverting was done as part of KIP-98. Have we
> > > > considered
> > > > > > the
> > > > > > > original reasons for the change?
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Ismael
> > > > > > >
> > > > > > > On Wed, Feb 10, 2021 at 12:58 PM Vahid Hashemian <
> > > > > > > vahid.hashemian@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Retitled the thread to conform to the common format.
> > > > > > > >
> > > > > > > > On Fri, Feb 5, 2021 at 4:00 PM Ning Zhang <
> > > ning2008wisc@gmail.com>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hello Henry,
> > > > > > > > >
> > > > > > > > > This is a very interesting proposal.
> > > > > > > > > https://issues.apache.org/jira/browse/KAFKA-10728 reflects
> > the
> > > > > > similar
> > > > > > > > > concern of re-compressing data in mirror maker.
> > > > > > > > >
> > > > > > > > > Probably one thing may need to clarify is: how "shallow"
> > > > mirroring
> > > > > is
> > > > > > > > only
> > > > > > > > > applied to mirrormaker use case, if the changes need to be
> > made
> > > > on
> > > > > > > > generic
> > > > > > > > > consumer and producer (e.g. by adding `fetch.raw.bytes` and
> > > > > > > > > `send.raw.bytes` to producer and consumer config)
> > > > > > > > >
> > > > > > > > > On 2021/02/05 00:59:57, Henry Cai
> <hcai@pinterest.com.INVALID
> > >
> > > > > > wrote:
> > > > > > > > > > Dear Community members,
> > > > > > > > > >
> > > > > > > > > > We are proposing a new feature to improve the performance
> > of
> > > > > Kafka
> > > > > > > > mirror
> > > > > > > > > > maker:
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-712%3A+Shallow+Mirroring
> > > > > > > > > >
> > > > > > > > > > The current Kafka MirrorMaker process (with the
> underlying
> > > > > Consumer
> > > > > > > and
> > > > > > > > > > Producer library) uses significant CPU cycles and memory
> to
> > > > > > > > > > decompress/recompress, deserialize/re-serialize messages
> > and
> > > > copy
> > > > > > > > > multiple
> > > > > > > > > > times of messages bytes along the mirroring/replicating
> > > stages.
> > > > > > > > > >
> > > > > > > > > > The KIP proposes a *shallow mirror* feature which brings
> > back
> > > > the
> > > > > > > > shallow
> > > > > > > > > > iterator concept to the mirror process and also proposes
> to
> > > > skip
> > > > > > the
> > > > > > > > > > unnecessary message decompression and recompression
> steps.
> > > We
> > > > > > argue
> > > > > > > in
> > > > > > > > > > many cases users just want a simple replication pipeline
> to
> > > > > > replicate
> > > > > > > > the
> > > > > > > > > > message as it is from the source cluster to the
> destination
> > > > > > cluster.
> > > > > > > > In
> > > > > > > > > > many cases the messages in the source cluster are already
> > > > > > compressed
> > > > > > > > and
> > > > > > > > > > properly batched, users just need an identical copy of
> the
> > > > > message
> > > > > > > > bytes
> > > > > > > > > > through the mirroring without any transformation or
> > > > > repartitioning.
> > > > > > > > > >
> > > > > > > > > > We have a prototype implementation in house with
> > MirrorMaker
> > > v1
> > > > > and
> > > > > > > > > > observed *CPU usage dropped from 50% to 15%* for some
> > mirror
> > > > > > > pipelines.
> > > > > > > > > >
> > > > > > > > > > We name this feature: *shallow mirroring* since it has
> some
> > > > > > > resemblance
> > > > > > > > > to
> > > > > > > > > > the old Kafka 0.7 namesake feature but the
> implementations
> > > are
> > > > > not
> > > > > > > > quite
> > > > > > > > > > the same.  ‘*Shallow*’ means 1. we *shallowly* iterate
> > > > > > RecordBatches
> > > > > > > > > inside
> > > > > > > > > > MemoryRecords structure instead of deep iterating records
> > > > inside
> > > > > > > > > > RecordBatch; 2. We *shallowly* copy (share) pointers
> inside
> > > > > > > ByteBuffer
> > > > > > > > > > instead of deep copying and deserializing bytes into
> > objects.
> > > > > > > > > >
> > > > > > > > > > Please share discussions/feedback along this email
> thread.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > >
> > > > > > > > Thanks!
> > > > > > > > --Vahid
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-712: Shallow Mirroring

Posted by Tom Bentley <tb...@redhat.com>.
Hi Henry and Ryanne,

Related to Ismael's point about the producer & consumer configs being
dangerous, I can see two parts to this:

2a. Both the proposed configs seem to be fundamentally incompatible with
the Producer's existing key.serializer, value.serializer and
compression.type configs, likewise the consumers key.deserializer and
value.deserializer. I don't see a way to avoid this, since those existing
configs are already separate things. (I did consider whether using
special-case Deserializer and Serializer could be used instead, but that
doesn't work nicely; in this use case they're necessarily all configured
together). I think all we could do would be to reject configs which tried
to set those existing client configs in conjunction with fetch.raw.bytes
and send.raw.bytes.

2b. That still leaves a public Java API which would allow access to the raw
byte buffers. AFAICS we don't actually need user code to have access to the
raw buffers. It would be enough to get an opaque object that wrapped the
ByteBuffer from the consumer and pass it to the producer. It's only the
consumer and producer code which needs to be able to obtain the wrapped
buffer.

Kind regards,

Tom

On Tue, Mar 30, 2021 at 8:41 AM Ismael Juma <is...@juma.me.uk> wrote:

> Hi Henry,
>
> Can you clarify why this "network performance" issue is only related to
> shallow mirroring? Generally, we want the protocol to be generic and not
> have a number of special cases. The more special cases you have, the
> tougher it becomes to test all the edge cases.
>
> Ismael
>
> On Mon, Mar 29, 2021 at 9:51 PM Henry Cai <hc...@pinterest.com.invalid>
> wrote:
>
> > It's interesting this VOTE thread finally becomes a DISCUSS thread.
> >
> > For MM2 concern, I will take a look to see whether I can add the support
> > for MM2.
> >
> > For Ismael's concern on multiple batches in the ProduceRequest
> (conflicting
> > with KIP-98), here is my take:
> >
> > 1. We do need to group multiple batches in the same request otherwise the
> > network performance will suffer.
> > 2. For the concern on transactional message support as in KIP-98, since
> MM1
> > and MM2 currently don't support transactional messages, KIP-712 will not
> > attempt to support transactions either.  I will add a config option on
> > producer config: allowMultipleBatches.  By default this option will be
> off
> > and the user needs to explicitly turn on this option to use the shallow
> > mirror feature.  And if we detect both this option and transaction is
> > turned on we will throw an exception to protect current transaction
> > processing.
> > 3. In the future, when MM2 starts to support exact-once and transactional
> > messages (is that KIP-656?), we can revisit this code.  The current
> > transactional message already makes the compromise that the messages in
> the
> > same RecordBatch (MessageSet) are sharing the same
> > sequence-id/transaction-id, so those messages need to be committed all
> > together.  I think when we support the shallow mirror with transactional
> > semantics, we will group all batches in the same ProduceRequest in the
> same
> > transaction boundary, they need to be committed all together.  On the
> > broker side, all batches coming from ProduceRequest (or FetchResponse)
> are
> > committed in the same log segment file as one unit (current behavior).
> >
> > On Mon, Mar 29, 2021 at 8:46 AM Ryanne Dolan <ry...@gmail.com>
> > wrote:
> >
> > > Ah, I see, thanks Ismael. Now I understand your concern.
> > >
> > > From KIP-98, re this change in v3:
> > >
> > > "This allows us to remove the message set size since each message set
> > > already contains a field for the size. More importantly, since there is
> > > only one message set to be written to the log, partial produce failures
> > are
> > > no longer possible. The full message set is either successfully written
> > to
> > > the log (and replicated) or it is not."
> > >
> > > The schema and size field don't seem to be an issue, as KIP-712 already
> > > addresses.
> > >
> > > The partial produce failure issue is something I don't understand. I
> > can't
> > > tell if this was done out of convenience at the time or if there is
> > > something incompatible with partial produce success/failure and EOS.
> Does
> > > anyone know?
> > >
> > > Ryanne
> > >
> > > On Mon, Mar 29, 2021, 1:41 AM Ismael Juma <is...@juma.me.uk> wrote:
> > >
> > > > Ryanne,
> > > >
> > > > You misunderstood the referenced comment. It is about the produce
> > request
> > > > change to have multiple batches:
> > > >
> > > > "Up to ProduceRequest V2, a ProduceRequest can contain multiple
> batches
> > > of
> > > > messages stored in the record_set field, but this was disabled in V3.
> > We
> > > > are proposing to bring the multiple batches feature back to improve
> the
> > > > network throughput of the mirror maker producer when the original
> batch
> > > > size from source broker is too small."
> > > >
> > > > This is unrelated to shallow iteration.
> > > >
> > > > Ismael
> > > >
> > > > On Sun, Mar 28, 2021, 10:15 PM Ryanne Dolan <ry...@gmail.com>
> > > wrote:
> > > >
> > > > > Ismael, I don't think KIP-98 is related. Shallow iteration was
> > removed
> > > in
> > > > > KAFKA-732, which predates KIP-98 by a few years.
> > > > >
> > > > > Ryanne
> > > > >
> > > > > On Sun, Mar 28, 2021, 11:25 PM Ismael Juma <is...@juma.me.uk>
> > wrote:
> > > > >
> > > > > > Thanks for the KIP. I have a few high level comments:
> > > > > >
> > > > > > 1. Like Tom, I'm not convinced about the proposal to make this
> > change
> > > > to
> > > > > > MirrorMaker 1 if we intend to deprecate it and remove it. I would
> > > > rather
> > > > > us
> > > > > > focus our efforts on the implementation we intend to support
> going
> > > > > forward.
> > > > > > 2. The producer/consumer configs seem pretty dangerous for
> general
> > > > usage,
> > > > > > but the KIP doesn't address the potential downsides.
> > > > > > 3. How does the ProducerRequest change impact exactly-once (if at
> > > all)?
> > > > > The
> > > > > > change we are reverting was done as part of KIP-98. Have we
> > > considered
> > > > > the
> > > > > > original reasons for the change?
> > > > > >
> > > > > > Thanks,
> > > > > > Ismael
> > > > > >
> > > > > > On Wed, Feb 10, 2021 at 12:58 PM Vahid Hashemian <
> > > > > > vahid.hashemian@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Retitled the thread to conform to the common format.
> > > > > > >
> > > > > > > On Fri, Feb 5, 2021 at 4:00 PM Ning Zhang <
> > ning2008wisc@gmail.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > > Hello Henry,
> > > > > > > >
> > > > > > > > This is a very interesting proposal.
> > > > > > > > https://issues.apache.org/jira/browse/KAFKA-10728 reflects
> the
> > > > > similar
> > > > > > > > concern of re-compressing data in mirror maker.
> > > > > > > >
> > > > > > > > Probably one thing may need to clarify is: how "shallow"
> > > mirroring
> > > > is
> > > > > > > only
> > > > > > > > applied to mirrormaker use case, if the changes need to be
> made
> > > on
> > > > > > > generic
> > > > > > > > consumer and producer (e.g. by adding `fetch.raw.bytes` and
> > > > > > > > `send.raw.bytes` to producer and consumer config)
> > > > > > > >
> > > > > > > > On 2021/02/05 00:59:57, Henry Cai <hcai@pinterest.com.INVALID
> >
> > > > > wrote:
> > > > > > > > > Dear Community members,
> > > > > > > > >
> > > > > > > > > We are proposing a new feature to improve the performance
> of
> > > > Kafka
> > > > > > > mirror
> > > > > > > > > maker:
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-712%3A+Shallow+Mirroring
> > > > > > > > >
> > > > > > > > > The current Kafka MirrorMaker process (with the underlying
> > > > Consumer
> > > > > > and
> > > > > > > > > Producer library) uses significant CPU cycles and memory to
> > > > > > > > > decompress/recompress, deserialize/re-serialize messages
> and
> > > copy
> > > > > > > > multiple
> > > > > > > > > times of messages bytes along the mirroring/replicating
> > stages.
> > > > > > > > >
> > > > > > > > > The KIP proposes a *shallow mirror* feature which brings
> back
> > > the
> > > > > > > shallow
> > > > > > > > > iterator concept to the mirror process and also proposes to
> > > skip
> > > > > the
> > > > > > > > > unnecessary message decompression and recompression steps.
> > We
> > > > > argue
> > > > > > in
> > > > > > > > > many cases users just want a simple replication pipeline to
> > > > > replicate
> > > > > > > the
> > > > > > > > > message as it is from the source cluster to the destination
> > > > > cluster.
> > > > > > > In
> > > > > > > > > many cases the messages in the source cluster are already
> > > > > compressed
> > > > > > > and
> > > > > > > > > properly batched, users just need an identical copy of the
> > > > message
> > > > > > > bytes
> > > > > > > > > through the mirroring without any transformation or
> > > > repartitioning.
> > > > > > > > >
> > > > > > > > > We have a prototype implementation in house with
> MirrorMaker
> > v1
> > > > and
> > > > > > > > > observed *CPU usage dropped from 50% to 15%* for some
> mirror
> > > > > > pipelines.
> > > > > > > > >
> > > > > > > > > We name this feature: *shallow mirroring* since it has some
> > > > > > resemblance
> > > > > > > > to
> > > > > > > > > the old Kafka 0.7 namesake feature but the implementations
> > are
> > > > not
> > > > > > > quite
> > > > > > > > > the same.  ‘*Shallow*’ means 1. we *shallowly* iterate
> > > > > RecordBatches
> > > > > > > > inside
> > > > > > > > > MemoryRecords structure instead of deep iterating records
> > > inside
> > > > > > > > > RecordBatch; 2. We *shallowly* copy (share) pointers inside
> > > > > > ByteBuffer
> > > > > > > > > instead of deep copying and deserializing bytes into
> objects.
> > > > > > > > >
> > > > > > > > > Please share discussions/feedback along this email thread.
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > >
> > > > > > > Thanks!
> > > > > > > --Vahid
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-712: Shallow Mirroring

Posted by Ismael Juma <is...@juma.me.uk>.
Hi Henry,

Can you clarify why this "network performance" issue is only related to
shallow mirroring? Generally, we want the protocol to be generic and not
have a number of special cases. The more special cases you have, the
tougher it becomes to test all the edge cases.

Ismael

On Mon, Mar 29, 2021 at 9:51 PM Henry Cai <hc...@pinterest.com.invalid>
wrote:

> It's interesting this VOTE thread finally becomes a DISCUSS thread.
>
> For MM2 concern, I will take a look to see whether I can add the support
> for MM2.
>
> For Ismael's concern on multiple batches in the ProduceRequest (conflicting
> with KIP-98), here is my take:
>
> 1. We do need to group multiple batches in the same request otherwise the
> network performance will suffer.
> 2. For the concern on transactional message support as in KIP-98, since MM1
> and MM2 currently don't support transactional messages, KIP-712 will not
> attempt to support transactions either.  I will add a config option on
> producer config: allowMultipleBatches.  By default this option will be off
> and the user needs to explicitly turn on this option to use the shallow
> mirror feature.  And if we detect both this option and transaction is
> turned on we will throw an exception to protect current transaction
> processing.
> 3. In the future, when MM2 starts to support exact-once and transactional
> messages (is that KIP-656?), we can revisit this code.  The current
> transactional message already makes the compromise that the messages in the
> same RecordBatch (MessageSet) are sharing the same
> sequence-id/transaction-id, so those messages need to be committed all
> together.  I think when we support the shallow mirror with transactional
> semantics, we will group all batches in the same ProduceRequest in the same
> transaction boundary, they need to be committed all together.  On the
> broker side, all batches coming from ProduceRequest (or FetchResponse) are
> committed in the same log segment file as one unit (current behavior).
>
> On Mon, Mar 29, 2021 at 8:46 AM Ryanne Dolan <ry...@gmail.com>
> wrote:
>
> > Ah, I see, thanks Ismael. Now I understand your concern.
> >
> > From KIP-98, re this change in v3:
> >
> > "This allows us to remove the message set size since each message set
> > already contains a field for the size. More importantly, since there is
> > only one message set to be written to the log, partial produce failures
> are
> > no longer possible. The full message set is either successfully written
> to
> > the log (and replicated) or it is not."
> >
> > The schema and size field don't seem to be an issue, as KIP-712 already
> > addresses.
> >
> > The partial produce failure issue is something I don't understand. I
> can't
> > tell if this was done out of convenience at the time or if there is
> > something incompatible with partial produce success/failure and EOS. Does
> > anyone know?
> >
> > Ryanne
> >
> > On Mon, Mar 29, 2021, 1:41 AM Ismael Juma <is...@juma.me.uk> wrote:
> >
> > > Ryanne,
> > >
> > > You misunderstood the referenced comment. It is about the produce
> request
> > > change to have multiple batches:
> > >
> > > "Up to ProduceRequest V2, a ProduceRequest can contain multiple batches
> > of
> > > messages stored in the record_set field, but this was disabled in V3.
> We
> > > are proposing to bring the multiple batches feature back to improve the
> > > network throughput of the mirror maker producer when the original batch
> > > size from source broker is too small."
> > >
> > > This is unrelated to shallow iteration.
> > >
> > > Ismael
> > >
> > > On Sun, Mar 28, 2021, 10:15 PM Ryanne Dolan <ry...@gmail.com>
> > wrote:
> > >
> > > > Ismael, I don't think KIP-98 is related. Shallow iteration was
> removed
> > in
> > > > KAFKA-732, which predates KIP-98 by a few years.
> > > >
> > > > Ryanne
> > > >
> > > > On Sun, Mar 28, 2021, 11:25 PM Ismael Juma <is...@juma.me.uk>
> wrote:
> > > >
> > > > > Thanks for the KIP. I have a few high level comments:
> > > > >
> > > > > 1. Like Tom, I'm not convinced about the proposal to make this
> change
> > > to
> > > > > MirrorMaker 1 if we intend to deprecate it and remove it. I would
> > > rather
> > > > us
> > > > > focus our efforts on the implementation we intend to support going
> > > > forward.
> > > > > 2. The producer/consumer configs seem pretty dangerous for general
> > > usage,
> > > > > but the KIP doesn't address the potential downsides.
> > > > > 3. How does the ProducerRequest change impact exactly-once (if at
> > all)?
> > > > The
> > > > > change we are reverting was done as part of KIP-98. Have we
> > considered
> > > > the
> > > > > original reasons for the change?
> > > > >
> > > > > Thanks,
> > > > > Ismael
> > > > >
> > > > > On Wed, Feb 10, 2021 at 12:58 PM Vahid Hashemian <
> > > > > vahid.hashemian@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Retitled the thread to conform to the common format.
> > > > > >
> > > > > > On Fri, Feb 5, 2021 at 4:00 PM Ning Zhang <
> ning2008wisc@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > > Hello Henry,
> > > > > > >
> > > > > > > This is a very interesting proposal.
> > > > > > > https://issues.apache.org/jira/browse/KAFKA-10728 reflects the
> > > > similar
> > > > > > > concern of re-compressing data in mirror maker.
> > > > > > >
> > > > > > > Probably one thing may need to clarify is: how "shallow"
> > mirroring
> > > is
> > > > > > only
> > > > > > > applied to mirrormaker use case, if the changes need to be made
> > on
> > > > > > generic
> > > > > > > consumer and producer (e.g. by adding `fetch.raw.bytes` and
> > > > > > > `send.raw.bytes` to producer and consumer config)
> > > > > > >
> > > > > > > On 2021/02/05 00:59:57, Henry Cai <hc...@pinterest.com.INVALID>
> > > > wrote:
> > > > > > > > Dear Community members,
> > > > > > > >
> > > > > > > > We are proposing a new feature to improve the performance of
> > > Kafka
> > > > > > mirror
> > > > > > > > maker:
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-712%3A+Shallow+Mirroring
> > > > > > > >
> > > > > > > > The current Kafka MirrorMaker process (with the underlying
> > > Consumer
> > > > > and
> > > > > > > > Producer library) uses significant CPU cycles and memory to
> > > > > > > > decompress/recompress, deserialize/re-serialize messages and
> > copy
> > > > > > > multiple
> > > > > > > > times of messages bytes along the mirroring/replicating
> stages.
> > > > > > > >
> > > > > > > > The KIP proposes a *shallow mirror* feature which brings back
> > the
> > > > > > shallow
> > > > > > > > iterator concept to the mirror process and also proposes to
> > skip
> > > > the
> > > > > > > > unnecessary message decompression and recompression steps.
> We
> > > > argue
> > > > > in
> > > > > > > > many cases users just want a simple replication pipeline to
> > > > replicate
> > > > > > the
> > > > > > > > message as it is from the source cluster to the destination
> > > > cluster.
> > > > > > In
> > > > > > > > many cases the messages in the source cluster are already
> > > > compressed
> > > > > > and
> > > > > > > > properly batched, users just need an identical copy of the
> > > message
> > > > > > bytes
> > > > > > > > through the mirroring without any transformation or
> > > repartitioning.
> > > > > > > >
> > > > > > > > We have a prototype implementation in house with MirrorMaker
> v1
> > > and
> > > > > > > > observed *CPU usage dropped from 50% to 15%* for some mirror
> > > > > pipelines.
> > > > > > > >
> > > > > > > > We name this feature: *shallow mirroring* since it has some
> > > > > resemblance
> > > > > > > to
> > > > > > > > the old Kafka 0.7 namesake feature but the implementations
> are
> > > not
> > > > > > quite
> > > > > > > > the same.  ‘*Shallow*’ means 1. we *shallowly* iterate
> > > > RecordBatches
> > > > > > > inside
> > > > > > > > MemoryRecords structure instead of deep iterating records
> > inside
> > > > > > > > RecordBatch; 2. We *shallowly* copy (share) pointers inside
> > > > > ByteBuffer
> > > > > > > > instead of deep copying and deserializing bytes into objects.
> > > > > > > >
> > > > > > > > Please share discussions/feedback along this email thread.
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > >
> > > > > > Thanks!
> > > > > > --Vahid
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-712: Shallow Mirroring

Posted by Henry Cai <hc...@pinterest.com.INVALID>.
It's interesting this VOTE thread finally becomes a DISCUSS thread.

For MM2 concern, I will take a look to see whether I can add the support
for MM2.

For Ismael's concern on multiple batches in the ProduceRequest (conflicting
with KIP-98), here is my take:

1. We do need to group multiple batches in the same request otherwise the
network performance will suffer.
2. For the concern on transactional message support as in KIP-98, since MM1
and MM2 currently don't support transactional messages, KIP-712 will not
attempt to support transactions either.  I will add a config option on
producer config: allowMultipleBatches.  By default this option will be off
and the user needs to explicitly turn on this option to use the shallow
mirror feature.  And if we detect both this option and transaction is
turned on we will throw an exception to protect current transaction
processing.
3. In the future, when MM2 starts to support exact-once and transactional
messages (is that KIP-656?), we can revisit this code.  The current
transactional message already makes the compromise that the messages in the
same RecordBatch (MessageSet) are sharing the same
sequence-id/transaction-id, so those messages need to be committed all
together.  I think when we support the shallow mirror with transactional
semantics, we will group all batches in the same ProduceRequest in the same
transaction boundary, they need to be committed all together.  On the
broker side, all batches coming from ProduceRequest (or FetchResponse) are
committed in the same log segment file as one unit (current behavior).

On Mon, Mar 29, 2021 at 8:46 AM Ryanne Dolan <ry...@gmail.com> wrote:

> Ah, I see, thanks Ismael. Now I understand your concern.
>
> From KIP-98, re this change in v3:
>
> "This allows us to remove the message set size since each message set
> already contains a field for the size. More importantly, since there is
> only one message set to be written to the log, partial produce failures are
> no longer possible. The full message set is either successfully written to
> the log (and replicated) or it is not."
>
> The schema and size field don't seem to be an issue, as KIP-712 already
> addresses.
>
> The partial produce failure issue is something I don't understand. I can't
> tell if this was done out of convenience at the time or if there is
> something incompatible with partial produce success/failure and EOS. Does
> anyone know?
>
> Ryanne
>
> On Mon, Mar 29, 2021, 1:41 AM Ismael Juma <is...@juma.me.uk> wrote:
>
> > Ryanne,
> >
> > You misunderstood the referenced comment. It is about the produce request
> > change to have multiple batches:
> >
> > "Up to ProduceRequest V2, a ProduceRequest can contain multiple batches
> of
> > messages stored in the record_set field, but this was disabled in V3.  We
> > are proposing to bring the multiple batches feature back to improve the
> > network throughput of the mirror maker producer when the original batch
> > size from source broker is too small."
> >
> > This is unrelated to shallow iteration.
> >
> > Ismael
> >
> > On Sun, Mar 28, 2021, 10:15 PM Ryanne Dolan <ry...@gmail.com>
> wrote:
> >
> > > Ismael, I don't think KIP-98 is related. Shallow iteration was removed
> in
> > > KAFKA-732, which predates KIP-98 by a few years.
> > >
> > > Ryanne
> > >
> > > On Sun, Mar 28, 2021, 11:25 PM Ismael Juma <is...@juma.me.uk> wrote:
> > >
> > > > Thanks for the KIP. I have a few high level comments:
> > > >
> > > > 1. Like Tom, I'm not convinced about the proposal to make this change
> > to
> > > > MirrorMaker 1 if we intend to deprecate it and remove it. I would
> > rather
> > > us
> > > > focus our efforts on the implementation we intend to support going
> > > forward.
> > > > 2. The producer/consumer configs seem pretty dangerous for general
> > usage,
> > > > but the KIP doesn't address the potential downsides.
> > > > 3. How does the ProducerRequest change impact exactly-once (if at
> all)?
> > > The
> > > > change we are reverting was done as part of KIP-98. Have we
> considered
> > > the
> > > > original reasons for the change?
> > > >
> > > > Thanks,
> > > > Ismael
> > > >
> > > > On Wed, Feb 10, 2021 at 12:58 PM Vahid Hashemian <
> > > > vahid.hashemian@gmail.com>
> > > > wrote:
> > > >
> > > > > Retitled the thread to conform to the common format.
> > > > >
> > > > > On Fri, Feb 5, 2021 at 4:00 PM Ning Zhang <ni...@gmail.com>
> > > > wrote:
> > > > >
> > > > > > Hello Henry,
> > > > > >
> > > > > > This is a very interesting proposal.
> > > > > > https://issues.apache.org/jira/browse/KAFKA-10728 reflects the
> > > similar
> > > > > > concern of re-compressing data in mirror maker.
> > > > > >
> > > > > > Probably one thing may need to clarify is: how "shallow"
> mirroring
> > is
> > > > > only
> > > > > > applied to mirrormaker use case, if the changes need to be made
> on
> > > > > generic
> > > > > > consumer and producer (e.g. by adding `fetch.raw.bytes` and
> > > > > > `send.raw.bytes` to producer and consumer config)
> > > > > >
> > > > > > On 2021/02/05 00:59:57, Henry Cai <hc...@pinterest.com.INVALID>
> > > wrote:
> > > > > > > Dear Community members,
> > > > > > >
> > > > > > > We are proposing a new feature to improve the performance of
> > Kafka
> > > > > mirror
> > > > > > > maker:
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-712%3A+Shallow+Mirroring
> > > > > > >
> > > > > > > The current Kafka MirrorMaker process (with the underlying
> > Consumer
> > > > and
> > > > > > > Producer library) uses significant CPU cycles and memory to
> > > > > > > decompress/recompress, deserialize/re-serialize messages and
> copy
> > > > > > multiple
> > > > > > > times of messages bytes along the mirroring/replicating stages.
> > > > > > >
> > > > > > > The KIP proposes a *shallow mirror* feature which brings back
> the
> > > > > shallow
> > > > > > > iterator concept to the mirror process and also proposes to
> skip
> > > the
> > > > > > > unnecessary message decompression and recompression steps.  We
> > > argue
> > > > in
> > > > > > > many cases users just want a simple replication pipeline to
> > > replicate
> > > > > the
> > > > > > > message as it is from the source cluster to the destination
> > > cluster.
> > > > > In
> > > > > > > many cases the messages in the source cluster are already
> > > compressed
> > > > > and
> > > > > > > properly batched, users just need an identical copy of the
> > message
> > > > > bytes
> > > > > > > through the mirroring without any transformation or
> > repartitioning.
> > > > > > >
> > > > > > > We have a prototype implementation in house with MirrorMaker v1
> > and
> > > > > > > observed *CPU usage dropped from 50% to 15%* for some mirror
> > > > pipelines.
> > > > > > >
> > > > > > > We name this feature: *shallow mirroring* since it has some
> > > > resemblance
> > > > > > to
> > > > > > > the old Kafka 0.7 namesake feature but the implementations are
> > not
> > > > > quite
> > > > > > > the same.  ‘*Shallow*’ means 1. we *shallowly* iterate
> > > RecordBatches
> > > > > > inside
> > > > > > > MemoryRecords structure instead of deep iterating records
> inside
> > > > > > > RecordBatch; 2. We *shallowly* copy (share) pointers inside
> > > > ByteBuffer
> > > > > > > instead of deep copying and deserializing bytes into objects.
> > > > > > >
> > > > > > > Please share discussions/feedback along this email thread.
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Thanks!
> > > > > --Vahid
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-712: Shallow Mirroring

Posted by Ryanne Dolan <ry...@gmail.com>.
Ah, I see, thanks Ismael. Now I understand your concern.

From KIP-98, re this change in v3:

"This allows us to remove the message set size since each message set
already contains a field for the size. More importantly, since there is
only one message set to be written to the log, partial produce failures are
no longer possible. The full message set is either successfully written to
the log (and replicated) or it is not."

The schema and size field don't seem to be an issue, as KIP-712 already
addresses.

The partial produce failure issue is something I don't understand. I can't
tell if this was done out of convenience at the time or if there is
something incompatible with partial produce success/failure and EOS. Does
anyone know?

Ryanne

On Mon, Mar 29, 2021, 1:41 AM Ismael Juma <is...@juma.me.uk> wrote:

> Ryanne,
>
> You misunderstood the referenced comment. It is about the produce request
> change to have multiple batches:
>
> "Up to ProduceRequest V2, a ProduceRequest can contain multiple batches of
> messages stored in the record_set field, but this was disabled in V3.  We
> are proposing to bring the multiple batches feature back to improve the
> network throughput of the mirror maker producer when the original batch
> size from source broker is too small."
>
> This is unrelated to shallow iteration.
>
> Ismael
>
> On Sun, Mar 28, 2021, 10:15 PM Ryanne Dolan <ry...@gmail.com> wrote:
>
> > Ismael, I don't think KIP-98 is related. Shallow iteration was removed in
> > KAFKA-732, which predates KIP-98 by a few years.
> >
> > Ryanne
> >
> > On Sun, Mar 28, 2021, 11:25 PM Ismael Juma <is...@juma.me.uk> wrote:
> >
> > > Thanks for the KIP. I have a few high level comments:
> > >
> > > 1. Like Tom, I'm not convinced about the proposal to make this change
> to
> > > MirrorMaker 1 if we intend to deprecate it and remove it. I would
> rather
> > us
> > > focus our efforts on the implementation we intend to support going
> > forward.
> > > 2. The producer/consumer configs seem pretty dangerous for general
> usage,
> > > but the KIP doesn't address the potential downsides.
> > > 3. How does the ProducerRequest change impact exactly-once (if at all)?
> > The
> > > change we are reverting was done as part of KIP-98. Have we considered
> > the
> > > original reasons for the change?
> > >
> > > Thanks,
> > > Ismael
> > >
> > > On Wed, Feb 10, 2021 at 12:58 PM Vahid Hashemian <
> > > vahid.hashemian@gmail.com>
> > > wrote:
> > >
> > > > Retitled the thread to conform to the common format.
> > > >
> > > > On Fri, Feb 5, 2021 at 4:00 PM Ning Zhang <ni...@gmail.com>
> > > wrote:
> > > >
> > > > > Hello Henry,
> > > > >
> > > > > This is a very interesting proposal.
> > > > > https://issues.apache.org/jira/browse/KAFKA-10728 reflects the
> > similar
> > > > > concern of re-compressing data in mirror maker.
> > > > >
> > > > > Probably one thing may need to clarify is: how "shallow" mirroring
> is
> > > > only
> > > > > applied to mirrormaker use case, if the changes need to be made on
> > > > generic
> > > > > consumer and producer (e.g. by adding `fetch.raw.bytes` and
> > > > > `send.raw.bytes` to producer and consumer config)
> > > > >
> > > > > On 2021/02/05 00:59:57, Henry Cai <hc...@pinterest.com.INVALID>
> > wrote:
> > > > > > Dear Community members,
> > > > > >
> > > > > > We are proposing a new feature to improve the performance of
> Kafka
> > > > mirror
> > > > > > maker:
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-712%3A+Shallow+Mirroring
> > > > > >
> > > > > > The current Kafka MirrorMaker process (with the underlying
> Consumer
> > > and
> > > > > > Producer library) uses significant CPU cycles and memory to
> > > > > > decompress/recompress, deserialize/re-serialize messages and copy
> > > > > multiple
> > > > > > times of messages bytes along the mirroring/replicating stages.
> > > > > >
> > > > > > The KIP proposes a *shallow mirror* feature which brings back the
> > > > shallow
> > > > > > iterator concept to the mirror process and also proposes to skip
> > the
> > > > > > unnecessary message decompression and recompression steps.  We
> > argue
> > > in
> > > > > > many cases users just want a simple replication pipeline to
> > replicate
> > > > the
> > > > > > message as it is from the source cluster to the destination
> > cluster.
> > > > In
> > > > > > many cases the messages in the source cluster are already
> > compressed
> > > > and
> > > > > > properly batched, users just need an identical copy of the
> message
> > > > bytes
> > > > > > through the mirroring without any transformation or
> repartitioning.
> > > > > >
> > > > > > We have a prototype implementation in house with MirrorMaker v1
> and
> > > > > > observed *CPU usage dropped from 50% to 15%* for some mirror
> > > pipelines.
> > > > > >
> > > > > > We name this feature: *shallow mirroring* since it has some
> > > resemblance
> > > > > to
> > > > > > the old Kafka 0.7 namesake feature but the implementations are
> not
> > > > quite
> > > > > > the same.  ‘*Shallow*’ means 1. we *shallowly* iterate
> > RecordBatches
> > > > > inside
> > > > > > MemoryRecords structure instead of deep iterating records inside
> > > > > > RecordBatch; 2. We *shallowly* copy (share) pointers inside
> > > ByteBuffer
> > > > > > instead of deep copying and deserializing bytes into objects.
> > > > > >
> > > > > > Please share discussions/feedback along this email thread.
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Thanks!
> > > > --Vahid
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-712: Shallow Mirroring

Posted by Ismael Juma <is...@juma.me.uk>.
Ryanne,

You misunderstood the referenced comment. It is about the produce request
change to have multiple batches:

"Up to ProduceRequest V2, a ProduceRequest can contain multiple batches of
messages stored in the record_set field, but this was disabled in V3.  We
are proposing to bring the multiple batches feature back to improve the
network throughput of the mirror maker producer when the original batch
size from source broker is too small."

This is unrelated to shallow iteration.

Ismael

On Sun, Mar 28, 2021, 10:15 PM Ryanne Dolan <ry...@gmail.com> wrote:

> Ismael, I don't think KIP-98 is related. Shallow iteration was removed in
> KAFKA-732, which predates KIP-98 by a few years.
>
> Ryanne
>
> On Sun, Mar 28, 2021, 11:25 PM Ismael Juma <is...@juma.me.uk> wrote:
>
> > Thanks for the KIP. I have a few high level comments:
> >
> > 1. Like Tom, I'm not convinced about the proposal to make this change to
> > MirrorMaker 1 if we intend to deprecate it and remove it. I would rather
> us
> > focus our efforts on the implementation we intend to support going
> forward.
> > 2. The producer/consumer configs seem pretty dangerous for general usage,
> > but the KIP doesn't address the potential downsides.
> > 3. How does the ProducerRequest change impact exactly-once (if at all)?
> The
> > change we are reverting was done as part of KIP-98. Have we considered
> the
> > original reasons for the change?
> >
> > Thanks,
> > Ismael
> >
> > On Wed, Feb 10, 2021 at 12:58 PM Vahid Hashemian <
> > vahid.hashemian@gmail.com>
> > wrote:
> >
> > > Retitled the thread to conform to the common format.
> > >
> > > On Fri, Feb 5, 2021 at 4:00 PM Ning Zhang <ni...@gmail.com>
> > wrote:
> > >
> > > > Hello Henry,
> > > >
> > > > This is a very interesting proposal.
> > > > https://issues.apache.org/jira/browse/KAFKA-10728 reflects the
> similar
> > > > concern of re-compressing data in mirror maker.
> > > >
> > > > Probably one thing may need to clarify is: how "shallow" mirroring is
> > > only
> > > > applied to mirrormaker use case, if the changes need to be made on
> > > generic
> > > > consumer and producer (e.g. by adding `fetch.raw.bytes` and
> > > > `send.raw.bytes` to producer and consumer config)
> > > >
> > > > On 2021/02/05 00:59:57, Henry Cai <hc...@pinterest.com.INVALID>
> wrote:
> > > > > Dear Community members,
> > > > >
> > > > > We are proposing a new feature to improve the performance of Kafka
> > > mirror
> > > > > maker:
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-712%3A+Shallow+Mirroring
> > > > >
> > > > > The current Kafka MirrorMaker process (with the underlying Consumer
> > and
> > > > > Producer library) uses significant CPU cycles and memory to
> > > > > decompress/recompress, deserialize/re-serialize messages and copy
> > > > multiple
> > > > > times of messages bytes along the mirroring/replicating stages.
> > > > >
> > > > > The KIP proposes a *shallow mirror* feature which brings back the
> > > shallow
> > > > > iterator concept to the mirror process and also proposes to skip
> the
> > > > > unnecessary message decompression and recompression steps.  We
> argue
> > in
> > > > > many cases users just want a simple replication pipeline to
> replicate
> > > the
> > > > > message as it is from the source cluster to the destination
> cluster.
> > > In
> > > > > many cases the messages in the source cluster are already
> compressed
> > > and
> > > > > properly batched, users just need an identical copy of the message
> > > bytes
> > > > > through the mirroring without any transformation or repartitioning.
> > > > >
> > > > > We have a prototype implementation in house with MirrorMaker v1 and
> > > > > observed *CPU usage dropped from 50% to 15%* for some mirror
> > pipelines.
> > > > >
> > > > > We name this feature: *shallow mirroring* since it has some
> > resemblance
> > > > to
> > > > > the old Kafka 0.7 namesake feature but the implementations are not
> > > quite
> > > > > the same.  ‘*Shallow*’ means 1. we *shallowly* iterate
> RecordBatches
> > > > inside
> > > > > MemoryRecords structure instead of deep iterating records inside
> > > > > RecordBatch; 2. We *shallowly* copy (share) pointers inside
> > ByteBuffer
> > > > > instead of deep copying and deserializing bytes into objects.
> > > > >
> > > > > Please share discussions/feedback along this email thread.
> > > > >
> > > >
> > >
> > >
> > > --
> > >
> > > Thanks!
> > > --Vahid
> > >
> >
>

Re: [DISCUSS] KIP-712: Shallow Mirroring

Posted by Ryanne Dolan <ry...@gmail.com>.
Ismael, I don't think KIP-98 is related. Shallow iteration was removed in
KAFKA-732, which predates KIP-98 by a few years.

Ryanne

On Sun, Mar 28, 2021, 11:25 PM Ismael Juma <is...@juma.me.uk> wrote:

> Thanks for the KIP. I have a few high level comments:
>
> 1. Like Tom, I'm not convinced about the proposal to make this change to
> MirrorMaker 1 if we intend to deprecate it and remove it. I would rather us
> focus our efforts on the implementation we intend to support going forward.
> 2. The producer/consumer configs seem pretty dangerous for general usage,
> but the KIP doesn't address the potential downsides.
> 3. How does the ProducerRequest change impact exactly-once (if at all)? The
> change we are reverting was done as part of KIP-98. Have we considered the
> original reasons for the change?
>
> Thanks,
> Ismael
>
> On Wed, Feb 10, 2021 at 12:58 PM Vahid Hashemian <
> vahid.hashemian@gmail.com>
> wrote:
>
> > Retitled the thread to conform to the common format.
> >
> > On Fri, Feb 5, 2021 at 4:00 PM Ning Zhang <ni...@gmail.com>
> wrote:
> >
> > > Hello Henry,
> > >
> > > This is a very interesting proposal.
> > > https://issues.apache.org/jira/browse/KAFKA-10728 reflects the similar
> > > concern of re-compressing data in mirror maker.
> > >
> > > Probably one thing may need to clarify is: how "shallow" mirroring is
> > only
> > > applied to mirrormaker use case, if the changes need to be made on
> > generic
> > > consumer and producer (e.g. by adding `fetch.raw.bytes` and
> > > `send.raw.bytes` to producer and consumer config)
> > >
> > > On 2021/02/05 00:59:57, Henry Cai <hc...@pinterest.com.INVALID> wrote:
> > > > Dear Community members,
> > > >
> > > > We are proposing a new feature to improve the performance of Kafka
> > mirror
> > > > maker:
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-712%3A+Shallow+Mirroring
> > > >
> > > > The current Kafka MirrorMaker process (with the underlying Consumer
> and
> > > > Producer library) uses significant CPU cycles and memory to
> > > > decompress/recompress, deserialize/re-serialize messages and copy
> > > multiple
> > > > times of messages bytes along the mirroring/replicating stages.
> > > >
> > > > The KIP proposes a *shallow mirror* feature which brings back the
> > shallow
> > > > iterator concept to the mirror process and also proposes to skip the
> > > > unnecessary message decompression and recompression steps.  We argue
> in
> > > > many cases users just want a simple replication pipeline to replicate
> > the
> > > > message as it is from the source cluster to the destination cluster.
> > In
> > > > many cases the messages in the source cluster are already compressed
> > and
> > > > properly batched, users just need an identical copy of the message
> > bytes
> > > > through the mirroring without any transformation or repartitioning.
> > > >
> > > > We have a prototype implementation in house with MirrorMaker v1 and
> > > > observed *CPU usage dropped from 50% to 15%* for some mirror
> pipelines.
> > > >
> > > > We name this feature: *shallow mirroring* since it has some
> resemblance
> > > to
> > > > the old Kafka 0.7 namesake feature but the implementations are not
> > quite
> > > > the same.  ‘*Shallow*’ means 1. we *shallowly* iterate RecordBatches
> > > inside
> > > > MemoryRecords structure instead of deep iterating records inside
> > > > RecordBatch; 2. We *shallowly* copy (share) pointers inside
> ByteBuffer
> > > > instead of deep copying and deserializing bytes into objects.
> > > >
> > > > Please share discussions/feedback along this email thread.
> > > >
> > >
> >
> >
> > --
> >
> > Thanks!
> > --Vahid
> >
>

Re: [DISCUSS] KIP-712: Shallow Mirroring

Posted by Ismael Juma <is...@juma.me.uk>.
Thanks for the KIP. I have a few high level comments:

1. Like Tom, I'm not convinced about the proposal to make this change to
MirrorMaker 1 if we intend to deprecate it and remove it. I would rather us
focus our efforts on the implementation we intend to support going forward.
2. The producer/consumer configs seem pretty dangerous for general usage,
but the KIP doesn't address the potential downsides.
3. How does the ProducerRequest change impact exactly-once (if at all)? The
change we are reverting was done as part of KIP-98. Have we considered the
original reasons for the change?

Thanks,
Ismael

On Wed, Feb 10, 2021 at 12:58 PM Vahid Hashemian <va...@gmail.com>
wrote:

> Retitled the thread to conform to the common format.
>
> On Fri, Feb 5, 2021 at 4:00 PM Ning Zhang <ni...@gmail.com> wrote:
>
> > Hello Henry,
> >
> > This is a very interesting proposal.
> > https://issues.apache.org/jira/browse/KAFKA-10728 reflects the similar
> > concern of re-compressing data in mirror maker.
> >
> > Probably one thing may need to clarify is: how "shallow" mirroring is
> only
> > applied to mirrormaker use case, if the changes need to be made on
> generic
> > consumer and producer (e.g. by adding `fetch.raw.bytes` and
> > `send.raw.bytes` to producer and consumer config)
> >
> > On 2021/02/05 00:59:57, Henry Cai <hc...@pinterest.com.INVALID> wrote:
> > > Dear Community members,
> > >
> > > We are proposing a new feature to improve the performance of Kafka
> mirror
> > > maker:
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-712%3A+Shallow+Mirroring
> > >
> > > The current Kafka MirrorMaker process (with the underlying Consumer and
> > > Producer library) uses significant CPU cycles and memory to
> > > decompress/recompress, deserialize/re-serialize messages and copy
> > multiple
> > > times of messages bytes along the mirroring/replicating stages.
> > >
> > > The KIP proposes a *shallow mirror* feature which brings back the
> shallow
> > > iterator concept to the mirror process and also proposes to skip the
> > > unnecessary message decompression and recompression steps.  We argue in
> > > many cases users just want a simple replication pipeline to replicate
> the
> > > message as it is from the source cluster to the destination cluster.
> In
> > > many cases the messages in the source cluster are already compressed
> and
> > > properly batched, users just need an identical copy of the message
> bytes
> > > through the mirroring without any transformation or repartitioning.
> > >
> > > We have a prototype implementation in house with MirrorMaker v1 and
> > > observed *CPU usage dropped from 50% to 15%* for some mirror pipelines.
> > >
> > > We name this feature: *shallow mirroring* since it has some resemblance
> > to
> > > the old Kafka 0.7 namesake feature but the implementations are not
> quite
> > > the same.  ‘*Shallow*’ means 1. we *shallowly* iterate RecordBatches
> > inside
> > > MemoryRecords structure instead of deep iterating records inside
> > > RecordBatch; 2. We *shallowly* copy (share) pointers inside ByteBuffer
> > > instead of deep copying and deserializing bytes into objects.
> > >
> > > Please share discussions/feedback along this email thread.
> > >
> >
>
>
> --
>
> Thanks!
> --Vahid
>

Re: [DISCUSS] KIP-712: Shallow Mirroring

Posted by Ryanne Dolan <ry...@gmail.com>.
> so as to avoid creating a perverse incentive for users to not adopt MM2?

Tom, I know there are several of us chomping at the bit to enable this in
MM2, so I don't think that's a problem. AFAICT there won't be a separate
KIP required -- we just need to enable this new option in MM2 by default.

Ryanne

On Fri, Mar 26, 2021 at 12:53 PM Tom Bentley <tb...@redhat.com> wrote:

> Hi,
>
> Thanks for the KIP. I've not yet read it in detail, so have no technical
> points to make at this point.
>
> I'm having trouble reconciling KIP-720's deprecation of MM1 with this
> proposal to add a new feature to MM1 and not to MM2. I think this would
> create a confusing impression to users. Given that the change to MM2 is
> supposed to be straightforward is there any reason why not to implement it
> as part of this KIP, so as to avoid creating a perverse incentive for users
> to not adopt MM2?
>
> Kind regards,
>
> Tom
>
> On Mon, Mar 22, 2021 at 6:21 PM Henry Cai <hc...@pinterest.com.invalid>
> wrote:
>
> > If there are no additional comments you’d start a vote in a couple of
> days.
> >
> > On Sat, Feb 27, 2021 at 9:26 AM A S <as...@gmail.com> wrote:
> >
> > > +1 to adding latency metrics.
> > >
> > > To add context on why CPU, memory and GC has a bigger impact than
> network
> > > in a Mirror for compressed topics without KIP-712 is: *a failing /
> > > unstable mirror cluster will have lag perpetually spiking having much
> > > larger impact on e2e latencies*. To explain a bit more:
> > >
> > > Less data moved:
> > > Compressed topics "usually" should move less data over the network and
> > are
> > > useful to reduce the network cost / footprint of replication.
> Therefore,
> > > network usage may naturally be less than if this data were
> uncompressed.
> > > Instead the CPU usage bottleneck hits first due to decompression of
> data.
> > > Prior to KIP-712 we had never been able to operate a mirror at wire
> > speed.
> > >
> > > Stability:
> > > If there is a load spike, there can be a few scenarios played out:
> > > - more data in a batch i.e. larger uncompressed size i.e. larger memory
> > > footprint
> > > - more number of batches i.e. larger memory footprint
> > >
> > > In either case higher memory usage and more CPU cycles are used due to
> > > this.
> > > If the GC throughput or heap size is insufficient leads to OOMs.
> > >
> > > Domino Effect:
> > > Just like any Kafka Consumer, if a consumer instance in a consumer
> group
> > > terminates it triggers a rebalance. In this case that rebalance happens
> > due
> > > to an OOM. If a Mirror instance that fails due to an OOM triggered by
> > > traffic load (explained above) will result in a domino effect of more
> > > Mirror instances having OOMs as the load increases due to an even
> smaller
> > > number of running instances remaining in the group. Eventually leading
> > to a
> > > total failure of the mirror cluster.
> > >
> > > Memory Limits & Ineffective workarounds:
> > > A question that could be asked couldn't we configure the Mirror
> instance
> > > in such a way that this doesn't happen? The answer is it's expensive
> and
> > > difficult.
> > > Let's say we are using a 4 core host with X GBs of memory and configure
> > > the Mirror to use 4 Streams and this configuration leads to an OOM, we
> > > could try to reduce the number of Streams to 3 or 2. That's a 25-50%
> loss
> > > in efficiency i.e. we may now need 2x the number of nodes (& 2x cost)
> > > without any guarantees that this configuration will never result in an
> > OOM
> > > (since future traffic characteristics are unpredictable) but it may
> > reduce
> > > the probability of an OOM.
> > >
> > > Summary:
> > > Since the root cause is memory usage due to decompression of data in
> > > flight, the ideal way to resolve this was to eliminate the
> decompression
> > of
> > > data which isn't a hard requirement for the mirror to operate since it
> > was
> > > not performing any transformation or repartitioning in our case.
> > >
> > > Thanks,
> > > - Ambud
> > >
> > > On Mon, Feb 22, 2021 at 9:20 AM Vahid Hashemian <
> > vahid.hashemian@gmail.com>
> > > wrote:
> > >
> > >> As Henry mentions in the KIP, we are seeing a great deal of
> improvements
> > >> and efficiency by using the mirroring enhancement proposed in this
> KIP,
> > >> and
> > >> believe it would be equally beneficial to everyone that runs Kafka and
> > >> Kafka Mirror at scale.
> > >>
> > >> I'm bumping up this thread in case there are additional feedback or
> > >> comments.
> > >>
> > >> Thanks,
> > >> --Vahid
> > >>
> > >> On Sat, Feb 13, 2021, 13:59 Ryanne Dolan <ry...@gmail.com>
> wrote:
> > >>
> > >> > Glad to hear that latency and thruput aren't negatively affected
> > >> somehow. I
> > >> > would love to see this KIP move forward.
> > >> >
> > >> > Ryanne
> > >> >
> > >> > On Sat, Feb 13, 2021, 3:00 PM Henry Cai <hc...@pinterest.com> wrote:
> > >> >
> > >> > > Ryanne,
> > >> > >
> > >> > > Yes, network performance is also important.  In our deployment, we
> > are
> > >> > > bottlenecked on the CPU/memory on the mirror hosts.  We are using
> > >> c5.2x
> > >> > and
> > >> > > m5.2x nodes in AWS, before the deployment, CPU would peak to 80%
> but
> > >> > there
> > >> > > is enough network bandwidth left on those hosts.  Having said
> that,
> > we
> > >> > > maintain the same network throughput before and after the switch.
> > >> > >
> > >> > > On Fri, Feb 12, 2021 at 12:20 PM Ryanne Dolan <
> > ryannedolan@gmail.com>
> > >> > > wrote:
> > >> > >
> > >> > >> Hey Henry, great KIP. The performance improvements are
> impressive!
> > >> > >> However, often cpu, ram, gc are not the metrics most important
> to a
> > >> > >> replication pipeline -- often the network is mostly saturated
> > >> anyway. Do
> > >> > >> you know how this change affects latency or thruput? I suspect
> less
> > >> GC
> > >> > >> pressure means slightly less p99 latency, but it would be great
> to
> > >> see
> > >> > that
> > >> > >> confirmed. I don't think it's necessary that this KIP improves
> > these
> > >> > >> metrics, but I think it's important to show that they at least
> > aren't
> > >> > made
> > >> > >> worse.
> > >> > >>
> > >> > >> I suspect any improvement in MM1 would be magnified in MM2, given
> > >> there
> > >> > >> is a lot more machinery between consumer and producer in MM2.
> > >> > >>
> > >> > >>
> > >> > >> I'd like to do some performance analysis based on these changes.
> > >> Looking
> > >> > >> forward to a PR!
> > >> > >>
> > >> > >> Ryanne
> > >> > >>
> > >> > >> On Wed, Feb 10, 2021, 3:50 PM Henry Cai <hc...@pinterest.com>
> > wrote:
> > >> > >>
> > >> > >>> On the question "whether shallow mirror is only applied on
> mirror
> > >> maker
> > >> > >>> v1", the code change is mostly on consumer and producer code
> path,
> > >> the
> > >> > >>> change to mirrormaker v1 is very trivial.  We chose to modify
> the
> > >> > >>> consumer/producer path (instead of creating a new mirror
> product)
> > so
> > >> > other
> > >> > >>> use cases can use that feature as well.  The change to mirror
> > maker
> > >> v2
> > >> > >>> should be straightforward as well but we don't have that
> > >> environment in
> > >> > >>> house.  I think the community can easily port this change to
> > mirror
> > >> > maker
> > >> > >>> v2.
> > >> > >>>
> > >> > >>>
> > >> > >>>
> > >> > >>> On Wed, Feb 10, 2021 at 12:58 PM Vahid Hashemian <
> > >> > >>> vahid.hashemian@gmail.com> wrote:
> > >> > >>>
> > >> > >>>> Retitled the thread to conform to the common format.
> > >> > >>>>
> > >> > >>>> On Fri, Feb 5, 2021 at 4:00 PM Ning Zhang <
> > ning2008wisc@gmail.com>
> > >> > >>>> wrote:
> > >> > >>>>
> > >> > >>>> > Hello Henry,
> > >> > >>>> >
> > >> > >>>> > This is a very interesting proposal.
> > >> > >>>> > https://issues.apache.org/jira/browse/KAFKA-10728 reflects
> the
> > >> > >>>> similar
> > >> > >>>> > concern of re-compressing data in mirror maker.
> > >> > >>>> >
> > >> > >>>> > Probably one thing may need to clarify is: how "shallow"
> > >> mirroring
> > >> > is
> > >> > >>>> only
> > >> > >>>> > applied to mirrormaker use case, if the changes need to be
> made
> > >> on
> > >> > >>>> generic
> > >> > >>>> > consumer and producer (e.g. by adding `fetch.raw.bytes` and
> > >> > >>>> > `send.raw.bytes` to producer and consumer config)
> > >> > >>>> >
> > >> > >>>> > On 2021/02/05 00:59:57, Henry Cai <hcai@pinterest.com.INVALID
> >
> > >> > wrote:
> > >> > >>>> > > Dear Community members,
> > >> > >>>> > >
> > >> > >>>> > > We are proposing a new feature to improve the performance
> of
> > >> Kafka
> > >> > >>>> mirror
> > >> > >>>> > > maker:
> > >> > >>>> > >
> > >> > >>>> >
> > >> > >>>>
> > >> >
> > >>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-712%3A+Shallow+Mirroring
> > >> > >>>> > >
> > >> > >>>> > > The current Kafka MirrorMaker process (with the underlying
> > >> > Consumer
> > >> > >>>> and
> > >> > >>>> > > Producer library) uses significant CPU cycles and memory to
> > >> > >>>> > > decompress/recompress, deserialize/re-serialize messages
> and
> > >> copy
> > >> > >>>> > multiple
> > >> > >>>> > > times of messages bytes along the mirroring/replicating
> > stages.
> > >> > >>>> > >
> > >> > >>>> > > The KIP proposes a *shallow mirror* feature which brings
> back
> > >> the
> > >> > >>>> shallow
> > >> > >>>> > > iterator concept to the mirror process and also proposes to
> > >> skip
> > >> > the
> > >> > >>>> > > unnecessary message decompression and recompression steps.
> > We
> > >> > >>>> argue in
> > >> > >>>> > > many cases users just want a simple replication pipeline to
> > >> > >>>> replicate the
> > >> > >>>> > > message as it is from the source cluster to the destination
> > >> > >>>> cluster.  In
> > >> > >>>> > > many cases the messages in the source cluster are already
> > >> > >>>> compressed and
> > >> > >>>> > > properly batched, users just need an identical copy of the
> > >> message
> > >> > >>>> bytes
> > >> > >>>> > > through the mirroring without any transformation or
> > >> > repartitioning.
> > >> > >>>> > >
> > >> > >>>> > > We have a prototype implementation in house with
> MirrorMaker
> > v1
> > >> > and
> > >> > >>>> > > observed *CPU usage dropped from 50% to 15%* for some
> mirror
> > >> > >>>> pipelines.
> > >> > >>>> > >
> > >> > >>>> > > We name this feature: *shallow mirroring* since it has some
> > >> > >>>> resemblance
> > >> > >>>> > to
> > >> > >>>> > > the old Kafka 0.7 namesake feature but the implementations
> > are
> > >> not
> > >> > >>>> quite
> > >> > >>>> > > the same.  ‘*Shallow*’ means 1. we *shallowly* iterate
> > >> > RecordBatches
> > >> > >>>> > inside
> > >> > >>>> > > MemoryRecords structure instead of deep iterating records
> > >> inside
> > >> > >>>> > > RecordBatch; 2. We *shallowly* copy (share) pointers inside
> > >> > >>>> ByteBuffer
> > >> > >>>> > > instead of deep copying and deserializing bytes into
> objects.
> > >> > >>>> > >
> > >> > >>>> > > Please share discussions/feedback along this email thread.
> > >> > >>>> > >
> > >> > >>>> >
> > >> > >>>>
> > >> > >>>>
> > >> > >>>> --
> > >> > >>>>
> > >> > >>>> Thanks!
> > >> > >>>> --Vahid
> > >> > >>>>
> > >> > >>>
> > >> >
> > >>
> > >
> >
>

Re: [DISCUSS] KIP-712: Shallow Mirroring

Posted by Tom Bentley <tb...@redhat.com>.
Hi,

Thanks for the KIP. I've not yet read it in detail, so have no technical
points to make at this point.

I'm having trouble reconciling KIP-720's deprecation of MM1 with this
proposal to add a new feature to MM1 and not to MM2. I think this would
create a confusing impression to users. Given that the change to MM2 is
supposed to be straightforward is there any reason why not to implement it
as part of this KIP, so as to avoid creating a perverse incentive for users
to not adopt MM2?

Kind regards,

Tom

On Mon, Mar 22, 2021 at 6:21 PM Henry Cai <hc...@pinterest.com.invalid>
wrote:

> If there are no additional comments you’d start a vote in a couple of days.
>
> On Sat, Feb 27, 2021 at 9:26 AM A S <as...@gmail.com> wrote:
>
> > +1 to adding latency metrics.
> >
> > To add context on why CPU, memory and GC has a bigger impact than network
> > in a Mirror for compressed topics without KIP-712 is: *a failing /
> > unstable mirror cluster will have lag perpetually spiking having much
> > larger impact on e2e latencies*. To explain a bit more:
> >
> > Less data moved:
> > Compressed topics "usually" should move less data over the network and
> are
> > useful to reduce the network cost / footprint of replication. Therefore,
> > network usage may naturally be less than if this data were uncompressed.
> > Instead the CPU usage bottleneck hits first due to decompression of data.
> > Prior to KIP-712 we had never been able to operate a mirror at wire
> speed.
> >
> > Stability:
> > If there is a load spike, there can be a few scenarios played out:
> > - more data in a batch i.e. larger uncompressed size i.e. larger memory
> > footprint
> > - more number of batches i.e. larger memory footprint
> >
> > In either case higher memory usage and more CPU cycles are used due to
> > this.
> > If the GC throughput or heap size is insufficient leads to OOMs.
> >
> > Domino Effect:
> > Just like any Kafka Consumer, if a consumer instance in a consumer group
> > terminates it triggers a rebalance. In this case that rebalance happens
> due
> > to an OOM. If a Mirror instance that fails due to an OOM triggered by
> > traffic load (explained above) will result in a domino effect of more
> > Mirror instances having OOMs as the load increases due to an even smaller
> > number of running instances remaining in the group. Eventually leading
> to a
> > total failure of the mirror cluster.
> >
> > Memory Limits & Ineffective workarounds:
> > A question that could be asked couldn't we configure the Mirror instance
> > in such a way that this doesn't happen? The answer is it's expensive and
> > difficult.
> > Let's say we are using a 4 core host with X GBs of memory and configure
> > the Mirror to use 4 Streams and this configuration leads to an OOM, we
> > could try to reduce the number of Streams to 3 or 2. That's a 25-50% loss
> > in efficiency i.e. we may now need 2x the number of nodes (& 2x cost)
> > without any guarantees that this configuration will never result in an
> OOM
> > (since future traffic characteristics are unpredictable) but it may
> reduce
> > the probability of an OOM.
> >
> > Summary:
> > Since the root cause is memory usage due to decompression of data in
> > flight, the ideal way to resolve this was to eliminate the decompression
> of
> > data which isn't a hard requirement for the mirror to operate since it
> was
> > not performing any transformation or repartitioning in our case.
> >
> > Thanks,
> > - Ambud
> >
> > On Mon, Feb 22, 2021 at 9:20 AM Vahid Hashemian <
> vahid.hashemian@gmail.com>
> > wrote:
> >
> >> As Henry mentions in the KIP, we are seeing a great deal of improvements
> >> and efficiency by using the mirroring enhancement proposed in this KIP,
> >> and
> >> believe it would be equally beneficial to everyone that runs Kafka and
> >> Kafka Mirror at scale.
> >>
> >> I'm bumping up this thread in case there are additional feedback or
> >> comments.
> >>
> >> Thanks,
> >> --Vahid
> >>
> >> On Sat, Feb 13, 2021, 13:59 Ryanne Dolan <ry...@gmail.com> wrote:
> >>
> >> > Glad to hear that latency and thruput aren't negatively affected
> >> somehow. I
> >> > would love to see this KIP move forward.
> >> >
> >> > Ryanne
> >> >
> >> > On Sat, Feb 13, 2021, 3:00 PM Henry Cai <hc...@pinterest.com> wrote:
> >> >
> >> > > Ryanne,
> >> > >
> >> > > Yes, network performance is also important.  In our deployment, we
> are
> >> > > bottlenecked on the CPU/memory on the mirror hosts.  We are using
> >> c5.2x
> >> > and
> >> > > m5.2x nodes in AWS, before the deployment, CPU would peak to 80% but
> >> > there
> >> > > is enough network bandwidth left on those hosts.  Having said that,
> we
> >> > > maintain the same network throughput before and after the switch.
> >> > >
> >> > > On Fri, Feb 12, 2021 at 12:20 PM Ryanne Dolan <
> ryannedolan@gmail.com>
> >> > > wrote:
> >> > >
> >> > >> Hey Henry, great KIP. The performance improvements are impressive!
> >> > >> However, often cpu, ram, gc are not the metrics most important to a
> >> > >> replication pipeline -- often the network is mostly saturated
> >> anyway. Do
> >> > >> you know how this change affects latency or thruput? I suspect less
> >> GC
> >> > >> pressure means slightly less p99 latency, but it would be great to
> >> see
> >> > that
> >> > >> confirmed. I don't think it's necessary that this KIP improves
> these
> >> > >> metrics, but I think it's important to show that they at least
> aren't
> >> > made
> >> > >> worse.
> >> > >>
> >> > >> I suspect any improvement in MM1 would be magnified in MM2, given
> >> there
> >> > >> is a lot more machinery between consumer and producer in MM2.
> >> > >>
> >> > >>
> >> > >> I'd like to do some performance analysis based on these changes.
> >> Looking
> >> > >> forward to a PR!
> >> > >>
> >> > >> Ryanne
> >> > >>
> >> > >> On Wed, Feb 10, 2021, 3:50 PM Henry Cai <hc...@pinterest.com>
> wrote:
> >> > >>
> >> > >>> On the question "whether shallow mirror is only applied on mirror
> >> maker
> >> > >>> v1", the code change is mostly on consumer and producer code path,
> >> the
> >> > >>> change to mirrormaker v1 is very trivial.  We chose to modify the
> >> > >>> consumer/producer path (instead of creating a new mirror product)
> so
> >> > other
> >> > >>> use cases can use that feature as well.  The change to mirror
> maker
> >> v2
> >> > >>> should be straightforward as well but we don't have that
> >> environment in
> >> > >>> house.  I think the community can easily port this change to
> mirror
> >> > maker
> >> > >>> v2.
> >> > >>>
> >> > >>>
> >> > >>>
> >> > >>> On Wed, Feb 10, 2021 at 12:58 PM Vahid Hashemian <
> >> > >>> vahid.hashemian@gmail.com> wrote:
> >> > >>>
> >> > >>>> Retitled the thread to conform to the common format.
> >> > >>>>
> >> > >>>> On Fri, Feb 5, 2021 at 4:00 PM Ning Zhang <
> ning2008wisc@gmail.com>
> >> > >>>> wrote:
> >> > >>>>
> >> > >>>> > Hello Henry,
> >> > >>>> >
> >> > >>>> > This is a very interesting proposal.
> >> > >>>> > https://issues.apache.org/jira/browse/KAFKA-10728 reflects the
> >> > >>>> similar
> >> > >>>> > concern of re-compressing data in mirror maker.
> >> > >>>> >
> >> > >>>> > Probably one thing may need to clarify is: how "shallow"
> >> mirroring
> >> > is
> >> > >>>> only
> >> > >>>> > applied to mirrormaker use case, if the changes need to be made
> >> on
> >> > >>>> generic
> >> > >>>> > consumer and producer (e.g. by adding `fetch.raw.bytes` and
> >> > >>>> > `send.raw.bytes` to producer and consumer config)
> >> > >>>> >
> >> > >>>> > On 2021/02/05 00:59:57, Henry Cai <hc...@pinterest.com.INVALID>
> >> > wrote:
> >> > >>>> > > Dear Community members,
> >> > >>>> > >
> >> > >>>> > > We are proposing a new feature to improve the performance of
> >> Kafka
> >> > >>>> mirror
> >> > >>>> > > maker:
> >> > >>>> > >
> >> > >>>> >
> >> > >>>>
> >> >
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-712%3A+Shallow+Mirroring
> >> > >>>> > >
> >> > >>>> > > The current Kafka MirrorMaker process (with the underlying
> >> > Consumer
> >> > >>>> and
> >> > >>>> > > Producer library) uses significant CPU cycles and memory to
> >> > >>>> > > decompress/recompress, deserialize/re-serialize messages and
> >> copy
> >> > >>>> > multiple
> >> > >>>> > > times of messages bytes along the mirroring/replicating
> stages.
> >> > >>>> > >
> >> > >>>> > > The KIP proposes a *shallow mirror* feature which brings back
> >> the
> >> > >>>> shallow
> >> > >>>> > > iterator concept to the mirror process and also proposes to
> >> skip
> >> > the
> >> > >>>> > > unnecessary message decompression and recompression steps.
> We
> >> > >>>> argue in
> >> > >>>> > > many cases users just want a simple replication pipeline to
> >> > >>>> replicate the
> >> > >>>> > > message as it is from the source cluster to the destination
> >> > >>>> cluster.  In
> >> > >>>> > > many cases the messages in the source cluster are already
> >> > >>>> compressed and
> >> > >>>> > > properly batched, users just need an identical copy of the
> >> message
> >> > >>>> bytes
> >> > >>>> > > through the mirroring without any transformation or
> >> > repartitioning.
> >> > >>>> > >
> >> > >>>> > > We have a prototype implementation in house with MirrorMaker
> v1
> >> > and
> >> > >>>> > > observed *CPU usage dropped from 50% to 15%* for some mirror
> >> > >>>> pipelines.
> >> > >>>> > >
> >> > >>>> > > We name this feature: *shallow mirroring* since it has some
> >> > >>>> resemblance
> >> > >>>> > to
> >> > >>>> > > the old Kafka 0.7 namesake feature but the implementations
> are
> >> not
> >> > >>>> quite
> >> > >>>> > > the same.  ‘*Shallow*’ means 1. we *shallowly* iterate
> >> > RecordBatches
> >> > >>>> > inside
> >> > >>>> > > MemoryRecords structure instead of deep iterating records
> >> inside
> >> > >>>> > > RecordBatch; 2. We *shallowly* copy (share) pointers inside
> >> > >>>> ByteBuffer
> >> > >>>> > > instead of deep copying and deserializing bytes into objects.
> >> > >>>> > >
> >> > >>>> > > Please share discussions/feedback along this email thread.
> >> > >>>> > >
> >> > >>>> >
> >> > >>>>
> >> > >>>>
> >> > >>>> --
> >> > >>>>
> >> > >>>> Thanks!
> >> > >>>> --Vahid
> >> > >>>>
> >> > >>>
> >> >
> >>
> >
>

Re: [DISCUSS] KIP-712: Shallow Mirroring

Posted by Henry Cai <hc...@pinterest.com.INVALID>.
If there are no additional comments you’d start a vote in a couple of days.

On Sat, Feb 27, 2021 at 9:26 AM A S <as...@gmail.com> wrote:

> +1 to adding latency metrics.
>
> To add context on why CPU, memory and GC has a bigger impact than network
> in a Mirror for compressed topics without KIP-712 is: *a failing /
> unstable mirror cluster will have lag perpetually spiking having much
> larger impact on e2e latencies*. To explain a bit more:
>
> Less data moved:
> Compressed topics "usually" should move less data over the network and are
> useful to reduce the network cost / footprint of replication. Therefore,
> network usage may naturally be less than if this data were uncompressed.
> Instead the CPU usage bottleneck hits first due to decompression of data.
> Prior to KIP-712 we had never been able to operate a mirror at wire speed.
>
> Stability:
> If there is a load spike, there can be a few scenarios played out:
> - more data in a batch i.e. larger uncompressed size i.e. larger memory
> footprint
> - more number of batches i.e. larger memory footprint
>
> In either case higher memory usage and more CPU cycles are used due to
> this.
> If the GC throughput or heap size is insufficient leads to OOMs.
>
> Domino Effect:
> Just like any Kafka Consumer, if a consumer instance in a consumer group
> terminates it triggers a rebalance. In this case that rebalance happens due
> to an OOM. If a Mirror instance that fails due to an OOM triggered by
> traffic load (explained above) will result in a domino effect of more
> Mirror instances having OOMs as the load increases due to an even smaller
> number of running instances remaining in the group. Eventually leading to a
> total failure of the mirror cluster.
>
> Memory Limits & Ineffective workarounds:
> A question that could be asked couldn't we configure the Mirror instance
> in such a way that this doesn't happen? The answer is it's expensive and
> difficult.
> Let's say we are using a 4 core host with X GBs of memory and configure
> the Mirror to use 4 Streams and this configuration leads to an OOM, we
> could try to reduce the number of Streams to 3 or 2. That's a 25-50% loss
> in efficiency i.e. we may now need 2x the number of nodes (& 2x cost)
> without any guarantees that this configuration will never result in an OOM
> (since future traffic characteristics are unpredictable) but it may reduce
> the probability of an OOM.
>
> Summary:
> Since the root cause is memory usage due to decompression of data in
> flight, the ideal way to resolve this was to eliminate the decompression of
> data which isn't a hard requirement for the mirror to operate since it was
> not performing any transformation or repartitioning in our case.
>
> Thanks,
> - Ambud
>
> On Mon, Feb 22, 2021 at 9:20 AM Vahid Hashemian <va...@gmail.com>
> wrote:
>
>> As Henry mentions in the KIP, we are seeing a great deal of improvements
>> and efficiency by using the mirroring enhancement proposed in this KIP,
>> and
>> believe it would be equally beneficial to everyone that runs Kafka and
>> Kafka Mirror at scale.
>>
>> I'm bumping up this thread in case there are additional feedback or
>> comments.
>>
>> Thanks,
>> --Vahid
>>
>> On Sat, Feb 13, 2021, 13:59 Ryanne Dolan <ry...@gmail.com> wrote:
>>
>> > Glad to hear that latency and thruput aren't negatively affected
>> somehow. I
>> > would love to see this KIP move forward.
>> >
>> > Ryanne
>> >
>> > On Sat, Feb 13, 2021, 3:00 PM Henry Cai <hc...@pinterest.com> wrote:
>> >
>> > > Ryanne,
>> > >
>> > > Yes, network performance is also important.  In our deployment, we are
>> > > bottlenecked on the CPU/memory on the mirror hosts.  We are using
>> c5.2x
>> > and
>> > > m5.2x nodes in AWS, before the deployment, CPU would peak to 80% but
>> > there
>> > > is enough network bandwidth left on those hosts.  Having said that, we
>> > > maintain the same network throughput before and after the switch.
>> > >
>> > > On Fri, Feb 12, 2021 at 12:20 PM Ryanne Dolan <ry...@gmail.com>
>> > > wrote:
>> > >
>> > >> Hey Henry, great KIP. The performance improvements are impressive!
>> > >> However, often cpu, ram, gc are not the metrics most important to a
>> > >> replication pipeline -- often the network is mostly saturated
>> anyway. Do
>> > >> you know how this change affects latency or thruput? I suspect less
>> GC
>> > >> pressure means slightly less p99 latency, but it would be great to
>> see
>> > that
>> > >> confirmed. I don't think it's necessary that this KIP improves these
>> > >> metrics, but I think it's important to show that they at least aren't
>> > made
>> > >> worse.
>> > >>
>> > >> I suspect any improvement in MM1 would be magnified in MM2, given
>> there
>> > >> is a lot more machinery between consumer and producer in MM2.
>> > >>
>> > >>
>> > >> I'd like to do some performance analysis based on these changes.
>> Looking
>> > >> forward to a PR!
>> > >>
>> > >> Ryanne
>> > >>
>> > >> On Wed, Feb 10, 2021, 3:50 PM Henry Cai <hc...@pinterest.com> wrote:
>> > >>
>> > >>> On the question "whether shallow mirror is only applied on mirror
>> maker
>> > >>> v1", the code change is mostly on consumer and producer code path,
>> the
>> > >>> change to mirrormaker v1 is very trivial.  We chose to modify the
>> > >>> consumer/producer path (instead of creating a new mirror product) so
>> > other
>> > >>> use cases can use that feature as well.  The change to mirror maker
>> v2
>> > >>> should be straightforward as well but we don't have that
>> environment in
>> > >>> house.  I think the community can easily port this change to mirror
>> > maker
>> > >>> v2.
>> > >>>
>> > >>>
>> > >>>
>> > >>> On Wed, Feb 10, 2021 at 12:58 PM Vahid Hashemian <
>> > >>> vahid.hashemian@gmail.com> wrote:
>> > >>>
>> > >>>> Retitled the thread to conform to the common format.
>> > >>>>
>> > >>>> On Fri, Feb 5, 2021 at 4:00 PM Ning Zhang <ni...@gmail.com>
>> > >>>> wrote:
>> > >>>>
>> > >>>> > Hello Henry,
>> > >>>> >
>> > >>>> > This is a very interesting proposal.
>> > >>>> > https://issues.apache.org/jira/browse/KAFKA-10728 reflects the
>> > >>>> similar
>> > >>>> > concern of re-compressing data in mirror maker.
>> > >>>> >
>> > >>>> > Probably one thing may need to clarify is: how "shallow"
>> mirroring
>> > is
>> > >>>> only
>> > >>>> > applied to mirrormaker use case, if the changes need to be made
>> on
>> > >>>> generic
>> > >>>> > consumer and producer (e.g. by adding `fetch.raw.bytes` and
>> > >>>> > `send.raw.bytes` to producer and consumer config)
>> > >>>> >
>> > >>>> > On 2021/02/05 00:59:57, Henry Cai <hc...@pinterest.com.INVALID>
>> > wrote:
>> > >>>> > > Dear Community members,
>> > >>>> > >
>> > >>>> > > We are proposing a new feature to improve the performance of
>> Kafka
>> > >>>> mirror
>> > >>>> > > maker:
>> > >>>> > >
>> > >>>> >
>> > >>>>
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-712%3A+Shallow+Mirroring
>> > >>>> > >
>> > >>>> > > The current Kafka MirrorMaker process (with the underlying
>> > Consumer
>> > >>>> and
>> > >>>> > > Producer library) uses significant CPU cycles and memory to
>> > >>>> > > decompress/recompress, deserialize/re-serialize messages and
>> copy
>> > >>>> > multiple
>> > >>>> > > times of messages bytes along the mirroring/replicating stages.
>> > >>>> > >
>> > >>>> > > The KIP proposes a *shallow mirror* feature which brings back
>> the
>> > >>>> shallow
>> > >>>> > > iterator concept to the mirror process and also proposes to
>> skip
>> > the
>> > >>>> > > unnecessary message decompression and recompression steps.  We
>> > >>>> argue in
>> > >>>> > > many cases users just want a simple replication pipeline to
>> > >>>> replicate the
>> > >>>> > > message as it is from the source cluster to the destination
>> > >>>> cluster.  In
>> > >>>> > > many cases the messages in the source cluster are already
>> > >>>> compressed and
>> > >>>> > > properly batched, users just need an identical copy of the
>> message
>> > >>>> bytes
>> > >>>> > > through the mirroring without any transformation or
>> > repartitioning.
>> > >>>> > >
>> > >>>> > > We have a prototype implementation in house with MirrorMaker v1
>> > and
>> > >>>> > > observed *CPU usage dropped from 50% to 15%* for some mirror
>> > >>>> pipelines.
>> > >>>> > >
>> > >>>> > > We name this feature: *shallow mirroring* since it has some
>> > >>>> resemblance
>> > >>>> > to
>> > >>>> > > the old Kafka 0.7 namesake feature but the implementations are
>> not
>> > >>>> quite
>> > >>>> > > the same.  ‘*Shallow*’ means 1. we *shallowly* iterate
>> > RecordBatches
>> > >>>> > inside
>> > >>>> > > MemoryRecords structure instead of deep iterating records
>> inside
>> > >>>> > > RecordBatch; 2. We *shallowly* copy (share) pointers inside
>> > >>>> ByteBuffer
>> > >>>> > > instead of deep copying and deserializing bytes into objects.
>> > >>>> > >
>> > >>>> > > Please share discussions/feedback along this email thread.
>> > >>>> > >
>> > >>>> >
>> > >>>>
>> > >>>>
>> > >>>> --
>> > >>>>
>> > >>>> Thanks!
>> > >>>> --Vahid
>> > >>>>
>> > >>>
>> >
>>
>

Re: [DISCUSS] KIP-712: Shallow Mirroring

Posted by A S <as...@gmail.com>.
+1 to adding latency metrics.

To add context on why CPU, memory and GC has a bigger impact than network
in a Mirror for compressed topics without KIP-712 is: *a failing / unstable
mirror cluster will have lag perpetually spiking having much larger impact
on e2e latencies*. To explain a bit more:

Less data moved:
Compressed topics "usually" should move less data over the network and are
useful to reduce the network cost / footprint of replication. Therefore,
network usage may naturally be less than if this data were uncompressed.
Instead the CPU usage bottleneck hits first due to decompression of data.
Prior to KIP-712 we had never been able to operate a mirror at wire speed.

Stability:
If there is a load spike, there can be a few scenarios played out:
- more data in a batch i.e. larger uncompressed size i.e. larger memory
footprint
- more number of batches i.e. larger memory footprint

In either case higher memory usage and more CPU cycles are used due to
this.
If the GC throughput or heap size is insufficient leads to OOMs.

Domino Effect:
Just like any Kafka Consumer, if a consumer instance in a consumer group
terminates it triggers a rebalance. In this case that rebalance happens due
to an OOM. If a Mirror instance that fails due to an OOM triggered by
traffic load (explained above) will result in a domino effect of more
Mirror instances having OOMs as the load increases due to an even smaller
number of running instances remaining in the group. Eventually leading to a
total failure of the mirror cluster.

Memory Limits & Ineffective workarounds:
A question that could be asked couldn't we configure the Mirror instance in
such a way that this doesn't happen? The answer is it's expensive and
difficult.
Let's say we are using a 4 core host with X GBs of memory and configure the
Mirror to use 4 Streams and this configuration leads to an OOM, we could
try to reduce the number of Streams to 3 or 2. That's a 25-50% loss in
efficiency i.e. we may now need 2x the number of nodes (& 2x cost) without
any guarantees that this configuration will never result in an OOM (since
future traffic characteristics are unpredictable) but it may reduce the
probability of an OOM.

Summary:
Since the root cause is memory usage due to decompression of data in
flight, the ideal way to resolve this was to eliminate the decompression of
data which isn't a hard requirement for the mirror to operate since it was
not performing any transformation or repartitioning in our case.

Thanks,
- Ambud

On Mon, Feb 22, 2021 at 9:20 AM Vahid Hashemian <va...@gmail.com>
wrote:

> As Henry mentions in the KIP, we are seeing a great deal of improvements
> and efficiency by using the mirroring enhancement proposed in this KIP, and
> believe it would be equally beneficial to everyone that runs Kafka and
> Kafka Mirror at scale.
>
> I'm bumping up this thread in case there are additional feedback or
> comments.
>
> Thanks,
> --Vahid
>
> On Sat, Feb 13, 2021, 13:59 Ryanne Dolan <ry...@gmail.com> wrote:
>
> > Glad to hear that latency and thruput aren't negatively affected
> somehow. I
> > would love to see this KIP move forward.
> >
> > Ryanne
> >
> > On Sat, Feb 13, 2021, 3:00 PM Henry Cai <hc...@pinterest.com> wrote:
> >
> > > Ryanne,
> > >
> > > Yes, network performance is also important.  In our deployment, we are
> > > bottlenecked on the CPU/memory on the mirror hosts.  We are using c5.2x
> > and
> > > m5.2x nodes in AWS, before the deployment, CPU would peak to 80% but
> > there
> > > is enough network bandwidth left on those hosts.  Having said that, we
> > > maintain the same network throughput before and after the switch.
> > >
> > > On Fri, Feb 12, 2021 at 12:20 PM Ryanne Dolan <ry...@gmail.com>
> > > wrote:
> > >
> > >> Hey Henry, great KIP. The performance improvements are impressive!
> > >> However, often cpu, ram, gc are not the metrics most important to a
> > >> replication pipeline -- often the network is mostly saturated anyway.
> Do
> > >> you know how this change affects latency or thruput? I suspect less GC
> > >> pressure means slightly less p99 latency, but it would be great to see
> > that
> > >> confirmed. I don't think it's necessary that this KIP improves these
> > >> metrics, but I think it's important to show that they at least aren't
> > made
> > >> worse.
> > >>
> > >> I suspect any improvement in MM1 would be magnified in MM2, given
> there
> > >> is a lot more machinery between consumer and producer in MM2.
> > >>
> > >>
> > >> I'd like to do some performance analysis based on these changes.
> Looking
> > >> forward to a PR!
> > >>
> > >> Ryanne
> > >>
> > >> On Wed, Feb 10, 2021, 3:50 PM Henry Cai <hc...@pinterest.com> wrote:
> > >>
> > >>> On the question "whether shallow mirror is only applied on mirror
> maker
> > >>> v1", the code change is mostly on consumer and producer code path,
> the
> > >>> change to mirrormaker v1 is very trivial.  We chose to modify the
> > >>> consumer/producer path (instead of creating a new mirror product) so
> > other
> > >>> use cases can use that feature as well.  The change to mirror maker
> v2
> > >>> should be straightforward as well but we don't have that environment
> in
> > >>> house.  I think the community can easily port this change to mirror
> > maker
> > >>> v2.
> > >>>
> > >>>
> > >>>
> > >>> On Wed, Feb 10, 2021 at 12:58 PM Vahid Hashemian <
> > >>> vahid.hashemian@gmail.com> wrote:
> > >>>
> > >>>> Retitled the thread to conform to the common format.
> > >>>>
> > >>>> On Fri, Feb 5, 2021 at 4:00 PM Ning Zhang <ni...@gmail.com>
> > >>>> wrote:
> > >>>>
> > >>>> > Hello Henry,
> > >>>> >
> > >>>> > This is a very interesting proposal.
> > >>>> > https://issues.apache.org/jira/browse/KAFKA-10728 reflects the
> > >>>> similar
> > >>>> > concern of re-compressing data in mirror maker.
> > >>>> >
> > >>>> > Probably one thing may need to clarify is: how "shallow" mirroring
> > is
> > >>>> only
> > >>>> > applied to mirrormaker use case, if the changes need to be made on
> > >>>> generic
> > >>>> > consumer and producer (e.g. by adding `fetch.raw.bytes` and
> > >>>> > `send.raw.bytes` to producer and consumer config)
> > >>>> >
> > >>>> > On 2021/02/05 00:59:57, Henry Cai <hc...@pinterest.com.INVALID>
> > wrote:
> > >>>> > > Dear Community members,
> > >>>> > >
> > >>>> > > We are proposing a new feature to improve the performance of
> Kafka
> > >>>> mirror
> > >>>> > > maker:
> > >>>> > >
> > >>>> >
> > >>>>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-712%3A+Shallow+Mirroring
> > >>>> > >
> > >>>> > > The current Kafka MirrorMaker process (with the underlying
> > Consumer
> > >>>> and
> > >>>> > > Producer library) uses significant CPU cycles and memory to
> > >>>> > > decompress/recompress, deserialize/re-serialize messages and
> copy
> > >>>> > multiple
> > >>>> > > times of messages bytes along the mirroring/replicating stages.
> > >>>> > >
> > >>>> > > The KIP proposes a *shallow mirror* feature which brings back
> the
> > >>>> shallow
> > >>>> > > iterator concept to the mirror process and also proposes to skip
> > the
> > >>>> > > unnecessary message decompression and recompression steps.  We
> > >>>> argue in
> > >>>> > > many cases users just want a simple replication pipeline to
> > >>>> replicate the
> > >>>> > > message as it is from the source cluster to the destination
> > >>>> cluster.  In
> > >>>> > > many cases the messages in the source cluster are already
> > >>>> compressed and
> > >>>> > > properly batched, users just need an identical copy of the
> message
> > >>>> bytes
> > >>>> > > through the mirroring without any transformation or
> > repartitioning.
> > >>>> > >
> > >>>> > > We have a prototype implementation in house with MirrorMaker v1
> > and
> > >>>> > > observed *CPU usage dropped from 50% to 15%* for some mirror
> > >>>> pipelines.
> > >>>> > >
> > >>>> > > We name this feature: *shallow mirroring* since it has some
> > >>>> resemblance
> > >>>> > to
> > >>>> > > the old Kafka 0.7 namesake feature but the implementations are
> not
> > >>>> quite
> > >>>> > > the same.  ‘*Shallow*’ means 1. we *shallowly* iterate
> > RecordBatches
> > >>>> > inside
> > >>>> > > MemoryRecords structure instead of deep iterating records inside
> > >>>> > > RecordBatch; 2. We *shallowly* copy (share) pointers inside
> > >>>> ByteBuffer
> > >>>> > > instead of deep copying and deserializing bytes into objects.
> > >>>> > >
> > >>>> > > Please share discussions/feedback along this email thread.
> > >>>> > >
> > >>>> >
> > >>>>
> > >>>>
> > >>>> --
> > >>>>
> > >>>> Thanks!
> > >>>> --Vahid
> > >>>>
> > >>>
> >
>

Re: [DISCUSS] KIP-712: Shallow Mirroring

Posted by Vahid Hashemian <va...@gmail.com>.
As Henry mentions in the KIP, we are seeing a great deal of improvements
and efficiency by using the mirroring enhancement proposed in this KIP, and
believe it would be equally beneficial to everyone that runs Kafka and
Kafka Mirror at scale.

I'm bumping up this thread in case there are additional feedback or
comments.

Thanks,
--Vahid

On Sat, Feb 13, 2021, 13:59 Ryanne Dolan <ry...@gmail.com> wrote:

> Glad to hear that latency and thruput aren't negatively affected somehow. I
> would love to see this KIP move forward.
>
> Ryanne
>
> On Sat, Feb 13, 2021, 3:00 PM Henry Cai <hc...@pinterest.com> wrote:
>
> > Ryanne,
> >
> > Yes, network performance is also important.  In our deployment, we are
> > bottlenecked on the CPU/memory on the mirror hosts.  We are using c5.2x
> and
> > m5.2x nodes in AWS, before the deployment, CPU would peak to 80% but
> there
> > is enough network bandwidth left on those hosts.  Having said that, we
> > maintain the same network throughput before and after the switch.
> >
> > On Fri, Feb 12, 2021 at 12:20 PM Ryanne Dolan <ry...@gmail.com>
> > wrote:
> >
> >> Hey Henry, great KIP. The performance improvements are impressive!
> >> However, often cpu, ram, gc are not the metrics most important to a
> >> replication pipeline -- often the network is mostly saturated anyway. Do
> >> you know how this change affects latency or thruput? I suspect less GC
> >> pressure means slightly less p99 latency, but it would be great to see
> that
> >> confirmed. I don't think it's necessary that this KIP improves these
> >> metrics, but I think it's important to show that they at least aren't
> made
> >> worse.
> >>
> >> I suspect any improvement in MM1 would be magnified in MM2, given there
> >> is a lot more machinery between consumer and producer in MM2.
> >>
> >>
> >> I'd like to do some performance analysis based on these changes. Looking
> >> forward to a PR!
> >>
> >> Ryanne
> >>
> >> On Wed, Feb 10, 2021, 3:50 PM Henry Cai <hc...@pinterest.com> wrote:
> >>
> >>> On the question "whether shallow mirror is only applied on mirror maker
> >>> v1", the code change is mostly on consumer and producer code path, the
> >>> change to mirrormaker v1 is very trivial.  We chose to modify the
> >>> consumer/producer path (instead of creating a new mirror product) so
> other
> >>> use cases can use that feature as well.  The change to mirror maker v2
> >>> should be straightforward as well but we don't have that environment in
> >>> house.  I think the community can easily port this change to mirror
> maker
> >>> v2.
> >>>
> >>>
> >>>
> >>> On Wed, Feb 10, 2021 at 12:58 PM Vahid Hashemian <
> >>> vahid.hashemian@gmail.com> wrote:
> >>>
> >>>> Retitled the thread to conform to the common format.
> >>>>
> >>>> On Fri, Feb 5, 2021 at 4:00 PM Ning Zhang <ni...@gmail.com>
> >>>> wrote:
> >>>>
> >>>> > Hello Henry,
> >>>> >
> >>>> > This is a very interesting proposal.
> >>>> > https://issues.apache.org/jira/browse/KAFKA-10728 reflects the
> >>>> similar
> >>>> > concern of re-compressing data in mirror maker.
> >>>> >
> >>>> > Probably one thing may need to clarify is: how "shallow" mirroring
> is
> >>>> only
> >>>> > applied to mirrormaker use case, if the changes need to be made on
> >>>> generic
> >>>> > consumer and producer (e.g. by adding `fetch.raw.bytes` and
> >>>> > `send.raw.bytes` to producer and consumer config)
> >>>> >
> >>>> > On 2021/02/05 00:59:57, Henry Cai <hc...@pinterest.com.INVALID>
> wrote:
> >>>> > > Dear Community members,
> >>>> > >
> >>>> > > We are proposing a new feature to improve the performance of Kafka
> >>>> mirror
> >>>> > > maker:
> >>>> > >
> >>>> >
> >>>>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-712%3A+Shallow+Mirroring
> >>>> > >
> >>>> > > The current Kafka MirrorMaker process (with the underlying
> Consumer
> >>>> and
> >>>> > > Producer library) uses significant CPU cycles and memory to
> >>>> > > decompress/recompress, deserialize/re-serialize messages and copy
> >>>> > multiple
> >>>> > > times of messages bytes along the mirroring/replicating stages.
> >>>> > >
> >>>> > > The KIP proposes a *shallow mirror* feature which brings back the
> >>>> shallow
> >>>> > > iterator concept to the mirror process and also proposes to skip
> the
> >>>> > > unnecessary message decompression and recompression steps.  We
> >>>> argue in
> >>>> > > many cases users just want a simple replication pipeline to
> >>>> replicate the
> >>>> > > message as it is from the source cluster to the destination
> >>>> cluster.  In
> >>>> > > many cases the messages in the source cluster are already
> >>>> compressed and
> >>>> > > properly batched, users just need an identical copy of the message
> >>>> bytes
> >>>> > > through the mirroring without any transformation or
> repartitioning.
> >>>> > >
> >>>> > > We have a prototype implementation in house with MirrorMaker v1
> and
> >>>> > > observed *CPU usage dropped from 50% to 15%* for some mirror
> >>>> pipelines.
> >>>> > >
> >>>> > > We name this feature: *shallow mirroring* since it has some
> >>>> resemblance
> >>>> > to
> >>>> > > the old Kafka 0.7 namesake feature but the implementations are not
> >>>> quite
> >>>> > > the same.  ‘*Shallow*’ means 1. we *shallowly* iterate
> RecordBatches
> >>>> > inside
> >>>> > > MemoryRecords structure instead of deep iterating records inside
> >>>> > > RecordBatch; 2. We *shallowly* copy (share) pointers inside
> >>>> ByteBuffer
> >>>> > > instead of deep copying and deserializing bytes into objects.
> >>>> > >
> >>>> > > Please share discussions/feedback along this email thread.
> >>>> > >
> >>>> >
> >>>>
> >>>>
> >>>> --
> >>>>
> >>>> Thanks!
> >>>> --Vahid
> >>>>
> >>>
>

Re: [DISCUSS] KIP-712: Shallow Mirroring

Posted by Ryanne Dolan <ry...@gmail.com>.
Glad to hear that latency and thruput aren't negatively affected somehow. I
would love to see this KIP move forward.

Ryanne

On Sat, Feb 13, 2021, 3:00 PM Henry Cai <hc...@pinterest.com> wrote:

> Ryanne,
>
> Yes, network performance is also important.  In our deployment, we are
> bottlenecked on the CPU/memory on the mirror hosts.  We are using c5.2x and
> m5.2x nodes in AWS, before the deployment, CPU would peak to 80% but there
> is enough network bandwidth left on those hosts.  Having said that, we
> maintain the same network throughput before and after the switch.
>
> On Fri, Feb 12, 2021 at 12:20 PM Ryanne Dolan <ry...@gmail.com>
> wrote:
>
>> Hey Henry, great KIP. The performance improvements are impressive!
>> However, often cpu, ram, gc are not the metrics most important to a
>> replication pipeline -- often the network is mostly saturated anyway. Do
>> you know how this change affects latency or thruput? I suspect less GC
>> pressure means slightly less p99 latency, but it would be great to see that
>> confirmed. I don't think it's necessary that this KIP improves these
>> metrics, but I think it's important to show that they at least aren't made
>> worse.
>>
>> I suspect any improvement in MM1 would be magnified in MM2, given there
>> is a lot more machinery between consumer and producer in MM2.
>>
>>
>> I'd like to do some performance analysis based on these changes. Looking
>> forward to a PR!
>>
>> Ryanne
>>
>> On Wed, Feb 10, 2021, 3:50 PM Henry Cai <hc...@pinterest.com> wrote:
>>
>>> On the question "whether shallow mirror is only applied on mirror maker
>>> v1", the code change is mostly on consumer and producer code path, the
>>> change to mirrormaker v1 is very trivial.  We chose to modify the
>>> consumer/producer path (instead of creating a new mirror product) so other
>>> use cases can use that feature as well.  The change to mirror maker v2
>>> should be straightforward as well but we don't have that environment in
>>> house.  I think the community can easily port this change to mirror maker
>>> v2.
>>>
>>>
>>>
>>> On Wed, Feb 10, 2021 at 12:58 PM Vahid Hashemian <
>>> vahid.hashemian@gmail.com> wrote:
>>>
>>>> Retitled the thread to conform to the common format.
>>>>
>>>> On Fri, Feb 5, 2021 at 4:00 PM Ning Zhang <ni...@gmail.com>
>>>> wrote:
>>>>
>>>> > Hello Henry,
>>>> >
>>>> > This is a very interesting proposal.
>>>> > https://issues.apache.org/jira/browse/KAFKA-10728 reflects the
>>>> similar
>>>> > concern of re-compressing data in mirror maker.
>>>> >
>>>> > Probably one thing may need to clarify is: how "shallow" mirroring is
>>>> only
>>>> > applied to mirrormaker use case, if the changes need to be made on
>>>> generic
>>>> > consumer and producer (e.g. by adding `fetch.raw.bytes` and
>>>> > `send.raw.bytes` to producer and consumer config)
>>>> >
>>>> > On 2021/02/05 00:59:57, Henry Cai <hc...@pinterest.com.INVALID> wrote:
>>>> > > Dear Community members,
>>>> > >
>>>> > > We are proposing a new feature to improve the performance of Kafka
>>>> mirror
>>>> > > maker:
>>>> > >
>>>> >
>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-712%3A+Shallow+Mirroring
>>>> > >
>>>> > > The current Kafka MirrorMaker process (with the underlying Consumer
>>>> and
>>>> > > Producer library) uses significant CPU cycles and memory to
>>>> > > decompress/recompress, deserialize/re-serialize messages and copy
>>>> > multiple
>>>> > > times of messages bytes along the mirroring/replicating stages.
>>>> > >
>>>> > > The KIP proposes a *shallow mirror* feature which brings back the
>>>> shallow
>>>> > > iterator concept to the mirror process and also proposes to skip the
>>>> > > unnecessary message decompression and recompression steps.  We
>>>> argue in
>>>> > > many cases users just want a simple replication pipeline to
>>>> replicate the
>>>> > > message as it is from the source cluster to the destination
>>>> cluster.  In
>>>> > > many cases the messages in the source cluster are already
>>>> compressed and
>>>> > > properly batched, users just need an identical copy of the message
>>>> bytes
>>>> > > through the mirroring without any transformation or repartitioning.
>>>> > >
>>>> > > We have a prototype implementation in house with MirrorMaker v1 and
>>>> > > observed *CPU usage dropped from 50% to 15%* for some mirror
>>>> pipelines.
>>>> > >
>>>> > > We name this feature: *shallow mirroring* since it has some
>>>> resemblance
>>>> > to
>>>> > > the old Kafka 0.7 namesake feature but the implementations are not
>>>> quite
>>>> > > the same.  ‘*Shallow*’ means 1. we *shallowly* iterate RecordBatches
>>>> > inside
>>>> > > MemoryRecords structure instead of deep iterating records inside
>>>> > > RecordBatch; 2. We *shallowly* copy (share) pointers inside
>>>> ByteBuffer
>>>> > > instead of deep copying and deserializing bytes into objects.
>>>> > >
>>>> > > Please share discussions/feedback along this email thread.
>>>> > >
>>>> >
>>>>
>>>>
>>>> --
>>>>
>>>> Thanks!
>>>> --Vahid
>>>>
>>>

Re: [DISCUSS] KIP-712: Shallow Mirroring

Posted by Henry Cai <hc...@pinterest.com.INVALID>.
Ryanne,

Yes, network performance is also important.  In our deployment, we are
bottlenecked on the CPU/memory on the mirror hosts.  We are using c5.2x and
m5.2x nodes in AWS, before the deployment, CPU would peak to 80% but there
is enough network bandwidth left on those hosts.  Having said that, we
maintain the same network throughput before and after the switch.

On Fri, Feb 12, 2021 at 12:20 PM Ryanne Dolan <ry...@gmail.com> wrote:

> Hey Henry, great KIP. The performance improvements are impressive!
> However, often cpu, ram, gc are not the metrics most important to a
> replication pipeline -- often the network is mostly saturated anyway. Do
> you know how this change affects latency or thruput? I suspect less GC
> pressure means slightly less p99 latency, but it would be great to see that
> confirmed. I don't think it's necessary that this KIP improves these
> metrics, but I think it's important to show that they at least aren't made
> worse.
>
> I suspect any improvement in MM1 would be magnified in MM2, given there is
> a lot more machinery between consumer and producer in MM2.
>
>
> I'd like to do some performance analysis based on these changes. Looking
> forward to a PR!
>
> Ryanne
>
> On Wed, Feb 10, 2021, 3:50 PM Henry Cai <hc...@pinterest.com> wrote:
>
>> On the question "whether shallow mirror is only applied on mirror maker
>> v1", the code change is mostly on consumer and producer code path, the
>> change to mirrormaker v1 is very trivial.  We chose to modify the
>> consumer/producer path (instead of creating a new mirror product) so other
>> use cases can use that feature as well.  The change to mirror maker v2
>> should be straightforward as well but we don't have that environment in
>> house.  I think the community can easily port this change to mirror maker
>> v2.
>>
>>
>>
>> On Wed, Feb 10, 2021 at 12:58 PM Vahid Hashemian <
>> vahid.hashemian@gmail.com> wrote:
>>
>>> Retitled the thread to conform to the common format.
>>>
>>> On Fri, Feb 5, 2021 at 4:00 PM Ning Zhang <ni...@gmail.com>
>>> wrote:
>>>
>>> > Hello Henry,
>>> >
>>> > This is a very interesting proposal.
>>> > https://issues.apache.org/jira/browse/KAFKA-10728 reflects the similar
>>> > concern of re-compressing data in mirror maker.
>>> >
>>> > Probably one thing may need to clarify is: how "shallow" mirroring is
>>> only
>>> > applied to mirrormaker use case, if the changes need to be made on
>>> generic
>>> > consumer and producer (e.g. by adding `fetch.raw.bytes` and
>>> > `send.raw.bytes` to producer and consumer config)
>>> >
>>> > On 2021/02/05 00:59:57, Henry Cai <hc...@pinterest.com.INVALID> wrote:
>>> > > Dear Community members,
>>> > >
>>> > > We are proposing a new feature to improve the performance of Kafka
>>> mirror
>>> > > maker:
>>> > >
>>> >
>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-712%3A+Shallow+Mirroring
>>> > >
>>> > > The current Kafka MirrorMaker process (with the underlying Consumer
>>> and
>>> > > Producer library) uses significant CPU cycles and memory to
>>> > > decompress/recompress, deserialize/re-serialize messages and copy
>>> > multiple
>>> > > times of messages bytes along the mirroring/replicating stages.
>>> > >
>>> > > The KIP proposes a *shallow mirror* feature which brings back the
>>> shallow
>>> > > iterator concept to the mirror process and also proposes to skip the
>>> > > unnecessary message decompression and recompression steps.  We argue
>>> in
>>> > > many cases users just want a simple replication pipeline to
>>> replicate the
>>> > > message as it is from the source cluster to the destination
>>> cluster.  In
>>> > > many cases the messages in the source cluster are already compressed
>>> and
>>> > > properly batched, users just need an identical copy of the message
>>> bytes
>>> > > through the mirroring without any transformation or repartitioning.
>>> > >
>>> > > We have a prototype implementation in house with MirrorMaker v1 and
>>> > > observed *CPU usage dropped from 50% to 15%* for some mirror
>>> pipelines.
>>> > >
>>> > > We name this feature: *shallow mirroring* since it has some
>>> resemblance
>>> > to
>>> > > the old Kafka 0.7 namesake feature but the implementations are not
>>> quite
>>> > > the same.  ‘*Shallow*’ means 1. we *shallowly* iterate RecordBatches
>>> > inside
>>> > > MemoryRecords structure instead of deep iterating records inside
>>> > > RecordBatch; 2. We *shallowly* copy (share) pointers inside
>>> ByteBuffer
>>> > > instead of deep copying and deserializing bytes into objects.
>>> > >
>>> > > Please share discussions/feedback along this email thread.
>>> > >
>>> >
>>>
>>>
>>> --
>>>
>>> Thanks!
>>> --Vahid
>>>
>>

Re: [DISCUSS] KIP-712: Shallow Mirroring

Posted by Ryanne Dolan <ry...@gmail.com>.
Hey Henry, great KIP. The performance improvements are impressive! However,
often cpu, ram, gc are not the metrics most important to a replication
pipeline -- often the network is mostly saturated anyway. Do you know how
this change affects latency or thruput? I suspect less GC pressure means
slightly less p99 latency, but it would be great to see that confirmed. I
don't think it's necessary that this KIP improves these metrics, but I
think it's important to show that they at least aren't made worse.

I suspect any improvement in MM1 would be magnified in MM2, given there is
a lot more machinery between consumer and producer in MM2.


I'd like to do some performance analysis based on these changes. Looking
forward to a PR!

Ryanne

On Wed, Feb 10, 2021, 3:50 PM Henry Cai <hc...@pinterest.com> wrote:

> On the question "whether shallow mirror is only applied on mirror maker
> v1", the code change is mostly on consumer and producer code path, the
> change to mirrormaker v1 is very trivial.  We chose to modify the
> consumer/producer path (instead of creating a new mirror product) so other
> use cases can use that feature as well.  The change to mirror maker v2
> should be straightforward as well but we don't have that environment in
> house.  I think the community can easily port this change to mirror maker
> v2.
>
>
>
> On Wed, Feb 10, 2021 at 12:58 PM Vahid Hashemian <
> vahid.hashemian@gmail.com> wrote:
>
>> Retitled the thread to conform to the common format.
>>
>> On Fri, Feb 5, 2021 at 4:00 PM Ning Zhang <ni...@gmail.com> wrote:
>>
>> > Hello Henry,
>> >
>> > This is a very interesting proposal.
>> > https://issues.apache.org/jira/browse/KAFKA-10728 reflects the similar
>> > concern of re-compressing data in mirror maker.
>> >
>> > Probably one thing may need to clarify is: how "shallow" mirroring is
>> only
>> > applied to mirrormaker use case, if the changes need to be made on
>> generic
>> > consumer and producer (e.g. by adding `fetch.raw.bytes` and
>> > `send.raw.bytes` to producer and consumer config)
>> >
>> > On 2021/02/05 00:59:57, Henry Cai <hc...@pinterest.com.INVALID> wrote:
>> > > Dear Community members,
>> > >
>> > > We are proposing a new feature to improve the performance of Kafka
>> mirror
>> > > maker:
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-712%3A+Shallow+Mirroring
>> > >
>> > > The current Kafka MirrorMaker process (with the underlying Consumer
>> and
>> > > Producer library) uses significant CPU cycles and memory to
>> > > decompress/recompress, deserialize/re-serialize messages and copy
>> > multiple
>> > > times of messages bytes along the mirroring/replicating stages.
>> > >
>> > > The KIP proposes a *shallow mirror* feature which brings back the
>> shallow
>> > > iterator concept to the mirror process and also proposes to skip the
>> > > unnecessary message decompression and recompression steps.  We argue
>> in
>> > > many cases users just want a simple replication pipeline to replicate
>> the
>> > > message as it is from the source cluster to the destination cluster.
>> In
>> > > many cases the messages in the source cluster are already compressed
>> and
>> > > properly batched, users just need an identical copy of the message
>> bytes
>> > > through the mirroring without any transformation or repartitioning.
>> > >
>> > > We have a prototype implementation in house with MirrorMaker v1 and
>> > > observed *CPU usage dropped from 50% to 15%* for some mirror
>> pipelines.
>> > >
>> > > We name this feature: *shallow mirroring* since it has some
>> resemblance
>> > to
>> > > the old Kafka 0.7 namesake feature but the implementations are not
>> quite
>> > > the same.  ‘*Shallow*’ means 1. we *shallowly* iterate RecordBatches
>> > inside
>> > > MemoryRecords structure instead of deep iterating records inside
>> > > RecordBatch; 2. We *shallowly* copy (share) pointers inside ByteBuffer
>> > > instead of deep copying and deserializing bytes into objects.
>> > >
>> > > Please share discussions/feedback along this email thread.
>> > >
>> >
>>
>>
>> --
>>
>> Thanks!
>> --Vahid
>>
>

Re: [DISCUSS] KIP-712: Shallow Mirroring

Posted by Henry Cai <hc...@pinterest.com.INVALID>.
On the question "whether shallow mirror is only applied on mirror maker
v1", the code change is mostly on consumer and producer code path, the
change to mirrormaker v1 is very trivial.  We chose to modify the
consumer/producer path (instead of creating a new mirror product) so other
use cases can use that feature as well.  The change to mirror maker v2
should be straightforward as well but we don't have that environment in
house.  I think the community can easily port this change to mirror maker
v2.



On Wed, Feb 10, 2021 at 12:58 PM Vahid Hashemian <va...@gmail.com>
wrote:

> Retitled the thread to conform to the common format.
>
> On Fri, Feb 5, 2021 at 4:00 PM Ning Zhang <ni...@gmail.com> wrote:
>
> > Hello Henry,
> >
> > This is a very interesting proposal.
> > https://issues.apache.org/jira/browse/KAFKA-10728 reflects the similar
> > concern of re-compressing data in mirror maker.
> >
> > Probably one thing may need to clarify is: how "shallow" mirroring is
> only
> > applied to mirrormaker use case, if the changes need to be made on
> generic
> > consumer and producer (e.g. by adding `fetch.raw.bytes` and
> > `send.raw.bytes` to producer and consumer config)
> >
> > On 2021/02/05 00:59:57, Henry Cai <hc...@pinterest.com.INVALID> wrote:
> > > Dear Community members,
> > >
> > > We are proposing a new feature to improve the performance of Kafka
> mirror
> > > maker:
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-712%3A+Shallow+Mirroring
> > >
> > > The current Kafka MirrorMaker process (with the underlying Consumer and
> > > Producer library) uses significant CPU cycles and memory to
> > > decompress/recompress, deserialize/re-serialize messages and copy
> > multiple
> > > times of messages bytes along the mirroring/replicating stages.
> > >
> > > The KIP proposes a *shallow mirror* feature which brings back the
> shallow
> > > iterator concept to the mirror process and also proposes to skip the
> > > unnecessary message decompression and recompression steps.  We argue in
> > > many cases users just want a simple replication pipeline to replicate
> the
> > > message as it is from the source cluster to the destination cluster.
> In
> > > many cases the messages in the source cluster are already compressed
> and
> > > properly batched, users just need an identical copy of the message
> bytes
> > > through the mirroring without any transformation or repartitioning.
> > >
> > > We have a prototype implementation in house with MirrorMaker v1 and
> > > observed *CPU usage dropped from 50% to 15%* for some mirror pipelines.
> > >
> > > We name this feature: *shallow mirroring* since it has some resemblance
> > to
> > > the old Kafka 0.7 namesake feature but the implementations are not
> quite
> > > the same.  ‘*Shallow*’ means 1. we *shallowly* iterate RecordBatches
> > inside
> > > MemoryRecords structure instead of deep iterating records inside
> > > RecordBatch; 2. We *shallowly* copy (share) pointers inside ByteBuffer
> > > instead of deep copying and deserializing bytes into objects.
> > >
> > > Please share discussions/feedback along this email thread.
> > >
> >
>
>
> --
>
> Thanks!
> --Vahid
>