You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Guozhang Wang (Jira)" <ji...@apache.org> on 2020/04/16 18:20:00 UTC

[jira] [Commented] (KAFKA-9860) Transactional Producer could pre-add partitions

    [ https://issues.apache.org/jira/browse/KAFKA-9860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17085166#comment-17085166 ] 

Guozhang Wang commented on KAFKA-9860:
--------------------------------------

I think this is a good idea. Probably we should list all the pros and cons in more details in the description above. For example, in this model if the txn coordinator timed out a txn without see EndTxn request, then it has to send a marker to all the recorded partitions which may be a superset of the partitions that actually have data sent to.

> Transactional Producer could pre-add partitions
> -----------------------------------------------
>
>                 Key: KAFKA-9860
>                 URL: https://issues.apache.org/jira/browse/KAFKA-9860
>             Project: Kafka
>          Issue Type: Improvement
>          Components: producer 
>            Reporter: Boyang Chen
>            Priority: Major
>              Labels: need-kip
>
> As of today, the Producer transaction manager bookkeeps the partitions involved with current transaction. Each time it sees a new partition, it will try to send a request to add all the involved partitions to the broker, which results in multiple requests. If we could batch the work at the beginning of the transaction, we save unnecessary round trips.
> The idea is that most times the output partitions for a Producer is constant overtime, so we could leverage the last transactions affected partitions to do a batch `AddPartitionToTxn` first, and bump the EndTxn request with a field of partitions actually being added. The transaction coordinator will only send markers to the partitions included in the EndTxn. If the first batch is not a superset of affected partitions as we are producing data, we would still need a second AddPartition call. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)