You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "hachikuji (via GitHub)" <gi...@apache.org> on 2023/02/15 19:55:45 UTC

[GitHub] [kafka] hachikuji commented on a diff in pull request #13231: KAFKA-14402: Update AddPartitionsToTxn protocol to batch and handle verifyOnly requests

hachikuji commented on code in PR #13231:
URL: https://github.com/apache/kafka/pull/13231#discussion_r1107606857


##########
clients/src/main/java/org/apache/kafka/common/requests/AddPartitionsToTxnRequest.java:
##########
@@ -35,21 +44,43 @@ public class AddPartitionsToTxnRequest extends AbstractRequest {
     private final AddPartitionsToTxnRequestData data;
 
     private List<TopicPartition> cachedPartitions = null;
+    
+    private Map<String, List<TopicPartition>> cachedPartitionsByTransaction = null;
+
+    private final short version;
 
     public static class Builder extends AbstractRequest.Builder<AddPartitionsToTxnRequest> {
         public final AddPartitionsToTxnRequestData data;
+        public final boolean isClientRequest;
 
-        public Builder(final AddPartitionsToTxnRequestData data) {
+        // Only used for versions < 4
+        public Builder(String transactionalId,
+                       long producerId,
+                       short producerEpoch,
+                       List<TopicPartition> partitions) {
             super(ApiKeys.ADD_PARTITIONS_TO_TXN);
-            this.data = data;
+            this.isClientRequest = true;
+
+            AddPartitionsToTxnTopicCollection topics = compileTopics(partitions);
+
+            this.data = new AddPartitionsToTxnRequestData()
+                    .setTransactionalId(transactionalId)
+                    .setProducerId(producerId)
+                    .setProducerEpoch(producerEpoch)
+                    .setTopics(topics);
         }
 
-        public Builder(final String transactionalId,
-                       final long producerId,
-                       final short producerEpoch,
-                       final List<TopicPartition> partitions) {
+        public Builder(AddPartitionsToTxnTransactionCollection transactions,
+                       boolean verifyOnly) {
             super(ApiKeys.ADD_PARTITIONS_TO_TXN);
+            this.isClientRequest = false;
 
+            this.data = new AddPartitionsToTxnRequestData()
+                    .setTransactions(transactions)
+                    .setVerifyOnly(verifyOnly);
+        }
+
+        private AddPartitionsToTxnTopicCollection compileTopics(final List<TopicPartition> partitions) {

Review Comment:
   nit: how about `buildTxnTopicCollection`?



##########
clients/src/main/java/org/apache/kafka/common/requests/AddPartitionsToTxnRequest.java:
##########
@@ -35,21 +44,43 @@ public class AddPartitionsToTxnRequest extends AbstractRequest {
     private final AddPartitionsToTxnRequestData data;
 
     private List<TopicPartition> cachedPartitions = null;
+    
+    private Map<String, List<TopicPartition>> cachedPartitionsByTransaction = null;
+
+    private final short version;
 
     public static class Builder extends AbstractRequest.Builder<AddPartitionsToTxnRequest> {
         public final AddPartitionsToTxnRequestData data;
+        public final boolean isClientRequest;
 
-        public Builder(final AddPartitionsToTxnRequestData data) {
+        // Only used for versions < 4
+        public Builder(String transactionalId,
+                       long producerId,
+                       short producerEpoch,
+                       List<TopicPartition> partitions) {
             super(ApiKeys.ADD_PARTITIONS_TO_TXN);
-            this.data = data;
+            this.isClientRequest = true;
+
+            AddPartitionsToTxnTopicCollection topics = compileTopics(partitions);
+
+            this.data = new AddPartitionsToTxnRequestData()
+                    .setTransactionalId(transactionalId)

Review Comment:
   Does it make sense to set verifyOnly to false explicitly here?



##########
clients/src/main/java/org/apache/kafka/common/requests/AddPartitionsToTxnRequest.java:
##########
@@ -66,24 +97,22 @@ public Builder(final String transactionalId,
             AddPartitionsToTxnTopicCollection topics = new AddPartitionsToTxnTopicCollection();
             for (Map.Entry<String, List<Integer>> partitionEntry : partitionMap.entrySet()) {
                 topics.add(new AddPartitionsToTxnTopic()
-                               .setName(partitionEntry.getKey())
-                               .setPartitions(partitionEntry.getValue()));
+                        .setName(partitionEntry.getKey())
+                        .setPartitions(partitionEntry.getValue()));
             }
-
-            this.data = new AddPartitionsToTxnRequestData()
-                            .setTransactionalId(transactionalId)
-                            .setProducerId(producerId)
-                            .setProducerEpoch(producerEpoch)
-                            .setTopics(topics);
+            return topics;
         }
 
         @Override
         public AddPartitionsToTxnRequest build(short version) {
-            return new AddPartitionsToTxnRequest(data, version);
+            short clampedVersion = (isClientRequest && version > 3) ? 3 : version;

Review Comment:
   It's a little strange to ignore the version. I think another way to do this is to set the `latestAllowedVersion` to 3 in the client builder. That will ensure that the client does not try to use a higher version even if the broker supports it. Similarly, we can set a min version of 4 for the server.



##########
clients/src/main/java/org/apache/kafka/common/requests/AddPartitionsToTxnRequest.java:
##########
@@ -118,11 +193,41 @@ public AddPartitionsToTxnRequestData data() {
 
     @Override
     public AddPartitionsToTxnResponse getErrorResponse(int throttleTimeMs, Throwable e) {
-        final HashMap<TopicPartition, Errors> errors = new HashMap<>();
-        for (TopicPartition partition : partitions()) {
-            errors.put(partition, Errors.forException(e));
+        Errors error = Errors.forException(e);
+        if (version < 4) {
+            final HashMap<TopicPartition, Errors> errors = new HashMap<>();
+            for (TopicPartition partition : partitions()) {
+                errors.put(partition, error);
+            }
+            return new AddPartitionsToTxnResponse(throttleTimeMs, errors);
+        } else {
+            AddPartitionsToTxnResponseData response = new AddPartitionsToTxnResponseData();

Review Comment:
   The logic in here makes me wonder if we should add a top-level error code in the response.



##########
core/src/main/scala/kafka/coordinator/transaction/TransactionCoordinator.scala:
##########
@@ -352,7 +353,12 @@ class TransactionCoordinator(txnConfig: TransactionConfig,
               // this is an optimization: if the partitions are already in the metadata reply OK immediately
               Left(Errors.NONE)
             } else {
-              Right(coordinatorEpoch, txnMetadata.prepareAddPartitions(partitions.toSet, time.milliseconds()))
+              // If verifyOnly, we should have returned in the step above. If we didn't the partitions are not present in the transaction.
+              if (verifyOnly) {
+                Left(Errors.INVALID_TXN_STATE)

Review Comment:
   Suppose that some of the partitions are added to the transaction and some are not. We return `INVALID_TXN_STATE` which means the broker cannot distinguish which partitions have been added correctly. Does it matter? I think there are two main consequences:
   1. We have to reject the full `Produce` request, which may contain writes to multiple partitions. This means the producer also cannot tell which partition was not added correctly. Maybe this is fine since the producer must abort anyway?
   2. The `AddPartitionsToTxn` cannot batch across multiple `Produce` requests from the same `transactionalId`. Or, if it does batch across requests, then we would have to return INVALID_TXN_STATE in all of the responses. Maybe this is also fine for the same reason?
   
   I do think we'll want to have at least in the logs somewhere which partitions were not present in the transaction.



##########
clients/src/main/java/org/apache/kafka/common/requests/AddPartitionsToTxnRequest.java:
##########
@@ -35,21 +44,43 @@ public class AddPartitionsToTxnRequest extends AbstractRequest {
     private final AddPartitionsToTxnRequestData data;
 
     private List<TopicPartition> cachedPartitions = null;
+    
+    private Map<String, List<TopicPartition>> cachedPartitionsByTransaction = null;
+
+    private final short version;
 
     public static class Builder extends AbstractRequest.Builder<AddPartitionsToTxnRequest> {
         public final AddPartitionsToTxnRequestData data;
+        public final boolean isClientRequest;
 
-        public Builder(final AddPartitionsToTxnRequestData data) {
+        // Only used for versions < 4

Review Comment:
   Perhaps we could create factory methods: `Builder.forClient` and `Builder.forServer` (or something like that).



##########
clients/src/main/java/org/apache/kafka/common/requests/AddPartitionsToTxnRequest.java:
##########
@@ -101,15 +130,61 @@ public String toString() {
     public AddPartitionsToTxnRequest(final AddPartitionsToTxnRequestData data, short version) {
         super(ApiKeys.ADD_PARTITIONS_TO_TXN, version);
         this.data = data;
+        this.version = version;
     }
-
+    
+    // Only used for versions < 4
     public List<TopicPartition> partitions() {
         if (cachedPartitions != null) {
             return cachedPartitions;
         }
         cachedPartitions = Builder.getPartitions(data);
         return cachedPartitions;
     }
+    
+    private List<TopicPartition> partitionsForTransaction(String transaction) {

Review Comment:
   nit: could we use `transactionalId` as the argument name?



##########
clients/src/main/java/org/apache/kafka/common/requests/AddPartitionsToTxnResponse.java:
##########
@@ -99,6 +112,7 @@ public void maybeSetThrottleTimeMs(int throttleTimeMs) {
         data.setThrottleTimeMs(throttleTimeMs);
     }
 
+    // Only used for versions < 4

Review Comment:
   Do we need to continue exposing some of these version-specific methods? We sort of intend the request/response objects to abstract a lot of the version-specific details.



##########
clients/src/main/java/org/apache/kafka/common/requests/AddPartitionsToTxnResponse.java:
##########
@@ -49,28 +52,37 @@ public class AddPartitionsToTxnResponse extends AbstractResponse {
     private final AddPartitionsToTxnResponseData data;
 
     private Map<TopicPartition, Errors> cachedErrorsMap = null;
+    
+    private Map<String, Map<TopicPartition, Errors>> cachedAllErrorsMap = null;
 
     public AddPartitionsToTxnResponse(AddPartitionsToTxnResponseData data) {
         super(ApiKeys.ADD_PARTITIONS_TO_TXN);
         this.data = data;
     }
 
+    // Only used for versions < 4

Review Comment:
   I do wonder if we need both constructors. The old version is a subset of the new version where the number of producers happens to be 1.



##########
clients/src/main/resources/common/message/AddPartitionsToTxnResponse.json:
##########
@@ -22,22 +22,35 @@
   // Version 2 adds the support for new error code PRODUCER_FENCED.
   //
   // Version 3 enables flexible versions.
-  "validVersions": "0-3",
+  //
+  // Version 4 adds support to batch multiple transactions.
+  "validVersions": "0-4",
   "flexibleVersions": "3+",
   "fields": [
     { "name": "ThrottleTimeMs", "type": "int32", "versions": "0+",
       "about": "Duration in milliseconds for which the request was throttled due to a quota violation, or zero if the request did not violate any quota." },
-    { "name": "Results", "type": "[]AddPartitionsToTxnTopicResult", "versions": "0+",
-      "about": "The results for each topic.", "fields": [
+    { "name": "ResultsByTransaction", "type": "[]AddPartitionsToTxnResult", "versions": "4+",
+      "about": "Results categorized by transactional ID.", "fields": [
+      { "name": "TransactionalId", "type": "string", "versions": "4+", "mapKey": true, "entityType": "transactionalId",
+        "about": "The transactional id corresponding to the transaction."},
+      { "name": "TopicResults", "type": "[]AddPartitionsToTxnTopicResult", "versions": "4+",
+        "about": "The results for each topic." }
+    ]},
+    { "name": "Results", "type": "[]AddPartitionsToTxnTopicResult", "versions": "0-3",

Review Comment:
   I wonder if it makes sense to use a different naming convention for the fields which are being removed. For example, instead of `Results` we could have `ResultsV3AndBelow`. It is not pretty, but it does make the usage in the code clearer.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org