You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Dong Lin (JIRA)" <ji...@apache.org> on 2014/08/01 01:49:38 UTC

[jira] [Created] (KAFKA-1565) Transaction manager failover handling

Dong Lin created KAFKA-1565:
-------------------------------

             Summary: Transaction manager failover handling
                 Key: KAFKA-1565
                 URL: https://issues.apache.org/jira/browse/KAFKA-1565
             Project: Kafka
          Issue Type: New Feature
            Reporter: Dong Lin


Transaction manager should guarantee that, once a pre-commit/pre-abort request is acknowledged, commit/abort request will be delivered to partitions involved in the transaction.

In particular, we handle the following failover scenarios:

1) Transaction manager or its followers fail before txRequest is duplicated on local log and followers.
Solution: Transaction manager responds to request with error status if it is alive. The producer keeps trying commit.

2) The txPartition’s leader is not available.
Solution: Put txRequest on unSentTxRequestQueue. When metadataCache is updated, check and re-send txRequest from unSentTxRequestQueue if possible.

3) The txPartition’s leader fails when txRequest is in channel manager.
Solution: Retrieve all txRequests queued for transmission to this broker and put them on unSentTxRequestQueue.

4) Transaction manage does not receive success response from txPartition’s leaders within timeout period.
Solution: Transaction manager expires the txRequest and re-send it.

5) Transaction manager fails.
Solution: The new transaction manager reads transactionHW from zookeeper, and sends txRequest starting from the transactionHW.





--
This message was sent by Atlassian JIRA
(v6.2#6252)