You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2022/12/17 18:44:25 UTC

[GitHub] [pulsar] poorbarcode opened a new pull request, #18971: [fix] [tx] [branch-2.10] Transaction buffer recover blocked by readNext

poorbarcode opened a new pull request, #18971:
URL: https://github.com/apache/pulsar/pull/18971

   ### Motivation
   Since PR #18833 can not cherry-pick to `branch-2.10`, create a separate PR.
   
   
   #### Context for Transaction Buffer
   - If turn on `transactionCoordinatorEnabled`,  then `TransactionBuffer` will be initialized when a user topic create.
   - The `TransactionBuffer` reads the aborted logs of transactions from topic `__transaction_buffer_snapshot`  -- this process is called `recovery`.
   - During recovery, the reading from that snapshot ledger is done via a `Reader`; the reader works like this:
   ```
   while (reader.hasMessageAvailable()){
       reader.readNext();
   }
   ``` 
   
   #### Context for Compaction
   - After [pip-14](https://github.com/apache/pulsar/wiki/PIP-14:-Topic-compaction), the consumer that enabled feature read-compacted will read messages from the compacted topic instead of the original topic if the task-compaction is done, and read messages from the original topic if task-compaction is not done.
   - If the data of the last message with key k sent to a topic is null, the compactor will mark all messages for that key as deleted.
   
   #### Issue
   There is a race condition: after executing `reader.hasMessageAvailable`,  the following messages have been deleted by compaction-task, so read next will be blocked because there have no messages to read.
   
   ----
   
   ### Modifications
   
   - If hits this issue, do recover again.
   
   ----
   
   #### Why not just let the client try to load the topic again to retry the recover?
   If the topic load is failed, the client will receive an error response. This is a behavior that we can handle, so should not be perceived by the users.
   
   ### Documentation
   
   <!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. -->
   
   - [ ] `doc` <!-- Your PR contains doc changes. Please attach the local preview screenshots (run `sh start.sh` at `pulsar/site2/website`) to your PR description, or else your PR might not get merged. -->
   - [ ] `doc-required` <!-- Your PR changes impact docs and you will update later -->
   - [x] `doc-not-needed` <!-- Your PR changes do not impact docs -->
   - [ ] `doc-complete` <!-- Docs have been already added -->
   
   ### Matching PR in forked repository
   
   PR in forked repository: 
   - https://github.com/poorbarcode/pulsar/pull/48
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] congbobo184 merged pull request #18971: [fix] [tx] [branch-2.10] Transaction buffer recover blocked by readNext

Posted by GitBox <gi...@apache.org>.
congbobo184 merged PR #18971:
URL: https://github.com/apache/pulsar/pull/18971


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] congbobo184 commented on a diff in pull request #18971: [fix] [tx] [branch-2.10] Transaction buffer recover blocked by readNext

Posted by GitBox <gi...@apache.org>.
congbobo184 commented on code in PR #18971:
URL: https://github.com/apache/pulsar/pull/18971#discussion_r1051548349


##########
pulsar-broker/src/main/java/org/apache/pulsar/broker/transaction/buffer/impl/TopicTransactionBuffer.java:
##########
@@ -632,6 +641,14 @@ public void run() {
                                     callBack.noNeedToRecover();
                                     return;
                                 }
+                            } catch (TimeoutException ex) {
+                                Throwable t = FutureUtil.unwrapCompletionException(ex);
+                                String errorMessage = String.format("[%s] Transaction buffer recover fail by read "
+                                        + "transactionBufferSnapshot timeout!", topic.getName());
+                                log.error(errorMessage, t);
+                                callBack.recoverExceptionally(
+                                        new BrokerServiceException.ServiceUnitNotReadyException(errorMessage, t));

Review Comment:
   should close the reader, right?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] poorbarcode commented on a diff in pull request #18971: [fix] [tx] [branch-2.10] Transaction buffer recover blocked by readNext

Posted by GitBox <gi...@apache.org>.
poorbarcode commented on code in PR #18971:
URL: https://github.com/apache/pulsar/pull/18971#discussion_r1051564590


##########
pulsar-broker/src/main/java/org/apache/pulsar/broker/transaction/buffer/impl/TopicTransactionBuffer.java:
##########
@@ -632,6 +641,14 @@ public void run() {
                                     callBack.noNeedToRecover();
                                     return;
                                 }
+                            } catch (TimeoutException ex) {
+                                Throwable t = FutureUtil.unwrapCompletionException(ex);
+                                String errorMessage = String.format("[%s] Transaction buffer recover fail by read "
+                                        + "transactionBufferSnapshot timeout!", topic.getName());
+                                log.error(errorMessage, t);
+                                callBack.recoverExceptionally(
+                                        new BrokerServiceException.ServiceUnitNotReadyException(errorMessage, t));

Review Comment:
   Yes, Already fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org