You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by xy...@apache.org on 2022/05/01 15:39:00 UTC

[pulsar] branch master updated: [Java Client] Fix flaky BatchMessageTest by initializing lastBatchSendNanoTime (#15406)

This is an automated email from the ASF dual-hosted git repository.

xyz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/pulsar.git


The following commit(s) were added to refs/heads/master by this push:
     new 72bbb97528d [Java Client] Fix flaky BatchMessageTest by initializing lastBatchSendNanoTime (#15406)
72bbb97528d is described below

commit 72bbb97528d9c07fa7bbee8073358bfc3b22b8ca
Author: Yunze Xu <xy...@163.com>
AuthorDate: Sun May 1 23:38:54 2022 +0800

    [Java Client] Fix flaky BatchMessageTest by initializing lastBatchSendNanoTime (#15406)
    
    ### Motivation
    
    `BatchMessageTest` is very flaky, some tests failed at checking the
    backlog of a topic. This flakiness, maybe the bug, was introduced from
    https://github.com/apache/pulsar/pull/14185. `batchFlushTask` calculates
    the time duration between now and `lastBatchSendNanoTime`. However,
    for the first time `batchFlushTask` is called, `lastBatchSendNanoTime`
    is not initialized, so the time diff duration satified so that
    `batchMessageAndSend` will always be called for the first time.
    
    For example, if all messages can be split into two batches for a given
    batch size limit, there is a chance that when first few messages are in
    the batch, `batchFlushTask` is called at the same time and these
    messages could be sent first.
    
    Here are some extra logs added by myself:
    
    ```
    2022-05-01T12:35:23,545+0800 [main] INFO  org.apache.pulsar.client.impl.BatchMessageKeyBasedContainer - XYZ add 0 | 1 (0) 12 (600) | false
    2022-05-01T12:35:23,545+0800 [pulsar-client-io-35-1] INFO  org.apache.pulsar.client.impl.ProducerImpl - XYZ batchFlushTask 52779290583 us (5000000)
    2022-05-01T12:35:23,545+0800 [pulsar-client-io-35-1] INFO  org.apache.pulsar.client.impl.BatchMessageKeyBasedContainer - XYZ before create: [0]
    2022-05-01T12:35:23,546+0800 [pulsar-client-io-35-1] INFO  org.apache.pulsar.client.impl.BatchMessageKeyBasedContainer - XYZ create 1 Ops: 1 (0) 12 (600)
    ```
    
    We can see `batchFlushTask` was called while the batch contains only 1
    message. The time duration is nearly 15 hours, which is appearly wrong.
    
    ### Modifications
    
    Initialize `lastBatchSendNanoTime` with the current time point
    immediately after the batch container is constructed.
---
 .../src/main/java/org/apache/pulsar/client/impl/ProducerImpl.java      | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ProducerImpl.java b/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ProducerImpl.java
index 7c3e7d1b0c1..80a003ebd81 100644
--- a/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ProducerImpl.java
+++ b/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ProducerImpl.java
@@ -255,6 +255,7 @@ public class ProducerImpl<T> extends ProducerBase<T> implements TimerTask, Conne
             }
             this.batchMessageContainer = (BatchMessageContainerBase) containerBuilder.build();
             this.batchMessageContainer.setProducer(this);
+            this.lastBatchSendNanoTime = System.nanoTime();
         } else {
             this.batchMessageContainer = null;
         }
@@ -2303,4 +2304,4 @@ public class ProducerImpl<T> extends ProducerBase<T> implements TimerTask, Conne
     }
 
     private static final Logger log = LoggerFactory.getLogger(ProducerImpl.class);
-}
\ No newline at end of file
+}