You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2020/07/09 14:57:50 UTC

[GitHub] [pulsar] kezhuw opened a new issue #7490: Entry duplicated in pulsar when bookie failed and zookeeper disconnected

kezhuw opened a new issue #7490:
URL: https://github.com/apache/pulsar/issues/7490


   **Describe the bug**
   When adding new entry, bookie failure and zookeeper disconnection may cause inconsistency between `LedgerInfo.getEntries` and `LedgerMetadata.getLastEntryId`. This will introduce duplicated entry.
   
   **To Reproduce**
   1. Adding `entry-a` to `ledger1`.
   2. Succeed to persist on disk, but failed to response due to, say, machine crash.
   3. Zookeeper disconnected, thus fail to write metadata and closing ledger.
   4. Report `ledgerClosed` to `ManagedLedger`.
   5. Zookeeper reconnected.
   6. Roll to `ledger2`, `entry-a` added success.
   7. `ManagedLedger` does not count `entry-a` in `LedgerInfo.getEntries`, but `entry-a` does count as `LedgerMetadata.getLastEntryId` and `LedgerHandle.getLastAddConfirmed` after recovery.
   8. `ManagedCursor` does not use `LedgerInfo.getEntries` to restrict its reading.
   9. `LedgerOffloader` does not use `LedgerInfo.getEntries` either.
   
   I have add test case to reproduce this: https://github.com/kezhuw/pulsar/commit/bc0de5e9e110d931340ce6fd2a85911892630f33.
   
   Strictly speaking, I think `OpAddEntry.handleAddTimeoutFailure` has this issue too.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] kezhuw commented on issue #7490: Entry duplicated in pulsar when bookie failed and zookeeper disconnected

Posted by GitBox <gi...@apache.org>.
kezhuw commented on issue #7490:
URL: https://github.com/apache/pulsar/issues/7490#issuecomment-656180614


   Forgot to attach test case assertion message for https://github.com/kezhuw/pulsar/commit/bc0de5e9e110d931340ce6fd2a85911892630f33, here it is:
   
   ```shell
   java.lang.AssertionError: The following asserts failed:
   	unexpected last entry id expected [9] but found [10],
   	unexpected last add confirmed expected [9] but found [10],
   	unexpected sequence id expected [12] but found [11],
   	unexpected message content expected [foo-12] but found [foo-11],
   	unexpected sequence id expected [13] but found [12],
   	unexpected message content expected [foo-13] but found [foo-12]
   <Click to see difference>
   
   
   	at org.testng.asserts.SoftAssert.assertAll(SoftAssert.java:43)
   	at org.apache.pulsar.client.api.ClientDeduplicationFailureTest.testClientDeduplicationWithBkFailureAndZkDisconnected(ClientDeduplicationFailureTest.java:593)
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:124)
   	at org.testng.internal.Invoker.invokeMethod(Invoker.java:583)
   	at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:719)
   	at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:989)
   	at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:125)
   	at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:109)
   	at org.testng.TestRunner.privateRun(TestRunner.java:648)
   	at org.testng.TestRunner.run(TestRunner.java:505)
   	at org.testng.SuiteRunner.runTest(SuiteRunner.java:455)
   	at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:450)
   	at org.testng.SuiteRunner.privateRun(SuiteRunner.java:415)
   	at org.testng.SuiteRunner.run(SuiteRunner.java:364)
   	at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52)
   	at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:84)
   	at org.testng.TestNG.runSuitesSequentially(TestNG.java:1208)
   	at org.testng.TestNG.runSuitesLocally(TestNG.java:1137)
   	at org.testng.TestNG.runSuites(TestNG.java:1049)
   	at org.testng.TestNG.run(TestNG.java:1017)
   	at com.intellij.rt.testng.IDEARemoteTestNG.run(IDEARemoteTestNG.java:66)
   	at com.intellij.rt.testng.RemoteTestNGStarter.main(RemoteTestNGStarter.java:110)
   
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org