You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Amrit Sarkar (JIRA)" <ji...@apache.org> on 2018/08/27 09:31:00 UTC

[jira] [Issue Comment Deleted] (SOLR-12524) CdcrBidirectionalTest.testBiDir() regularly fails

     [ https://issues.apache.org/jira/browse/SOLR-12524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amrit Sarkar updated SOLR-12524:
--------------------------------
    Comment: was deleted

(was: Attached updated new patch and following is the explanation; please correct me if I am wrong.

There is single assertion in CdcrUpdateLog failing after SOLR-9922 which is strictly non-harmful. Since we get rid of {{TransactionLog:snapshot()}} and {{TransactionLog:rollback()}} functions, Cdcr buffer updates functionalities got bit altered in terms of {{recoveryInfo.positionOfStart}}.
 In the function {{CdcrUpdateLog:forwardSeek}}:
 tlogs are de-referenced whose entries are forwarded to target. the assertion:
{code:java}
      assert this.tlogs.peekLast().id == subReader.tlogs.peekLast().id : this.tlogs.peekLast().id+" != "+subReader.tlogs.peekLast().id;
{code}
validates that we have purged all tlogs which we don't want to keep anymore (been forwarded); subReader is mainTlogReader itself. However after SOLR-9922, since tlogs are no longer buffered the matter it was before when cores are in recovery (please correct me as I don't understand every nuance of tlog very well);

{{this.tlogs.peekLast().id}} can be greater than {{subReader.tlogs.peekLast().id}}, which means all useless tlogs are already purged, {{forwardSeek}} doesn't have to do anything, which is fine as long as no updates are missed.
 If we change the assertion to:
{{this.tlogs.peekLast().id >= subReader.tlogs.peekLast().id}} meaning
 this while loop in {{forwardSeek}} won't be executed rightfully;
{code:java}
      while (this.tlogs.peekLast().id < subReader.tlogs.peekLast().id) {
        tlogs.removeLast();
        currentTlog = tlogs.peekLast();
      }
{code}
everything is fine, all tests passed.
 I am ran 300 round beasts with this assertion on {{CdcrBidirectionalTest}} and all good.

I missed this bug in SOLR-9922, as I didn't expect the above scenario to happen, even if it is happening, its legit and ok.)

> CdcrBidirectionalTest.testBiDir() regularly fails
> -------------------------------------------------
>
>                 Key: SOLR-12524
>                 URL: https://issues.apache.org/jira/browse/SOLR-12524
>             Project: Solr
>          Issue Type: Test
>          Components: CDCR, Tests
>            Reporter: Christine Poerschke
>            Priority: Major
>         Attachments: SOLR-12524.patch, SOLR-12524.patch, SOLR-12524.patch, SOLR-12524.patch, SOLR-12524.patch, beast-test-run
>
>
> e.g. from https://jenkins.thetaphi.de/job/Lucene-Solr-master-MacOSX/4701/consoleText
> {code}
> [junit4] ERROR   20.4s J0 | CdcrBidirectionalTest.testBiDir <<<
> [junit4]    > Throwable #1: com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=28371, name=cdcr-replicator-11775-thread-1, state=RUNNABLE, group=TGRP-CdcrBidirectionalTest]
> [junit4]    > 	at __randomizedtesting.SeedInfo.seed([CA5584AC7009CD50:8F8E744E68278112]:0)
> [junit4]    > Caused by: java.lang.AssertionError
> [junit4]    > 	at __randomizedtesting.SeedInfo.seed([CA5584AC7009CD50]:0)
> [junit4]    > 	at org.apache.solr.update.CdcrUpdateLog$CdcrLogReader.forwardSeek(CdcrUpdateLog.java:611)
> [junit4]    > 	at org.apache.solr.handler.CdcrReplicator.run(CdcrReplicator.java:125)
> [junit4]    > 	at org.apache.solr.handler.CdcrReplicatorScheduler.lambda$null$0(CdcrReplicatorScheduler.java:81)
> [junit4]    > 	at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209)
> [junit4]    > 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [junit4]    > 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [junit4]    > 	at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org