You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Will Berkeley (JIRA)" <ji...@apache.org> on 2018/10/18 17:21:00 UTC

[jira] [Commented] (KUDU-2610) TestSimultaneousLeaderTransferAndAbruptStepdown is Flaky

    [ https://issues.apache.org/jira/browse/KUDU-2610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16655619#comment-16655619 ] 

Will Berkeley commented on KUDU-2610:
-------------------------------------

Hm that's funny as I doubled the timeout recently when this flakiness showed itself in a precommit, then ran the test many times through dist-test and saw no problem. Probably, I should reduce the frequency of leader changes in ASAN, or even disable the test.

> TestSimultaneousLeaderTransferAndAbruptStepdown is Flaky
> --------------------------------------------------------
>
>                 Key: KUDU-2610
>                 URL: https://issues.apache.org/jira/browse/KUDU-2610
>             Project: Kudu
>          Issue Type: Bug
>    Affects Versions: 1.8.0
>            Reporter: Hao Hao
>            Assignee: Will Berkeley
>            Priority: Major
>         Attachments: kudu-admin-test.5.txt
>
>
> AdminCliTest.TestSimultaneousLeaderTransferAndAbruptStepdown is flaky sometime in ASAN build with the following error:
> {noformat}
> b01d528fd3c74eb5b42b8d4888591ed2 (127.18.62.194:38185) has failed: Timed out: Write RPC to 127.18.62.194:38185 timed out after 60.000s (SENT)
> W1017 23:33:47.772014 20038 batcher.cc:348] Timed out: Failed to write batch of 1 ops to tablet 9b4b2dea960941bcb38197b51c55baf4 after 1 attempt(s): Failed to write to server: b01d528fd3c74eb5b42b8d4888591ed2 (127.18.62.194:38185): Write RPC to 127.18.62.194:38185 timed out after 60.000s (SENT)
> F1017 23:33:47.772820 20042 test_workload.cc:202] Timed out: Failed to write batch of 1 ops to tablet 9b4b2dea960941bcb38197b51c55baf4 after 1 attempt(s): Failed to write to server: b01d528fd3c74eb5b42b8d4888591ed2 (127.18.62.194:38185): Write RPC to 127.18.62.194:38185 timed out after 60.000s (SENT)
> *** Check failure stack trace: ***
> *** Aborted at 1539844427 (unix time) try "date -d @1539844427" if you are using GNU date ***
> PC: @ 0x3c74632625 __GI_raise
> *** SIGABRT (@0x452000048fb) received by PID 18683 (TID 0x7f13ebe5b700) from PID 18683; stack trace: ***
>  @ 0x3c74a0f710 (unknown) at ??:0
>  @ 0x3c74632625 __GI_raise at ??:0
>  @ 0x3c74633e05 __GI_abort at ??:0
>  @ 0x7f13fd43da29 (unknown) at ??:0
>  @ 0x7f13fd43f31d (unknown) at ??:0
>  @ 0x7f13fd4411dd (unknown) at ??:0
>  @ 0x7f13fd43ee59 (unknown) at ??:0
>  @ 0x7f13fd441c7f (unknown) at ??:0
>  @ 0x7f1412f7ba6e (unknown) at ??:0
>  @ 0x3c796b6470 (unknown) at ??:0
>  @ 0x3c74a079d1 start_thread at ??:0
>  @ 0x3c746e88fd clone at ??:0
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)