You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Alexey Serbin (Code Review)" <ge...@cloudera.org> on 2017/09/06 04:14:57 UTC

[kudu-CR] [tests] fixed flake in delete table-itest

Alexey Serbin has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7972

Change subject: [tests] fixed flake in delete_table-itest
......................................................................

[tests] fixed flake in delete_table-itest

Fixed flake in DeleteTableITest.TestNoDeleteTombstonedTablets: if leader
election happened in the middle of the AddServer/RemoveServer sequence,
the test failed with error messages like the following:

  src/kudu/integration-tests/delete_table-itest.cc:1217: Failure
  Failed
  Bad status: Illegal state: Replica 37783b00d5d34ffe87953cb90fa60e7c \
    is not leader of this config. Role: FOLLOWER.

If running tests against a DEBUG build, it were about 3 failures
per 1K run:
  http://dist-test.cloudera.org/job?job_id=aserbin.1504655654.10466

After the fix, there were no failures in multiple 1K runs:
  http://dist-test.cloudera.org/job?job_id=aserbin.1504670739.3127

Change-Id: I7dda40ae1054becaf25963a6d301ecaed5a926f9
---
M src/kudu/integration-tests/delete_table-itest.cc
1 file changed, 104 insertions(+), 30 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/72/7972/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7972
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I7dda40ae1054becaf25963a6d301ecaed5a926f9
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>

[kudu-CR] [tests] fix flakes in delete table-itest

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change.

Change subject: [tests] fix flakes in delete_table-itest
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7972/3/src/kudu/integration-tests/delete_table-itest.cc
File src/kudu/integration-tests/delete_table-itest.cc:

PS3, Line 1216: 
              : 
              : 
> Instead of adding the extra functions, could we do this in less by just wra
I think that's a good idea.  I thought about using ASSERT_EVENTUALLY(), but I was concerned with not spotting some other transitional issues (like if AddServer/RemoveServer returning something else but IllegalState()).  But after you addressed my concerns for the similar situations in tablet_copy-itest, I think using ASSERT_EVENTUALLY() is good enough.

Let me adopt this.


-- 
To view, visit http://gerrit.cloudera.org:8080/7972
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7dda40ae1054becaf25963a6d301ecaed5a926f9
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes

[kudu-CR] [tests] fix flakes in delete table-itest

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: [tests] fix flakes in delete_table-itest
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7972/3/src/kudu/integration-tests/delete_table-itest.cc
File src/kudu/integration-tests/delete_table-itest.cc:

PS3, Line 1216: 
              : 
              : 
Instead of adding the extra functions, could we do this in less by just wrapping the logic of finding the leader and changing the config in an ASSERT_EVENTUALLY()? i.e.:

  // Loop in case the leader changes between adding and removing a replica.
  for (int i : {0, 1}) {
    ASSERT_EVENTUALLY([&] {
      // Find leader.
      TServerDetails* leader = nullptr;
      ASSERT_OK(FindTabletLeader(ts_map_, tablet_id, kTimeout, &leader));
      ASSERT_OK(WaitForOpFromCurrentTerm(leader, tablet_id, COMMITTED_OPID, kTimeout));

      // Do the config changes and wait for the master to delete the old node.
      if (i == 0) {
        ASSERT_OK(AddServer(leader, tablet_id, to_add, RaftPeerPB::VOTER, boost::none, kTimeout));
      } else {
        ASSERT_OK(RemoveServer(leader, tablet_id, to_remove, boost::none, kTimeout));
      }
    });
  }


-- 
To view, visit http://gerrit.cloudera.org:8080/7972
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7dda40ae1054becaf25963a6d301ecaed5a926f9
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes

[kudu-CR] [tests] fix flakes in delete table-itest

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7972

to look at the new patch set (#3).

Change subject: [tests] fix flakes in delete_table-itest
......................................................................

[tests] fix flakes in delete_table-itest

Fixed flakes in DeleteTableITest.TestNoDeleteTombstonedTablets:

* If a leader election happened in the middle of the
  AddServer/RemoveServer sequence, the test failed with an error
  message like the following:

    src/kudu/integration-tests/delete_table-itest.cc:1217: Failure
    Failed
    Bad status: Illegal state: Replica 37783b00d5d34ffe87953cb90fa60e7c\
      is not leader of this config. Role: FOLLOWER.

* If more than a couple of heartbeats from the evicted node were
  received, the test failed with an error message like the following:

    src/kudu/integration-tests/delete_table-itest.cc:1241: Failure
    Expected: (num_delete_attempts) <= (kMaxDeleteAttemptsPerEviction),\
      actual: 5 vs 3
    src/kudu/util/test_util.cc:283: Failure
    Failed
    Timed out waiting for assertion to pass

Prior to the fix, running tests against a DEBUG build,
there were about 3 failures per 1K run:
  http://dist-test.cloudera.org/job?job_id=aserbin.1504655654.10466

After the fix, there were no failures in multiple 1K runs:
  http://dist-test.cloudera.org/job?job_id=aserbin.1504670739.3127

Change-Id: I7dda40ae1054becaf25963a6d301ecaed5a926f9
---
M src/kudu/integration-tests/delete_table-itest.cc
1 file changed, 108 insertions(+), 33 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/72/7972/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7972
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7dda40ae1054becaf25963a6d301ecaed5a926f9
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot

[kudu-CR] [tests] fix flakes in delete table-itest

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change.

Change subject: [tests] fix flakes in delete_table-itest
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7972/3/src/kudu/integration-tests/delete_table-itest.cc
File src/kudu/integration-tests/delete_table-itest.cc:

PS3, Line 1216: 
              : 
              : 
> Instead of adding the extra functions, could we do this in less by just wra
And another catch is: the server to remove should be different from the server just added, and the logic to find appropriate server is different for different operations.  So, the result code would not be that simple at all. :)

But anyway, I'm going to use ASSERT_EVENTUALLY() to have blanket-handling for any type of errors appear during the process.


-- 
To view, visit http://gerrit.cloudera.org:8080/7972
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7dda40ae1054becaf25963a6d301ecaed5a926f9
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes

[kudu-CR] [tests] fixed flake in delete table-itest

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has uploaded a new patch set (#2).

Change subject: [tests] fixed flake in delete_table-itest
......................................................................

[tests] fixed flake in delete_table-itest

Fixed flake in DeleteTableITest.TestNoDeleteTombstonedTablets: if leader
election happened in the middle of the AddServer/RemoveServer sequence,
the test failed with error messages like the following:

  src/kudu/integration-tests/delete_table-itest.cc:1217: Failure
  Failed
  Bad status: Illegal state: Replica 37783b00d5d34ffe87953cb90fa60e7c \
    is not leader of this config. Role: FOLLOWER.

If running tests against a DEBUG build, it were about 3 failures
per 1K run:
  http://dist-test.cloudera.org/job?job_id=aserbin.1504655654.10466

After the fix, there were no failures in multiple 1K runs:
  http://dist-test.cloudera.org/job?job_id=aserbin.1504670739.3127

Change-Id: I7dda40ae1054becaf25963a6d301ecaed5a926f9
---
M src/kudu/integration-tests/delete_table-itest.cc
1 file changed, 106 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/72/7972/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7972
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7dda40ae1054becaf25963a6d301ecaed5a926f9
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot

[kudu-CR] [tests] fix flakes in delete table-itest

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/7972 )

Change subject: [tests] fix flakes in delete_table-itest
......................................................................


Patch Set 3:

Maybe we should just merge this patch the way it is to reduce the flakiness issues... thoughts?


-- 
To view, visit http://gerrit.cloudera.org:8080/7972
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7dda40ae1054becaf25963a6d301ecaed5a926f9
Gerrit-Change-Number: 7972
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Mon, 02 Oct 2017 21:53:32 +0000
Gerrit-HasComments: No

[kudu-CR] [tests] fix flakes in delete table-itest

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/7972 )

Change subject: [tests] fix flakes in delete_table-itest
......................................................................


Patch Set 3:

> Patch Set 3:
> 
> (1 comment)

+1 to ASSERT_EVENTUALLY


-- 
To view, visit http://gerrit.cloudera.org:8080/7972
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7dda40ae1054becaf25963a6d301ecaed5a926f9
Gerrit-Change-Number: 7972
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Mon, 27 Nov 2017 19:32:43 +0000
Gerrit-HasComments: No