You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@kudu.apache.org by al...@apache.org on 2020/06/24 04:22:53 UTC

[kudu] branch master updated (da1c66b -> cd37d7c)

This is an automated email from the ASF dual-hosted git repository.

alexey pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/kudu.git.


    from da1c66b  [consensus] KUDU-2727 lock-free CheckLeadershipAndBindTerm()
     new fb53b1b  [docs] Add a section about using Gradle checkstyle
     new 2bd3d82  [autn_token_expire-itest] make one scenario more robust
     new cd37d7c  [mini-cluster] fix GetLeaderMasterIndex()

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 docs/contributing.adoc                             | 11 +++++++
 .../integration-tests/auth_token_expire-itest.cc   | 37 ++++++++++++++--------
 src/kudu/mini-cluster/external_mini_cluster.cc     | 12 +++++--
 3 files changed, 44 insertions(+), 16 deletions(-)


[kudu] 02/03: [autn_token_expire-itest] make one scenario more robust

Posted by al...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

alexey pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/kudu.git

commit 2bd3d8281fc6260c441657d378fd4d023d526b6b
Author: Alexey Serbin <al...@apache.org>
AuthorDate: Tue Jun 23 11:43:24 2020 -0700

    [autn_token_expire-itest] make one scenario more robust
    
    This patch addresses a TOCTOU condition in ClientReacquiresAuthnToken
    scenario of the MultiMasterIdleConnectionsITest test and replaces ad-hoc
    master re-election requests with graceful leadership transfer.
    
    I haven't seen the TOCTOU condition manifesting itself yet, but since
    I already spent time on looking around to find the reason behind the
    test failure (the root cause has been found and is fixed by
    https://gerrit.cloudera.org/#/c/16106/), it doesn't hurt to make the
    test scenario a bit more robust.
    
    Change-Id: Id06be851c92958c2c134d9e06a7dc133283aeba2
    Reviewed-on: http://gerrit.cloudera.org:8080/16107
    Reviewed-by: Grant Henke <gr...@apache.org>
    Tested-by: Alexey Serbin <as...@cloudera.com>
---
 .../integration-tests/auth_token_expire-itest.cc   | 37 ++++++++++++++--------
 1 file changed, 23 insertions(+), 14 deletions(-)

diff --git a/src/kudu/integration-tests/auth_token_expire-itest.cc b/src/kudu/integration-tests/auth_token_expire-itest.cc
index 6c0806c..a77fcfd 100644
--- a/src/kudu/integration-tests/auth_token_expire-itest.cc
+++ b/src/kudu/integration-tests/auth_token_expire-itest.cc
@@ -522,13 +522,12 @@ class MultiMasterIdleConnectionsITest : public AuthTokenExpireITestBase {
 // when the client tried to open the test table after master leader re-election:
 //   Timed out: GetTableSchema timed out after deadline expired
 TEST_F(MultiMasterIdleConnectionsITest, ClientReacquiresAuthnToken) {
-  const string kTableName = "keep-connection-to-former-master-leader";
-
   if (!AllowSlowTests()) {
     LOG(WARNING) << "test is skipped; set KUDU_ALLOW_SLOW_TESTS=1 to run";
     return;
   }
 
+  const string kTableName = "keep-connection-to-former-master-leader";
   const auto time_start = MonoTime::Now();
 
   shared_ptr<KuduClient> client;
@@ -560,6 +559,9 @@ TEST_F(MultiMasterIdleConnectionsITest, ClientReacquiresAuthnToken) {
     SleepFor(MonoDelta::FromMilliseconds(250));
   }
 
+  int former_leader_master_idx;
+  ASSERT_OK(cluster_->GetLeaderMasterIndex(&former_leader_master_idx));
+
   // Given the relation between the master_rpc_keepalive_time_ms_ and
   // authn_token_validity_seconds_ parameters, the original authn token should
   // expire and connections to follower masters should be torn down due to
@@ -567,26 +569,33 @@ TEST_F(MultiMasterIdleConnectionsITest, ClientReacquiresAuthnToken) {
   // after waiting for additional token expiration interval.
   SleepFor(MonoDelta::FromSeconds(authn_token_validity_seconds_));
 
-  {
-    int former_leader_master_idx;
-    ASSERT_OK(cluster_->GetLeaderMasterIndex(&former_leader_master_idx));
-    const int leader_idx = (former_leader_master_idx + 1) % num_masters_;
-    ASSERT_EVENTUALLY([&] {
+  ASSERT_EVENTUALLY([&] {
+    // The leadership could change behind the scenes, so if a new leader master
+    // is already around, another leadership change isn't needed.
+    int leader_idx;
+    ASSERT_OK(cluster_->GetLeaderMasterIndex(&leader_idx));
+    if (former_leader_master_idx == leader_idx) {
+      // Make a request to the current leader master to step down, transferring
+      // the leadership role to other master.
       consensus::ConsensusServiceProxy proxy(
           cluster_->messenger(), cluster_->master(leader_idx)->bound_rpc_addr(),
           cluster_->master(leader_idx)->bound_rpc_hostport().host());
-      consensus::RunLeaderElectionRequestPB req;
-      req.set_tablet_id(master::SysCatalogTable::kSysCatalogTabletId);
+      consensus::LeaderStepDownRequestPB req;
       req.set_dest_uuid(cluster_->master(leader_idx)->uuid());
+      req.set_tablet_id(master::SysCatalogTable::kSysCatalogTabletId);
+      req.set_mode(consensus::LeaderStepDownMode::GRACEFUL);
+      consensus::LeaderStepDownResponsePB resp;
       rpc::RpcController rpc;
-      rpc.set_timeout(MonoDelta::FromSeconds(1));
-      consensus::RunLeaderElectionResponsePB resp;
-      ASSERT_OK(proxy.RunLeaderElection(req, &resp, &rpc));
+      rpc.set_timeout(MonoDelta::FromSeconds(3));
+      ASSERT_OK(proxy.LeaderStepDown(req, &resp, &rpc));
+
+      // Make sure the leader has actually changed. If not, the step-down
+      // request is retried until ASSERT_EVENTUALLY() times out.
       int idx;
       ASSERT_OK(cluster_->GetLeaderMasterIndex(&idx));
       ASSERT_NE(former_leader_master_idx, idx);
-    });
-  }
+    }
+  });
 
   // Try to open the table after leader master re-election. The former leader
   // responds with NOT_THE_LEADER error even if the authn token has expired


[kudu] 03/03: [mini-cluster] fix GetLeaderMasterIndex()

Posted by al...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

alexey pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/kudu.git

commit cd37d7cbcae4ff9112ee66123ead62b404f656ef
Author: Alexey Serbin <al...@apache.org>
AuthorDate: Tue Jun 23 15:51:20 2020 -0700

    [mini-cluster] fix GetLeaderMasterIndex()
    
    Prior to this patch, when running external mini-cluster in
    BindMode::UNIQUE_LOOPBACK mode, in rare cases masters might be bound
    to the same port number at different loopback IP addresses.  In such
    cases, GetLeaderMasterIndex() might return incorrect results.
    
    I saw a manifestation of such an issue with failure of the
    MultiMasterIdleConnectionsITest.ClientReacquiresAuthnToken scenario at
    http://dist-test.cloudera.org/job?job_id=jenkins-slave.1592866280.18987
    
    I think in that run the leader master index was detected as 0 (due to
    the issue this patch fixes), but in reality the index was 1, so all
    the attempts to start an election with current leader master didn't
    change the leadership role, where the leader master output messages
    like the following for every RunLeaderElectionRequest() call:
    
      I0622 22:58:08.909113 20844 raft_consensus.cc:461] T 00000000000000000000000000000000 P d31fb2325643457e8787dc92f8c3a4cf [term 1 LEADER]: Not starting forced leader election -- already a leader
      src/kudu/integration-tests/auth_token_expire-itest.cc:587: Failure
    
    Eventually, the test failed with the message:
      Expected: (former_leader_master_idx) != (idx), actual: 1 vs 1
    
    Change-Id: I3f445e587fe2152efbb67e4a36d36c760b486dac
    Reviewed-on: http://gerrit.cloudera.org:8080/16106
    Tested-by: Alexey Serbin <as...@cloudera.com>
    Reviewed-by: Alexey Serbin <as...@cloudera.com>
---
 src/kudu/mini-cluster/external_mini_cluster.cc | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/src/kudu/mini-cluster/external_mini_cluster.cc b/src/kudu/mini-cluster/external_mini_cluster.cc
index c2b649a..daff355 100644
--- a/src/kudu/mini-cluster/external_mini_cluster.cc
+++ b/src/kudu/mini-cluster/external_mini_cluster.cc
@@ -757,6 +757,7 @@ Status ExternalMiniCluster::GetLeaderMasterIndex(int* idx) {
   Synchronizer sync;
   vector<pair<Sockaddr, string>> addrs_with_names;
   Sockaddr leader_master_addr;
+  string leader_master_hostname;
   MonoTime deadline = MonoTime::Now() + MonoDelta::FromSeconds(5);
 
   for (const scoped_refptr<ExternalMaster>& master : masters_) {
@@ -764,9 +765,10 @@ Status ExternalMiniCluster::GetLeaderMasterIndex(int* idx) {
   }
   const auto& cb = [&](const Status& status,
                        const pair<Sockaddr, string>& leader_master,
-                       const master::ConnectToMasterResponsePB& resp) {
+                       const master::ConnectToMasterResponsePB& /*resp*/) {
     if (status.ok()) {
       leader_master_addr = leader_master.first;
+      leader_master_hostname = leader_master.second;
     }
     sync.StatusCB(status);
   };
@@ -782,7 +784,13 @@ Status ExternalMiniCluster::GetLeaderMasterIndex(int* idx) {
   RETURN_NOT_OK(sync.Wait());
   bool found = false;
   for (int i = 0; i < masters_.size(); i++) {
-    if (masters_[i]->bound_rpc_hostport().port() == leader_master_addr.port()) {
+    const auto& bound_hp = masters_[i]->bound_rpc_hostport();
+    // If using BindMode::UNIQUE_LOOPBACK mode, in rare cases different masters
+    // might bind to different local IP addresses but use same port numbers.
+    // So, it's necessary to check both the returned hostnames and IP addresses
+    // to point to leader master.
+    if (bound_hp.port() == leader_master_addr.port() &&
+        bound_hp.host() == leader_master_hostname) {
       found = true;
       *idx = i;
       break;


[kudu] 01/03: [docs] Add a section about using Gradle checkstyle

Posted by al...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

alexey pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/kudu.git

commit fb53b1b18d1f0950b31a668c05a313e271b6864c
Author: Greg Solovyev <gs...@cloudera.com>
AuthorDate: Tue Jun 23 14:22:00 2020 -0700

    [docs] Add a section about using Gradle checkstyle
    
    Coming from Ant/Maven world, I was not aware of Gradle
    checkStyle plugin and didn't use it before submitting
    a patch to Gerrit. Knowing about checkStyle would have
    saved one Gerrit iteration.
    
    To save time for future first time Java committers, I added
    a section about using Gradle checkstyle task to check
    Java code style.
    
    Pre-rendered on my fork: https://github.com/grishick/kudu/blob/master/docs/contributing.adoc#java-code-style
    
    Change-Id: Ica45e5fce418fe40fbdcd354bb0c98dc0a653a65
    Reviewed-on: http://gerrit.cloudera.org:8080/16105
    Tested-by: Kudu Jenkins
    Reviewed-by: Alexey Serbin <as...@cloudera.com>
---
 docs/contributing.adoc | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/docs/contributing.adoc b/docs/contributing.adoc
index 4bb8339..09634ca 100644
--- a/docs/contributing.adoc
+++ b/docs/contributing.adoc
@@ -432,6 +432,17 @@ if (!isCriticalConditionSatisfied) {
 }
 ----
 
+===== Checking code style with Gradle checkStyle
+
+Before posting a Java patch to Gerrit for review, make sure to check Java code
+style with Gradle `checkstyle` plugin. See
+link:https://docs.gradle.org/current/userguide/checkstyle_plugin.html[Gradle Checkstyle Plugin documentation]
+for more information.
+[source,bash]
+----
+./gradlew checkstyle
+----
+
 ===== References
 * link:https://docs.oracle.com/javase/8/docs/technotes/guides/language/assert.html[Programming With Assertions]
 * link:https://github.com/google/guava/wiki/PreconditionsExplained[Guava Preconditions Explained]