You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@kudu.apache.org by ad...@apache.org on 2017/09/19 18:41:28 UTC

kudu git commit: Link to troubleshooting docs for two common problems

Repository: kudu
Updated Branches:
  refs/heads/master 5c8a8a2ab -> 501112b5d


Link to troubleshooting docs for two common problems

>From experience, users don't know what to do when they see timeouts or
logs with "Soft memory limit exceeded" messages. Let's add a link to
some new troubleshooting docs, to give users an easy way to find solutions.

Likewise, when a disk failure or unexpected change to Kudu's data dirs
happens, users often don't know what to do when seeing the
"FsManager root not empty" message, so let's add a link to some new
troubleshooting docs for that, too.

The links go to the proper version of the docs, so a 1.6 server will
provide a link to the 1.6 docs. I added a couple of helper functions to
make these links, and put them in a new file since they didn't seem to
fit into any existing place.

Change-Id: Ida7e1495e4ba68f9e9b4d424650d84c0019c9b0f
Reviewed-on: http://gerrit.cloudera.org:8080/8093
Tested-by: Kudu Jenkins
Reviewed-by: David Ribeiro Alves <da...@gmail.com>
Reviewed-by: Adar Dembo <ad...@cloudera.com>


Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/501112b5
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/501112b5
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/501112b5

Branch: refs/heads/master
Commit: 501112b5d779464210326b219ad1929a74ae2aed
Parents: 5c8a8a2
Author: Will Berkeley <wd...@apache.org>
Authored: Wed Sep 13 16:15:19 2017 -0700
Committer: Adar Dembo <ad...@cloudera.com>
Committed: Tue Sep 19 18:41:01 2017 +0000

----------------------------------------------------------------------
 docs/troubleshooting.adoc                       | 40 ++++++++++++++++++
 src/kudu/client/client.cc                       |  2 +-
 src/kudu/fs/fs_manager.cc                       |  5 ++-
 src/kudu/integration-tests/registration-test.cc |  6 +--
 src/kudu/master/master-test.cc                  |  4 +-
 src/kudu/master/master.cc                       |  2 +-
 src/kudu/server/server_base.cc                  |  2 +-
 src/kudu/tserver/heartbeater.cc                 |  2 +-
 src/kudu/tserver/tablet_service.cc              |  5 ++-
 src/kudu/util/CMakeLists.txt                    |  1 +
 src/kudu/util/version_info.cc                   |  6 ++-
 src/kudu/util/version_info.h                    |  7 +++-
 src/kudu/util/website_util.cc                   | 43 ++++++++++++++++++++
 src/kudu/util/website_util.h                    | 35 ++++++++++++++++
 14 files changed, 145 insertions(+), 15 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kudu/blob/501112b5/docs/troubleshooting.adoc
----------------------------------------------------------------------
diff --git a/docs/troubleshooting.adoc b/docs/troubleshooting.adoc
index 554303b..c43aa19 100644
--- a/docs/troubleshooting.adoc
+++ b/docs/troubleshooting.adoc
@@ -289,6 +289,46 @@ User stack:
 These traces can be useful for diagnosing root-cause latency issues when they are caused by systems
 below Kudu, such as disk controllers or file systems.
 
+[[memory_limits]]
+=== Memory Limits
+
+Kudu has a hard and soft memory limit. The hard memory limit is the maximum amount a Kudu process
+is allowed to use, and is controlled by the `--memory_limit_hard_bytes` flag. The soft memory limit
+is a percentage of the hard memory limit, controlled by the flag `memory_limit_soft_percentage` and
+with a default value of 80%, that determines the amount of memory a process may use before it will
+start rejecting some write operations.
+
+If the logs or RPC traces contain messages like
+
+----
+Service unavailable: Soft memory limit exceeded (at 96.35% of capacity)
+----
+
+then Kudu is rejecting writes due to memory backpressure. This may result in write timeouts. There
+are several ways to relieve the memory pressure on Kudu:
+
+- If the host has more memory available for Kudu, increase `--memory_limit_hard_bytes`.
+- Increase the rate at which Kudu can flush writes from memory to disk by increasing the number of
+  disks or increasing the number of maintenance manager threads `--maintenance_manager_num_threads`.
+  Generally, the recommended ratio of maintenance manager threads to data directories is 1:3.
+- Reduce the volume of writes flowing to Kudu on the application side.
+
+[[disk_issues]]
+=== Disk Issues
+
+When Kudu starts, it checks each configured data directory, expecting either for all to be
+initialized or for all to be empty. If a server fails to start with a log message like
+
+----
+Check failed: _s.ok() Bad status: Already present: Could not create new FS layout: FSManager root is not empty: /data0/kudu/data
+----
+
+then this precondition has failed. This could be because Kudu was configured with non-empty data
+directories on first startup, or because a previously-running, healthy Kudu process was restarted
+and at least one data directory was deleted or is somehow corrupted, perhaps because of a disk
+error. If in the latter situation, consult the
+link:administration.html#change_dir_config[Changing Directory Configurations] documentation.
+
 == Issues using Kudu
 
 [[hive_handler]]

http://git-wip-us.apache.org/repos/asf/kudu/blob/501112b5/src/kudu/client/client.cc
----------------------------------------------------------------------
diff --git a/src/kudu/client/client.cc b/src/kudu/client/client.cc
index 5cec57c..8f76581 100644
--- a/src/kudu/client/client.cc
+++ b/src/kudu/client/client.cc
@@ -238,7 +238,7 @@ Status DisableOpenSSLInitialization() {
 }
 
 string GetShortVersionString() {
-  return VersionInfo::GetShortVersionString();
+  return VersionInfo::GetVersionInfo();
 }
 
 string GetAllVersionInfo() {

http://git-wip-us.apache.org/repos/asf/kudu/blob/501112b5/src/kudu/fs/fs_manager.cc
----------------------------------------------------------------------
diff --git a/src/kudu/fs/fs_manager.cc b/src/kudu/fs/fs_manager.cc
index 90494de..39a386a 100644
--- a/src/kudu/fs/fs_manager.cc
+++ b/src/kudu/fs/fs_manager.cc
@@ -59,6 +59,7 @@
 #include "kudu/util/path_util.h"
 #include "kudu/util/pb_util.h"
 #include "kudu/util/stopwatch.h"
+#include "kudu/util/website_util.h"
 
 DEFINE_bool(enable_data_block_fsync, true,
             "Whether to enable fsync() of data blocks, metadata, and their parent directories. "
@@ -370,7 +371,9 @@ Status FsManager::CreateInitialFileSystemLayout(boost::optional<string> uuid) {
     RETURN_NOT_OK_PREPEND(IsDirectoryEmpty(root.path, &is_empty),
                           "Unable to check if FSManager root is empty");
     if (!is_empty) {
-      return Status::AlreadyPresent("FSManager root is not empty", root.path);
+      return Status::AlreadyPresent(
+          Substitute("FSManager root is not empty. See $0", KuduDocsTroubleshootingUrl()),
+          root.path);
     }
   }
 

http://git-wip-us.apache.org/repos/asf/kudu/blob/501112b5/src/kudu/integration-tests/registration-test.cc
----------------------------------------------------------------------
diff --git a/src/kudu/integration-tests/registration-test.cc b/src/kudu/integration-tests/registration-test.cc
index 40945ef..6fc8902 100644
--- a/src/kudu/integration-tests/registration-test.cc
+++ b/src/kudu/integration-tests/registration-test.cc
@@ -167,8 +167,8 @@ class RegistrationTest : public KuduTest {
     ASSERT_STR_CONTAINS(buf_str, expected_uuid);
 
     // Should check that the TS software version is included on the page.
-    // tserver version should be the same as returned by GetShortVersionString()
-    ASSERT_STR_CONTAINS(buf_str, VersionInfo::GetShortVersionString());
+    // tserver version should be the same as returned by GetVersionInfo()
+    ASSERT_STR_CONTAINS(buf_str, VersionInfo::GetVersionInfo());
     if (contents != nullptr) {
       *contents = std::move(buf_str);
     }
@@ -248,7 +248,7 @@ TEST_F(RegistrationTest, TestMasterSoftwareVersion) {
     SCOPED_TRACE(SecureShortDebugString(reg));
     ASSERT_TRUE(reg.has_software_version());
     ASSERT_STR_CONTAINS(reg.software_version(),
-                        VersionInfo::GetShortVersionString());
+                        VersionInfo::GetVersionInfo());
   }
 }
 

http://git-wip-us.apache.org/repos/asf/kudu/blob/501112b5/src/kudu/master/master-test.cc
----------------------------------------------------------------------
diff --git a/src/kudu/master/master-test.cc b/src/kudu/master/master-test.cc
index 7d35e00..499e9c2 100644
--- a/src/kudu/master/master-test.cc
+++ b/src/kudu/master/master-test.cc
@@ -189,7 +189,7 @@ TEST_F(MasterTest, TestRegisterAndHeartbeat) {
   ServerRegistrationPB fake_reg;
   MakeHostPortPB("localhost", 1000, fake_reg.add_rpc_addresses());
   MakeHostPortPB("localhost", 2000, fake_reg.add_http_addresses());
-  fake_reg.set_software_version(VersionInfo::GetShortVersionString());
+  fake_reg.set_software_version(VersionInfo::GetVersionInfo());
 
   {
     TSHeartbeatRequestPB req;
@@ -339,7 +339,7 @@ TEST_F(MasterTest, TestRegisterAndHeartbeat) {
     ASSERT_STREQ("my-ts-uuid", tablet_server["uuid"].GetString());
     ASSERT_TRUE(tablet_server["millis_since_heartbeat"].GetInt64() >= 0);
     ASSERT_EQ(true, tablet_server["live"].GetBool());
-    ASSERT_STREQ(VersionInfo::GetShortVersionString().c_str(),
+    ASSERT_STREQ(VersionInfo::GetVersionInfo().c_str(),
         tablet_server["version"].GetString());
   }
 

http://git-wip-us.apache.org/repos/asf/kudu/blob/501112b5/src/kudu/master/master.cc
----------------------------------------------------------------------
diff --git a/src/kudu/master/master.cc b/src/kudu/master/master.cc
index 1a06973..14f8d28 100644
--- a/src/kudu/master/master.cc
+++ b/src/kudu/master/master.cc
@@ -258,7 +258,7 @@ Status Master::InitMasterRegistration() {
     RETURN_NOT_OK(AddHostPortPBs(http_addrs, reg.mutable_http_addresses()));
     reg.set_https_enabled(web_server()->IsSecure());
   }
-  reg.set_software_version(VersionInfo::GetShortVersionString());
+  reg.set_software_version(VersionInfo::GetVersionInfo());
 
   registration_.Swap(&reg);
   registration_initialized_.store(true);

http://git-wip-us.apache.org/repos/asf/kudu/blob/501112b5/src/kudu/server/server_base.cc
----------------------------------------------------------------------
diff --git a/src/kudu/server/server_base.cc b/src/kudu/server/server_base.cc
index c3f6bb0..7cd7612 100644
--- a/src/kudu/server/server_base.cc
+++ b/src/kudu/server/server_base.cc
@@ -455,7 +455,7 @@ void ServerBase::ExcessLogFileDeleterThread() {
 
 std::string ServerBase::FooterHtml() const {
   return Substitute("<pre>$0\nserver uuid $1</pre>",
-                    VersionInfo::GetShortVersionString(),
+                    VersionInfo::GetVersionInfo(),
                     instance_pb_->permanent_uuid());
 }
 

http://git-wip-us.apache.org/repos/asf/kudu/blob/501112b5/src/kudu/tserver/heartbeater.cc
----------------------------------------------------------------------
diff --git a/src/kudu/tserver/heartbeater.cc b/src/kudu/tserver/heartbeater.cc
index fe548fe..7ecfd90 100644
--- a/src/kudu/tserver/heartbeater.cc
+++ b/src/kudu/tserver/heartbeater.cc
@@ -339,7 +339,7 @@ Status Heartbeater::Thread::SetupRegistration(ServerRegistrationPB* reg) {
                           "Failed to add HTTP addresses to registration");
     reg->set_https_enabled(server_->web_server()->IsSecure());
   }
-  reg->set_software_version(VersionInfo::GetShortVersionString());
+  reg->set_software_version(VersionInfo::GetVersionInfo());
 
   return Status::OK();
 }

http://git-wip-us.apache.org/repos/asf/kudu/blob/501112b5/src/kudu/tserver/tablet_service.cc
----------------------------------------------------------------------
diff --git a/src/kudu/tserver/tablet_service.cc b/src/kudu/tserver/tablet_service.cc
index 1267570..be86cb3 100644
--- a/src/kudu/tserver/tablet_service.cc
+++ b/src/kudu/tserver/tablet_service.cc
@@ -97,6 +97,7 @@
 #include "kudu/util/status_callback.h"
 #include "kudu/util/trace.h"
 #include "kudu/util/trace_metrics.h"
+#include "kudu/util/website_util.h"
 
 DEFINE_int32(scanner_default_batch_size_bytes, 1024 * 1024,
              "The default size for batches of scan results");
@@ -861,8 +862,8 @@ void TabletServiceImpl::Write(const WriteRequestPB* req,
   if (process_memory::SoftLimitExceeded(&capacity_pct)) {
     tablet->metrics()->leader_memory_pressure_rejections->Increment();
     string msg = StringPrintf(
-        "Soft memory limit exceeded (at %.2f%% of capacity)",
-        capacity_pct);
+        "Soft memory limit exceeded (at %.2f%% of capacity). See %s",
+        capacity_pct, KuduDocsTroubleshootingUrl().c_str());
     if (capacity_pct >= FLAGS_memory_limit_warn_threshold_percentage) {
       KLOG_EVERY_N_SECS(WARNING, 1) << "Rejecting Write request: " << msg << THROTTLE_MSG;
     } else {

http://git-wip-us.apache.org/repos/asf/kudu/blob/501112b5/src/kudu/util/CMakeLists.txt
----------------------------------------------------------------------
diff --git a/src/kudu/util/CMakeLists.txt b/src/kudu/util/CMakeLists.txt
index 077f156..8f58a26 100644
--- a/src/kudu/util/CMakeLists.txt
+++ b/src/kudu/util/CMakeLists.txt
@@ -196,6 +196,7 @@ set(UTIL_SRCS
   user.cc
   url-coding.cc
   version_info.cc
+  website_util.cc
   zlib.cc
 )
 

http://git-wip-us.apache.org/repos/asf/kudu/blob/501112b5/src/kudu/util/version_info.cc
----------------------------------------------------------------------
diff --git a/src/kudu/util/version_info.cc b/src/kudu/util/version_info.cc
index 3d63126..1dfcdec 100644
--- a/src/kudu/util/version_info.cc
+++ b/src/kudu/util/version_info.cc
@@ -36,7 +36,11 @@ string VersionInfo::GetGitHash() {
   return ret;
 }
 
-string VersionInfo::GetShortVersionString() {
+string VersionInfo::GetShortVersionInfo() {
+  return KUDU_VERSION_STRING;
+}
+
+string VersionInfo::GetVersionInfo() {
   return strings::Substitute("kudu $0 (rev $1)",
                              KUDU_VERSION_STRING,
                              GetGitHash());

http://git-wip-us.apache.org/repos/asf/kudu/blob/501112b5/src/kudu/util/version_info.h
----------------------------------------------------------------------
diff --git a/src/kudu/util/version_info.h b/src/kudu/util/version_info.h
index 5bda97e..e19830d 100644
--- a/src/kudu/util/version_info.h
+++ b/src/kudu/util/version_info.h
@@ -28,8 +28,11 @@ class VersionInfoPB;
 // Static functions related to fetching information about the current build.
 class VersionInfo {
  public:
-  // Get a short version string ("kudu 1.2.3 (rev abcdef...)")
-  static std::string GetShortVersionString();
+  // Get a short version string ("1.2.3" or "1.9.3-SNAPSHOT").
+  static std::string GetShortVersionInfo();
+
+  // Get a version string ("kudu 1.2.3 (rev abcdef...)").
+  static std::string GetVersionInfo();
 
   // Get a multi-line string including version info, build time, etc.
   static std::string GetAllVersionInfo();

http://git-wip-us.apache.org/repos/asf/kudu/blob/501112b5/src/kudu/util/website_util.cc
----------------------------------------------------------------------
diff --git a/src/kudu/util/website_util.cc b/src/kudu/util/website_util.cc
new file mode 100644
index 0000000..b7d14e5
--- /dev/null
+++ b/src/kudu/util/website_util.cc
@@ -0,0 +1,43 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "kudu/util/website_util.h"
+
+#include "kudu/gutil/strings/substitute.h"
+#include "kudu/util/version_info.h"
+
+using std::string;
+using strings::Substitute;
+
+namespace kudu {
+
+const char* const kKuduUrl = "https://kudu.apache.org";
+
+// Returns a URL for the Kudu website.
+string KuduUrl() {
+  return kKuduUrl;
+}
+
+string KuduDocsUrl() {
+  return Substitute("$0/releases/$1/docs", kKuduUrl, VersionInfo::GetShortVersionInfo());
+}
+
+string KuduDocsTroubleshootingUrl() {
+  return Substitute("$0/troubleshooting.html", KuduDocsUrl());
+}
+
+} // namespace kudu

http://git-wip-us.apache.org/repos/asf/kudu/blob/501112b5/src/kudu/util/website_util.h
----------------------------------------------------------------------
diff --git a/src/kudu/util/website_util.h b/src/kudu/util/website_util.h
new file mode 100644
index 0000000..6dcf810
--- /dev/null
+++ b/src/kudu/util/website_util.h
@@ -0,0 +1,35 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+#pragma once
+
+#include <string>
+
+namespace kudu {
+
+// Returns a URL for the Kudu website.
+std::string KuduUrl();
+
+// Returns the base URL for this Kudu version's documentation.
+// Of course, if this version of Kudu isn't released, the link won't work.
+std::string KuduDocsUrl();
+
+// Returns a link to this Kudu version's troubleshooting docs. Useful to put in
+// error messages for common problems covered in the troubleshooting docs,
+// but whose solutions are too complex or varied to put in a log message.
+std::string KuduDocsTroubleshootingUrl();
+
+} // namespace kudu