You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@kudu.apache.org by ad...@apache.org on 2017/09/19 18:41:28 UTC
kudu git commit: Link to troubleshooting docs for two common problems
Repository: kudu
Updated Branches:
refs/heads/master 5c8a8a2ab -> 501112b5d
Link to troubleshooting docs for two common problems
>From experience, users don't know what to do when they see timeouts or
logs with "Soft memory limit exceeded" messages. Let's add a link to
some new troubleshooting docs, to give users an easy way to find solutions.
Likewise, when a disk failure or unexpected change to Kudu's data dirs
happens, users often don't know what to do when seeing the
"FsManager root not empty" message, so let's add a link to some new
troubleshooting docs for that, too.
The links go to the proper version of the docs, so a 1.6 server will
provide a link to the 1.6 docs. I added a couple of helper functions to
make these links, and put them in a new file since they didn't seem to
fit into any existing place.
Change-Id: Ida7e1495e4ba68f9e9b4d424650d84c0019c9b0f
Reviewed-on: http://gerrit.cloudera.org:8080/8093
Tested-by: Kudu Jenkins
Reviewed-by: David Ribeiro Alves <da...@gmail.com>
Reviewed-by: Adar Dembo <ad...@cloudera.com>
Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/501112b5
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/501112b5
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/501112b5
Branch: refs/heads/master
Commit: 501112b5d779464210326b219ad1929a74ae2aed
Parents: 5c8a8a2
Author: Will Berkeley <wd...@apache.org>
Authored: Wed Sep 13 16:15:19 2017 -0700
Committer: Adar Dembo <ad...@cloudera.com>
Committed: Tue Sep 19 18:41:01 2017 +0000
----------------------------------------------------------------------
docs/troubleshooting.adoc | 40 ++++++++++++++++++
src/kudu/client/client.cc | 2 +-
src/kudu/fs/fs_manager.cc | 5 ++-
src/kudu/integration-tests/registration-test.cc | 6 +--
src/kudu/master/master-test.cc | 4 +-
src/kudu/master/master.cc | 2 +-
src/kudu/server/server_base.cc | 2 +-
src/kudu/tserver/heartbeater.cc | 2 +-
src/kudu/tserver/tablet_service.cc | 5 ++-
src/kudu/util/CMakeLists.txt | 1 +
src/kudu/util/version_info.cc | 6 ++-
src/kudu/util/version_info.h | 7 +++-
src/kudu/util/website_util.cc | 43 ++++++++++++++++++++
src/kudu/util/website_util.h | 35 ++++++++++++++++
14 files changed, 145 insertions(+), 15 deletions(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/kudu/blob/501112b5/docs/troubleshooting.adoc
----------------------------------------------------------------------
diff --git a/docs/troubleshooting.adoc b/docs/troubleshooting.adoc
index 554303b..c43aa19 100644
--- a/docs/troubleshooting.adoc
+++ b/docs/troubleshooting.adoc
@@ -289,6 +289,46 @@ User stack:
These traces can be useful for diagnosing root-cause latency issues when they are caused by systems
below Kudu, such as disk controllers or file systems.
+[[memory_limits]]
+=== Memory Limits
+
+Kudu has a hard and soft memory limit. The hard memory limit is the maximum amount a Kudu process
+is allowed to use, and is controlled by the `--memory_limit_hard_bytes` flag. The soft memory limit
+is a percentage of the hard memory limit, controlled by the flag `memory_limit_soft_percentage` and
+with a default value of 80%, that determines the amount of memory a process may use before it will
+start rejecting some write operations.
+
+If the logs or RPC traces contain messages like
+
+----
+Service unavailable: Soft memory limit exceeded (at 96.35% of capacity)
+----
+
+then Kudu is rejecting writes due to memory backpressure. This may result in write timeouts. There
+are several ways to relieve the memory pressure on Kudu:
+
+- If the host has more memory available for Kudu, increase `--memory_limit_hard_bytes`.
+- Increase the rate at which Kudu can flush writes from memory to disk by increasing the number of
+ disks or increasing the number of maintenance manager threads `--maintenance_manager_num_threads`.
+ Generally, the recommended ratio of maintenance manager threads to data directories is 1:3.
+- Reduce the volume of writes flowing to Kudu on the application side.
+
+[[disk_issues]]
+=== Disk Issues
+
+When Kudu starts, it checks each configured data directory, expecting either for all to be
+initialized or for all to be empty. If a server fails to start with a log message like
+
+----
+Check failed: _s.ok() Bad status: Already present: Could not create new FS layout: FSManager root is not empty: /data0/kudu/data
+----
+
+then this precondition has failed. This could be because Kudu was configured with non-empty data
+directories on first startup, or because a previously-running, healthy Kudu process was restarted
+and at least one data directory was deleted or is somehow corrupted, perhaps because of a disk
+error. If in the latter situation, consult the
+link:administration.html#change_dir_config[Changing Directory Configurations] documentation.
+
== Issues using Kudu
[[hive_handler]]
http://git-wip-us.apache.org/repos/asf/kudu/blob/501112b5/src/kudu/client/client.cc
----------------------------------------------------------------------
diff --git a/src/kudu/client/client.cc b/src/kudu/client/client.cc
index 5cec57c..8f76581 100644
--- a/src/kudu/client/client.cc
+++ b/src/kudu/client/client.cc
@@ -238,7 +238,7 @@ Status DisableOpenSSLInitialization() {
}
string GetShortVersionString() {
- return VersionInfo::GetShortVersionString();
+ return VersionInfo::GetVersionInfo();
}
string GetAllVersionInfo() {
http://git-wip-us.apache.org/repos/asf/kudu/blob/501112b5/src/kudu/fs/fs_manager.cc
----------------------------------------------------------------------
diff --git a/src/kudu/fs/fs_manager.cc b/src/kudu/fs/fs_manager.cc
index 90494de..39a386a 100644
--- a/src/kudu/fs/fs_manager.cc
+++ b/src/kudu/fs/fs_manager.cc
@@ -59,6 +59,7 @@
#include "kudu/util/path_util.h"
#include "kudu/util/pb_util.h"
#include "kudu/util/stopwatch.h"
+#include "kudu/util/website_util.h"
DEFINE_bool(enable_data_block_fsync, true,
"Whether to enable fsync() of data blocks, metadata, and their parent directories. "
@@ -370,7 +371,9 @@ Status FsManager::CreateInitialFileSystemLayout(boost::optional<string> uuid) {
RETURN_NOT_OK_PREPEND(IsDirectoryEmpty(root.path, &is_empty),
"Unable to check if FSManager root is empty");
if (!is_empty) {
- return Status::AlreadyPresent("FSManager root is not empty", root.path);
+ return Status::AlreadyPresent(
+ Substitute("FSManager root is not empty. See $0", KuduDocsTroubleshootingUrl()),
+ root.path);
}
}
http://git-wip-us.apache.org/repos/asf/kudu/blob/501112b5/src/kudu/integration-tests/registration-test.cc
----------------------------------------------------------------------
diff --git a/src/kudu/integration-tests/registration-test.cc b/src/kudu/integration-tests/registration-test.cc
index 40945ef..6fc8902 100644
--- a/src/kudu/integration-tests/registration-test.cc
+++ b/src/kudu/integration-tests/registration-test.cc
@@ -167,8 +167,8 @@ class RegistrationTest : public KuduTest {
ASSERT_STR_CONTAINS(buf_str, expected_uuid);
// Should check that the TS software version is included on the page.
- // tserver version should be the same as returned by GetShortVersionString()
- ASSERT_STR_CONTAINS(buf_str, VersionInfo::GetShortVersionString());
+ // tserver version should be the same as returned by GetVersionInfo()
+ ASSERT_STR_CONTAINS(buf_str, VersionInfo::GetVersionInfo());
if (contents != nullptr) {
*contents = std::move(buf_str);
}
@@ -248,7 +248,7 @@ TEST_F(RegistrationTest, TestMasterSoftwareVersion) {
SCOPED_TRACE(SecureShortDebugString(reg));
ASSERT_TRUE(reg.has_software_version());
ASSERT_STR_CONTAINS(reg.software_version(),
- VersionInfo::GetShortVersionString());
+ VersionInfo::GetVersionInfo());
}
}
http://git-wip-us.apache.org/repos/asf/kudu/blob/501112b5/src/kudu/master/master-test.cc
----------------------------------------------------------------------
diff --git a/src/kudu/master/master-test.cc b/src/kudu/master/master-test.cc
index 7d35e00..499e9c2 100644
--- a/src/kudu/master/master-test.cc
+++ b/src/kudu/master/master-test.cc
@@ -189,7 +189,7 @@ TEST_F(MasterTest, TestRegisterAndHeartbeat) {
ServerRegistrationPB fake_reg;
MakeHostPortPB("localhost", 1000, fake_reg.add_rpc_addresses());
MakeHostPortPB("localhost", 2000, fake_reg.add_http_addresses());
- fake_reg.set_software_version(VersionInfo::GetShortVersionString());
+ fake_reg.set_software_version(VersionInfo::GetVersionInfo());
{
TSHeartbeatRequestPB req;
@@ -339,7 +339,7 @@ TEST_F(MasterTest, TestRegisterAndHeartbeat) {
ASSERT_STREQ("my-ts-uuid", tablet_server["uuid"].GetString());
ASSERT_TRUE(tablet_server["millis_since_heartbeat"].GetInt64() >= 0);
ASSERT_EQ(true, tablet_server["live"].GetBool());
- ASSERT_STREQ(VersionInfo::GetShortVersionString().c_str(),
+ ASSERT_STREQ(VersionInfo::GetVersionInfo().c_str(),
tablet_server["version"].GetString());
}
http://git-wip-us.apache.org/repos/asf/kudu/blob/501112b5/src/kudu/master/master.cc
----------------------------------------------------------------------
diff --git a/src/kudu/master/master.cc b/src/kudu/master/master.cc
index 1a06973..14f8d28 100644
--- a/src/kudu/master/master.cc
+++ b/src/kudu/master/master.cc
@@ -258,7 +258,7 @@ Status Master::InitMasterRegistration() {
RETURN_NOT_OK(AddHostPortPBs(http_addrs, reg.mutable_http_addresses()));
reg.set_https_enabled(web_server()->IsSecure());
}
- reg.set_software_version(VersionInfo::GetShortVersionString());
+ reg.set_software_version(VersionInfo::GetVersionInfo());
registration_.Swap(®);
registration_initialized_.store(true);
http://git-wip-us.apache.org/repos/asf/kudu/blob/501112b5/src/kudu/server/server_base.cc
----------------------------------------------------------------------
diff --git a/src/kudu/server/server_base.cc b/src/kudu/server/server_base.cc
index c3f6bb0..7cd7612 100644
--- a/src/kudu/server/server_base.cc
+++ b/src/kudu/server/server_base.cc
@@ -455,7 +455,7 @@ void ServerBase::ExcessLogFileDeleterThread() {
std::string ServerBase::FooterHtml() const {
return Substitute("<pre>$0\nserver uuid $1</pre>",
- VersionInfo::GetShortVersionString(),
+ VersionInfo::GetVersionInfo(),
instance_pb_->permanent_uuid());
}
http://git-wip-us.apache.org/repos/asf/kudu/blob/501112b5/src/kudu/tserver/heartbeater.cc
----------------------------------------------------------------------
diff --git a/src/kudu/tserver/heartbeater.cc b/src/kudu/tserver/heartbeater.cc
index fe548fe..7ecfd90 100644
--- a/src/kudu/tserver/heartbeater.cc
+++ b/src/kudu/tserver/heartbeater.cc
@@ -339,7 +339,7 @@ Status Heartbeater::Thread::SetupRegistration(ServerRegistrationPB* reg) {
"Failed to add HTTP addresses to registration");
reg->set_https_enabled(server_->web_server()->IsSecure());
}
- reg->set_software_version(VersionInfo::GetShortVersionString());
+ reg->set_software_version(VersionInfo::GetVersionInfo());
return Status::OK();
}
http://git-wip-us.apache.org/repos/asf/kudu/blob/501112b5/src/kudu/tserver/tablet_service.cc
----------------------------------------------------------------------
diff --git a/src/kudu/tserver/tablet_service.cc b/src/kudu/tserver/tablet_service.cc
index 1267570..be86cb3 100644
--- a/src/kudu/tserver/tablet_service.cc
+++ b/src/kudu/tserver/tablet_service.cc
@@ -97,6 +97,7 @@
#include "kudu/util/status_callback.h"
#include "kudu/util/trace.h"
#include "kudu/util/trace_metrics.h"
+#include "kudu/util/website_util.h"
DEFINE_int32(scanner_default_batch_size_bytes, 1024 * 1024,
"The default size for batches of scan results");
@@ -861,8 +862,8 @@ void TabletServiceImpl::Write(const WriteRequestPB* req,
if (process_memory::SoftLimitExceeded(&capacity_pct)) {
tablet->metrics()->leader_memory_pressure_rejections->Increment();
string msg = StringPrintf(
- "Soft memory limit exceeded (at %.2f%% of capacity)",
- capacity_pct);
+ "Soft memory limit exceeded (at %.2f%% of capacity). See %s",
+ capacity_pct, KuduDocsTroubleshootingUrl().c_str());
if (capacity_pct >= FLAGS_memory_limit_warn_threshold_percentage) {
KLOG_EVERY_N_SECS(WARNING, 1) << "Rejecting Write request: " << msg << THROTTLE_MSG;
} else {
http://git-wip-us.apache.org/repos/asf/kudu/blob/501112b5/src/kudu/util/CMakeLists.txt
----------------------------------------------------------------------
diff --git a/src/kudu/util/CMakeLists.txt b/src/kudu/util/CMakeLists.txt
index 077f156..8f58a26 100644
--- a/src/kudu/util/CMakeLists.txt
+++ b/src/kudu/util/CMakeLists.txt
@@ -196,6 +196,7 @@ set(UTIL_SRCS
user.cc
url-coding.cc
version_info.cc
+ website_util.cc
zlib.cc
)
http://git-wip-us.apache.org/repos/asf/kudu/blob/501112b5/src/kudu/util/version_info.cc
----------------------------------------------------------------------
diff --git a/src/kudu/util/version_info.cc b/src/kudu/util/version_info.cc
index 3d63126..1dfcdec 100644
--- a/src/kudu/util/version_info.cc
+++ b/src/kudu/util/version_info.cc
@@ -36,7 +36,11 @@ string VersionInfo::GetGitHash() {
return ret;
}
-string VersionInfo::GetShortVersionString() {
+string VersionInfo::GetShortVersionInfo() {
+ return KUDU_VERSION_STRING;
+}
+
+string VersionInfo::GetVersionInfo() {
return strings::Substitute("kudu $0 (rev $1)",
KUDU_VERSION_STRING,
GetGitHash());
http://git-wip-us.apache.org/repos/asf/kudu/blob/501112b5/src/kudu/util/version_info.h
----------------------------------------------------------------------
diff --git a/src/kudu/util/version_info.h b/src/kudu/util/version_info.h
index 5bda97e..e19830d 100644
--- a/src/kudu/util/version_info.h
+++ b/src/kudu/util/version_info.h
@@ -28,8 +28,11 @@ class VersionInfoPB;
// Static functions related to fetching information about the current build.
class VersionInfo {
public:
- // Get a short version string ("kudu 1.2.3 (rev abcdef...)")
- static std::string GetShortVersionString();
+ // Get a short version string ("1.2.3" or "1.9.3-SNAPSHOT").
+ static std::string GetShortVersionInfo();
+
+ // Get a version string ("kudu 1.2.3 (rev abcdef...)").
+ static std::string GetVersionInfo();
// Get a multi-line string including version info, build time, etc.
static std::string GetAllVersionInfo();
http://git-wip-us.apache.org/repos/asf/kudu/blob/501112b5/src/kudu/util/website_util.cc
----------------------------------------------------------------------
diff --git a/src/kudu/util/website_util.cc b/src/kudu/util/website_util.cc
new file mode 100644
index 0000000..b7d14e5
--- /dev/null
+++ b/src/kudu/util/website_util.cc
@@ -0,0 +1,43 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "kudu/util/website_util.h"
+
+#include "kudu/gutil/strings/substitute.h"
+#include "kudu/util/version_info.h"
+
+using std::string;
+using strings::Substitute;
+
+namespace kudu {
+
+const char* const kKuduUrl = "https://kudu.apache.org";
+
+// Returns a URL for the Kudu website.
+string KuduUrl() {
+ return kKuduUrl;
+}
+
+string KuduDocsUrl() {
+ return Substitute("$0/releases/$1/docs", kKuduUrl, VersionInfo::GetShortVersionInfo());
+}
+
+string KuduDocsTroubleshootingUrl() {
+ return Substitute("$0/troubleshooting.html", KuduDocsUrl());
+}
+
+} // namespace kudu
http://git-wip-us.apache.org/repos/asf/kudu/blob/501112b5/src/kudu/util/website_util.h
----------------------------------------------------------------------
diff --git a/src/kudu/util/website_util.h b/src/kudu/util/website_util.h
new file mode 100644
index 0000000..6dcf810
--- /dev/null
+++ b/src/kudu/util/website_util.h
@@ -0,0 +1,35 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+#pragma once
+
+#include <string>
+
+namespace kudu {
+
+// Returns a URL for the Kudu website.
+std::string KuduUrl();
+
+// Returns the base URL for this Kudu version's documentation.
+// Of course, if this version of Kudu isn't released, the link won't work.
+std::string KuduDocsUrl();
+
+// Returns a link to this Kudu version's troubleshooting docs. Useful to put in
+// error messages for common problems covered in the troubleshooting docs,
+// but whose solutions are too complex or varied to put in a log message.
+std::string KuduDocsTroubleshootingUrl();
+
+} // namespace kudu