You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@kudu.apache.org by gr...@apache.org on 2019/02/19 23:00:21 UTC
[kudu] 01/03: Fix tracing of log appending and reduce log-related
log spam
This is an automated email from the ASF dual-hosted git repository.
granthenke pushed a commit to branch branch-1.9.x
in repository https://gitbox.apache.org/repos/asf/kudu.git
commit 6854cec5a0509d91738d9dc6778748ee1a50b50e
Author: Will Berkeley <wd...@gmail.org>
AuthorDate: Thu Feb 14 16:25:18 2019 -0800
Fix tracing of log appending and reduce log-related log spam
While messing around on a server I noticed a funny trace:
0212 16:29:26.554846 (+ 0us) service_pool.cc:163] Inserting onto call queue
0212 16:29:26.554854 (+ 8us) service_pool.cc:222] Handling call
0212 16:29:28.219369 (+1664515us) inbound_call.cc:157] Queueing success response
Related trace 'txn':
0212 16:29:26.561719 (+ 0us) write_transaction.cc:101] PREPARE: Starting
0212 16:29:26.562102 (+ 383us) write_transaction.cc:268] Acquiring schema lock in shared mode
0212 16:29:26.562103 (+ 1us) write_transaction.cc:271] Acquired schema lock
0212 16:29:26.562104 (+ 1us) tablet.cc:400] PREPARE: Decoding operations
0212 16:29:26.599420 (+ 37316us) tablet.cc:422] PREPARE: Acquiring locks for 6376 operations
0212 16:29:26.611285 (+ 11865us) tablet.cc:426] PREPARE: locks acquired
0212 16:29:26.611286 (+ 1us) write_transaction.cc:126] PREPARE: finished.
0212 16:29:26.611389 (+ 103us) write_transaction.cc:136] Start()
0212 16:29:26.611392 (+ 3us) write_transaction.cc:141] Timestamp: P: 1550017766611388 usec, L: 0
0212 16:29:26.613188 (+ 1796us) log.cc:582] Serialized 10493083 byte log entry
0212 16:29:26.735023 (+121835us) write_transaction.cc:149] APPLY: Starting
0212 16:29:28.213010 (+1477987us) tablet_metrics.cc:365] ProbeStats: bloom_lookups=27143,key_file_lookups=5,delta_file_lookups=0,mrs_lookups=6376
0212 16:29:28.214317 (+ 1307us) log.cc:582] Serialized 38279 byte log entry
0212 16:29:28.214376 (+ 59us) write_transaction.cc:309] Releasing row and schema locks
0212 16:29:28.216505 (+ 2129us) write_transaction.cc:277] Released schema lock
0212 16:29:28.219357 (+ 2852us) write_transaction.cc:196] FINISH: updating metrics
0212 16:29:28.219709 (+ 352us) write_transaction.cc:309] Releasing row and schema locks
0212 16:29:28.219709 (+ 0us) write_transaction.cc:277] Released schema lock
0212 16:29:28.824261 (+604552us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegment4us8LW
0212 16:29:29.531241 (+706980us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentGJU550
0212 16:29:30.254932 (+723691us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmenteWEBF9
0212 16:29:31.005554 (+750622us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentsAIJzm
0212 16:29:31.695339 (+689785us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentsBimGD
0212 16:29:32.443351 (+748012us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentamea7Y
0212 16:29:33.180797 (+737446us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentSam2No
0212 16:29:33.905360 (+724563us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmenthfhQDS
0212 16:29:34.634994 (+729634us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentxDVwMq
0212 16:29:35.384800 (+749806us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentjMi862
0212 16:29:36.080099 (+695299us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentRgaDAJ
0212 16:29:36.822293 (+742194us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentwSXUHR
0212 16:29:37.540434 (+718141us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentCp3SMG
0212 16:29:38.289865 (+749431us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentyEttfA
0212 16:29:38.993878 (+704013us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegment7OTQ23
0212 16:29:39.759563 (+765685us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentzk7y0x
0212 16:29:40.495284 (+735721us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmenttvkp8B
0212 16:29:41.289037 (+793753us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentnhuvEK
0212 16:29:41.993331 (+704294us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentVCJgnX
0212 16:29:42.710348 (+717017us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentrZTzve
0212 16:29:43.458553 (+748205us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentffNUWz
The issue is that this txn caused the append thread to wake up, which
meant that the append thread adopted the txn's trace. Then, several
batches were written to the wal during that one active period of the
append thread, and some of them caused WAL preallocation, which caused
the trace messages to be appended to the trace buffer of the original
txn that woke the append thread, even though it had nothing to do with
those preallocations, and in the default cause it is not blocked by any
preallocation because preallocation of WAL segments is asynchronous.
So this is misleading, as none of those preallocations delayed the write.
The simple thing to do is to remove the TRACE() call for preallocation.
There are still TRACE_EVENT()'s for tracking the activity of the append
thread using the tracing interface, and to cover the case of synchronous
segment allocation and rolling, I added a scoped latency counter for
synchronous log rolls.
Additionally, I noticed that logs tend to get spammed with log rolling
and preallocation messages like
I0214 16:38:30.307582 14666 log.cc:649] T f445674ede8d4b6590ad5002764eeae7 P 09d6bf7a02124145b43f43cb7a667b3d: Max segment size reached. Starting new segment allocation
I0214 16:38:30.430410 14666 log.cc:576] T f445674ede8d4b6590ad5002764eeae7 P 09d6bf7a02124145b43f43cb7a667b3d: Rolled over to a new log segment at /data/15/kudu/wals/f445674ede8d4b6590ad5002764eeae7/wal-000001116
These messages aren't really useful. If preallocation or rolling is
slow, there are already slow execution logging scopes that will alert us
through the logs. I dropped their level to VLOG(1).
Change-Id: Ia50698e3af321b4ab87ee3974525dea6fc551613
Reviewed-on: http://gerrit.cloudera.org:8080/12491
Reviewed-by: Adar Dembo <ad...@cloudera.com>
Reviewed-by: Andrew Wong <aw...@cloudera.com>
Tested-by: Kudu Jenkins
(cherry picked from commit 3574e3eb55a6bc815a9095636c4e08edcfa7c6b0)
Reviewed-on: http://gerrit.cloudera.org:8080/12509
---
src/kudu/consensus/log.cc | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/src/kudu/consensus/log.cc b/src/kudu/consensus/log.cc
index 845b446..c61fcde 100644
--- a/src/kudu/consensus/log.cc
+++ b/src/kudu/consensus/log.cc
@@ -573,7 +573,8 @@ Status Log::RollOver() {
RETURN_NOT_OK(SwitchToAllocatedSegment());
- LOG_WITH_PREFIX(INFO) << "Rolled over to a new log segment at " << active_segment_->path();
+ VLOG_WITH_PREFIX(1) << "Rolled over to a new log segment at "
+ << active_segment_->path();
return Status::OK();
}
@@ -646,10 +647,11 @@ Status Log::DoAppend(LogEntryBatch* entry_batch) {
// if the size of this entry overflows the current segment, get a new one
if (allocation_state() == kAllocationNotStarted) {
if ((active_segment_->Size() + entry_batch_bytes + 4) > max_segment_size_) {
- LOG_WITH_PREFIX(INFO) << "Max segment size reached. Starting new segment allocation";
+ VLOG_WITH_PREFIX(1) << "Max segment size reached. Starting new segment allocation";
RETURN_NOT_OK(AsyncAllocateSegment());
if (!options_.async_preallocate_segments) {
LOG_SLOW_EXECUTION(WARNING, 50, Substitute("$0Log roll took a long time", LogPrefix())) {
+ TRACE_COUNTER_SCOPE_LATENCY_US("log_roll");
RETURN_NOT_OK(RollOver());
}
}
@@ -1038,7 +1040,6 @@ Status Log::PreAllocateNewSegment() {
Status::IOError("Injected IOError in Log::PreAllocateNewSegment()"));
if (options_.preallocate_segments) {
- TRACE("Preallocating $0 byte segment in $1", max_segment_size_, next_segment_path_);
RETURN_NOT_OK(env_util::VerifySufficientDiskSpace(fs_manager_->env(),
next_segment_path_,
max_segment_size_,