You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@kudu.apache.org by gr...@apache.org on 2019/02/19 23:00:20 UTC
[kudu] branch branch-1.9.x updated (8f16041 -> 272e98d)
This is an automated email from the ASF dual-hosted git repository.
granthenke pushed a change to branch branch-1.9.x
in repository https://gitbox.apache.org/repos/asf/kudu.git.
from 8f16041 log_block_manager: fix invalid pointer
new 6854cec Fix tracing of log appending and reduce log-related log spam
new 741db85 KUDU-2686 python: remove multiprocessing
new 272e98d [minicluster] Fix building with python 2.6
The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails. The revisions
listed as "add" were already present in the repository and have only
been added to this reference.
Summary of changes:
.dockerignore | 5 +++++
.../mini-cluster/relocate_binaries_for_mini_cluster.py | 2 +-
docker/bootstrap-dev-env.sh | 2 ++
python/kudu/tests/test_scantoken.py | 17 ++++-------------
python/setup.py | 5 -----
src/kudu/consensus/log.cc | 7 ++++---
6 files changed, 16 insertions(+), 22 deletions(-)
[kudu] 02/03: KUDU-2686 python: remove multiprocessing
Posted by gr...@apache.org.
This is an automated email from the ASF dual-hosted git repository.
granthenke pushed a commit to branch branch-1.9.x
in repository https://gitbox.apache.org/repos/asf/kudu.git
commit 741db8534109ed075a2dc9792cc773e165efd172
Author: Andrew Wong <aw...@cloudera.com>
AuthorDate: Thu Feb 14 14:25:32 2019 -0800
KUDU-2686 python: remove multiprocessing
The python multiprocessing library doesn't play very nicely when
creating Pools after initializing complex state (e.g. the KuduClient).
Because of how it forks, the library may even copy lock states, and lead
to odd situations, like multiple threads waiting on a lock that isn't
held by any thread in the process.
This was the case in KUDU-2686, which resulted in a hang in
test_scantoken.py. Upon inspection[1], there appeared to be multiple
threads in a process waiting on the same sys_futex, but none of them
held it. [2] tips some pieces to make it more reproducible (tested on
Ubuntu 14.04, where the issue was first reported).
Starting with python3.4, there is supposedly a way around this, which is
to use a different process-startup pattern, though this isn't available
in python2.
This patch removes usage of multiprocessing entirely. While it can be
argued that multiprocessing provides extra test coverage, given that
multiprocessing is known to have such issues[3][4], that this isn't the
first time we've been bitten by this forking issue, that it's test-only,
and that scan tokens are tested in the C++ client as well, "dumbing"
down the test doesn't seem unreasonable.
[1] https://gist.github.com/andrwng/d2d21c551362ddd564926c2a4ec406ae
[2] https://gist.github.com/andrwng/cc6c211c62b1235cc58944d513ba6655
[3] https://github.com/pytest-dev/pytest/issues/958
[4] https://codewithoutrules.com/2018/09/04/python-multiprocessing/
Change-Id: Ia9aa91191d54801731da27e5f132b3c96af0efa1
Reviewed-on: http://gerrit.cloudera.org:8080/12494
Tested-by: Kudu Jenkins
Reviewed-by: Jordan Birdsell <jt...@apache.org>
Reviewed-by: Alexey Serbin <as...@cloudera.com>
Reviewed-on: http://gerrit.cloudera.org:8080/12515
Reviewed-by: Grant Henke <gr...@apache.org>
---
python/kudu/tests/test_scantoken.py | 17 ++++-------------
python/setup.py | 5 -----
2 files changed, 4 insertions(+), 18 deletions(-)
diff --git a/python/kudu/tests/test_scantoken.py b/python/kudu/tests/test_scantoken.py
index c453659..37b273d 100644
--- a/python/kudu/tests/test_scantoken.py
+++ b/python/kudu/tests/test_scantoken.py
@@ -20,7 +20,6 @@ from kudu.compat import unittest
from kudu.tests.util import TestScanBase
from kudu.tests.common import KuduTestBase
import kudu
-from multiprocessing import Pool
import datetime
import time
@@ -44,21 +43,13 @@ class TestScanToken(TestScanBase):
def _subtest_serialize_thread_and_verify(self, tokens, expected_tuples, count_only=False):
"""
- Given the input serialized tokens, spawn new threads,
- execute them and validate the results
+ Given the input serialized tokens, hydrate the scanners, execute the
+ scans, and validate the results
"""
- input = [(token.serialize(), self.master_hosts, self.master_ports)
- for token in tokens]
-
- # Begin process pool
- pool = Pool(len(input))
- try:
- results = pool.map(_get_scan_token_results, input)
- finally:
- pool.close()
- pool.join()
+ input = [(token.serialize(), self.master_hosts, self.master_ports) for token in tokens]
# Validate results
+ results = [_get_scan_token_results(i) for i in input]
actual_tuples = []
for result in results:
actual_tuples += result
diff --git a/python/setup.py b/python/setup.py
index 87872bc..a09f7a9 100644
--- a/python/setup.py
+++ b/python/setup.py
@@ -30,11 +30,6 @@ import os
import re
import subprocess
-# Workaround a Python bug in which multiprocessing's atexit handler doesn't
-# play well with pytest. See http://bugs.python.org/issue15881 for details
-# and this suggested workaround (comment msg170215 in the thread).
-import multiprocessing
-
if Cython.__version__ < '0.21.0':
raise Exception('Please upgrade to Cython 0.21.0 or newer')
[kudu] 03/03: [minicluster] Fix building with python 2.6
Posted by gr...@apache.org.
This is an automated email from the ASF dual-hosted git repository.
granthenke pushed a commit to branch branch-1.9.x
in repository https://gitbox.apache.org/repos/asf/kudu.git
commit 272e98dea63afb54b50347fdae9491c27c7dbce4
Author: Grant Henke <gr...@apache.org>
AuthorDate: Tue Feb 19 15:53:16 2019 -0600
[minicluster] Fix building with python 2.6
This patch fixes building the minicluster binary
jar with python 2.6. This is imporant because python
2.6 is the default version on centos 6 which is the
primary linux OS that the kudu-binary jar is built on.
Because I used Docker to test this, the related Docker
build fixes are included too.
Change-Id: Id9e7ce9a1fd7ac813866678d6ec804fdf91ea729
Reviewed-on: http://gerrit.cloudera.org:8080/12527
Reviewed-by: Adar Dembo <ad...@cloudera.com>
Reviewed-by: Andrew Wong <aw...@cloudera.com>
Tested-by: Grant Henke <gr...@apache.org>
---
.dockerignore | 5 +++++
build-support/mini-cluster/relocate_binaries_for_mini_cluster.py | 2 +-
docker/bootstrap-dev-env.sh | 2 ++
3 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/.dockerignore b/.dockerignore
index 5316c91..007dd33 100644
--- a/.dockerignore
+++ b/.dockerignore
@@ -21,6 +21,11 @@
*
# General top level source files.
+!CONTRIBUTING.adoc
+!LICENSE.txt
+!NOTICE.txt
+!README.adoc
+!RELEASING.adoc
!version.txt
# Docker files.
diff --git a/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py b/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py
index 30ecdba..87bb602 100755
--- a/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py
+++ b/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py
@@ -116,7 +116,7 @@ def check_for_command(command):
Ensure that the specified command is available on the PATH.
"""
try:
- _ = subprocess.check_output(['which', command])
+ _ = check_output(['which', command])
except subprocess.CalledProcessError as err:
logging.error("Unable to find %s command", command)
raise err
diff --git a/docker/bootstrap-dev-env.sh b/docker/bootstrap-dev-env.sh
index bf966fb..4cbddcf 100755
--- a/docker/bootstrap-dev-env.sh
+++ b/docker/bootstrap-dev-env.sh
@@ -58,6 +58,7 @@ if [[ -f "/usr/bin/yum" ]]; then
yum install -y \
autoconf \
automake \
+ chrpath \
cyrus-sasl-devel \
cyrus-sasl-gssapi \
cyrus-sasl-plain \
@@ -142,6 +143,7 @@ elif [[ -f "/usr/bin/apt-get" ]]; then
apt-get install -y --no-install-recommends \
autoconf \
automake \
+ chrpath \
curl \
flex \
g++ \
[kudu] 01/03: Fix tracing of log appending and reduce log-related
log spam
Posted by gr...@apache.org.
This is an automated email from the ASF dual-hosted git repository.
granthenke pushed a commit to branch branch-1.9.x
in repository https://gitbox.apache.org/repos/asf/kudu.git
commit 6854cec5a0509d91738d9dc6778748ee1a50b50e
Author: Will Berkeley <wd...@gmail.org>
AuthorDate: Thu Feb 14 16:25:18 2019 -0800
Fix tracing of log appending and reduce log-related log spam
While messing around on a server I noticed a funny trace:
0212 16:29:26.554846 (+ 0us) service_pool.cc:163] Inserting onto call queue
0212 16:29:26.554854 (+ 8us) service_pool.cc:222] Handling call
0212 16:29:28.219369 (+1664515us) inbound_call.cc:157] Queueing success response
Related trace 'txn':
0212 16:29:26.561719 (+ 0us) write_transaction.cc:101] PREPARE: Starting
0212 16:29:26.562102 (+ 383us) write_transaction.cc:268] Acquiring schema lock in shared mode
0212 16:29:26.562103 (+ 1us) write_transaction.cc:271] Acquired schema lock
0212 16:29:26.562104 (+ 1us) tablet.cc:400] PREPARE: Decoding operations
0212 16:29:26.599420 (+ 37316us) tablet.cc:422] PREPARE: Acquiring locks for 6376 operations
0212 16:29:26.611285 (+ 11865us) tablet.cc:426] PREPARE: locks acquired
0212 16:29:26.611286 (+ 1us) write_transaction.cc:126] PREPARE: finished.
0212 16:29:26.611389 (+ 103us) write_transaction.cc:136] Start()
0212 16:29:26.611392 (+ 3us) write_transaction.cc:141] Timestamp: P: 1550017766611388 usec, L: 0
0212 16:29:26.613188 (+ 1796us) log.cc:582] Serialized 10493083 byte log entry
0212 16:29:26.735023 (+121835us) write_transaction.cc:149] APPLY: Starting
0212 16:29:28.213010 (+1477987us) tablet_metrics.cc:365] ProbeStats: bloom_lookups=27143,key_file_lookups=5,delta_file_lookups=0,mrs_lookups=6376
0212 16:29:28.214317 (+ 1307us) log.cc:582] Serialized 38279 byte log entry
0212 16:29:28.214376 (+ 59us) write_transaction.cc:309] Releasing row and schema locks
0212 16:29:28.216505 (+ 2129us) write_transaction.cc:277] Released schema lock
0212 16:29:28.219357 (+ 2852us) write_transaction.cc:196] FINISH: updating metrics
0212 16:29:28.219709 (+ 352us) write_transaction.cc:309] Releasing row and schema locks
0212 16:29:28.219709 (+ 0us) write_transaction.cc:277] Released schema lock
0212 16:29:28.824261 (+604552us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegment4us8LW
0212 16:29:29.531241 (+706980us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentGJU550
0212 16:29:30.254932 (+723691us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmenteWEBF9
0212 16:29:31.005554 (+750622us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentsAIJzm
0212 16:29:31.695339 (+689785us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentsBimGD
0212 16:29:32.443351 (+748012us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentamea7Y
0212 16:29:33.180797 (+737446us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentSam2No
0212 16:29:33.905360 (+724563us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmenthfhQDS
0212 16:29:34.634994 (+729634us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentxDVwMq
0212 16:29:35.384800 (+749806us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentjMi862
0212 16:29:36.080099 (+695299us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentRgaDAJ
0212 16:29:36.822293 (+742194us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentwSXUHR
0212 16:29:37.540434 (+718141us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentCp3SMG
0212 16:29:38.289865 (+749431us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentyEttfA
0212 16:29:38.993878 (+704013us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegment7OTQ23
0212 16:29:39.759563 (+765685us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentzk7y0x
0212 16:29:40.495284 (+735721us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmenttvkp8B
0212 16:29:41.289037 (+793753us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentnhuvEK
0212 16:29:41.993331 (+704294us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentVCJgnX
0212 16:29:42.710348 (+717017us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentrZTzve
0212 16:29:43.458553 (+748205us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentffNUWz
The issue is that this txn caused the append thread to wake up, which
meant that the append thread adopted the txn's trace. Then, several
batches were written to the wal during that one active period of the
append thread, and some of them caused WAL preallocation, which caused
the trace messages to be appended to the trace buffer of the original
txn that woke the append thread, even though it had nothing to do with
those preallocations, and in the default cause it is not blocked by any
preallocation because preallocation of WAL segments is asynchronous.
So this is misleading, as none of those preallocations delayed the write.
The simple thing to do is to remove the TRACE() call for preallocation.
There are still TRACE_EVENT()'s for tracking the activity of the append
thread using the tracing interface, and to cover the case of synchronous
segment allocation and rolling, I added a scoped latency counter for
synchronous log rolls.
Additionally, I noticed that logs tend to get spammed with log rolling
and preallocation messages like
I0214 16:38:30.307582 14666 log.cc:649] T f445674ede8d4b6590ad5002764eeae7 P 09d6bf7a02124145b43f43cb7a667b3d: Max segment size reached. Starting new segment allocation
I0214 16:38:30.430410 14666 log.cc:576] T f445674ede8d4b6590ad5002764eeae7 P 09d6bf7a02124145b43f43cb7a667b3d: Rolled over to a new log segment at /data/15/kudu/wals/f445674ede8d4b6590ad5002764eeae7/wal-000001116
These messages aren't really useful. If preallocation or rolling is
slow, there are already slow execution logging scopes that will alert us
through the logs. I dropped their level to VLOG(1).
Change-Id: Ia50698e3af321b4ab87ee3974525dea6fc551613
Reviewed-on: http://gerrit.cloudera.org:8080/12491
Reviewed-by: Adar Dembo <ad...@cloudera.com>
Reviewed-by: Andrew Wong <aw...@cloudera.com>
Tested-by: Kudu Jenkins
(cherry picked from commit 3574e3eb55a6bc815a9095636c4e08edcfa7c6b0)
Reviewed-on: http://gerrit.cloudera.org:8080/12509
---
src/kudu/consensus/log.cc | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/src/kudu/consensus/log.cc b/src/kudu/consensus/log.cc
index 845b446..c61fcde 100644
--- a/src/kudu/consensus/log.cc
+++ b/src/kudu/consensus/log.cc
@@ -573,7 +573,8 @@ Status Log::RollOver() {
RETURN_NOT_OK(SwitchToAllocatedSegment());
- LOG_WITH_PREFIX(INFO) << "Rolled over to a new log segment at " << active_segment_->path();
+ VLOG_WITH_PREFIX(1) << "Rolled over to a new log segment at "
+ << active_segment_->path();
return Status::OK();
}
@@ -646,10 +647,11 @@ Status Log::DoAppend(LogEntryBatch* entry_batch) {
// if the size of this entry overflows the current segment, get a new one
if (allocation_state() == kAllocationNotStarted) {
if ((active_segment_->Size() + entry_batch_bytes + 4) > max_segment_size_) {
- LOG_WITH_PREFIX(INFO) << "Max segment size reached. Starting new segment allocation";
+ VLOG_WITH_PREFIX(1) << "Max segment size reached. Starting new segment allocation";
RETURN_NOT_OK(AsyncAllocateSegment());
if (!options_.async_preallocate_segments) {
LOG_SLOW_EXECUTION(WARNING, 50, Substitute("$0Log roll took a long time", LogPrefix())) {
+ TRACE_COUNTER_SCOPE_LATENCY_US("log_roll");
RETURN_NOT_OK(RollOver());
}
}
@@ -1038,7 +1040,6 @@ Status Log::PreAllocateNewSegment() {
Status::IOError("Injected IOError in Log::PreAllocateNewSegment()"));
if (options_.preallocate_segments) {
- TRACE("Preallocating $0 byte segment in $1", max_segment_size_, next_segment_path_);
RETURN_NOT_OK(env_util::VerifySufficientDiskSpace(fs_manager_->env(),
next_segment_path_,
max_segment_size_,