You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@kudu.apache.org by gr...@apache.org on 2019/02/19 23:00:20 UTC

[kudu] branch branch-1.9.x updated (8f16041 -> 272e98d)

This is an automated email from the ASF dual-hosted git repository.

granthenke pushed a change to branch branch-1.9.x
in repository https://gitbox.apache.org/repos/asf/kudu.git.


    from 8f16041  log_block_manager: fix invalid pointer
     new 6854cec  Fix tracing of log appending and reduce log-related log spam
     new 741db85  KUDU-2686 python: remove multiprocessing
     new 272e98d  [minicluster] Fix building with python 2.6

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .dockerignore                                           |  5 +++++
 .../mini-cluster/relocate_binaries_for_mini_cluster.py  |  2 +-
 docker/bootstrap-dev-env.sh                             |  2 ++
 python/kudu/tests/test_scantoken.py                     | 17 ++++-------------
 python/setup.py                                         |  5 -----
 src/kudu/consensus/log.cc                               |  7 ++++---
 6 files changed, 16 insertions(+), 22 deletions(-)

[kudu] 02/03: KUDU-2686 python: remove multiprocessing

Posted by gr...@apache.org.

This is an automated email from the ASF dual-hosted git repository.

granthenke pushed a commit to branch branch-1.9.x
in repository https://gitbox.apache.org/repos/asf/kudu.git

commit 741db8534109ed075a2dc9792cc773e165efd172
Author: Andrew Wong <aw...@cloudera.com>
AuthorDate: Thu Feb 14 14:25:32 2019 -0800

    KUDU-2686 python: remove multiprocessing
    
    The python multiprocessing library doesn't play very nicely when
    creating Pools after initializing complex state (e.g. the KuduClient).
    Because of how it forks, the library may even copy lock states, and lead
    to odd situations, like multiple threads waiting on a lock that isn't
    held by any thread in the process.
    
    This was the case in KUDU-2686, which resulted in a hang in
    test_scantoken.py. Upon inspection[1], there appeared to be multiple
    threads in a process waiting on the same sys_futex, but none of them
    held it. [2] tips some pieces to make it more reproducible (tested on
    Ubuntu 14.04, where the issue was first reported).
    
    Starting with python3.4, there is supposedly a way around this, which is
    to use a different process-startup pattern, though this isn't available
    in python2.
    
    This patch removes usage of multiprocessing entirely. While it can be
    argued that multiprocessing provides extra test coverage, given that
    multiprocessing is known to have such issues[3][4], that this isn't the
    first time we've been bitten by this forking issue, that it's test-only,
    and that scan tokens are tested in the C++ client as well, "dumbing"
    down the test doesn't seem unreasonable.
    
    [1] https://gist.github.com/andrwng/d2d21c551362ddd564926c2a4ec406ae
    [2] https://gist.github.com/andrwng/cc6c211c62b1235cc58944d513ba6655
    [3] https://github.com/pytest-dev/pytest/issues/958
    [4] https://codewithoutrules.com/2018/09/04/python-multiprocessing/
    
    Change-Id: Ia9aa91191d54801731da27e5f132b3c96af0efa1
    Reviewed-on: http://gerrit.cloudera.org:8080/12494
    Tested-by: Kudu Jenkins
    Reviewed-by: Jordan Birdsell <jt...@apache.org>
    Reviewed-by: Alexey Serbin <as...@cloudera.com>
    Reviewed-on: http://gerrit.cloudera.org:8080/12515
    Reviewed-by: Grant Henke <gr...@apache.org>
---
 python/kudu/tests/test_scantoken.py | 17 ++++-------------
 python/setup.py                     |  5 -----
 2 files changed, 4 insertions(+), 18 deletions(-)

diff --git a/python/kudu/tests/test_scantoken.py b/python/kudu/tests/test_scantoken.py
index c453659..37b273d 100644
--- a/python/kudu/tests/test_scantoken.py
+++ b/python/kudu/tests/test_scantoken.py
@@ -20,7 +20,6 @@ from kudu.compat import unittest
 from kudu.tests.util import TestScanBase
 from kudu.tests.common import KuduTestBase
 import kudu
-from multiprocessing import Pool
 import datetime
 import time
 
@@ -44,21 +43,13 @@ class TestScanToken(TestScanBase):
 
     def _subtest_serialize_thread_and_verify(self, tokens, expected_tuples, count_only=False):
         """
-        Given the input serialized tokens, spawn new threads,
-        execute them and validate the results
+        Given the input serialized tokens, hydrate the scanners, execute the
+        scans, and validate the results
         """
-        input =  [(token.serialize(), self.master_hosts, self.master_ports)
-                for token in tokens]
-
-        # Begin process pool
-        pool = Pool(len(input))
-        try:
-            results = pool.map(_get_scan_token_results, input)
-        finally:
-            pool.close()
-            pool.join()
+        input = [(token.serialize(), self.master_hosts, self.master_ports) for token in tokens]
 
         # Validate results
+        results = [_get_scan_token_results(i) for i in input]
         actual_tuples = []
         for result in results:
             actual_tuples += result
diff --git a/python/setup.py b/python/setup.py
index 87872bc..a09f7a9 100644
--- a/python/setup.py
+++ b/python/setup.py
@@ -30,11 +30,6 @@ import os
 import re
 import subprocess
 
-# Workaround a Python bug in which multiprocessing's atexit handler doesn't
-# play well with pytest. See http://bugs.python.org/issue15881 for details
-# and this suggested workaround (comment msg170215 in the thread).
-import multiprocessing
-
 if Cython.__version__ < '0.21.0':
     raise Exception('Please upgrade to Cython 0.21.0 or newer')

[kudu] 03/03: [minicluster] Fix building with python 2.6

Posted by gr...@apache.org.

This is an automated email from the ASF dual-hosted git repository.

granthenke pushed a commit to branch branch-1.9.x
in repository https://gitbox.apache.org/repos/asf/kudu.git

commit 272e98dea63afb54b50347fdae9491c27c7dbce4
Author: Grant Henke <gr...@apache.org>
AuthorDate: Tue Feb 19 15:53:16 2019 -0600

    [minicluster] Fix building with python 2.6
    
    This patch fixes building the minicluster binary
    jar with python 2.6. This is imporant because python
    2.6 is the default version on centos 6 which is the
    primary linux OS that the kudu-binary jar is built on.
    
    Because I used Docker to test this, the related Docker
    build fixes are included too.
    
    Change-Id: Id9e7ce9a1fd7ac813866678d6ec804fdf91ea729
    Reviewed-on: http://gerrit.cloudera.org:8080/12527
    Reviewed-by: Adar Dembo <ad...@cloudera.com>
    Reviewed-by: Andrew Wong <aw...@cloudera.com>
    Tested-by: Grant Henke <gr...@apache.org>
---
 .dockerignore                                                    | 5 +++++
 build-support/mini-cluster/relocate_binaries_for_mini_cluster.py | 2 +-
 docker/bootstrap-dev-env.sh                                      | 2 ++
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/.dockerignore b/.dockerignore
index 5316c91..007dd33 100644
--- a/.dockerignore
+++ b/.dockerignore
@@ -21,6 +21,11 @@
 *
 
 # General top level source files.
+!CONTRIBUTING.adoc
+!LICENSE.txt
+!NOTICE.txt
+!README.adoc
+!RELEASING.adoc
 !version.txt
 
 # Docker files.
diff --git a/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py b/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py
index 30ecdba..87bb602 100755
--- a/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py
+++ b/build-support/mini-cluster/relocate_binaries_for_mini_cluster.py
@@ -116,7 +116,7 @@ def check_for_command(command):
   Ensure that the specified command is available on the PATH.
   """
   try:
-    _ = subprocess.check_output(['which', command])
+    _ = check_output(['which', command])
   except subprocess.CalledProcessError as err:
     logging.error("Unable to find %s command", command)
     raise err
diff --git a/docker/bootstrap-dev-env.sh b/docker/bootstrap-dev-env.sh
index bf966fb..4cbddcf 100755
--- a/docker/bootstrap-dev-env.sh
+++ b/docker/bootstrap-dev-env.sh
@@ -58,6 +58,7 @@ if [[ -f "/usr/bin/yum" ]]; then
   yum install -y \
     autoconf \
     automake \
+    chrpath \
     cyrus-sasl-devel \
     cyrus-sasl-gssapi \
     cyrus-sasl-plain \
@@ -142,6 +143,7 @@ elif [[ -f "/usr/bin/apt-get" ]]; then
   apt-get install -y --no-install-recommends \
     autoconf \
     automake \
+    chrpath \
     curl \
     flex \
     g++ \

[kudu] 01/03: Fix tracing of log appending and reduce log-related log spam

Posted by gr...@apache.org.

This is an automated email from the ASF dual-hosted git repository.

granthenke pushed a commit to branch branch-1.9.x
in repository https://gitbox.apache.org/repos/asf/kudu.git

commit 6854cec5a0509d91738d9dc6778748ee1a50b50e
Author: Will Berkeley <wd...@gmail.org>
AuthorDate: Thu Feb 14 16:25:18 2019 -0800

    Fix tracing of log appending and reduce log-related log spam
    
    While messing around on a server I noticed a funny trace:
    
    0212 16:29:26.554846 (+     0us) service_pool.cc:163] Inserting onto call queue
    0212 16:29:26.554854 (+     8us) service_pool.cc:222] Handling call
    0212 16:29:28.219369 (+1664515us) inbound_call.cc:157] Queueing success response
    Related trace 'txn':
    0212 16:29:26.561719 (+     0us) write_transaction.cc:101] PREPARE: Starting
    0212 16:29:26.562102 (+   383us) write_transaction.cc:268] Acquiring schema lock in shared mode
    0212 16:29:26.562103 (+     1us) write_transaction.cc:271] Acquired schema lock
    0212 16:29:26.562104 (+     1us) tablet.cc:400] PREPARE: Decoding operations
    0212 16:29:26.599420 (+ 37316us) tablet.cc:422] PREPARE: Acquiring locks for 6376 operations
    0212 16:29:26.611285 (+ 11865us) tablet.cc:426] PREPARE: locks acquired
    0212 16:29:26.611286 (+     1us) write_transaction.cc:126] PREPARE: finished.
    0212 16:29:26.611389 (+   103us) write_transaction.cc:136] Start()
    0212 16:29:26.611392 (+     3us) write_transaction.cc:141] Timestamp: P: 1550017766611388 usec, L: 0
    0212 16:29:26.613188 (+  1796us) log.cc:582] Serialized 10493083 byte log entry
    0212 16:29:26.735023 (+121835us) write_transaction.cc:149] APPLY: Starting
    0212 16:29:28.213010 (+1477987us) tablet_metrics.cc:365] ProbeStats: bloom_lookups=27143,key_file_lookups=5,delta_file_lookups=0,mrs_lookups=6376
    0212 16:29:28.214317 (+  1307us) log.cc:582] Serialized 38279 byte log entry
    0212 16:29:28.214376 (+    59us) write_transaction.cc:309] Releasing row and schema locks
    0212 16:29:28.216505 (+  2129us) write_transaction.cc:277] Released schema lock
    0212 16:29:28.219357 (+  2852us) write_transaction.cc:196] FINISH: updating metrics
    0212 16:29:28.219709 (+   352us) write_transaction.cc:309] Releasing row and schema locks
    0212 16:29:28.219709 (+     0us) write_transaction.cc:277] Released schema lock
    0212 16:29:28.824261 (+604552us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegment4us8LW
    0212 16:29:29.531241 (+706980us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentGJU550
    0212 16:29:30.254932 (+723691us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmenteWEBF9
    0212 16:29:31.005554 (+750622us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentsAIJzm
    0212 16:29:31.695339 (+689785us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentsBimGD
    0212 16:29:32.443351 (+748012us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentamea7Y
    0212 16:29:33.180797 (+737446us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentSam2No
    0212 16:29:33.905360 (+724563us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmenthfhQDS
    0212 16:29:34.634994 (+729634us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentxDVwMq
    0212 16:29:35.384800 (+749806us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentjMi862
    0212 16:29:36.080099 (+695299us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentRgaDAJ
    0212 16:29:36.822293 (+742194us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentwSXUHR
    0212 16:29:37.540434 (+718141us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentCp3SMG
    0212 16:29:38.289865 (+749431us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentyEttfA
    0212 16:29:38.993878 (+704013us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegment7OTQ23
    0212 16:29:39.759563 (+765685us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentzk7y0x
    0212 16:29:40.495284 (+735721us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmenttvkp8B
    0212 16:29:41.289037 (+793753us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentnhuvEK
    0212 16:29:41.993331 (+704294us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentVCJgnX
    0212 16:29:42.710348 (+717017us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentrZTzve
    0212 16:29:43.458553 (+748205us) log.cc:1006] Preallocating 8388608 byte segment in /data/15/kudu/wals/b5ecd0b8d23a4a529650e5871741f23e/.kudutmp.newsegmentffNUWz
    
    The issue is that this txn caused the append thread to wake up, which
    meant that the append thread adopted the txn's trace. Then, several
    batches were written to the wal during that one active period of the
    append thread, and some of them caused WAL preallocation, which caused
    the trace messages to be appended to the trace buffer of the original
    txn that woke the append thread, even though it had nothing to do with
    those preallocations, and in the default cause it is not blocked by any
    preallocation because preallocation of WAL segments is asynchronous.
    
    So this is misleading, as none of those preallocations delayed the write.
    The simple thing to do is to remove the TRACE() call for preallocation.
    There are still TRACE_EVENT()'s for tracking the activity of the append
    thread using the tracing interface, and to cover the case of synchronous
    segment allocation and rolling, I added a scoped latency counter for
    synchronous log rolls.
    
    Additionally, I noticed that logs tend to get spammed with log rolling
    and preallocation messages like
    
    I0214 16:38:30.307582 14666 log.cc:649] T f445674ede8d4b6590ad5002764eeae7 P 09d6bf7a02124145b43f43cb7a667b3d: Max segment size reached. Starting new segment allocation
    I0214 16:38:30.430410 14666 log.cc:576] T f445674ede8d4b6590ad5002764eeae7 P 09d6bf7a02124145b43f43cb7a667b3d: Rolled over to a new log segment at /data/15/kudu/wals/f445674ede8d4b6590ad5002764eeae7/wal-000001116
    
    These messages aren't really useful. If preallocation or rolling is
    slow, there are already slow execution logging scopes that will alert us
    through the logs. I dropped their level to VLOG(1).
    
    Change-Id: Ia50698e3af321b4ab87ee3974525dea6fc551613
    Reviewed-on: http://gerrit.cloudera.org:8080/12491
    Reviewed-by: Adar Dembo <ad...@cloudera.com>
    Reviewed-by: Andrew Wong <aw...@cloudera.com>
    Tested-by: Kudu Jenkins
    (cherry picked from commit 3574e3eb55a6bc815a9095636c4e08edcfa7c6b0)
    Reviewed-on: http://gerrit.cloudera.org:8080/12509
---
 src/kudu/consensus/log.cc | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/kudu/consensus/log.cc b/src/kudu/consensus/log.cc
index 845b446..c61fcde 100644
--- a/src/kudu/consensus/log.cc
+++ b/src/kudu/consensus/log.cc
@@ -573,7 +573,8 @@ Status Log::RollOver() {
 
   RETURN_NOT_OK(SwitchToAllocatedSegment());
 
-  LOG_WITH_PREFIX(INFO) << "Rolled over to a new log segment at " << active_segment_->path();
+  VLOG_WITH_PREFIX(1) << "Rolled over to a new log segment at "
+                      << active_segment_->path();
   return Status::OK();
 }
 
@@ -646,10 +647,11 @@ Status Log::DoAppend(LogEntryBatch* entry_batch) {
   // if the size of this entry overflows the current segment, get a new one
   if (allocation_state() == kAllocationNotStarted) {
     if ((active_segment_->Size() + entry_batch_bytes + 4) > max_segment_size_) {
-      LOG_WITH_PREFIX(INFO) << "Max segment size reached. Starting new segment allocation";
+      VLOG_WITH_PREFIX(1) << "Max segment size reached. Starting new segment allocation";
       RETURN_NOT_OK(AsyncAllocateSegment());
       if (!options_.async_preallocate_segments) {
         LOG_SLOW_EXECUTION(WARNING, 50, Substitute("$0Log roll took a long time", LogPrefix())) {
+          TRACE_COUNTER_SCOPE_LATENCY_US("log_roll");
           RETURN_NOT_OK(RollOver());
         }
       }
@@ -1038,7 +1040,6 @@ Status Log::PreAllocateNewSegment() {
                        Status::IOError("Injected IOError in Log::PreAllocateNewSegment()"));
 
   if (options_.preallocate_segments) {
-    TRACE("Preallocating $0 byte segment in $1", max_segment_size_, next_segment_path_);
     RETURN_NOT_OK(env_util::VerifySufficientDiskSpace(fs_manager_->env(),
                                                       next_segment_path_,
                                                       max_segment_size_,