You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@kudu.apache.org by al...@apache.org on 2020/01/28 05:51:23 UTC

[kudu] branch master updated: [tests] simpler wait-for-crash in TestFillMultipleDisks

This is an automated email from the ASF dual-hosted git repository.

alexey pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/kudu.git


The following commit(s) were added to refs/heads/master by this push:
     new 3ba5ec5  [tests] simpler wait-for-crash in TestFillMultipleDisks
3ba5ec5 is described below

commit 3ba5ec5d0ab3e56428d990afbff92e3dde241578
Author: Alexey Serbin <al...@apache.org>
AuthorDate: Mon Jan 27 18:45:02 2020 -0800

    [tests] simpler wait-for-crash in TestFillMultipleDisks
    
    I saw a test failure in DiskReservationITest.TestFillMultipleDisks in
    case of a TSAN build.  The scenario timed out while waiting for the
    tserver process to crash, but a few milliseconds later there was a stack
    trace from the crash.  Apparently, the configured timeout period was
    a bit less than the actual one in case of TSAN build.
    
    I updated the code which asserts for the crash, increasing the wait
    time up to 30 seconds (from original 10).  In addition, the condition
    now uses ASSERT_EVENTUALLY and is easier to comprehend.
    
    The relevant part from the scenario's output is below.
    
    F0127 23:16:55.030217   561 tablet_replica_mm_ops.cc:197] Check failed: tablet->HasBeenStopped() Unrecoverable flush failure caused by error: IO error: Failed to open DiskRowSet for flush: Unable to open output file for column key INT32 NOT NULL: No directories available to add to ae0e35b8cce6441bae25213b1c314546's directory group (2 dirs total, 2 full, 0 failed). (error 28)
    *** Check failure stack trace: ***
        @     0x7fc93865f3e2  google::LogMessage::Flush() at ??:0
        @     0x7fc938663c1b  google::LogMessageFatal::~LogMessageFatal() at ??:0
        @     0x7fc93fe26d63  kudu::tablet::FlushMRSOp::Perform() at ??:0
        @     0x7fc938ec0316  kudu::MaintenanceManager::LaunchOp() at ??:0
        @     0x7fc938ec6381  boost::_mfi::mf1<>::operator()() at ??:0
        @     0x7fc938ec62be  boost::_bi::list2<>::operator()<>() at ??:0
        @     0x7fc938ec6224  boost::_bi::bind_t<>::operator()() at ??:0
        @     0x7fc938ec5fc2  boost::detail::function::void_function_obj_invoker0<>::invoke() at ??:0
        @     0x7fc93aae4d72  boost::function0<>::operator()() at ??:0
        @     0x7fc938f3520e  kudu::FunctionRunnable::Run() at ??:0
        @     0x7fc938f30ba9  kudu::ThreadPool::DispatchThread() at ??:0
        @     0x7fc938f3bb1a  boost::_mfi::mf0<>::operator()() at ??:0
        @     0x7fc938f3ba6b  boost::_bi::list1<>::operator()<>() at ??:0
        @     0x7fc938f3b9f4  boost::_bi::bind_t<>::operator()() at ??:0
        @     0x7fc938f3b7ea  boost::detail::function::void_function_obj_invoker0<>::invoke() at ??:0
        @     0x7fc93aae4d72  boost::function0<>::operator()() at ??:0
        @     0x7fc938f256a5  kudu::Thread::SuperviseThread() at ??:0
        @           0x448bee  __tsan_thread_start_func at /data0/jenkins/workspace/kudu-pre-commit-unittest-TSAN/thirdparty/src/llvm-6.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_clock.h:205
        @     0x7fc93c960184  start_thread at ??:0
        @     0x7fc934b64ffd  clone at ??:0
    I0127 23:16:55.600241   464 disk_reservation-itest.cc:118] Rows inserted: 18410
    /data0/jenkins/workspace/kudu-pre-commit-unittest-TSAN/src/kudu/integration-tests/disk_reservation-itest.cc:120: Failure
    Failed
    Bad status: Timed out: Process did not crash within 1.000s
    
    Change-Id: Iaf8c171c9b492e069e70a15a1b4bb2ae83950eef
    Reviewed-on: http://gerrit.cloudera.org:8080/15112
    Tested-by: Kudu Jenkins
    Reviewed-by: Andrew Wong <aw...@cloudera.com>
---
 src/kudu/integration-tests/disk_reservation-itest.cc | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/src/kudu/integration-tests/disk_reservation-itest.cc b/src/kudu/integration-tests/disk_reservation-itest.cc
index 90a2adc..fc4f0b8 100644
--- a/src/kudu/integration-tests/disk_reservation-itest.cc
+++ b/src/kudu/integration-tests/disk_reservation-itest.cc
@@ -34,6 +34,7 @@
 #include "kudu/util/monotime.h"
 #include "kudu/util/status.h"
 #include "kudu/util/test_macros.h"
+#include "kudu/util/test_util.h"
 
 using std::string;
 using std::vector;
@@ -111,13 +112,9 @@ TEST_F(DiskReservationITest, TestFillMultipleDisks) {
                               "disk_reserved_override_prefix_2_bytes_free_for_testing", "0"));
 
   // Wait for crash due to inability to flush or compact.
-  Status s;
-  for (int i = 0; i < 10; i++) {
-    s = cluster_->tablet_server(0)->WaitForFatal(MonoDelta::FromSeconds(1));
-    if (s.ok()) break;
-    LOG(INFO) << "Rows inserted: " << workload.rows_inserted();
-  }
-  ASSERT_OK(s);
+  ASSERT_EVENTUALLY([&] {
+    ASSERT_OK(cluster_->tablet_server(0)->WaitForFatal(MonoDelta::FromSeconds(1)));
+  });
   workload.StopAndJoin();
 }