You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Andrew Wong (JIRA)" <ji...@apache.org> on 2019/01/03 06:49:00 UTC

[jira] [Updated] (KUDU-2651) mt-tablet-test hits a DCHECK failure

     [ https://issues.apache.org/jira/browse/KUDU-2651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Wong updated KUDU-2651:
------------------------------
    Description: 
I hit a couple failures of this in DEBUG and ASAN in pre-commit, but couldn't tell whether it's a known flake or not since the flaky test dashboard was idle for the holidays and has yet to return to its former glory.

I looped the test on commit c7a2d69fbd4f0a6f3b0938c31a439049f3733767 with 8 stress threads in DEBUG mode 100 times (with 4 shards each) and saw 10 failures (attached one such log). The other runs can be found here: [http://dist-test.cloudera.org/job?job_id=awong.1546496637.36553]

One snippet/stacktrace of interest:

 

{{I0103 06:27:17.085534 2512 multi_column_writer.cc:98] Opened CFile writers for 3 column(s)}}
 {{I0103 06:27:17.089375 2512 tablet.cc:1614] T test_tablet_id P 95cae2d4680545c9b1e8b1707076b7de: Flush: entering phase 2 (starting to duplicate updates in new rowsets)}}
 {{F0103 06:27:17.089998 2514 delta_tracker.cc:239] Check failed: first_copy->delta_stats().min_timestamp() >= second_copy->delta_stats().min_timestamp() (6334451043730628608 vs. 6334451043743322112) Found out-of-order deltas: [

{14621771211629594691 (ts range=[6334451043730628608, 6334451043881463808], delete_count=[0], reinsert_count=[0], update_counts_by_col_id=[12:11])}

,

{14621771211629594681 (ts range=[6334451043743322112, 6334451043743322112], delete_count=[0], reinsert_count=[0], update_counts_by_col_id=[12:1])}

]: type = 1}}
 {{*** Check failure stack trace: ***}}
 {{*** Aborted at 1546496837 (unix time) try "date -d @1546496837" if you are using GNU date ***}}
 {{PC: @ 0x7f6c8b6b0c37 gsignal}}
 {{*** SIGABRT (@0x3e80000094a) received by PID 2378 (TID 0x7f6c85411700) from PID 2378; stack trace: ***}}
 \{{ @ 0x7f6c93f88330 (unknown) at ??:0}}
 \{{ @ 0x7f6c8b6b0c37 gsignal at ??:0}}
 \{{ @ 0x7f6c8b6b4028 abort at ??:0}}
 \{{ @ 0x7f6c8d574a09 google::logging_fail() at ??:0}}
 \{{ @ 0x7f6c8d5762fd google::LogMessage::Fail() at ??:0}}
 \{{ @ 0x7f6c8d5781bd google::LogMessage::SendToLog() at ??:0}}
 \{{ @ 0x7f6c8d575e39 google::LogMessage::Flush() at ??:0}}
 \{{ @ 0x7f6c8d578c5f google::LogMessageFatal::~LogMessageFatal() at ??:0}}
 \{{ @ 0x7f6c96fc3a88 kudu::tablet::DeltaTracker::ValidateDeltaOrder() at ??:0}}
 {{@ 0x7f6c96fc4423 kudu::tablet::DeltaTracker::AtomicUpdateStores() at ??:0}}
 \{{ @ 0x7f6c96f9be79 kudu::tablet::MajorDeltaCompaction::UpdateDeltaTracker() at ??:0}}
 \{{ @ 0x7f6c96ef33d2 kudu::tablet::DiskRowSet::MajorCompactDeltaStoresWithColumnIds() at ??:0}}
 \{{ @ 0x7f6c96ef2b0b kudu::tablet::DiskRowSet::MajorCompactDeltaStores() at ??:0}}
 \{{ @ 0x7f6c96e1c73b kudu::tablet::Tablet::CompactWorstDeltas() at ??:0}}
 \{{ @ 0x4a3bb8 kudu::tablet::MultiThreadedTabletTest<>::CompactDeltas() at /data/8/awong/kudu/build/debug/../../src/kudu/tablet/mt-tablet-test.cc:332}}
 \{{ @ 0x49f19c kudu::tablet::MultiThreadedTabletTest<>::MajorCompactDeltasThread() at /data/8/awong/kudu/build/debug/../../src/kudu/tablet/mt-tablet-test.cc:326}}
 \{{ @ 0x4a281f boost::_mfi::mf1<>::operator()() at /data/8/awong/kudu/build/debug/../../thirdparty/installed/uninstrumented/include/boost/bind/mem_fn_template.hpp:165}}
 \{{ @ 0x4a277e boost::_bi::list2<>::operator()<>() at /data/8/awong/kudu/build/debug/../../thirdparty/installed/uninstrumented/include/boost/bind/bind.hpp:320}}
 \{{ @ 0x4a270a boost::_bi::bind_t<>::operator()() at /data/8/awong/kudu/build/debug/../../thirdparty/installed/uninstrumented/include/boost/bind/bind.hpp:1222}}
 \{{ @ 0x4a2490 boost::detail::function::void_function_obj_invoker0<>::invoke() at /data/8/awong/kudu/build/debug/../../thirdparty/installed/uninstrumented/include/boost/function/function_template.hpp:160}}
 \{{ @ 0x7f6c906d47c8 boost::function0<>::operator()() at ??:0}}
 \{{ @ 0x7f6c8f691a8d kudu::Thread::SuperviseThread() at ??:0}}
 \{{ @ 0x7f6c93f80184 start_thread at ??:0}}
 {

{ @ 0x7f6c8b777ffd clone at ??:0}

}

  was:
I hit a couple failures of this in DEBUG and ASAN in pre-commit, but couldn't tell whether it's a known flake or not since the flaky test dashboard was idle for the holidays and has yet to return to its former glory.

I looped the test on commit c7a2d69fbd4f0a6f3b0938c31a439049f3733767 with 8 stress threads in DEBUG mode 400 times and saw 10 failures (attached one such log). The other runs can be found here: [http://dist-test.cloudera.org/job?job_id=awong.1546496637.36553]

One snippet/stacktrace of interest:

 

{{I0103 06:27:17.085534 2512 multi_column_writer.cc:98] Opened CFile writers for 3 column(s)}}
 {{I0103 06:27:17.089375 2512 tablet.cc:1614] T test_tablet_id P 95cae2d4680545c9b1e8b1707076b7de: Flush: entering phase 2 (starting to duplicate updates in new rowsets)}}
 {{F0103 06:27:17.089998 2514 delta_tracker.cc:239] Check failed: first_copy->delta_stats().min_timestamp() >= second_copy->delta_stats().min_timestamp() (6334451043730628608 vs. 6334451043743322112) Found out-of-order deltas: [

{14621771211629594691 (ts range=[6334451043730628608, 6334451043881463808], delete_count=[0], reinsert_count=[0], update_counts_by_col_id=[12:11])}

,

{14621771211629594681 (ts range=[6334451043743322112, 6334451043743322112], delete_count=[0], reinsert_count=[0], update_counts_by_col_id=[12:1])}

]: type = 1}}
 {{*** Check failure stack trace: ***}}
 {{*** Aborted at 1546496837 (unix time) try "date -d @1546496837" if you are using GNU date ***}}
 {{PC: @ 0x7f6c8b6b0c37 gsignal}}
 {{*** SIGABRT (@0x3e80000094a) received by PID 2378 (TID 0x7f6c85411700) from PID 2378; stack trace: ***}}
 \{{ @ 0x7f6c93f88330 (unknown) at ??:0}}
 \{{ @ 0x7f6c8b6b0c37 gsignal at ??:0}}
 \{{ @ 0x7f6c8b6b4028 abort at ??:0}}
 \{{ @ 0x7f6c8d574a09 google::logging_fail() at ??:0}}
 \{{ @ 0x7f6c8d5762fd google::LogMessage::Fail() at ??:0}}
 \{{ @ 0x7f6c8d5781bd google::LogMessage::SendToLog() at ??:0}}
 \{{ @ 0x7f6c8d575e39 google::LogMessage::Flush() at ??:0}}
 \{{ @ 0x7f6c8d578c5f google::LogMessageFatal::~LogMessageFatal() at ??:0}}
 \{{ @ 0x7f6c96fc3a88 kudu::tablet::DeltaTracker::ValidateDeltaOrder() at ??:0}}
 {{@ 0x7f6c96fc4423 kudu::tablet::DeltaTracker::AtomicUpdateStores() at ??:0}}
 \{{ @ 0x7f6c96f9be79 kudu::tablet::MajorDeltaCompaction::UpdateDeltaTracker() at ??:0}}
 \{{ @ 0x7f6c96ef33d2 kudu::tablet::DiskRowSet::MajorCompactDeltaStoresWithColumnIds() at ??:0}}
 \{{ @ 0x7f6c96ef2b0b kudu::tablet::DiskRowSet::MajorCompactDeltaStores() at ??:0}}
 \{{ @ 0x7f6c96e1c73b kudu::tablet::Tablet::CompactWorstDeltas() at ??:0}}
 \{{ @ 0x4a3bb8 kudu::tablet::MultiThreadedTabletTest<>::CompactDeltas() at /data/8/awong/kudu/build/debug/../../src/kudu/tablet/mt-tablet-test.cc:332}}
 \{{ @ 0x49f19c kudu::tablet::MultiThreadedTabletTest<>::MajorCompactDeltasThread() at /data/8/awong/kudu/build/debug/../../src/kudu/tablet/mt-tablet-test.cc:326}}
 \{{ @ 0x4a281f boost::_mfi::mf1<>::operator()() at /data/8/awong/kudu/build/debug/../../thirdparty/installed/uninstrumented/include/boost/bind/mem_fn_template.hpp:165}}
 \{{ @ 0x4a277e boost::_bi::list2<>::operator()<>() at /data/8/awong/kudu/build/debug/../../thirdparty/installed/uninstrumented/include/boost/bind/bind.hpp:320}}
 \{{ @ 0x4a270a boost::_bi::bind_t<>::operator()() at /data/8/awong/kudu/build/debug/../../thirdparty/installed/uninstrumented/include/boost/bind/bind.hpp:1222}}
 \{{ @ 0x4a2490 boost::detail::function::void_function_obj_invoker0<>::invoke() at /data/8/awong/kudu/build/debug/../../thirdparty/installed/uninstrumented/include/boost/function/function_template.hpp:160}}
 \{{ @ 0x7f6c906d47c8 boost::function0<>::operator()() at ??:0}}
 \{{ @ 0x7f6c8f691a8d kudu::Thread::SuperviseThread() at ??:0}}
 \{{ @ 0x7f6c93f80184 start_thread at ??:0}}
 {

{ @ 0x7f6c8b777ffd clone at ??:0}

}


> mt-tablet-test hits a DCHECK failure
> ------------------------------------
>
>                 Key: KUDU-2651
>                 URL: https://issues.apache.org/jira/browse/KUDU-2651
>             Project: Kudu
>          Issue Type: Bug
>          Components: tablet
>    Affects Versions: 1.9.0
>            Reporter: Andrew Wong
>            Priority: Major
>         Attachments: mt-tablet-test.1.txt
>
>
> I hit a couple failures of this in DEBUG and ASAN in pre-commit, but couldn't tell whether it's a known flake or not since the flaky test dashboard was idle for the holidays and has yet to return to its former glory.
> I looped the test on commit c7a2d69fbd4f0a6f3b0938c31a439049f3733767 with 8 stress threads in DEBUG mode 100 times (with 4 shards each) and saw 10 failures (attached one such log). The other runs can be found here: [http://dist-test.cloudera.org/job?job_id=awong.1546496637.36553]
> One snippet/stacktrace of interest:
>  
> {{I0103 06:27:17.085534 2512 multi_column_writer.cc:98] Opened CFile writers for 3 column(s)}}
>  {{I0103 06:27:17.089375 2512 tablet.cc:1614] T test_tablet_id P 95cae2d4680545c9b1e8b1707076b7de: Flush: entering phase 2 (starting to duplicate updates in new rowsets)}}
>  {{F0103 06:27:17.089998 2514 delta_tracker.cc:239] Check failed: first_copy->delta_stats().min_timestamp() >= second_copy->delta_stats().min_timestamp() (6334451043730628608 vs. 6334451043743322112) Found out-of-order deltas: [
> {14621771211629594691 (ts range=[6334451043730628608, 6334451043881463808], delete_count=[0], reinsert_count=[0], update_counts_by_col_id=[12:11])}
> ,
> {14621771211629594681 (ts range=[6334451043743322112, 6334451043743322112], delete_count=[0], reinsert_count=[0], update_counts_by_col_id=[12:1])}
> ]: type = 1}}
>  {{*** Check failure stack trace: ***}}
>  {{*** Aborted at 1546496837 (unix time) try "date -d @1546496837" if you are using GNU date ***}}
>  {{PC: @ 0x7f6c8b6b0c37 gsignal}}
>  {{*** SIGABRT (@0x3e80000094a) received by PID 2378 (TID 0x7f6c85411700) from PID 2378; stack trace: ***}}
>  \{{ @ 0x7f6c93f88330 (unknown) at ??:0}}
>  \{{ @ 0x7f6c8b6b0c37 gsignal at ??:0}}
>  \{{ @ 0x7f6c8b6b4028 abort at ??:0}}
>  \{{ @ 0x7f6c8d574a09 google::logging_fail() at ??:0}}
>  \{{ @ 0x7f6c8d5762fd google::LogMessage::Fail() at ??:0}}
>  \{{ @ 0x7f6c8d5781bd google::LogMessage::SendToLog() at ??:0}}
>  \{{ @ 0x7f6c8d575e39 google::LogMessage::Flush() at ??:0}}
>  \{{ @ 0x7f6c8d578c5f google::LogMessageFatal::~LogMessageFatal() at ??:0}}
>  \{{ @ 0x7f6c96fc3a88 kudu::tablet::DeltaTracker::ValidateDeltaOrder() at ??:0}}
>  {{@ 0x7f6c96fc4423 kudu::tablet::DeltaTracker::AtomicUpdateStores() at ??:0}}
>  \{{ @ 0x7f6c96f9be79 kudu::tablet::MajorDeltaCompaction::UpdateDeltaTracker() at ??:0}}
>  \{{ @ 0x7f6c96ef33d2 kudu::tablet::DiskRowSet::MajorCompactDeltaStoresWithColumnIds() at ??:0}}
>  \{{ @ 0x7f6c96ef2b0b kudu::tablet::DiskRowSet::MajorCompactDeltaStores() at ??:0}}
>  \{{ @ 0x7f6c96e1c73b kudu::tablet::Tablet::CompactWorstDeltas() at ??:0}}
>  \{{ @ 0x4a3bb8 kudu::tablet::MultiThreadedTabletTest<>::CompactDeltas() at /data/8/awong/kudu/build/debug/../../src/kudu/tablet/mt-tablet-test.cc:332}}
>  \{{ @ 0x49f19c kudu::tablet::MultiThreadedTabletTest<>::MajorCompactDeltasThread() at /data/8/awong/kudu/build/debug/../../src/kudu/tablet/mt-tablet-test.cc:326}}
>  \{{ @ 0x4a281f boost::_mfi::mf1<>::operator()() at /data/8/awong/kudu/build/debug/../../thirdparty/installed/uninstrumented/include/boost/bind/mem_fn_template.hpp:165}}
>  \{{ @ 0x4a277e boost::_bi::list2<>::operator()<>() at /data/8/awong/kudu/build/debug/../../thirdparty/installed/uninstrumented/include/boost/bind/bind.hpp:320}}
>  \{{ @ 0x4a270a boost::_bi::bind_t<>::operator()() at /data/8/awong/kudu/build/debug/../../thirdparty/installed/uninstrumented/include/boost/bind/bind.hpp:1222}}
>  \{{ @ 0x4a2490 boost::detail::function::void_function_obj_invoker0<>::invoke() at /data/8/awong/kudu/build/debug/../../thirdparty/installed/uninstrumented/include/boost/function/function_template.hpp:160}}
>  \{{ @ 0x7f6c906d47c8 boost::function0<>::operator()() at ??:0}}
>  \{{ @ 0x7f6c8f691a8d kudu::Thread::SuperviseThread() at ??:0}}
>  \{{ @ 0x7f6c93f80184 start_thread at ??:0}}
>  {
> { @ 0x7f6c8b777ffd clone at ??:0}
> }



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)