You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "zhangsong (JIRA)" <ji...@apache.org> on 2016/06/05 09:53:59 UTC

[jira] [Comment Edited] (KUDU-1472) kudu-tserver crash unexpected

    [ https://issues.apache.org/jira/browse/KUDU-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15315825#comment-15315825 ] 

zhangsong edited comment on KUDU-1472 at 6/5/16 9:53 AM:
---------------------------------------------------------

@todd, met this crash again , with little difference backtrace:
(gdb) bt
#0  kudu::BlockIdPB::set_has_id (this=<optimized out>) at /export/ldb/kudu_build/kudu-gitlab/build/release/src/kudu/fs/fs.pb.h:1016
#1  kudu::BlockIdPB::set_id (value=1909031780344067001, this=0xd00) at /export/ldb/kudu_build/kudu-gitlab/build/release/src/kudu/fs/fs.pb.h:1030
#2  kudu::BlockId::CopyToPB (this=this@entry=0x42a70848, pb=0xd00) at /export/ldb/kudu_build/kudu-gitlab/src/kudu/fs/block_id.cc:44
#3  0x00000000008e7e9b in kudu::tablet::RowSetMetadata::ToProtobuf (this=0x42a70820, pb=0x1234fe100) at /export/ldb/kudu_build/kudu-gitlab/src/kudu/tablet/rowset_metadata.cc:129
#4  0x00000000008e208f in kudu::tablet::TabletMetadata::ToSuperBlockUnlocked (this=this@entry=0x42ab6480, super_block=super_block@entry=0x7fe0bcc8cdf0, rowsets=...)
    at /export/ldb/kudu_build/kudu-gitlab/src/kudu/tablet/tablet_metadata.cc:540
#5  0x00000000008e26ac in kudu::tablet::TabletMetadata::Flush (this=this@entry=0x42ab6480) at /export/ldb/kudu_build/kudu-gitlab/src/kudu/tablet/tablet_metadata.cc:433
#6  0x00000000008e3889 in kudu::tablet::TabletMetadata::UpdateAndFlush (this=0x42ab6480, to_remove=..., to_add=..., last_durable_mrs_id=last_durable_mrs_id@entry=3502)
    at /export/ldb/kudu_build/kudu-gitlab/src/kudu/tablet/tablet_metadata.cc:354
#7  0x000000000086c423 in kudu::tablet::Tablet::FlushMetadata (this=this@entry=0x88266dc0, to_remove=..., to_add=..., mrs_being_flushed=mrs_being_flushed@entry=3502)
    at /export/ldb/kudu_build/kudu-gitlab/src/kudu/tablet/tablet.cc:1228
#8  0x000000000086d736 in kudu::tablet::Tablet::DoCompactionOrFlush (this=this@entry=0x88266dc0, input=..., mrs_being_flushed=mrs_being_flushed@entry=3502)
    at /export/ldb/kudu_build/kudu-gitlab/src/kudu/tablet/tablet.cc:1410
#9  0x000000000086eb35 in kudu::tablet::Tablet::FlushInternal (this=this@entry=0x88266dc0, input=..., old_ms=...) at /export/ldb/kudu_build/kudu-gitlab/src/kudu/tablet/tablet.cc:777
#10 0x000000000086ef27 in kudu::tablet::Tablet::FlushUnlocked (this=this@entry=0x88266dc0) at /export/ldb/kudu_build/kudu-gitlab/src/kudu/tablet/tablet.cc:712
#11 0x0000000000903c0c in kudu::tablet::FlushMRSOp::Perform (this=0xac5588c0) at /export/ldb/kudu_build/kudu-gitlab/src/kudu/tablet/tablet_peer_mm_ops.cc:127
#12 0x00000000008b83fa in kudu::MaintenanceManager::LaunchOp (this=0x3896300, op=0xac5588c0) at /export/ldb/kudu_build/kudu-gitlab/src/kudu/tablet/maintenance_manager.cc:360
#13 0x0000000001901c3e in boost::function0<void>::operator() (this=<optimized out>) at /usr/local/include/boost/function/function_template.hpp:767
#14 kudu::FunctionRunnable::Run (this=<optimized out>) at /export/ldb/kudu_build/kudu-gitlab/src/kudu/util/threadpool.cc:48
#15 kudu::ThreadPool::DispatchThread (this=0x3917380, permanent=true) at /export/ldb/kudu_build/kudu-gitlab/src/kudu/util/threadpool.cc:343
#16 0x00000000018fc7ba in boost::function0<void>::operator() (this=0x3880368) at /usr/local/include/boost/function/function_template.hpp:767
#17 kudu::Thread::SuperviseThread (arg=0x3880340) at /export/ldb/kudu_build/kudu-gitlab/src/kudu/util/thread.cc:586
#18 0x0000003296a079d1 in start_thread () from /export/servers/kudu/lib64/libpthread.so.0
#19 0x00000032966e8b6d in clone () from /export/servers/kudu/lib64/libc.so.6

now it is getting more weird, the adhoc_index_block_ seems to have been freed or something , when call set_id , it acts like a dangling pointer.


was (Author: brucesz):
@todd, met this crash again , with little difference backtrace:
(gdb) bt
#0  kudu::BlockIdPB::set_has_id (this=<optimized out>) at /export/ldb/kudu_build/kudu-gitlab/build/release/src/kudu/fs/fs.pb.h:1016
#1  kudu::BlockIdPB::set_id (value=1909031780344067001, this=0xd00) at /export/ldb/kudu_build/kudu-gitlab/build/release/src/kudu/fs/fs.pb.h:1030
#2  kudu::BlockId::CopyToPB (this=this@entry=0x42a70848, pb=0xd00) at /export/ldb/kudu_build/kudu-gitlab/src/kudu/fs/block_id.cc:44
#3  0x00000000008e7e9b in kudu::tablet::RowSetMetadata::ToProtobuf (this=0x42a70820, pb=0x1234fe100) at /export/ldb/kudu_build/kudu-gitlab/src/kudu/tablet/rowset_metadata.cc:129
#4  0x00000000008e208f in kudu::tablet::TabletMetadata::ToSuperBlockUnlocked (this=this@entry=0x42ab6480, super_block=super_block@entry=0x7fe0bcc8cdf0, rowsets=...)
    at /export/ldb/kudu_build/kudu-gitlab/src/kudu/tablet/tablet_metadata.cc:540
#5  0x00000000008e26ac in kudu::tablet::TabletMetadata::Flush (this=this@entry=0x42ab6480) at /export/ldb/kudu_build/kudu-gitlab/src/kudu/tablet/tablet_metadata.cc:433
#6  0x00000000008e3889 in kudu::tablet::TabletMetadata::UpdateAndFlush (this=0x42ab6480, to_remove=..., to_add=..., last_durable_mrs_id=last_durable_mrs_id@entry=3502)
    at /export/ldb/kudu_build/kudu-gitlab/src/kudu/tablet/tablet_metadata.cc:354
#7  0x000000000086c423 in kudu::tablet::Tablet::FlushMetadata (this=this@entry=0x88266dc0, to_remove=..., to_add=..., mrs_being_flushed=mrs_being_flushed@entry=3502)
    at /export/ldb/kudu_build/kudu-gitlab/src/kudu/tablet/tablet.cc:1228
#8  0x000000000086d736 in kudu::tablet::Tablet::DoCompactionOrFlush (this=this@entry=0x88266dc0, input=..., mrs_being_flushed=mrs_being_flushed@entry=3502)
    at /export/ldb/kudu_build/kudu-gitlab/src/kudu/tablet/tablet.cc:1410
#9  0x000000000086eb35 in kudu::tablet::Tablet::FlushInternal (this=this@entry=0x88266dc0, input=..., old_ms=...) at /export/ldb/kudu_build/kudu-gitlab/src/kudu/tablet/tablet.cc:777
#10 0x000000000086ef27 in kudu::tablet::Tablet::FlushUnlocked (this=this@entry=0x88266dc0) at /export/ldb/kudu_build/kudu-gitlab/src/kudu/tablet/tablet.cc:712
#11 0x0000000000903c0c in kudu::tablet::FlushMRSOp::Perform (this=0xac5588c0) at /export/ldb/kudu_build/kudu-gitlab/src/kudu/tablet/tablet_peer_mm_ops.cc:127
#12 0x00000000008b83fa in kudu::MaintenanceManager::LaunchOp (this=0x3896300, op=0xac5588c0) at /export/ldb/kudu_build/kudu-gitlab/src/kudu/tablet/maintenance_manager.cc:360
#13 0x0000000001901c3e in boost::function0<void>::operator() (this=<optimized out>) at /usr/local/include/boost/function/function_template.hpp:767
#14 kudu::FunctionRunnable::Run (this=<optimized out>) at /export/ldb/kudu_build/kudu-gitlab/src/kudu/util/threadpool.cc:48
#15 kudu::ThreadPool::DispatchThread (this=0x3917380, permanent=true) at /export/ldb/kudu_build/kudu-gitlab/src/kudu/util/threadpool.cc:343
#16 0x00000000018fc7ba in boost::function0<void>::operator() (this=0x3880368) at /usr/local/include/boost/function/function_template.hpp:767
#17 kudu::Thread::SuperviseThread (arg=0x3880340) at /export/ldb/kudu_build/kudu-gitlab/src/kudu/util/thread.cc:586
#18 0x0000003296a079d1 in start_thread () from /export/servers/kudu/lib64/libpthread.so.0
#19 0x00000032966e8b6d in clone () from /export/servers/kudu/lib64/libc.so.6

> kudu-tserver crash unexpected
> -----------------------------
>
>                 Key: KUDU-1472
>                 URL: https://issues.apache.org/jira/browse/KUDU-1472
>             Project: Kudu
>          Issue Type: Bug
>    Affects Versions: 0.8.0, 0.9.0
>            Reporter: zhangsong
>            Priority: Critical
>
> kudu-tserver will crash under some case, in jd.com 200-node env, it occurring frequently.
> some crash   info  from core file:
> (gdb) bt
> #0  0x0000000000a2489f in kudu::tablet::RowSetDataPB::SharedDtor (this=0x58fb5b180)
>    at /export/ldb/kudu-master/build/release/src/kudu/tablet/metadata.pb.cc:815
> #1  kudu::tablet::RowSetDataPB::~RowSetDataPB (this=0x58fb5b180, __in_chrg=<optimized out>)
>    at /export/ldb/kudu-master/build/release/src/kudu/tablet/metadata.pb.cc:809
> #2  kudu::tablet::RowSetDataPB::~RowSetDataPB (this=0x58fb5b180, __in_chrg=<optimized out>)
>    at /export/ldb/kudu-master/build/release/src/kudu/tablet/metadata.pb.cc:810
> #3  google::protobuf::internal::GenericTypeHandler<kudu::tablet::RowSetDataPB>::Delete (value=0x58fb5b180)
>    at /export/ldb/kudu-master/thirdparty/installed-deps/include/google/protobuf/repeated_field.h:363
> #4  google::protobuf::internal::RepeatedPtrFieldBase::Destroy<google::protobuf::RepeatedPtrField<kudu::tablet::RowSetDataPB>::TypeHandler> (
>    this=<optimized out>, this=<optimized out>) at /export/ldb/kudu-master/thirdparty/installed-deps/include/google/protobuf/repeated_field.h:869
> Backtrace stopped: Cannot access memory at address 0x7fc1f230fd08
> after crash , kudu-tserver will not be restarted successfully, due to some pb validation  check failed, for example:
>  check failed: _s.ok() Bad status: IO error: Could not init Tablet Manager: Failed to open tablet metadata for tablet: 260359a41a134c1f91631e9094847bcf: Failed to load tablet metadata for tablet id 260359a41a134c1f91631e9094847bcf: Could not load tablet metadata from /export/servers/kudu/tserver_data_7052/tablet-meta/260359a41a134c1f91631e9094847bcf: Unable to parse PB from path: /export/servers/kudu/tserver_data_7052/tablet-meta/260359a41a134c1f91631e9094847bcf
> kudu version is 0.9.0-snapshot, last commit id :  be10f8514c48950b64c7d59bbce848f3792ec52d 
> workload is: several write tasks  keeps inserting into kudu table, some task using java api, while others using impala.
> kudu-table will be scanned while whose tasks are running.
> almost everyday there will be a crash case. same phenomenon as described above. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)