You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Pranay Singh (JIRA)" <ji...@apache.org> on 2018/03/28 22:23:00 UTC

[jira] [Created] (IMPALA-6762) DataStreamRecvr::SenderQueue::GetBatch encounters an exception doing a data_arrival_cv_.Wait(l)

Pranay Singh created IMPALA-6762:
------------------------------------

             Summary:  DataStreamRecvr::SenderQueue::GetBatch encounters an exception doing a data_arrival_cv_.Wait(l)
                 Key: IMPALA-6762
                 URL: https://issues.apache.org/jira/browse/IMPALA-6762
             Project: IMPALA
          Issue Type: Bug
            Reporter: Pranay Singh
             Fix For: Impala 2.12.0, Impala 2.13.0, Impala 2.11.0, Impala 2.10.0, Impala 2.9.0, Impala 2.8.0, Impala 2.7.0, Impala 2.6.0


Problem: In the function impala::DataStreamRecvr::SenderQueue::GetBatch() while
         calling data_arrival_cv_.Wait() an exception is encountered in boost library, which
         results in a SIGABRT. The probable cause of this issue is that lock has been freed.

Evidence: We have a minidump for the issue seen; the two suspected threads involved in the issue are listed below.

Thread encountered SIGABRT

Crash reason:  SIGABRT
Crash address: 0x3d300008b2f
Process uptime: not available

Thread 959 (crashed)
 0  libc-2.17.so + 0x351f7
    rax = 0x0000000000000000   rdx = 0x0000000000000006
    rcx = 0xffffffffffffffff   rbx = 0x00007f1291116f18
    rsi = 0x000000000001a041   rdi = 0x0000000000008b2f
    rbp = 0x0000000002ad97c0   rsp = 0x00007f102ac0cd48
     r8 = 0x000000000000000a    r9 = 0x00007f102ac0e700
    r10 = 0x0000000000000008   r11 = 0x0000000000000202
    r12 = 0x00007f1291116f00   r13 = 0x00007f102ac0cfb0
    r14 = 0x0000000000000000   r15 = 0x0000000000000000
    rip = 0x00007f13ec6601f7
    Found by: given as instruction pointer in context
 1  libc-2.17.so + 0x368e8
    rsp = 0x00007f102ac0cd50   rip = 0x00007f13ec6618e8
    Found by: stack scanning
     .
     .
     .
  9  impalad!<name omitted>
    rax = 0x0000000000000001   rdx = 0x0000000000000001
    rbx = 0x00007f102ac0d390   rbp = 0x00007f12c68c13a0
    rsp = 0x00007f102ac0d390   r12 = 0x00007f12cc820cc0
    r13 = 0x00007f1244ab5600   r14 = 0x00007f102ac0d4e0
    r15 = 0x0000000000000001   rip = 0x000000000080fe65
    Found by: call frame info
10  impalad!<name omitted>
    rbx = 0x00007f102ac0d4e0   rbp = 0x00007f1244ab5630
    rsp = 0x00007f102ac0d3e0   r12 = 0x00007f12cc820cc0
    r13 = 0x00007f1244ab5600   r14 = 0x00007f102ac0d4e0
    r15 = 0x0000000000000001   rip = 0x000000000080fe8c
    Found by: call frame info
11  impalad!<name omitted>
    rbx = 0x0000000000000000   rbp = 0x00007f1244ab5630
    rsp = 0x00007f102ac0d430   r12 = 0x00007f12cc820cc0
    r13 = 0x00007f1244ab5600   r14 = 0x00007f102ac0d4e0
    r15 = 0x0000000000000001   rip = 0x0000000000810294
    Found by: call frame info
12  impalad!impala::DataStreamRecvr::(impala::RowBatch**)
    rbx = 0x00007f12cc820c60   rbp = 0x00007f102ac0d500
    rsp = 0x00007f102ac0d4c0   r12 = 0x00007f102ac0d530
    r13 = 0x00007f12cc820c90   r14 = 0x00007f127242f338
    r15 = 0x00007f12cc820d48   rip = 0x0000000000a280f3
    Found by: call frame info
13  impalad!impala::DataStreamRecvr::GetBatch(impala::RowBatch**)
    rbx = 0x00007f102ac0d5c0   rbp = 0x00007f102ac0d5c0
    rsp = 0x00007f102ac0d5a0   r12 = 0x00007f121f464100
    r13 = 0x00007f127242f180   r14 = 0x00007f121f464100
    r15 = 0x00007f102ac0d760   rip = 0x0000000000a284c3
    Found by: call frame info
14  impalad!impala::ExchangeNode::FillInputRowBatch(impala::RuntimeState*)
    rbx = 0x00007f102ac0d690   rbp = 0x00007f102ac0d5c0
    rsp = 0x00007f102ac0d5b0   r12 = 0x00007f121f464100
    r13 = 0x00007f127242f180   r14 = 0x00007f121f464100
    r15 = 0x00007f102ac0d760   rip = 0x0000000000beffa5
    Found by: call frame info
15  impalad!impala::ExchangeNode::Open(impala::RuntimeState*)
    rbx = 0x00007f121f464100   rbp = 0x00007f102ac0d8d0
    rsp = 0x00007f102ac0d640   r12 = 0x00007f127242f180
    r13 = 0x00007f102ac0d690   r14 = 0x00007f121f464100
    r15 = 0x00007f102ac0d760   rip = 0x0000000000bf0d9e
    Found by: call frame info


Thread 336
----------------
13  impalad!<name omitted> [TBufferTransports.h : 69 + 0xe]
    rbx = 0x0000000000000000   rbp = 0x0000000000000004
    rsp = 0x00007f13077b9840   r12 = 0x0000000000000004
    r13 = 0x00007f13077b98b0   r14 = 0x00007f12c3f6f270
    r15 = 0x00007f12d5a7c034   rip = 0x000000000080be6e
    Found by: call frame info
14  impalad!apache::thrift::protocol::TBinaryProtocolT<apache::thrift::transport::TTransport>::readMessageBegin(std::string&, apache::thrift::protocol::TMessageType&, int&)
    rbx = 0x00007f13077b98b0   rbp = 0x00007f13077b98f8
    rsp = 0x00007f13077b98a0   r12 = 0x00007f13077b98fc
    r13 = 0x00007f13077b9900   r14 = 0x00007f12406cd0e0
    r15 = 0x00007f13077b9b80   rip = 0x00000000009ca5bf
    Found by: call frame info
15  impalad!impala::ImpalaInternalServiceClient::recv_CancelPlanFragment(impala::TCancelPlanFragmentResult&)
    rbx = 0x000000001f9241c0   rbp = 0x00007f13ed2106a0
    rsp = 0x00007f13077b98f0   r12 = 0x00007f13077b9900
    r13 = 0x00007f13077b9b80   r14 = 0x00007f13077b9b50
    r15 = 0x00007f13077b9b80   rip = 0x0000000000cba069
    Found by: call frame info

16  impalad!impala::Status impala::ClientConnection<impala::ImpalaBackendClient>::DoRpc<void (impala::ImpalaInternalServiceClient::*)(impala::TCancelPlanFragmentResult&, impala::TCancelPlanFragmentParams const&), impala::TCancelPlanFragmentParams, impala::TCancelPlanFragmentResult>(void (impala::ImpalaInternalServiceClient::* const&)(impala::TCancelPlanFragmentResult&, impala::TCancelPlanFragmentParams const&), impala::TCancelPlanFragmentParams const&, impala::TCancelPlanFragmentResult*, bool*) 
    rbx = 0x00007f13077b9b20   rbp = 0x00007f13077b9ae0
    rsp = 0x00007f13077b9970   r12 = 0x00007f13077b9bc0
    r13 = 0x00007f13077b9acf   r14 = 0x00007f13077b9b50
    r15 = 0x00007f13077b9b80   rip = 0x0000000000d79031
    Found by: call frame info
17  impalad!impala::Coordinator::CancelRemoteFragments() 
    rbx = 0x0000000000000000   rbp = 0x00007f12d8533f40
    rsp = 0x00007f13077b9a60   r12 = 0x00007f12d8533fa0
    r13 = 0x00007f13077b9bc0   r14 = 0x000000003dc58000
    r15 = 0x00007f13077b9b20   rip = 0x0000000000d6818f
    Found by: call frame info
18  impalad!impala::Coordinator::CancelInternal()
    rbx = 0x000000003dc58000   rbp = 0x00007f13077b9d70
    rsp = 0x00007f13077b9d70   r12 = 0x00007f127209f600
    r13 = 0x00007f13077b9ff0   r14 = 0x000000003dc58000
    r15 = 0x00007f13077b9de0   rip = 0x0000000000d6f7f2
    Found by: call frame info
19  impalad!impala::Coordinator::Cancel(impala::Status const*)
    rbx = 0x000000003dc58000   rbp = 0x000000003dc58390
    rsp = 0x00007f13077b9da0   r12 = 0x00007f13077b9ff0
    r13 = 0x00007f13077b9ff0   r14 = 0x000000003dc58000
    r15 = 0x00007f13077b9de0   rip = 0x0000000000d71b83
    Found by: call frame info
20  impalad!impala::ImpalaServer::QueryExecState::Cancel(bool, impala::Status const*)
    rbx = 0x00007f12b928e000   rbp = 0x00007f12b928e2b8
    rsp = 0x00007f13077b9dc0   r12 = 0x00007f13077b9e60
    r13 = 0x00007f13077b9ff0   r14 = 0x000000003dc58000
    r15 = 0x00007f13077b9de0   rip = 0x0000000000adba06
    Found by: call frame info
21  impalad!impala::ImpalaServer::CancelInternal(impala::TUniqueId const&, bool, impala::Status const*) 
    rbx = 0x00007f13077b9e70   rbp = 0x00007f13077b9f50
    rsp = 0x00007f13077b9e30   r12 = 0x00007f13077b9e60
    r13 = 0x00007f13ed2106a0   r14 = 0x000000000f8b1100
    r15 = 0x00007f13077b9ff0   rip = 0x0000000000a8597a
    Found by: call frame info

Cause of the issue
------------------------
DataStreamRecvr::SenderQueue::Cancel() or DataStreamRecvr::CancelStream() does not wait for threads inside impala::DataStreamRecvr::SenderQueue::GetBatch() to finish,  that leads to a situation where the ~DataStreamRecv() will be called with thread still in  impala::DataStreamRecvr::SenderQueue::GetBatch() which may sometime result in this crash.









--
This message was sent by Atlassian JIRA
(v7.6.3#76005)