You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2021/08/20 01:15:00 UTC
[jira] [Commented] (IMPALA-10855) Hang in
PartitionedHashJoinNode::Close() for cancel the query when disk spilling
[ https://issues.apache.org/jira/browse/IMPALA-10855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17401942#comment-17401942 ]
Quanlong Huang commented on IMPALA-10855:
-----------------------------------------
I think IMPALA-9611 fixes this. Could you cherry-pick it to your branch and verify it?
> Hang in PartitionedHashJoinNode::Close() for cancel the query when disk spilling
> --------------------------------------------------------------------------------
>
> Key: IMPALA-10855
> URL: https://issues.apache.org/jira/browse/IMPALA-10855
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 3.4.0
> Reporter: carolinchen
> Priority: Major
>
> Query has been cancelled due to overtime,but two fragment instances still haven't been released. pstack the remaining two fragment instances, where the two threads was blocked.
> this is the stack of the blocked fis.
> * thread A
> {code:java}
> // code placeholder
> Thread 1 (process 1947185):
> #0 0x00007fbaa033a945 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
> #1 0x0000000001529834 in Wait (lock=..., this=0x706c44a8) at /opt/Impala/be/src/util/condition-variable.h:49
> #2 impala::JoinBuilder::HandoffToProbesAndWait (this=this@entry=0x706c43c0, build_side_state=build_side_state@entry=0xdcbb4e00) at /opt/Impala/be/src/exec/join-builder.cc:103
> #3 0x00000000014889d5 in impala::PhjBuilder::FlushFinal (this=0x706c43c0, state=0xdcbb4e00) at /opt/Impala/be/src/exec/partitioned-hash-join-builder.cc:367
> #4 0x00000000010eb2d6 in impala::FragmentInstanceState::ExecInternal (this=this@entry=0x1961a8700) at /opt/Impala/be/src/runtime/fragment-instance-state.cc:403
> #5 0x00000000010ed51a in impala::FragmentInstanceState::Exec (this=this@entry=0x1961a8700) at /opt/Impala/be/src/runtime/fragment-instance-state.cc:98
> #6 0x00000000010cc7b7 in impala::QueryState::ExecFInstance (this=0x21e8278000, fis=0x1961a8700) at /opt/Impala/be/src/runtime/query-state.cc:719
> #7 0x00000000013807bb in operator() (this=0x7fb131b7fc00) at /opt/Impala/toolchain/boost-1.61.0-p2/include/boost/function/function_template.hpp:771
> #8 impala::Thread::SuperviseThread(std::string const&, std::string const&, boost::function<void ()>, impala::ThreadDebugInfo const*, impala::Promise<long, (impala::PromiseMode)0>*) (name=..., category=..., functor=..., parent_thread_info=<optimized out>, thread_started=0x7fb12fb7ae10) at /opt/Impala/be/src/util/thread.cc:360
> #9 0x000000000138168a in operator()<void (*)(const std::basic_string<char>&, const std::basic_string<char>&, boost::function<void()>, const impala::ThreadDebugInfo*, impala::Promise<long int>*), boost::_bi::list0> (f=@0x5ecd8fb8: 0x13804b0 <impala::Thread::SuperviseThread(std::string const&, std::string const&, boost::function<void ()>, impala::ThreadDebugInfo const*, impala::Promise<long, (impala::PromiseMode)0>*)>, a=<synthetic pointer>, this=0x5ecd8fc0) at /opt/Impala/toolchain/boost-1.61.0-p2/include/boost/bind/bind.hpp:531
> #10 operator() (this=0x5ecd8fb8) at /opt/Impala/toolchain/boost-1.61.0-p2/include/boost/bind/bind.hpp:1222
> #11 boost::detail::thread_data<boost::_bi::bind_t<void, void (*)(std::string const&, std::string const&, boost::function<void ()>, impala::ThreadDebugInfo const*, impala::Promise<long, (impala::PromiseMode)0>*), boost::_bi::list5<boost::_bi::value<std::string>, boost::_bi::value<std::string>, boost::_bi::value<boost::function<void ()> >, boost::_bi::value<impala::ThreadDebugInfo*>, boost::_bi::value<impala::Promise<long, (impala::PromiseMode)0>*> > > >::run() (this=0x5ecd8e00) at /opt/Impala/toolchain/boost-1.61.0-p2/include/boost/thread/detail/thread.hpp:116
> #12 0x0000000001b55f4a in thread_proxy ()
> #13 0x00007fbaa0336e25 in start_thread () from /lib64/libpthread.so.0
> #14 0x00007fba9cf2e35d in clone () from /lib64/libc.so.6
> {code}
> thread B
> {code:java}
> Thread 1 (process 1947174):
> #0 0x00007fbaa033a945 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
> #1 0x00000000013ab45d in Wait (lock=..., this=0x21b9e3b50) at /opt/Impala/be/src/util/condition-variable.h:49
> #2 impala::BufferPool::Client::CleanPages (this=this@entry=0x21b9e3a40, client_lock=client_lock@entry=0x7fb139b8f530, len=len@entry=524288, lazy_flush=lazy_flush@entry=true) at /opt/Impala/be/src/runtime/bufferpool/buffer-pool.cc:684
> #3 0x00000000013ac487 in impala::BufferPool::Client::TransferReservationTo (this=0x21b9e3a40, dst=0x455c3498, bytes=524288, transferred=transferred@entry=0x7fb139b8f5cf) at /opt/Impala/be/src/runtime/bufferpool/buffer-pool.cc:639
> #4 0x00000000013ac554 in impala::BufferPool::ClientHandle::TransferReservationTo (this=this@entry=0xcd27f7b0, dst=<optimized out>, bytes=<optimized out>, transferred=transferred@entry=0x7fb139b8f5cf) at /opt/Impala/be/src/runtime/bufferpool/buffer-pool.cc:343
> #5 0x00000000013ac578 in impala::BufferPool::ClientHandle::TransferReservationTo (this=this@entry=0xcd27f7b0, dst=<optimized out>, bytes=<optimized out>, transferred=transferred@entry=0x7fb139b8f5cf) at /opt/Impala/be/src/runtime/bufferpool/buffer-pool.cc:349
> #6 0x0000000001481b96 in impala::PhjBuilder::ReturnReservation (this=<optimized out>, probe_client=probe_client@entry=0xcd27f7b0, bytes=<optimized out>) at /opt/Impala/be/src/exec/partitioned-hash-join-builder.cc:978
> #7 0x0000000001496cf6 in impala::PartitionedHashJoinNode::Close (this=0xcd27f680, state=0xa7491a00) at /opt/Impala/be/src/exec/partitioned-hash-join-node.cc:316
> #8 0x00000000013f2a91 in impala::ExecNode::Close (this=0x612b861c0, state=0xa7491a00) at /opt/Impala/be/src/exec/exec-node.cc:305
> #9 0x00000000010ea410 in impala::FragmentInstanceState::Close (this=this@entry=0x7cccc820) at /opt/Impala/be/src/runtime/fragment-instance-state.cc:431
> #10 0x00000000010ed327 in impala::FragmentInstanceState::Exec (this=this@entry=0x7cccc820) at /opt/Impala/be/src/runtime/fragment-instance-state.cc:104
> #11 0x00000000010cc7b7 in impala::QueryState::ExecFInstance (this=0x21e8278000, fis=0x7cccc820) at /opt/Impala/be/src/runtime/query-state.cc:719
> #12 0x00000000013807bb in operator() (this=0x7fb139b8fc00) at /opt/Impala/toolchain/boost-1.61.0-p2/include/boost/function/function_template.hpp:771
> #13 impala::Thread::SuperviseThread(std::string const&, std::string const&, boost::function<void ()>, impala::ThreadDebugInfo const*, impala::Promise<long, (impala::PromiseMode)0>*) (name=..., category=..., functor=..., parent_thread_info=<optimized out>, thread_started=0x7fb12fb7ae10) at /opt/Impala/be/src/util/thread.cc:360
> #14 0x000000000138168a in operator()<void (*)(const std::basic_string<char>&, const std::basic_string<char>&, boost::function<void()>, const impala::ThreadDebugInfo*, impala::Promise<long int>*), boost::_bi::list0> (f=@0x1133db5b8: 0x13804b0 <impala::Thread::SuperviseThread(std::string const&, std::string const&, boost::function<void ()>, impala::ThreadDebugInfo const*, impala::Promise<long, (impala::PromiseMode)0>*)>, a=<synthetic pointer>, this=0x1133db5c0) at /opt/Impala/toolchain/boost-1.61.0-p2/include/boost/bind/bind.hpp:531
> #15 operator() (this=0x1133db5b8) at /opt/Impala/toolchain/boost-1.61.0-p2/include/boost/bind/bind.hpp:1222
> #16 boost::detail::thread_data<boost::_bi::bind_t<void, void (*)(std::string const&, std::string const&, boost::function<void ()>, impala::ThreadDebugInfo const*, impala::Promise<long, (impala::PromiseMode)0>*), boost::_bi::list5<boost::_bi::value<std::string>, boost::_bi::value<std::string>, boost::_bi::value<boost::function<void ()> >, boost::_bi::value<impala::ThreadDebugInfo*>, boost::_bi::value<impala::Promise<long, (impala::PromiseMode)0>*> > > >::run() (this=0x1133db400) at /opt/Impala/toolchain/boost-1.61.0-p2/include/boost/thread/detail/thread.hpp:116
> #17 0x0000000001b55f4a in thread_proxy ()
> #18 0x00007fbaa0336e25 in start_thread () from /lib64/libpthread.so.0
> #19 0x00007fba9cf2e35d in clone () from /lib64/libc.so.6
> {code}
> # thread A hang in HandoffToProbeAndWait(), wait in the ConditionVariable bulid_wakeup_cv to notify.
> # Under normal circumstances, thread B will notify thread A to get the lock, then end the wait.
> # but in this circumstances, thread B won't notify thread A. because thread B hang in CleanPages() on the cv write_complete_cv_ waiting to be notified, it is blocked so can't notify thread A to release .
> as i kown reproduce the issue needs these factor(Probabilistic):
> # join
> # spill
> # cancel (due to overtime or voluntarily cancel)
> # mt
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org