You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2022/02/04 17:57:00 UTC

[jira] [Commented] (IMPALA-11105) Impala crashes in PhjBuilder::Close() when Prepare() fails

    [ https://issues.apache.org/jira/browse/IMPALA-11105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17487209#comment-17487209 ] 

ASF subversion and git services commented on IMPALA-11105:
----------------------------------------------------------

Commit a0e6b6b618ba9996bc2bc9bd644a340c2cad8b79 in impala's branch refs/heads/master from Zoltan Borok-Nagy
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=a0e6b6b ]

IMPALA-11105: Impala crashes in PhjBuilder::Close() when Prepare() fails

In PhjBuilder::Close() we invoke
'ht_ctx_->StatsCountersAdd(ht_stats_profile_.get())' when 'ht_ctx_' is
not null. But in Prepare we create 'ht_ctx_' first, then after a couple
operations which might fail we create 'ht_stats_profile_'. This means if
an operation fails in Prepare(), between the creation of 'ht_ctx_' and
'ht_stast_profile_', then later we'll get a SEGFAULT in Close().

This patch restructures the code in PhjBuilder::Prepare(), so at first
it creates the counters and profile, then it creates 'ht_ctx_',
similarly to what we do in grouping-aggregator.cc. It also modifies
HashTableCtx::StatsCountersAdd(), so in release mode it is a no-op
if 'profile' is null.

Testing:
 * added a debug action that fails PhjBuilder::Prepare() after the
   creation of 'ht_ctx_'

Change-Id: Id41b0c45d9693cb3433e02737048cb9f50ba59c1
Reviewed-on: http://gerrit.cloudera.org:8080/18195
Reviewed-by: Joe McDonnell <jo...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Impala crashes in PhjBuilder::Close() when Prepare() fails
> ----------------------------------------------------------
>
>                 Key: IMPALA-11105
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11105
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>            Reporter: Zoltán Borók-Nagy
>            Assignee: Zoltán Borók-Nagy
>            Priority: Major
>
> Currently PhjBuilder::Close() looks like the following:
> {noformat}
> void PhjBuilder::Close(RuntimeState* state) {
>   if (closed_) return;
>   CloseAndDeletePartitions(nullptr);
>   if (ht_ctx_ != nullptr) {
>     ht_ctx_->StatsCountersAdd(ht_stats_profile_.get());
>     ht_ctx_->Close(state);
>     ht_ctx_.reset();
>   }
> ...
> {noformat}
> When 'ht_ctx_' is not null, we invoke ht_ctx_->StatsCountersAdd(ht_stats_profile_.get());
> But in Prepare() we create 'ht_ctx_' first, then after a couple operations which might fail we create 'ht_stats_profile_'.
> This means if an operation fails in Prepare(), between the creation of 'ht_ctx_' and 'ht_stast_profile_', then later we'll get a SEGFAULT in Close().
> The stack trace would look like the following:
> {noformat}
> #0  impala::HashTableCtx::StatsCountersAdd (this=0x3ea4ddc80, profile=0x0) at hash-table.cc:229
> #1  0x0000000001545c4c in impala::PhjBuilder::Close (this=0xbf6523c0, state=0x2eb586c00) at partitioned-hash-join-builder.cc:395
> #2  0x00000000015e4056 in impala::JoinBuilder::CloseFromProbe (this=0xbf6523c0, join_node_state=join_node_state@entry=0x2eb586c00) at join-builder.cc:64
> #3  0x0000000001550fdb in impala::PartitionedHashJoinNode::Close (this=0x214654480, state=0x2eb586c00) at partitioned-hash-join-node.cc:306
> #4  0x00000000014ab5e1 in impala::ExecNode::Close (this=0xc34db5680, state=0x2eb586c00) at exec-node.cc:313
> #5  0x00000000014ab5e1 in impala::ExecNode::Close (this=0x1980bdcd80, state=0x2eb586c00) at exec-node.cc:313
> ...
> #16 0x00000000014ab5e1 in impala::ExecNode::Close (this=0x6faf3e000, state=0x2eb586c00) at exec-node.cc:313
> #17 0x00000000014ab5e1 in impala::ExecNode::Close (this=0x86bc5e000, state=0x2eb586c00) at exec-node.cc:313
> #18 0x0000000001198ebe in impala::FragmentInstanceState::Close (this=this@entry=0x12183fb180) at fragment-instance-state.cc:411
> #19 0x000000000119ba47 in impala::FragmentInstanceState::Exec (this=this@entry=0x12183fb180) at fragment-instance-state.cc:104
> {noformat}
> The solution could be to initialize the counters first, like we do in grouping-aggregators.cc.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org