You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2023/06/12 05:04:00 UTC

[jira] [Commented] (IMPALA-12204) Redundant codegen info of HashJoinBuilder inside a subplan

    [ https://issues.apache.org/jira/browse/IMPALA-12204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17731435#comment-17731435 ] 

Quanlong Huang commented on IMPALA-12204:
-----------------------------------------

FWIW, here is the stacktrace to where the string is added:
{noformat}
    @          0x35d28f7  impala::DataSink::Open()
    @          0x36b295d  impala::PhjBuilder::Open()
    @          0x3779788  impala::BlockingJoinNode::SendBuildInputToSink<>()
    @          0x37742ff  impala::BlockingJoinNode::ProcessBuildInputAndOpenProbe()
    @          0x36daab7  impala::PartitionedHashJoinNode::Open()
    @          0x3696fac  impala::NestedLoopJoinNode::Open()
    @          0x371a809  impala::SubplanNode::GetNext()
    @          0x37edb98  impala::AggregationNode::Open()
    @          0x2b41779  impala::FragmentInstanceState::Open()
    @          0x2b3e1c0  impala::FragmentInstanceState::Exec()
    @          0x2a2b5ef  impala::QueryState::ExecFInstance()
    @          0x297ac77  boost::function0<>::operator()()
    @          0x34ae6f0  impala::Thread::SuperviseThread()
    @          0x34baaf9  boost::_bi::list5<>::operator()<>()
    @          0x34ba94c  boost::_bi::bind_t<>::operator()()
    @          0x4b8fd77  thread_proxy
    @     0x7f42a69ee6db  start_thread
    @     0x7f42a376361f  clone {noformat}
Code snipper
{code:cpp}
Status DataSink::Open(RuntimeState* state) {
  DCHECK_EQ(output_exprs_.size(), output_expr_evals_.size());
  for (const string& codegen_msg : sink_config_.codegen_status_msgs_) {
    profile_->AppendExecOption(codegen_msg);
  }
  return ScalarExprEvaluator::Open(output_expr_evals_, state);
}
{code}

> Redundant codegen info of HashJoinBuilder inside a subplan
> ----------------------------------------------------------
>
>                 Key: IMPALA-12204
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12204
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>            Reporter: Quanlong Huang
>            Assignee: Quanlong Huang
>            Priority: Critical
>
> In query profile, the info strings of a hash join builder contains an ExecOption that has content like "Build Side Codegen Enabled, Hash Table Construction Codegen Enabled". When there is a HashJoin node inside a SUBPLAN node, this string could be repeated many times since the SUBPLAN node will open the right child many times. This could blow up the profile size.
> I can reproduce this by the following query:
> {code:sql}
> select count(*) from
>   tpch_nested_parquet.customer c1,
>   tpch_nested_parquet.customer c2,
>   (select x.* from c1.c_orders x, c2.c_orders y
>   where x.o_orderkey = y.o_orderkey) v
> where c1.c_custkey = c2.c_custkey;{code}
> In the query plan, there is a HASH JOIN node inside a SUBPLAN node:
> {noformat}
> 08:SUBPLAN
> |  row-size=56B cardinality=1.50M
> |
> |--06:NESTED LOOP JOIN [CROSS JOIN]
> |  |  row-size=56B cardinality=10
> |  |
> |  |--02:SINGULAR ROW SRC
> |  |     row-size=40B cardinality=1
> |  |
> |  05:HASH JOIN [INNER JOIN]
> |  |  hash predicates: x.o_orderkey = y.o_orderkey
> |  |  row-size=16B cardinality=10
> |  |
> |  |--04:UNNEST [c2.c_orders y]
> |  |     row-size=0B cardinality=10
> |  |
> |  03:UNNEST [c1.c_orders x]
> |     row-size=0B cardinality=10
>  {noformat}
> The query porfile has super long strings:
> {noformat}
> Hash Join Builder (join_node_id=5):
>   ExecOption: Build Side Codegen Enabled, Hash Table Construction Codegen Enabled, Build Side Codegen Enabled, Hash Table Construction Codegen Enabled,...
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org