You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2020/10/12 08:00:00 UTC
[jira] [Updated] (IMPALA-10233) Hit DCHECK in
DmlExecState::AddPartition when inserting to a partitioned table with
zorder
[ https://issues.apache.org/jira/browse/IMPALA-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Quanlong Huang updated IMPALA-10233:
------------------------------------
Summary: Hit DCHECK in DmlExecState::AddPartition when inserting to a partitioned table with zorder (was: Hit DCHECK in DmlExecState::AddPartition)
> Hit DCHECK in DmlExecState::AddPartition when inserting to a partitioned table with zorder
> ------------------------------------------------------------------------------------------
>
> Key: IMPALA-10233
> URL: https://issues.apache.org/jira/browse/IMPALA-10233
> Project: IMPALA
> Issue Type: Bug
> Reporter: Quanlong Huang
> Priority: Major
>
> Hit the DCHECK when inserting to a partitioned parquet table with zorder. I'm on master branch (commit=b8a2b75).
> {code:java}
> F1012 15:04:27.726274 3868 dml-exec-state.cc:432] a6479cc4725101fd:b86db2a100000003] Check failed: per_partition_status_.find(name) == per_partition_status_.end()
> *** Check failure stack trace: ***
> @ 0x51ff3cc google::LogMessage::Fail()
> @ 0x5200cbc google::LogMessage::SendToLog()
> @ 0x51fed2a google::LogMessage::Flush()
> @ 0x5202928 google::LogMessageFatal::~LogMessageFatal()
> @ 0x234ba18 impala::DmlExecState::AddPartition()
> @ 0x2817786 impala::HdfsTableSink::GetOutputPartition()
> @ 0x2813151 impala::HdfsTableSink::WriteClusteredRowBatch()
> @ 0x28156c4 impala::HdfsTableSink::Send()
> @ 0x23139dd impala::FragmentInstanceState::ExecInternal()
> @ 0x230fe10 impala::FragmentInstanceState::Exec()
> @ 0x227bb79 impala::QueryState::ExecFInstance()
> @ 0x2279f7b _ZZN6impala10QueryState15StartFInstancesEvENKUlvE_clEv
> @ 0x227e2c2 _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala10QueryState15StartFInstancesEvEUlvE_vE6invokeERNS1_15function_bufferE
> @ 0x2137699 boost::function0<>::operator()()
> @ 0x2715d7d impala::Thread::SuperviseThread()
> @ 0x271dd1a boost::_bi::list5<>::operator()<>()
> @ 0x271dc3e boost::_bi::bind_t<>::operator()()
> @ 0x271dbff boost::detail::thread_data<>::run()
> @ 0x3f05f01 thread_proxy
> @ 0x7fb18bebb6b9 start_thread
> @ 0x7fb188a474dc clone {code}
> It seems the zorder sort node doesn't keep the rows sorted by partition keys. Thus violates the assumption of HdfsTableSink::WriteClusteredRowBatch() that input must be ordered by the partition key expressions. So a partition key was deleted and then inserted again to the {{partition_keys_to_output_partitions_}} map.
> {code:c++}
> /// Maps all rows in 'batch' to partitions and appends them to their temporary Hdfs
> /// files. The input must be ordered by the partition key expressions.
> Status WriteClusteredRowBatch(RuntimeState* state, RowBatch* batch) WARN_UNUSED_RESULT;
> {code}
> The key got removed here: https://github.com/apache/impala/blob/b8a2b754669eb7f8d164e8112e594ac413e436ef/be/src/exec/hdfs-table-sink.cc#L334 when processing a new partition key.
> It got reinserted here: https://github.com/apache/impala/blob/b8a2b754669eb7f8d164e8112e594ac413e436ef/be/src/exec/hdfs-table-sink.cc#L590 so hit the DCHECK.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org