You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Mostafa Mokhtar (JIRA)" <ji...@apache.org> on 2017/04/04 23:51:41 UTC
[jira] [Created] (IMPALA-5168) Codegen hash computation in
DataStreamSender::Send for partition exchange.
Mostafa Mokhtar created IMPALA-5168:
---------------------------------------
Summary: Codegen hash computation in DataStreamSender::Send for partition exchange.
Key: IMPALA-5168
URL: https://issues.apache.org/jira/browse/IMPALA-5168
Project: IMPALA
Issue Type: Bug
Components: Backend
Affects Versions: Impala 2.6.0
Reporter: Mostafa Mokhtar
Hash partition computation for exchange operators can benefit from codegen, profile data ~20% of CPU in the fragment thread is consumed by RawValue::GetHashValueFnv & ExprContext::GetValue
{code}
// hash-partition batch's rows across channels
int num_channels = channels_.size();
for (int i = 0; i < batch->num_rows(); ++i) {
TupleRow* row = batch->GetRow(i);
uint32_t hash_val = HashUtil::FNV_SEED;
for (int i = 0; i < partition_expr_ctxs_.size(); ++i) {
ExprContext* ctx = partition_expr_ctxs_[i];
void* partition_val = ctx->GetValue(row);
// We can't use the crc hash function here because it does not result
// in uncorrelated hashes with different seeds. Instead we must use
// fnv hash.
// TODO: fix crc hash/GetHashValue()
hash_val =
RawValue::GetHashValueFnv(partition_val, ctx->root()->type(), hash_val);
}
ExprContext::FreeLocalAllocations(partition_expr_ctxs_);
RETURN_IF_ERROR(channels_[hash_val % num_channels]->AddRow(row));
}
{code}
|Function Stack| Effective Time % |
|Total|100%|
| clone|99%|
| start_thread|99%|
| thread_proxy|99%|
| boost::detail::thread_data<boost::_bi::bind_t<>::run|99%|
| boost::_bi::bind_t<void, void (*)(), ::operator()|99%|
| operator()<void (*)(const std::basic_string<|99%|
| impala::Thread::SuperviseThread|99%|
| boost::function0<void>::operator()|99%|
| impala::QueryExecMgr::ExecFInstance|99%|
| impala::FragmentInstanceState::Exec|99%|
| impala::PlanFragmentExecutor::Exec|99%|
| impala::PlanFragmentExecutor::ExecInternal|96%|
| impala::DataStreamSender::Send|91%|
| impala::DataStreamSender::Channel::AddRow|56%|
| impala::RawValue::GetHashValueFnv|11%|
| impala::ExprContext::GetValue|11%|
| impala::ExprContext::FreeLocalAllocations|6%|
| impala::RowBatch::GetRow|1%|
| std::vector<impala::ExprContext*, std::allocator<impala::ExprContext*>>::size|1%|
| impala::Expr::type|0%|
| impala::ExprContext::GetValue|0%|
| impala::RuntimeState::CheckQueryState|0%|
| impala::HdfsScanNode::GetNext|3%|
| impala::RowBatch::Reset|1%|
| Status|0%|
| ~ScopedTimer|0%|
| [Unknown stack frame(s)]|4%|
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)