You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Mostafa Mokhtar (JIRA)" <ji...@apache.org> on 2016/04/14 01:53:25 UTC
[jira] [Created] (KUDU-1417) Kudu scanner :
KuduScanBatch::RowPtr::Get* has very expensive checks in the hot path
Mostafa Mokhtar created KUDU-1417:
-------------------------------------
Summary: Kudu scanner : KuduScanBatch::RowPtr::Get* has very expensive checks in the hot path
Key: KUDU-1417
URL: https://issues.apache.org/jira/browse/KUDU-1417
Project: Kudu
Issue Type: Bug
Components: impala, rpc
Affects Versions: 0.7.0
Reporter: Mostafa Mokhtar
{code}KuduScanBatch::RowPtr::Get{code} gets called by the Impala scanner per row, per column and RowPtr::Get has check on type which not necessarily needed, this burns about 10% of the overall CPU.
Also similar to IMPALA-3347, it would help if the Batch has a column level nullability bit to skip the NULL check.
{code}
template<typename T>
Status KuduScanBatch::RowPtr::Get(int col_idx, typename T::cpp_type* val) const {
const ColumnSchema& col = schema_->column(col_idx);
if (PREDICT_FALSE(col.type_info()->type() != T::type)) {
// TODO: at some point we could allow type coercion here.
return Status::InvalidArgument(
Substitute("invalid type $0 provided for column '$1' (expected $2)",
T::name(),
col.name(), col.type_info()->name()));
}
if (col.is_nullable() && IsNull(col_idx)) {
return Status::NotFound("column is NULL");
}
memcpy(val, row_data_ + schema_->column_offset(col_idx), sizeof(*val));
return Status::OK();
}
{code}
>From perf
{code}
Samples: 182K of event 'cycles', Event count (approx.): 65636591846
11.59% libkudu_client.so.0.1.0 [.] kudu::client::KuduScanBatch::RowPtr::GetInt64(int, long*) const
10.03% impalad [.] impala::KuduScanner::KuduRowToImpalaTuple(kudu::client::KuduScanBatch::RowPtr const&, impala::RowBatch*, impala::Tuple*)
7.28% impalad [.] impala::KuduScanner::DecodeRowsIntoRowBatch(impala::RowBatch*, impala::Tuple**, bool*)
3.69% [kernel] [k] copy_user_enhanced_fast_string
3.26% kudu-tserver [.] kudu::SerializeRowBlock(kudu::RowBlock const&, kudu::RowwiseRowBlockPB*, kudu::Schema const*, kudu::faststring*, kudu::faststring*)
3.01% libkudu_client.so.0.1.0 [.] kudu::client::KuduScanBatch::RowPtr::IsNull(int) const
{code}
And this shows that most of the cost is from the check before the memcpy
{code}
kudu::client::KuduScanBatch::RowPtr::GetInt64(int, long*) const /opt/cloudera/parcels/KUDU-0.7.1-1.kudu0.7.1.p0.36/lib64/libkudu_client.so.0.1.0
2.37 │ mov %rdi,%r12
0.42 │ sub $0xd8,%rsp
2.62 │ mov (%rsi),%r8
2.37 │ lea (%rbx,%rbx,8),%rsi
0.46 │ mov (%r8),%rax
0.61 │ lea (%rax,%rsi,8),%r9
2.76 │ mov 0x8(%r9),%rax
6.81 │ cmpl $0x7,(%rax)
12.14 │ ↓ jne e8
2.57 │ cmpb $0x0,0x10(%r9)
0.57 │ ↓ jne 68
1.95 │ 3b: mov 0x40(%r8),%rax
0.68 │ mov (%rax,%rbx,8),%rax
2.92 │ add 0x8(%r13),%rax
0.75 │ mov (%rax),%rax
32.71 │ mov %rax,(%rcx)
2.62 │ movq $0x0,(%r12)
0.52 │ 55: lea -0x18(%rbp),%rsp
0.21 │ mov %r12,%rax
2.13 │ pop %rbx
2.48 │ pop %r12
0.52 │ pop %r13
0.22 │ pop %rbp
2.46 │ ← retq
│ nop
│ 68: mov %edx,%esi
│ mov %r13,%rdi
│ mov %rcx,-0xe8(%rbp)
│ → callq kudu::client::KuduScanBatch::RowPtr::IsNull(int) const@plt
│ test %al,%al
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)