You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Bikramjeet Vig (JIRA)" <ji...@apache.org> on 2017/07/24 19:10:00 UTC
[jira] [Resolved] (IMPALA-5275) Avoid printing Status stack trace
on hot paths
[ https://issues.apache.org/jira/browse/IMPALA-5275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bikramjeet Vig resolved IMPALA-5275.
------------------------------------
Resolution: Fixed
IMPALA-5275: Avoid printing status stack trace on hot paths
Currently, creation of a Status object (non-OK and non-EXPECTED)
prints the stack trace to the log. Fetching the stack trace takes
a large chunk of CPU time close to 130ms and results in a significant
perf hit when encountered on hot paths.
Five such hot paths were identified and the following changes were
made to fix it:
1. In ImpalaServer::GetExecSummary(), create Status() without holding
the query_log_lock_.
2, 3 and 4. In impala::DeserializeThriftMsg<>(),
PartitionedAggregationNode::CodegenUpdateTuple() and
HdfsScanner::CodegenWriteCompleteTuple, use Status::Expected where
appropriate.
5. In Status::MemLimitExceeded(), create Status object without
printing stacktrace
Change-Id: Ief083f558fba587381aa7fe8f99da279da02f1f2
Reviewed-on: http://gerrit.cloudera.org:8080/7449
Reviewed-by: Matthew Jacobs <mj...@cloudera.com>
Tested-by: Impala Public Jenkins
> Avoid printing Status stack trace on hot paths
> ----------------------------------------------
>
> Key: IMPALA-5275
> URL: https://issues.apache.org/jira/browse/IMPALA-5275
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Reporter: Mostafa Mokhtar
> Assignee: Bikramjeet Vig
> Priority: Critical
> Labels: ramp-up
> Attachments: cancellation_printing_expensive_stack_impalad-stacks-full.out.zip
>
>
> h3. Summary from [~mjacobs]:
> There are 3 known issues to fix ASAP:
> 1) In ImpalaServer::GetExecSummary(), do not create Status() while holding the lock.
> 2) Change error status in impala::DeserializeThriftMsg<>() to use Status::Expected
> 3) PartitionedAggregationNode::CodegenUpdateTuple() handling for unsupported types (e.g. char) should use Status::Expected
> Then separately (less time critical) we should investigate how to improve our stack walking as these issues may come up again in the future. I filed IMPALA-5651.
> h3. More detailed initial report for #2 above, from Mostafa:
> When reading Parquet data attempts to find the Parquet header prints a stack trace to the log even if the error was found successfully.
> Also printing the stack trace without logging which Parquet file was being read makes the log message useless.
> {code}
> Status::Status(const string& error_msg)
> : msg_(new ErrorMsg(TErrorCode::GENERAL, error_msg)) {
> VLOG(1) << msg_->msg() << "\n" << GetStackTrace();
> }
> {code}
> Vtune data showed that capturing the stack trace is very expensive and spends 130ms CPU per invocation.
> {code}
> This is the expensive call stack btw
> CPU Time
> 1 of 8: 77.9% (0.106s of 0.136s)
> vmlinux ! KSTK_ESP - [unknown source file]
> vmlinux ! vm_is_stack + 0xa1 - [unknown source file]
> vmlinux ! show_map_vma + 0x268 - [unknown source file]
> vmlinux ! show_map + 0x20 - [unknown source file]
> vmlinux ! show_pid_map + 0x12 - [unknown source file]
> vmlinux ! seq_read + 0x24f - [unknown source file]
> vmlinux ! vfs_read + 0x9d - [unknown source file]
> vmlinux ! SyS_read + 0x7e - [unknown source file]
> vmlinux ! tracesys + 0xdc - [unknown source file]
> libpthread-2.17.so ! __read + 0x2c - [unknown source file]
> impalad ! google::OpenObjectFileContainingPcAndGetStartAddress + 0x2cf - [unknown source file]
> impalad ! google::SymbolizeAndDemangle + 0x488 - [unknown source file]
> impalad ! google::DumpStackTrace.constprop.9 + 0x6e - [unknown source file]
> impalad ! impala::GetStackTrace + 0x1e - debug-util.cc:321
> impalad ! impala::Status::Status + 0x68 - status.cc:114
> impalad ! impala::DeserializeThriftMsg<parquet::PageHeader> + 0x2ce - thrift-util.h:128
> impalad ! impala::BaseScalarColumnReader::ReadPageHeader + 0x182 - parquet-column-readers.cc:787
> impalad ! impala::BaseScalarColumnReader::ReadDataPage + 0x9b - parquet-column-readers.cc:966
> impalad ! impala::BaseScalarColumnReader::NextPage + 0x6e - parquet-column-readers.cc:1093
> impalad ! ReadValueBatch<false> + 0x196 - parquet-column-readers.cc:306
> impalad ! impala::ScalarColumnReader<long, (bool)1>::ReadNonRepeatedValueBatch + 0x60 - parquet-column-readers.cc:253
> impalad ! impala::HdfsParquetScanner::AssembleRows + 0x313 - hdfs-parquet-scanner.cc:950
> impalad ! impala::HdfsParquetScanner::GetNextInternal + 0x277 - hdfs-parquet-scanner.cc:482
> impalad ! impala::HdfsParquetScanner::ProcessSplit + 0x73 - hdfs-parquet-scanner.cc:409
> impalad ! impala::HdfsScanNode::ProcessSplit + 0x395 - hdfs-scan-node.cc:526
> impalad ! impala::HdfsScanNode::ScannerThread + 0xbb8 - hdfs-scan-node.cc:417
> impalad ! boost::function0<void>::operator() + 0x1a - function_template.hpp:767
> {code}
> What gets printed to the log
> {code}
> I0501 14:13:24.154422 21797 status.cc:119] couldn't deserialize thrift msg:
> No more data to read.
> @ 0x838a89 impala::Status::Status()
> @ 0xc955af impala::DeserializeThriftMsg<>()
> @ 0xc8c973 impala::BaseScalarColumnReader::ReadPageHeader()
> @ 0xc8e0fc impala::BaseScalarColumnReader::ReadDataPage()
> @ 0xc8ee6f impala::BaseScalarColumnReader::NextPage()
> @ 0xc9d5a8 impala::ScalarColumnReader<>::ReadNonRepeatedValueBatch()
> @ 0xc72764 impala::HdfsParquetScanner::AssembleRows()
> @ 0xc77238 impala::HdfsParquetScanner::GetNextInternal()
> @ 0xc71e74 impala::HdfsParquetScanner::ProcessSplit()
> @ 0xc4eab6 impala::HdfsScanNode::ProcessSplit()
> @ 0xc50719 impala::HdfsScanNode::ScannerThread()
> @ 0xbdd739 impala::Thread::SuperviseThread()
> @ 0xbde184 boost::detail::thread_data<>::run()
> @ 0xe4f3ba (unknown)
> @ 0x3797407aa1 (unknown)
> @ 0x37970e893d (unknown)
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)