You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Mostafa Mokhtar (JIRA)" <ji...@apache.org> on 2018/06/28 21:28:00 UTC

[jira] [Resolved] (IMPALA-2682) Address exchange operator CPU bottlenecks in Thrift

     [ https://issues.apache.org/jira/browse/IMPALA-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mostafa Mokhtar resolved IMPALA-2682.
-------------------------------------
    Resolution: Fixed

Fixed in 2.13

> Address exchange operator CPU bottlenecks in Thrift
> ---------------------------------------------------
>
>                 Key: IMPALA-2682
>                 URL: https://issues.apache.org/jira/browse/IMPALA-2682
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Distributed Exec
>    Affects Versions: Impala 2.2
>            Reporter: Mostafa Mokhtar
>            Assignee: Mostafa Mokhtar
>            Priority: Minor
>              Labels: performance
>         Attachments: Screen Shot 2015-11-25 at 4.19.28 PM.png
>
>
> Currently the exchange operator is a major bottleneck for partitioned table insert, Broadcast and Shuffle joins as well as aggregates, exchange alone constitutes for 60% of slow down in partitioned table insert compared to un-partitioned.
> The CPU cost of exchange operators can vary between 10-50% depending on cardinality and complexity of other operators. 
> Part of the bottleneck is thrift is due to 512 bytes buffer used by TBufferedTransport, which causes the reads to go through a "slow" path where the payload is mem-copied over N chunks of 512 bytes. 
> These are the top contributing call stacks in the exchange operator. 
> Read path in thrift
> {code}
> Data Of Interest (CPU Metrics)
> 1 of 18: 58.1% (43.860s of 75.474s)
> impalad!apache::thrift::transport::TSocket::read - TSocket.cpp
> impalad!apache::thrift::transport::TTransport::read+0x5 - TTransport.h:109
> impalad!apache::thrift::transport::TBufferedTransport::readSlow+0x4b - TBufferTransports.cpp:52
> impalad!apache::thrift::transport::TBufferBase::read+0xb9 - TBufferTransports.h:69
> impalad!apache::thrift::transport::readAll<apache::thrift::transport::TBufferBase>+0x28 - TTransport.h:44
> impalad!apache::thrift::transport::TTransport::readAll+0xa - TTransport.h:126
> impalad!apache::thrift::protocol::TBinaryProtocolT<apache::thrift::transport::TTransport>::readStringBody<std::string>+0xae - TBinaryProtocol.tcc:458
> impalad!readString<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >+0x24 - TBinaryProtocol.tcc:412
> impalad!apache::thrift::protocol::TVirtualProtocol<apache::thrift::protocol::TBinaryProtocolT<apache::thrift::transport::TTransport>, apache::thrift::protocol::TProtocolDefaults>::readString_virt+0x11 - TVirtualProtocol.h:515
> impalad!apache::thrift::protocol::TProtocol::readString+0x10 - TProtocol.h:621
> impalad!impala::TRowBatch::read+0x17d - Results_types.cpp:89
> {code}
> Send path compressing the data 
> {code}
> Data Of Interest (CPU Metrics)
> 1 of 1: 100.0% (65.480s of 65.480s)
> impalad!LZ4_compress64kCtx - [Unknown]
> impalad!impala::Lz4Compressor::ProcessBlock+0x32 - compress.cc:294
> impalad!impala::RowBatch::Serialize+0x20b - row-batch.cc:211
> impalad!impala::RowBatch::Serialize+0x34 - row-batch.cc:168
> impalad!impala::DataStreamSender::SerializeBatch+0x117 - data-stream-sender.cc:463
> impalad!impala::DataStreamSender::Channel::SendCurrentBatch+0x44 - data-stream-sender.cc:256
> impalad!impala::DataStreamSender::Channel::AddRow+0x48 - data-stream-sender.cc:232
> impalad!impala::DataStreamSender::Send+0x6d2 - data-stream-sender.cc:444
> {code}
> Memory copy in send path
> {code}
> Data Of Interest (CPU Metrics)
> 1 of 137: 24.6% (6.870s of 27.911s)
> libc.so.6!memcpy - [Unknown]
> impalad!impala::Tuple::DeepCopyVarlenData+0xfe - tuple.cc:89
> impalad!impala::Tuple::DeepCopy+0xc3 - tuple.cc:69
> impalad!impala::DataStreamSender::Channel::AddRow+0xf5 - data-stream-sender.cc:245
> impalad!impala::DataStreamSender::Send+0x6d2 - data-stream-sender.cc:444
> impalad!impala::PlanFragmentExecutor::OpenInternal+0x3e2 - plan-fragment-executor.cc:355
> {code}
> Un-compressing in read path 
> {code}
> Data Of Interest (CPU Metrics)
> 1 of 1: 100.0% (25.100s of 25.100s)
> impalad!LZ4_uncompress - [Unknown]
> impalad!impala::Lz4Decompressor::ProcessBlock+0x47 - decompress.cc:454
> impalad!impala::RowBatch::RowBatch+0x3ec - row-batch.cc:108
> impalad!impala::DataStreamRecvr::SenderQueue::AddBatch+0x1a1 - data-stream-recvr.cc:207
> impalad!impala::DataStreamMgr::AddData+0x134 - data-stream-mgr.cc:103
> impalad!impala::ImpalaServer::TransmitData+0x175 - impala-server.cc:1077
> impalad!impala::ImpalaInternalService::TransmitData+0x43 - impala-internal-service.h:60
> {code}
> Write path in thrift 
> {code}
> Data Of Interest (CPU Metrics)
> 1 of 14: 31.9% (7.010s of 21.980s)
> libpthread.so.0!__send - [Unknown]
> impalad!apache::thrift::transport::TSocket::write_partial+0x36 - TSocket.cpp:567
> impalad!apache::thrift::transport::TSocket::write+0x3c - TSocket.cpp:542
> impalad!apache::thrift::transport::TTransport::write+0xa - TTransport.h:158
> impalad!apache::thrift::transport::TBufferedTransport::writeSlow+0x79 - TBufferTransports.cpp:93
> impalad!apache::thrift::transport::TTransport::write+0xb - TTransport.h:158
> impalad!writeString<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >+0x39 - TBinaryProtocol.tcc:186
> impalad!apache::thrift::protocol::TVirtualProtocol<apache::thrift::protocol::TBinaryProtocolT<apache::thrift::transport::TTransport>, apache::thrift::protocol::TProtocolDefaults>::writeString_virt+0x1b - TVirtualProtocol.h:417
> impalad!apache::thrift::protocol::TProtocol::writeString+0xf - TProtocol.h:463
> impalad!impala::TRowBatch::write+0x18a - Results_types.cpp:169
> impalad!impala::TTransmitDataParams::write+0x128 - ImpalaInternalService_types.cpp:2914
> impalad!impala::ImpalaInternalService_TransmitData_pargs::write+0x59 - ImpalaInternalService.cpp:571
> impalad!impala::ImpalaInternalServiceClient::send_TransmitData+0x65 - ImpalaInternalService.cpp:862
> impalad!impala::ImpalaInternalServiceClient::TransmitData+0x1b - ImpalaInternalService.cpp:851
> impalad!impala::ClientConnection<impala::ImpalaInternalServiceClient>::DoRpc<void (impala::TTransmitDataResult&, impala::TTransmitDataParams const&) impala::ImpalaInternalServiceClient::*, impala::TTransmitDataParams, impala::TTransmitDataResult>+0x4a - client-cache.h:229
> impalad!impala::DataStreamSender::Channel::TransmitDataHelper+0x5b1 - data-stream-sender.cc:206
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org