You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2021/06/11 07:29:00 UTC
[jira] [Commented] (IMPALA-6294) Concurrent hung with lots of
spilling make slow progress due to blocking in DataStreamRecvr and
DataStreamSender
[ https://issues.apache.org/jira/browse/IMPALA-6294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17361492#comment-17361492 ]
Quanlong Huang commented on IMPALA-6294:
----------------------------------------
FWIW, IMPALA-10578 is a similar issue but finally found the cause is a poor configuration that only one rotational disk is configed for spilling and that disk is also used by logging. The spilling saturates the disk so block loggings and finally block RPCs.
> Concurrent hung with lots of spilling make slow progress due to blocking in DataStreamRecvr and DataStreamSender
> ----------------------------------------------------------------------------------------------------------------
>
> Key: IMPALA-6294
> URL: https://issues.apache.org/jira/browse/IMPALA-6294
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 2.11.0
> Reporter: Mostafa Mokhtar
> Assignee: Michael Ho
> Priority: Critical
> Attachments: IMPALA-6285 TPCDS Q3 slow broadcast, slow_broadcast_q3_reciever.txt, slow_broadcast_q3_sender.txt
>
>
> While running a highly concurrent spilling workload on a large cluster queries start running slower, even light weight queries that are not running are affected by this slow down.
> {code}
> EXCHANGE_NODE (id=9):(Total: 3m1s, non-child: 3m1s, % non-child: 100.00%)
> - ConvertRowBatchTime: 999.990us
> - PeakMemoryUsage: 0
> - RowsReturned: 108.00K (108001)
> - RowsReturnedRate: 593.00 /sec
> DataStreamReceiver:
> BytesReceived(4s000ms): 254.47 KB, 338.82 KB, 338.82 KB, 852.43 KB, 1.32 MB, 1.33 MB, 1.50 MB, 2.53 MB, 2.99 MB, 3.00 MB, 3.00 MB, 3.00 MB, 3.00 MB, 3.00 MB, 3.00 MB, 3.00 MB, 3.00 MB, 3.00 MB, 3.16 MB, 3.49 MB, 3.80 MB, 4.15 MB, 4.55 MB, 4.84 MB, 4.99 MB, 5.07 MB, 5.41 MB, 5.75 MB, 5.92 MB, 6.00 MB, 6.00 MB, 6.00 MB, 6.07 MB, 6.28 MB, 6.33 MB, 6.43 MB, 6.67 MB, 6.91 MB, 7.29 MB, 8.03 MB, 9.12 MB, 9.68 MB, 9.90 MB, 9.97 MB, 10.44 MB, 11.25 MB
> - BytesReceived: 11.73 MB (12301692)
> - DeserializeRowBatchTimer: 957.990ms
> - FirstBatchArrivalWaitTime: 0.000ns
> - PeakMemoryUsage: 644.44 KB (659904)
> - SendersBlockedTimer: 0.000ns
> - SendersBlockedTotalTimer(*): 0.000ns
> {code}
> {code}
> DataStreamSender (dst_id=9):(Total: 1s819ms, non-child: 1s819ms, % non-child: 100.00%)
> - BytesSent: 234.64 MB (246033840)
> - NetworkThroughput(*): 139.58 MB/sec
> - OverallThroughput: 128.92 MB/sec
> - PeakMemoryUsage: 33.12 KB (33920)
> - RowsReturned: 108.00K (108001)
> - SerializeBatchTime: 133.998ms
> - TransmitDataRPCTime: 1s680ms
> - UncompressedRowBatchSize: 446.42 MB (468102200)
> {code}
> Timeouts seen in IMPALA-6285 are caused by this issue
> {code}
> I1206 12:44:14.925405 25274 status.cc:58] RPC recv timed out: Client foo-17.domain.com:22000 timed-out during recv call.
> @ 0x957a6a impala::Status::Status()
> @ 0x11dd5fe impala::DataStreamSender::Channel::DoTransmitDataRpc()
> @ 0x11ddcd4 impala::DataStreamSender::Channel::TransmitDataHelper()
> @ 0x11de080 impala::DataStreamSender::Channel::TransmitData()
> @ 0x11e1004 impala::ThreadPool<>::WorkerThread()
> @ 0xd10063 impala::Thread::SuperviseThread()
> @ 0xd107a4 boost::detail::thread_data<>::run()
> @ 0x128997a (unknown)
> @ 0x7f68c5bc7e25 start_thread
> @ 0x7f68c58f534d __clone
> {code}
> A similar behavior was also observed with KRPC enabled IMPALA-6048
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org