You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2023/04/06 10:06:00 UTC

[jira] [Updated] (IMPALA-12046) Add profile counter for scan range queueing time on disk queues

     [ https://issues.apache.org/jira/browse/IMPALA-12046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Quanlong Huang updated IMPALA-12046:
------------------------------------
    Priority: Critical  (was: Major)

> Add profile counter for scan range queueing time on disk queues
> ---------------------------------------------------------------
>
>                 Key: IMPALA-12046
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12046
>             Project: IMPALA
>          Issue Type: New Feature
>          Components: Backend
>            Reporter: Quanlong Huang
>            Assignee: Quanlong Huang
>            Priority: Critical
>              Labels: observability, supportability
>
> I saw a profile showing the total time of a ScanNode is dominanted by {{{}ScannerIoWaitTime{}}}. However, the hdfs openFileTime and readTime are all small. No other counters can explain why {{ScannerIoWaitTime}} is long.
> {code:java}
> - DecompressionTime: 964.648ms
> - InactiveTotalTime: 0.000ns
> - MaterializeTupleTime: 2s132ms
> - ScannerIoWaitTime: 11s641ms          <-- Dominants the total time
> - TotalRawHdfsOpenFileTime: 14.501ms
> - TotalRawHdfsReadTime: 1s374ms
> - TotalReadThroughput: 29.94 MB/secĀ 
> - TotalTime: 15s865ms{code}
> After some debug, I realize the time is spent in queuing in the disk queue. If the scanner is consuming data faster than the disk queue threads can read, scan ranges will be queueing in the disk queues. The queueing time is not counted in either TotalRawHdfsOpenFileTime or TotalRawHdfsReadTime, but is counted in ScannerIoWaitTime. We should add profile counter for the queueing time on disk queues to better explain ScannerIoWaitTime.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org