You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues-all@impala.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2023/04/06 10:06:00 UTC

[jira] [Created] (IMPALA-12046) Add profile counter for scan range queueing time on disk queues

Quanlong Huang created IMPALA-12046:
---------------------------------------

             Summary: Add profile counter for scan range queueing time on disk queues
                 Key: IMPALA-12046
                 URL: https://issues.apache.org/jira/browse/IMPALA-12046
             Project: IMPALA
          Issue Type: New Feature
          Components: Backend
            Reporter: Quanlong Huang
            Assignee: Quanlong Huang


I saw a profile showing the total time of a ScanNode is dominanted by {{{}ScannerIoWaitTime{}}}. However, the hdfs openFileTime and readTime are all small. No other counters can explain why {{ScannerIoWaitTime}} is long.
{code:java}
- DecompressionTime: 964.648ms
- InactiveTotalTime: 0.000ns
- MaterializeTupleTime: 2s132ms
- ScannerIoWaitTime: 11s641ms          <-- Dominants the total time
- TotalRawHdfsOpenFileTime: 14.501ms
- TotalRawHdfsReadTime: 1s374ms
- TotalReadThroughput: 29.94 MB/sec 
- TotalTime: 15s865ms{code}
After some debug, I realize the time is spent in queuing in the disk queue. If the scanner is consuming data faster than the disk queue threads can read, scan ranges will be queueing in the disk queues. The queueing time is not counted in either TotalRawHdfsOpenFileTime or TotalRawHdfsReadTime, but is counted in ScannerIoWaitTime. We should add profile counter for the queueing time on disk queues to better explain ScannerIoWaitTime.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org