You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Gabor Kaszab (JIRA)" <ji...@apache.org> on 2019/03/14 14:25:00 UTC

[jira] [Resolved] (IMPALA-7980) High system CPU time usage (and waste) when runtime filters filter out files

     [ https://issues.apache.org/jira/browse/IMPALA-7980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gabor Kaszab resolved IMPALA-7980.
----------------------------------
       Resolution: Fixed
    Fix Version/s: Impala 3.2.0

> High system CPU time usage (and waste) when runtime filters filter out files
> ----------------------------------------------------------------------------
>
>                 Key: IMPALA-7980
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7980
>             Project: IMPALA
>          Issue Type: Task
>            Reporter: Philip Zeyliger
>            Priority: Major
>             Fix For: Impala 3.2.0
>
>
> When running TPC-DS query 1 on scale factor 10,000 (10TB) on a 140-node cluster with {{replica_preference=remote}}, we observed really high system CPU usage for some of the scan nodes:
> {code}
> HDFS_SCAN_NODE (id=6):(Total: 59s107ms, non-child: 59s107ms, % non-
> child: 100.00%
> - BytesRead: 80.50 MB (84408563)
> - ScannerThreadsSysTime: 36m17s
> {code}
> Using {perf}, we discovered a lot of usage of {futex_wait} and {pthread_cond_wait} and so on. (We also used perf to record context switches and cycles.) Interestingly, observing in top saw the really high system CPU usage spike some time into the query.
> We believe what's going on is that we start many ScannerThread instances, which wait first until initial ranges have been issued and then grab data using {impala::io::ScanRange::GetNext()}. They do this in a loop, and it uses two locks, until the query is done or there are no {{num_unqueued_files_}} left. If num_unqueued_files_ is left above zero, then these threads just loop through two lock acquisitions and nothing else. We believe that this hot loop is eating system CPU aggressively.
> It's a bit interesting that this is exacerbated in the case with more remote reads. Our best guess is that some of the reads take significantly longer in this case, and a single outlier can extend this period of waste.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)