You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Tim Armstrong (Jira)" <ji...@apache.org> on 2020/01/18 05:29:00 UTC
[jira] [Resolved] (IMPALA-9302) Multithreaded scanners don't check
for filter effectiveness
[ https://issues.apache.org/jira/browse/IMPALA-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tim Armstrong resolved IMPALA-9302.
-----------------------------------
Fix Version/s: Impala 3.4.0
Resolution: Fixed
> Multithreaded scanners don't check for filter effectiveness
> -----------------------------------------------------------
>
> Key: IMPALA-9302
> URL: https://issues.apache.org/jira/browse/IMPALA-9302
> Project: IMPALA
> Issue Type: Improvement
> Components: Backend
> Affects Versions: Impala 3.3.0
> Reporter: Tim Armstrong
> Assignee: Tim Armstrong
> Priority: Major
> Labels: multithreading, performance
> Fix For: Impala 3.4.0
>
>
> This can be reproduced for TPC-H Q9. I saw this on scale factor 30 locally, where the mt_dop=4 version of the query uses a lot more CPU in the scan than the mt_dop=0 version.
> This turns out to be because none of the runtime filters are getting disabled, not even the ineffective ones.
> {noformat}
> Filter 2 (16.00 MB):
> - Files processed: 0 (0)
> - Files rejected: 0 (0)
> - Files total: 0 (0)
> - RowGroups processed: 0 (0)
> - RowGroups rejected: 0 (0)
> - RowGroups total: 0 (0)
> - Rows processed: 30.97M (30970695)
> - Rows rejected: 0 (0)
> - Rows total: 31.01M (31009074)
> - Splits processed: 0 (0)
> - Splits rejected: 0 (0)
> - Splits total: 0 (0)
> Filter 4 (8.00 MB):
> - Files processed: 0 (0)
> - Files rejected: 0 (0)
> - Files total: 0 (0)
> - RowGroups processed: 0 (0)
> - RowGroups rejected: 0 (0)
> - RowGroups total: 0 (0)
> - Rows processed: 30.97M (30970695)
> - Rows rejected: 0 (0)
> - Rows total: 31.01M (31009074)
> - Splits processed: 0 (0)
> - Splits rejected: 0 (0)
> - Splits total: 0 (0)
> Filter 5 (8.00 MB):
> - Files processed: 0 (0)
> - Files rejected: 0 (0)
> - Files total: 0 (0)
> - RowGroups processed: 0 (0)
> - RowGroups rejected: 0 (0)
> - RowGroups total: 0 (0)
> - Rows processed: 30.97M (30970695)
> - Rows rejected: 0 (0)
> - Rows total: 31.01M (31009074)
> - Splits processed: 0 (0)
> - Splits rejected: 0 (0)
> - Splits total: 0 (0)
> Filter 8 (1.00 MB):
> - Files processed: 0 (0)
> - Files rejected: 0 (0)
> - Files total: 0 (0)
> - RowGroups processed: 0 (0)
> - RowGroups rejected: 0 (0)
> - RowGroups total: 0 (0)
> - Rows processed: 31.01M (31009074)
> - Rows rejected: 0 (0)
> - Rows total: 31.01M (31009074)
> - Splits processed: 0 (0)
> - Splits rejected: 0 (0)
> - Splits total: 0 (0)
> Filter 10 (1.00 MB):
> - Files processed: 0 (0)
> - Files rejected: 0 (0)
> - Files total: 0 (0)
> - RowGroups processed: 0 (0)
> - RowGroups rejected: 0 (0)
> - RowGroups total: 0 (0)
> - Rows processed: 31.01M (31009074)
> - Rows rejected: 29.32M (29317263)
> - Rows total: 31.01M (31009074)
> - Splits processed: 0 (0)
> - Splits rejected: 0 (0)
> - Splits total: 0 (0)
> {noformat}
> In contrast here are the filters for mt_dop=0, where not all the rows are processed.
> {noformat}
> Filter 2 (16.00 MB):
> - Files processed: 0 (0)
> - Files rejected: 0 (0)
> - Files total: 0 (0)
> - RowGroups processed: 0 (0)
> - RowGroups rejected: 0 (0)
> - RowGroups total: 0 (0)
> - Rows processed: 8.18M (8180257)
> - Rows rejected: 0 (0)
> - Rows total: 180.00M (179998372)
> - Splits processed: 0 (0)
> - Splits rejected: 0 (0)
> - Splits total: 0 (0)
> Filter 4 (8.00 MB):
> - Files processed: 0 (0)
> - Files rejected: 0 (0)
> - Files total: 0 (0)
> - RowGroups processed: 0 (0)
> - RowGroups rejected: 0 (0)
> - RowGroups total: 0 (0)
> - Rows processed: 8.18M (8180257)
> - Rows rejected: 0 (0)
> - Rows total: 180.00M (179998372)
> - Splits processed: 0 (0)
> - Splits rejected: 0 (0)
> - Splits total: 0 (0)
> Filter 5 (8.00 MB):
> - Files processed: 0 (0)
> - Files rejected: 0 (0)
> - Files total: 0 (0)
> - RowGroups processed: 0 (0)
> - RowGroups rejected: 0 (0)
> - RowGroups total: 0 (0)
> - Rows processed: 8.18M (8180257)
> - Rows rejected: 0 (0)
> - Rows total: 180.00M (179998372)
> - Splits processed: 0 (0)
> - Splits rejected: 0 (0)
> - Splits total: 0 (0)
> Filter 8 (1.00 MB):
> - Files processed: 0 (0)
> - Files rejected: 0 (0)
> - Files total: 0 (0)
> - RowGroups processed: 0 (0)
> - RowGroups rejected: 0 (0)
> - RowGroups total: 0 (0)
> - Rows processed: 8.41M (8406914)
> - Rows rejected: 0 (0)
> - Rows total: 180.00M (179998372)
> - Splits processed: 0 (0)
> - Splits rejected: 0 (0)
> - Splits total: 0 (0)
> Filter 10 (1.00 MB):
> - Files processed: 0 (0)
> - Files rejected: 0 (0)
> - Files total: 0 (0)
> - RowGroups processed: 0 (0)
> - RowGroups rejected: 0 (0)
> - RowGroups total: 0 (0)
> - Rows processed: 180.00M (179998372)
> - Rows rejected: 170.18M (170177099)
> - Rows total: 180.00M (179998372)
> - Splits processed: 0 (0)
> - Splits rejected: 0 (0)
> - Splits total: 0 (0)
> {noformat}
> Perf top showed 28% of CPU time in impala::BloomFilter::BucketFindAVX2, which corroborates this.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)