You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Tim Armstrong (JIRA)" <ji...@apache.org> on 2017/07/11 03:24:00 UTC
[jira] [Resolved] (IMPALA-5629) list::size() in
BufferedTupleStreamV2::AdvanceWritePage() is expensive
[ https://issues.apache.org/jira/browse/IMPALA-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tim Armstrong resolved IMPALA-5629.
-----------------------------------
Resolution: Fixed
Fix Version/s: Impala 2.10.0
IMPALA-5629: avoid expensive list::size() call
As a workaround until we move to GCC5+, explicitly track the pages_
list size. This is not too bad in practice since it is only mutated
in 3 places.
Testing:
Ran buffered-tuple-stream-v2-test (the only coverage of
BufferedTupleStreamV2 currently).
Reran the query with the perf issue, confirmed that it was no longer
spending lots of time in BufferedTupleStreamV2::AdvanceWritePage().
Change-Id: Id83fcf68dcc3ea729df167885f999ff32b861e66
Reviewed-on: http://gerrit.cloudera.org:8080/7382
Reviewed-by: Dan Hecht <dh...@cloudera.com>
Tested-by: Impala Public Jenkins
> list::size() in BufferedTupleStreamV2::AdvanceWritePage() is expensive
> ----------------------------------------------------------------------
>
> Key: IMPALA-5629
> URL: https://issues.apache.org/jira/browse/IMPALA-5629
> Project: IMPALA
> Issue Type: Sub-task
> Components: Backend
> Affects Versions: Impala 2.10.0
> Reporter: Tim Armstrong
> Assignee: Tim Armstrong
> Labels: perf
> Fix For: Impala 2.10.0
>
>
> In a test run executing a very large join I saw a lot of CPU being burnt in BufferedTupleStreamV2::AdvanceWritePage()
> It looks like it's all being spent iterating over the pages_ linked list. list::size() is an O(n) operation in some implementations.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)