You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2022/03/18 23:32:00 UTC

[jira] [Commented] (IMPALA-11185) Reuse orc::ColumnVectorBatch in the scanner life-cycle

    [ https://issues.apache.org/jira/browse/IMPALA-11185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17509113#comment-17509113 ] 

ASF subversion and git services commented on IMPALA-11185:
----------------------------------------------------------

Commit 4d32ab7122557ca3336354301a3a467a206913a9 in impala's branch refs/heads/master from stiga-huang
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=4d32ab7 ]

IMPALA-11185: Reuse orc row batch in the scanner life-cycle

In HdfsOrcScanner::AssembleRows(), we always re-create a
orc::ColumnVectorBatch. The ideal pattern is reusing the batch and only
destroying it when the scanner is closed.

This save half of the scanner time in some TPCH queries. See the flame
graph in JIRA description.

Tests:
 - Run CORE test

Change-Id: I03887ed94af2ff03d67cd00c79375c734a75af62
Reviewed-on: http://gerrit.cloudera.org:8080/18325
Reviewed-by: Quanlong Huang <hu...@gmail.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Reuse orc::ColumnVectorBatch in the scanner life-cycle
> ------------------------------------------------------
>
>                 Key: IMPALA-11185
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11185
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>            Reporter: Quanlong Huang
>            Assignee: Quanlong Huang
>            Priority: Major
>         Attachments: tpch-q1-scanner-flame-graph.jpg
>
>
> In HdfsOrcScanner::AssembleRows(), we always re-create a orc::ColumnVectorBatch. The ideal pattern is reusing the batch and only destroyed it when the scanner is closed.
> In the flame graph of TPC-H Q1 collected by [~drorke] , the createRowBatch and destructors occupies almost half of the scanner time.
> !tpch-q1-scanner-flame-graph.jpg|width=979,height=426!



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org