You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Tim Armstrong (Jira)" <ji...@apache.org> on 2019/09/16 23:08:00 UTC

[jira] [Resolved] (IMPALA-1580) Optimize conversion of row batch to query result set

     [ https://issues.apache.org/jira/browse/IMPALA-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tim Armstrong resolved IMPALA-1580.
-----------------------------------
    Resolution: Duplicate

> Optimize conversion of row batch to query result set
> ----------------------------------------------------
>
>                 Key: IMPALA-1580
>                 URL: https://issues.apache.org/jira/browse/IMPALA-1580
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Perf Investigation
>    Affects Versions: Impala 2.0.1
>            Reporter: casey
>            Priority: Minor
>              Labels: performance, ramp-up
>         Attachments: select_lineitem.profile
>
>
> For simple queries that produce a large result set such as "select * from tpch.lineitem" the server execution time is limited by the time required to convert row batches (results in the internal structure) to query results (the structure to be sent to the client). The data conversion is the limiting factor in this case because the query plan execution happens in parallel.
> Here are some data points from the profile of "select * from tpch.lineitem" using HS2 (this was taken using --exchg_node_buffer_size_bytes=2048576000 so the exchange node would never block because of a full buffer.). Beeswax takes even longer to convert the rows.
> * Query Timeline: 1m9s
> * Execution Profile -- Total: 1s295ms
> * ClientFetchWaitTimer: 52s553ms
> * RowMaterializationTimer: 15s216ms
> * Coordinator Fragment F01:(Total: 1s092ms
> * Averaged Fragment F00:(Total: 5s608ms
> So the "RowMaterializationTimer", which is actually conversion time, adds ~9 seconds or ~2x the plan execution time to the overall time.
> Ideally the conversion time would be codegen'd but even without that there should be a lot of room for improvement by reducing function calls.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org