You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Csaba Ringhofer (Jira)" <ji...@apache.org> on 2023/04/27 17:27:00 UTC

[jira] [Closed] (IMPALA-12101) Inconsistent speeds with result spooling

     [ https://issues.apache.org/jira/browse/IMPALA-12101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Csaba Ringhofer closed IMPALA-12101.
------------------------------------
    Resolution: Invalid

> Inconsistent speeds with result spooling
> ----------------------------------------
>
>                 Key: IMPALA-12101
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12101
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend, Clients
>            Reporter: Csaba Ringhofer
>            Priority: Major
>              Labels: performance
>
> Noticed a case when enabling result spooling makes query execution much slower:
> impala-shell  -B -q "set spool_query_results=1; select cast(l_shipdate as timestamp)  from tpch_parquet.lineitem;" > /dev/null
> Fetched 6001215 row(s) in 23.81s
> impala-shell -B -q "set spool_query_results=0; select cast(l_shipdate as timestamp)  from tpch_parquet.lineitem;" > /dev/null
> Fetched 6001215 row(s) in 9.92s
> Using beeswax leads to completely different results:
> impala-shell  --protocol=beeswax -B -q "set spool_query_results=1; select cast(l_shipdate as timestamp)  from tpch_parquet.lineitem;" > /dev/null
> Fetched 6001215 row(s) in 10.32s
> impala-shell  --protocol=beeswax -B -q "set spool_query_results=0; select cast(l_shipdate as timestamp)  from tpch_parquet.lineitem;" > /dev/null
> Fetched 6001215 row(s) in 11.87s
> This anomaly seems to occur when both the client and the coordinator needs significant time to process the returned rows.
> Note that the slow result generation from timestamps (and dates) is a known performance issue  in the coordinator - most time is spent in converting dates/timestamps to strings. On the other side I don't understand how enabling result spooling can slow down a query.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org