You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Sahil Takiar (Jira)" <ji...@apache.org> on 2019/08/23 17:51:00 UTC

[jira] [Commented] (IMPALA-8888) Profile fetch performance when result spooling is enabled

    [ https://issues.apache.org/jira/browse/IMPALA-8888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16914506#comment-16914506 ] 

Sahil Takiar commented on IMPALA-8888:
--------------------------------------

After talking with Tim offline, it seems that using a JDBC driver might be better than impala-shell (impala-shell is slow enough that server side perf improvements to this code probably don't affect latency). So will benchmark with JDBC instead.

> Profile fetch performance when result spooling is enabled
> ---------------------------------------------------------
>
>                 Key: IMPALA-8888
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8888
>             Project: IMPALA
>          Issue Type: Sub-task
>            Reporter: Sahil Takiar
>            Assignee: Sahil Takiar
>            Priority: Major
>
> Profile the performance of fetching rows when result spooling is enabled. There are a few queries that can be used to benchmark the performance:
> {{time ./bin/impala-shell.sh -B -q "select l_orderkey from tpch_parquet.lineitem" > /dev/null}}
> {{time ./bin/impala-shell.sh -B -q "select * from tpch_parquet.orders" > /dev/null}}
> The first fetches one column and 6,001,215 the second fetches 9 columns and 1,500,000 - so a mix of rows fetched vs. columns fetched.
> The base line for the benchmark should be the commit prior to IMPALA-8780.
> The benchmark should check for both latency and CPU usage (to see if the copy into {{BufferedTupleStream}} has a significant overhead).
> Various fetch sizes should be used in the benchmark as well to see if increasing the fetch size for result spooling improves performance (ideally it should) (it would be nice to run some fetches between machines as well as that will better reflect network round trip latencies).



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org