You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Sahil Takiar (JIRA)" <ji...@apache.org> on 2019/08/06 23:41:00 UTC

[jira] [Commented] (IMPALA-8786) BufferedPlanRootSink should directly write to a QueryResultSet if one is available

    [ https://issues.apache.org/jira/browse/IMPALA-8786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901557#comment-16901557 ] 

Sahil Takiar commented on IMPALA-8786:
--------------------------------------

Agree. If the batch boundaries don't match up properly, this optimization won't work. Another issue is that the client waits until rows are available before calling {{GetNext}} so its not always likely that a {{QueryResultSet}} will be available when the first {{RowBatch}} is produced.

> BufferedPlanRootSink should directly write to a QueryResultSet if one is available
> ----------------------------------------------------------------------------------
>
>                 Key: IMPALA-8786
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8786
>             Project: IMPALA
>          Issue Type: Sub-task
>          Components: Backend
>            Reporter: Sahil Takiar
>            Assignee: Sahil Takiar
>            Priority: Major
>
> {{BufferedPlanRootSink}} uses a {{RowBatchQueue}} to buffer {{RowBatch}}-es and then the consumer thread reads them and writes them to a given {{QueryResultSet}}. Implementations of {{RowBatchQueue}} might end up copying the buffered {{RowBatch}}-es (e.g. if the queue is backed by a {{BufferedTupleStream}}). An optimization would be for the producer thread to directly write to the consumer {{QueryResultSet}}. This optimization would only be triggered if (1) the queue is empty, and (2) the consumer thread has a {{QueryResultSet}} available for writing.
> This "fast path" is useful in a few different scenarios:
>  * If the consumer is faster than at reading rows than the producer is at sending them; in this case, the overhead of buffering rows in a {{RowBatchQueue}} can be completely avoided
>  * For queries that return under 1024 its likely that the consumer will produce a {{QueryResultSet}} before the first {{RowBatch}} is returned (except perhaps for very trivial queries)



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org