You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Sahil Takiar (JIRA)" <ji...@apache.org> on 2019/07/31 19:08:00 UTC
[jira] [Created] (IMPALA-8819) BufferedPlanRootSink should handle
non-default fetch sizes
Sahil Takiar created IMPALA-8819:
------------------------------------
Summary: BufferedPlanRootSink should handle non-default fetch sizes
Key: IMPALA-8819
URL: https://issues.apache.org/jira/browse/IMPALA-8819
Project: IMPALA
Issue Type: Sub-task
Components: Backend
Reporter: Sahil Takiar
Assignee: Sahil Takiar
As of IMPALA-8780, the {{BufferedPlanRootSink}} returns an error whenever a client sets the fetch size to a value lower than the {{BATCH_SIZE}}. The issue is that when reading from a {{RowBatch}} from the queue, the batch might contain more rows than the number requested by the client. So the {{BufferedPlanRootSink}} needs to be able to partially read a {{RowBatch}} and remember the index of the rows it read. Furthermore, {{num_results}} in {{BufferedPlanRootSink::GetNext}} could be lower than {{BATCH_SIZE}} if the query results cache in {{ClientRequestState}} has a cache hit (only happens if the client cursor is reset).
Another issue is that the {{BufferedPlanRootSink}} can only read up to a single {{RowBatch}} at a time. So if a fetch size larger than {{BATCH_SIZE}} is specified, only {{BATCH_SIZE}} rows will be written to the given {{QueryResultSet}}. This is consistent with the legacy behavior of {{PlanRootSink}} (now {{BlockingPlanRootSink}}), but is not ideal because that means clients can only read {{BATCH_SIZE}} rows at a time. A higher fetch size would potentially reduce the number of round-trips necessary between the client and the coordinator, which could improve fetch performance (but only if the {{BlockingPlanRootSink}} is capable of filling all the requested rows).
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)