You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2018/11/07 22:56:00 UTC

[jira] [Commented] (IMPALA-3652) Fix resource transfer in subplans with limits

    [ https://issues.apache.org/jira/browse/IMPALA-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16678915#comment-16678915 ] 

ASF subversion and git services commented on IMPALA-3652:
---------------------------------------------------------

Commit c28bc3e4f3900bfd0b14084ca19e51b36ea4dca7 in impala's branch refs/heads/master from [~twmarshall]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=c28bc3e ]

IMPALA-3652: Fix resource transfer in subplans with limits

Impala assumes that when Reset() is called on an ExecNode, all of the
memory returned from that node by GetNext() has been attached to the
output RowBatch. In a query with a LIMIT on the subplan, such that
some nodes don't reach 'eos', this may not be the case.

The solution is to have Reset() take a RowBatch that any such memory
can be attached to. I examined all ExecNodes for resources being
transferred on 'eos' and added transferring of those resources in
Resst().

Testing:
- Added e2e tests that repro the issue for hash and nested loop joins.

Change-Id: I3968a379fcbb5d30fcec304995d3e44933dbbc77
Reviewed-on: http://gerrit.cloudera.org:8080/11852
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Fix resource transfer in subplans with limits
> ---------------------------------------------
>
>                 Key: IMPALA-3652
>                 URL: https://issues.apache.org/jira/browse/IMPALA-3652
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 2.6.0
>            Reporter: Tim Armstrong
>            Assignee: Thomas Tauber-Marshall
>            Priority: Major
>              Labels: resource-management
>
> There is a tricky corner case in our resource transfer model with subplans and limits. The problem is that the limit in the subplan may mean that the exec node is reset before it has returned its full output. The resource transfer logic generally attaches resources to batches at specific points in the output, e.g. end of partition, end of block, so it's possible that batches returned before the Reset() may reference resources that have not yet been transferred. It's unclear if we test this scenario consistently or if it's always handled correctly.
> One example is this query, reported in IMPALA-5456:
> {code}
> select c_custkey, c_mktsegment, o_orderkey, o_orderdate
> from customer c,
>   (select o1.o_orderkey, o2.o_orderdate
>    from c.c_orders o1, c.c_orders o2
>    where o1.o_orderkey = o2.o_orderkey limit 10) v limit 500;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org