You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues-all@impala.apache.org by "Tim Armstrong (JIRA)" <ji...@apache.org> on 2018/04/30 16:20:00 UTC

[jira] [Resolved] (IMPALA-6679) Don't claim ideal reservation in scanner until actually processing scan ranges

     [ https://issues.apache.org/jira/browse/IMPALA-6679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tim Armstrong resolved IMPALA-6679.
-----------------------------------
       Resolution: Fixed
    Fix Version/s: Impala 3.1.0
                   Impala 2.13.0

> Don't claim ideal reservation in scanner until actually processing scan ranges
> ------------------------------------------------------------------------------
>
>                 Key: IMPALA-6679
>                 URL: https://issues.apache.org/jira/browse/IMPALA-6679
>             Project: IMPALA
>          Issue Type: Sub-task
>          Components: Backend
>            Reporter: Tim Armstrong
>            Assignee: Tim Armstrong
>            Priority: Major
>             Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> One cause of the memory regression in IMPALA-4835 was that scans, particularly Parquet, claimed memory very aggressively in HdfsScanNode::Open() based on the planner's estimated ideal memory. There were two problems here:
> # In many cases the ideal memory increase was not at all needed, because the total amount of data scanned in the Parquet file was actually less than the minimum reservation. We don't know this for sure until the read the Parquet footer and check the column sizes.
> # HdfsScanNode::Open() may happen a long time before the first HdfsScanNode::GetNext(), particularly if the scan node is on the left side of a broadcast join. The scan node may grab a lot of reservation before other plan nodes have started running, eventually resulting in starving a lot of nodes.
> It would make sense to wait until we are actually processing a scan range to increase a scanner thread's reservation from the minimum. There are various more complex things we could do here, but I suspect the simplest possible thing would help a lot. E.g. we could extend this by  releasing reservation after processing each scan range.
> I believe the second effect may be fairly significant when running many queries with high concurrency. In that case the async join build will tend to be disabled, which means that HdfsScanNode::GetNext() on the left child of a broadcast join will only be called after the join build has completed, so overlap between that scans and the nodes on the right side of the join would be minimal (there's not enough synchronisation to guarantee no overlap).
> See https://docs.google.com/document/d/1kR0zfevNNUJom3sH1XmposacVZ-QALan7NSwnR5CkSA/edit# for some more analysis.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org