You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Tim Armstrong (JIRA)" <ji...@apache.org> on 2017/08/05 03:21:00 UTC
[jira] [Resolved] (IMPALA-4674) Port spilling ExecNodes to new
buffer pool
[ https://issues.apache.org/jira/browse/IMPALA-4674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tim Armstrong resolved IMPALA-4674.
-----------------------------------
Resolution: Fixed
Fix Version/s: Impala 2.10.0
IMPALA-4674: Part 2: port backend exec to BufferPool
Always create global BufferPool at startup using 80% of memory and
limit reservations to 80% of query memory (same as BufferedBlockMgr).
The query's initial reservation is computed in the planner, claimed
centrally (managed by the InitialReservations class) and distributed
to query operators from there.
min_spillable_buffer_size and default_spillable_buffer_size query
options control the buffer size that the planner selects for
spilling operators.
Port ExecNodes to use BufferPool:
* Each ExecNode has to claim its reservation during Open()
* Port Sorter to use BufferPool.
* Switch from BufferedTupleStream to BufferedTupleStreamV2
* Port HashTable to use BufferPool via a Suballocator.
This also makes PAGG memory consumption more efficient (avoid wasting buffers)
and improve the spilling algorithm:
* Allow preaggs to execute with 0 reservation - if streams and hash tables
cannot be allocated, it will pass through rows.
* Halve the buffer requirement for spilling aggs - avoid allocating
buffers for aggregated and unaggregated streams simultaneously.
* Rebuild spilled partitions instead of repartitioning (IMPALA-2708)
TODO in follow-up patches:
* Rename BufferedTupleStreamV2 to BufferedTupleStream
* Implement max_row_size query option.
Testing:
* Updated tests to reflect new memory requirements
Change-Id: I7fc7fe1c04e9dfb1a0c749fb56a5e0f2bf9c6c3e
Reviewed-on: http://gerrit.cloudera.org:8080/5801
Reviewed-by: Tim Armstrong <ta...@cloudera.com>
Tested-by: Impala Public Jenkins
> Port spilling ExecNodes to new buffer pool
> ------------------------------------------
>
> Key: IMPALA-4674
> URL: https://issues.apache.org/jira/browse/IMPALA-4674
> Project: IMPALA
> Issue Type: Sub-task
> Components: Backend
> Affects Versions: Impala 2.8.0
> Reporter: Tim Armstrong
> Assignee: Tim Armstrong
> Labels: resource-management
> Fix For: Impala 2.10.0
>
>
> Once the buffer pool is functional we need to port the spilling exec nodes to use it and remove BufferedBlockMgr:
> # PartitionedHashJoinNode
> # PartitionedAggregationNode
> # SortNode
> # AnalyticEvalNode
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)