You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Alexey Serbin (Code Review)" <ge...@cloudera.org> on 2022/12/16 18:09:03 UTC

[kudu-CR] KUDU-3406 memory budgeting for CompactRowSetsOp

Alexey Serbin has uploaded this change for review. ( http://gerrit.cloudera.org:8080/19368


Change subject: KUDU-3406 memory budgeting for CompactRowSetsOp
......................................................................

KUDU-3406 memory budgeting for CompactRowSetsOp

This patch implements memory budgeting for performing rowset merge
compactions (i.e. CompactRowSetsOp maintenance operations).

The idea is to check whether it's enough memory left before reaching the
hard memory limit if starting a CompactRowSetsOp.  An estimate for the
amount of memory necessary to perform the operation is based on the
total on-disk size of all deltas in rowsets selected for the merge
compaction and the ratio of memory-to-disk size when loading those
deltas in memory to perform the merge rowset compaction.  If there is
enough memory, then a rowset is considered as an input for merge
compaction, otherwise it's not.  Meanwhile, REDO deltas are becoming
UNDO deltas after major delta compactions run on the rowset, and UNDO
deltas eventually become ancient, so UndoDeltaBlockGCOp drop those.
With that, the amount of memory required to load a rowset's delta data
into memory shrinks over long run, and eventually it's back into the
input for one of the future runs of the CompactRowSetsOp maintenance
operation.

Prior to this patch, the root cause of running out of memory when
performing CompactRowSetsOp was trying to allocate too much memory
at least due to the following factors:
  * many UNDO deltas might accumulate in rowsets selected for the
    compaction operation because of the relatively high setting for the
    --tablet_history_max_age_sec flag (7 days) and a particular workload
    that issues many updates for rows in the same rowset
  * even if it's a merge-like operation by its nature, the current
    implementation of CompactRowSetsOp allocates all the memory
    necessary to load the UNDO deltas at once, and it keeps all the
    preliminary results in the memory as well before persisting
    the result data to disk
  * the current implementation of CompactRowSetsOp loads all the UNDO
    deltas from the rowsets selected for compaction regardless whether
    they are ancient or not; it discards of the data sourced from the
    ancient deltas in the very end before persisting the result data

Ideally, the current implementation of CompactRowSetsOp should be
refactored to merge the deltas in participating rowsets sequentially,
chunk by chunk, persisting the results and allocating memory just for
small bunch of processed deltas, not loading all the deltas at once.
A future patch should take care of that, while this patch provides an
interim approach using memory budgeting on top of the current
CompactRowSetsOp implementation as-is.

The newly introduced behavior is gated by the following two flags:
  * rowset_compaction_memory_estimate_enabled: whether to enable memory
    budgeting for CompactRowSetsOp (default is 'false').
  * rowset_compaction_ancient_delta_threshold_enabled: whether to
    check against the ratio of ancient UNDO deltas across rowsets
    selected for compaction (default is 'true').

In addition, the following two flags allow for tweaking the new
behavior gated by the corresponding flags above:
  * rowset_compaction_delta_memory_factor: the multiplication factor for
    the total size of rowset's deltas to estimate how much memory
    CompactRowSetsOp would consume if operating on those deltas when
    no runtime stats for the compact_rs_mem_usage_to_deltas_size_ratio
    metric is yet available (default is 3.0)
  * rowset_compaction_ancient_delta_max_ratio: the threshold for the
    ratio of the data size in ancient UNDO deltas to the total data size
    of UNDO deltas in the rowsets selected for merge compaction

To complement the --rowset_compaction_delta_memory_factor flag with more
tablet-specific stats, two new per-tablet metrics have been introduced:
  * compact_rs_mem_usage is a histogram to gather statistics on how much
    memory rowset merge compaction consumed
  * compact_rs_mem_usage_to_deltas_size_ratio is a histogram to track
    the memory-to-disk size for a tablet's rowsets participating in
    merge compaction -- this metric provides the average that's is used
    as a more precise factor to estimate the amount of memory a rowset's
    deltas would use when undergoing merge compaction given the amount
    of memory of all the rowset's deltas on disk

Change-Id: I15bef59dd5052d4a54d85a3f2759562fdc614c20
---
M src/kudu/tablet/compaction.cc
M src/kudu/tablet/compaction.h
M src/kudu/tablet/compaction_policy-test.cc
M src/kudu/tablet/compaction_policy.cc
M src/kudu/tablet/compaction_policy.h
M src/kudu/tablet/delta_iterator_merger.cc
M src/kudu/tablet/delta_iterator_merger.h
M src/kudu/tablet/delta_store.h
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/delta_tracker.h
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/deltamemstore.h
M src/kudu/tablet/rowset_info.cc
M src/kudu/tablet/rowset_info.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet_metrics.cc
M src/kudu/tablet/tablet_metrics.h
M src/kudu/tablet/tablet_mm_ops-test.cc
M src/kudu/tablet/tablet_mm_ops.cc
M src/kudu/tablet/tablet_mm_ops.h
21 files changed, 379 insertions(+), 35 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/19368/1
-- 
To view, visit http://gerrit.cloudera.org:8080/19368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I15bef59dd5052d4a54d85a3f2759562fdc614c20
Gerrit-Change-Number: 19368
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <al...@apache.org>

[kudu-CR] KUDU-3406 memory budgeting for CompactRowSetsOp

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has abandoned this change. ( http://gerrit.cloudera.org:8080/19368 )

Change subject: KUDU-3406 memory budgeting for CompactRowSetsOp
......................................................................


Abandoned

Mistakenly posted as a new changelist instead of revving https://gerrit.cloudera.org/#/c/19281/
-- 
To view, visit http://gerrit.cloudera.org:8080/19368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I15bef59dd5052d4a54d85a3f2759562fdc614c20
Gerrit-Change-Number: 19368
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)