You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Will Berkeley (JIRA)" <ji...@apache.org> on 2019/02/20 19:37:00 UTC
[jira] [Created] (KUDU-2704) Rowsets that are much bigger than the
target size discourage compactions
Will Berkeley created KUDU-2704:
-----------------------------------
Summary: Rowsets that are much bigger than the target size discourage compactions
Key: KUDU-2704
URL: https://issues.apache.org/jira/browse/KUDU-2704
Project: Kudu
Issue Type: Bug
Affects Versions: 1.9.0
Reporter: Will Berkeley
Assignee: Will Berkeley
In KUDU-2701, I fixed a KUDU-1400-related compaction loop where the size used for compaction was the base data and redos, which caused situations where compacting rowsets that looked small but weren't was effectively a no-op, resulting in a compaction loop. Now, rowset count / KUDU-1400 compactions use the whole rowset size. While testing something on a table with 279 columns, I noticed that almost all rowsets were being flushed at a size of 80-90MB and, even though the tablet height was increasing rapidly and above 20, almost no compactions were happening. Looking into it, when the total size of the rowset is far above the target size, we assign a big negative score to including the rowset in a compaction, since the score is proportional to 1 - size/target size. This problem always existed, it just got worse because the size now includes more things.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)