You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Adar Dembo (JIRA)" <ji...@apache.org> on 2017/03/16 22:11:41 UTC
[jira] [Created] (KUDU-1943) log containers should be reusables
without first closing in-flight writable blocks
Adar Dembo created KUDU-1943:
--------------------------------
Summary: log containers should be reusables without first closing in-flight writable blocks
Key: KUDU-1943
URL: https://issues.apache.org/jira/browse/KUDU-1943
Project: Kudu
Issue Type: Bug
Components: fs
Affects Versions: 1.3.0
Reporter: Adar Dembo
The log block manager has had a longstanding issue wherein a container can only be used by a block once the outstanding writable block has been closed. Thing is, we like to delay the close (and sync) of blocks until the very end of a Kudu flush/compact operation, so as to maximize the amount of time that the kernel has to asynchronously flush dirty pages out to disk. As a result, the LBM can easily generate a thousand containers after flushing a very modest tablet of ~30 columns. To be precise, the number of containers will be equal to the flush threshold (1 GB by default) divided by the rowset size (32 MB by default) multiplied by the number of columns in the tablet. Coupled with the LBM's default preallocation buffer size (32 MB), a single tablet flush can result in the tserver's space consumption skyrocketing to 32 GB.
In and of itself this isn't fatal; the tserver will make use of this space over time. But it's a pretty bad first impression for a novice who is trying to calculate just how much disk space Kudu uses, and it means Kudu's disk space consumption is very "bursty" instead of linear.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)