You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hbase.apache.org by vj...@apache.org on 2020/06/01 08:42:44 UTC

[hbase] branch branch-1 updated: HBASE-24455 Correct the doc of "On the number of column families" (#1799)

This is an automated email from the ASF dual-hosted git repository.

vjasani pushed a commit to branch branch-1
in repository https://gitbox.apache.org/repos/asf/hbase.git


The following commit(s) were added to refs/heads/branch-1 by this push:
     new 4096925  HBASE-24455 Correct the doc of "On the number of column families" (#1799)
4096925 is described below

commit 4096925b983d7d2ed2f69e95cda160fc73d76c80
Author: bsglz <18...@qq.com>
AuthorDate: Mon Jun 1 16:38:07 2020 +0800

    HBASE-24455 Correct the doc of "On the number of column families" (#1799)
    
    Signed-off-by: Wellington Ramos Chevreuil <wc...@apache.org>
    Signed-off-by: Viraj Jasani <vj...@apache.org>
---
 src/main/asciidoc/_chapters/schema_design.adoc | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/main/asciidoc/_chapters/schema_design.adoc b/src/main/asciidoc/_chapters/schema_design.adoc
index 9319c65..3031397 100644
--- a/src/main/asciidoc/_chapters/schema_design.adoc
+++ b/src/main/asciidoc/_chapters/schema_design.adoc
@@ -68,8 +68,9 @@ See <<store,store>> for more information on StoreFiles.
 ==  On the number of column families
 
 HBase currently does not do well with anything above two or three column families so keep the number of column families in your schema low.
-Currently, flushing and compactions are done on a per Region basis so if one column family is carrying the bulk of the data bringing on flushes, the adjacent families will also be flushed even though the amount of data they carry is small.
-When many column families exist the flushing and compaction interaction can make for a bunch of needless i/o (To be addressed by changing flushing and compaction to work on a per column family basis). For more information on compactions, see <<compaction>>.
+Currently, flushing is done on a per Region basis so if one column family is carrying the bulk of the data bringing on flushes, the adjacent families will also be flushed even though the amount of data they carry is small.
+When many column families exist the flushing interaction can make for a bunch of needless i/o (To be addressed by changing flushing to work on a per column family basis).
+In addition, compactions triggered at table/region level will happen per store too.
 
 Try to make do with one column family if you can in your schemas.
 Only introduce a second and third column family in the case where data access is usually column scoped; i.e.