You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2017/03/06 23:33:33 UTC

[jira] [Commented] (HBASE-15248) BLOCKSIZE 4k should result in 4096 bytes on disk; i.e. fit inside a BucketCache 'block' of 4k

    [ https://issues.apache.org/jira/browse/HBASE-15248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15898383#comment-15898383 ] 

stack commented on HBASE-15248:
-------------------------------

Added this note to BLOCKSIZE:


--- a/hbase-client/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
+++ b/hbase-client/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
@@ -103,7 +103,10 @@ public class HColumnDescriptor implements Comparable<HColumnDescriptor> {
   /**
    * Size of storefile/hfile 'blocks'.  Default is {@link #DEFAULT_BLOCKSIZE}.
    * Use smaller block sizes for faster random-access at expense of larger
-   * indices (more memory consumption).
+   * indices (more memory consumption). Note that this is a soft limit and that
+   * blocks have overhead (metadata, CRCs) so blocks will tend to be the size
+   * specified here and then some; i.e. don't expect that setting BLOCKSIZE=4k
+   * means hbase data will align with an SSDs 4k page accesses (TODO).
    */
   public static final String BLOCKSIZE = "BLOCKSIZE";



> BLOCKSIZE 4k should result in 4096 bytes on disk; i.e. fit inside a BucketCache 'block' of 4k
> ---------------------------------------------------------------------------------------------
>
>                 Key: HBASE-15248
>                 URL: https://issues.apache.org/jira/browse/HBASE-15248
>             Project: HBase
>          Issue Type: Sub-task
>          Components: BucketCache
>            Reporter: stack
>
> Chatting w/ a gentleman named Daniel Pol who is messing w/ bucketcache, he wants blocks to be the size specified in the configuration and no bigger. His hardware set ups fetches pages of 4k and so a block that has 4k of payload but has then a header and the header of the next block (which helps figure whats next when scanning) ends up being 4203 bytes or something, and this then then translates into two seeks per block fetch.
> This issue is about what it would take to stay inside our configured size boundary writing out blocks.
> If not possible, give back better signal on what to do so you could fit inside a particular constraint.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)