You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2010/09/14 08:38:32 UTC

[jira] Updated: (HBASE-2899) hfile.min.blocksize.size ignored/documentation wrong

     [ https://issues.apache.org/jira/browse/HBASE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-2899:
-------------------------

    Attachment: 2899.txt

Small patch to address this issue (Not on review board because I didn't make patch w/ git so having issues uploading)

Renames hfile.min.blocksize.size to be hbase.mapreduce.hfileoutputformat.blocksize
Blocksize is normally a column family attribute set using the BLOCKSIZE key on
an HColumnDescriptor.  This above configuration is for the mapreduce outputfile
context where there is not table schema available.

I did not move HFile.DEFAULT_BLOCKSIZE to HConstants.  HFile is trying to 
minimize its dependency on the backing hbase so doesn't want to depend on
Configuration or HConstants (though I notice the main has Configuration and
HConstants). It wants configuration passed in on construction (which is
how we set hfile blocksize, by passing whats set in HColumnDescritor
into the hfile constructor).

The default BLOCKSIZE is set to HFile.DEFAULT_BLOCKSIZE.

So, I think regards Karthik's comment, the blocksize is HColumnDescriptor#BLOCKSIZE,
never HBaseConfiguration.get("hfile.min.blocksize.size (as of this patch), and
yes, the default is whats in File.DEFAULT_BLOCKSIZE

> hfile.min.blocksize.size ignored/documentation wrong
> ----------------------------------------------------
>
>                 Key: HBASE-2899
>                 URL: https://issues.apache.org/jira/browse/HBASE-2899
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Francke
>            Priority: Trivial
>         Attachments: 2899.txt
>
>
> There is a property in hbase-default.xml called {{hfile.min.blocksize.size}} set to {{65536}}.
> The description says: Minimum store file block size.  The smaller you make this, the  bigger your index and the less you fetch on a random-access.  Set size down  if you have small cells and want faster random-access of individual cells.
> This property is only used in the HFileOutputFormat and nowhere else. So we should at least change the description to something more meaningful.
> The other option I see would be: HFile now has a DEFAULT_BLOCKSIZE field which could be moved to HConstants and HFile could somehow read the {{hfile.min.blocksize.size}} from the Configuration or use HConstansts.DEFAULT_BLOCKSIZE if it's not defined. I believe this is what's happening to the other config variables?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.