You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Ishan Chhabra (JIRA)" <ji...@apache.org> on 2014/01/12 08:59:50 UTC
[jira] [Created] (HBASE-10323) Auto detect data block encoding in
HFileOutputFormat
Ishan Chhabra created HBASE-10323:
-------------------------------------
Summary: Auto detect data block encoding in HFileOutputFormat
Key: HBASE-10323
URL: https://issues.apache.org/jira/browse/HBASE-10323
Project: HBase
Issue Type: Improvement
Reporter: Ishan Chhabra
Assignee: Ishan Chhabra
Currently, one has to specify the data block encoding of the table explicitly using the config parameter "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload load. This option is easily missed, not documented and also works differently than compression, block size and bloom filter type, which are auto detected.
The solution would be to add support to auto detect datablock encoding similar to other parameters.
The current patch does the following:
1. Automatically detects datablock encoding in HFileOutputFormat.
2. Keeps the legacy option of manually specifying the datablock encoding
around as a method to override auto detections.
3. Moves string conf parsing to the start of the program so that it fails
fast during starting up instead of failing during record writes. It also
makes the internals of the program type safe.
4. Adds missing doc strings and unit tests for code serializing and
deserializing config paramerters for bloom filer type, block size and
datablock encoding.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)