You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Phabricator (Updated) (JIRA)" <ji...@apache.org> on 2012/03/22 00:44:23 UTC
[jira] [Updated] (HBASE-5469) Add baseline compression efficiency
to DataBlockEncodingTool
[ https://issues.apache.org/jira/browse/HBASE-5469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Phabricator updated HBASE-5469:
-------------------------------
Attachment: D2409.1.patch
mbautin requested code review of "[jira] [HBASE-5469] Add baseline compression efficiency to DataBlockEncodingTool".
Reviewers: JIRA, dhruba, tedyu, stack
DataBlockEncodingTool currently does not provide baseline compression
efficiency, e.g. Hadoop compression codec applied to unencoded data. E.g. if
we are using LZO to compress blocks, we would like to have the following
columns in the report (possibly as percentages of raw data size).
Baseline K+V in blockcache | Baseline K + V on disk (LZO compressed) | K + V
DataBlockEncoded in block cache | K + V DataBlockEncoded + LZOCompressed (on
disk)
Background: we never store compressed blocks in cache, but we always store
encoded data blocks in cache if data block encoding is enabled for the column
family.
TEST PLAN
* Run unit tests.
* Run DataBlockEncodingTool on a variety of real-world HFiles.
REVISION DETAIL
https://reviews.facebook.net/D2409
AFFECTED FILES
src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
MANAGE HERALD DIFFERENTIAL RULES
https://reviews.facebook.net/herald/view/differential/
WHY DID I GET THIS EMAIL?
https://reviews.facebook.net/herald/transcript/5403/
Tip: use the X-Herald-Rules header to filter Herald messages in your client.
> Add baseline compression efficiency to DataBlockEncodingTool
> ------------------------------------------------------------
>
> Key: HBASE-5469
> URL: https://issues.apache.org/jira/browse/HBASE-5469
> Project: HBase
> Issue Type: Improvement
> Reporter: Mikhail Bautin
> Assignee: Mikhail Bautin
> Priority: Minor
> Attachments: D2409.1.patch
>
>
> DataBlockEncodingTool currently does not provide baseline compression efficiency, e.g. Hadoop compression codec applied to unencoded data. E.g. if we are using LZO to compress blocks, we would like to have the following columns in the report (possibly as percentages of raw data size).
> Baseline K+V in blockcache | Baseline K + V on disk (LZO compressed) | K + V DataBlockEncoded in block cache | K + V DataBlockEncoded + LZOCompressed (on disk)
> Background: we never store compressed blocks in cache, but we always store encoded data blocks in cache if data block encoding is enabled for the column family.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira