You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org> on 2012/02/01 03:30:57 UTC

[jira] [Commented] (HBASE-4608) HLog Compression

    [ https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197521#comment-13197521 ] 

jiraposter@reviews.apache.org commented on HBASE-4608:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2740/#review4732
-----------------------------------------------------------


Only got about halfway through. Will continue to look soon. Overall looking pretty good!


src/main/java/org/apache/hadoop/hbase/regionserver/wal/CompressedKeyValue.java
<https://reviews.apache.org/r/2740/#comment10459>

    I'd rename this class to KeyValueCompression or even KVCompression. Then rename readFields to just "read" -- since this is just utility functions, not actually an instance of a compressed keyvalue.



src/main/java/org/apache/hadoop/hbase/regionserver/wal/CompressedKeyValue.java
<https://reviews.apache.org/r/2740/#comment10460>

    rather than using keyVal.getRow(), keyVal.getFamily(), keyVal.getQualifer(), you should use the versions of those functions that just return offsets and lengths (eg getKeyOffset, getKeyLength). Then expand the writeCompressed API to take (byte[] buf, int off, int len). Otherwise you're making needless copies/garbage here.



src/main/java/org/apache/hadoop/hbase/regionserver/wal/CompressionContext.java
<https://reviews.apache.org/r/2740/#comment10461>

    Since this is so simple, I'd move it to be a static inner class of KVCompression above



src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java
<https://reviews.apache.org/r/2740/#comment10462>

    I think we can merge this with the other class that just has static methods as well.



src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java
<https://reviews.apache.org/r/2740/#comment10463>

    this function requires that the whole log data fit in RAM - not a great assumption



src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java
<https://reviews.apache.org/r/2740/#comment10464>

    why is this split into two if/elses? looks like the top clauses can be combined, as can the bottom clauses



src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java
<https://reviews.apache.org/r/2740/#comment10465>

    switch order of "in" and "offset" here.
    
    Perhaps clearer to name this as "uncompressIntoArray"?



src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java
<https://reviews.apache.org/r/2740/#comment10467>

    worth a comment here to explain that the "status" byte actually has the high-order byte of the dictionary entry in the case that it's in the dictionary



src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java
<https://reviews.apache.org/r/2740/#comment10466>

    *un*compressed value, right?


- Todd


On 2012-01-24 22:29:18, Li Pi wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2740/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2012-01-24 22:29:18)
bq.  
bq.  
bq.  Review request for hbase, Eli Collins and Todd Lipcon.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  HLog compression. Has unit tests and a command line tool for compressing/decompressing.
bq.  
bq.  
bq.  This addresses bug HBase-4608.
bq.      https://issues.apache.org/jira/browse/HBase-4608
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/HConstants.java 8370ef8 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/wal/CompressedKeyValue.java PRE-CREATION 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/wal/CompressionContext.java PRE-CREATION 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java PRE-CREATION 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java e46a7a0 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogKey.java f067221 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/wal/LRUDictionary.java PRE-CREATION 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java d9cd6de 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogWriter.java cbef70f 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALDictionary.java PRE-CREATION 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEdit.java e1117ef 
bq.    src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLRUDictionary.java PRE-CREATION 
bq.    src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java 59910bf 
bq.    src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplayCompressed.java PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/2740/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Li
bq.  
bq.


                
> HLog Compression
> ----------------
>
>                 Key: HBASE-4608
>                 URL: https://issues.apache.org/jira/browse/HBASE-4608
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Li Pi
>            Assignee: Li Pi
>         Attachments: 4608v1.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt
>
>
> The current bottleneck to HBase write speed is replicating the WAL appends across different datanodes. We can speed up this process by compressing the HLog. Current plan involves using a dictionary to compress table name, region id, cf name, and possibly other bits of repeated data. Also, HLog format may be changed in other ways to produce a smaller HLog.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira