You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "ryan rawson (Issue Comment Edited) (JIRA)" <ji...@apache.org> on 2012/02/09 08:18:59 UTC

[jira] [Issue Comment Edited] (HBASE-5355) Compressed RPC's for HBase

    [ https://issues.apache.org/jira/browse/HBASE-5355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204318#comment-13204318 ] 

ryan rawson edited comment on HBASE-5355 at 2/9/12 7:17 AM:
------------------------------------------------------------

oh btw, on the custom-compression, it wasnt so much as compression but a new form of serialization that did not duplicate fields. It was similar to your constant pool, the cost of figuring out what those constants are, then re-serializing it ended up being net-neutral at the best, and took slightly more time at worst.

The act of determining what the constant pool could have been the expensive bit.  0-copy with protobuf would be great, but if the cost is in the constant pool assembly, then it might not be as beneficial as one would like.
                
      was (Author: ryanobjc):
    oh btw, on the custom-compression, it wasnt so much as compression but a new form of serialization that did not duplicate fields.  Figuring the de-duplication, then writing it out ended up being as, or more, expensive than just sending the data. Just the cost of re-copying the data appeared to be too much. And even with the best compression algorithms, you'd still have to incur the data copying cost at least 1x (even if its over previously used memory).
                  
> Compressed RPC's for HBase
> --------------------------
>
>                 Key: HBASE-5355
>                 URL: https://issues.apache.org/jira/browse/HBASE-5355
>             Project: HBase
>          Issue Type: Improvement
>          Components: ipc
>    Affects Versions: 0.89.20100924
>            Reporter: Karthik Ranganathan
>            Assignee: Karthik Ranganathan
>
> Some application need ability to do large batched writes and reads from a remote MR cluster. These eventually get bottlenecked on the network. These results are also pretty compressible sometimes.
> The aim here is to add the ability to do compressed calls to the server on both the send and receive paths.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira