You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jonathan Ellis (JIRA)" <ji...@apache.org> on 2010/11/14 18:13:14 UTC

[jira] Commented: (CASSANDRA-1735) Using MessagePack for reducing data size

    [ https://issues.apache.org/jira/browse/CASSANDRA-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12931844#action_12931844 ] 

Jonathan Ellis commented on CASSANDRA-1735:
-------------------------------------------

Thanks, this is exciting!

What kind of performance improvement do you see with this patch?

> Using MessagePack for reducing data size
> ----------------------------------------
>
>                 Key: CASSANDRA-1735
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1735
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API
>    Affects Versions: 0.7 beta 3
>         Environment: Fedora11,  JDK1.6.0_20
>            Reporter: Muga Nishizawa
>         Attachments: 0001-implement-a-Cassandra-RPC-part-with-MessagePack.patch, dependency_libs.zip
>
>
> For improving Cassandra performance, I implemented a Cassandra RPC part with MessagePack.  The implementation details are attached as a patch.  The patch works on Cassandra 0.7.0-beta3.  Please check it.  
> MessagePack is one of object serialization libraries for cross-languages like Thrift and Protocol Buffers but it is much faster, small, and easy to implement.  MessagePack allows reducing serialization cost and data size in network and disk.  
> MessagePack websites are
>     * website: http://msgpack.org/
>         This website compares MessagePack, Thrift and JSON.  
>     * desing details: http://redmine.msgpack.org/projects/msgpack/wiki/FormatDesign
>     * source code: https://github.com/msgpack/msgpack/
> Performance of the data serialization library is one of the most important issues for developing a distributed database in Java.  If the performance is bad, it significantly reduces the overall database performance.  Java's GC also runs many times.  Cassandra has this problem as well.  
> For reducing data size in network between a client and Cassandra, I prototyped the implementation of a Cassandra RPC part with MessagePack and MessagePack-RPC.  The implementation is very simple.  MessagePack-RPC can reuse the existing Thrift based CassandraServer (org.apache.cassandra.thrift.CassandraServer)
> while adapting MessagePack's communication protocol and data serialization.  
> Major features of MessagePack-RPC are 
>     * Asynchronous RPC
>     * Parallel Pipelining
>     * Connection pooling
>     * Delayed return
>     * Event-driven I/O
>     * more details: http://redmine.msgpack.org/projects/msgpack/wiki/RPCDesign
>     * source code: https://github.com/msgpack/msgpack-rpc/
> The attached patch includes a ring cache program for MessagePack and its test program.  
> You can check the behavior of the Cassandra RPC with MessagePack.  
> Thanks in advance, 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.