You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@thrift.apache.org by "Larry Hastings (JIRA)" <ji...@apache.org> on 2009/02/05 15:05:59 UTC

[jira] Issue Comment Edited: (THRIFT-110) A more compact format

    [ https://issues.apache.org/jira/browse/THRIFT-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12670752#action_12670752 ] 

larryhastings edited comment on THRIFT-110 at 2/5/09 6:05 AM:
---------------------------------------------------------------

If you're willing to go *that* far, then let's go all the way: dump all the type information from the type-header.  The nibble would therefore be:

stop => 0
value-zero => 1 # effective payload is the value zero, 0x0, maps to integer 0 / boolean false
value-one => 2 #  effective payload is the value one, 0x1, maps to integer 1 / boolean true
payload-length-0 => 3 # empty payload
payload-length-1 => 4 # payload is 1 byte long
payload-length-2 => 5 # ...
payload-length-3 => 6
payload-length-4 => 7
payload-length-5 => 8
payload-length-6 => 9
payload-length-7 => 0xA
payload-length-8 => 0xB
payload-length-9 => 0xC
payload-length-10 => 0xD # payload is 10 bytes long
payload-length-variable => 0xE # payload length follows--zigzag int?
extended-type => 0xF # extended type follows -- zigzag int?

If empty payloads are not possible, we can drop "payload-length-0", shift all the payload-length fields down by one, and add payload-length-11 to the end.

On the other hand, if -1 is anything like a common value, we could add a "value-negative-one", which would stand in for any all-bits-set fixed-size value.  I'm guessing it's rare, so it's probably not worth it, but as already stated I'm not really a Thrift guy.

/larry/

p.s. Hi Ben!

      was (Author: larryhastings):
    
If you're willing to go *that* far, then let's go all the way: dump all the type information from the type-header.  The nibble would therefore be:

stop => 0
value-zero => 2 # effective payload is the value zero, 0x0, maps to integer 0 / boolean false
value-one => 3 #  effective payload is the value one, 0x1, maps to integer 1 / boolean true
payload-length-0 => 1 # empty payload
payload-length-1 => 4 # payload is 1 byte long
payload-length-2 => 5 # ...
payload-length-3 => 6
payload-length-4 => 7
payload-length-5 => 8
payload-length-6 => 9
payload-length-7 => 0xA
payload-length-8 => 0xB
payload-length-9 => 0xC
payload-length-10 => 0xD # payload is 10 bytes long
payload-length-variable => 0xE # payload length follows--zigzag int?
extended-type => 0xF # extended type follows -- zigzag int?

If empty payloads are not possible, we can drop "payload-length-0", shift all the payload-length fields down by one, and add payload-length-11 to the end.

On the other hand, if -1 is anything like a common value, we could add a "value-negative-one", which would stand in for any all-bits-set fixed-size value.  I'm guessing it's rare, so it's probably not worth it, but as already stated I'm not really a Thrift guy.
  
> A more compact format 
> ----------------------
>
>                 Key: THRIFT-110
>                 URL: https://issues.apache.org/jira/browse/THRIFT-110
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Noble Paul
>            Assignee: Bryan Duxbury
>         Attachments: compact-proto-spec-2.txt, compact_proto_spec.txt, compact_proto_spec.txt, thrift-110-v2.patch, thrift-110-v3.patch, thrift-110-v4.patch, thrift-110-v5.patch, thrift-110-v6.patch, thrift-110-v7.patch, thrift-110-v8.patch, thrift-110-v9.patch, thrift-110.patch
>
>
> Thrift is not very compact in writing out data as (say protobuf) . It does not have the concept of variable length integers and various other optimizations possible . In Solr we use a lot of such optimizations to make a very compact payload. Thrift has a lot common with that format.
> It is all done in a single class
> http://svn.apache.org/viewvc/lucene/solr/trunk/src/java/org/apache/solr/common/util/NamedListCodec.java?revision=685640&view=markup
> The other optimizations include writing type/value  in same byte, very fast writes of Strings, externalizable strings etc 
> We could use a thrift format for non-java clients and I would like to see it as compact as the current java version

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.