You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@thrift.apache.org by "Noble Paul (JIRA)" <ji...@apache.org> on 2008/08/19 07:05:44 UTC

[jira] Issue Comment Edited: (THRIFT-110) A more compact format

    [ https://issues.apache.org/jira/browse/THRIFT-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12623556#action_12623556 ] 

noble.paul edited comment on THRIFT-110 at 8/18/08 10:04 PM:
-------------------------------------------------------------

bq.I'm a little hesitant to do this. For one thing, it would mean that clients in languages that didn't support this encoding would be unable to communicate with servers that used it

We must ensure that all the languages that we support supports this before we roll it out it. So that all clients for a given version of thrift are able to consume this. 

The mandate for a library like thrift is to make transport efficient and fast . If we shy away from that , we can consider this project complete and move on. 

bq.For example, we don't support inheritance of data structures or shared structure

IDL does not support inheritance . So it is not really an API limitation. But maps and lists are supported by IDL hence the difference

bq. the amount of extra metadata required to have a list that could contain two different types of structures would be huge.

We can always add a new types for heterogeneous collection types. So the user can make the choice and pay the price for that.

We do not have to assume that there is only one way of doing something. Probably there are easier ways to achieve this if we are willing to have a dialogue

bq.In order to implement extern strings, I would create a separate list of strings, store indexes into the list in your main data structure, and send both across.

While it is possible to do so,   with a lot of extra effort , for the library user . It will be easy if I could use it straight off the shelf. In our case doing so is really expensive because the strings can only be obtained after reading from an object which may be residing in the disk which we wish to load just before writing only

We are discussing as if the window for discussion is closed and thrift has finalized everything in incubation itself. All features do not need to be there all at once . But if we want to gain acceptance we must be willing to embrace ideas from others and be willing to accommodate whatever is possible. Moreover , we must look at our competitors and incorporate good ideas (if possible)

      was (Author: noble.paul):
    bq.I'm a little hesitant to do this. For one thing, it would mean that clients in languages that didn't support this encoding would be unable to communicate with servers that used it

We must ensure that all the languages that we support supports this before we roll it out it. So that all clients for a given version of thrift are able to consume this. 

The mandate for a library like thrift is to make transport efficient and fast . If we shy away from that , we can consider this project complete and move on. 

bq.For example, we don't support inheritance of data structures or shared structure

IDL does not support inheritance . So it is not really an API limitation. But maps and lists are supported by IDL hence the difference

bq. the amount of extra metadata required to have a list that could contain two different types of structures would be huge.

We can always add a new types for heterogeneous collection types. So the user can make the choice and pay the price for that.

We do not have to assume that there is only one way of doing something. Probably there are easier ways to achieve this if we are willing to have a dialogue

bq.In order to implement extern strings, I would create a separate list of strings, store indexes into the list in your main data structure, and send both across.

While it is possible to do so,   with a lot of extra effort , for the library user . It will be easy if I could use it straight off the shelf

We are discussing as if the window for discussion is closed and thrift has finalized everything in incubation itself. All features do not need to be there all at once . But if we want to gain acceptance we must be willing to embrace ideas from others and be willing to accommodate whatever is possible. Moreover , we must look at our competitors and incorporate good ideas (if possible)
  
> A more compact format 
> ----------------------
>
>                 Key: THRIFT-110
>                 URL: https://issues.apache.org/jira/browse/THRIFT-110
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Noble Paul
>
> Thrift is not very compact in writing out data as (say protobuf) . It does not have the concept of variable length integers and various other optimizations possible . In Solr we use a lot of such optimizations to make a very compact payload. Thrift has a lot common with that format.
> It is all done in a single class
> http://svn.apache.org/viewvc/lucene/solr/trunk/src/java/org/apache/solr/common/util/NamedListCodec.java?revision=685640&view=markup
> The other optimizations include writing type/value  in same byte, very fast writes of Strings, externalizable strings etc 
> We could use a thrift format for non-java clients and I would like to see it as compact as the current java version

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.