You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@thrift.apache.org by "Bryan Duxbury (JIRA)" <ji...@apache.org> on 2008/12/18 00:32:44 UTC

[jira] Created: (THRIFT-236) Structs should be serialized in a consistent order

Structs should be serialized in a consistent order
--------------------------------------------------

                 Key: THRIFT-236
                 URL: https://issues.apache.org/jira/browse/THRIFT-236
             Project: Thrift
          Issue Type: Improvement
            Reporter: Bryan Duxbury


As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 

The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bryan Duxbury updated THRIFT-236:
---------------------------------

    Attachment: thrift-236-v7.patch

How's this?

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Alexander Shigin
>             Fix For: 0.1
>
>         Attachments: sort-fields.diff, thrift-236-v4.patch, thrift-236-v5.patch, thrift-236-v6.patch, thrift-236-v7.patch, thrift-236.patch, thrift-consistent-order-v2.patch, thrift-consistent-order-v3.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bryan Duxbury resolved THRIFT-236.
----------------------------------

    Resolution: Fixed
      Assignee: Alexander Shigin  (was: Bryan Duxbury)

Ok. Since the global field ordering thing is fixed, I'm calling this one committed. Ruby doesn't need to be patched further right now, though it might be kind of nice to improve the performance a little bit at some point in the future.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Alexander Shigin
>             Fix For: 0.1
>
>         Attachments: sort-fields.diff, thrift-236.patch, thrift-consistent-order-v2.patch, thrift-consistent-order-v3.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "David Reiss (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12683771#action_12683771 ] 

David Reiss commented on THRIFT-236:
------------------------------------

Nope.  Sending out an email now.  Sorry.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>             Fix For: 0.1
>
>         Attachments: thrift-236.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "David Reiss (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689609#action_12689609 ] 

David Reiss commented on THRIFT-236:
------------------------------------

There is a url in the patch that promises to point to more information, but it just points to a table of contents.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>             Fix For: 0.1
>
>         Attachments: thrift-236.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Kevin Clark (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665544#action_12665544 ] 

Kevin Clark commented on THRIFT-236:
------------------------------------

Do we still need to modify the generator to use FIELDS_IN_IDL_ORDER? (See my comments before)

And if the accelerated binary protocol won't be tweaked with this ticket, we should open a new ticket to handle the change. THRIFT-248 may need to be tweaked as well.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>         Attachments: thrift-236.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Kevin Clark (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12683696#action_12683696 ] 

Kevin Clark commented on THRIFT-236:
------------------------------------

Is this something we want to move forward on? Should this still be included in the 0.1 roadmap?

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>             Fix For: 0.1
>
>         Attachments: thrift-236.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "David Reiss (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12688918#action_12688918 ] 

David Reiss commented on THRIFT-236:
------------------------------------

Oh, also.  The url is a table of contents.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>             Fix For: 0.1
>
>         Attachments: thrift-236.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "David Reiss (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689738#action_12689738 ] 

David Reiss commented on THRIFT-236:
------------------------------------

I could go either way.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>             Fix For: 0.1
>
>         Attachments: sort-fields.diff, thrift-236.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12683707#action_12683707 ] 

Bryan Duxbury commented on THRIFT-236:
--------------------------------------

Yes, I would still like to do this. TCompactProtocol benefits from it nicely when writing. David, did you ever decide if this would massively destroy TDenseProtocol-written stuff?

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>             Fix For: 0.1
>
>         Attachments: thrift-236.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "David Reiss (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689748#action_12689748 ] 

David Reiss commented on THRIFT-236:
------------------------------------

http://www.open-std.org/jtc1/sc22/open/n2356/template.html#temp.arg

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>             Fix For: 0.1
>
>         Attachments: sort-fields.diff, thrift-236.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "David Reiss (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697989#action_12697989 ] 

David Reiss commented on THRIFT-236:
------------------------------------

<dreiss> bryanduxbury: adds trailing whitespace in t_struct.h
<bryanduxbury> other than that?
<dreiss> checking
<dreiss> use braces in the "if" in t_struct.h
<dreiss> gnerate_fingerprint doesn't need to be in order, but it doesn't matter

Otherwise, LG.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Alexander Shigin
>             Fix For: 0.1
>
>         Attachments: sort-fields.diff, thrift-236-v4.patch, thrift-236-v5.patch, thrift-236-v6.patch, thrift-236-v7.patch, thrift-236.patch, thrift-consistent-order-v2.patch, thrift-consistent-order-v3.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12688450#action_12688450 ] 

Bryan Duxbury commented on THRIFT-236:
--------------------------------------

@Alexander: This patch is totally simple and awesome. I tested it out, and it does exactly the right thing with the java generated code. I would say, combine that with the existing ruby lib patch, and we're good to go.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>             Fix For: 0.1
>
>         Attachments: thrift-236.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663094#action_12663094 ] 

Todd Lipcon commented on THRIFT-236:
------------------------------------

The mapreduce use case does make sense, but Nathan brings up a good point as well that I didn't think about.

The sloppy code I was imagining was people using the assumption of a consistent order to avoid hash lookups on field ids in deserialization code. However, I guess that's in generated code so we can take care in the generators to not generate code that relies on the consistent-order property, so it's a moot point.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>         Attachments: thrift-236.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Reopened: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bryan Duxbury reopened THRIFT-236:
----------------------------------


Let's fix this a slightly different way...

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Alexander Shigin
>             Fix For: 0.1
>
>         Attachments: sort-fields.diff, thrift-236.patch, thrift-consistent-order-v2.patch, thrift-consistent-order-v3.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "David Reiss (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12690002#action_12690002 ] 

David Reiss commented on THRIFT-236:
------------------------------------

I'd prefer to see the members_type typedef inside the t_program class.  If people are okay with this, I can just do it before committing.  Otherwise, LG.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>             Fix For: 0.1
>
>         Attachments: sort-fields.diff, thrift-236.patch, thrift-consistent-order-v2.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689564#action_12689564 ] 

Bryan Duxbury commented on THRIFT-236:
--------------------------------------

Is this committable, then? I'm not sure I understand the "url is a table of contents" comment.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>             Fix For: 0.1
>
>         Attachments: thrift-236.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bryan Duxbury updated THRIFT-236:
---------------------------------

    Patch Info:   (was: [Patch Available])

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>             Fix For: 0.1
>
>         Attachments: thrift-236.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Alexander Shigin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689650#action_12689650 ] 

Alexander Shigin commented on THRIFT-236:
-----------------------------------------

@Bryan: The patch sorts fields after the last one.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>             Fix For: 0.1
>
>         Attachments: sort-fields.diff, thrift-236.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "David Reiss (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12693930#action_12693930 ] 

David Reiss commented on THRIFT-236:
------------------------------------

So is this resolved now, or are there still cases where Ruby serializes in hash-order?  And what about sets and maps?

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>             Fix For: 0.1
>
>         Attachments: sort-fields.diff, thrift-236.patch, thrift-consistent-order-v2.patch, thrift-consistent-order-v3.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Kevin Clark (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657616#action_12657616 ] 

Kevin Clark commented on THRIFT-236:
------------------------------------

You mention Ruby in the description, but is what you're looking for really consistent-ordered serialization cross language implementation?

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Will Pierce (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662977#action_12662977 ] 

Will Pierce commented on THRIFT-236:
------------------------------------

I strongly agree that encoding order should be explicitly defined.  IDL order is fine here. One alternative could be to order by numeric field ID, but then you would eliminate the possibility of being able to change encoding order at a later point in time, while keeping the fields unchanged.  IDL order keeps that option open.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>         Attachments: thrift-236.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Kevin Clark (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661348#action_12661348 ] 

Kevin Clark commented on THRIFT-236:
------------------------------------

THRIFT-246 requires this to modify the generator to use FIELDS_IN_IDL_ORDER instead of FIELDS in struct_fields. There is also going to need to be a change in the accelerated binary protocol.

And we should really have a spec confirming this behavior (and we could share that across the C and Ruby impls).

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>         Attachments: thrift-236.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689651#action_12689651 ] 

Bryan Duxbury commented on THRIFT-236:
--------------------------------------

I did a really quick benchmark of doing the sorting vs not doing the sorting, and it looks like there's a 2% difference in performance without the sorting. It's pretty small, but the amount of work I'd have to do to reclaim that performance is pretty small, too, so I'm apt to do it.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>             Fix For: 0.1
>
>         Attachments: sort-fields.diff, thrift-236.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "David Reiss (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697951#action_12697951 ] 

David Reiss commented on THRIFT-236:
------------------------------------

The fingerprint is used by the dense protocol, so it should be in the same order as serialization.  You should also hit line 873 of the cpp generator.  I agree with you about defaulting to the old behavior, for now at least.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Alexander Shigin
>             Fix For: 0.1
>
>         Attachments: sort-fields.diff, thrift-236-v4.patch, thrift-236-v5.patch, thrift-236-v6.patch, thrift-236.patch, thrift-consistent-order-v2.patch, thrift-consistent-order-v3.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "David Reiss (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12690092#action_12690092 ] 

David Reiss commented on THRIFT-236:
------------------------------------

Oh, right.  I just read the diff wrong.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>             Fix For: 0.1
>
>         Attachments: sort-fields.diff, thrift-236.patch, thrift-consistent-order-v2.patch, thrift-consistent-order-v3.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Alexander Shigin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697739#action_12697739 ] 

Alexander Shigin commented on THRIFT-236:
-----------------------------------------

I'd prefer if get_members returns sorted vector and get_unsorted_members for members in idl-order. You've missed at least fingerprint generation.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Alexander Shigin
>             Fix For: 0.1
>
>         Attachments: sort-fields.diff, thrift-236-v4.patch, thrift-236-v5.patch, thrift-236-v6.patch, thrift-236.patch, thrift-consistent-order-v2.patch, thrift-consistent-order-v3.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "David Reiss (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David Reiss updated THRIFT-236:
-------------------------------

    Attachment: sort-fields.diff

You could do something like this.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>             Fix For: 0.1
>
>         Attachments: sort-fields.diff, thrift-236.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657621#action_12657621 ] 

Bryan Duxbury commented on THRIFT-236:
--------------------------------------

Yes, I'm picking out Ruby in particular because it doesn't behave like the others. However, if we specified a thrift-wide ordering, then clearly Ruby (and any other library that doesn't play by the rules) can be fixed.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Nathan Marz (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663087#action_12663087 ] 

Nathan Marz commented on THRIFT-236:
------------------------------------

Besides serializing the fields in a consistent order, for two equivalent objects to serialize the same wouldn't we also need to sort sets and maps before writing them?

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>         Attachments: thrift-236.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "David Reiss (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12672178#action_12672178 ] 

David Reiss commented on THRIFT-236:
------------------------------------

TDenseProtocol currently depends on it being IDL order.  I have to check to see if we have any permanently-stored TDense data here at Facebook.  If anyone else does, they should speak up too.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>             Fix For: 0.1
>
>         Attachments: thrift-236.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bryan Duxbury updated THRIFT-236:
---------------------------------

    Patch Info: [Patch Available]

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>         Attachments: thrift-236.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bryan Duxbury reassigned THRIFT-236:
------------------------------------

    Assignee: Bryan Duxbury

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>         Attachments: thrift-236.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bryan Duxbury updated THRIFT-236:
---------------------------------

    Attachment: thrift-236-v4.patch

Ok, in this version of Alex's patch, I make the use of struct->get_sorted_members() explicit for only the struct writer. This patch only affects the Java library at the moment. 

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Alexander Shigin
>             Fix For: 0.1
>
>         Attachments: sort-fields.diff, thrift-236-v4.patch, thrift-236.patch, thrift-consistent-order-v2.patch, thrift-consistent-order-v3.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "David Reiss (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657614#action_12657614 ] 

David Reiss commented on THRIFT-236:
------------------------------------

C++, Java, and Python serialize them in IDL order.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664554#action_12664554 ] 

Bryan Duxbury commented on THRIFT-236:
--------------------------------------

It's true, we'd have to sort maps and sets for them to equal in all cases. However, It's rare that maps and sets play a "key" role in map/reduce, and if you really need them to be sorted on the wire, right now, you can just use SortedMap and SortedSet implementations yourself. The order in which the fields are serialized MUST be done in Thrift, however.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>         Attachments: thrift-236.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "David Reiss (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697239#action_12697239 ] 

David Reiss commented on THRIFT-236:
------------------------------------

Another solution would be to add an annotation to the circle structure, like

{noformat}
struct circle {
  required 4: point center;
  required 3: i32 radius;
} (
  cpp.constructor = "4,3"
)
{noformat}

Then check the annotation in the code generator and put the arguments in that order.  You could even leave fields out of it to have them default-constructed.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Alexander Shigin
>             Fix For: 0.1
>
>         Attachments: sort-fields.diff, thrift-236.patch, thrift-consistent-order-v2.patch, thrift-consistent-order-v3.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bryan Duxbury updated THRIFT-236:
---------------------------------

    Fix Version/s: 0.1

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>             Fix For: 0.1
>
>         Attachments: thrift-236.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689576#action_12689576 ] 

Bryan Duxbury commented on THRIFT-236:
--------------------------------------

I just looked at my old ruby patch, and it's wrong now. It looks like the current code will sort the fields by field id before serializing, which is at least correct behavior, if potentially suboptimal in the current implementation. I might fix the suboptimality with an appropriate generator patch before committing.

Ruby's constructor won't be impacted by a change in the order of the constructor, since it uses a map as its input. Java's will get scrambled up, which may have some negative effects on some users' code. I don't think we should go out of our way to solve this problem, though.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>             Fix For: 0.1
>
>         Attachments: thrift-236.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "David Reiss (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697447#action_12697447 ] 

David Reiss commented on THRIFT-236:
------------------------------------

Or, I should say, this syntax works now.  It just doesn't do anything since the "cpp.constructor" annotation is not read by the generator.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Alexander Shigin
>             Fix For: 0.1
>
>         Attachments: sort-fields.diff, thrift-236.patch, thrift-consistent-order-v2.patch, thrift-consistent-order-v3.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12667820#action_12667820 ] 

Bryan Duxbury commented on THRIFT-236:
--------------------------------------

After the latest changes to the compact protocol (THRIFT-110), I'm switching my opinion from IDL order to numeric order. This will make it a lot easier to get optimal performance from the compact protocol.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>         Attachments: thrift-236.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665548#action_12665548 ] 

Bryan Duxbury commented on THRIFT-236:
--------------------------------------

When I say IDL order, I mean "in the order that the lines in the .thrift file proceed". This is how almost all the existing libraries encode their structs. It's true that someone could "clean up" the .thrift file and have a negative effect on the bitwise comparison. To mitigate this possibility, we'd have to use numeric ordering, and that would necessitate changing ALL language libraries.

THRIFT-248 probably does need to be tweaked. I would say we should open a separate issue for the existing binary protocol accelerated, and then just not do it when we decide to commit 248 :)

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>         Attachments: thrift-236.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662996#action_12662996 ] 

Todd Lipcon commented on THRIFT-236:
------------------------------------

Can someone point out why the advantages outweigh the disadvantages here?

It seems like the advantage is that you could have slightly more efficient deserialization code if you know what order the fields are coming in. The disadvantage is in the assumption that the two sides have identical IDL order, which is not an assumption I think most end-users would make. In almost every use case, both sides have identical thrift files, but I can certainly see a situation in which a new version of a service would "clean up the RPC file" and reorder struct fields while maintaining field tags, expecting backwards compatibility and failing miserably.



> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>         Attachments: thrift-236.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "David Reiss (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12688914#action_12688914 ] 

David Reiss commented on THRIFT-236:
------------------------------------

Yeah, this is how I was planning on solving this.  It does affect the code in some (possibly) unexpected ways, though, like the order of arguments in constructors.

It appears that we do not have any TDenseProtocol data stored perrmanently, so I'm okay with this.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>             Fix For: 0.1
>
>         Attachments: thrift-236.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663018#action_12663018 ] 

Bryan Duxbury commented on THRIFT-236:
--------------------------------------

Kevin is right. This issue is NOT about any sort of efficiency gains. The goal is so that bitwise comparison would work in more cases. In particular, most language's libraries do the serialization in the order in which the compiler read them from the IDL file, whereas Ruby's just does some random ordering based on the output of iterating a hash. 

In general, I don't really feel like people should have that much of a problem with IDL ordering. If the IDL isn't the same on both sides, I'd expect the common case to be that a new field is added in one place or another, which, by definition, would have to be optional for things to keep working. Either way, objects created from out-of-sync IDL definitions probably can't be bitwise equal anyway, since one side should have more new fields or different field ids. 

If it would make the community happier, I think it'd be fine to use numeric ordering. It just means changing more libraries to be compatible. 

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>         Attachments: thrift-236.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Alexander Shigin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689571#action_12689571 ] 

Alexander Shigin commented on THRIFT-236:
-----------------------------------------

Oh, I've missed the discussion in my mailbox.

Bryan, but you don't need the ruby patch anymore. There is no more "IDL order", because fields always has "tag order".

I've missed the order of fields in the constructor. 
  # Python test suite seems doesn't depend on order of fields.
  # C++ generator doesn't create constructor right now.

And I don't know much about other languages. What abour java and ruby?

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>             Fix For: 0.1
>
>         Attachments: thrift-236.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Alexander Shigin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alexander Shigin updated THRIFT-236:
------------------------------------

    Attachment: thrift-consistent-order-v3.patch

I'd prefer to declare typedef in t_struct class.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>             Fix For: 0.1
>
>         Attachments: sort-fields.diff, thrift-236.patch, thrift-consistent-order-v2.patch, thrift-consistent-order-v3.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Alexander Shigin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689603#action_12689603 ] 

Alexander Shigin commented on THRIFT-236:
-----------------------------------------

I don't think what thrift struct with thousands of fields is common case. So I think we can't even measure the slowdown on any test struct (like CompactProtoTestStruct, 49 fields).

I don't like ugly code, so we can sort fields in get_members routine and set some flag like sorted_.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>             Fix For: 0.1
>
>         Attachments: thrift-236.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bryan Duxbury updated THRIFT-236:
---------------------------------

    Attachment: thrift-236-v6.patch

Ok, good catch. How's this?

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Alexander Shigin
>             Fix For: 0.1
>
>         Attachments: sort-fields.diff, thrift-236-v4.patch, thrift-236-v5.patch, thrift-236-v6.patch, thrift-236.patch, thrift-consistent-order-v2.patch, thrift-consistent-order-v3.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Eric Anderson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697441#action_12697441 ] 

Eric Anderson commented on THRIFT-236:
--------------------------------------

Is that syntax (cpp.constructor = "4,3") in the upcoming 1.0 release or is that proposed? (I haven't seen it before and it's not in the last release) I was thinking a general option at the top of the file because we want the idl ordered constructors for everything.  It turns out we also have patches for allowing arbitrary stuff to be added in to the c++ class declaration (sundry helper functions, special constructors and such), but I haven't submitted it because it's not properly general and it changes the idl.  It's mainly useful for static languages, most of the dynamic ones can stuff in additional methods whenever they want.


> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Alexander Shigin
>             Fix For: 0.1
>
>         Attachments: sort-fields.diff, thrift-236.patch, thrift-consistent-order-v2.patch, thrift-consistent-order-v3.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Eric Anderson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697227#action_12697227 ] 

Eric Anderson commented on THRIFT-236:
--------------------------------------

There is one advantage to recording the order of fields in the IDL.  If you have constructors for objects that don't name their parameters, then it's useful to be able to keep the IDL order in "human-natural" order, yet let the tag numbers change as needed to support schema evolution.  Consider the following evolution:

struct circle { required 1: i32 x; required 2: i32 y; required 3: i32 radius; }

— V2: (assuming a series of intermediate ones with the necessary optionals)

struct point { required 1: i32 x; required 2: i32 y; }

struct circle { required 4: point center; required 3: i32 radius; }

The problem is that the latter circle would be initialized as circle(radius, center), which seems inverted to me from the "expected" order for defining such an object. However, I can't reuse the 1 and 2 tags because they are obsolete (as I understand the thrift migration of structure tag rules)

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Alexander Shigin
>             Fix For: 0.1
>
>         Attachments: sort-fields.diff, thrift-236.patch, thrift-consistent-order-v2.patch, thrift-consistent-order-v3.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Closed: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bryan Duxbury closed THRIFT-236.
--------------------------------

    Resolution: Fixed

OK, I made the few changes David wanted locally and committed.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Alexander Shigin
>             Fix For: 0.1
>
>         Attachments: sort-fields.diff, thrift-236-v4.patch, thrift-236-v5.patch, thrift-236-v6.patch, thrift-236-v7.patch, thrift-236.patch, thrift-consistent-order-v2.patch, thrift-consistent-order-v3.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Alexander Shigin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alexander Shigin updated THRIFT-236:
------------------------------------

    Attachment: thrift-consistent-order.patch

The really simple (but a bit ugly) patch makes all languages to write structures in tag order. It solves THRIFT-84 as side effect.

The 0.1 release includes couple backward incompatible things and my point of view is to fix TDenseProtocol now, before it'll be widely used.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>             Fix For: 0.1
>
>         Attachments: thrift-236.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Alexander Shigin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alexander Shigin updated THRIFT-236:
------------------------------------

    Attachment: thrift-consistent-order-v2.patch

The new version of the patch. Please review.

There is still one problem: compiler doesn't check if two fields has the same name. I'll make new patch on Monday.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>             Fix For: 0.1
>
>         Attachments: sort-fields.diff, thrift-236.patch, thrift-consistent-order-v2.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697864#action_12697864 ] 

Bryan Duxbury commented on THRIFT-236:
--------------------------------------

Ok... I feel like we're going back and forth a little bit here. The point of making the sorted behavior explicit is that it allows us to fulfill the single use case this issue is actually calling out, rather than arbitrarily affecting all situations where we iterate over the fields of a struct. I tend to think this is best, and has the least unintended consequences.

As far as the fingerprint generation - what is that exactly? Why do we need to do that in id order?

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Alexander Shigin
>             Fix For: 0.1
>
>         Attachments: sort-fields.diff, thrift-236-v4.patch, thrift-236-v5.patch, thrift-236-v6.patch, thrift-236.patch, thrift-consistent-order-v2.patch, thrift-consistent-order-v3.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bryan Duxbury updated THRIFT-236:
---------------------------------

    Attachment: thrift-236-v5.patch

After looking at all the generators, I think this should do it. Can others with more familiarity with various language libraries chime in if I've made any mistakes?

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Alexander Shigin
>             Fix For: 0.1
>
>         Attachments: sort-fields.diff, thrift-236-v4.patch, thrift-236-v5.patch, thrift-236.patch, thrift-consistent-order-v2.patch, thrift-consistent-order-v3.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689642#action_12689642 ] 

Bryan Duxbury commented on THRIFT-236:
--------------------------------------

@David: I don't understand what your latest patch is supposed to do.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>             Fix For: 0.1
>
>         Attachments: sort-fields.diff, thrift-236.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "David Reiss (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697446#action_12697446 ] 

David Reiss commented on THRIFT-236:
------------------------------------

Proposed.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Alexander Shigin
>             Fix For: 0.1
>
>         Attachments: sort-fields.diff, thrift-236.patch, thrift-consistent-order-v2.patch, thrift-consistent-order-v3.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665485#action_12665485 ] 

Bryan Duxbury commented on THRIFT-236:
--------------------------------------

Does anyone still have objections to this issue? If not, I'll commit the attached Ruby patch.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>         Attachments: thrift-236.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Kevin Clark (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657626#action_12657626 ] 

Kevin Clark commented on THRIFT-236:
------------------------------------

Ok. I'd support this. +1

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "David Reiss (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697676#action_12697676 ] 

David Reiss commented on THRIFT-236:
------------------------------------

The new type should be members_in_id_order_, not idl_order.  You missed generate_struct_result_writer in (at least) C++.  Not sure if those matter since they are unlikely to be used in any context where order matters.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Alexander Shigin
>             Fix For: 0.1
>
>         Attachments: sort-fields.diff, thrift-236-v4.patch, thrift-236-v5.patch, thrift-236.patch, thrift-consistent-order-v2.patch, thrift-consistent-order-v3.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Alexander Shigin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689649#action_12689649 ] 

Alexander Shigin commented on THRIFT-236:
-----------------------------------------

The url is from original thrift code. git blames dreiss for this line :)

I don't like the idea of sort_members method. It'd really hard to understand why members is sorted. I prefer localize the behavior in one place.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>             Fix For: 0.1
>
>         Attachments: sort-fields.diff, thrift-236.patch, thrift-consistent-order.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12672169#action_12672169 ] 

Bryan Duxbury commented on THRIFT-236:
--------------------------------------

Does anyone object to making the Thrift-wide spec for serialization order numeric order? It doesn't have to be done immediately, of course. We should create sub-issues for each language as appropriate.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>             Fix For: 0.1
>
>         Attachments: thrift-236.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663082#action_12663082 ] 

Bryan Duxbury commented on THRIFT-236:
--------------------------------------

Map/reduce keys is my use case. If you can avoid deserializing an object, you save a lot of time.

I don't think I follow your final comment, though - what kind of sloppy code will people write, exactly?

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>         Attachments: thrift-236.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Kevin Clark (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663007#action_12663007 ] 

Kevin Clark commented on THRIFT-236:
------------------------------------

To be clear, on the Ruby side, this isn't going to change anything except the order of serialization. Deserialization still happens by reading field headers. The patch being put up for consideration is just going to make it so the same structure is going to look the same (bitwise) on Ruby as it does in Python, C++, et al. There's no expected efficiency gain or loss, and backwards compatibility shouldn't be an issue.

Oh. Maybe I misunderstand. Is IDL order defined as the order the lines go in, or the order of the field tags? If it's the second, I don't see issue. If it's the first, I understand Todd's point.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>         Attachments: thrift-236.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bryan Duxbury updated THRIFT-236:
---------------------------------

    Attachment: thrift-236.patch

This patch fixes the ruby library and compiler to serialize (and, incidentally, print inspect output) in IDL ordering.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>         Attachments: thrift-236.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-236) Structs should be serialized in a consistent order

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663078#action_12663078 ] 

Todd Lipcon commented on THRIFT-236:
------------------------------------

Isn't bitwise comparison in effect an efficiency gain? What's the use case for bitwise comparison? I guess it makes sense in the case of mapreduce keys when aggregating data that may have been output by serialization code in different thrift client implementations.

My worry is that if we make this the requirement then people will be very tempted to write sloppy code that assumes it to be the case for the sake of efficiency.

> Structs should be serialized in a consistent order
> --------------------------------------------------
>
>                 Key: THRIFT-236
>                 URL: https://issues.apache.org/jira/browse/THRIFT-236
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>         Attachments: thrift-236.patch
>
>
> As it stands right now, Ruby generated structs will be serialized in arbitrary order (due to storage of metadata in a hash). This leads to different binary encoding for the same struct values. Ideally, it should be the same for any two serializations of equivalent structs, and between languages if possible. 
> The two approaches that seem to make the most sense are in lowest-to-highest field id order, and in IDL-defined order. What do people think of this idea, and which approach would be preferred?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.