You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@thrift.apache.org by "Johan Stuyts (JIRA)" <ji...@apache.org> on 2008/05/16 09:50:55 UTC

[jira] Created: (THRIFT-10) Descriptors used during serialization should be immutable objects

Descriptors used during serialization should be immutable objects
-----------------------------------------------------------------

                 Key: THRIFT-10
                 URL: https://issues.apache.org/jira/browse/THRIFT-10
             Project: Thrift
          Issue Type: Improvement
          Components: Library (Java)
            Reporter: Johan Stuyts
            Priority: Minor


The descriptors for structures, messages, types, etc. in package {{com.facebook.thrift.protocol}} which are used during (de-)serialization have mutable attributes. This forces the creation of these descriptors for each use even if the data is constant in many cases.

By changing the descriptors to be immutable, structures and protocols can use singleton, constant instances during (de-)serialization. This will improve performance a bit by preventing the creation of hundreds (thousands?) of short-lived objects per second during heavy use of Thrift.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-10) Descriptors used during serialization should be immutable objects

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12606555#action_12606555 ] 

Bryan Duxbury commented on THRIFT-10:
-------------------------------------

I would also add, can Johan publish the benchmark being used? Perhaps that sort of thing should be in source control so we can have it to test things later.

> Descriptors used during serialization should be immutable objects
> -----------------------------------------------------------------
>
>                 Key: THRIFT-10
>                 URL: https://issues.apache.org/jira/browse/THRIFT-10
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (Java)
>            Reporter: Johan Stuyts
>            Priority: Minor
>         Attachments: ImmutableProtocolComponentDescriptors2.patch
>
>
> The descriptors for structures, messages, types, etc. in package {{com.facebook.thrift.protocol}} which are used during (de-)serialization have mutable attributes. This forces the creation of these descriptors for each use even if the data is constant in many cases.
> By changing the descriptors to be immutable, structures and protocols can use singleton, constant instances during (de-)serialization. This will improve performance a bit by preventing the creation of hundreds (thousands?) of short-lived objects per second during heavy use of Thrift.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-10) Descriptors used during serialization should be immutable objects

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12606165#action_12606165 ] 

Bryan Duxbury commented on THRIFT-10:
-------------------------------------

If all these attributes are meant to be read-only after initialization, how about we just make the all final?

> Descriptors used during serialization should be immutable objects
> -----------------------------------------------------------------
>
>                 Key: THRIFT-10
>                 URL: https://issues.apache.org/jira/browse/THRIFT-10
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (Java)
>            Reporter: Johan Stuyts
>            Priority: Minor
>         Attachments: ImmutableProtocolComponentDescriptors2.patch
>
>
> The descriptors for structures, messages, types, etc. in package {{com.facebook.thrift.protocol}} which are used during (de-)serialization have mutable attributes. This forces the creation of these descriptors for each use even if the data is constant in many cases.
> By changing the descriptors to be immutable, structures and protocols can use singleton, constant instances during (de-)serialization. This will improve performance a bit by preventing the creation of hundreds (thousands?) of short-lived objects per second during heavy use of Thrift.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-10) Descriptors used during serialization should be immutable objects

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12606552#action_12606552 ] 

Bryan Duxbury commented on THRIFT-10:
-------------------------------------

Even if it's a small performance benefit, that's something. We could try and squeeze another few percent out of it by making the members final in addition to using accessors. I also do appreciate the conceptual cleanliness afforded by having the structs be immutable.

> Descriptors used during serialization should be immutable objects
> -----------------------------------------------------------------
>
>                 Key: THRIFT-10
>                 URL: https://issues.apache.org/jira/browse/THRIFT-10
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (Java)
>            Reporter: Johan Stuyts
>            Priority: Minor
>         Attachments: ImmutableProtocolComponentDescriptors2.patch
>
>
> The descriptors for structures, messages, types, etc. in package {{com.facebook.thrift.protocol}} which are used during (de-)serialization have mutable attributes. This forces the creation of these descriptors for each use even if the data is constant in many cases.
> By changing the descriptors to be immutable, structures and protocols can use singleton, constant instances during (de-)serialization. This will improve performance a bit by preventing the creation of hundreds (thousands?) of short-lived objects per second during heavy use of Thrift.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-10) Descriptors used during serialization should be immutable objects

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-10?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bryan Duxbury updated THRIFT-10:
--------------------------------

    Attachment: thrift-10-v3.patch

How's this? Not a gitster, so not 100% I got all the whitespace.

> Descriptors used during serialization should be immutable objects
> -----------------------------------------------------------------
>
>                 Key: THRIFT-10
>                 URL: https://issues.apache.org/jira/browse/THRIFT-10
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (Java)
>            Reporter: Johan Stuyts
>            Assignee: Bryan Duxbury
>            Priority: Minor
>             Fix For: 0.1
>
>         Attachments: ImmutableProtocolComponentDescriptors2.patch, SerializationBenchmarkMain.java, thrift-10-v2.patch, thrift-10-v3.patch
>
>
> The descriptors for structures, messages, types, etc. in package {{com.facebook.thrift.protocol}} which are used during (de-)serialization have mutable attributes. This forces the creation of these descriptors for each use even if the data is constant in many cases.
> By changing the descriptors to be immutable, structures and protocols can use singleton, constant instances during (de-)serialization. This will improve performance a bit by preventing the creation of hundreds (thousands?) of short-lived objects per second during heavy use of Thrift.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-10) Descriptors used during serialization should be immutable objects

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-10?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bryan Duxbury updated THRIFT-10:
--------------------------------

    Fix Version/s: 0.1

> Descriptors used during serialization should be immutable objects
> -----------------------------------------------------------------
>
>                 Key: THRIFT-10
>                 URL: https://issues.apache.org/jira/browse/THRIFT-10
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (Java)
>            Reporter: Johan Stuyts
>            Priority: Minor
>             Fix For: 0.1
>
>         Attachments: ImmutableProtocolComponentDescriptors2.patch, SerializationBenchmarkMain.java
>
>
> The descriptors for structures, messages, types, etc. in package {{com.facebook.thrift.protocol}} which are used during (de-)serialization have mutable attributes. This forces the creation of these descriptors for each use even if the data is constant in many cases.
> By changing the descriptors to be immutable, structures and protocols can use singleton, constant instances during (de-)serialization. This will improve performance a bit by preventing the creation of hundreds (thousands?) of short-lived objects per second during heavy use of Thrift.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-10) Descriptors used during serialization should be immutable objects

Posted by "David Reiss (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12671537#action_12671537 ] 

David Reiss commented on THRIFT-10:
-----------------------------------

{code}
+    bool is_upper = isupper(character) && character != '_' && (character < '0' || '9' < character);
{code}
should just be
{code}
+    bool is_upper = isupper(character);
{code}

I didn't actually use git to find the trailing whitespace this time, just grep for "^+.* $" (add a \ before the + if you are using extended regexps).  There are still two more:

{code}
-  public TField() {}
-
+  public TField() {
+    this("", TType.STOP, (short)0);
+  }
+  
{code}

{code}
-    out <<
-      indent() << "field.name = \"" << (*f_iter)->get_name() << "\";" << endl <<
-      indent() << "field.type = " << type_to_enum((*f_iter)->get_type()) << ";" << endl <<
-      indent() << "field.id = " << upcase_string((*f_iter)->get_name()) << ";" << endl <<
-      indent() << "oprot.writeFieldBegin(field);" << endl;
-
+    indent(out) << "oprot.writeFieldBegin(" << constant_name((*f_iter)->get_name()) << "_FIELD_DESC);" << endl;
+    
{code}


> Descriptors used during serialization should be immutable objects
> -----------------------------------------------------------------
>
>                 Key: THRIFT-10
>                 URL: https://issues.apache.org/jira/browse/THRIFT-10
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (Java)
>            Reporter: Johan Stuyts
>            Assignee: Bryan Duxbury
>            Priority: Minor
>             Fix For: 0.1
>
>         Attachments: ImmutableProtocolComponentDescriptors2.patch, SerializationBenchmarkMain.java, thrift-10-v2.patch, thrift-10-v3.patch
>
>
> The descriptors for structures, messages, types, etc. in package {{com.facebook.thrift.protocol}} which are used during (de-)serialization have mutable attributes. This forces the creation of these descriptors for each use even if the data is constant in many cases.
> By changing the descriptors to be immutable, structures and protocols can use singleton, constant instances during (de-)serialization. This will improve performance a bit by preventing the creation of hundreds (thousands?) of short-lived objects per second during heavy use of Thrift.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-10) Descriptors used during serialization should be immutable objects

Posted by "Johan Stuyts (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-10?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Johan Stuyts updated THRIFT-10:
-------------------------------

    Attachment: SerializationBenchmarkMain.java

Benchmark that serializes a structure many times to measure the effect on performance of changes to the library and/or the generated structures.

> Descriptors used during serialization should be immutable objects
> -----------------------------------------------------------------
>
>                 Key: THRIFT-10
>                 URL: https://issues.apache.org/jira/browse/THRIFT-10
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (Java)
>            Reporter: Johan Stuyts
>            Priority: Minor
>         Attachments: ImmutableProtocolComponentDescriptors2.patch, SerializationBenchmarkMain.java
>
>
> The descriptors for structures, messages, types, etc. in package {{com.facebook.thrift.protocol}} which are used during (de-)serialization have mutable attributes. This forces the creation of these descriptors for each use even if the data is constant in many cases.
> By changing the descriptors to be immutable, structures and protocols can use singleton, constant instances during (de-)serialization. This will improve performance a bit by preventing the creation of hundreds (thousands?) of short-lived objects per second during heavy use of Thrift.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-10) Descriptors used during serialization should be immutable objects

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12606446#action_12606446 ] 

Bryan Duxbury commented on THRIFT-10:
-------------------------------------

I've looked over this patch, and it seems good. There's no reason to go back to the drawing board and change this (fairly extensive) patch to use final attributes instead of getters. I think we should apply it.

> Descriptors used during serialization should be immutable objects
> -----------------------------------------------------------------
>
>                 Key: THRIFT-10
>                 URL: https://issues.apache.org/jira/browse/THRIFT-10
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (Java)
>            Reporter: Johan Stuyts
>            Priority: Minor
>         Attachments: ImmutableProtocolComponentDescriptors2.patch
>
>
> The descriptors for structures, messages, types, etc. in package {{com.facebook.thrift.protocol}} which are used during (de-)serialization have mutable attributes. This forces the creation of these descriptors for each use even if the data is constant in many cases.
> By changing the descriptors to be immutable, structures and protocols can use singleton, constant instances during (de-)serialization. This will improve performance a bit by preventing the creation of hundreds (thousands?) of short-lived objects per second during heavy use of Thrift.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-10) Descriptors used during serialization should be immutable objects

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-10?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bryan Duxbury updated THRIFT-10:
--------------------------------

    Attachment: thrift-10-v2.patch

Here's an updated version of this patch. I made a few changes:
 * Used final members instead of accessors in descriptor classes
 * Made the accesor classes final
 * Instead of MYSTRUCT_STRUCT, I just named the static struct descriptor STRUCT_DESC
 * Instead of MY_FIELD, I named the field descriptors MY_FIELD_DESC
 * I had to update if/reflection_limited.thrift to have the correct apache namespace, and regenerated the reflection.limited code
 * I found a duplicate FieldMetaData.java that should have been deleted but was not, so I whacked that

In a previous tiny commit, I added SerializationBenchmark.java, which is geared up to test this change. In trunk, writing 10M OneOfEachs without any real io takes 17879, while with this patch it takes 17022, which is about 4.8% speedup. The deserialization test also shows a measurable improvement, but because io has to be included in that test, it's not as big of a percent speedup (about 2%). (I could probably amend that test to not count some of the non-serialization time, but I'm being lazy.) So, it seems like this patch should be a modest performance gain and a simplification to the generated code and protocols. 

> Descriptors used during serialization should be immutable objects
> -----------------------------------------------------------------
>
>                 Key: THRIFT-10
>                 URL: https://issues.apache.org/jira/browse/THRIFT-10
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (Java)
>            Reporter: Johan Stuyts
>            Priority: Minor
>             Fix For: 0.1
>
>         Attachments: ImmutableProtocolComponentDescriptors2.patch, SerializationBenchmarkMain.java, thrift-10-v2.patch
>
>
> The descriptors for structures, messages, types, etc. in package {{com.facebook.thrift.protocol}} which are used during (de-)serialization have mutable attributes. This forces the creation of these descriptors for each use even if the data is constant in many cases.
> By changing the descriptors to be immutable, structures and protocols can use singleton, constant instances during (de-)serialization. This will improve performance a bit by preventing the creation of hundreds (thousands?) of short-lived objects per second during heavy use of Thrift.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-10) Descriptors used during serialization should be immutable objects

Posted by "Johan Stuyts (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12606282#action_12606282 ] 

Johan Stuyts commented on THRIFT-10:
------------------------------------

That is another solution to the same problem. It is fine with me. I only used getters because I prefer those. There is no other reason for using them.

> Descriptors used during serialization should be immutable objects
> -----------------------------------------------------------------
>
>                 Key: THRIFT-10
>                 URL: https://issues.apache.org/jira/browse/THRIFT-10
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (Java)
>            Reporter: Johan Stuyts
>            Priority: Minor
>         Attachments: ImmutableProtocolComponentDescriptors2.patch
>
>
> The descriptors for structures, messages, types, etc. in package {{com.facebook.thrift.protocol}} which are used during (de-)serialization have mutable attributes. This forces the creation of these descriptors for each use even if the data is constant in many cases.
> By changing the descriptors to be immutable, structures and protocols can use singleton, constant instances during (de-)serialization. This will improve performance a bit by preventing the creation of hundreds (thousands?) of short-lived objects per second during heavy use of Thrift.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-10) Descriptors used during serialization should be immutable objects

Posted by "Johan Stuyts (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12606660#action_12606660 ] 

Johan Stuyts commented on THRIFT-10:
------------------------------------

I doubt if making the members final will have any effect on the performance. The compiler and runtime are very smart (as I found out again). When I run the benchmark, with memory limited to 16 MB to force garbage collection to occur during the benchmark, I only get a performance improvement of less than 6 %. And the only thing the benchmark actually does besides serializing structs is counting the number of bytes that are written. In production environments the improvement will probably be immeasurable. You would expect that instantiating millions of objects, initializing their fields and garbage collecting them would consume more time.

Here are the numbers:
- without the patch (ms): 2093, 2094, 2110, 2110, 2093
- with the patch (ms): 2015, 1968, 1984, 1969, 1985

I removed the highest and lowest measurement from each set, added the remaining three measurements and divided the results:
- without the patch (ms): 2093 + 2094 + 2110 = 6297
- with the patch (ms): 1984 + 1969 + 1985 = 5938
- divided: 5938 / 6297 = 0,943 = 5,7 % improvement

> Descriptors used during serialization should be immutable objects
> -----------------------------------------------------------------
>
>                 Key: THRIFT-10
>                 URL: https://issues.apache.org/jira/browse/THRIFT-10
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (Java)
>            Reporter: Johan Stuyts
>            Priority: Minor
>         Attachments: ImmutableProtocolComponentDescriptors2.patch, SerializationBenchmarkMain.java
>
>
> The descriptors for structures, messages, types, etc. in package {{com.facebook.thrift.protocol}} which are used during (de-)serialization have mutable attributes. This forces the creation of these descriptors for each use even if the data is constant in many cases.
> By changing the descriptors to be immutable, structures and protocols can use singleton, constant instances during (de-)serialization. This will improve performance a bit by preventing the creation of hundreds (thousands?) of short-lived objects per second during heavy use of Thrift.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-10) Descriptors used during serialization should be immutable objects

Posted by "Johan Stuyts (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-10?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Johan Stuyts updated THRIFT-10:
-------------------------------

    Attachment: ImmutableProtocolComponentDescriptors2.patch

Make the component descriptors used by protocols in package {{com.facebook.thrift.protocol}} immutable. This allows singleton, constant instances to be used during serialization, and during deserialization if a non-reflective protocol is used.

I do not expect a performance boost. It is a minor optimization (and might even be a micro-optimization).

> Descriptors used during serialization should be immutable objects
> -----------------------------------------------------------------
>
>                 Key: THRIFT-10
>                 URL: https://issues.apache.org/jira/browse/THRIFT-10
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (Java)
>            Reporter: Johan Stuyts
>            Priority: Minor
>         Attachments: ImmutableProtocolComponentDescriptors2.patch
>
>
> The descriptors for structures, messages, types, etc. in package {{com.facebook.thrift.protocol}} which are used during (de-)serialization have mutable attributes. This forces the creation of these descriptors for each use even if the data is constant in many cases.
> By changing the descriptors to be immutable, structures and protocols can use singleton, constant instances during (de-)serialization. This will improve performance a bit by preventing the creation of hundreds (thousands?) of short-lived objects per second during heavy use of Thrift.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-10) Descriptors used during serialization should be immutable objects

Posted by "David Reiss (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12671538#action_12671538 ] 

David Reiss commented on THRIFT-10:
-----------------------------------

Feel free to ci if you fix all three of these things.

> Descriptors used during serialization should be immutable objects
> -----------------------------------------------------------------
>
>                 Key: THRIFT-10
>                 URL: https://issues.apache.org/jira/browse/THRIFT-10
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (Java)
>            Reporter: Johan Stuyts
>            Assignee: Bryan Duxbury
>            Priority: Minor
>             Fix For: 0.1
>
>         Attachments: ImmutableProtocolComponentDescriptors2.patch, SerializationBenchmarkMain.java, thrift-10-v2.patch, thrift-10-v3.patch
>
>
> The descriptors for structures, messages, types, etc. in package {{com.facebook.thrift.protocol}} which are used during (de-)serialization have mutable attributes. This forces the creation of these descriptors for each use even if the data is constant in many cases.
> By changing the descriptors to be immutable, structures and protocols can use singleton, constant instances during (de-)serialization. This will improve performance a bit by preventing the creation of hundreds (thousands?) of short-lived objects per second during heavy use of Thrift.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-10) Descriptors used during serialization should be immutable objects

Posted by "David Reiss (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12599951#action_12599951 ] 

David Reiss commented on THRIFT-10:
-----------------------------------

Can you post the patch again here so you can mark it as Apache-okay?

Can you also post either the second patch or a combined patch so we can measure the performance boost before committing this?  (Committing it will require regenerating all generated code, so I'd rather wait until we have everything ready before breaking stuff.)

> Descriptors used during serialization should be immutable objects
> -----------------------------------------------------------------
>
>                 Key: THRIFT-10
>                 URL: https://issues.apache.org/jira/browse/THRIFT-10
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (Java)
>            Reporter: Johan Stuyts
>            Priority: Minor
>
> The descriptors for structures, messages, types, etc. in package {{com.facebook.thrift.protocol}} which are used during (de-)serialization have mutable attributes. This forces the creation of these descriptors for each use even if the data is constant in many cases.
> By changing the descriptors to be immutable, structures and protocols can use singleton, constant instances during (de-)serialization. This will improve performance a bit by preventing the creation of hundreds (thousands?) of short-lived objects per second during heavy use of Thrift.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-10) Descriptors used during serialization should be immutable objects

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12613518#action_12613518 ] 

Todd Lipcon commented on THRIFT-10:
-----------------------------------

+1 on the idea of making everything mutability.

I haven't had a chance to apply the patch, recompile the generator, and run the tests.

Assuming someone has done this and everything checked out, I say commit away!

> Descriptors used during serialization should be immutable objects
> -----------------------------------------------------------------
>
>                 Key: THRIFT-10
>                 URL: https://issues.apache.org/jira/browse/THRIFT-10
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (Java)
>            Reporter: Johan Stuyts
>            Priority: Minor
>         Attachments: ImmutableProtocolComponentDescriptors2.patch, SerializationBenchmarkMain.java
>
>
> The descriptors for structures, messages, types, etc. in package {{com.facebook.thrift.protocol}} which are used during (de-)serialization have mutable attributes. This forces the creation of these descriptors for each use even if the data is constant in many cases.
> By changing the descriptors to be immutable, structures and protocols can use singleton, constant instances during (de-)serialization. This will improve performance a bit by preventing the creation of hundreds (thousands?) of short-lived objects per second during heavy use of Thrift.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (THRIFT-10) Descriptors used during serialization should be immutable objects

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-10?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bryan Duxbury reassigned THRIFT-10:
-----------------------------------

    Assignee: Bryan Duxbury

> Descriptors used during serialization should be immutable objects
> -----------------------------------------------------------------
>
>                 Key: THRIFT-10
>                 URL: https://issues.apache.org/jira/browse/THRIFT-10
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (Java)
>            Reporter: Johan Stuyts
>            Assignee: Bryan Duxbury
>            Priority: Minor
>             Fix For: 0.1
>
>         Attachments: ImmutableProtocolComponentDescriptors2.patch, SerializationBenchmarkMain.java, thrift-10-v2.patch
>
>
> The descriptors for structures, messages, types, etc. in package {{com.facebook.thrift.protocol}} which are used during (de-)serialization have mutable attributes. This forces the creation of these descriptors for each use even if the data is constant in many cases.
> By changing the descriptors to be immutable, structures and protocols can use singleton, constant instances during (de-)serialization. This will improve performance a bit by preventing the creation of hundreds (thousands?) of short-lived objects per second during heavy use of Thrift.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-10) Descriptors used during serialization should be immutable objects

Posted by "David Reiss (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12671394#action_12671394 ] 

David Reiss commented on THRIFT-10:
-----------------------------------

This is awesome.  In the future, it might even be worth pre-generating all of the TSet, TList, and maybe even TMap objects so that we can just do a lookup when reading instead of allocating a new object.

I don't know if it is worth making TMessage final, but it doesn't hurt.

In the expression "is_upper && character != '_'", the check against '_' is redundant.

There is a libc function isupper.  For most optimal C++ style, #include <cctype>, then call it as std::isupper(c).

You've added a little bit of trailing whitespace.

> Descriptors used during serialization should be immutable objects
> -----------------------------------------------------------------
>
>                 Key: THRIFT-10
>                 URL: https://issues.apache.org/jira/browse/THRIFT-10
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (Java)
>            Reporter: Johan Stuyts
>            Assignee: Bryan Duxbury
>            Priority: Minor
>             Fix For: 0.1
>
>         Attachments: ImmutableProtocolComponentDescriptors2.patch, SerializationBenchmarkMain.java, thrift-10-v2.patch
>
>
> The descriptors for structures, messages, types, etc. in package {{com.facebook.thrift.protocol}} which are used during (de-)serialization have mutable attributes. This forces the creation of these descriptors for each use even if the data is constant in many cases.
> By changing the descriptors to be immutable, structures and protocols can use singleton, constant instances during (de-)serialization. This will improve performance a bit by preventing the creation of hundreds (thousands?) of short-lived objects per second during heavy use of Thrift.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-10) Descriptors used during serialization should be immutable objects

Posted by "David Reiss (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12606506#action_12606506 ] 

David Reiss commented on THRIFT-10:
-----------------------------------

Bryan, do you agree that this will not result in a performance improvement?  If so, what is the benefit?

> Descriptors used during serialization should be immutable objects
> -----------------------------------------------------------------
>
>                 Key: THRIFT-10
>                 URL: https://issues.apache.org/jira/browse/THRIFT-10
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (Java)
>            Reporter: Johan Stuyts
>            Priority: Minor
>         Attachments: ImmutableProtocolComponentDescriptors2.patch
>
>
> The descriptors for structures, messages, types, etc. in package {{com.facebook.thrift.protocol}} which are used during (de-)serialization have mutable attributes. This forces the creation of these descriptors for each use even if the data is constant in many cases.
> By changing the descriptors to be immutable, structures and protocols can use singleton, constant instances during (de-)serialization. This will improve performance a bit by preventing the creation of hundreds (thousands?) of short-lived objects per second during heavy use of Thrift.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (THRIFT-10) Descriptors used during serialization should be immutable objects

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-10?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bryan Duxbury resolved THRIFT-10.
---------------------------------

    Resolution: Fixed

Thanks for the thoughtful review. I just committed this.

> Descriptors used during serialization should be immutable objects
> -----------------------------------------------------------------
>
>                 Key: THRIFT-10
>                 URL: https://issues.apache.org/jira/browse/THRIFT-10
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (Java)
>            Reporter: Johan Stuyts
>            Assignee: Bryan Duxbury
>            Priority: Minor
>             Fix For: 0.1
>
>         Attachments: ImmutableProtocolComponentDescriptors2.patch, SerializationBenchmarkMain.java, thrift-10-v2.patch, thrift-10-v3.patch
>
>
> The descriptors for structures, messages, types, etc. in package {{com.facebook.thrift.protocol}} which are used during (de-)serialization have mutable attributes. This forces the creation of these descriptors for each use even if the data is constant in many cases.
> By changing the descriptors to be immutable, structures and protocols can use singleton, constant instances during (de-)serialization. This will improve performance a bit by preventing the creation of hundreds (thousands?) of short-lived objects per second during heavy use of Thrift.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-10) Descriptors used during serialization should be immutable objects

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12669445#action_12669445 ] 

Bryan Duxbury commented on THRIFT-10:
-------------------------------------

Glancing at this issue again, I wonder if there would be more substantial benefits to be had if we changed the code generator to have constant, static field descriptors and such. We could avoid lots of repeated object allocations with this change, especially for structs with lots of fields.

> Descriptors used during serialization should be immutable objects
> -----------------------------------------------------------------
>
>                 Key: THRIFT-10
>                 URL: https://issues.apache.org/jira/browse/THRIFT-10
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (Java)
>            Reporter: Johan Stuyts
>            Priority: Minor
>         Attachments: ImmutableProtocolComponentDescriptors2.patch, SerializationBenchmarkMain.java
>
>
> The descriptors for structures, messages, types, etc. in package {{com.facebook.thrift.protocol}} which are used during (de-)serialization have mutable attributes. This forces the creation of these descriptors for each use even if the data is constant in many cases.
> By changing the descriptors to be immutable, structures and protocols can use singleton, constant instances during (de-)serialization. This will improve performance a bit by preventing the creation of hundreds (thousands?) of short-lived objects per second during heavy use of Thrift.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-10) Descriptors used during serialization should be immutable objects

Posted by "Johan Stuyts (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12597374#action_12597374 ] 

Johan Stuyts commented on THRIFT-10:
------------------------------------

I sent a patch to the mailing list:
http://publists.facebook.com/pipermail/thrift/2008-May/001006.html

As can be read in the post this is a partial patch as it only made the descriptors immutable. The descriptors are still created for each use. Another patch will have to change the object creations to constant instances.

> Descriptors used during serialization should be immutable objects
> -----------------------------------------------------------------
>
>                 Key: THRIFT-10
>                 URL: https://issues.apache.org/jira/browse/THRIFT-10
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (Java)
>            Reporter: Johan Stuyts
>            Priority: Minor
>
> The descriptors for structures, messages, types, etc. in package {{com.facebook.thrift.protocol}} which are used during (de-)serialization have mutable attributes. This forces the creation of these descriptors for each use even if the data is constant in many cases.
> By changing the descriptors to be immutable, structures and protocols can use singleton, constant instances during (de-)serialization. This will improve performance a bit by preventing the creation of hundreds (thousands?) of short-lived objects per second during heavy use of Thrift.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-10) Descriptors used during serialization should be immutable objects

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12649479#action_12649479 ] 

Bryan Duxbury commented on THRIFT-10:
-------------------------------------

Should we pick up this issue again? It'd be a nice conceptual improvement and a tiny but measurable performance boost. We could also make the descriptor classes final to make them a little faster again. 

> Descriptors used during serialization should be immutable objects
> -----------------------------------------------------------------
>
>                 Key: THRIFT-10
>                 URL: https://issues.apache.org/jira/browse/THRIFT-10
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (Java)
>            Reporter: Johan Stuyts
>            Priority: Minor
>         Attachments: ImmutableProtocolComponentDescriptors2.patch, SerializationBenchmarkMain.java
>
>
> The descriptors for structures, messages, types, etc. in package {{com.facebook.thrift.protocol}} which are used during (de-)serialization have mutable attributes. This forces the creation of these descriptors for each use even if the data is constant in many cases.
> By changing the descriptors to be immutable, structures and protocols can use singleton, constant instances during (de-)serialization. This will improve performance a bit by preventing the creation of hundreds (thousands?) of short-lived objects per second during heavy use of Thrift.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-10) Descriptors used during serialization should be immutable objects

Posted by "Johan Stuyts (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12606511#action_12606511 ] 

Johan Stuyts commented on THRIFT-10:
------------------------------------

I can confirm that the performance improvement is neglible. I ran some tests that only counted the bytes that were written. The performance improvement was just a few percent.

The benefit of the patch is that the code is a bit cleaner, but of course that is a matter of taste.

> Descriptors used during serialization should be immutable objects
> -----------------------------------------------------------------
>
>                 Key: THRIFT-10
>                 URL: https://issues.apache.org/jira/browse/THRIFT-10
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Library (Java)
>            Reporter: Johan Stuyts
>            Priority: Minor
>         Attachments: ImmutableProtocolComponentDescriptors2.patch
>
>
> The descriptors for structures, messages, types, etc. in package {{com.facebook.thrift.protocol}} which are used during (de-)serialization have mutable attributes. This forces the creation of these descriptors for each use even if the data is constant in many cases.
> By changing the descriptors to be immutable, structures and protocols can use singleton, constant instances during (de-)serialization. This will improve performance a bit by preventing the creation of hundreds (thousands?) of short-lived objects per second during heavy use of Thrift.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.