You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@thrift.apache.org by "Will Pierce (JIRA)" <ji...@apache.org> on 2011/03/28 01:53:05 UTC

[jira] [Issue Comment Edited] (THRIFT-1115) dynamic python object (de)serialization, using __slots__ for less memory, and custom object inheritance

    [ https://issues.apache.org/jira/browse/THRIFT-1115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13011872#comment-13011872 ] 

Will Pierce edited comment on THRIFT-1115 at 3/27/11 11:52 PM:
---------------------------------------------------------------

Attaching test benchmarking script "test_size.py" that measures the RAM used and the Time elapsed to instantiate 1 million thrift objects.

To use this code:
# cd into the current thrift-svn trunk/
# apply the patch to trunk
# run ./bootstrap.sh && ./configure --with-ruby=no && make
# cd test/py  (put the test_size.py script here into this test/py/ directory)
# make check
# hit control-C after the tests start up, since all want is the test-suite to build out the various gen-py-\* directories with different variations of generated python code.

The test code takes 3 arguments:
* arg1: directory to add to sys.path to import the generated code. One of gen-py (same as gen-py-default), gen-py-dynamicslots, gen-py-newstyle, gen-py-newstyleslots, gen-py-dynamic
* arg2: name of class from DebugProtoTest and ThriftTest to instantiate for size/speed tests: i.e. Bonk, Empty, OneOfEach, CompactProtoTestStruct, Bools, etc...
* args: optional, the number of objects to instantiate.  Defaults to 1 million.  If you have the RAM, try 10 million, i.e. 10000000

This requires linux, as it gets the RAM usage at end-of-execution from the VmRSS line of /proc/PID/status

The two particular test-cases that are most interesting (to me) are: gen-py-default vs. gen-py-dynamicslots   The gen-py-default is what "-gen py" produces, using old-style non-dynamic generated code.  The gen-py-dynamicslots directory contains the results of running "-gen py:dynamic,slots" which produces new-style classes that inherit from the TProtocolDynamic class for serialization/deserialization and uses {{__slots__}} for more efficient storage and faster execution.

I will post some test results from this script shortly, with the time and space savings that the dynamic/slots based code produces.  The savings surprised me.

      was (Author: willp):
    Attaching test benchmarking script "test_size.py" that measures the RAM used and the Time elapsed to instantiate 1 million thrift objects.

To use this code:
# cd into the current thrift-svn trunk/
# apply the patch to trunk
# run ./bootstrap.sh && ./configure --with-ruby=no && make
# cd test/py
# make check
# hit control-C after the tests start up, since all want is the test-suite to build out the various gen-py-\* directories with different variations of generated python code.

The test code takes 3 arguments:
* arg1: directory to add to sys.path to import the generated code. One of gen-py (same as gen-py-default), gen-py-dynamicslots, gen-py-newstyle, gen-py-newstyleslots, gen-py-dynamic
* arg2: name of class from DebugProtoTest and ThriftTest to instantiate for size/speed tests: i.e. Bonk, Empty, OneOfEach, CompactProtoTestStruct, Bools, etc...
* args: optional, the number of objects to instantiate.  Defaults to 1 million.  If you have the RAM, try 10 million, i.e. 10000000

This requires linux, as it gets the RAM usage at end-of-execution from the VmRSS line of /proc/PID/status

The two particular test-cases that are most interesting (to me) are: gen-py-default vs. gen-py-dynamicslots   The gen-py-default is what "-gen py" produces, using old-style non-dynamic generated code.  The gen-py-dynamicslots directory contains the results of running "-gen py:dynamic,slots" which produces new-style classes that inherit from the TProtocolDynamic class for serialization/deserialization and uses {{__slots__}} for more efficient storage and faster execution.

I will post some test results from this script shortly, with the time and space savings that the dynamic/slots based code produces.  The savings surprised me.
  
> dynamic python object (de)serialization, using __slots__ for less memory, and custom object inheritance
> -------------------------------------------------------------------------------------------------------
>
>                 Key: THRIFT-1115
>                 URL: https://issues.apache.org/jira/browse/THRIFT-1115
>             Project: Thrift
>          Issue Type: New Feature
>          Components: Python - Compiler, Python - Library
>            Reporter: Will Pierce
>            Assignee: Will Pierce
>         Attachments: THRIFT-1115.python_dynamic_code_and_slots_v1.patch, test_size.py
>
>
> This patch adds several new features to the compiler for python and the python libraries, and exercises the new features with expanded unit testing.
> This adds support for generating python classes that have no {{read()}} or {{write()}} methods. Instead, the generated classes inherit from a new base class, {{TProtocolDynamic}}. This new class implements the de/serialization with {{read()}} and {{write()}} methods that iterate over the derived class's "{{thrift_spec}}" class member that describes the structure and types of the thrift.  This dynamic evaluation works with both binary and compact protocols, and has the same hook as before for delegating de/serialization to fastbinary.so for the "accelerated binary" protocol.  This new baseclass {{read()}} method may even be more efficient than the generated explicit {{read()}} code for objects with lots of attributes because it doesn't have a case/switch style series of "{{if field_id == X...}}" nested inside a loop.  Instead, it indexes the decoded field ID into the {{thrift_spec}} tuple directly.  That efficiency gain is probably just noise though, since the dynamic process probably uses more CPU later on, though I haven't benchmarked it.
> If the 'dynamic' flag is given as a -gen py: flag to the compiler, then the generated classes no longer get individual {{\_\_repr\_\_}} and {{\_\_eq\_\_}} and {{\_\_ne\_\_}} methods, instead they inherit from the TProtocolDynamic base class implementation, which uses {{\_\_slots\_\_}} instead of {{\_\_dict\_\_}} for repr and equality testing.
> When "dynamic" python classes are generated, they have very little code, just a constructor and class data.  All the work of serialization and deserialization is done by the base class.  This produces about 980 lines for DebugProtoTest vs. 3540 lines in default "\-gen py" mode, or about 1/3 the original code size.
> The {{\_\_slots\_\_}} support is available without requiring the dynamic base class, so users can save memory using the slots flag to generate non-dict based instances.  The memory difference between dict and slots based objects is hard to measure, but seems to be around 10x smaller using slots, as long as the base class also uses {{\_\_slots\_\_}}.  If the generated classes are old-style, and use slots, there's no memory savings at all, because the base class still creates a {{\_\_dict\_\_}} object for every instance.  Python is just tricky when it comes to using {{\_\_slots\_\_}} best.
> The memory savings is pretty astounding using new-style classes and {{\_\_slots\_\_}}.  Building DebugProtoTest.thrift with: -gen py:dynamic,slots versus \-gen py results in some pretty amazing memory savings.  I tested by instantiating 1 million of the heavy DebugProtoTest.thrift's {{CompactProtoTestStruct()}}, which has 49 attributes in it, using regular "\-gen py" code versus "{{\-gen py:dynamic,slots}}" and compared the VmRSS resident memory usage of both processes.  I didn't set any values to any attributes, so every attribute was left with the null value, None.  The slots technique used 441 MB with slots vs. 3485 MB using non-slots, non-dynamic generated code.  That's about 13% of the original size, or 87% memory savings.
> I tried the same test using a very tiny object instead, the DebugThrift.thift {{Bonk()}} class, which only has two fields.  For this, I made 10 million instances of {{Bonk()}} and the results were very similar: 730 MB with slots vs. 2622 MB using non-slots.
> The speed difference between non-slots and slots is about 25% better with slots, or 14.6 seconds for non-slots 10 million Bonk() test, versus 11.1 seconds for the same slots test, or a savings of 24% time.  This only measured object creation time, not serialization/deserialization or instance attribute access.
> If the user wants to subclass the {{TProtocolDynamic}} class and override the read/write methods, or add more functionality, this patch also adds a {{dynbase=CLS}} thrift compiler option and a "{{dynimport='from foo import *}}" option, to specify a different base class (CLS) for the generated "dynamic" objects, and an import line that brings that base class into scope.
> The python unit tests now get yet another test loop layer, with the {{Makefile.am}} specifying multiple thrift generation lines, exercising the possible combinations of {{thrift \-gen py:blah}}, sending the output to multiple gen-py-blah output directories (like gen\-py\-default, gen\-py\-dynamicslots, gen\-py\-newstyle, etc), so that {{RunClientServer.py}} can then loop over each of these and do all of its testing (which is combinatorially exhaustive) on each of those subdirectories.  It's a pretty insane number of tests now, over 390 individual tests on python >= 2.5 (ssl + multiprocessing).  Since the test/py "make check" tests aren't done for the 'casual' installer, I think it's acceptable to have the "make check" tests take this long to complete- but it is annoying to have to wait a few minutes for all the tests to finish/succeed.  I think it could be improved a fair bit.
> This patch also makes a few other small changes:
> * adds a comment to the top of each generated .py file with the options to \-gen, i.e. "{{py:dynamic,slots}}", to make it easier to figure out how the thrift compiler generated code was created, after the fact.  This should probably be backported to other languages.
> * added another test for inequality to {{SerializationTest.py}} to compare objects of different types, to exercise the {{\_\_eq\_\_}} code more
> * added \-\-genpydir cmdline options to {{TestServer.py}} and {{TestClient.py}} to let {{RunClientServer.py}} loop over the gen\-py\-\* directories as a kind of test controller.
> * reduced verbosity on the TestServer/Client.py code from 2 to 1, since the number of invocations is now huge.
> * changed TestServer's intentional Exception to have a message "Exception test PASSES." instead of "foo", which previously made the unit test output look kind of scary.
> * Adds container read/write methods to {{TProtocol}}, only used by {{TProtocolDynamic}} for "dynamic" generated classes
> * Added a {{_VALUES_TO_NAMES}} list to {{TType}} in {{Thrift.py}} to make it easier to log about {{TType}} related (de)serialization issues.
> * Changes the import line in generated code from "{{from thrift.Thrift import *}}" to "{{from thrift.Thrift import TType, TMessageType}}" to be a little more controlled about what is imported into the generated\-code namespace.  May be slightly more efficient too.
> * reduces the inter\-test {{sleep()}} delays somewhat, to try to speed up the "make check" unit testing a bit.  hopefully the delays are large enough that no spurious errors pop up.
> Things to ponder:
> Using -gen py:dynamic,slots actually results in python rejecting any attempt to assign to a non-existent member variable of a thrift object.  It's not the same as statically-typed languages blowing up at compile time for the same reason, but it could expose more subtle user-code errors that otherwise are undetected during serialization.
> There's an interesting edge case in python 2.4 that arises due to old-style classes and exception processing that only shows up with new-style classes (those derived ultimately from object).  I'm not sure how to handle this, since it's a problem that you don't know exists at code-generation time, since we can't really know what version of python is going to execute the generated code.  Usually it just doesn't matter, but the case of exceptions with multiple inheritance is an edge case, it seems.  Since this is only a tricky problem for the unit tests, I put some failure detection code into {{TestClient.py}} to skip the exception testing if the generated class causes a {{TypeError}} exception when raised, instead of itself.  The test code prints a warning: "{{Skipping testing of Xception because python (2, 4) does not support this type of exception (gen py is: gen\-py\-dynamic)}}".
> I'm not sure the {{TProtocolDynamic}} class name is the best possible name for the base class for "dynamic" objects to derive from.  The {{TProtocolDynamic}} class does embody the underlying protocol, at an abstract level, but the only other class in thrift/protocol/ that it is similar to is the {{TProtocolBase}} class.
> This patch adds more dynamic behavior to the thrift library, and it makes sense that we'd want to take it another step and implement a python class that can parse a .thrift file, and generate the corresponding message/struct/etc classes at runtime.  This patch doesn't do that, but I think it makes it easier.  Obviously the use-cases are compelling for generating the actual per-class {{read()}} and {{write()}} methods at thrift-compile time.  But there's some interesting possibilities ahead.
> The unit testing code needs a little more TLC... I'm not too familiar with best practices here, but I'd love to see some code-coverage metrics produced by the unit testing / test cases.  It would be nice to know if some code paths aren't being exercised.  I won't be surprised if there are edge-cases that break the dynamic/slots generated code.  It would be nice to know of any, so we can update the unit tests to explore it (and fix whatever it is).  The unit tests don't exercise the py:twisted compatible code at all, so there's likely some lurking bugs in there.
> With this patch, the new output from {{thrift -h}} relating to python is:
> {noformat}
>   --gen STR   Generate code with a dynamically-registered generator.
>                 STR has the form language[:key1=val1[,key2,[key3=val3]]].
>                 Keys and values are options passed to the generator.
>                 Many options will not require values.
> Available generators (and options):
> [ ... snip ... ]
>   py (Python):
>     new_style:       Generate new-style classes.
>     twisted:         Generate Twisted-friendly RPC services.
>     slots:           Generate code using slots for instance members.
>     dynamic:         Generate dynamic code, less code generated but slower.
>     dynbase=CLS      Derive generated classes from class CLS instead of TProtocolDynamic.
>     dynimport='from foo.bar import CLS'
>                      Add an import line to generated code to find the dynbase class.
> {noformat}
> Sorry for such a long ticket body.  I'll attach the patch in a minute or so,

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira