You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@thrift.apache.org by "Deepak Muley (JIRA)" <ji...@apache.org> on 2012/10/14 08:50:03 UTC

[jira] [Commented] (THRIFT-597) Python THttpServer performance improvements

    [ https://issues.apache.org/jira/browse/THRIFT-597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475781#comment-13475781 ] 

Deepak Muley commented on THRIFT-597:
-------------------------------------

One query:

I want to use THttpServer in my python project but from initial performance review for a single client single request (No ThreadingMixIn required at this point), 
it shows that HttpServer is always 2 to 4 times slower than NonBlocking server in tutorial Calculator client server app and same was observed in my product testing as well. 
Is there a way to make the performance equal for THTTPServer as compared to NonBlocking?

Heres the code for NonBlocking client and server in python and its performance on ubuntu. single machine.
------------------
client:
def non_blocking_server_client():

   try:

      # Make socket
      transport = TSocket.TSocket('localhost', 9090)

      # Buffering is critical. Raw sockets are very slow
      transport = TTransport.TFramedTransport(transport)

      # Wrap in a protocol
      protocol = TBinaryProtocol.TBinaryProtocol(transport)

      # Create a client to use the protocol encoder
      client = Calculator.Client(protocol)

      # Connect!
      transport.open()

      perform_ops(client)

      # Close!
      transport.close()

   except Thrift.TException, tx:
      print '%s' % (tx.message)


server:
def non_blocking_server():
   handler = CalculatorHandler()
   processor = Calculator.Processor(handler)
   transport = TSocket.TServerSocket(port=9090)

   server = TNonblockingServer.TNonblockingServer(processor, transport)

   print 'Starting the server...'
   server.serve()
   print 'done.'


performance timings:
$ ./PythonClient.py
ping took 0.633 ms <==================
ping()
add took 0.395 ms
1+1=2
InvalidOperation: InvalidOperation(what=4, why='Cannot divide by 0')
calculate took 0.399 ms
15-10=5
getStruct took 0.361 ms
Check log: 5

$ ./PythonClient.py
ping took 0.536 ms <==================
ping()
add took 0.362 ms
1+1=2
InvalidOperation: InvalidOperation(what=4, why='Cannot divide by 0')
calculate took 0.403 ms
15-10=5
getStruct took 0.364 ms
Check log: 5
------------------

Heres the code for HttpServer client and server in python and its performance
------------------
client:
def http_server_client():
   try:

      path = "http://%s:%s/" % ('127.0.0.1', 9090)

      transport = THttpClient.THttpClient(uri_or_host=path)

      # Wrap in a protocol
      protocol = TBinaryProtocol.TBinaryProtocol(transport)

      # Create a client to use the protocol encoder
      client = Calculator.Client(protocol)

      # Connect!
      transport.open()

      perform_ops(client)

      # Close!
      transport.close()

   except Thrift.TException, tx:
      print '%s' % (tx.message)


server:
def http_server():
   handler = CalculatorHandler()
   processor = Calculator.Processor(handler)
   pfactory = TBinaryProtocol.TBinaryProtocolFactory()

   server = THttpServer.THttpServer(processor, ('127.0.0.1', 9090), pfactory)

   print 'Starting the server...'
   server.serve()
   print 'done.'


performance timings:
$ ./PythonClient.py
ping took 1.535 ms <==================
ping()
add took 0.972 ms
1+1=2
InvalidOperation: InvalidOperation(what=4, why='Cannot divide by 0')
calculate took 0.929 ms
15-10=5
getStruct took 0.943 ms
Check log: 5

$ ./PythonClient.py
ping took 1.243 ms <==================
ping()
add took 0.944 ms
1+1=2
InvalidOperation: InvalidOperation(what=4, why='Cannot divide by 0')
calculate took 0.930 ms
15-10=5
getStruct took 0.925 ms
Check log: 5
------------------

server side timings are pretty low.
In my project I observed that on client side, its the generated code send_<api> takes most of the time. which internally calls httplib's getresponse which probably is waiting for something to come on wire?

Will really appreciate if someone can shed some light on how to improve performance for THttpServer in thrift 0.8.0.
                
> Python THttpServer performance improvements
> -------------------------------------------
>
>                 Key: THRIFT-597
>                 URL: https://issues.apache.org/jira/browse/THRIFT-597
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Python - Library
>            Reporter: David Reiss
>            Assignee: David Reiss
>            Priority: Minor
>             Fix For: 0.5
>
>         Attachments: ASF.LICENSE.NOT.GRANTED--v1-0002-THRIFT-597.-python-Make-THttpServer-use-buffering.patch, ASF.LICENSE.NOT.GRANTED--v1-0003-THRIFT-597.-python-Allow-THttpServer-to-use-Threadi.patch
>
>
> This class was originally meant for functional testing only, so performance wasn't a concern.  But now I'm using it for load testing. :)  Two patches here.  The first enables buffered I/O.  The second allows the http server class to be specified, which allows users to use the ThreadingMixin.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira