You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@thrift.apache.org by Per Knudsgaard <pk...@rgbnetworks.com> on 2011/05/26 01:33:21 UTC

Connection issues with Python server / C++ client

Hi,

I am having a small problem with a small client/server application and I am hoping for an easy answer :)

The server is written in python, the client is c++ and I am using thrift with a buffered transport. I have tried TSimpleServer and TThreadedServer with the same behavior. The thrift version is 0.5.0.

What I am trying to do is have the client send oneway updates to the server on a regular basis. Some of the updates are large (700+ bytes) and some are smaller (10-20 bytes). What I am seeing is the following:

1. I kill the python server (kill -9...)

2. The next message throws an exception on the client.

3. The client drops the message (single message loss is ok) and marks the connection as failed.

4. Next message will cause a the connection to be re-opened before being sent.

At this point, the server will not get any messages (the message from 4 will disappear, further messages will be dropped). Neither the server nor the client will produce any indication that there is a problem.

Looking at a tcpdump, I find that when the connection is re-opened in 4, the message from 2 is re-sent followed by the new message. Well, it looks like the first ~500 bytes from the first message are sent and the rest dropped (it is hard to tell exactly what is dropped since I am using a BinaryProtocol). Adding some instrumentation to the generated thrift code finds it blocking in a read call, waiting for half a megabyte of data. I assume that means that the parser went off the tracks when it didn't get the full message?

Does any of this sound familiar? How much of the client should be re-created when a connection fails?

Thanks,

-- Per.

Re: Connection issues with Python server / C++ client

Posted by Vinod Gupta Tankala <tv...@socialyantra.com>.

I faced something similar couple of days ago.. my server was in java though.
then i upgraded to thrift 0.60.0.
the other thing i had to do was to make the server multi threaded. that only
solves part of your problem. the main issue is that the server has an
established stale connection with the client which has probably died without
sending a reset. hence the server is stuck forever. i verified by doing
netstat and jstack dump.
so the other change you need to make is give client timeout to
TServerTransport that you pass to TSimpleServer.

i am guessing this is the issue you are facing. hope this helps.

thanks

On Wed, May 25, 2011 at 4:37 PM, Chris Morgan <ch...@gmail.com> wrote:

> On May 25, 2011, at 7:33 PM, Per Knudsgaard <pk...@rgbnetworks.com>
> wrote:
>
> >   Hi,
> >
> >   I am having a small problem with a small client/server application and
> I am hoping for an easy answer :)
> >
> >   The server is written in python, the client is c++ and I am using
> thrift with a buffered transport.  I have tried TSimpleServer and
> TThreadedServer with the same behavior.  The thrift version is 0.5.0.
> >
> >  What I am trying to do is have the client send oneway updates to the
> server on a regular basis.  Some of the updates are large (700+ bytes) and
> some are smaller (10-20 bytes).  What I am seeing is the following:
> >
> >
> > 1.       I kill the python server (kill -9...)
> >
> > 2.       The next message throws an exception on the client.
> >
> > 3.       The client drops the message (single message loss is ok) and
> marks the connection as failed.
> >
> > 4.       Next message will cause a the connection to be re-opened before
> being sent.
> >
> >   At this point, the server will not get any messages (the message from 4
> will disappear, further messages will be dropped).  Neither the server nor
> the client will produce any indication that there is a problem.
> >
> >   Looking at a tcpdump, I find that when the connection is re-opened in
> 4, the message from 2 is re-sent followed by the new message.  Well, it
> looks like the first ~500 bytes from the first message are sent and the rest
> dropped (it is hard to tell exactly what is dropped since I am using a
> BinaryProtocol).  Adding some instrumentation to the generated thrift code
> finds it blocking in a read call, waiting for half a megabyte of data.  I
> assume that means that the parser went off the tracks when it didn't get the
> full message?
> >
> >   Does any of this sound familiar?  How much of the client should be
> re-created when a connection fails?
> >
> >   Thanks,
> >
> >   -- Per.
>
> I'd recommend trying the latest thrift release, your issue may have
> been fixed already. If that doesn't work you might try the latest
> snapshot release or straight from source control (can't recall which
> source control thrift is using).
>
> Chris
>

Re: Connection issues with Python server / C++ client

Posted by Chris Morgan <ch...@gmail.com>.

On May 25, 2011, at 8:09 PM, Per Knudsgaard <pk...@rgbnetworks.com> wrote:

>   Thanks Chris,
>
>   Updating to the latest feels a little like rebooting your windows machine.  It may have fixed the problem but it didn't actually help you understand why it needed rebooting or what to do to avoid problems in the future.
>

I recall seeing a patch for some python connection issue. You might be
able to find it by searching on the dev mailing list or just via
google.

Its more like upgrading from one os to a newer version. In your case
you are mentioning using an old version. A newer and hopefully more
featureful and less buggy version is available. Why beat your head
against the wall with the older version?


>   In the real world, services tend to come and go.  Re-establishing connections is an essential part of a networked system and my question is as much about how I solve my immediate problem as it is about what a client needs to do to reconnect to a server.  Should I be creating fresh sockets/transports/protocols or can I simply call open and reuse the ones I already created?
>

I dont quite understand. If the connection is lost of course you need
to open it again and reconnect if you want to communicate.



>   -- Per.
>
> -----Original Message-----
> From: Chris Morgan [mailto:chmorgan@gmail.com]
> Sent: Wednesday, May 25, 2011 4:37 PM
> To: user@thrift.apache.org
> Subject: Re: Connection issues with Python server / C++ client
>
> On May 25, 2011, at 7:33 PM, Per Knudsgaard <pk...@rgbnetworks.com> wrote:
>
>>  Hi,
>>
>>  I am having a small problem with a small client/server application and I am hoping for an easy answer :)
>>
>>  The server is written in python, the client is c++ and I am using thrift with a buffered transport.  I have tried TSimpleServer and TThreadedServer with the same behavior.  The thrift version is 0.5.0.
>>
>> What I am trying to do is have the client send oneway updates to the server on a regular basis.  Some of the updates are large (700+ bytes) and some are smaller (10-20 bytes).  What I am seeing is the following:
>>
>>
>> 1.       I kill the python server (kill -9...)
>>
>> 2.       The next message throws an exception on the client.
>>
>> 3.       The client drops the message (single message loss is ok) and marks the connection as failed.
>>
>> 4.       Next message will cause a the connection to be re-opened before being sent.
>>
>>  At this point, the server will not get any messages (the message from 4 will disappear, further messages will be dropped).  Neither the server nor the client will produce any indication that there is a problem.
>>
>>  Looking at a tcpdump, I find that when the connection is re-opened in 4, the message from 2 is re-sent followed by the new message.  Well, it looks like the first ~500 bytes from the first message are sent and the rest dropped (it is hard to tell exactly what is dropped since I am using a BinaryProtocol).  Adding some instrumentation to the generated thrift code finds it blocking in a read call, waiting for half a megabyte of data.  I assume that means that the parser went off the tracks when it didn't get the full message?
>>
>>  Does any of this sound familiar?  How much of the client should be re-created when a connection fails?
>>
>>  Thanks,
>>
>>  -- Per.
>
> I'd recommend trying the latest thrift release, your issue may have
> been fixed already. If that doesn't work you might try the latest
> snapshot release or straight from source control (can't recall which
> source control thrift is using).
>
> Chris

RE: Connection issues with Python server / C++ client

Posted by Per Knudsgaard <pk...@rgbnetworks.com>.

   Thanks Chris,

   Updating to the latest feels a little like rebooting your windows machine.  It may have fixed the problem but it didn't actually help you understand why it needed rebooting or what to do to avoid problems in the future.

   In the real world, services tend to come and go.  Re-establishing connections is an essential part of a networked system and my question is as much about how I solve my immediate problem as it is about what a client needs to do to reconnect to a server.  Should I be creating fresh sockets/transports/protocols or can I simply call open and reuse the ones I already created?

   -- Per.

-----Original Message-----
From: Chris Morgan [mailto:chmorgan@gmail.com] 
Sent: Wednesday, May 25, 2011 4:37 PM
To: user@thrift.apache.org
Subject: Re: Connection issues with Python server / C++ client

On May 25, 2011, at 7:33 PM, Per Knudsgaard <pk...@rgbnetworks.com> wrote:

>   Hi,
>
>   I am having a small problem with a small client/server application and I am hoping for an easy answer :)
>
>   The server is written in python, the client is c++ and I am using thrift with a buffered transport.  I have tried TSimpleServer and TThreadedServer with the same behavior.  The thrift version is 0.5.0.
>
>  What I am trying to do is have the client send oneway updates to the server on a regular basis.  Some of the updates are large (700+ bytes) and some are smaller (10-20 bytes).  What I am seeing is the following:
>
>
> 1.       I kill the python server (kill -9...)
>
> 2.       The next message throws an exception on the client.
>
> 3.       The client drops the message (single message loss is ok) and marks the connection as failed.
>
> 4.       Next message will cause a the connection to be re-opened before being sent.
>
>   At this point, the server will not get any messages (the message from 4 will disappear, further messages will be dropped).  Neither the server nor the client will produce any indication that there is a problem.
>
>   Looking at a tcpdump, I find that when the connection is re-opened in 4, the message from 2 is re-sent followed by the new message.  Well, it looks like the first ~500 bytes from the first message are sent and the rest dropped (it is hard to tell exactly what is dropped since I am using a BinaryProtocol).  Adding some instrumentation to the generated thrift code finds it blocking in a read call, waiting for half a megabyte of data.  I assume that means that the parser went off the tracks when it didn't get the full message?
>
>   Does any of this sound familiar?  How much of the client should be re-created when a connection fails?
>
>   Thanks,
>
>   -- Per.

I'd recommend trying the latest thrift release, your issue may have
been fixed already. If that doesn't work you might try the latest
snapshot release or straight from source control (can't recall which
source control thrift is using).

Chris

Re: Connection issues with Python server / C++ client

Posted by Chris Morgan <ch...@gmail.com>.

On May 25, 2011, at 7:33 PM, Per Knudsgaard <pk...@rgbnetworks.com> wrote:

>   Hi,
>
>   I am having a small problem with a small client/server application and I am hoping for an easy answer :)
>
>   The server is written in python, the client is c++ and I am using thrift with a buffered transport.  I have tried TSimpleServer and TThreadedServer with the same behavior.  The thrift version is 0.5.0.
>
>  What I am trying to do is have the client send oneway updates to the server on a regular basis.  Some of the updates are large (700+ bytes) and some are smaller (10-20 bytes).  What I am seeing is the following:
>
>
> 1.       I kill the python server (kill -9...)
>
> 2.       The next message throws an exception on the client.
>
> 3.       The client drops the message (single message loss is ok) and marks the connection as failed.
>
> 4.       Next message will cause a the connection to be re-opened before being sent.
>
>   At this point, the server will not get any messages (the message from 4 will disappear, further messages will be dropped).  Neither the server nor the client will produce any indication that there is a problem.
>
>   Looking at a tcpdump, I find that when the connection is re-opened in 4, the message from 2 is re-sent followed by the new message.  Well, it looks like the first ~500 bytes from the first message are sent and the rest dropped (it is hard to tell exactly what is dropped since I am using a BinaryProtocol).  Adding some instrumentation to the generated thrift code finds it blocking in a read call, waiting for half a megabyte of data.  I assume that means that the parser went off the tracks when it didn't get the full message?
>
>   Does any of this sound familiar?  How much of the client should be re-created when a connection fails?
>
>   Thanks,
>
>   -- Per.

I'd recommend trying the latest thrift release, your issue may have
been fixed already. If that doesn't work you might try the latest
snapshot release or straight from source control (can't recall which
source control thrift is using).

Chris