You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@thrift.apache.org by Mario Emmenlauer <ma...@emmenlauer.de> on 2017/07/03 14:51:33 UTC

how to handle network downtime gracefully?

How can I gracefully handle network problems? In grpc, I used to
create the full interface even if the network was down, and later
when I try to call RPC methods, grpc would hang until it could
connect. That was quite simple, when the network came back the RPC
succeeded eventually.


What is the most graceful way to handle an unreliable network
connection in thrift?


Background:
I'm building a cross platform API with Java server and C++ client
in thrift. I use the binary protocol to send large files. I use two
transport channels, one that uses SSL to send the login credentials,
and a second one that may later be used to send large datasets (after
the login succeeded).

Currently I create the full interface. But if the network is down,
I get an exception somewhere after creating the secure socket, with
error "No more data to read".

All the best,

    Mario Emmenlauer


--
BioDataAnalysis GmbH, Mario Emmenlauer      Tel. Buero: +49-89-74677203
Balanstr. 43                   mailto: memmenlauer * biodataanalysis.de
D-81669 München                          http://www.biodataanalysis.de/

Re: how to handle network downtime gracefully?

Posted by Mario Emmenlauer <ma...@emmenlauer.de>.

Dear Randy,

thanks a lot for the many hints and insights, its very much appreciated!

I will certainly think about the chunked up- and download. Actually as a
first step, it seems already a reasonable improvement to implement a small
protocol for chunked data transfer on top of thrift RPC :-)

About the network disconnect and reconnect, I will do as you suggest! What
parts of the connection can be re-used? Basically my code currently boils
down to:
 - create a socket
 - create a transport on top of the socket
 - create a protocol on top of the transport
 - create the client interface on top of the protocol

I don't know if its always like this, but I gathered this from examples.
After a disconnect, when I want to reconnect, which objects would be
sensible to re-create, and which ones can e just re-used?

Thanks and all the best,

   Mario




On 03.07.2017 18:13, Randy Abernethy wrote:
> Hi Mario,
> 
> The simplest form of error recovery (though not necessarily always the most
> efficient) in RPC is to disconnect and reconnect. A reasonable starting
> place is to write call code that operates within a protected block (e.g. a
> "try" block) then when a non application error is thrown, the catch block
> optionally disconnects (you may already be disconnected) and attempts to
> reconnect and/or retry the call. This is a simple but reliable approach and
> once working you can optimize as needed.
> 
> It is worth pointing out that RPC (of any kind) is not perfect for large
> file transfer. RPC - Remote Procedure Call, is designed to let you invoke
> remote functions and retrieve their results. The function call is an atomic
> thing, it either completely succeeds or completely fails. "Procedure Call"
> also infers some manageable size block of arguments and return values in
> most world views. This means that all of the many small and large
> architectural decisions made when creating Thrift were predicated on
> reasonable sized inputs and outputs (< 1MB ish).
> 
> If you try to transfer a file by passing its data as an argument to a
> server and the operation fails you make no progress. It may make sense to
> use RPC directly as a file transfer scheme for small files where retrying
> the entire transfer might be reasonable. For large files though it is
> better to create an application level protocol where you pass modest sized
> chunks of the file (in the 1MB handle say). This way if a chunk fails you
> only re-transmit the chunk rather than the entire file. Also transferring
> really large files (1GB+) in one go can overflow (or overtax) buffers on
> the client but particularly on the server. Using chunks avoids this issue.
> You can easily write a library wrapper for your chunked transfer that
> allows clients to make a single call to transfer a large file with many RPC
> transfers happening behind the scenes.
> 
> There are lots of ways to skin a cat of course. just some thoughts.
> 
> Very best,
> Randy
> 
> 
> 
> 
> 
> On Mon, Jul 3, 2017 at 7:51 AM, Mario Emmenlauer <ma...@emmenlauer.de>
> wrote:
> 
>>
>> How can I gracefully handle network problems? In grpc, I used to
>> create the full interface even if the network was down, and later
>> when I try to call RPC methods, grpc would hang until it could
>> connect. That was quite simple, when the network came back the RPC
>> succeeded eventually.
>>
>>
>> What is the most graceful way to handle an unreliable network
>> connection in thrift?
>>
>>
>> Background:
>> I'm building a cross platform API with Java server and C++ client
>> in thrift. I use the binary protocol to send large files. I use two
>> transport channels, one that uses SSL to send the login credentials,
>> and a second one that may later be used to send large datasets (after
>> the login succeeded).
>>
>> Currently I create the full interface. But if the network is down,
>> I get an exception somewhere after creating the secure socket, with
>> error "No more data to read".
>>
>> All the best,
>>
>>     Mario Emmenlauer



Viele Gruesse,

    Mario Emmenlauer


--
BioDataAnalysis GmbH, Mario Emmenlauer      Tel. Buero: +49-89-74677203
Balanstr. 43                   mailto: memmenlauer * biodataanalysis.de
D-81669 München                          http://www.biodataanalysis.de/

Re: how to handle network downtime gracefully?

Posted by Randy Abernethy <ra...@apache.org>.

Hi Mario,

The simplest form of error recovery (though not necessarily always the most
efficient) in RPC is to disconnect and reconnect. A reasonable starting
place is to write call code that operates within a protected block (e.g. a
"try" block) then when a non application error is thrown, the catch block
optionally disconnects (you may already be disconnected) and attempts to
reconnect and/or retry the call. This is a simple but reliable approach and
once working you can optimize as needed.

It is worth pointing out that RPC (of any kind) is not perfect for large
file transfer. RPC - Remote Procedure Call, is designed to let you invoke
remote functions and retrieve their results. The function call is an atomic
thing, it either completely succeeds or completely fails. "Procedure Call"
also infers some manageable size block of arguments and return values in
most world views. This means that all of the many small and large
architectural decisions made when creating Thrift were predicated on
reasonable sized inputs and outputs (< 1MB ish).

If you try to transfer a file by passing its data as an argument to a
server and the operation fails you make no progress. It may make sense to
use RPC directly as a file transfer scheme for small files where retrying
the entire transfer might be reasonable. For large files though it is
better to create an application level protocol where you pass modest sized
chunks of the file (in the 1MB handle say). This way if a chunk fails you
only re-transmit the chunk rather than the entire file. Also transferring
really large files (1GB+) in one go can overflow (or overtax) buffers on
the client but particularly on the server. Using chunks avoids this issue.
You can easily write a library wrapper for your chunked transfer that
allows clients to make a single call to transfer a large file with many RPC
transfers happening behind the scenes.

There are lots of ways to skin a cat of course. just some thoughts.

Very best,
Randy

On Mon, Jul 3, 2017 at 7:51 AM, Mario Emmenlauer <ma...@emmenlauer.de>
wrote:

>
> How can I gracefully handle network problems? In grpc, I used to
> create the full interface even if the network was down, and later
> when I try to call RPC methods, grpc would hang until it could
> connect. That was quite simple, when the network came back the RPC
> succeeded eventually.
>
>
> What is the most graceful way to handle an unreliable network
> connection in thrift?
>
>
> Background:
> I'm building a cross platform API with Java server and C++ client
> in thrift. I use the binary protocol to send large files. I use two
> transport channels, one that uses SSL to send the login credentials,
> and a second one that may later be used to send large datasets (after
> the login succeeded).
>
> Currently I create the full interface. But if the network is down,
> I get an exception somewhere after creating the secure socket, with
> error "No more data to read".
>
> All the best,
>
>     Mario Emmenlauer
>
>
> --
> BioDataAnalysis GmbH, Mario Emmenlauer      Tel. Buero: +49-89-74677203
> Balanstr. 43                   mailto: memmenlauer * biodataanalysis.de
> D-81669 München                          http://www.biodataanalysis.de/
>