You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@thrift.apache.org by "James E. King, III (JIRA)" <ji...@apache.org> on 2010/03/23 13:41:27 UTC
[jira] Commented: (THRIFT-66) Allow multiplexing multiple services over a single TCP connection

    [ https://issues.apache.org/jira/browse/THRIFT-66?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12848677#action_12848677 ] 

James E. King, III commented on THRIFT-66:
------------------------------------------

I have implemented a similar multiplexed runtime in C# and tested it with 200 concurrent clients who are pushing and receiving.  It uses no sleep loops and does proper blocking when waiting to send or receive data.  I think it would be a good idea to define a standard specification for multiplexed communications so all the language runtimes can synchronize.

Definition:  Once the client connects to the server, multiple virtual communication channels are opened in both directions for reliable RPC that can be initiated by either end.  As there are multiple communication channels each side chooses to be the client or server.  Each virtual transport handles a single thrift service declaration.  An added benefit, other than two-way communications, is that you can split your RPCs into functional areas and service them on different virtual channels.  This may help to clean up larger .thrift files.

Implementation:  The wire format is simple, and as follows:

01: byte, range 01-FF, indicates the virtual transport ID
02-05: int32, network byte order, indicates the payload length
06-end: payload

This requires no changes to the compiler / compiled code.  Note that fully integrating this into the mainline will likely tuck it under the version exchange.  Here is the preferred runtime implementation:

* As there is a single communication channel, a single thread should be used to read from the socket, and a single thread should be used to write.
* For the read thread, it should block on read of the actual transport.  When it gets some data it should route it to an appropriate virtual transport kept in a map.  The virtual transport should store incoming binary data in order and distribute it out the overridden Read function.  When there is no more data to read this should block.  This means that when data is enqueued to the virtual transport it ideally needs some form of event to wait on.
* For the write side, when a virtual transport receives a Write call, it should store the outgoing binary data in-order and use an event to signal the write thread to do some work.  Ideally the write thread can wait on multiple events including a shutdown event, and get notification of which virtual transport needs a write, and then push the data out.  This makes the write thread block when there is nothing to do, and wake up when it is time to stop.
* sleeping loops should be avoided where possible

Acknowledgement:  charlie (d o t) mas (a t) gmail (d o t) com released an initial implementation which I overhauled

Comment:  Still waiting for Dell approval to release open-source code back to the community.  Once that happens I will push up a patch for review.

> Allow multiplexing multiple services over a single TCP connection
> -----------------------------------------------------------------
>
>                 Key: THRIFT-66
>                 URL: https://issues.apache.org/jira/browse/THRIFT-66
>             Project: Thrift
>          Issue Type: New Feature
>          Components: Library (C#), Library (C++), Library (Cocoa), Library (Erlang), Library (Java), Library (Perl), Library (Python), Library (Ruby)
>            Reporter: Johan Stuyts
>            Priority: Trivial
>         Attachments: CalculatorImpl.java, MultiplexTestClientMain.java, MultiplexTestServerMain.java, SharedImpl.java, ThriftMultiplexInvocationHandler.java, TMultiplexServer.java, TMultiplexServer.py, TSimpleMultiplexServer.java
>
>
> The current {{TServer}} implementations expose a single service on a port. If an application has many services many ports have to be opened. This is cumbersome because:
> - you have to document which service is available on which port, and remembering the port numbers is difficult
> - to prevent the overhead of connection setup on each call, a client has to maintain to many connections: at least one to each port
> - it requires opening many ports on a firewall if one is between the client and the server.
> By multiplexing multiple services on a single port the problems above are resolved:
> - instead of a port number a symbolic name can be assigned to a service
> - a client can maintain a small pool of connections to a single port
> - only one port has to be opened on the firewall
> The attached Java implementation simply wraps a normal {{CALL}} message with a (new) {{SERVICE_SELECTION}} message. It is not necessary to modify or wrap the response. No changes are needed to the generated classes. Only a new type of server is introduced, and an invocation handler for a dynamic proxy around the {{Client}} classes of services is provided for the client side. The implementation does not handle communication errors (invalid data, timeouts, etc.) yet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.