You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Wenbo Hu <hu...@gmail.com> on 2021/09/03 15:37:15 UTC

[C++] Decouple Flight RPC from GRPC

Hi all,

I've just post an issue [ARROW-13889] on jira as below. Maybe here is the right place to discuss.
----
I'm trying to implement Flight RPC on RPC framework with protobuf message support in distributed system.

However, the flight rpc is tied to grpc.
Classes from grpc used in flight server are:
1. `grpc::ServerContext` used in grpc generated code in parameter, and used to generate `ServerCallContext`.
2. `grpc::Status` used in grpc generated code as return type.
3. `grpc::ServerReaderWriter` and `grpc::ServerReader` used in massive wrapped MessageReader/Writer classes.

1 & 2 are not coupled much with flight, while the third part is the tough work.
Shall we introduce an interface class with same semantics to allow anyone implement the writing process to stream, such as `arrow::flight::ServerReaderWriter` and `arrow::flight::ServerReader`.

So that, making a shim layer between `FlightServiceImpl` and `FlightServerBase` is possible to decouple flight from grpc, meanwhile taking advantage of its zero-copy messages.
All message converting processes can be handled in the shim layer.
For example, the function definition of `DoGet` can be `arrow::Status DoGet(ServerCallContext* context, const pb::Ticket* request, ServerWriter<pb::FlightData>* writer)`, which converts pb messages to flight's and call functions from actual business logic implementation from `FlightServerBase` as `Status DoGet(const ServerCallContext& context, const Ticket& request, std::unique_ptr<FlightDataStream>* stream)`.

While, the client seems more complex, since the cookie stuff and others.
If the idea above is possible, I'll have a exploration on client in depth.
----
The problem, what I'm really facing to, is that I cannot get the grpc generated service, which all zero copy operations are implemented, from the rpc framework which is wrapped from a grpc server running over TCP/TLS though.
I need to decouple how the messages and streams are comming to FlightServerBase functions.

As far as I've tried, in-process grpc is not an option for me to proxy traffic from wrapped rpc to original grpc flight server, since C++ implemention of in-process grpc, unlike Java, is serializing and deserializing pb messages all the time.


Re: [C++] Decouple Flight RPC from GRPC

Posted by David Li <li...@apache.org>.
Hi Wenbo,

Thanks for reaching out on the mailing list. 

First I want to step back a bit: what is the goal here? It sounds like you're using Flight in-process, but you've found that gRPC/C++ still serializes/deserializes data. In that case, I suppose I'm curious what the motivation for in-process Flight is? You mention wrapping the server, so it almost sounds like there's some sort of in process translation/proxying going on, and this is still actually conceptually a server?

More to the point, the immediate goal you have is taking a Flight Protobuf message you already have in memory, and being able to pass that directly to the Flight server implementation?

We have discussed supporting alternative transports in the past. However, there are a few things that would need to happen here. Being able to parameterize implementation classes with alternative reader/writers is reasonable, and we could certainly refactor things to not deal with the actual Protobuf classes in more places. (Though, note that the gRPC classes you mention are already interfaces. Also, we do some somewhat-iffy casting, such that those writers are actually being passed a Flight-internal struct, and not an actual Protobuf message.)

Finally, I would say FlightServiceImpl is already effectively the gRPC-specific part of Flight, albeit it's 90% of the implementation. So I'm not sure which piece exactly you want to reuse here. That goes back to my question about the actual goals: it seems you want to mostly reuse the implementation, but fake certain parts of gRPC since you're doing your own in-process proxying/translation? If so the implementation would be different than what we would do for a truly new transport. Also, it would effectively mean exposing the internals of Flight/gRPC as part of the API, which would be undesirable.

-David

On Fri, Sep 3, 2021, at 11:37, Wenbo Hu wrote:
> Hi all,
> 
> I've just post an issue [ARROW-13889] on jira as below. Maybe here is the right place to discuss.
> ----
> I'm trying to implement Flight RPC on RPC framework with protobuf message support in distributed system.
> 
> However, the flight rpc is tied to grpc.
> Classes from grpc used in flight server are:
> 1. `grpc::ServerContext` used in grpc generated code in parameter, and used to generate `ServerCallContext`.
> 2. `grpc::Status` used in grpc generated code as return type.
> 3. `grpc::ServerReaderWriter` and `grpc::ServerReader` used in massive wrapped MessageReader/Writer classes.
> 
> 1 & 2 are not coupled much with flight, while the third part is the tough work.
> Shall we introduce an interface class with same semantics to allow anyone implement the writing process to stream, such as `arrow::flight::ServerReaderWriter` and `arrow::flight::ServerReader`.
> 
> So that, making a shim layer between `FlightServiceImpl` and `FlightServerBase` is possible to decouple flight from grpc, meanwhile taking advantage of its zero-copy messages.
> All message converting processes can be handled in the shim layer.
> For example, the function definition of `DoGet` can be `arrow::Status DoGet(ServerCallContext* context, const pb::Ticket* request, ServerWriter<pb::FlightData>* writer)`, which converts pb messages to flight's and call functions from actual business logic implementation from `FlightServerBase` as `Status DoGet(const ServerCallContext& context, const Ticket& request, std::unique_ptr<FlightDataStream>* stream)`.
> 
> While, the client seems more complex, since the cookie stuff and others.
> If the idea above is possible, I'll have a exploration on client in depth.
> ----
> The problem, what I'm really facing to, is that I cannot get the grpc generated service, which all zero copy operations are implemented, from the rpc framework which is wrapped from a grpc server running over TCP/TLS though.
> I need to decouple how the messages and streams are comming to FlightServerBase functions.
> 
> As far as I've tried, in-process grpc is not an option for me to proxy traffic from wrapped rpc to original grpc flight server, since C++ implemention of in-process grpc, unlike Java, is serializing and deserializing pb messages all the time.
> 
>