You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@reef.apache.org by Douglas Service <ds...@gmail.com> on 2018/03/07 01:36:28 UTC

Thrift and the REEF bridge

The Apache Thrift team has implemented full .NET Core 2.0 support in the
latest release. Thrift is similar to Avro and Protobuf, but supports many
more languages than Avro (20+), and has RPC protocol support across all
languages which Avro does not. Thrift also provides a robust
cross-language test
suite that literally runs every test across every permutation of languages
one configures in the Thrift build environment. There are many online
comparisons of Thrift and Protobuf and they are similar in features and
performance. Apache Thrift is used in production systems most notably by
Cloudera, Evernote, Facebook, Mendeley, and Uber, and is used or supported
by a number of Apache projects such as Hadoop, Aurora, HBase, Parquet, and
Storm. Most importantly for REEF Thrift is controlled by Apache.

There have been some suggestions to change the approach to modifying the
bridge to run on Linux. The possible approaches are:

1) Continue with Avro.
2) Switch to Thrift using a similar approach to the Avro approach.
3) Switch to keeping the C++, converting it to native, and using PInvoke to
call from C# to C++, C# delegates to call from C++ to C#, and continue
using the current JNI code to interface between Java and C#.

The advantage of using Avro is that all of the data types for any supported
language are autogenerated with the necessary marshaling/unmarshalling
code. If you look at the current bridge, you will see that most of the code
is handwritten data types duplicated across languages and associated
cross-language conversion code. The disadvantage of Avro is that it does
not support RPC protocol definitions between Java and C#, it has not
transport support; thus we have to build the protocols transport by hand.
In addition, we are using a combination of Microsoft/Apache Avro which
means there is more work to do in the future on the Avro side.

Thrift has all of the advantages of Avro, and in addition, it supports full
RPC protocol definition and generates code for transports such as TCP and
pipes, and wire formats such as binary and jason. Using thrift would
eliminate all of the custom hand-coding and marshaling of types in the
bridge as Avro does, and also eliminate the need to write the protocol code
and transport code.

The advantage of delegates and PInvoke is that we would keep a lot of the
existing bridge code and possibly get done faster. In this case all of the
code used for interop would need to move out of the current bridge
executable into a dll on Windows and library, possibly sharded, on Linux,
all of the managed types in C++ would have to become unmanaged, and then
conversion code would have to be written to convert from managed types to
unmanaged types and from unmanaged types to JNI types and back. The
disadvantage to delegates and PInvoke is that it most likely all throw away.

Going forward in the future there seems to be strong agreement that we
should only have a single primary language, such as C# or Java in which
most of the core functionality is written with an asynchronous
cross-language messaging API. (Taking to spark would still require java)
This would allow us to stop implementing core functionality in both C# and
Java, and we would be able to support applications written in any language
supported by Thrift or a language that supports calling into C++ such as R
or Julia. Thrift is an excellent choice for this asynchronous messaging API
and adopting it now would put us on the road to this future architecture.
Currently it would probably have to be synchronous due to the current
bridge design, but could then be asynchronous in the future, There is some
concern that a messaging API will be slower than interop, but as Markus
points out the limiting factor will be the time it takes to get messages
between the driver and an evaluator on different nodes in the cluster and
not the time it takes to get a message between the two processes in driver
running on the same node. I would also encourage that we keep REEF code in
the non-core languages as thin as possible.

Comments?

Doug