You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@reef.apache.org by Scott Inglis <ms...@gmail.com> on 2018/04/17 22:47:14 UTC

[DISCUSS] Proposal for supporting different platforms with .NET Core runtime

Greetings REEF devs,

This is a proposal to support multiple platforms within REEF. With the
transition to .NET Core, this is already opening discussions on how can we
submit a job from a platform that is different from the cluster (Search for
Org.Apache.REEF.Api). By platform, I mean operating system so say
submitting a REEF job from Linux and have it run on a Windows cluster or
vice versa.

Before discussing the proposal, I would like to give some context so that
you are aware of the issues that I have seen using .NET Core in the current
architecture:

Context:
I experimented with REEF running on .NET Core using the current
architecture with minor modifications ( i.e. having the libraries target
.net standard 2.0, the Bridge target .net461 and the Evaluator target .net
core 2.0). The idea is that a user could submit a job from Linux and have
it run on a Windows cluster.

NOTE: This is in parallel with the bridge work -- in this experiment I kept
the bridge at 4.6.1 allowing it to stay as it is now. The work with
Tyson/Doug and I will ultimately change the  current bridge architecture.

Through this effort, there were some interesting findings:
1) The current organization of files that REEF uses does not work with
multiple versions of the .net framework. Typically an application will call
AddGlobalAssembliesInDirectoryOfExecutingAssembly -- which will copy all
assemblies from the application to the global folder causing file conflicts.

2) Not all System libraries that .NET Core uses are the same. To give some
background, .NET Core will copy a number of System libraries so that it can
generate a stand-alone application. This is nice because it prevents the
need to have .NET Core runtime installed on the cluster. However these
System libraries are not the same when building a:
 - .netcore application targeting Windows
 - .netcore application targeting Linux
 - .net461 application that targets .netstandard 2.0 libraries

With the current architecture, if we were 100% on .NET Core, submitting a
REEF job on a Linux system to a Windows cluster would fail.The Linux .NET
Core application will submit the job containing Linux .NET Core System
libraries -- which will end up failing on the Windows cluster due to the
wrong assemblies being provided.


Proposal:
I am making the following proposal -- note that this would happen after
Bridge transition to .NET Core 2.0 is complete:

1) We build out all the platforms that we plan on supporting (this will
likely happen in the Evaluator project). For example, the runtimes could be
win7-x64 for window platforms and centos7.x64, fedora.x64 for linux
platforms. This will generate a set of binaries that are required to run
the Evaluator and the Bridge.

2) When building a REEF application, all runtimes will be copied to the
output path of the project as a subfolder called "runtime". Runtime will
contain subdirectories one for each platform (windows, centos, fedora...).

3) The runtime assemblies will be a part of the nuget distribution, for
example, FatNuget. FatNuget will  contain build steps to copy the runtime
folder to the application output directory. So as an app developer, if I
pull FatNuget into my project, the copy step will happen automatically as
part of my build.

4) Have a SetRuntime API on the JobRequestBuilder when submitting the job.
This is independent of what platform the application builds as. The API
would look like the following:
NewJobRequestBuilder().SetRuntime("centos") where "centos" is the name of
the subdirectory to copy from runtime.

5) At runtime, the driver folder preparation will copy all binaries from
the specified runtime directory to global. Note that while we copy all
runtimes to the build folder, only one runtime is copied when the job is
submitted.

6) As users still need to package their assembly plus any dependent
assemblies, we add a new API,  AddGlobalAssemblyWithDependenciesForType().
This is a slight modification where we will add the assembly for that type
plus any dependent assemblies. This will exclude any System.* assemblies as
these are provided from the runtime folder.


Users can still continue to add additional assemblies if needed but this is
meant to do most of the work for the user.


As this is a proposal, I am very interested in hearing feedback or
questions you may have.

Thanks,

Scott

Re: [DISCUSS] Proposal for supporting different platforms with .NET Core runtime

Posted by Markus Weimer <ma...@weimo.de>.
On Wed, Apr 18, 2018 at 10:57 AM, Scott Inglis <ms...@gmail.com> wrote:
> With all that said, I do agree with you though -- not having to deal with
> runtimes does ensure that we can just run anywhere. So if the .NET Core
> install is an acceptable requirement, then we can close this discussion and
> the changes that I am proposing can be disregarded.

Not so fast :)

I believe this problem is worth solving. But maybe not for REEF, but
for applications built using REEF. Because if we (only) solve it at
the REEF level, those apps still have the problem to solve. Is there
something we can do to make it easier for apps to package themselves,
REEF and .NET Core in a convenient package?

Markus

Re: [DISCUSS] Proposal for supporting different platforms with .NET Core runtime

Posted by Scott Inglis <ms...@gmail.com>.
In order for that to work, this will put the requirement of users to ensure
that .NET Core is installed not only on the cluster but also the machine
that is doing the job submission. Really this is probably just a few lines
in a script to automate the process, but is having to install .NET Core on
a large cluster a problem?

You could argue that they already do this with Java, but I would say that
Java is very prevalent in this case simply because distributed frameworks
already require it. Therefore running REEF on Java just plays naturally
into those requirements of the distributed framework without adding
anything new. Even looking at the .net framework, we also shipped on .net
framework 4.52 -- this too did not add any new install requirements as the
.net framework was installed by default since Windows Server 2012.

This is different with .Net Core, so I took the approach of trying to
maintain no new dependencies and putting the work into managing runtime
binaries on the reef developers and the build system.

With all that said, I do agree with you though -- not having to deal with
runtimes does ensure that we can just run anywhere. So if the .NET Core
install is an acceptable requirement, then we can close this discussion and
the changes that I am proposing can be disregarded.



On Tue, Apr 17, 2018 at 4:08 PM, Markus Weimer <ma...@weimo.de> wrote:

> I am all for supporting cross-platform job submission and run. This
> has been a long standing issue in the HDI runtime, for example:
>
> https://issues.apache.org/jira/browse/REEF-322
>
> What I don't understand is the need for multiple different builds of
> REEF, one per platform. Once we move to .NET Core, we no longer have
> any native code in the project. Hence, we should run wherever we have
> .NET Core installed. The one hiccup are the Windows- and UNIX-isms
> around path and list separators. But I don't see how that creates a
> need for multiple builds.
>
> Markus
>

Re: [DISCUSS] Proposal for supporting different platforms with .NET Core runtime

Posted by Markus Weimer <ma...@weimo.de>.
I am all for supporting cross-platform job submission and run. This
has been a long standing issue in the HDI runtime, for example:

https://issues.apache.org/jira/browse/REEF-322

What I don't understand is the need for multiple different builds of
REEF, one per platform. Once we move to .NET Core, we no longer have
any native code in the project. Hence, we should run wherever we have
.NET Core installed. The one hiccup are the Windows- and UNIX-isms
around path and list separators. But I don't see how that creates a
need for multiple builds.

Markus