You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@reef.apache.org by "Sergiy Matusevych (JIRA)" <ji...@apache.org> on 2017/09/13 23:09:01 UTC

[jira] [Comment Edited] (REEF-1791) Implement reef-runtime-spark

    [ https://issues.apache.org/jira/browse/REEF-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16165441#comment-16165441 ] 

Sergiy Matusevych edited comment on REEF-1791 at 9/13/17 11:08 PM:
-------------------------------------------------------------------

bq. 1) What are the negatives in running in unmanaged AM mode , what happens if the code runs into any performance issues, how will it recover, how will the user manage this, seems like this is placing more responsiblity on the user , he/she may or may not have this knowledge

By design, REEF assumes that the user takes full responsibility of the app. This is done because we want the user to be in control as much as possible while providing sane defaults. Running REEF from Spark is no different - we assume that the user will implement all the necessary event handlers for the failure events if the defaults are not sufficient for the use case. What is different for the Unmanaged AM mode is that the REEF Driver launched from Spark must also respond to the (failure) events originated from Spark, and we currently do not have mechanisms to forward Spark events to the REEF app _transparently_ - the user has to do it by hand. Other than Spark-REEF event forwarding, all other issues that you mention -- performance and error recovery, usability, etc. --  are not directly relevant to this PR, and we can discuss them elsewhere.

bq. 2) I would like to see an end to end User Interaction Diagram , maybe when we meet we can discuss this

Let's talk about it in the meeting and post a picture here.

bq. 3) Is it possible to make the partitions configurable in the DataLoader Service, in general I'd like to understand how this can be specified

I am not sure what parameters you are talking about. Please take a look at the {{DataLoadingRequestBuilder}} and let me know what other parameters we might need for Spark integration.

bq. 4) What are the tradeoffs between using the EvaluatorRequestor versus DataLoader, if the goal is to not have too much dependency on spark internals it seems like the dataloader is a better approach

In my opinion, {{EvaluatorRequestor}} is more flexible as it allows us to request additional partitions (and potentially the new datasets) at runtime. OTOH, {{DataLoader}} can be easier to implement and it should cover 99% of our needs. In the long run, we may end up with both approaches implemented.

bq. 5) I would postpone the low level spark API till the first part using the EvaluatorRequestor or the DataLoader is complete

That depends on how hard it is to implement the `EvaluatorRequestor` using low-level Spark API. If done properly, it can give us a proper REEF+Spark runitme that is completely transparent to the end user; then we won't need any workarounds like DataLoader or custom data-driven SparkEvaluatorRequestor. Still, I would much prefer a workaround that would allow us to move forward with Spark+REEF.NET integration now and come back to the low-level solution later.

bq. 6) In the Reef.net Bridge I would recommend launching a .NET vm as a separate process to avoid using JNI and not being able to use spark-submit.

I agree.


was (Author: motus):
bq. 1) What are the negatives in running in unmanaged AM mode , what happens if the code runs into any performance issues, how will it recover, how will the user manage this, seems like this is placing more responsiblity on the user , he/she may or may not have this knowledge

By design, REEF assumes that the user takes full responsibility of the app. This is done because we want the user to be in control as much as possible while providing sane defaults. Running REEF from Spark is no different - we assume that the user will implement all the necessary event handlers for the failure events if the defaults are not sufficient for the use case. What is different for the Unmanaged AM mode is that the REEF Driver launched from Spark must also respond to the (failure) events originated from Spark, and we currently do not have mechanisms to forward Spark events to the REEF app _transparently_ - the user has to do it by hand. Other than Spark-REEF event forwarding, performance and error recovery, as well as the usability issues are not directly relevant to this PR - we can discuss them later elsewhere.

bq. 2) I would like to see an end to end User Interaction Diagram , maybe when we meet we can discuss this

Let's talk about it in the meeting and post a picture here.

bq. 3) Is it possible to make the partitions configurable in the DataLoader Service, in general I'd like to understand how this can be specified

I am not sure what parameters you are talking about. Please take a look at the {{DataLoadingRequestBuilder}} and let me know what other parameters we might need for Spark integration.

bq. 4) What are the tradeoffs between using the EvaluatorRequestor versus DataLoader, if the goal is to not have too much dependency on spark internals it seems like the dataloader is a better approach

In my opinion, {{EvaluatorRequestor}} is more flexible as it allows us to request additional partitions (and potentially the new datasets) at runtime. OTOH, {{DataLoader}} can be easier to implement and it should cover 99% of our needs. In the long run, we may end up with both approaches implemented.

bq. 5) I would postpone the low level spark API till the first part using the EvaluatorRequestor or the DataLoader is complete

That depends on how hard it is to implement the `EvaluatorRequestor` using low-level Spark API. If done properly, it can give us a proper REEF+Spark runitme that is completely transparent to the end user; then we won't need any workarounds like DataLoader or custom data-driven SparkEvaluatorRequestor. Still, I would much prefer a workaround that would allow us to move forward with Spark+REEF.NET integration now and come back to the low-level solution later.

bq. 6) In the Reef.net Bridge I would recommend launching a .NET vm as a separate process to avoid using JNI and not being able to use spark-submit.

I agree.

> Implement reef-runtime-spark
> ----------------------------
>
>                 Key: REEF-1791
>                 URL: https://issues.apache.org/jira/browse/REEF-1791
>             Project: REEF
>          Issue Type: New Feature
>          Components: REEF
>            Reporter: Sergiy Matusevych
>            Assignee: Saikat Kanjilal
>         Attachments: file-1.jpeg, file.jpeg
>
>   Original Estimate: 1,344h
>  Remaining Estimate: 1,344h
>
> We need to run REEF Tasks on Spark Executors. Ideally, that should require only a few lines of changes in the REEF application configuration. All Spark-related logic must be encapsulated in the {{reef-runtime-spark}} module, similar to the existing e.g. {{reef-runtime-yarn}} or {{reef-runtime-local}}. As a first step, we can have a Java-only solution, but later we'll need to run .NET Tasks on Executors as well.
> h3. P.S. Here's a REEF Wiki page with more details: *[Spark+REEF integration|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=73636401]*



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)