You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Aljoscha Krettek (JIRA)" <ji...@apache.org> on 2017/04/18 12:48:42 UTC

[jira] [Comment Edited] (BEAM-1631) Flink runner: submit job to a Flink-on-YARN cluster

    [ https://issues.apache.org/jira/browse/BEAM-1631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15972593#comment-15972593 ] 

Aljoscha Krettek edited comment on BEAM-1631 at 4/18/17 12:48 PM:
------------------------------------------------------------------

Yes, I think we would still need to have a {{HADOOP_CONF_DIR}} set. I think right now it's not possible to solve this in a nice way. If was thinking of pointing the Flink Runner to the directory of a Flink installation and then use the normal Flink submission script from Java to submit the Job Jar. That's the easiest solution I could come up with but it's quite hacky.

There is stuff happening around FLIP-6 (https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077) which will introduce the notion of a dispatcher that will accept jobs from clients and then bring up a per-job cluster on YARN (or other cluster management systems). I think once we have this job submission on the Flink Runner should be very smooth.


was (Author: aljoscha):
Yes, I think we would still need to have a {{HADOOP_CONF_DIR}} set. I think right now it's not possible to solve this in a nice way. If was thinking of pointing the Flink Runner to the directory of a Flink installation and then use the normal Flink submission script from Java to submit the Job Jar. That's the easiest solution I could come up with but it's quite hacky.

There is stuff happening around FLINK-6 (https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077) which will introduce the notion of a dispatcher that will accept jobs from clients and then bring up a per-job cluster on YARN (or other cluster management systems). I think once we have this job submission on the Flink Runner should be very smooth.

> Flink runner: submit job to a Flink-on-YARN cluster
> ---------------------------------------------------
>
>                 Key: BEAM-1631
>                 URL: https://issues.apache.org/jira/browse/BEAM-1631
>             Project: Beam
>          Issue Type: New Feature
>          Components: runner-flink
>            Reporter: Davor Bonaci
>            Assignee: Aljoscha Krettek
>
> As far as I understand, running Beam pipelines on a Flink cluster can be done in two ways:
> * Run directly with a Flink runner, and specifying {{--flinkMaster}} pipeline option via, say, {{mvn exec}}.
> * Produce a bundled JAR, and use {{bin/flink}} to submit the same pipeline.
> These two ways are equivalent, and work well on a standalone Flink cluster.
> Submitting to a Flink-on-YARN is more complicated. You can still produce a bundled JAR, and use {{bin/flink -yid <applicationid>}} to submit such a job. However, that seems impossible with a Flink runner directly.
> If so, we should add the ability to the Flink runner to submit a job to a Flink-on-YARN cluster directly.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)