You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/04/21 18:13:04 UTC

[jira] [Commented] (FLINK-5815) Add resource files configuration for Yarn Mode

    [ https://issues.apache.org/jira/browse/FLINK-5815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15979160#comment-15979160 ] 

ASF GitHub Bot commented on FLINK-5815:
---------------------------------------

Github user EronWright commented on the issue:

    https://github.com/apache/flink/pull/3388
  
    Just a note: For Mesos, inevitably we'll add support for arbitrary URLs to be downloaded by the container.   The Mesos fetcher deals with HTTP URLs primarily but actually supports HDFS URLs too (if configured; seems rarely used).
    
    Stepping back, we need the clustering layer to converge because supporting these features in a one-off way is becoming unsustainable.


> Add resource files configuration for Yarn Mode
> ----------------------------------------------
>
>                 Key: FLINK-5815
>                 URL: https://issues.apache.org/jira/browse/FLINK-5815
>             Project: Flink
>          Issue Type: Improvement
>          Components: Client, YARN
>    Affects Versions: 1.3.0
>            Reporter: Wenlong Lyu
>            Assignee: Wenlong Lyu
>
> Currently in flink, when we want to setup a resource file to distributed cache, we need to make the file accessible remotely by a url, which is often difficult to maintain a service like that. What's more, when we want do add some extra jar files to job classpath, we need to copy the jar files to blob server when submitting the jobgraph. In yarn, especially in flip-6, the blob server is not running yet when we try to start a flink job. 
> Yarn has a efficient distributed cache implementation for application running on it, what's more we can be easily share the files stored in hdfs in different application by distributed cache without extra IO operations. 
> I suggest to introduce -yfiles, -ylibjars -yarchives options to FlinkYarnCLI to enable yarn user setup their job resource files by yarn distributed cache. The options is compatible with what is used in mapreduce, which make it easy to use for yarn user who generally has experience on using mapreduce.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)