You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by "Till Rohrmann (JIRA)" <ji...@apache.org> on 2019/05/02 09:02:00 UTC

[jira] [Commented] (FLINK-12343) Allow set file.replication in Yarn Configuration

    [ https://issues.apache.org/jira/browse/FLINK-12343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16831489#comment-16831489 ] 

Till Rohrmann commented on FLINK-12343:
---------------------------------------

[~ZhenqiuHuang] I think the {{ResourceManager}} only stores the {{TaskExecutors}} configuration file to HDFS when setting up the {{TaskExecutorContext}}. Ideally we treat this situation similarly to the upload of the other Flink cluster artifacts. But this should not be hard to do. I guess we have agreed on the overall approach and can now start with the implementation. I think in order to test it, we either need a {{HdfsMiniCluster}} or the {{YarnMiniCluster}} to deploy and check the replication factor.

> Allow set file.replication in Yarn Configuration
> ------------------------------------------------
>
>                 Key: FLINK-12343
>                 URL: https://issues.apache.org/jira/browse/FLINK-12343
>             Project: Flink
>          Issue Type: Improvement
>          Components: Command Line Client, Deployment / YARN
>    Affects Versions: 1.6.4, 1.7.2, 1.8.0
>            Reporter: Zhenqiu Huang
>            Assignee: Zhenqiu Huang
>            Priority: Minor
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, FlinkYarnSessionCli upload jars into hdfs with default 3 replications. From our production experience, we find that 3 replications will block big job (256 containers) to launch, when the HDFS is slow due to big workload for batch pipelines. Thus, we want to make the factor customizable from FlinkYarnSessionCli by adding an option.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)