You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "yuqi (JIRA)" <ji...@apache.org> on 2018/04/08 02:52:00 UTC

[jira] [Comment Edited] (FLINK-9143) Restart strategy defined in flink-conf.yaml is ignored

    [ https://issues.apache.org/jira/browse/FLINK-9143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16429590#comment-16429590 ] 

yuqi edited comment on FLINK-9143 at 4/8/18 2:51 AM:
-----------------------------------------------------

En, Agree with you on this point [~Zentol], I also wonder why flink ignores configuration file and set restart strategy to fixed delay if users do not set strategy explicitly in code no matter configuration file is. [~StephanEwen] could you give us some hint ?

      in jobmanger, flink will sets strategy according to configuraion file if users forget to set this option, see jobmanager.scala 
{code:java}
val restartStrategy =
 Option(jobGraph.getSerializedExecutionConfig()
 .deserializeValue(userCodeLoader)
 .getRestartStrategy())
 .map(RestartStrategyFactory.createRestartStrategy)
 .filter(p => p != null) match {
 case Some(strategy) => strategy
 case None => restartStrategyFactory.createRestartStrategy()
 }{code}
Back to this issue. restart strategy will be set to fixed delay if no one was set by flink client. as show above, at this time restartStrategy on the jobmanager will never be `None` and thus cause problem.


was (Author: yuqi):
En, Agree with you on this point [~Zentol], I also wonder why flink ignores configuration file and set restart strategy to fixed delay if users do not set strategy explicitly in code no matter configuration file is. [~StephanEwen] could you give us some hint ?

      in jobmanger, flink will sets strategy according to configuraion file if users forget to set this option, see jobmanager.scala

 
{code:java}
val restartStrategy =
 Option(jobGraph.getSerializedExecutionConfig()
 .deserializeValue(userCodeLoader)
 .getRestartStrategy())
 .map(RestartStrategyFactory.createRestartStrategy)
 .filter(p => p != null) match {
 case Some(strategy) => strategy
 case None => restartStrategyFactory.createRestartStrategy()
 }{code}
 

 

Back to this issue. restart strategy will be set to fixed delay if no one was set by flink client. as show above, at this time restartStrategy on the jobmanager will never be `None` and thus cause problem.

> Restart strategy defined in flink-conf.yaml is ignored
> ------------------------------------------------------
>
>                 Key: FLINK-9143
>                 URL: https://issues.apache.org/jira/browse/FLINK-9143
>             Project: Flink
>          Issue Type: Bug
>          Components: Configuration
>    Affects Versions: 1.4.2
>            Reporter: Alex Smirnov
>            Assignee: yuqi
>            Priority: Major
>         Attachments: execution_config.png, jobmanager.log, jobmanager.png
>
>
> Restart strategy defined in flink-conf.yaml is disregarded, when user enables checkpointing.
> Steps to reproduce:
> 1. Download flink distribution (1.4.2), update flink-conf.yaml:
>  
> restart-strategy: none
> state.backend: rocksdb
> state.backend.fs.checkpointdir: file:///tmp/nfsrecovery/flink-checkpoints-metadata
> state.backend.rocksdb.checkpointdir: file:///tmp/nfsrecovery/flink-checkpoints-rocksdb
>  
> 2. create new java project as described at [https://ci.apache.org/projects/flink/flink-docs-release-1.4/quickstart/java_api_quickstart.html]
> here's the code:
> public class FailedJob
> {
>     static final Logger LOGGER = LoggerFactory.getLogger(FailedJob.class);
>     public static void main( String[] args ) throws Exception
>     {
>         final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
>         env.enableCheckpointing(5000, CheckpointingMode.EXACTLY_ONCE);
>         DataStream<String> stream = env.fromCollection(Arrays.asList("test"));
>         stream.map(new MapFunction<String, String>(){
>             @Override
>             public String map(String obj) {
>                 throw new NullPointerException("NPE");
>             } 
>         });
>         env.execute("Failed job");
>     }
> }
>  
> 3. Compile: mvn clean package; submit it to the cluster
>  
> 4. Go to Job Manager configuration in WebUI, ensure settings from flink-conf.yaml is there (screenshot attached)
>  
> 5. Go to Job's configuration, see Execution Configuration section
>  
> *Expected result*: restart strategy as defined in flink-conf.yaml
>  
> *Actual result*: Restart with fixed delay (10000 ms). #[2147483647|tel:(214)%20748-3647] restart attempts.
>  
>  
> see attached screenshots and jobmanager log (line 1 and 31)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)