You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/06/02 10:26:04 UTC
[jira] [Commented] (FLINK-6643) Flink restarts job in HA even if NoRestartStrategy is set

    [ https://issues.apache.org/jira/browse/FLINK-6643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034458#comment-16034458 ] 

ASF GitHub Bot commented on FLINK-6643:
---------------------------------------

GitHub user zhangminglei opened a pull request:

    https://github.com/apache/flink/pull/4049

    [FLINK-6643] [JobManager] Flink restarts job in HA even if NoRestartS…

    …trategy is set
    
    Thanks for contributing to Apache Flink. Before you open your pull request, please take the following check list into consideration.
    If your changes take all of the items into account, feel free to open your pull request. For more information and/or questions please refer to the [How To Contribute guide](http://flink.apache.org/how-to-contribute.html).
    In addition to going through the list, please provide a meaningful description of your changes.
    
    - [ ] General
      - The pull request references the related JIRA issue ("[FLINK-XXX] Jira title text")
      - The pull request addresses only one issue
      - Each commit in the PR has a meaningful commit message (including the JIRA id)
    
    - [ ] Documentation
      - Documentation has been added for new functionality
      - Old documentation affected by the pull request has been updated
      - JavaDoc for public methods has been added
    
    - [ ] Tests & Build
      - Functionality added by the pull request is covered by tests
      - `mvn clean verify` has been executed successfully locally or a Travis build has passed


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/zhangminglei/flink flink-6643

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/4049.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4049
    
----
commit 757fb9ddc57e76a0a854c91d8857520a1001e7aa
Author: zhangminglei <zm...@163.com>
Date:   2017-06-02T10:21:37Z

    [FLINK-6643] [JobManager] Flink restarts job in HA even if NoRestartStrategy is set

----


> Flink restarts job in HA even if NoRestartStrategy is set
> ---------------------------------------------------------
>
>                 Key: FLINK-6643
>                 URL: https://issues.apache.org/jira/browse/FLINK-6643
>             Project: Flink
>          Issue Type: Bug
>          Components: JobManager
>    Affects Versions: 1.3.0
>            Reporter: Robert Metzger
>            Priority: Critical
>
> While testing Flink 1.3 RC1, I found that the JobManager is trying to recover a job that had the {{NoRestartStrategy}} set.
> {code}
> 2017-05-19 15:09:04,038 INFO  org.apache.flink.yarn.YarnJobManager                          - Attempting to recover all jobs.
> 2017-05-19 15:09:04,039 DEBUG org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore  - Retrieving all stored job ids from ZooKeeper under flink/application_1494870922226_0064/jobgraphs.
> 2017-05-19 15:09:04,041 INFO  org.apache.flink.yarn.YarnJobManager                          - There are 1 jobs to recover. Starting the job recovery.
> 2017-05-19 15:09:04,043 INFO  org.apache.flink.yarn.YarnJobManager                          - Attempting to recover job f94b1f7a0e9e3dbcb160c687e476ca77.
> 2017-05-19 15:09:04,043 DEBUG org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore  - Recovering job graph f94b1f7a0e9e3dbcb160c687e476ca77 from flink/application_1494870922226_0064/jobgraphs/f94b1f7a0e9e3dbcb160c687e476ca77.
> 2017-05-19 15:09:04,078 WARN  org.apache.hadoop.util.NativeCodeLoader                       - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
> 2017-05-19 15:09:04,142 INFO  org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore  - Recovered SubmittedJobGraph(f94b1f7a0e9e3dbcb160c687e476ca77, JobInfo(clients: Set((Actor[akka.tcp://flink@permanent-qa-cluster-master.c.astral-sorter-757.internal:40391/user/$a#-155566858],EXECUTION_RESULT_AND_STATE_CHANGES)), start: 1495206476885)).
> 2017-05-19 15:09:04,142 INFO  org.apache.flink.yarn.YarnJobManager                          - Submitting recovered job f94b1f7a0e9e3dbcb160c687e476ca77.
> 2017-05-19 15:09:04,143 INFO  org.apache.flink.yarn.YarnJobManager                          - Submitting job f94b1f7a0e9e3dbcb160c687e476ca77 (CarTopSpeedWindowingExample) (Recovery).
> 2017-05-19 15:09:04,151 INFO  org.apache.flink.yarn.YarnJobManager                          - Using restart strategy NoRestartStrategy for f94b1f7a0e9e3dbcb160c687e476ca77.
> 2017-05-19 15:09:04,163 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Job recovers via failover strategy: full graph restart
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)