You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/09/30 12:37:20 UTC

[jira] [Commented] (FLINK-4720) Implement an archived version of the execution graph

    [ https://issues.apache.org/jira/browse/FLINK-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15535878#comment-15535878 ] 

ASF GitHub Bot commented on FLINK-4720:
---------------------------------------

GitHub user zentol opened a pull request:

    https://github.com/apache/flink/pull/2577

    [FLINK-4720] Implement archived version of the ExecutionGraph

    This PR allows an ExecutionGraph to be archived. The archived version is a serializable snapshot of the EG at the time it was archived.
    
    To this end I implemented an archived version for the following classes:
    * ExecutionConfig
    * ExecutionGraph
    * ExecutionJobVertex
    * ExecutionJobVertex
    * Execution
    
    The archived versions are simple containers. They retain all fields that were required by the WebInterface handlers and have appropriate getX() methods. There are no methods that change state, however the containers are not guaranteed to be immutable. They are created by calling archive() on the respective original instance.
    
    To ensure compatibility with the existing handlers a common 'Access' interface was added for each of the above classes. This means that there is now an `AccessExecutionGraph` interface, which both `ExecutionGraph` and `ArchivedExecutionGraph` implement. The name is more or less a placeholder. Several handlers, relevant `JobManagerMessages` and tests were adjusted accordingly. Apart from the `JobVertexBackPressureHandler` all handlers now work on the `Access*` interfaces.
    
    Furthermore, a number of methods were be added to the existing `Execution*` classes. These serve 2 purposes:
     1. avoid user-defined classes, since the WebInterface/History server will (soon&#x2122;) no longer have access to user classes
     2. prevent accesses upwards (i.e from Execution to ExecutionVertex), otherwise the archived structure would be insane :)
    
    List of added methods:
    * ExecutionConfig:
     * `getRestartStrategyAsString(); added due to the possibility of user-defined restart strategies in the future as per FLINK-4596
     * `getGlobalJobParametersMap()`; added since the `GlobalJobParameters` instances may be a user-defined class
    * `ExecutionGraph`:
     * `getFailureCauseAsString()`; added since the `Throwable` may by a user-defined exception
     * `getExecutionConfig()`, returns the deserialized `ExecutionConfig`; added since the user classloader is no longer available.
    * `ExecutionJobVertex`:
     * `getName()`, returns `getJobVertex().getName()`; Added since `getJobVertex()` is no longer available.
     * `getCheckpointStats()`; Added since `getGraph()` is no longer available.
    * `ExecutionVertex`:
     * `getFailureCauseAsString()`: see above
     * `getPriorExecutions()` (package private); required for construction of archived version
    * `Execution`:
     * `getFailureCauseAsString()`; see above
     * `getParallelSubtaskIndex()`; added since `getVertex()` is no longer available.
    
    The `prepareForArchiving` methods were removed since changes to the `ExecutionGraph` or others, for the purpose of archiving, are no longer necessary.
    
    The `CheckpointStats` and `JobCheckpointStats` classes now implement the `Serializable` interface, so that we do not need a separate archived implementation for them as well.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/zentol/flink 4720_archive_exec_graph

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/2577.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2577
    
----
commit 1af9853ff75b146acbd0076df786aca6b2ac829b
Author: zentol <ch...@apache.org>
Date:   2016-09-22T12:02:22Z

    Add common interface for runtime and archived EG

commit bdc16949e4af91cd5f9c78347e1a39eabd39cfd1
Author: zentol <ch...@apache.org>
Date:   2016-09-19T12:01:53Z

    Implement archived version of ExecutionGraph components

commit 5270fce2e847c0a310f1e25760b749cb3888886a
Author: zentol <ch...@apache.org>
Date:   2016-09-19T12:02:23Z

    Adjust ExecutionGraph components/ add archive() methods

commit c67147659c92d996823d31b4ff1b36011ad1f26a
Author: zentol <ch...@apache.org>
Date:   2016-09-22T12:28:52Z

    Adjust WebInterface handlers

commit 6554137f451d7e036b27dab25f5c282d3809f655
Author: zentol <ch...@apache.org>
Date:   2016-09-23T10:38:02Z

    Adjust JobManager/Archive messages

commit f8b3348721e213f0814d59ad2f4520fef2962b0e
Author: zentol <ch...@apache.org>
Date:   2016-09-28T15:02:51Z

    The mother of tests

----


> Implement an archived version of the execution graph
> ----------------------------------------------------
>
>                 Key: FLINK-4720
>                 URL: https://issues.apache.org/jira/browse/FLINK-4720
>             Project: Flink
>          Issue Type: Improvement
>          Components: JobManager, Webfrontend
>    Affects Versions: 1.1.2
>            Reporter: Chesnay Schepler
>            Assignee: Chesnay Schepler
>             Fix For: 1.2.0
>
>
> In order to implement a job history server, as well as separate the JobManager from the WebInterface, we require an archived version of the ExecutionGraph that is Serializable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)