You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by zentol <gi...@git.apache.org> on 2016/09/30 12:36:46 UTC

[GitHub] flink pull request #2577: [FLINK-4720] Implement archived version of the Exe...

GitHub user zentol opened a pull request:

    https://github.com/apache/flink/pull/2577

    [FLINK-4720] Implement archived version of the ExecutionGraph

    This PR allows an ExecutionGraph to be archived. The archived version is a serializable snapshot of the EG at the time it was archived.
    
    To this end I implemented an archived version for the following classes:
    * ExecutionConfig
    * ExecutionGraph
    * ExecutionJobVertex
    * ExecutionJobVertex
    * Execution
    
    The archived versions are simple containers. They retain all fields that were required by the WebInterface handlers and have appropriate getX() methods. There are no methods that change state, however the containers are not guaranteed to be immutable. They are created by calling archive() on the respective original instance.
    
    To ensure compatibility with the existing handlers a common 'Access' interface was added for each of the above classes. This means that there is now an `AccessExecutionGraph` interface, which both `ExecutionGraph` and `ArchivedExecutionGraph` implement. The name is more or less a placeholder. Several handlers, relevant `JobManagerMessages` and tests were adjusted accordingly. Apart from the `JobVertexBackPressureHandler` all handlers now work on the `Access*` interfaces.
    
    Furthermore, a number of methods were be added to the existing `Execution*` classes. These serve 2 purposes:
     1. avoid user-defined classes, since the WebInterface/History server will (soon&#x2122;) no longer have access to user classes
     2. prevent accesses upwards (i.e from Execution to ExecutionVertex), otherwise the archived structure would be insane :)
    
    List of added methods:
    * ExecutionConfig:
     * `getRestartStrategyAsString(); added due to the possibility of user-defined restart strategies in the future as per FLINK-4596
     * `getGlobalJobParametersMap()`; added since the `GlobalJobParameters` instances may be a user-defined class
    * `ExecutionGraph`:
     * `getFailureCauseAsString()`; added since the `Throwable` may by a user-defined exception
     * `getExecutionConfig()`, returns the deserialized `ExecutionConfig`; added since the user classloader is no longer available.
    * `ExecutionJobVertex`:
     * `getName()`, returns `getJobVertex().getName()`; Added since `getJobVertex()` is no longer available.
     * `getCheckpointStats()`; Added since `getGraph()` is no longer available.
    * `ExecutionVertex`:
     * `getFailureCauseAsString()`: see above
     * `getPriorExecutions()` (package private); required for construction of archived version
    * `Execution`:
     * `getFailureCauseAsString()`; see above
     * `getParallelSubtaskIndex()`; added since `getVertex()` is no longer available.
    
    The `prepareForArchiving` methods were removed since changes to the `ExecutionGraph` or others, for the purpose of archiving, are no longer necessary.
    
    The `CheckpointStats` and `JobCheckpointStats` classes now implement the `Serializable` interface, so that we do not need a separate archived implementation for them as well.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/zentol/flink 4720_archive_exec_graph

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/2577.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2577
    
----
commit 1af9853ff75b146acbd0076df786aca6b2ac829b
Author: zentol <ch...@apache.org>
Date:   2016-09-22T12:02:22Z

    Add common interface for runtime and archived EG

commit bdc16949e4af91cd5f9c78347e1a39eabd39cfd1
Author: zentol <ch...@apache.org>
Date:   2016-09-19T12:01:53Z

    Implement archived version of ExecutionGraph components

commit 5270fce2e847c0a310f1e25760b749cb3888886a
Author: zentol <ch...@apache.org>
Date:   2016-09-19T12:02:23Z

    Adjust ExecutionGraph components/ add archive() methods

commit c67147659c92d996823d31b4ff1b36011ad1f26a
Author: zentol <ch...@apache.org>
Date:   2016-09-22T12:28:52Z

    Adjust WebInterface handlers

commit 6554137f451d7e036b27dab25f5c282d3809f655
Author: zentol <ch...@apache.org>
Date:   2016-09-23T10:38:02Z

    Adjust JobManager/Archive messages

commit f8b3348721e213f0814d59ad2f4520fef2962b0e
Author: zentol <ch...@apache.org>
Date:   2016-09-28T15:02:51Z

    The mother of tests

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2577: [FLINK-4720] Implement archived version of the ExecutionG...

Posted by mxm <gi...@git.apache.org>.
Github user mxm commented on the issue:

    https://github.com/apache/flink/pull/2577
  
    Thanks. Looks good to merge!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2577: [FLINK-4720] Implement archived version of the ExecutionG...

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on the issue:

    https://github.com/apache/flink/pull/2577
  
    I've added the Archiveable interface. I also added it to the ExecutionConfig, which required moving the ExecutionConfigSummary to flink-core. Since this move destroys the history anyway i renamed it to ArchivedExecutionConfig for consistency purposes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2577: [FLINK-4720] Implement archived version of the ExecutionG...

Posted by mxm <gi...@git.apache.org>.
Github user mxm commented on the issue:

    https://github.com/apache/flink/pull/2577
  
    >I am using the ExecutionConfigSummary (and removed my original ArchivedExecutionConfig) when i rebased, so I'm not sure what you mean :/
    
    That's what I meant, I wondered why you didn't use your archive design pattern for `ExecutionConfig`.
    
    What do you think about the `Archiveable<T extends Serializable>` interface I proposed? 
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2577: [FLINK-4720] Implement archived version of the ExecutionG...

Posted by mxm <gi...@git.apache.org>.
Github user mxm commented on the issue:

    https://github.com/apache/flink/pull/2577
  
    Thanks for the pr @zentol. Looks good to me! A couple of things I noticed: 
    
    - The prefix of the interfaces (`Access*`) could be a bit confusing. What about these?
      - interface: `ExecutionGraph`
      - runtime:`RuntimeExecutionGraph`
      - archive: `ArchivedExecutionGraph` 
    
      That said, you probably chose to not rename the existing `ExecutionGraph` class to minimize changes.
    
    - How about having an interface `Archiveable<T extends Serializable>` containing the `T archive()` method? This interface could be implemented by all runtime classes which are archievable.
    
    - You chose not to port the `ExecutionConfig` or `ExecutionConfigSummary` to your changes while rebasing?
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #2577: [FLINK-4720] Implement archived version of the Exe...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/flink/pull/2577


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2577: [FLINK-4720] Implement archived version of the ExecutionG...

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on the issue:

    https://github.com/apache/flink/pull/2577
  
    thank you for your review @mxm, will merge this now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2577: [FLINK-4720] Implement archived version of the ExecutionG...

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on the issue:

    https://github.com/apache/flink/pull/2577
  
    I did not want to rename all usages of ExecutionGraph, correct. Otherwise i would've went with exactly the names you suggested.
    
    I am using the ExecutionConfigSummary (and removed my original ArchivedExecutionConfig) when i rebased, so I'm not sure what you mean :/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2577: [FLINK-4720] Implement archived version of the ExecutionG...

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on the issue:

    https://github.com/apache/flink/pull/2577
  
    Ah i see. I removed it since it was pretty much identical to the ExecutionConfigSummary class; it would've been a rename essentially. I like the interface idea, makes it a bit more organized.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2577: [FLINK-4720] Implement archived version of the ExecutionG...

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on the issue:

    https://github.com/apache/flink/pull/2577
  
    Rebasing right now...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---