You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/10/20 10:23:01 UTC

[jira] [Commented] (FLINK-7876) Register TaskManagerMetricGroup under ResourceID instead of InstanceID

    [ https://issues.apache.org/jira/browse/FLINK-7876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16212469#comment-16212469 ] 

ASF GitHub Bot commented on FLINK-7876:
---------------------------------------

GitHub user tillrohrmann opened a pull request:

    https://github.com/apache/flink/pull/4872

    [FLINK-7876] Register TaskManagerMetricGroup under ResourceID

    ## What is the purpose of the change
    
    This commit changes that TaskManagerMetricGroups are now registered under the
    TaskManager's ResourceID instead of the InstanceID. This allows to create the
    TaskManagerMetricGroup at startup of the TaskManager.
    
    Moreover, it pulls the MetricRegistry out of JobManager and TaskManager. This
    allows to reuse the same MetricRegistry across multiple instances (e.g. in the
    FlinkMiniCluster case). Moreover, it ensures proper cleanup of a potentially
    started MetricyQueryServiceActor.
    
    Last but not least the PR corrects how the `MetricRegistry` is instantiated in the `ClusterEntrypoint`.
    
    ## Brief change log
    
    - Pull MetricRegistry out of `JobManager` and `TaskManager`
    - Provide `TaskManagerMetricGroup` and `JobManagerMetricGroup` at start up of respective component
    - Register `TaskManagerMetricGroup` under `TaskManager's` `ResourceID`
    - Adapt `TaskManagersHandler` to serve `ResourceID` instead of `InstanceID`
    - Adapt `MetricFetcher` to retain `TaskManagers` based on `ResourceID`
    - Correct instantiation of `MetricRegistry` in `ClusterEntrypoint`
    - Make `MetricRegistry` an interface and rename `MetricRegistry` into `MetricRegistryImpl` as one of the implementations of the `MetricRegistry` interface
    - Introduce `NoOpMetricRegistry` implementation which simply does nothing and can be used when testing components for which one does not want to register metrics
    
    ## Verifying this change
    
    Tested manually.
    
    ## Does this pull request potentially affect one of the following parts:
    
      - Dependencies (does it add or upgrade a dependency): (no)
      - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no)
      - The serializers: (no)
      - The runtime per-record code paths (performance sensitive): (no)
      - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
    
    ## Documentation
    
      - Does this pull request introduce a new feature? (no)
      - If yes, how is the feature documented? (not applicable)
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tillrohrmann/flink taskManagerMetricGroupRegistration

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/4872.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4872
    
----
commit 9106445f757099763c350b96dc0a0afad8f9b081
Author: Till <ti...@gmail.com>
Date:   2017-10-18T11:50:06Z

    [FLINK-7863] Generalize MetricFetcher to work with a RestfulGateway
    
    Add more logging to MetricFetcher

commit 577ac1d23afe9fe04d1026600e15f26270f8252e
Author: Till <ti...@gmail.com>
Date:   2017-10-19T12:51:07Z

    Fix MetricFetcherTest

commit 7371c68e09ed983e21ee46110566f60b57dcf942
Author: Till <ti...@gmail.com>
Date:   2017-10-19T15:22:53Z

    [FLINK-7876] Register TaskManagerMetricGroup under ResourceID
    
    This commit changes that TaskManagerMetricGroups are now registered under the
    TaskManager's ResourceID instead of the InstanceID. This allows to create the
    TaskManagerMetricGroup at startup of the TaskManager.
    
    Moreover, it pulls the MetricRegistry out of JobManager and TaskManager. This
    allows to reuse the same MetricRegistry across multiple instances (e.g. in the
    FlinkMiniCluster case). Moreover, it ensures proper cleanup of a potentially
    started MetricyQueryServiceActor.
    
    Change TaskManagersHandler to work with ResourceID instead of InstanceID
    
    Adapt MetricFetcher to use ResourceID instead of InstanceID

commit 660d33ee72de982996e49a88f99591a2e2062b22
Author: Till <ti...@gmail.com>
Date:   2017-10-20T10:13:10Z

    [FLINK-7876] Properly start and shutdown MetricRegistry by ClusterEntrypoint

----


> Register TaskManagerMetricGroup under ResourceID instead of InstanceID
> ----------------------------------------------------------------------
>
>                 Key: FLINK-7876
>                 URL: https://issues.apache.org/jira/browse/FLINK-7876
>             Project: Flink
>          Issue Type: Improvement
>          Components: Metrics
>    Affects Versions: 1.4.0
>            Reporter: Till Rohrmann
>            Assignee: Till Rohrmann
>            Priority: Minor
>              Labels: flip-6
>
> Currently, the {{TaskManager}} registers the {{TaskManagerMetricGroup}} under its {{InstanceID}} and thereby binding its metrics effectively to the lifetime of its registration with the {{JobManager}}. This has also implications how the REST handler retrieve the TaskManager metrics, namely by its {{InstanceID}}.
> I would actually propose to register the {{TaskManagerMetricGroup}} under the {{TaskManager}}/{{TaskExecutor}} {{ResourceID}} which is valid over the whole lifetime of the {{TaskManager}}/{{TaskExecutor}}. That way we would also be able to query metrics independent of the connection status to the {{JobManager}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)