You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/08/21 16:47:00 UTC

[jira] [Commented] (FLINK-10150) Inconsistent number of "Records received" / "Records sent"

    [ https://issues.apache.org/jira/browse/FLINK-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16587716#comment-16587716 ] 

ASF GitHub Bot commented on FLINK-10150:
----------------------------------------

zentol opened a new pull request #6599: [FLINK-10150][metrics] Fix OperatorMetricGroup creation for Batch
URL: https://github.com/apache/flink/pull/6599
 
 
   ## What is the purpose of the change
   
   This PR fixes a severe issue in the metric system where chained batch operators would always operate on the same `OperatorMetricGroup`. As a result most Flink-provided metrics were not exposed for chained operators at all, while other metrics, like task-level IO metrics, were render incorrect.
   
   The problem is that we used the tasks `VertexID` to identify operators; which is obviously identical for all operators in a chain. We now use the vertexID and operator name to identify them.
   
   ## Brief change log
   
   * fix identification in `TaskMetricGroup` by using both the ID and operator name
   * extend `MockEnvironment[Builder]` to allow the `TaskMetricGroup` to be set
   
   ## Verifying this change
   
   This change added tests:
   * ChainedOperatorsMetricTest
   * run a basic wordcount as described in the JIRA and verify the results via the UI/reporter of your choice
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Inconsistent number of "Records received" / "Records sent"
> ----------------------------------------------------------
>
>                 Key: FLINK-10150
>                 URL: https://issues.apache.org/jira/browse/FLINK-10150
>             Project: Flink
>          Issue Type: Bug
>          Components: Metrics, Webfrontend
>    Affects Versions: 1.4.0, 1.5.0, 1.6.0, 1.7.0
>            Reporter: Helmut Zechmann
>            Assignee: Chesnay Schepler
>            Priority: Blocker
>              Labels: pull-request-available
>             Fix For: 1.4.3, 1.6.1, 1.7.0, 1.5.4
>
>         Attachments: record_counts_flink_1_3.png, record_counts_flink_1_4.png
>
>
> The flink web ui displays an inconsistent number of "Records received" / "Records sent” in the job overview "Subtasks" view.
> When I run the example wordcount batch job with a small input file on flink 1.3.2 I get
>  * 3 records sent by the first subtask and
>  * 3 records received by the second subtask
> This is the result I would expect.
>  
> If I run the same job on flink 1.4.0 / 1.5.2 / 1.6.0 I get
>  * 13 records sent by the first subtask and
>  * 3 records received by the second subtask
> In real life jobs the numbers are much more strange.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)