You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/02/01 19:49:39 UTC

[jira] [Commented] (FLINK-3160) Aggregate operator statistics by TaskManager

    [ https://issues.apache.org/jira/browse/FLINK-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15126783#comment-15126783 ] 

ASF GitHub Bot commented on FLINK-3160:
---------------------------------------

Github user greghogan commented on the pull request:

    https://github.com/apache/flink/pull/1564#issuecomment-178124299
  
    Running WordCount on a single node running two taskmanagers, the first screenshot is the current UI (with 'Overview' renamed to 'Subtasks') and the second screenshot is the new tab for 'TaskManagers'.
    
    Hostname is appended with port in both tabs in order to disambiguate TaskManagers, although this deserves further consideration. It would be cleaner to use a per-host index for TaskManagers rather than the 5-digit port (ie., 127.0.0.1:0, 127.0.0.1:1, ...).
    
    Subtasks and TaskManagers are now sorted by host.
    
    ![Subtasks](https://cloud.githubusercontent.com/assets/569655/12725776/54f0fa14-c8e2-11e5-975e-c156ec1c6688.png)
    
    ![TaskManagers](https://cloud.githubusercontent.com/assets/569655/12725820/8b92e960-c8e2-11e5-9596-a96120e99834.png)



> Aggregate operator statistics by TaskManager
> --------------------------------------------
>
>                 Key: FLINK-3160
>                 URL: https://issues.apache.org/jira/browse/FLINK-3160
>             Project: Flink
>          Issue Type: Improvement
>          Components: Webfrontend
>    Affects Versions: 1.0.0
>            Reporter: Greg Hogan
>            Assignee: Greg Hogan
>
> The web client job info page presents a table of the following per task statistics: start time, end time, duration, bytes received, records received, bytes sent, records sent, attempt, host, status.
> Flink supports clusters with thousands of slots and a job setting a high parallelism renders this job info page unwieldy and difficult to analyze in real-time.
> It would be helpful to optionally or automatically aggregate statistics by TaskManager. These rows could then be expanded to reveal the current per task statistics.
> Start time, end time, duration, and attempt are not applicable to a TaskManager since new tasks for repeated attempts may be started. Bytes received, records received, bytes sent, and records sent are summed. Any throughput metrics can be averaged over the total task time or time window. Status could reference the number of running tasks on the TaskManager or an idle state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)