You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/05/21 05:37:23 UTC

[jira] [Resolved] (SPARK-5225) Support coalesed Input Metrics from different sources

     [ https://issues.apache.org/jira/browse/SPARK-5225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon resolved SPARK-5225.
---------------------------------
    Resolution: Incomplete

> Support coalesed Input Metrics from different sources
> -----------------------------------------------------
>
>                 Key: SPARK-5225
>                 URL: https://issues.apache.org/jira/browse/SPARK-5225
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>            Reporter: Kostas Sakellis
>            Priority: Major
>              Labels: bulk-closed
>
> Currently, If task reads data from more than one block and it is from different read methods we ignore the second read method bytes. For example:
> {noformat}                
>               CoalescedRDD
>                    | 
>                  Task1 
>              /      |      \           
>          hadoop  hadoop  cached
> {noformat}
> if Task1 starts reading from the hadoop blocks first, then the input metrics for Task1 will only contain input metrics from the hadoop blocks and ignre the input metrics from cached blocks. We need to change the way we collect input metrics so that it is not a single value but rather a collection of input metrics for a task. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org