You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Jackey Lee (Jira)" <ji...@apache.org> on 2022/03/18 05:17:00 UTC

[jira] [Updated] (SPARK-37831) Add task partition id in metrics

     [ https://issues.apache.org/jira/browse/SPARK-37831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jackey Lee updated SPARK-37831:
-------------------------------
    Affects Version/s:     (was: 3.2.1)

> Add task partition id in metrics
> --------------------------------
>
>                 Key: SPARK-37831
>                 URL: https://issues.apache.org/jira/browse/SPARK-37831
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 3.3.0
>            Reporter: Jackey Lee
>            Priority: Major
>
> There is no partition id in current metrics, it makes difficult to trace stage metrics, such as stage shuffle read, especially when there are stage retries. It is also impossible to check task metrics between different applications.
> {code:java}
> class TaskData private[spark](
>     val taskId: Long,
>     val index: Int,
>     val attempt: Int,
>     val launchTime: Date,
>     val resultFetchStart: Option[Date],
>     @JsonDeserialize(contentAs = classOf[JLong])
>     val duration: Option[Long],
>     val executorId: String,
>     val host: String,
>     val status: String,
>     val taskLocality: String,
>     val speculative: Boolean,
>     val accumulatorUpdates: Seq[AccumulableInfo],
>     val errorMessage: Option[String] = None,
>     val taskMetrics: Option[TaskMetrics] = None,
>     val executorLogs: Map[String, String],
>     val schedulerDelay: Long,
>     val gettingResultTime: Long) {code}
> Adding partitionId in Task Data can not only make us easy to trace task metrics, also can make it possible to collect metrics for actual stage outputs, especially when stage retries.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org