You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Luca Canali (Jira)" <ji...@apache.org> on 2019/12/19 14:55:00 UTC
[jira] [Created] (SPARK-30306) Instrument Python UDF execution time
and metrics using Spark Metrics system
Luca Canali created SPARK-30306:
-----------------------------------
Summary: Instrument Python UDF execution time and metrics using Spark Metrics system
Key: SPARK-30306
URL: https://issues.apache.org/jira/browse/SPARK-30306
Project: Spark
Issue Type: Improvement
Components: PySpark, Spark Core
Affects Versions: 3.0.0
Reporter: Luca Canali
This proposes to extend Spark instrumentation to add metrics aimed at understanding the performance of Python code called by Spark, via UDF, Pandas UDF or with MapPartittions. Relevant performance counters are exposed using the Spark Metrics System (based on the Dropwizard library). This allows to easily consume the metrics produced by executors, for example using a performance dashboard. See also the attached screenshot.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org