You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Luca Canali (Jira)" <ji...@apache.org> on 2019/12/19 14:55:00 UTC

[jira] [Created] (SPARK-30306) Instrument Python UDF execution time and metrics using Spark Metrics system

Luca Canali created SPARK-30306:
-----------------------------------

             Summary: Instrument Python UDF execution time and metrics using Spark Metrics system
                 Key: SPARK-30306
                 URL: https://issues.apache.org/jira/browse/SPARK-30306
             Project: Spark
          Issue Type: Improvement
          Components: PySpark, Spark Core
    Affects Versions: 3.0.0
            Reporter: Luca Canali


This proposes to extend Spark instrumentation to add metrics aimed at understanding the performance of Python code called by Spark, via UDF, Pandas UDF or with MapPartittions. Relevant performance counters are exposed using the Spark Metrics System (based on the Dropwizard library).  This allows to easily consume the metrics produced by executors, for example using a performance dashboard. See also the attached screenshot.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org