You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Huon Wilson (JIRA)" <ji...@apache.org> on 2019/04/19 03:54:00 UTC

[jira] [Commented] (SPARK-22044) explain function with codegen and cost parameters

    [ https://issues.apache.org/jira/browse/SPARK-22044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16821645#comment-16821645 ] 

Huon Wilson commented on SPARK-22044:
-------------------------------------

I think this would be great, since the current ways to do it are moderately annoying, and the mismatch with the direct {{EXPLAIN CODEGEN}} and {{EXPLAIN COST}} in SQL is a bit jarring/unexpected.


* For {{codegen}}, there's the work-around of using {{df.queryExecution.debug.codegen}} in Scala, but this is somewhat awkward to use from pyspark ({{df._jdf.queryExecution().debug().codegen()}}, which doesn't use Python's {{stdout}} for printing, and so can't be captured easily, if required), and very awkward for sparkR (I believe {{invisible(sparkR.callJMethod(sparkR.callJMethod(sparkR.callJMethod(df@sdf, "queryExecution"), "debug"), "codegen"))}}, but again, cannot be captured via {{capture.output}} easily). 
* For {{cost}}, there's a similar work around of using {{df.queryExecution.stringWithStats}}, but this has the same awkwardness as {{codegen}} for calling from pyspark and sparkR.

> explain function with codegen and cost parameters
> -------------------------------------------------
>
>                 Key: SPARK-22044
>                 URL: https://issues.apache.org/jira/browse/SPARK-22044
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.2.0
>            Reporter: Jacek Laskowski
>            Priority: Minor
>
> {{explain}} operator creates {{ExplainCommand}} runnable command that accepts (among other things) {{codegen}} and {{cost}} arguments.
> There's no version of {{explain}} to allow for this. That's however possible using SQL which is kind of surprising (given how much focus is devoted to the Dataset API).
> This is to have another {{explain}} with {{codegen}} and {{cost}} arguments, i.e.
> {code}
> def explain(codegen: Boolean = false, cost: Boolean = false): Unit
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org