You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/04/19 04:30:03 UTC
[GitHub] [spark] huonw opened a new pull request #24414: [SPARK-22044][SQL]
Add `cost` and `codegen` arguments to `explain`
huonw opened a new pull request #24414: [SPARK-22044][SQL] Add `cost` and `codegen` arguments to `explain`
URL: https://github.com/apache/spark/pull/24414
## What changes were proposed in this pull request?
In SQL it's easy to see the inferred statistics (`EXPLAIN COST`) and
the generated code (`EXPLAIN CODEGEN`), but it was much more annoying
to do so via the Dataset/DataFrame APIs. It was more annoying to
access this information from PySpark, and yet even more annoying from
SparkR, as the work-around for each required dropping down to call JVM
functions directly.
This patch exposes this via an overload of `explain` that takes 3
boolean arguments (extended, cost and codegen). This doesn't replace
the old `explain` overloads (to keep backwards compatibility), and
uses booleans to be easily compatible with PySpark and SparkR
callers. The latter have their `explain` functions extended to include
these extra arguments too.
## How was this patch tested?
Added unit tests.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org