You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/04/19 04:30:03 UTC

[GitHub] [spark] huonw opened a new pull request #24414: [SPARK-22044][SQL] Add `cost` and `codegen` arguments to `explain`

huonw opened a new pull request #24414: [SPARK-22044][SQL] Add `cost` and `codegen` arguments to `explain`
URL: https://github.com/apache/spark/pull/24414
 
 
   ## What changes were proposed in this pull request?
   
   In SQL it's easy to see the inferred statistics (`EXPLAIN COST`) and
   the generated code (`EXPLAIN CODEGEN`), but it was much more annoying
   to do so via the Dataset/DataFrame APIs. It was more annoying to 
   access this information from PySpark, and yet even more annoying from
   SparkR, as the work-around for each required dropping down to call JVM
   functions directly.
   
   This patch exposes this via an overload of `explain` that takes 3
   boolean arguments (extended, cost and codegen). This doesn't replace
   the old `explain` overloads (to keep backwards compatibility), and
   uses booleans to be easily compatible with PySpark and SparkR
   callers. The latter have their `explain` functions extended to include
   these extra arguments too.
   
   ## How was this patch tested?
   
   Added unit tests.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org