You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/08/18 20:47:25 UTC

[GitHub] [arrow] alamb commented on a change in pull request #7993: ARROW-9760: [Rust] [DataFusion] Added DataFrame::explain

alamb commented on a change in pull request #7993:
URL: https://github.com/apache/arrow/pull/7993#discussion_r472482939



##########
File path: rust/datafusion/src/dataframe.rs
##########
@@ -174,4 +174,18 @@ pub trait DataFrame {
 
     /// Return the logical plan represented by this DataFrame.
     fn to_logical_plan(&self) -> LogicalPlan;
+
+    /// Return a DataFrame with the explanation of its plan so far.
+    ///
+    /// ```
+    /// # use datafusion::prelude::*;
+    /// # use datafusion::error::Result;
+    /// # fn main() -> Result<()> {
+    /// let mut ctx = ExecutionContext::new();
+    /// let df = ctx.read_csv("tests/example.csv", CsvReadOptions::new())?;
+    /// let batches = df.limit(100)?.explain(false)?.collect()?;
+    /// # Ok(())
+    /// # }
+    /// ```
+    fn explain(&self, verbose: bool) -> Result<Arc<dyn DataFrame>>;

Review comment:
       Or we could add something like  `fn explain(&self, verbose: bool) -> String` if you wanted to keep it like spark. It would seem that to get a String for the explain plan, we would need to run the various optimizers and physical planner, but that is probably ok. 
   
   I am not as familiar with Spark's DataFrame API so I do not have a good feel for how people would want this API to behave.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org