You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/02/01 21:42:48 UTC

[GitHub] [arrow-datafusion] alamb opened a new pull request #1726: API to get Expr's type and nullability without a `DFSchema`

alamb opened a new pull request #1726:
URL: https://github.com/apache/arrow-datafusion/pull/1726


   # Which issue does this PR close?
   
   Closes https://github.com/apache/arrow-datafusion/issues/1725
   
    # Rationale for this change
   I am trying to create a low cost (low copy) way for `Expr` simplification.  More details on https://github.com/apache/arrow-datafusion/issues/1725
   
   # What changes are included in this PR?
   1. Add a trait that can provide the needed information for schema quries to Expr
   1. Change the functions on `Expr` that take a &DFSchema such as `Expr::nullable` to be nullable
   3. `impl` this new trait for `DFSchema` so that existing callsites work
   
   
   # Are there any user-facing changes?
   1. It is now possible to get an Expr's type without constructing a `DFSchema`; Existing code should continue to work


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb commented on a change in pull request #1726: API to get Expr's type and nullability without a `DFSchema`

Posted by GitBox <gi...@apache.org>.
alamb commented on a change in pull request #1726:
URL: https://github.com/apache/arrow-datafusion/pull/1726#discussion_r798488745



##########
File path: datafusion/src/logical_plan/expr.rs
##########
@@ -392,20 +392,54 @@ impl PartialOrd for Expr {
     }
 }
 
+/// Provides schema information needed by [Expr] methods such as
+/// [Expr::nullable] and [Expr::data_type]
+///
+/// Note that this trait is implemented for &[DFSchema] which is
+/// widely used in the DataFusion codebase.
+pub trait ExprSchema {
+    /// Is this column reference nullable?
+    fn nullable(&self, col: &Column) -> Result<bool>;
+
+    /// What is the datatype of this column?
+    fn data_type(&self, col: &Column) -> Result<&DataType>;
+}
+
+// Implement for Arc<DFSchema>
+impl<P: AsRef<DFSchema>> ExprSchema for P {
+    fn nullable(&self, col: &Column) -> Result<bool> {
+        self.as_ref().nullable(col)
+    }
+
+    fn data_type(&self, col: &Column) -> Result<&DataType> {
+        self.as_ref().data_type(col)
+    }
+}
+
+impl ExprSchema for DFSchema {
+    fn nullable(&self, col: &Column) -> Result<bool> {
+        Ok(self.field_from_column(col)?.is_nullable())
+    }
+
+    fn data_type(&self, col: &Column) -> Result<&DataType> {
+        Ok(self.field_from_column(col)?.data_type())
+    }
+}
+
 impl Expr {
-    /// Returns the [arrow::datatypes::DataType] of the expression based on [arrow::datatypes::Schema].
+    /// Returns the [arrow::datatypes::DataType] of the expression based on [DFSchema].

Review comment:
       good point -- will fix




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb commented on pull request #1726: API to get Expr's type and nullability without a `DFSchema`

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #1726:
URL: https://github.com/apache/arrow-datafusion/pull/1726#issuecomment-1028003973


   I am happy to improve / change this API -- if anyone has thoughts or suggestions please let me know


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] houqp commented on a change in pull request #1726: API to get Expr's type and nullability without a `DFSchema`

Posted by GitBox <gi...@apache.org>.
houqp commented on a change in pull request #1726:
URL: https://github.com/apache/arrow-datafusion/pull/1726#discussion_r798267335



##########
File path: datafusion/src/logical_plan/expr.rs
##########
@@ -392,20 +392,54 @@ impl PartialOrd for Expr {
     }
 }
 
+/// Provides schema information needed by [Expr] methods such as
+/// [Expr::nullable] and [Expr::data_type]
+///
+/// Note that this trait is implemented for &[DFSchema] which is
+/// widely used in the DataFusion codebase.
+pub trait ExprSchema {
+    /// Is this column reference nullable?
+    fn nullable(&self, col: &Column) -> Result<bool>;
+
+    /// What is the datatype of this column?
+    fn data_type(&self, col: &Column) -> Result<&DataType>;
+}
+
+// Implement for Arc<DFSchema>
+impl<P: AsRef<DFSchema>> ExprSchema for P {
+    fn nullable(&self, col: &Column) -> Result<bool> {
+        self.as_ref().nullable(col)
+    }
+
+    fn data_type(&self, col: &Column) -> Result<&DataType> {
+        self.as_ref().data_type(col)
+    }
+}
+
+impl ExprSchema for DFSchema {
+    fn nullable(&self, col: &Column) -> Result<bool> {
+        Ok(self.field_from_column(col)?.is_nullable())
+    }
+
+    fn data_type(&self, col: &Column) -> Result<&DataType> {
+        Ok(self.field_from_column(col)?.data_type())
+    }
+}
+
 impl Expr {
-    /// Returns the [arrow::datatypes::DataType] of the expression based on [arrow::datatypes::Schema].
+    /// Returns the [arrow::datatypes::DataType] of the expression based on [DFSchema].

Review comment:
       I don't think this comment is accurate anymore since this method is not restrict to `DFSchema` anymore.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb merged pull request #1726: API to get Expr's type and nullability without a `DFSchema`

Posted by GitBox <gi...@apache.org>.
alamb merged pull request #1726:
URL: https://github.com/apache/arrow-datafusion/pull/1726


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org