You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "alamb (via GitHub)" <gi...@apache.org> on 2023/04/17 10:16:17 UTC

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #6015: infer column nullability on joins

alamb commented on code in PR #6015:
URL: https://github.com/apache/arrow-datafusion/pull/6015#discussion_r1168475782


##########
datafusion/expr/src/logical_plan/builder.rs:
##########
@@ -1041,11 +1041,18 @@ pub fn build_join_schema(
     right: &DFSchema,
     join_type: &JoinType,
 ) -> Result<DFSchema> {
+    fn nullify_fields(fields: &[DFField]) -> Vec<DFField> {

Review Comment:
   It seems like there are two build_join_schema functions: https://github.com/search?q=repo%3Aapache%2Farrow-datafusion+build_join_schema&type=code
   
   The one in the `datafusion` module already seems to do the right thing with nulls: https://github.com/apache/arrow-datafusion/blob/5b9db17528910328d6ea00d5a6a81138ac6a680b/datafusion/core/src/physical_plan/joins/utils.rs#L349
   
   A nice follow on PR  might be to refactor the code so that only one copy of build_join_schema is present and is used in both places. 
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org