You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/05/04 13:50:01 UTC

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #2364: Add proper support for `null` literal by introducing `ScalarValue::Null`

alamb commented on code in PR #2364:
URL: https://github.com/apache/arrow-datafusion/pull/2364#discussion_r864844502


##########
datafusion/core/tests/sql/expr.rs:
##########
@@ -121,25 +121,25 @@ async fn case_when_else_with_null_contant() -> Result<()> {
         FROM t1";
     let actual = execute_to_batches(&ctx, sql).await;
     let expected = vec![
-        "+----------------------------------------------------------------------------------------------+",
-        "| CASE WHEN #t1.c1 = Utf8(\"a\") THEN Int64(1) WHEN Utf8(NULL) THEN Int64(2) ELSE Int64(999) END |",
-        "+----------------------------------------------------------------------------------------------+",
-        "| 1                                                                                            |",
-        "| 999                                                                                          |",
-        "| 999                                                                                          |",
-        "| 999                                                                                          |",
-        "+----------------------------------------------------------------------------------------------+",
+        "+----------------------------------------------------------------------------------------+",
+        "| CASE WHEN #t1.c1 = Utf8(\"a\") THEN Int64(1) WHEN NULL THEN Int64(2) ELSE Int64(999) END |",

Review Comment:
   This is the key difference here. Very nice 👍 



##########
datafusion/physical-expr/src/expressions/binary.rs:
##########
@@ -1336,6 +1360,28 @@ fn is_distinct_from_utf8<OffsetSize: StringOffsetSizeTrait>(
         .collect())
 }
 
+fn is_distinct_from_null(left: &NullArray, _right: &NullArray) -> Result<BooleanArray> {
+    let length = left.len();
+    make_boolean_array(length, false)
+}
+
+fn is_not_distinct_from_null(
+    left: &NullArray,
+    _right: &NullArray,
+) -> Result<BooleanArray> {
+    let length = left.len();
+    make_boolean_array(length, true)
+}
+
+pub fn eq_null(left: &NullArray, _right: &NullArray) -> Result<BooleanArray> {
+    let length = left.len();
+    make_boolean_array(length, false)
+}

Review Comment:
   I don't think this is correct -- specifically I think the resulting boolean array should be full of nulls not false
   
   Perhaps something like: 
   
   ```rust
       std::iter::repeat(left.len(), None).collect()
   ```



##########
datafusion/core/tests/sql/joins.rs:
##########
@@ -829,7 +829,13 @@ async fn inner_join_nulls() {
     let sql = "SELECT * FROM (SELECT null AS id1) t1
             INNER JOIN (SELECT null AS id2) t2 ON id1 = id2";
 
-    let expected = vec!["++", "++"];
+    let expected = vec![

Review Comment:
   This answer is not correct -- there should be no rows that match.
   
   This is because the join should produce rows where `id1 = id2` evaluates to `true` 
   
   However, `null = null` evaluates to `null` 🤯 
   
   Here is the query in postgres:
   
   ```sql
   alamb=# SELECT * FROM (SELECT null AS id1) t1
               INNER JOIN (SELECT null AS id2) t2 ON id1 = id2
   alamb-# ;
    id1 | id2 
   -----+-----
   (0 rows)
   
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org