You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/10/27 19:03:22 UTC

[GitHub] [arrow-datafusion] isidentical commented on a diff in pull request #3903: Factorize common AND factors out of OR predicates to support filterPu…

isidentical commented on code in PR #3903:
URL: https://github.com/apache/arrow-datafusion/pull/3903#discussion_r1007258449


##########
benchmarks/expected-plans/q7.txt:
##########
@@ -3,7 +3,7 @@ Sort: shipping.supp_nation ASC NULLS LAST, shipping.cust_nation ASC NULLS LAST,
     Aggregate: groupBy=[[shipping.supp_nation, shipping.cust_nation, shipping.l_year]], aggr=[[SUM(shipping.volume)]]
       Projection: shipping.supp_nation, shipping.cust_nation, shipping.l_year, shipping.volume, alias=shipping
         Projection: n1.n_name AS supp_nation, n2.n_name AS cust_nation, datepart(Utf8("YEAR"), lineitem.l_shipdate) AS l_year, CAST(lineitem.l_extendedprice AS Decimal128(38, 4)) * CAST(Decimal128(Some(100),23,2) - CAST(lineitem.l_discount AS Decimal128(23, 2)) AS Decimal128(38, 4)) AS volume, alias=shipping
-          Filter: n1.n_name = Utf8("FRANCE") AND n2.n_name = Utf8("GERMANY") OR n1.n_name = Utf8("GERMANY") AND n2.n_name = Utf8("FRANCE")
+          Filter: (n1.n_name = Utf8("FRANCE") OR n2.n_name = Utf8("FRANCE")) AND (n2.n_name = Utf8("GERMANY") OR n1.n_name = Utf8("GERMANY"))

Review Comment:
   I've also wanted to check TPC-H (it shouldn't affect, but just to see if there is an unexpected regression). It seems like there aren't any regression (`873.27 ms` vs  `878.54 ms`, only noise) 🚀 



##########
datafusion/optimizer/src/utils.rs:
##########
@@ -655,4 +828,135 @@ mod tests {
             "mismatch rewriting expr_from: {expr_from} to {rewrite_to}"
         )
     }
+
+    #[test]
+    fn test_permutations() {
+        assert_eq!(make_permutations(vec![]), vec![] as Vec<Vec<Expr>>)
+    }
+
+    #[test]
+    fn test_permutations_one() {
+        // [[a]] --> [[a]]
+        assert_eq!(
+            make_permutations(vec![vec![col("a")]]),
+            vec![vec![col("a")]]
+        )
+    }
+
+    #[test]
+    fn test_permutations_two() {
+        // [[a, b]] --> [[a], [b]]
+        assert_eq!(
+            make_permutations(vec![vec![col("a"), col("b")]]),
+            vec![vec![col("a")], vec![col("b")]]
+        )
+    }
+
+    #[test]
+    fn test_permutations_two_and_one() {
+        // [[a, b], [c]] --> [[a, c], [b, c]]
+        assert_eq!(
+            make_permutations(vec![vec![col("a"), col("b")], vec![col("c")]]),
+            vec![vec![col("a"), col("c")], vec![col("b"), col("c")]]
+        )
+    }
+
+    #[test]
+    fn test_permutations_two_and_one_and_two() {
+        // [[a, b], [c], [d, e]] --> [[a, c, d], [a, c, e], [b, c, d], [b, c, e]]

Review Comment:
   These are super useful, thanks 💯 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org