You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "aprimadi (via GitHub)" <gi...@apache.org> on 2023/06/02 06:23:36 UTC

[GitHub] [arrow-datafusion] aprimadi opened a new pull request, #6525: Refactor joins test to sqllogic

aprimadi opened a new pull request, #6525:
URL: https://github.com/apache/arrow-datafusion/pull/6525

   # Which issue does this PR close?
   
   Closes #6302 .
   
   # Rationale for this change
   
   N/A
   
   # What changes are included in this PR?
   
   N/A
   
   # Are these changes tested?
   
   Yes
   
   # Are there any user-facing changes?
   
   No


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] aprimadi commented on a diff in pull request #6525: Refactor joins test to sqllogic

Posted by "aprimadi (via GitHub)" <gi...@apache.org>.
aprimadi commented on code in PR #6525:
URL: https://github.com/apache/arrow-datafusion/pull/6525#discussion_r1214267103


##########
datafusion/optimizer/src/decorrelate_predicate_subquery.rs:
##########
@@ -155,6 +152,17 @@ impl OptimizerRule for DecorrelatePredicateSubquery {
             _ => Ok(None),
         }
     }
+}
+
+impl OptimizerRule for DecorrelatePredicateSubquery {
+    fn try_optimize(

Review Comment:
   cc @jackwener @alamb
   
   Is this looking ok? The problem I'm encountering is that every queries optimization use the same shared AliasGenerator. Thus, the `alias ID` is ever increasing. Instead, I think it should be instantiated per query, which is what this code does.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] aprimadi commented on pull request #6525: Refactor joins test to sqllogic

Posted by "aprimadi (via GitHub)" <gi...@apache.org>.
aprimadi commented on PR #6525:
URL: https://github.com/apache/arrow-datafusion/pull/6525#issuecomment-1586131070

   I'll create another pull request to reset subquery alias (possibly my solution involves using thread-local variable) since this PR already quite large and I won't have the time to work on this until next weekend.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on pull request #6525: Refactor joins test to sqllogic

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on PR #6525:
URL: https://github.com/apache/arrow-datafusion/pull/6525#issuecomment-1577361200

   FYI @mingmwang 
   
   @aprimadi  just let us know when this PR is ready for review


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] aprimadi commented on a diff in pull request #6525: Refactor joins test to sqllogic

Posted by "aprimadi (via GitHub)" <gi...@apache.org>.
aprimadi commented on code in PR #6525:
URL: https://github.com/apache/arrow-datafusion/pull/6525#discussion_r1214290037


##########
datafusion/optimizer/src/decorrelate_predicate_subquery.rs:
##########
@@ -155,6 +152,17 @@ impl OptimizerRule for DecorrelatePredicateSubquery {
             _ => Ok(None),
         }
     }
+}
+
+impl OptimizerRule for DecorrelatePredicateSubquery {
+    fn try_optimize(

Review Comment:
   Nvm, apparently this is not ok since each optimization pass may be done multiple times.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] aprimadi commented on a diff in pull request #6525: Refactor joins test to sqllogic

Posted by "aprimadi (via GitHub)" <gi...@apache.org>.
aprimadi commented on code in PR #6525:
URL: https://github.com/apache/arrow-datafusion/pull/6525#discussion_r1214294602


##########
datafusion/optimizer/src/decorrelate_predicate_subquery.rs:
##########
@@ -155,6 +152,17 @@ impl OptimizerRule for DecorrelatePredicateSubquery {
             _ => Ok(None),
         }
     }
+}
+
+impl OptimizerRule for DecorrelatePredicateSubquery {
+    fn try_optimize(

Review Comment:
   What do you think storing a reference to AliasGenerator in Optimizer so it can be reset during each call to `optimize`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] aprimadi commented on a diff in pull request #6525: Refactor joins test to sqllogic

Posted by "aprimadi (via GitHub)" <gi...@apache.org>.
aprimadi commented on code in PR #6525:
URL: https://github.com/apache/arrow-datafusion/pull/6525#discussion_r1214290037


##########
datafusion/optimizer/src/decorrelate_predicate_subquery.rs:
##########
@@ -155,6 +152,17 @@ impl OptimizerRule for DecorrelatePredicateSubquery {
             _ => Ok(None),
         }
     }
+}
+
+impl OptimizerRule for DecorrelatePredicateSubquery {
+    fn try_optimize(

Review Comment:
   Nvm, apparently this is not ok since each `optimize` call may contain multiple optimization pass.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb merged pull request #6525: Refactor joins test to sqllogic

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb merged PR #6525:
URL: https://github.com/apache/arrow-datafusion/pull/6525


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] aprimadi commented on a diff in pull request #6525: Refactor joins test to sqllogic

Posted by "aprimadi (via GitHub)" <gi...@apache.org>.
aprimadi commented on code in PR #6525:
URL: https://github.com/apache/arrow-datafusion/pull/6525#discussion_r1215271756


##########
datafusion/optimizer/src/decorrelate_predicate_subquery.rs:
##########
@@ -155,6 +152,17 @@ impl OptimizerRule for DecorrelatePredicateSubquery {
             _ => Ok(None),
         }
     }
+}
+
+impl OptimizerRule for DecorrelatePredicateSubquery {
+    fn try_optimize(

Review Comment:
   Nvm, I think it could cause a potential race condition.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #6525: Refactor joins test to sqllogic

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #6525:
URL: https://github.com/apache/arrow-datafusion/pull/6525#discussion_r1218505321


##########
datafusion/optimizer/src/decorrelate_predicate_subquery.rs:
##########
@@ -155,6 +152,17 @@ impl OptimizerRule for DecorrelatePredicateSubquery {
             _ => Ok(None),
         }
     }
+}
+
+impl OptimizerRule for DecorrelatePredicateSubquery {
+    fn try_optimize(

Review Comment:
   I think it would be great to reset the alias generator at the start of each query (if that is not already done)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on pull request #6525: Refactor joins test to sqllogic

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on PR #6525:
URL: https://github.com/apache/arrow-datafusion/pull/6525#issuecomment-1586194521

   > I'll create another pull request to reset subquery alias (possibly my solution involves using thread-local variable) since this PR already quite large and I won't have the time to work on this until next weekend.
   
   
   
   Thank you @aprimadi 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #6525: Refactor joins test to sqllogic

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #6525:
URL: https://github.com/apache/arrow-datafusion/pull/6525#discussion_r1225835705


##########
datafusion/core/tests/sqllogictests/test_files/joins.slt:
##########
@@ -622,3 +685,1655 @@ DROP TABLE t1
 
 statement ok
 DROP TABLE t2
+
+# Join timestamp
+
+statement ok
+CREATE TABLE timestamp(time TIMESTAMP) AS VALUES
+    (131964190213133),
+    (131964190213134),
+    (131964190213135);
+
+query PP
+SELECT *
+FROM timestamp as a
+JOIN (SELECT * FROM timestamp) as b
+ON a.time = b.time
+ORDER BY a.time
+----
+1970-01-02T12:39:24.190213133 1970-01-02T12:39:24.190213133
+1970-01-02T12:39:24.190213134 1970-01-02T12:39:24.190213134
+1970-01-02T12:39:24.190213135 1970-01-02T12:39:24.190213135
+
+statement ok
+DROP TABLE timestamp;
+
+# Join float32
+
+statement ok
+CREATE TABLE population(city VARCHAR, population FLOAT) AS VALUES
+    ('a', 838.698),
+    ('b', 1778.934),
+    ('c', 626.443);
+
+query TRTR
+SELECT *
+FROM population as a
+JOIN (SELECT * FROM population) as b
+ON a.population = b.population
+ORDER BY a.population
+----
+c 626.443 c 626.443
+a 838.698 a 838.698
+b 1778.934 b 1778.934
+
+statement ok
+DROP TABLE population;
+
+# Join float64
+
+statement ok
+CREATE TABLE population(city VARCHAR, population DOUBLE) AS VALUES
+    ('a', 838.698),
+    ('b', 1778.934),
+    ('c', 626.443);
+
+query TRTR
+SELECT *
+FROM population as a
+JOIN (SELECT * FROM population) as b
+ON a.population = b.population
+ORDER BY a.population
+----
+c 626.443 c 626.443
+a 838.698 a 838.698
+b 1778.934 b 1778.934
+
+statement ok
+DROP TABLE population;
+
+# Inner join qualified names
+
+statement ok
+CREATE TABLE t1 (a INT, b INT, c INT) AS VALUES
+    (1, 10, 50),
+    (2, 20, 60),
+    (3, 30, 70),
+    (4, 40, 80);
+
+statement ok
+CREATE TABLE t2 (a INT, b INT, c INT) AS VALUES
+    (1, 100, 500),
+    (2, 200, 600),
+    (9, 300, 700),
+    (4, 400, 800);
+
+query IIIIII
+SELECT t1.a, t1.b, t1.c, t2.a, t2.b, t2.c
+FROM t1
+INNER JOIN t2 ON t1.a = t2.a
+ORDER BY t1.a
+----
+1 10 50 1 100 500
+2 20 60 2 200 600
+4 40 80 4 400 800
+
+query IIIIII
+SELECT t1.a, t1.b, t1.c, t2.a, t2.b, t2.c
+FROM t1
+INNER JOIN t2 ON t2.a = t1.a
+ORDER BY t1.a
+----
+1 10 50 1 100 500
+2 20 60 2 200 600
+4 40 80 4 400 800
+
+statement ok
+DROP TABLE t1;
+
+statement ok
+DROP TABLE t2;
+
+# TODO: nestedjoin_with_alias
+
+# Nested join without alias
+
+query IIII
+select * from (select 1 as a, 2 as b) c INNER JOIN (select 1 as a, 3 as d) e on c.a = e.a
+----
+1 2 1 3
+
+# Join tables with duplicated column name not in on constraint
+
+statement ok
+CREATE TABLE countries (id INT, country VARCHAR) AS VALUES
+    (1, 'Germany'),
+    (2, 'Sweden'),
+    (3, 'Japan');
+
+statement ok
+CREATE TABLE cities (id INT, country_id INT, city VARCHAR) AS VALUES
+    (1, 1, 'Hamburg'),
+    (2, 2, 'Stockholm'),
+    (3, 3, 'Osaka'),
+    (4, 1, 'Berlin'),
+    (5, 2, 'Göteborg'),
+    (6, 3, 'Tokyo'),
+    (7, 3, 'Kyoto');
+
+query IITT
+SELECT t1.id, t2.id, t1.city, t2.country FROM cities AS t1 JOIN countries AS t2 ON t1.country_id = t2.id ORDER BY t1.id
+----
+1 1 Hamburg Germany
+2 2 Stockholm Sweden
+3 3 Osaka Japan
+4 1 Berlin Germany
+5 2 Göteborg Sweden
+6 3 Tokyo Japan
+7 3 Kyoto Japan
+
+statement ok
+DROP TABLE countries;
+
+statement ok
+DROP TABLE cities;
+
+# TODO: join_timestamp
+
+# Left join and right join should not panic with empty side
+
+statement ok
+CREATE TABLE t1 (t1_id BIGINT, ti_value VARCHAR NOT NULL) AS VALUES
+    (5247, 'a'),
+    (3821, 'b'),
+    (6321, 'c'),
+    (8821, 'd'),
+    (7748, 'e');
+
+statement ok
+CREATE TABLE t2 (t2_id BIGINT, t2_value BOOLEAN) AS VALUES
+    (358, true),
+    (2820, false),
+    (3804, NULL),
+    (7748, NULL);
+
+query ITIB rowsort
+SELECT * FROM t1 LEFT JOIN t2 ON t1_id = t2_id
+----
+3821 b NULL NULL
+5247 a NULL NULL
+6321 c NULL NULL
+7748 e 7748 NULL
+8821 d NULL NULL
+
+query IBIT rowsort
+SELECT * FROM t2 RIGHT JOIN t1 ON t1_id = t2_id
+----
+7748 NULL 7748 e
+NULL NULL 3821 b
+NULL NULL 5247 a
+NULL NULL 6321 c
+NULL NULL 8821 d
+
+statement ok
+DROP TABLE t1;
+
+statement ok
+DROP TABLE t2;
+
+# TODO: left_join_using_2
+
+# TODO: left_join_using_join_key_projection
+
+# TODO: left_join_2
+
+# TODO: join_partitioned
+
+# TODO: hash_join_with_date32
+
+# TODO: hash_join_with_date64
+
+# TODO: hash_join_with_decimal
+
+# TODO: hash_join_with_dictionary
+
+
+###
+# Configuration setup
+###
+
+statement ok
+set datafusion.optimizer.repartition_joins = false;
+
+# Reduce left join 1 (to inner join)
+
+query TT
+EXPLAIN
+SELECT *
+FROM join_t1
+LEFT JOIN join_t2 on join_t1.t1_id = join_t2.t2_id
+WHERE join_t2.t2_id < 100
+----
+logical_plan
+Inner Join: join_t1.t1_id = join_t2.t2_id
+--Filter: join_t1.t1_id < UInt32(100)
+----TableScan: join_t1 projection=[t1_id, t1_name, t1_int]
+--Filter: join_t2.t2_id < UInt32(100)
+----TableScan: join_t2 projection=[t2_id, t2_name, t2_int]
+
+# Reduce left join 2 (to inner join)
+
+query TT
+EXPLAIN
+SELECT *
+FROM join_t1
+LEFT JOIN join_t2 on join_t1.t1_id = join_t2.t2_id
+WHERE join_t2.t2_int < 10 or (join_t1.t1_int > 2 and join_t2.t2_name != 'w')
+----
+logical_plan
+Inner Join: join_t1.t1_id = join_t2.t2_id Filter: join_t2.t2_int < UInt32(10) OR join_t1.t1_int > UInt32(2) AND join_t2.t2_name != Utf8("w")
+--TableScan: join_t1 projection=[t1_id, t1_name, t1_int]
+--Filter: join_t2.t2_int < UInt32(10) OR join_t2.t2_name != Utf8("w")
+----TableScan: join_t2 projection=[t2_id, t2_name, t2_int]
+
+# Reduce left join 3 (to inner join)
+
+query TT
+EXPLAIN
+SELECT *
+FROM (
+    SELECT join_t1.*
+    FROM join_t1
+    LEFT JOIN join_t2 ON join_t1.t1_id = join_t2.t2_id
+    WHERE join_t2.t2_int < 3
+) t3
+LEFT JOIN join_t2 on t3.t1_int = join_t2.t2_int
+WHERE t3.t1_id < 100
+----
+logical_plan
+Left Join: t3.t1_int = join_t2.t2_int
+--SubqueryAlias: t3
+----Projection: join_t1.t1_id, join_t1.t1_name, join_t1.t1_int
+------Inner Join: join_t1.t1_id = join_t2.t2_id
+--------Filter: join_t1.t1_id < UInt32(100)
+----------TableScan: join_t1 projection=[t1_id, t1_name, t1_int]
+--------Projection: join_t2.t2_id
+----------Filter: join_t2.t2_int < UInt32(3) AND join_t2.t2_id < UInt32(100)
+------------TableScan: join_t2 projection=[t2_id, t2_int]
+--TableScan: join_t2 projection=[t2_id, t2_name, t2_int]
+
+# Reduce right join 1 (to inner join)
+
+query TT
+EXPLAIN
+SELECT *
+FROM join_t1
+RIGHT JOIN join_t2 on join_t1.t1_id = join_t2.t2_id
+WHERE join_t1.t1_int IS NOT NULL
+----
+logical_plan
+Inner Join: join_t1.t1_id = join_t2.t2_id
+--Filter: join_t1.t1_int IS NOT NULL
+----TableScan: join_t1 projection=[t1_id, t1_name, t1_int]
+--TableScan: join_t2 projection=[t2_id, t2_name, t2_int]
+
+# Reduce right join 2 (to inner join)
+
+query TT
+EXPLAIN
+SELECT *
+FROM join_t1
+RIGHT JOIN join_t2 on join_t1.t1_id = join_t2.t2_id
+WHERE NOT (join_t1.t1_int = join_t2.t2_int)
+----
+logical_plan
+Inner Join: join_t1.t1_id = join_t2.t2_id Filter: join_t1.t1_int != join_t2.t2_int
+--TableScan: join_t1 projection=[t1_id, t1_name, t1_int]
+--TableScan: join_t2 projection=[t2_id, t2_name, t2_int]
+
+# Reduce full join to right join
+
+query TT
+EXPLAIN
+SELECT *
+FROM join_t1
+FULL JOIN join_t2 on join_t1.t1_id = join_t2.t2_id
+WHERE join_t2.t2_name IS NOT NULL
+----
+logical_plan
+Right Join: join_t1.t1_id = join_t2.t2_id
+--TableScan: join_t1 projection=[t1_id, t1_name, t1_int]
+--Filter: join_t2.t2_name IS NOT NULL
+----TableScan: join_t2 projection=[t2_id, t2_name, t2_int]
+
+# Reduce full join to left join
+
+query TT
+EXPLAIN
+SELECT *
+FROM join_t1
+FULL JOIN join_t2 ON join_t1.t1_id = join_t2.t2_id
+WHERE join_t1.t1_name != 'b'
+----
+logical_plan
+Left Join: join_t1.t1_id = join_t2.t2_id
+--Filter: join_t1.t1_name != Utf8("b")
+----TableScan: join_t1 projection=[t1_id, t1_name, t1_int]
+--TableScan: join_t2 projection=[t2_id, t2_name, t2_int]
+
+# Reduce full join to inner join
+
+query TT
+EXPLAIN
+SELECT *
+FROM join_t1
+FULL JOIN join_t2 on join_t1.t1_id = join_t2.t2_id
+WHERE join_t1.t1_name != 'b' and join_t2.t2_name = 'x'
+----
+logical_plan
+Inner Join: join_t1.t1_id = join_t2.t2_id
+--Filter: join_t1.t1_name != Utf8("b")
+----TableScan: join_t1 projection=[t1_id, t1_name, t1_int]
+--Filter: join_t2.t2_name = Utf8("x")
+----TableScan: join_t2 projection=[t2_id, t2_name, t2_int]
+
+###
+# Configuration teardown
+###
+
+statement ok
+set datafusion.optimizer.repartition_joins = true;
+
+
+# Sort merge equijoin
+
+query ITT
+SELECT t1_id, t1_name, t2_name
+FROM join_t1
+JOIN join_t2 ON t1_id = t2_id
+ORDER BY t1_id
+----
+11 a z
+22 b y
+44 d x
+
+query ITT
+SELECT t1_id, t1_name, t2_name
+FROM join_t1
+JOIN join_t2 ON t2_id = t1_id
+ORDER BY t1_id
+----
+11 a z
+22 b y
+44 d x
+
+# TODO: sort_merge_join_on_date32
+
+# TODO: sort_merge_join_on_decimal
+
+
+# TODO: Left semi join
+
+# Left semi join pushdown
+
+query TT
+EXPLAIN
+SELECT lsaj_t1.t1_id, lsaj_t1.t1_name
+FROM lsaj_t1
+LEFT SEMI JOIN lsaj_t2 ON (lsaj_t1.t1_id = lsaj_t2.t2_id and lsaj_t2.t2_int > 1)
+----
+logical_plan
+LeftSemi Join: lsaj_t1.t1_id = lsaj_t2.t2_id
+--TableScan: lsaj_t1 projection=[t1_id, t1_name]
+--Projection: lsaj_t2.t2_id
+----Filter: lsaj_t2.t2_int > UInt32(1)
+------TableScan: lsaj_t2 projection=[t2_id, t2_int]
+
+# Left anti join
+
+statement ok
+set datafusion.optimizer.repartition_joins = false;
+
+query IT
+SELECT t1_id, t1_name
+FROM lsaj_t1
+WHERE NOT EXISTS (SELECT 1 FROM lsaj_t2 WHERE t1_id = t2_id)
+ORDER BY t1_id
+----
+33 c
+NULL e
+
+query I
+SELECT t1_id
+FROM lsaj_t1
+EXCEPT SELECT t2_id FROM lsaj_t2
+ORDER BY t1_id
+----
+33
+
+query IT
+SELECT t1_id, t1_name
+FROM lsaj_t1
+LEFT ANTI JOIN lsaj_t2 ON (t1_id = t2_id)
+ORDER BY t1_id
+----
+33 c
+NULL e
+
+statement ok
+set datafusion.optimizer.repartition_joins = true;
+
+query IT
+SELECT t1_id, t1_name
+FROM lsaj_t1
+WHERE NOT EXISTS (SELECT 1 FROM lsaj_t2 WHERE t1_id = t2_id)
+ORDER BY t1_id
+----
+33 c
+NULL e
+
+query I
+SELECT t1_id
+FROM lsaj_t1
+EXCEPT SELECT t2_id FROM lsaj_t2
+ORDER BY t1_id
+----
+33
+
+query IT
+SELECT t1_id, t1_name
+FROM lsaj_t1
+LEFT ANTI JOIN lsaj_t2 ON (t1_id = t2_id)
+ORDER BY t1_id
+----
+33 c
+NULL e
+
+# Error left anti join
+# https://github.com/apache/arrow-datafusion/issues/4366

Review Comment:
   lol -- this issue was subsequently fixed so we can probably remove it from the comments
   
   But we can do so as a follow on PR. 



##########
datafusion/core/tests/sqllogictests/test_files/joins.slt:
##########
@@ -622,3 +685,1655 @@ DROP TABLE t1
 
 statement ok
 DROP TABLE t2
+
+# Join timestamp
+
+statement ok
+CREATE TABLE timestamp(time TIMESTAMP) AS VALUES
+    (131964190213133),
+    (131964190213134),
+    (131964190213135);
+
+query PP
+SELECT *
+FROM timestamp as a
+JOIN (SELECT * FROM timestamp) as b
+ON a.time = b.time
+ORDER BY a.time
+----
+1970-01-02T12:39:24.190213133 1970-01-02T12:39:24.190213133
+1970-01-02T12:39:24.190213134 1970-01-02T12:39:24.190213134
+1970-01-02T12:39:24.190213135 1970-01-02T12:39:24.190213135
+
+statement ok
+DROP TABLE timestamp;
+
+# Join float32
+
+statement ok
+CREATE TABLE population(city VARCHAR, population FLOAT) AS VALUES
+    ('a', 838.698),
+    ('b', 1778.934),
+    ('c', 626.443);
+
+query TRTR
+SELECT *
+FROM population as a
+JOIN (SELECT * FROM population) as b
+ON a.population = b.population
+ORDER BY a.population
+----
+c 626.443 c 626.443
+a 838.698 a 838.698
+b 1778.934 b 1778.934
+
+statement ok
+DROP TABLE population;
+
+# Join float64
+
+statement ok
+CREATE TABLE population(city VARCHAR, population DOUBLE) AS VALUES
+    ('a', 838.698),
+    ('b', 1778.934),
+    ('c', 626.443);
+
+query TRTR
+SELECT *
+FROM population as a
+JOIN (SELECT * FROM population) as b
+ON a.population = b.population
+ORDER BY a.population
+----
+c 626.443 c 626.443
+a 838.698 a 838.698
+b 1778.934 b 1778.934
+
+statement ok
+DROP TABLE population;
+
+# Inner join qualified names
+
+statement ok
+CREATE TABLE t1 (a INT, b INT, c INT) AS VALUES
+    (1, 10, 50),
+    (2, 20, 60),
+    (3, 30, 70),
+    (4, 40, 80);
+
+statement ok
+CREATE TABLE t2 (a INT, b INT, c INT) AS VALUES
+    (1, 100, 500),
+    (2, 200, 600),
+    (9, 300, 700),
+    (4, 400, 800);
+
+query IIIIII
+SELECT t1.a, t1.b, t1.c, t2.a, t2.b, t2.c
+FROM t1
+INNER JOIN t2 ON t1.a = t2.a
+ORDER BY t1.a
+----
+1 10 50 1 100 500
+2 20 60 2 200 600
+4 40 80 4 400 800
+
+query IIIIII
+SELECT t1.a, t1.b, t1.c, t2.a, t2.b, t2.c
+FROM t1
+INNER JOIN t2 ON t2.a = t1.a
+ORDER BY t1.a
+----
+1 10 50 1 100 500
+2 20 60 2 200 600
+4 40 80 4 400 800
+
+statement ok
+DROP TABLE t1;
+
+statement ok
+DROP TABLE t2;
+
+# TODO: nestedjoin_with_alias
+
+# Nested join without alias
+
+query IIII
+select * from (select 1 as a, 2 as b) c INNER JOIN (select 1 as a, 3 as d) e on c.a = e.a
+----
+1 2 1 3
+
+# Join tables with duplicated column name not in on constraint
+
+statement ok
+CREATE TABLE countries (id INT, country VARCHAR) AS VALUES
+    (1, 'Germany'),
+    (2, 'Sweden'),
+    (3, 'Japan');
+
+statement ok
+CREATE TABLE cities (id INT, country_id INT, city VARCHAR) AS VALUES
+    (1, 1, 'Hamburg'),
+    (2, 2, 'Stockholm'),
+    (3, 3, 'Osaka'),
+    (4, 1, 'Berlin'),
+    (5, 2, 'Göteborg'),
+    (6, 3, 'Tokyo'),
+    (7, 3, 'Kyoto');
+
+query IITT
+SELECT t1.id, t2.id, t1.city, t2.country FROM cities AS t1 JOIN countries AS t2 ON t1.country_id = t2.id ORDER BY t1.id
+----
+1 1 Hamburg Germany
+2 2 Stockholm Sweden
+3 3 Osaka Japan
+4 1 Berlin Germany
+5 2 Göteborg Sweden
+6 3 Tokyo Japan
+7 3 Kyoto Japan
+
+statement ok
+DROP TABLE countries;
+
+statement ok
+DROP TABLE cities;
+
+# TODO: join_timestamp
+
+# Left join and right join should not panic with empty side
+
+statement ok
+CREATE TABLE t1 (t1_id BIGINT, ti_value VARCHAR NOT NULL) AS VALUES
+    (5247, 'a'),
+    (3821, 'b'),
+    (6321, 'c'),
+    (8821, 'd'),
+    (7748, 'e');
+
+statement ok
+CREATE TABLE t2 (t2_id BIGINT, t2_value BOOLEAN) AS VALUES
+    (358, true),
+    (2820, false),
+    (3804, NULL),
+    (7748, NULL);
+
+query ITIB rowsort
+SELECT * FROM t1 LEFT JOIN t2 ON t1_id = t2_id
+----
+3821 b NULL NULL
+5247 a NULL NULL
+6321 c NULL NULL
+7748 e 7748 NULL
+8821 d NULL NULL
+
+query IBIT rowsort
+SELECT * FROM t2 RIGHT JOIN t1 ON t1_id = t2_id
+----
+7748 NULL 7748 e
+NULL NULL 3821 b
+NULL NULL 5247 a
+NULL NULL 6321 c
+NULL NULL 8821 d
+
+statement ok
+DROP TABLE t1;
+
+statement ok
+DROP TABLE t2;
+
+# TODO: left_join_using_2

Review Comment:
   I verified this test is still in joins.rs. 👍 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org