You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2020/02/04 09:12:06 UTC
[GitHub] [incubator-doris] EmmyMiao87 edited a comment on issue #2830:
Support subquery in disjunction
EmmyMiao87 edited a comment on issue #2830: Support subquery in disjunction
URL: https://github.com/apache/incubator-doris/issues/2830#issuecomment-581809110
Greenplum support the correlated subquery in disjunctive(OR) filter by GPORCA.
If the correlated subquery in conjunctive filter, the query will be rewritten to semi join.
If the correlated subquery in disjunctive filter, the query will be rewritten to left outer join in Greenplum.
For example:
select * from a where exist (select * from b where a.k1=b.k1) or exist (select * from c where a.k1=c.k1);
1. a left outer join b on a.k1=b.k1
2. the result of 1 left outer join c on a.k1=c.k1
the result of 2 like that
| a.k1 | b.k1 | c.k1 |
| :-----| :---- | :----|
| 1 | 1 | NULL |
| 2 | NULL | 2 |
| 3 | 3 | 3|
| 4 | NULL | NULL|
3. filter the incorrect line
The first, second and third line should be keep.
4. keep the outer table columns
keep the columns in table A
Here is a question. What does the following mean?
`((COALESCE((count((count()))), 0::bigint) > 0::bigint) OR (COALESCE((count((count()))), 0::bigint) > 0::bigint))`
For query
```
SELECT *
FROM customer c
WHERE EXISTS
(
SELECT *
FROM web_sales
WHERE c.c_customer_sk = ws_bill_customer_sk
)
OR EXISTS
(
SELECT *
FROM catalog_sales
WHERE c.c_customer_sk = cs_ship_customer_sk
);
```
The explain result like that
```
tpcds=# explain SELECT *
tpcds-# FROM customer c
tpcds-# WHERE EXISTS
tpcds-# (
tpcds(# SELECT *
tpcds(# FROM web_sales
tpcds(# WHERE c.c_customer_sk = ws_bill_customer_sk
tpcds(# )
tpcds-# OR EXISTS
tpcds-# (
tpcds(# SELECT *
tpcds(# FROM catalog_sales
tpcds(# WHERE c.c_customer_sk = cs_ship_customer_sk
tpcds(#
tpcds(# );
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Gather Motion 10:1 (slice3; segments: 10) (cost=0.00..60357.42 rows=13500000 width=113)
-> Result (cost=0.00..56302.64 rows=1350000 width=113)
Filter: ((COALESCE((count((count()))), 0::bigint) > 0::bigint) OR (COALESCE((count((count()))), 0::bigint) > 0::bigint))
-> Result (cost=0.00..56118.17 rows=2803591 width=120)
-> Hash Left Join (cost=0.00..55781.73 rows=2803591 width=129)
Hash Cond: (customer.c_customer_sk = catalog_sales.cs_ship_customer_sk)
-> Hash Left Join (cost=0.00..19740.78 rows=1613115 width=121)
Hash Cond: (customer.c_customer_sk = web_sales.ws_bill_customer_sk)
-> Seq Scan on customer (cost=0.00..514.82 rows=1200000 width=113)
-> Hash (cost=17560.85..17560.85 rows=1177761 width=12)
-> HashAggregate (cost=0.00..17560.85 rows=1177761 width=12)
Group Key: web_sales.ws_bill_customer_sk
-> Redistribute Motion 10:10 (slice1; segments: 10) (cost=0.00..17410.01 rows=1177761 width=12)
Hash Key: web_sales.ws_bill_customer_sk
-> Result (cost=0.00..17365.77 rows=1177761 width=12)
-> HashAggregate (cost=0.00..17365.77 rows=1177761 width=12)
Group Key: web_sales.ws_bill_customer_sk
-> Sequence (cost=0.00..7559.00 rows=72000038 width=4)
-> Partition Selector for web_sales (dynamic scan id: 1) (cost=10.00..100.00 rows=10 width=4)
Partitions selected: 56 (out of 56)
-> Dynamic Seq Scan on web_sales (dynamic scan id: 1) (cost=0.00..7559.00 rows=72000038 width=4)
-> Hash (cost=33936.95..33936.95 rows=1178712 width=12)
-> HashAggregate (cost=0.00..33936.95 rows=1178712 width=12)
Group Key: catalog_sales.cs_ship_customer_sk
-> Redistribute Motion 10:10 (slice2; segments: 10) (cost=0.00..33785.99 rows=1178712 width=12)
Hash Key: catalog_sales.cs_ship_customer_sk
-> Result (cost=0.00..33741.71 rows=1178712 width=12)
-> HashAggregate (cost=0.00..33741.71 rows=1178712 width=12)
Group Key: catalog_sales.cs_ship_customer_sk
-> Sequence (cost=0.00..14132.41 rows=143998042 width=4)
-> Partition Selector for catalog_sales (dynamic scan id: 2) (cost=10.00..100.00 rows=10 width=4)
Partitions selected: 80 (out of 80)
-> Dynamic Seq Scan on catalog_sales (dynamic scan id: 2) (cost=0.00..14132.41 rows=143998042 width=4)
Optimizer: Pivotal Optimizer (GPORCA) version 3.80.0
(34 rows)
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org