You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2020/02/04 09:05:32 UTC

[GitHub] [incubator-doris] EmmyMiao87 commented on issue #2830: Support subquery in disjunction

EmmyMiao87 commented on issue #2830: Support subquery in disjunction
URL: https://github.com/apache/incubator-doris/issues/2830#issuecomment-581809110
 
 
   Greenplum support the correlated subquery in disjunctive(OR) filter by GPORCA.
   
   If the correlated subquery in conjunctive filter, the query will be rewritten to semi join.
   If the correlated subquery in disjunctive filter, the query will be rewritten to left outer join in Greenplum.
   For example:
   select * from a where exist (select * from b where a.k1=b.k1) or exist (select * from c where a.k1=c.k1);
   1. a left outer join b on a.k1=b.k1
   2. the result of 1 left outer join c on a.k1=c.k1
   the result of 2 like that
   | a.k1 | b.k1 | c.k1 |
   | :-----| :---- | :----|
   | 1 | 1 | NULL |
   | 2 | NULL | 2 |
   | 3 | 3 | 3|
   | 4 | NULL | NULL|
   3. filter the incorrect line
   The first, second and third line should be keep.
   4. keep the outer table columns
   keep the columns in table A
   Here is a question. What does the following mean?
   `((COALESCE((count((count()))), 0::bigint) > 0::bigint) OR (COALESCE((count((count()))), 0::bigint) > 0::bigint))`
   
   The explain result like that 
   ```
   tpcds=# explain SELECT *
   tpcds-# FROM customer c
   tpcds-# WHERE   EXISTS
   tpcds-#         (
   tpcds(#               SELECT *
   tpcds(#               FROM web_sales
   tpcds(#               WHERE c.c_customer_sk = ws_bill_customer_sk
   tpcds(#         )
   tpcds-#         OR EXISTS
   tpcds-#         (
   tpcds(#               SELECT *
   tpcds(#               FROM catalog_sales
   tpcds(#               WHERE c.c_customer_sk = cs_ship_customer_sk
   tpcds(#
   tpcds(#         );
                                                                               QUERY PLAN
   ------------------------------------------------------------------------------------------------------------------------------------------------------------------
    Gather Motion 10:1  (slice3; segments: 10)  (cost=0.00..60357.42 rows=13500000 width=113)
      ->  Result  (cost=0.00..56302.64 rows=1350000 width=113)
            Filter: ((COALESCE((count((count()))), 0::bigint) > 0::bigint) OR (COALESCE((count((count()))), 0::bigint) > 0::bigint))
            ->  Result  (cost=0.00..56118.17 rows=2803591 width=120)
                  ->  Hash Left Join  (cost=0.00..55781.73 rows=2803591 width=129)
                        Hash Cond: (customer.c_customer_sk = catalog_sales.cs_ship_customer_sk)
                        ->  Hash Left Join  (cost=0.00..19740.78 rows=1613115 width=121)
                              Hash Cond: (customer.c_customer_sk = web_sales.ws_bill_customer_sk)
                              ->  Seq Scan on customer  (cost=0.00..514.82 rows=1200000 width=113)
                              ->  Hash  (cost=17560.85..17560.85 rows=1177761 width=12)
                                    ->  HashAggregate  (cost=0.00..17560.85 rows=1177761 width=12)
                                          Group Key: web_sales.ws_bill_customer_sk
                                          ->  Redistribute Motion 10:10  (slice1; segments: 10)  (cost=0.00..17410.01 rows=1177761 width=12)
                                                Hash Key: web_sales.ws_bill_customer_sk
                                                ->  Result  (cost=0.00..17365.77 rows=1177761 width=12)
                                                      ->  HashAggregate  (cost=0.00..17365.77 rows=1177761 width=12)
                                                            Group Key: web_sales.ws_bill_customer_sk
                                                            ->  Sequence  (cost=0.00..7559.00 rows=72000038 width=4)
                                                                  ->  Partition Selector for web_sales (dynamic scan id: 1)  (cost=10.00..100.00 rows=10 width=4)
                                                                        Partitions selected: 56 (out of 56)
                                                                  ->  Dynamic Seq Scan on web_sales (dynamic scan id: 1)  (cost=0.00..7559.00 rows=72000038 width=4)
                        ->  Hash  (cost=33936.95..33936.95 rows=1178712 width=12)
                              ->  HashAggregate  (cost=0.00..33936.95 rows=1178712 width=12)
                                    Group Key: catalog_sales.cs_ship_customer_sk
                                    ->  Redistribute Motion 10:10  (slice2; segments: 10)  (cost=0.00..33785.99 rows=1178712 width=12)
                                          Hash Key: catalog_sales.cs_ship_customer_sk
                                          ->  Result  (cost=0.00..33741.71 rows=1178712 width=12)
                                                ->  HashAggregate  (cost=0.00..33741.71 rows=1178712 width=12)
                                                      Group Key: catalog_sales.cs_ship_customer_sk
                                                      ->  Sequence  (cost=0.00..14132.41 rows=143998042 width=4)
                                                            ->  Partition Selector for catalog_sales (dynamic scan id: 2)  (cost=10.00..100.00 rows=10 width=4)
                                                                  Partitions selected: 80 (out of 80)
                                                            ->  Dynamic Seq Scan on catalog_sales (dynamic scan id: 2)  (cost=0.00..14132.41 rows=143998042 width=4)
    Optimizer: Pivotal Optimizer (GPORCA) version 3.80.0
   (34 rows)
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org