You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2021/12/19 11:12:46 UTC

[GitHub] [incubator-doris] wangshuo128 opened a new issue #7433: [Enhancement] Improve partition prune.

wangshuo128 opened a new issue #7433:
URL: https://github.com/apache/incubator-doris/issues/7433


   ### Search before asking
   
   - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues.
   
   
   ### Description
   
   In the current code, partition prune has some issues.
   1.  Disjunctive predicates can't be handled.
   Let's say we have a table `t` with partition column `dt`
   ```sql
   CREATE TABLE `t` (
     `dt` int(11) NULL COMMENT "",
     `k1` int(11) NULL COMMENT "",
     `k2` int(11) NULL COMMENT "",
     `k3` int(11) NULL COMMENT "",
     `k4` int(11) NULL COMMENT ""
   ) DUPLICATE KEY(`dt`, `k1`, `k2`, `k3`, `k4`)
   PARTITION BY RANGE(`dt`)
   (PARTITION p20211121 VALUES LESS THAN ("20211121"),
   PARTITION p20211122 VALUES [("20211121"), ("20211122")),
   PARTITION p20211123 VALUES [("20211122"), ("20211123")),
   PARTITION p20211124 VALUES [("20211123"), ("20211124")),
   PARTITION p20211125 VALUES [("20211124"), ("20211125")),
   PARTITION p20211126 VALUES [("20211125"), ("20211126")),
   PARTITION p20211127 VALUES [("20211126"), ("20211127")),
   PARTITION p20211128 VALUES [("20211127"), ("20211128")))
   DISTRIBUTED BY HASH(`k1`) BUCKETS 60
   PROPERTIES('replication_num' = '1');
   ```
   All the partitions would be scanned if we run SQL with disjunctive predicates, e.g., `SELECT * FROM  t WHERE dt=20211123 OR dt=20211124`.
   
   2.  Can't handle well for multiple columns partition.
   https://github.com/apache/incubator-doris/blob/e74e55d2a4b9aa233f61ab4c35fe5e29d3a33d89/fe/fe-core/src/test/java/org/apache/doris/analysis/ListPartitionPrunerTest.java#L140-L141
   This test case should return 1 partition instead of 2.
   
   ### Solution
   
   I'd like to implement a V2 version of partition prune algorithm.
   
   1. Support prune partitions for disjunctive predicates.
   2. Optimize for multiple columns partition prune.
   
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] EmmyMiao87 closed issue #7433: [Enhancement] Improve partition prune.

Posted by GitBox <gi...@apache.org>.
EmmyMiao87 closed issue #7433:
URL: https://github.com/apache/incubator-doris/issues/7433


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] EmmyMiao87 commented on issue #7433: [Enhancement] Improve partition prune.

Posted by GitBox <gi...@apache.org>.
EmmyMiao87 commented on issue #7433:
URL: https://github.com/apache/incubator-doris/issues/7433#issuecomment-999259695


   1. 支持对 where 条件中带多个 range 的 分区列进行裁剪。比如 where  a=1 or a=2 则也可以进行分区裁剪。
   2. 支持 list 分区的多分区列的更精细化的分区裁剪。比如 list partition (k1, k2) 则支持对 where k1=1 and k2=1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] EmmyMiao87 closed issue #7433: [Enhancement] Improve partition prune.

Posted by GitBox <gi...@apache.org>.
EmmyMiao87 closed issue #7433:
URL: https://github.com/apache/incubator-doris/issues/7433


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] EmmyMiao87 edited a comment on issue #7433: [Enhancement] Improve partition prune.

Posted by GitBox <gi...@apache.org>.
EmmyMiao87 edited a comment on issue #7433:
URL: https://github.com/apache/incubator-doris/issues/7433#issuecomment-999259695


   需求
   
   1. 支持对 where 条件中带多个 range 的 分区列进行裁剪。比如 where  a=1 or a=2 则也可以进行分区裁剪。
   2. 支持 list 分区的多分区列的更精细化的分区裁剪。比如 list partition (k1, k2) 则支持对 where k1>1 and k2>1  多列裁剪。
       a. 之前的问题在于 多列分区时,不同列之间裁剪出的结果是一个并集。比如 将满足 条件 k1>1 的分区和 k2>1 的分区进行 union。这就导致了裁剪的不够惊喜。语义缺少了 and。
       b. 支持了精确的分区裁剪,考虑 同时满足k1>1 and k2>1 的分区。
   
   Requirement
   
   1. Support for trimming the partition columns with multiple ranges in the where condition. For example, where a=1 or a=2, partition cutting can also be performed.
   2. Support more refined partition tailoring of multi-partition columns of list partition. For example, list partition (k1, k2) supports multi-column clipping where k1>1 and k2>1.
        a. The previous problem is that when partitioning by multiple columns, the result of cropping between different columns is a union. For example, the partition that satisfies the condition k1>1 and the partition of k2>1 are unioned. This leads to the lack of surprises in tailoring. The semantics are missing and.
        b. Supports precise partition tailoring, consider partitions satisfying k1>1 and k2>1 at the same time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] EmmyMiao87 edited a comment on issue #7433: [Enhancement] Improve partition prune.

Posted by GitBox <gi...@apache.org>.
EmmyMiao87 edited a comment on issue #7433:
URL: https://github.com/apache/incubator-doris/issues/7433#issuecomment-999259695


   1. 支持对 where 条件中带多个 range 的 分区列进行裁剪。比如 where  a=1 or a=2 则也可以进行分区裁剪。
   2. 支持 list 分区的多分区列的更精细化的分区裁剪。比如 list partition (k1, k2) 则支持对 where k1>1 and k2>1  多列裁剪。
       a. 之前的问题在于 多列分区时,不同列之间裁剪出的结果是一个并集。比如 将满足 条件 k1>1 的分区和 k2>1 的分区进行 union。这就导致了裁剪的不够惊喜。语义缺少了 and。
       b. 支持了精确的分区裁剪,考虑 同时满足k1>1 and k2>1 的分区。


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org