You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@carbondata.apache.org by 马云 <si...@163.com> on 2017/03/28 05:24:34 UTC

[DISCUSSION] Order by + Limit Optimization

Hi Dev,


currently  I have done an optimization for order by 1 dimension.
performance test as below


my optimization solution for order by 1 dimension as below
mainly leverage  the dimension's order stored feature in each blocklet
step1. change logical plan and push down the order by and limit information to carbon scan
            and change sort physical plan to TakeOrderedAndProject  since data will be get and sorted in each partition
step2. apply the limit number, blocklet's min_max index to filter blocklet. 
          So it can reduce some scan time if some blocklets was filtered 


step3. in each partition,load the order by dimension data for  all blocklet which is filter




Re:[DISCUSSION] Order by + Limit Optimization

Posted by 马云 <si...@163.com>.

please ignore the email.
my mistake, the mail is not finished.
I will sent a new mail later









At 2017-03-28 13:24:34, "马云" <si...@163.com> wrote:
>Hi Dev,
>
>
>currently  I have done an optimization for order by 1 dimension.
>performance test as below
>
>
>my optimization solution for order by 1 dimension as below
>mainly leverage  the dimension's order stored feature in each blocklet
>step1. change logical plan and push down the order by and limit information to carbon scan
>            and change sort physical plan to TakeOrderedAndProject  since data will be get and sorted in each partition
>step2. apply the limit number, blocklet's min_max index to filter blocklet. 
>          So it can reduce some scan time if some blocklets was filtered 
>
>
>step3. in each partition,load the order by dimension data for  all blocklet which is filter
>
>
>