You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@doris.apache.org by Botong Huang <pk...@gmail.com> on 2022/09/21 07:59:17 UTC

Code for incremental plan of materialized view upon data change

Hi all,

Does anyone know where the code is for generating the incremental plan for
updating the materialized view when the base table receives new data? Many
thanks!

想知道物化视图的增量维护的plan生成的代码在哪里,谢谢!

Best,
Botong

Re: Re: Code for incremental plan of materialized view upon data change

Posted by Botong Huang <pk...@gmail.com>.
Cool, thx a lot!

Best,
Botong

On Fri, Sep 23, 2022 at 11:46 PM Mingyu Chen <mo...@163.com> wrote:

> Currently, Doris does not support multi-table materialized view.
> So the mv with join is not supported yet.
>
>
> Doris only support mv for single table, with some certain aggregation such
> as sum, min, max, count, count(dinstinct)
> For example, there is a base table A with column (k1, k2, v1, v2), you can
> create a mv B like:
> select k1, sum(v1) from A group by k1.
> And now, you got a base table A and a mv B.
> When data is loading into table A, it will generate a copy of data to B.
> For table A, the loading data is with column (k1, k2,v1,v2), and for mv B,
> the loading data will be (k1, v1).
> The aggregation is done on BE side, the BE will receive data (k1, v1) and
> do the aggregation (k1, sum(v1)).
>
>
> You can see StreamLoadPlanner.java in FE for the query plan, and
> DeltaWriter.h/cpp in BE for doing the aggregation.
>
>
>
>
> --
>
> 此致!Best Regards
> 陈明雨 Mingyu Chen
>
> Email:
> morningman@apache.org
>
>
>
>
>
> At 2022-09-23 23:29:44, "Botong Huang" <pk...@gmail.com> wrote:
> >Thanks for the reply!
> >
> >I do understand from doc that the incremental update of MV is triggered
> >upon base table change.
> >
> >But I cannot find the actual code in Doris code base that does the update,
> >specifically the actual incremental plan (tree of operator generated by
> the
> >optimizer) to update the view result.
> >
> >For example, if the view is "Select * from A inner join B". Then for each
> >new line in A, it needs to join all rows in B to get the correct updates
> to
> >the view.
> >
> > Or if the view is "Select avg(score) from A", then I assume the plan for
> >the update should be something like "Select sum(score), count(score) from
> >\delta A".
> >
> >Thanks,
> >Botong
> >
> >On Fri, Sep 23, 2022 at 9:46 PM Mingyu Chen <mo...@163.com> wrote:
> >
> >> It controls by load process.
> >> After a mv is built, all following load task will generate many copies
> of
> >> import data to base table and all materialized views.
> >> So the incremental update happens on every load task.
> >>
> >>
> >>
> >>
> >> --
> >>
> >> 此致!Best Regards
> >> 陈明雨 Mingyu Chen
> >>
> >> Email:
> >> morningman@apache.org
> >>
> >>
> >>
> >>
> >>
> >> 在 2022-09-21 15:59:17,"Botong Huang" <pk...@gmail.com> 写道:
> >> >Hi all,
> >> >
> >> >Does anyone know where the code is for generating the incremental plan
> for
> >> >updating the materialized view when the base table receives new data?
> Many
> >> >thanks!
> >> >
> >> >想知道物化视图的增量维护的plan生成的代码在哪里,谢谢!
> >> >
> >> >Best,
> >> >Botong
> >>
>

Re:Re: Code for incremental plan of materialized view upon data change

Posted by Mingyu Chen <mo...@163.com>.
Currently, Doris does not support multi-table materialized view.
So the mv with join is not supported yet.


Doris only support mv for single table, with some certain aggregation such as sum, min, max, count, count(dinstinct)
For example, there is a base table A with column (k1, k2, v1, v2), you can create a mv B like:
select k1, sum(v1) from A group by k1.
And now, you got a base table A and a mv B.
When data is loading into table A, it will generate a copy of data to B.
For table A, the loading data is with column (k1, k2,v1,v2), and for mv B, the loading data will be (k1, v1).
The aggregation is done on BE side, the BE will receive data (k1, v1) and do the aggregation (k1, sum(v1)).


You can see StreamLoadPlanner.java in FE for the query plan, and DeltaWriter.h/cpp in BE for doing the aggregation.




--

此致!Best Regards
陈明雨 Mingyu Chen

Email:
morningman@apache.org





At 2022-09-23 23:29:44, "Botong Huang" <pk...@gmail.com> wrote:
>Thanks for the reply!
>
>I do understand from doc that the incremental update of MV is triggered
>upon base table change.
>
>But I cannot find the actual code in Doris code base that does the update,
>specifically the actual incremental plan (tree of operator generated by the
>optimizer) to update the view result.
>
>For example, if the view is "Select * from A inner join B". Then for each
>new line in A, it needs to join all rows in B to get the correct updates to
>the view.
>
> Or if the view is "Select avg(score) from A", then I assume the plan for
>the update should be something like "Select sum(score), count(score) from
>\delta A".
>
>Thanks,
>Botong
>
>On Fri, Sep 23, 2022 at 9:46 PM Mingyu Chen <mo...@163.com> wrote:
>
>> It controls by load process.
>> After a mv is built, all following load task will generate many copies of
>> import data to base table and all materialized views.
>> So the incremental update happens on every load task.
>>
>>
>>
>>
>> --
>>
>> 此致!Best Regards
>> 陈明雨 Mingyu Chen
>>
>> Email:
>> morningman@apache.org
>>
>>
>>
>>
>>
>> 在 2022-09-21 15:59:17,"Botong Huang" <pk...@gmail.com> 写道:
>> >Hi all,
>> >
>> >Does anyone know where the code is for generating the incremental plan for
>> >updating the materialized view when the base table receives new data? Many
>> >thanks!
>> >
>> >想知道物化视图的增量维护的plan生成的代码在哪里,谢谢!
>> >
>> >Best,
>> >Botong
>>

Re: Code for incremental plan of materialized view upon data change

Posted by Botong Huang <pk...@gmail.com>.
Thanks for the reply!

I do understand from doc that the incremental update of MV is triggered
upon base table change.

But I cannot find the actual code in Doris code base that does the update,
specifically the actual incremental plan (tree of operator generated by the
optimizer) to update the view result.

For example, if the view is "Select * from A inner join B". Then for each
new line in A, it needs to join all rows in B to get the correct updates to
the view.

 Or if the view is "Select avg(score) from A", then I assume the plan for
the update should be something like "Select sum(score), count(score) from
\delta A".

Thanks,
Botong

On Fri, Sep 23, 2022 at 9:46 PM Mingyu Chen <mo...@163.com> wrote:

> It controls by load process.
> After a mv is built, all following load task will generate many copies of
> import data to base table and all materialized views.
> So the incremental update happens on every load task.
>
>
>
>
> --
>
> 此致!Best Regards
> 陈明雨 Mingyu Chen
>
> Email:
> morningman@apache.org
>
>
>
>
>
> 在 2022-09-21 15:59:17,"Botong Huang" <pk...@gmail.com> 写道:
> >Hi all,
> >
> >Does anyone know where the code is for generating the incremental plan for
> >updating the materialized view when the base table receives new data? Many
> >thanks!
> >
> >想知道物化视图的增量维护的plan生成的代码在哪里,谢谢!
> >
> >Best,
> >Botong
>

Re:Code for incremental plan of materialized view upon data change

Posted by Mingyu Chen <mo...@163.com>.
It controls by load process.
After a mv is built, all following load task will generate many copies of import data to base table and all materialized views.
So the incremental update happens on every load task.




--

此致!Best Regards
陈明雨 Mingyu Chen

Email:
morningman@apache.org





在 2022-09-21 15:59:17,"Botong Huang" <pk...@gmail.com> 写道:
>Hi all,
>
>Does anyone know where the code is for generating the incremental plan for
>updating the materialized view when the base table receives new data? Many
>thanks!
>
>想知道物化视图的增量维护的plan生成的代码在哪里,谢谢!
>
>Best,
>Botong