You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Haisheng Yuan <hy...@apache.org> on 2021/07/08 01:05:26 UTC

Re: Proposal to extend Calcite into a incremental query optimizer

Hi Botong,

We haven't heard from you for a while.
Feel free to reach out if you get stuck or need help on rebasing code.

Thanks,
Haisheng

On 2021/05/15 00:54:02, Botong Huang <pk...@gmail.com> wrote: 
> Hi all,
> 
> Thank you all for the interest, and thanks Julian for the update!
> 
> I am having problems uploading the pdf files into the jira CALCITE-4568
> <https://issues.apache.org/jira/browse/CALCITE-4568>, so I attached the
> slides in our code base:
> https://github.com/alibaba/cost-based-incremental-optimizer/blob/main/Tempura_Calcite_presentation.pdf
> 
> The slides contain a walking example of how Tempura expands its memo. The
> current version of the code also has two e2e unit tests at
> TvrOptimizationTest.java and TvrExecutionTest.java. Please feel free to
> start playing with them, and feel free to reach out and possibly schedule
> another meeting if needed.
> 
> As agreed in the meeting, we will rebase our code to a newer version of
> Calcite.
> 
> Best,
> Botong
> 
> On Thu, May 13, 2021 at 12:47 PM Julian Hyde <jh...@gmail.com> wrote:
> 
> > During the meeting we agreed to start progressing this contribution in the
> > usual Apache Way, with conversations on the dev list and in the
> > https://issues.apache.org/jira/browse/CALCITE-4568 <
> > https://issues.apache.org/jira/browse/CALCITE-4568> JIRA case. So, it
> > should be easy for you to participate.
> >
> > Botong said he would share the slides. (He might be unwilling to make them
> > public, because they are his presentation for a conference that has not
> > happened yet. Reach out to him one-to-one.)
> >
> > Next step is for someone on the Alibaba side to create a PR that is
> > rebased on the latest Calcite master, and add a comment to the JIRA case.
> > Then we can discuss what needs to be done for that PR. Code quality, adding
> > comments, breaking up into smaller commits, additional tests, renaming
> > packages/classes, restructuring into plugins are all possibilities.
> >
> > Our side of the bargain, as committers, is that we should review in a
> > timely manner, and not move the goal posts — if the contributors make the
> > changes we request then we will land this code in master in a reasonable
> > amount of time.
> >
> > We also discussed incremental view maintenance (IVM). Tempura solves a
> > more general problem (finding the optimal K steps to maintain a
> > materialized view as data arrives in K points in time) but if we set K=2,
> > we can generate a plan for how to update a materialized view given a delta
> > table. The plan will be different based on cost - e.g. whether the delta
> > table is small or large. This is a problem that many of our users would
> > like to solve. It will exercise much of Tempura’s code base, and encourage
> > contributions.
> >
> > In my opinion, we should do IVM at launch. It should be the main example
> > we use in conference talks, blog posts, etc. When people understand that
> > case, we can explain how we generalize from K=2 to arbitrary K.
> >
> > Julian
> >
> >
> > > On May 13, 2021, at 9:51 AM, Rui Wang <am...@apache.org> wrote:
> > >
> > > I apologize that I had a wrong impression on the meeting time (I thought
> > it
> > > should be on Thursday but it is Wednesday). I can follow up your meeting
> > > records if you have any.
> > >
> > >
> > > -Rui
> > >
> > > On Tue, May 11, 2021 at 8:17 PM Botong Huang <pk...@gmail.com> wrote:
> > >
> > >> Hi all,
> > >>
> > >> This is a reminder that we are going to have our second discussion
> > meeting
> > >> tomorrow at 10-11pm PST. Please find the link below, everyone is
> > welcome to
> > >> join!
> > >>
> > >> Join Zoom Meeting
> > >> https://uci.zoom.us/j/91986206610
> > >> <
> > >>
> > https://www.google.com/url?q=https%3A%2F%2Fuci.zoom.us%2Fj%2F91986206610&sa=D&source=calendar&usd=2&usg=AOvVaw24sxPtI6hbukCSo3nlQsbn
> > >>>
> > >>
> > >> Meeting ID: 919 8620 6610
> > >> One tap mobile
> > >> +16699006833 <(669)%20900-6833>,,91986206610# US (San Jose)
> > >> +12532158782 <(253)%20215-8782>,,91986206610# US (Tacoma)
> > >>
> > >> Dial by your location
> > >>        +1 669 900 6833 <(669)%20900-6833> US (San Jose)
> > >>        +1 253 215 8782 <(253)%20215-8782> US (Tacoma)
> > >>        +1 346 248 7799 <(346)%20248-7799> US (Houston)
> > >>        +1 301 715 8592 <(301)%20715-8592> US (Washington DC)
> > >>        +1 312 626 6799 <(312)%20626-6799> US (Chicago)
> > >>        +1 646 558 8656 <(646)%20558-8656> US (New York)
> > >> Meeting ID: 919 8620 6610
> > >> Find your local number: https://uci.zoom.us/u/acyXcc43Cd
> > >> <
> > >>
> > https://www.google.com/url?q=https%3A%2F%2Fuci.zoom.us%2Fu%2FacyXcc43Cd&sa=D&source=calendar&usd=2&usg=AOvVaw2W08kj_8hEx44dryeZlXb6
> > >>>
> > >>
> > >> Join by Skype for Business
> > >> https://uci.zoom.us/skype/91986206610
> > >> <
> > >>
> > https://www.google.com/url?q=https%3A%2F%2Fuci.zoom.us%2Fskype%2F91986206610&sa=D&source=calendar&usd=2&usg=AOvVaw3w0M0YYbcjPyBXzNpyyk0Z
> > >>>
> > >>
> > >> Thanks,
> > >> Botong
> > >>
> > >> On Wed, May 5, 2021 at 9:55 AM Botong Huang <pk...@gmail.com> wrote:
> > >>
> > >>> Hi Stamatis and all,
> > >>>
> > >>> Thanks for the interest! Let's tentatively schedule the next meeting
> > next
> > >>> Wednesday at May 12, 10pm-11pm PST then. Please let us know if there's
> > >> new
> > >>> needs showing up.
> > >>>
> > >>> Best,
> > >>> Botong
> > >>>
> > >>> On Sun, May 2, 2021 at 2:59 PM Stamatis Zampetakis <za...@gmail.com>
> > >>> wrote:
> > >>>
> > >>>> Hello,
> > >>>>
> > >>>> I really regret missing the first meeting, sorry about that. I added
> > my
> > >>>> preferences in the document.
> > >>>> I will make sure to attend the next one and help as much as I can.
> > >>>>
> > >>>> I didn't have the chance yet to go over the paper but will try to do
> > it
> > >>>> before the next meeting.
> > >>>>
> > >>>> For me the following dates are more convenient than others so it would
> > >> be
> > >>>> nice if we could arrange it then.
> > >>>>
> > >>>> Thu, May 6, 10pm PST
> > >>>> Tue, May 12, 10pm PST
> > >>>>
> > >>>> Best,
> > >>>> Stamatis
> > >>>>
> > >>>> On Sat, May 1, 2021 at 9:42 PM Julian Hyde <jh...@apache.org> wrote:
> > >>>>
> > >>>>> I have added my time preferences to the doc [1]. I am generally
> > >>>>> available any evening Mon - Thu. How about we meet Monday 10th May?
> > >>>>>
> > >>>>> Stamatis, Jesus, Given the complexity of this work, I would very much
> > >>>>> appreciate your insight, as experts in optimizer theory. Could one of
> > >>>>> you join the next meeting? Of course we should choose a time that
> > >>>>> works for everyone's schedule.
> > >>>>>
> > >>>>> Julian
> > >>>>>
> > >>>>> [1]
> > >>>>>
> > >>>>
> > >>
> > https://docs.google.com/document/d/1wyNjB94uSGwHtVvGYDwaLlCghUJE-7aDLnCdKKXJN1o/edit?usp=sharing
> > >>>>>
> > >>>>> On Wed, Apr 28, 2021 at 9:32 AM Botong Huang <pk...@gmail.com>
> > >> wrote:
> > >>>>>>
> > >>>>>> We didn't record it, we will try to record the following meetings.
> > >>>> Please
> > >>>>>> add your time preference in the docs, so that we can find a meeting
> > >>>> time
> > >>>>>> that works for more people.
> > >>>>>>
> > >>>>>> Thanks,
> > >>>>>> Botong
> > >>>>>>
> > >>>>>> On Wed, Apr 28, 2021 at 12:23 AM Viliam Durina <
> > >> viliam@hazelcast.com>
> > >>>>> wrote:
> > >>>>>>
> > >>>>>>> Is there a recording available?
> > >>>>>>> Viliam
> > >>>>>>>
> > >>>>>>> On Wed, 28 Apr 2021 at 00:15, Botong Huang <pk...@gmail.com>
> > >>>> wrote:
> > >>>>>>>
> > >>>>>>>> Hi all,
> > >>>>>>>>
> > >>>>>>>> The meeting yesterday was fun and productive. As discussed, this
> > >>>> is
> > >>>>> the
> > >>>>>>>> call to schedule our second meeting.
> > >>>>>>>>
> > >>>>>>>> We encourage everyone to add their time preferences during
> > >> 05/01 -
> > >>>>> 05/15
> > >>>>>>>> here:
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>
> > >>>>
> > >>
> > https://docs.google.com/document/d/1wyNjB94uSGwHtVvGYDwaLlCghUJE-7aDLnCdKKXJN1o/edit?usp=sharing
> > >>>>>>>>
> > >>>>>>>> Thanks,
> > >>>>>>>> Botong
> > >>>>>>>>
> > >>>>>>>> On Wed, Apr 21, 2021 at 5:19 PM Botong Huang <pk...@gmail.com>
> > >>>>> wrote:
> > >>>>>>>>
> > >>>>>>>>> Hi all,
> > >>>>>>>>> We've created a zoom meeting below for our meeting next Monday
> > >>>>>>>>> (9pm-10:30pm PST on 04/26).
> > >>>>>>>>> Talk to you all soon!
> > >>>>>>>>>
> > >>>>>>>>> Join Zoom Meeting
> > >>>>>>>>> https://uci.zoom.us/j/91279732686
> > >>>>>>>>> <
> > >>>>>>>>
> > >>>>>>>
> > >>>>>
> > >>>>
> > >>
> > https://www.google.com/url?q=https%3A%2F%2Fuci.zoom.us%2Fj%2F91279732686&sa=D&source=calendar&usd=2&usg=AOvVaw2C5LoOmCaSLWSi-YvMmsOE
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> Meeting ID: 912 7973 2686
> > >>>>>>>>> One tap mobile
> > >>>>>>>>> +16699006833 <(669)%20900-6833>,,91279732686# US (San Jose)
> > >>>>>>>>> +12532158782 <(253)%20215-8782>,,91279732686# US (Tacoma)
> > >>>>>>>>>
> > >>>>>>>>> Dial by your location
> > >>>>>>>>> +1 669 900 6833 <(669)%20900-6833> US (San Jose)
> > >>>>>>>>> +1 253 215 8782 <(253)%20215-8782> US (Tacoma)
> > >>>>>>>>> +1 346 248 7799 <(346)%20248-7799> US (Houston)
> > >>>>>>>>> +1 301 715 8592 <(301)%20715-8592> US (Washington DC)
> > >>>>>>>>> +1 312 626 6799 <(312)%20626-6799> US (Chicago)
> > >>>>>>>>> +1 646 558 8656 <(646)%20558-8656> US (New York)
> > >>>>>>>>> Meeting ID: 912 7973 2686
> > >>>>>>>>> Find your local number: https://uci.zoom.us/u/aykHTkJBh
> > >>>>>>>>> <
> > >>>>>>>>
> > >>>>>>>
> > >>>>>
> > >>>>
> > >>
> > https://www.google.com/url?q=https%3A%2F%2Fuci.zoom.us%2Fu%2FaykHTkJBh&sa=D&source=calendar&usd=2&usg=AOvVaw0y_V5CisCHRyt9wsXLa9UM
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> Join by Skype for Business
> > >>>>>>>>> https://uci.zoom.us/skype/91279732686
> > >>>>>>>>> <
> > >>>>>>>>
> > >>>>>>>
> > >>>>>
> > >>>>
> > >>
> > https://www.google.com/url?q=https%3A%2F%2Fuci.zoom.us%2Fskype%2F91279732686&sa=D&source=calendar&usd=2&usg=AOvVaw3iQwsDViu3K7-Rb_Iy6Zsy
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> Thanks,
> > >>>>>>>>> Botong
> > >>>>>>>>>
> > >>>>>>>>> On Tue, Apr 13, 2021 at 10:16 PM Botong Huang <
> > >> pkuhbt@gmail.com
> > >>>>>
> > >>>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> Hi all,
> > >>>>>>>>>>
> > >>>>>>>>>> According to the preferences collected, we are tentatively
> > >>>>> scheduling
> > >>>>>>>> our
> > >>>>>>>>>> meeting at 9pm-10:30pm PST on 04/26 Monday.
> > >>>>>>>>>>
> > >>>>>>>>>> We will give a presentation about Tempura, followed by a free
> > >>>>>>>> discussion.
> > >>>>>>>>>>
> > >>>>>>>>>> Please let us know if there are new other requests. Few days
> > >>>>> before
> > >>>>>>>>>> the meeting, I will send out a zoom meeting link.
> > >>>>>>>>>>
> > >>>>>>>>>> Thanks,
> > >>>>>>>>>> Botong
> > >>>>>>>>>>
> > >>>>>>>>>> On Wed, Apr 7, 2021 at 2:46 PM Botong Huang <
> > >> pkuhbt@gmail.com>
> > >>>>> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>>> Hi Julian and all,
> > >>>>>>>>>>>
> > >>>>>>>>>>> We've posted the Tempura code base below. Feel free to take
> > >> a
> > >>>>> quick
> > >>>>>>>> peek
> > >>>>>>>>>>> at the last five commits.
> > >>>>>>>>>>>
> > >>>>>>>>
> > >>>>>
> > >>>>
> > >>
> > https://github.com/alibaba/cost-based-incremental-optimizer/commits/main
> > >>>>>>>>>>>
> > >>>>>>>>>>> I've also opened a Jira (CALCITE-4568
> > >>>>>>>>>>> <https://issues.apache.org/jira/browse/CALCITE-4568>),
> > >> which
> > >>>>> will
> > >>>>>>>> serve
> > >>>>>>>>>>> as the umbrella Jira for the feature.
> > >>>>>>>>>>>
> > >>>>>>>>>>> In the meantime, we encourage everyone to enter the time
> > >>>>> preferences
> > >>>>>>>> for
> > >>>>>>>>>>> our first meeting here:
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>
> > >>>>
> > >>
> > https://docs.google.com/document/d/1wyNjB94uSGwHtVvGYDwaLlCghUJE-7aDLnCdKKXJN1o/edit?usp=sharing
> > >>>>>>>>>>>
> > >>>>>>>>>>> Thanks,
> > >>>>>>>>>>> Botong
> > >>>>>>>>>>>
> > >>>>>>>>>>> On Mon, Apr 5, 2021 at 3:59 PM Julian Hyde <
> > >>>>> jhyde.apache@gmail.com>
> > >>>>>>>>>>> wrote:
> > >>>>>>>>>>>
> > >>>>>>>>>>>> I have added my time preferences to the doc.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Before we meet, could you publish a PR for us to review?
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Initial discussions will need to be about architecture and
> > >>>>>>> high-level
> > >>>>>>>>>>>> design. So I would ask Calcite reviewers not to review the
> > >> PR
> > >>>>>>>> line-by-line
> > >>>>>>>>>>>> (or to leave comments in GitHub) but try to understand the
> > >>>>> design
> > >>>>>>>>>>>> holistically, and prepare questions/comments before the
> > >>>> meeting.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Botong, Can you please create a Calcite JIRA case for this
> > >>>> task?
> > >>>>>>> JIRA
> > >>>>>>>>>>>> how we track long-running tasks such as this.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Julian
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> On Apr 3, 2021, at 5:15 PM, Botong Huang <
> > >> pkuhbt@gmail.com
> > >>>>>
> > >>>>>>> wrote:
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Hi all,
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Apology for the delay. It took us some time to clean up
> > >> our
> > >>>>> code
> > >>>>>>>> base
> > >>>>>>>>>>>> and
> > >>>>>>>>>>>>> publicly release it (which will be out soon) for a quick
> > >>>> peek.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> We are ready to present our work. Let's schedule a time
> > >>>> for a
> > >>>>> Zoom
> > >>>>>>>>>>>>> meeting and discuss how to integrate Tempura into
> > >> Calcite.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Since some of our team members are in China, we prefer
> > >> the
> > >>>>> time
> > >>>>>>> slot
> > >>>>>>>>>>>> of
> > >>>>>>>>>>>>> 7:00pm-11:30pm PST any day. I've added our time
> > >> preference
> > >>>> in
> > >>>>> the
> > >>>>>>>>>>>> shared
> > >>>>>>>>>>>>> doc below.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>
> > >>>>
> > >>
> > https://docs.google.com/document/d/1wyNjB94uSGwHtVvGYDwaLlCghUJE-7aDLnCdKKXJN1o/edit?usp=sharing
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> We encourage everyone to add their time preferences
> > >> (during
> > >>>>>>>>>>>> 04/15-04/30) in
> > >>>>>>>>>>>>> this doc. In a week or so, we will try to settle a time
> > >>>> that
> > >>>>> works
> > >>>>>>>> for
> > >>>>>>>>>>>>> most.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Thanks,
> > >>>>>>>>>>>>> Botong
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> On Sat, Jan 30, 2021 at 9:19 PM Botong Huang <
> > >>>>> pkuhbt@gmail.com>
> > >>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Hi Julian and Rui,
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Sounds good to us. Please give us some time to prepare
> > >>>> some
> > >>>>>>> slides
> > >>>>>>>>>>>> for the
> > >>>>>>>>>>>>>> meeting.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> I've created a doc below for discussion. Please feel
> > >> free
> > >>>> to
> > >>>>> add
> > >>>>>>>>>>>> more in
> > >>>>>>>>>>>>>> here:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>
> > >>>>
> > >>
> > https://docs.google.com/document/d/1wyNjB94uSGwHtVvGYDwaLlCghUJE-7aDLnCdKKXJN1o/edit?usp=sharing
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Thanks,
> > >>>>>>>>>>>>>> Botong
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> On Thu, Jan 28, 2021 at 11:18 AM Julian Hyde <
> > >>>>>>>> jhyde.apache@gmail.com
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> PS The “editable doc” that Rui refers to is also a good
> > >>>>> idea. I
> > >>>>>>>>>>>> think we
> > >>>>>>>>>>>>>>> should create it to continue discussion after the first
> > >>>>> meeting.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Julian
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> On Jan 28, 2021, at 11:16 AM, Julian Hyde <
> > >>>>>>>> jhyde.apache@gmail.com>
> > >>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> I think good next steps would be a PR and a meeting.
> > >>>> The
> > >>>>> PR
> > >>>>>>> will
> > >>>>>>>>>>>> allow
> > >>>>>>>>>>>>>>> us to read the code, but I think we should do the first
> > >>>>> round of
> > >>>>>>>>>>>> questions
> > >>>>>>>>>>>>>>> at the meeting.  The meeting could perhaps start with a
> > >>>>>>>>>>>> presentation of the
> > >>>>>>>>>>>>>>> paper (do you have some slides you are planning to
> > >>>> present
> > >>>>> at
> > >>>>>>>> VLDB,
> > >>>>>>>>>>>>>>> Botong?) and then move on to questions about the
> > >>>> concepts,
> > >>>>> which
> > >>>>>>>>>>>>>>> alternatives were considered, and how the concepts map
> > >>>> onto
> > >>>>>>> other
> > >>>>>>>>>>>> current
> > >>>>>>>>>>>>>>> and future concepts in calcite.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> I don’t think we should start “reviewing” the PR
> > >>>>> line-by-line
> > >>>>>>> at
> > >>>>>>>>>>>> this
> > >>>>>>>>>>>>>>> point. We need to understand the high-level concepts
> > >> and
> > >>>>> design
> > >>>>>>>>>>>> choices. If
> > >>>>>>>>>>>>>>> we start reviewing the PR we will get lost in the
> > >>>> details.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> I know that integrating a major change is hard; I
> > >> doubt
> > >>>>> that we
> > >>>>>>>>>>>> will be
> > >>>>>>>>>>>>>>> able to integrate everything, but we can build
> > >>>> understanding
> > >>>>>>> about
> > >>>>>>>>>>>> where
> > >>>>>>>>>>>>>>> calcite needs to go, and I hope integrate a good amount
> > >>>> of
> > >>>>> code
> > >>>>>>> to
> > >>>>>>>>>>>> help us
> > >>>>>>>>>>>>>>> get there.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> As I said before, after the integration I would like
> > >>>>> people to
> > >>>>>>> be
> > >>>>>>>>>>>> able
> > >>>>>>>>>>>>>>> to experiment with it and use it in their production
> > >>>>> systems.
> > >>>>>>>> That
> > >>>>>>>>>>>> way, it
> > >>>>>>>>>>>>>>> will not be an experiment that withers, but a feature
> > >> set
> > >>>>>>>>>>>> integrates with
> > >>>>>>>>>>>>>>> other calcite features and gets stronger over time.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Julian
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> On Jan 28, 2021, at 10:54 AM, Rui Wang <
> > >>>>> amaliujia@apache.org>
> > >>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> For me to participate in the discussion for the
> > >> above
> > >>>>>>>> questions,
> > >>>>>>>>>>>> I
> > >>>>>>>>>>>>>>> will
> > >>>>>>>>>>>>>>>>> need to read a lot more to know relevant context and
> > >>>>> likely
> > >>>>>>> ask
> > >>>>>>>>>>>> lots of
> > >>>>>>>>>>>>>>>>> questions :-).  A editable doc is probably good for
> > >>>>> questions
> > >>>>>>>> and
> > >>>>>>>>>>>> back
> > >>>>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>>>> forward discussion.
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> -Rui
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> On Thu, Jan 28, 2021 at 10:50 AM Rui Wang <
> > >>>>>>>> amaliujia@apache.org
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> I am also happy to help push this work into Calcite
> > >>>>> (review
> > >>>>>>>> code
> > >>>>>>>>>>>> and
> > >>>>>>>>>>>>>>> doc,
> > >>>>>>>>>>>>>>>>>> etc.).
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> While you can share your code so people can have
> > >> more
> > >>>>> idea
> > >>>>>>> how
> > >>>>>>>>>>>> it is
> > >>>>>>>>>>>>>>>>>> implemented, I think it would be also nice to have a
> > >>>> doc
> > >>>>> to
> > >>>>>>>>>>>> discuss
> > >>>>>>>>>>>>>>> open
> > >>>>>>>>>>>>>>>>>> questions above. Some points that I copy those to
> > >>>> here:
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> 1. Can this solution be compatible with existing
> > >>>>> solutions in
> > >>>>>>>>>>>> Calcite
> > >>>>>>>>>>>>>>>>>> Streaming, materialized view maintenance, and
> > >>>> multi-query
> > >>>>>>>>>>>> optimization
> > >>>>>>>>>>>>>>>>>> (Sigma and Delta relational operators, lattice, and
> > >>>> Spool
> > >>>>>>>>>>>> operator),
> > >>>>>>>>>>>>>>>>>> 2. Did you find that you needed two separate cost
> > >>>> models
> > >>>>> -
> > >>>>>>> one
> > >>>>>>>>>>>> for
> > >>>>>>>>>>>>>>> “view
> > >>>>>>>>>>>>>>>>>> maintenance” and another for “user queries” - since
> > >>>> the
> > >>>>>>>>>>>> objectives of
> > >>>>>>>>>>>>>>> each
> > >>>>>>>>>>>>>>>>>> activity are so different?
> > >>>>>>>>>>>>>>>>>> 3. whether this work will hasten the arrival of
> > >>>>>>> multi-objective
> > >>>>>>>>>>>>>>> parametric
> > >>>>>>>>>>>>>>>>>> query optimization [1] in Calcite.
> > >>>>>>>>>>>>>>>>>> 4. probably SQL shell support.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> [1]:
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>
> > >>>>
> > >>
> > https://cacm.acm.org/magazines/2017/10/221322-multi-objective-parametric-query-optimization/fulltext
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> -Rui
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> On Wed, Jan 27, 2021 at 6:52 PM Albert <
> > >>>>> zinking3@gmail.com>
> > >>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> it would be very nice to see a POC of your work.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> On Thu, Jan 28, 2021 at 10:21 AM Botong Huang <
> > >>>>>>>>>>>> pkuhbt@gmail.com>
> > >>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> Hi Julian,
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> Just wondering if there are any updates? We are
> > >>>>> wondering
> > >>>>>>> if
> > >>>>>>>> it
> > >>>>>>>>>>>>>>> would
> > >>>>>>>>>>>>>>>>>>> help
> > >>>>>>>>>>>>>>>>>>>> to post our code for a quick preview.
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> Thanks,
> > >>>>>>>>>>>>>>>>>>>> Botong
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> On Fri, Jan 1, 2021 at 11:04 AM Botong Huang <
> > >>>>>>>> pkuhbt@gmail.com
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> Hi Julian,
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> Thanks for your interest! Sure let's figure out a
> > >>>> plan
> > >>>>>>> that
> > >>>>>>>>>>>> best
> > >>>>>>>>>>>>>>>>>>> benefits
> > >>>>>>>>>>>>>>>>>>>>> the community. Here are some clarifications that
> > >>>>> hopefully
> > >>>>>>>>>>>> answer
> > >>>>>>>>>>>>>>> your
> > >>>>>>>>>>>>>>>>>>>>> questions.
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> In our work (Tempura), users specify the set of
> > >>>> time
> > >>>>>>> points
> > >>>>>>>> to
> > >>>>>>>>>>>>>>>>>>> consider
> > >>>>>>>>>>>>>>>>>>>>> running and a cost function that expresses users'
> > >>>>>>> preference
> > >>>>>>>>>>>> over
> > >>>>>>>>>>>>>>>>>>> time,
> > >>>>>>>>>>>>>>>>>>>>> Tempura will generate the best incremental plan
> > >>>> that
> > >>>>>>>>>>>> minimizes the
> > >>>>>>>>>>>>>>>>>>>> overall
> > >>>>>>>>>>>>>>>>>>>>> cost function.
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> In this incremental plan, the sub-plans at
> > >>>> different
> > >>>>> time
> > >>>>>>>>>>>> points
> > >>>>>>>>>>>>>>> can
> > >>>>>>>>>>>>>>>>>>> be
> > >>>>>>>>>>>>>>>>>>>>> different from each other, as opposed to
> > >> identical
> > >>>>> plans
> > >>>>>>> in
> > >>>>>>>>>>>> all
> > >>>>>>>>>>>>>>> delta
> > >>>>>>>>>>>>>>>>>>>> runs
> > >>>>>>>>>>>>>>>>>>>>> as in streaming or IVM. As mentioned in $2.1 of
> > >> the
> > >>>>>>> Tempura
> > >>>>>>>>>>>> paper,
> > >>>>>>>>>>>>>>> we
> > >>>>>>>>>>>>>>>>>>> can
> > >>>>>>>>>>>>>>>>>>>>> mimic the current streaming implementation by
> > >>>>> specifying
> > >>>>>>> two
> > >>>>>>>>>>>>>>> (logical)
> > >>>>>>>>>>>>>>>>>>>> time
> > >>>>>>>>>>>>>>>>>>>>> points in Tempura, representing the initial run
> > >> and
> > >>>>> later
> > >>>>>>>>>>>> delta
> > >>>>>>>>>>>>>>> runs
> > >>>>>>>>>>>>>>>>>>>>> respectively. In general, note that Tempura
> > >>>> supports
> > >>>>>>> various
> > >>>>>>>>>>>> form
> > >>>>>>>>>>>>>>> of
> > >>>>>>>>>>>>>>>>>>>>> incremental computing, not only the small-delta
> > >>>>>>> append-only
> > >>>>>>>>>>>> data
> > >>>>>>>>>>>>>>>>>>> model in
> > >>>>>>>>>>>>>>>>>>>>> streaming systems. That's why we believe Tempura
> > >>>>> subsumes
> > >>>>>>>> the
> > >>>>>>>>>>>>>>> current
> > >>>>>>>>>>>>>>>>>>>>> streaming support, as well as any IVM
> > >>>> implementations.
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> About the cost model, we did not come up with a
> > >>>>> seperate
> > >>>>>>>> cost
> > >>>>>>>>>>>>>>> model,
> > >>>>>>>>>>>>>>>>>>> but
> > >>>>>>>>>>>>>>>>>>>>> rather extended the existing one. Similar to
> > >>>>>>> multi-objective
> > >>>>>>>>>>>>>>>>>>>> optimization,
> > >>>>>>>>>>>>>>>>>>>>> costs incurred at different time points are
> > >>>> considered
> > >>>>>>>>>>>> different
> > >>>>>>>>>>>>>>>>>>>>> dimensions. Tempura lets users supply a function
> > >>>> that
> > >>>>>>>>>>>> converts this
> > >>>>>>>>>>>>>>>>>>> cost
> > >>>>>>>>>>>>>>>>>>>>> vector into a final cost. So under this function,
> > >>>> any
> > >>>>> two
> > >>>>>>>>>>>>>>> incremental
> > >>>>>>>>>>>>>>>>>>>> plans
> > >>>>>>>>>>>>>>>>>>>>> are still comparable and there is an overall
> > >>>> optimum.
> > >>>>> I
> > >>>>>>>> guess
> > >>>>>>>>>>>> we
> > >>>>>>>>>>>>>>> can
> > >>>>>>>>>>>>>>>>>>> go
> > >>>>>>>>>>>>>>>>>>>>> down the route of multi-objective parametric
> > >> query
> > >>>>>>>>>>>> optimization
> > >>>>>>>>>>>>>>>>>>> instead
> > >>>>>>>>>>>>>>>>>>>> if
> > >>>>>>>>>>>>>>>>>>>>> there is a need.
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> Next on materialized views and multi-query
> > >>>>> optimization,
> > >>>>>>>>>>>> since our
> > >>>>>>>>>>>>>>>>>>>>> multi-time-point plan naturally involves
> > >>>> materializing
> > >>>>>>>>>>>> intermediate
> > >>>>>>>>>>>>>>>>>>>> results
> > >>>>>>>>>>>>>>>>>>>>> for later time points, we need to solve the
> > >>>> problem of
> > >>>>>>>>>>>> choosing
> > >>>>>>>>>>>>>>>>>>>>> materializations and include the cost of saving
> > >> and
> > >>>>>>> reusing
> > >>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>>>>> materializations when costing and comparing
> > >> plans.
> > >>>> We
> > >>>>>>>>>>>> borrowed the
> > >>>>>>>>>>>>>>>>>>>>> multi-query optimization techniques to solve this
> > >>>>> problem
> > >>>>>>>> even
> > >>>>>>>>>>>>>>> though
> > >>>>>>>>>>>>>>>>>>> we
> > >>>>>>>>>>>>>>>>>>>>> are looking at a single query. As a result, we
> > >>>> think
> > >>>>> our
> > >>>>>>>> work
> > >>>>>>>>>>>> is
> > >>>>>>>>>>>>>>>>>>>> orthogonal
> > >>>>>>>>>>>>>>>>>>>>> to Calcite's facilities around utilizing existing
> > >>>>> views,
> > >>>>>>>>>>>> lattice
> > >>>>>>>>>>>>>>> etc.
> > >>>>>>>>>>>>>>>>>>> We
> > >>>>>>>>>>>>>>>>>>>> do
> > >>>>>>>>>>>>>>>>>>>>> feel that the multi-query optimization component
> > >>>> can
> > >>>>> be
> > >>>>>>>>>>>> adopted to
> > >>>>>>>>>>>>>>>>>>> wider
> > >>>>>>>>>>>>>>>>>>>>> use, but probably need more suggestions from the
> > >>>>>>> community.
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> Lastly, our current implementation is set up in
> > >>>> java
> > >>>>> code,
> > >>>>>>>> it
> > >>>>>>>>>>>>>>> should
> > >>>>>>>>>>>>>>>>>>> be
> > >>>>>>>>>>>>>>>>>>>>> straightforward to hook it up with SQL shell.
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> Thanks,
> > >>>>>>>>>>>>>>>>>>>>> Botong
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> On Mon, Dec 28, 2020 at 6:44 PM Julian Hyde <
> > >>>>>>>>>>>>>>> jhyde.apache@gmail.com>
> > >>>>>>>>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> Botong,
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> This is very exciting; congratulations on this
> > >>>>> research,
> > >>>>>>>> and
> > >>>>>>>>>>>> thank
> > >>>>>>>>>>>>>>>>>>> you
> > >>>>>>>>>>>>>>>>>>>>>> for contributing it back to Calcite.
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> The research touches several areas in Calcite:
> > >>>>> streaming,
> > >>>>>>>>>>>>>>>>>>> materialized
> > >>>>>>>>>>>>>>>>>>>>>> view maintenance, and multi-query optimization.
> > >>>> As we
> > >>>>>>> have
> > >>>>>>>>>>>> already
> > >>>>>>>>>>>>>>>>>>> some
> > >>>>>>>>>>>>>>>>>>>>>> solutions in those areas (Sigma and Delta
> > >>>> relational
> > >>>>>>>>>>>> operators,
> > >>>>>>>>>>>>>>>>>>> lattice,
> > >>>>>>>>>>>>>>>>>>>>>> and Spool operator), it will be interesting to
> > >> see
> > >>>>>>> whether
> > >>>>>>>>>>>> we can
> > >>>>>>>>>>>>>>>>>>> make
> > >>>>>>>>>>>>>>>>>>>> them
> > >>>>>>>>>>>>>>>>>>>>>> compatible, or whether one concept can subsume
> > >>>>> others.
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> Your work differs from streaming queries in that
> > >>>> your
> > >>>>>>>>>>>> relations
> > >>>>>>>>>>>>>>> are
> > >>>>>>>>>>>>>>>>>>> used
> > >>>>>>>>>>>>>>>>>>>>>> by “external” user queries, whereas in pure
> > >>>> streaming
> > >>>>>>>>>>>> queries, the
> > >>>>>>>>>>>>>>>>>>> only
> > >>>>>>>>>>>>>>>>>>>>>> activity is the change propagation. Did you find
> > >>>>> that you
> > >>>>>>>>>>>> needed
> > >>>>>>>>>>>>>>> two
> > >>>>>>>>>>>>>>>>>>>>>> separate cost models - one for “view
> > >> maintenance”
> > >>>> and
> > >>>>>>>>>>>> another for
> > >>>>>>>>>>>>>>>>>>> “user
> > >>>>>>>>>>>>>>>>>>>>>> queries” - since the objectives of each activity
> > >>>> are
> > >>>>> so
> > >>>>>>>>>>>> different?
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> I wonder whether this work will hasten the
> > >>>> arrival of
> > >>>>>>>>>>>>>>> multi-objective
> > >>>>>>>>>>>>>>>>>>>>>> parametric query optimization [1] in Calcite.
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> I will make time over the next few days to read
> > >>>> and
> > >>>>>>> digest
> > >>>>>>>>>>>> your
> > >>>>>>>>>>>>>>>>>>> paper.
> > >>>>>>>>>>>>>>>>>>>>>> Then I expect that we will have a back-and-forth
> > >>>>> process
> > >>>>>>> to
> > >>>>>>>>>>>> create
> > >>>>>>>>>>>>>>>>>>>>>> something that will be useful for the broader
> > >>>>> community.
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> One thing will be particularly useful: making
> > >> this
> > >>>>>>>>>>>> functionality
> > >>>>>>>>>>>>>>>>>>>>>> available from a SQL shell, so that people can
> > >>>>> experiment
> > >>>>>>>>>>>> with
> > >>>>>>>>>>>>>>> this
> > >>>>>>>>>>>>>>>>>>>>>> functionality without writing Java code or
> > >>>> setting up
> > >>>>>>>> complex
> > >>>>>>>>>>>>>>>>>>> databases
> > >>>>>>>>>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>>>>>>>>> metadata. I have in mind something like the
> > >> simple
> > >>>>> DDL
> > >>>>>>>>>>>> operations
> > >>>>>>>>>>>>>>>>>>> that
> > >>>>>>>>>>>>>>>>>>>> are
> > >>>>>>>>>>>>>>>>>>>>>> available in Calcite’s ’server’ module. I wonder
> > >>>>> whether
> > >>>>>>> we
> > >>>>>>>>>>>> could
> > >>>>>>>>>>>>>>>>>>> devise
> > >>>>>>>>>>>>>>>>>>>>>> some kind of SQL syntax for a “multi-query”.
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> Julian
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> [1]
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>
> > >>>>
> > >>
> > https://cacm.acm.org/magazines/2017/10/221322-multi-objective-parametric-query-optimization/fulltext
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> On Dec 23, 2020, at 8:55 PM, Botong Huang <
> > >>>>>>>> pkuhbt@gmail.com
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> Thanks Aron for pointing this out. To see the
> > >>>>> figure,
> > >>>>>>>> please
> > >>>>>>>>>>>>>>> refer
> > >>>>>>>>>>>>>>>>>>> to
> > >>>>>>>>>>>>>>>>>>>>>> Fig
> > >>>>>>>>>>>>>>>>>>>>>>> 3(a) in our paper:
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>> https://kai-zeng.github.io/papers/tempura-vldb2021.pdf
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>>>>>>>>>>> Botong
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> On Wed, Dec 23, 2020 at 7:20 PM JiaTao Tao <
> > >>>>>>>>>>>> taojiatao@gmail.com>
> > >>>>>>>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>> Seems interesting, the pic can not be seen in
> > >>>> the
> > >>>>> mail,
> > >>>>>>>>>>>> may you
> > >>>>>>>>>>>>>>>>>>> open
> > >>>>>>>>>>>>>>>>>>>> a
> > >>>>>>>>>>>>>>>>>>>>>> JIRA
> > >>>>>>>>>>>>>>>>>>>>>>>> for this, people who are interested in this
> > >> can
> > >>>>>>> subscribe
> > >>>>>>>>>>>> to the
> > >>>>>>>>>>>>>>>>>>>> JIRA?
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>> Regards!
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>> Aron Tao
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>> Botong Huang <bo...@apache.org>
> > >> 于2020年12月24日周四
> > >>>>>>>> 上午3:18写道:
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> Hi all,
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> This is a proposal to extend the Calcite
> > >>>> optimizer
> > >>>>>>> into
> > >>>>>>>> a
> > >>>>>>>>>>>>>>> general
> > >>>>>>>>>>>>>>>>>>>>>>>>> incremental query optimizer, based on our
> > >>>> research
> > >>>>>>> paper
> > >>>>>>>>>>>>>>>>>>> published
> > >>>>>>>>>>>>>>>>>>>> in
> > >>>>>>>>>>>>>>>>>>>>>>>> VLDB
> > >>>>>>>>>>>>>>>>>>>>>>>>> 2021:
> > >>>>>>>>>>>>>>>>>>>>>>>>> Tempura: a general cost-based optimizer
> > >>>> framework
> > >>>>> for
> > >>>>>>>>>>>>>>> incremental
> > >>>>>>>>>>>>>>>>>>>> data
> > >>>>>>>>>>>>>>>>>>>>>>>>> processing
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> We also have a demo in SIGMOD 2020
> > >> illustrating
> > >>>>> how
> > >>>>>>>>>>>> Alibaba’s
> > >>>>>>>>>>>>>>>>>>> data
> > >>>>>>>>>>>>>>>>>>>>>>>>> warehouse is planning to use this incremental
> > >>>>> query
> > >>>>>>>>>>>> optimizer
> > >>>>>>>>>>>>>>> to
> > >>>>>>>>>>>>>>>>>>>>>>>> alleviate
> > >>>>>>>>>>>>>>>>>>>>>>>>> cluster-wise resource skewness:
> > >>>>>>>>>>>>>>>>>>>>>>>>> Grosbeak: A Data Warehouse Supporting
> > >>>>> Resource-Aware
> > >>>>>>>>>>>>>>> Incremental
> > >>>>>>>>>>>>>>>>>>>>>>>> Computing
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> To our best knowledge, this is the first
> > >>>> general
> > >>>>>>>>>>>> cost-based
> > >>>>>>>>>>>>>>>>>>>>>> incremental
> > >>>>>>>>>>>>>>>>>>>>>>>>> optimizer that can find the best plan across
> > >>>>> multiple
> > >>>>>>>>>>>> families
> > >>>>>>>>>>>>>>> of
> > >>>>>>>>>>>>>>>>>>>>>>>>> incremental computing methods, including IVM,
> > >>>>>>> Streaming,
> > >>>>>>>>>>>>>>>>>>> DBToaster,
> > >>>>>>>>>>>>>>>>>>>>>> etc.
> > >>>>>>>>>>>>>>>>>>>>>>>>> Experiments (in the paper) shows that the
> > >>>>> generated
> > >>>>>>> best
> > >>>>>>>>>>>> plan
> > >>>>>>>>>>>>>>> is
> > >>>>>>>>>>>>>>>>>>>>>>>>> consistently much better than the plans from
> > >>>> each
> > >>>>>>>>>>>> individual
> > >>>>>>>>>>>>>>>>>>> method
> > >>>>>>>>>>>>>>>>>>>>>>>> alone.
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> In general, incremental query planning is
> > >>>> central
> > >>>>> to
> > >>>>>>>>>>>> database
> > >>>>>>>>>>>>>>>>>>> view
> > >>>>>>>>>>>>>>>>>>>>>>>>> maintenance and stream processing systems,
> > >> and
> > >>>> are
> > >>>>>>> being
> > >>>>>>>>>>>>>>> adopted
> > >>>>>>>>>>>>>>>>>>> in
> > >>>>>>>>>>>>>>>>>>>>>>>> active
> > >>>>>>>>>>>>>>>>>>>>>>>>> databases, resumable query execution,
> > >>>> approximate
> > >>>>>>> query
> > >>>>>>>>>>>>>>>>>>> processing,
> > >>>>>>>>>>>>>>>>>>>>>> etc.
> > >>>>>>>>>>>>>>>>>>>>>>>> We
> > >>>>>>>>>>>>>>>>>>>>>>>>> are hoping that this feature can help
> > >> widening
> > >>>> the
> > >>>>>>>>>>>> spectrum of
> > >>>>>>>>>>>>>>>>>>>>>> Calcite,
> > >>>>>>>>>>>>>>>>>>>>>>>>> solicit more use cases and adoption of
> > >> Calcite.
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> Below is a brief description of the technical
> > >>>>> details.
> > >>>>>>>>>>>> Please
> > >>>>>>>>>>>>>>>>>>> refer
> > >>>>>>>>>>>>>>>>>>>> to
> > >>>>>>>>>>>>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>>>>>>>>> Tempura paper for more details. We are also
> > >>>>> working
> > >>>>>>> on a
> > >>>>>>>>>>>>>>> journal
> > >>>>>>>>>>>>>>>>>>>>>> version
> > >>>>>>>>>>>>>>>>>>>>>>>> of
> > >>>>>>>>>>>>>>>>>>>>>>>>> the paper with more implementation details.
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> Currently the query plan generated by Calcite
> > >>>> is
> > >>>>> meant
> > >>>>>>>> to
> > >>>>>>>>>>>> be
> > >>>>>>>>>>>>>>>>>>>> executed
> > >>>>>>>>>>>>>>>>>>>>>>>>> altogether at once. In the proposal,
> > >> Calcite’s
> > >>>>> memo
> > >>>>>>> will
> > >>>>>>>>>>>> be
> > >>>>>>>>>>>>>>>>>>> extended
> > >>>>>>>>>>>>>>>>>>>>>> with
> > >>>>>>>>>>>>>>>>>>>>>>>>> temporal information so that it is capable of
> > >>>>>>> generating
> > >>>>>>>>>>>>>>>>>>> incremental
> > >>>>>>>>>>>>>>>>>>>>>>>> plans
> > >>>>>>>>>>>>>>>>>>>>>>>>> that include multiple sub-plans to execute at
> > >>>>>>> different
> > >>>>>>>>>>>> time
> > >>>>>>>>>>>>>>>>>>> points.
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> The main idea is to view each table as one
> > >> that
> > >>>>>>> changes
> > >>>>>>>>>>>> over
> > >>>>>>>>>>>>>>> time
> > >>>>>>>>>>>>>>>>>>>>>> (Time
> > >>>>>>>>>>>>>>>>>>>>>>>>> Varying Relations (TVR)). To achieve that we
> > >>>>>>> introduced
> > >>>>>>>>>>>>>>>>>>> TvrMetaSet
> > >>>>>>>>>>>>>>>>>>>>>> into
> > >>>>>>>>>>>>>>>>>>>>>>>>> Calcite’s memo besides RelSet and RelSubset
> > >> to
> > >>>>> track
> > >>>>>>>>>>>> related
> > >>>>>>>>>>>>>>>>>>> RelSets
> > >>>>>>>>>>>>>>>>>>>>>> of a
> > >>>>>>>>>>>>>>>>>>>>>>>>> changing table (e.g. snapshot of the table at
> > >>>>> certain
> > >>>>>>>>>>>> time,
> > >>>>>>>>>>>>>>>>>>> delta of
> > >>>>>>>>>>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>>>>>>>>> table between two time points, etc.).
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> [image: image.png]
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> For example in the above figure, each
> > >> vertical
> > >>>>> line
> > >>>>>>> is a
> > >>>>>>>>>>>>>>>>>>> TvrMetaSet
> > >>>>>>>>>>>>>>>>>>>>>>>>> representing a TVR (S, R, S left outer join
> > >> R,
> > >>>>> etc.).
> > >>>>>>>>>>>>>>> Horizontal
> > >>>>>>>>>>>>>>>>>>>> lines
> > >>>>>>>>>>>>>>>>>>>>>>>>> represent time. Each black dot in the grid
> > >> is a
> > >>>>>>> RelSet.
> > >>>>>>>>>>>> Users
> > >>>>>>>>>>>>>>> can
> > >>>>>>>>>>>>>>>>>>>>>> write
> > >>>>>>>>>>>>>>>>>>>>>>>> TVR
> > >>>>>>>>>>>>>>>>>>>>>>>>> Rewrite Rules to describe valid
> > >> transformations
> > >>>>>>> between
> > >>>>>>>>>>>> these
> > >>>>>>>>>>>>>>>>>>> dots.
> > >>>>>>>>>>>>>>>>>>>>>> For
> > >>>>>>>>>>>>>>>>>>>>>>>>> example, the blues lines are inter-TVR rules
> > >>>> that
> > >>>>>>>>>>>> describe how
> > >>>>>>>>>>>>>>> to
> > >>>>>>>>>>>>>>>>>>>>>> compute
> > >>>>>>>>>>>>>>>>>>>>>>>>> certain RelSet of a TVR from RelSets of other
> > >>>>> TVRs.
> > >>>>>>> The
> > >>>>>>>>>>>> red
> > >>>>>>>>>>>>>>> lines
> > >>>>>>>>>>>>>>>>>>>> are
> > >>>>>>>>>>>>>>>>>>>>>>>>> intra-TVR rules that describe transformations
> > >>>>> within a
> > >>>>>>>>>>>> TVR. All
> > >>>>>>>>>>>>>>>>>>> TVR
> > >>>>>>>>>>>>>>>>>>>>>>>> rewrite
> > >>>>>>>>>>>>>>>>>>>>>>>>> rules are logical rules. All existing Calcite
> > >>>>> rules
> > >>>>>>>> still
> > >>>>>>>>>>>> work
> > >>>>>>>>>>>>>>> in
> > >>>>>>>>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>>>>>> new
> > >>>>>>>>>>>>>>>>>>>>>>>>> volcano system without modification.
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> All changes in this feature will consist of
> > >>>> four
> > >>>>>>> parts:
> > >>>>>>>>>>>>>>>>>>>>>>>>> 1. Memo extension with TvrMetaSet
> > >>>>>>>>>>>>>>>>>>>>>>>>> 2. Rule engine upgrade, capable of matching
> > >>>>> TvrMetaSet
> > >>>>>>>> and
> > >>>>>>>>>>>>>>>>>>> RelNodes,
> > >>>>>>>>>>>>>>>>>>>>>> as
> > >>>>>>>>>>>>>>>>>>>>>>>>> well as links in between the nodes.
> > >>>>>>>>>>>>>>>>>>>>>>>>> 3. A basic set of TvrRules, written using the
> > >>>>> upgraded
> > >>>>>>>>>>>> rule
> > >>>>>>>>>>>>>>>>>>> engine
> > >>>>>>>>>>>>>>>>>>>>>> API.
> > >>>>>>>>>>>>>>>>>>>>>>>>> 4. Multi-query optimization, used to find the
> > >>>> best
> > >>>>>>>>>>>> incremental
> > >>>>>>>>>>>>>>>>>>> plan
> > >>>>>>>>>>>>>>>>>>>>>>>>> involving multiple time points.
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> Note that this feature is an extension in
> > >>>> nature
> > >>>>> and
> > >>>>>>>> thus
> > >>>>>>>>>>>> when
> > >>>>>>>>>>>>>>>>>>>>>> disabled,
> > >>>>>>>>>>>>>>>>>>>>>>>>> does not change any existing Calcite
> > >> behavior.
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> Other than scenarios in the paper, we also
> > >>>> applied
> > >>>>>>> this
> > >>>>>>>>>>>>>>>>>>>>>> Calcite-extended
> > >>>>>>>>>>>>>>>>>>>>>>>>> incremental query optimizer to a type of
> > >>>> periodic
> > >>>>>>> query
> > >>>>>>>>>>>> called
> > >>>>>>>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>>>>>>>> ‘‘range
> > >>>>>>>>>>>>>>>>>>>>>>>>> query’’ in Alibaba’s data warehouse. It
> > >>>> achieved
> > >>>>> cost
> > >>>>>>>>>>>> savings
> > >>>>>>>>>>>>>>> of
> > >>>>>>>>>>>>>>>>>>> 80%
> > >>>>>>>>>>>>>>>>>>>>>> on
> > >>>>>>>>>>>>>>>>>>>>>>>>> total CPU and memory consumption, and 60% on
> > >>>>>>> end-to-end
> > >>>>>>>>>>>>>>> execution
> > >>>>>>>>>>>>>>>>>>>>>> time.
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> All comments and suggestions are welcome.
> > >>>> Thanks
> > >>>>> and
> > >>>>>>>> happy
> > >>>>>>>>>>>>>>>>>>> holidays!
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>>>>>>>>>>>>> Botong
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> --
> > >>>>>>>>>>>>>>>>>>> ~~~~~~~~~~~~~~~
> > >>>>>>>>>>>>>>>>>>> no mistakes
> > >>>>>>>>>>>>>>>>>>> ~~~~~~~~~~~~~~~~~~
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> --
> > >>>>>>> Viliam Durina
> > >>>>>>> Jet Developer
> > >>>>>>>      hazelcast®
> > >>>>>>>
> > >>>>>>>  <https://www.hazelcast.com> 2 W 5th Ave, Ste 300 | San Mateo,
> > >> CA
> > >>>>> 94402 |
> > >>>>>>> USA
> > >>>>>>> +1 (650) 521-5453 <(650)%20521-5453> | hazelcast.com <
> > >> https://www.hazelcast.com>
> > >>>>>>>
> > >>>>>>> --
> > >>>>>>> This message contains confidential information and is intended
> > >> only
> > >>>> for
> > >>>>>>> the
> > >>>>>>> individuals named. If you are not the named addressee you should
> > >> not
> > >>>>>>> disseminate, distribute or copy this e-mail. Please notify the
> > >>>> sender
> > >>>>>>> immediately by e-mail if you have received this e-mail by mistake
> > >>>> and
> > >>>>>>> delete this e-mail from your system. E-mail transmission cannot be
> > >>>>>>> guaranteed to be secure or error-free as information could be
> > >>>>> intercepted,
> > >>>>>>> corrupted, lost, destroyed, arrive late or incomplete, or contain
> > >>>>> viruses.
> > >>>>>>> The sender therefore does not accept liability for any errors or
> > >>>>> omissions
> > >>>>>>> in the contents of this message, which arise as a result of e-mail
> > >>>>>>> transmission. If verification is required, please request a
> > >>>> hard-copy
> > >>>>>>> version. -Hazelcast
> > >>>>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>
> >
> >
> 

Re: Proposal to extend Calcite into a incremental query optimizer

Posted by Botong Huang <pk...@gmail.com>.
Hi Haisheng,

Thanks for the reminder. Yeah I've been occupied with several other
deadlines. We will try to come up with something by next week.

Best,
Botong

On Wed, Jul 7, 2021 at 6:05 PM Haisheng Yuan <hy...@apache.org> wrote:

> Hi Botong,
>
> We haven't heard from you for a while.
> Feel free to reach out if you get stuck or need help on rebasing code.
>
> Thanks,
> Haisheng
>
> On 2021/05/15 00:54:02, Botong Huang <pk...@gmail.com> wrote:
> > Hi all,
> >
> > Thank you all for the interest, and thanks Julian for the update!
> >
> > I am having problems uploading the pdf files into the jira CALCITE-4568
> > <https://issues.apache.org/jira/browse/CALCITE-4568>, so I attached the
> > slides in our code base:
> >
> https://github.com/alibaba/cost-based-incremental-optimizer/blob/main/Tempura_Calcite_presentation.pdf
> >
> > The slides contain a walking example of how Tempura expands its memo. The
> > current version of the code also has two e2e unit tests at
> > TvrOptimizationTest.java and TvrExecutionTest.java. Please feel free to
> > start playing with them, and feel free to reach out and possibly schedule
> > another meeting if needed.
> >
> > As agreed in the meeting, we will rebase our code to a newer version of
> > Calcite.
> >
> > Best,
> > Botong
> >
> > On Thu, May 13, 2021 at 12:47 PM Julian Hyde <jh...@gmail.com>
> wrote:
> >
> > > During the meeting we agreed to start progressing this contribution in
> the
> > > usual Apache Way, with conversations on the dev list and in the
> > > https://issues.apache.org/jira/browse/CALCITE-4568 <
> > > https://issues.apache.org/jira/browse/CALCITE-4568> JIRA case. So, it
> > > should be easy for you to participate.
> > >
> > > Botong said he would share the slides. (He might be unwilling to make
> them
> > > public, because they are his presentation for a conference that has not
> > > happened yet. Reach out to him one-to-one.)
> > >
> > > Next step is for someone on the Alibaba side to create a PR that is
> > > rebased on the latest Calcite master, and add a comment to the JIRA
> case.
> > > Then we can discuss what needs to be done for that PR. Code quality,
> adding
> > > comments, breaking up into smaller commits, additional tests, renaming
> > > packages/classes, restructuring into plugins are all possibilities.
> > >
> > > Our side of the bargain, as committers, is that we should review in a
> > > timely manner, and not move the goal posts — if the contributors make
> the
> > > changes we request then we will land this code in master in a
> reasonable
> > > amount of time.
> > >
> > > We also discussed incremental view maintenance (IVM). Tempura solves a
> > > more general problem (finding the optimal K steps to maintain a
> > > materialized view as data arrives in K points in time) but if we set
> K=2,
> > > we can generate a plan for how to update a materialized view given a
> delta
> > > table. The plan will be different based on cost - e.g. whether the
> delta
> > > table is small or large. This is a problem that many of our users would
> > > like to solve. It will exercise much of Tempura’s code base, and
> encourage
> > > contributions.
> > >
> > > In my opinion, we should do IVM at launch. It should be the main
> example
> > > we use in conference talks, blog posts, etc. When people understand
> that
> > > case, we can explain how we generalize from K=2 to arbitrary K.
> > >
> > > Julian
> > >
> > >
> > > > On May 13, 2021, at 9:51 AM, Rui Wang <am...@apache.org> wrote:
> > > >
> > > > I apologize that I had a wrong impression on the meeting time (I
> thought
> > > it
> > > > should be on Thursday but it is Wednesday). I can follow up your
> meeting
> > > > records if you have any.
> > > >
> > > >
> > > > -Rui
> > > >
> > > > On Tue, May 11, 2021 at 8:17 PM Botong Huang <pk...@gmail.com>
> wrote:
> > > >
> > > >> Hi all,
> > > >>
> > > >> This is a reminder that we are going to have our second discussion
> > > meeting
> > > >> tomorrow at 10-11pm PST. Please find the link below, everyone is
> > > welcome to
> > > >> join!
> > > >>
> > > >> Join Zoom Meeting
> > > >> https://uci.zoom.us/j/91986206610
> > > >> <
> > > >>
> > >
> https://www.google.com/url?q=https%3A%2F%2Fuci.zoom.us%2Fj%2F91986206610&sa=D&source=calendar&usd=2&usg=AOvVaw24sxPtI6hbukCSo3nlQsbn
> > > >>>
> > > >>
> > > >> Meeting ID: 919 8620 6610
> > > >> One tap mobile
> > > >> +16699006833 <(669)%20900-6833>,,91986206610# US (San Jose)
> > > >> +12532158782 <(253)%20215-8782>,,91986206610# US (Tacoma)
> > > >>
> > > >> Dial by your location
> > > >>        +1 669 900 6833 <(669)%20900-6833> US (San Jose)
> > > >>        +1 253 215 8782 <(253)%20215-8782> US (Tacoma)
> > > >>        +1 346 248 7799 <(346)%20248-7799> US (Houston)
> > > >>        +1 301 715 8592 <(301)%20715-8592> US (Washington DC)
> > > >>        +1 312 626 6799 <(312)%20626-6799> US (Chicago)
> > > >>        +1 646 558 8656 <(646)%20558-8656> US (New York)
> > > >> Meeting ID: 919 8620 6610
> > > >> Find your local number: https://uci.zoom.us/u/acyXcc43Cd
> > > >> <
> > > >>
> > >
> https://www.google.com/url?q=https%3A%2F%2Fuci.zoom.us%2Fu%2FacyXcc43Cd&sa=D&source=calendar&usd=2&usg=AOvVaw2W08kj_8hEx44dryeZlXb6
> > > >>>
> > > >>
> > > >> Join by Skype for Business
> > > >> https://uci.zoom.us/skype/91986206610
> > > >> <
> > > >>
> > >
> https://www.google.com/url?q=https%3A%2F%2Fuci.zoom.us%2Fskype%2F91986206610&sa=D&source=calendar&usd=2&usg=AOvVaw3w0M0YYbcjPyBXzNpyyk0Z
> > > >>>
> > > >>
> > > >> Thanks,
> > > >> Botong
> > > >>
> > > >> On Wed, May 5, 2021 at 9:55 AM Botong Huang <pk...@gmail.com>
> wrote:
> > > >>
> > > >>> Hi Stamatis and all,
> > > >>>
> > > >>> Thanks for the interest! Let's tentatively schedule the next
> meeting
> > > next
> > > >>> Wednesday at May 12, 10pm-11pm PST then. Please let us know if
> there's
> > > >> new
> > > >>> needs showing up.
> > > >>>
> > > >>> Best,
> > > >>> Botong
> > > >>>
> > > >>> On Sun, May 2, 2021 at 2:59 PM Stamatis Zampetakis <
> zabetak@gmail.com>
> > > >>> wrote:
> > > >>>
> > > >>>> Hello,
> > > >>>>
> > > >>>> I really regret missing the first meeting, sorry about that. I
> added
> > > my
> > > >>>> preferences in the document.
> > > >>>> I will make sure to attend the next one and help as much as I can.
> > > >>>>
> > > >>>> I didn't have the chance yet to go over the paper but will try to
> do
> > > it
> > > >>>> before the next meeting.
> > > >>>>
> > > >>>> For me the following dates are more convenient than others so it
> would
> > > >> be
> > > >>>> nice if we could arrange it then.
> > > >>>>
> > > >>>> Thu, May 6, 10pm PST
> > > >>>> Tue, May 12, 10pm PST
> > > >>>>
> > > >>>> Best,
> > > >>>> Stamatis
> > > >>>>
> > > >>>> On Sat, May 1, 2021 at 9:42 PM Julian Hyde <jh...@apache.org>
> wrote:
> > > >>>>
> > > >>>>> I have added my time preferences to the doc [1]. I am generally
> > > >>>>> available any evening Mon - Thu. How about we meet Monday 10th
> May?
> > > >>>>>
> > > >>>>> Stamatis, Jesus, Given the complexity of this work, I would very
> much
> > > >>>>> appreciate your insight, as experts in optimizer theory. Could
> one of
> > > >>>>> you join the next meeting? Of course we should choose a time that
> > > >>>>> works for everyone's schedule.
> > > >>>>>
> > > >>>>> Julian
> > > >>>>>
> > > >>>>> [1]
> > > >>>>>
> > > >>>>
> > > >>
> > >
> https://docs.google.com/document/d/1wyNjB94uSGwHtVvGYDwaLlCghUJE-7aDLnCdKKXJN1o/edit?usp=sharing
> > > >>>>>
> > > >>>>> On Wed, Apr 28, 2021 at 9:32 AM Botong Huang <pk...@gmail.com>
> > > >> wrote:
> > > >>>>>>
> > > >>>>>> We didn't record it, we will try to record the following
> meetings.
> > > >>>> Please
> > > >>>>>> add your time preference in the docs, so that we can find a
> meeting
> > > >>>> time
> > > >>>>>> that works for more people.
> > > >>>>>>
> > > >>>>>> Thanks,
> > > >>>>>> Botong
> > > >>>>>>
> > > >>>>>> On Wed, Apr 28, 2021 at 12:23 AM Viliam Durina <
> > > >> viliam@hazelcast.com>
> > > >>>>> wrote:
> > > >>>>>>
> > > >>>>>>> Is there a recording available?
> > > >>>>>>> Viliam
> > > >>>>>>>
> > > >>>>>>> On Wed, 28 Apr 2021 at 00:15, Botong Huang <pk...@gmail.com>
> > > >>>> wrote:
> > > >>>>>>>
> > > >>>>>>>> Hi all,
> > > >>>>>>>>
> > > >>>>>>>> The meeting yesterday was fun and productive. As discussed,
> this
> > > >>>> is
> > > >>>>> the
> > > >>>>>>>> call to schedule our second meeting.
> > > >>>>>>>>
> > > >>>>>>>> We encourage everyone to add their time preferences during
> > > >> 05/01 -
> > > >>>>> 05/15
> > > >>>>>>>> here:
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>
> > > >>>>
> > > >>
> > >
> https://docs.google.com/document/d/1wyNjB94uSGwHtVvGYDwaLlCghUJE-7aDLnCdKKXJN1o/edit?usp=sharing
> > > >>>>>>>>
> > > >>>>>>>> Thanks,
> > > >>>>>>>> Botong
> > > >>>>>>>>
> > > >>>>>>>> On Wed, Apr 21, 2021 at 5:19 PM Botong Huang <
> pkuhbt@gmail.com>
> > > >>>>> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> Hi all,
> > > >>>>>>>>> We've created a zoom meeting below for our meeting next
> Monday
> > > >>>>>>>>> (9pm-10:30pm PST on 04/26).
> > > >>>>>>>>> Talk to you all soon!
> > > >>>>>>>>>
> > > >>>>>>>>> Join Zoom Meeting
> > > >>>>>>>>> https://uci.zoom.us/j/91279732686
> > > >>>>>>>>> <
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>
> > > >>>>
> > > >>
> > >
> https://www.google.com/url?q=https%3A%2F%2Fuci.zoom.us%2Fj%2F91279732686&sa=D&source=calendar&usd=2&usg=AOvVaw2C5LoOmCaSLWSi-YvMmsOE
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> Meeting ID: 912 7973 2686
> > > >>>>>>>>> One tap mobile
> > > >>>>>>>>> +16699006833 <(669)%20900-6833>,,91279732686# US (San Jose)
> > > >>>>>>>>> +12532158782 <(253)%20215-8782>,,91279732686# US (Tacoma)
> > > >>>>>>>>>
> > > >>>>>>>>> Dial by your location
> > > >>>>>>>>> +1 669 900 6833 <(669)%20900-6833> US (San Jose)
> > > >>>>>>>>> +1 253 215 8782 <(253)%20215-8782> US (Tacoma)
> > > >>>>>>>>> +1 346 248 7799 <(346)%20248-7799> US (Houston)
> > > >>>>>>>>> +1 301 715 8592 <(301)%20715-8592> US (Washington DC)
> > > >>>>>>>>> +1 312 626 6799 <(312)%20626-6799> US (Chicago)
> > > >>>>>>>>> +1 646 558 8656 <(646)%20558-8656> US (New York)
> > > >>>>>>>>> Meeting ID: 912 7973 2686
> > > >>>>>>>>> Find your local number: https://uci.zoom.us/u/aykHTkJBh
> > > >>>>>>>>> <
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>
> > > >>>>
> > > >>
> > >
> https://www.google.com/url?q=https%3A%2F%2Fuci.zoom.us%2Fu%2FaykHTkJBh&sa=D&source=calendar&usd=2&usg=AOvVaw0y_V5CisCHRyt9wsXLa9UM
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> Join by Skype for Business
> > > >>>>>>>>> https://uci.zoom.us/skype/91279732686
> > > >>>>>>>>> <
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>
> > > >>>>
> > > >>
> > >
> https://www.google.com/url?q=https%3A%2F%2Fuci.zoom.us%2Fskype%2F91279732686&sa=D&source=calendar&usd=2&usg=AOvVaw3iQwsDViu3K7-Rb_Iy6Zsy
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> Thanks,
> > > >>>>>>>>> Botong
> > > >>>>>>>>>
> > > >>>>>>>>> On Tue, Apr 13, 2021 at 10:16 PM Botong Huang <
> > > >> pkuhbt@gmail.com
> > > >>>>>
> > > >>>>>>> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>>> Hi all,
> > > >>>>>>>>>>
> > > >>>>>>>>>> According to the preferences collected, we are tentatively
> > > >>>>> scheduling
> > > >>>>>>>> our
> > > >>>>>>>>>> meeting at 9pm-10:30pm PST on 04/26 Monday.
> > > >>>>>>>>>>
> > > >>>>>>>>>> We will give a presentation about Tempura, followed by a
> free
> > > >>>>>>>> discussion.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Please let us know if there are new other requests. Few days
> > > >>>>> before
> > > >>>>>>>>>> the meeting, I will send out a zoom meeting link.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Thanks,
> > > >>>>>>>>>> Botong
> > > >>>>>>>>>>
> > > >>>>>>>>>> On Wed, Apr 7, 2021 at 2:46 PM Botong Huang <
> > > >> pkuhbt@gmail.com>
> > > >>>>> wrote:
> > > >>>>>>>>>>
> > > >>>>>>>>>>> Hi Julian and all,
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> We've posted the Tempura code base below. Feel free to take
> > > >> a
> > > >>>>> quick
> > > >>>>>>>> peek
> > > >>>>>>>>>>> at the last five commits.
> > > >>>>>>>>>>>
> > > >>>>>>>>
> > > >>>>>
> > > >>>>
> > > >>
> > >
> https://github.com/alibaba/cost-based-incremental-optimizer/commits/main
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> I've also opened a Jira (CALCITE-4568
> > > >>>>>>>>>>> <https://issues.apache.org/jira/browse/CALCITE-4568>),
> > > >> which
> > > >>>>> will
> > > >>>>>>>> serve
> > > >>>>>>>>>>> as the umbrella Jira for the feature.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> In the meantime, we encourage everyone to enter the time
> > > >>>>> preferences
> > > >>>>>>>> for
> > > >>>>>>>>>>> our first meeting here:
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>
> > > >>>>
> > > >>
> > >
> https://docs.google.com/document/d/1wyNjB94uSGwHtVvGYDwaLlCghUJE-7aDLnCdKKXJN1o/edit?usp=sharing
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Thanks,
> > > >>>>>>>>>>> Botong
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> On Mon, Apr 5, 2021 at 3:59 PM Julian Hyde <
> > > >>>>> jhyde.apache@gmail.com>
> > > >>>>>>>>>>> wrote:
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> I have added my time preferences to the doc.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Before we meet, could you publish a PR for us to review?
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Initial discussions will need to be about architecture and
> > > >>>>>>> high-level
> > > >>>>>>>>>>>> design. So I would ask Calcite reviewers not to review the
> > > >> PR
> > > >>>>>>>> line-by-line
> > > >>>>>>>>>>>> (or to leave comments in GitHub) but try to understand the
> > > >>>>> design
> > > >>>>>>>>>>>> holistically, and prepare questions/comments before the
> > > >>>> meeting.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Botong, Can you please create a Calcite JIRA case for this
> > > >>>> task?
> > > >>>>>>> JIRA
> > > >>>>>>>>>>>> how we track long-running tasks such as this.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Julian
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>> On Apr 3, 2021, at 5:15 PM, Botong Huang <
> > > >> pkuhbt@gmail.com
> > > >>>>>
> > > >>>>>>> wrote:
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> Hi all,
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> Apology for the delay. It took us some time to clean up
> > > >> our
> > > >>>>> code
> > > >>>>>>>> base
> > > >>>>>>>>>>>> and
> > > >>>>>>>>>>>>> publicly release it (which will be out soon) for a quick
> > > >>>> peek.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> We are ready to present our work. Let's schedule a time
> > > >>>> for a
> > > >>>>> Zoom
> > > >>>>>>>>>>>>> meeting and discuss how to integrate Tempura into
> > > >> Calcite.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> Since some of our team members are in China, we prefer
> > > >> the
> > > >>>>> time
> > > >>>>>>> slot
> > > >>>>>>>>>>>> of
> > > >>>>>>>>>>>>> 7:00pm-11:30pm PST any day. I've added our time
> > > >> preference
> > > >>>> in
> > > >>>>> the
> > > >>>>>>>>>>>> shared
> > > >>>>>>>>>>>>> doc below.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>
> > > >>>>
> > > >>
> > >
> https://docs.google.com/document/d/1wyNjB94uSGwHtVvGYDwaLlCghUJE-7aDLnCdKKXJN1o/edit?usp=sharing
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> We encourage everyone to add their time preferences
> > > >> (during
> > > >>>>>>>>>>>> 04/15-04/30) in
> > > >>>>>>>>>>>>> this doc. In a week or so, we will try to settle a time
> > > >>>> that
> > > >>>>> works
> > > >>>>>>>> for
> > > >>>>>>>>>>>>> most.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> Thanks,
> > > >>>>>>>>>>>>> Botong
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> On Sat, Jan 30, 2021 at 9:19 PM Botong Huang <
> > > >>>>> pkuhbt@gmail.com>
> > > >>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> Hi Julian and Rui,
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> Sounds good to us. Please give us some time to prepare
> > > >>>> some
> > > >>>>>>> slides
> > > >>>>>>>>>>>> for the
> > > >>>>>>>>>>>>>> meeting.
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> I've created a doc below for discussion. Please feel
> > > >> free
> > > >>>> to
> > > >>>>> add
> > > >>>>>>>>>>>> more in
> > > >>>>>>>>>>>>>> here:
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>
> > > >>>>
> > > >>
> > >
> https://docs.google.com/document/d/1wyNjB94uSGwHtVvGYDwaLlCghUJE-7aDLnCdKKXJN1o/edit?usp=sharing
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> Thanks,
> > > >>>>>>>>>>>>>> Botong
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> On Thu, Jan 28, 2021 at 11:18 AM Julian Hyde <
> > > >>>>>>>> jhyde.apache@gmail.com
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> PS The “editable doc” that Rui refers to is also a good
> > > >>>>> idea. I
> > > >>>>>>>>>>>> think we
> > > >>>>>>>>>>>>>>> should create it to continue discussion after the first
> > > >>>>> meeting.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> Julian
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> On Jan 28, 2021, at 11:16 AM, Julian Hyde <
> > > >>>>>>>> jhyde.apache@gmail.com>
> > > >>>>>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> I think good next steps would be a PR and a meeting.
> > > >>>> The
> > > >>>>> PR
> > > >>>>>>> will
> > > >>>>>>>>>>>> allow
> > > >>>>>>>>>>>>>>> us to read the code, but I think we should do the first
> > > >>>>> round of
> > > >>>>>>>>>>>> questions
> > > >>>>>>>>>>>>>>> at the meeting.  The meeting could perhaps start with a
> > > >>>>>>>>>>>> presentation of the
> > > >>>>>>>>>>>>>>> paper (do you have some slides you are planning to
> > > >>>> present
> > > >>>>> at
> > > >>>>>>>> VLDB,
> > > >>>>>>>>>>>>>>> Botong?) and then move on to questions about the
> > > >>>> concepts,
> > > >>>>> which
> > > >>>>>>>>>>>>>>> alternatives were considered, and how the concepts map
> > > >>>> onto
> > > >>>>>>> other
> > > >>>>>>>>>>>> current
> > > >>>>>>>>>>>>>>> and future concepts in calcite.
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> I don’t think we should start “reviewing” the PR
> > > >>>>> line-by-line
> > > >>>>>>> at
> > > >>>>>>>>>>>> this
> > > >>>>>>>>>>>>>>> point. We need to understand the high-level concepts
> > > >> and
> > > >>>>> design
> > > >>>>>>>>>>>> choices. If
> > > >>>>>>>>>>>>>>> we start reviewing the PR we will get lost in the
> > > >>>> details.
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> I know that integrating a major change is hard; I
> > > >> doubt
> > > >>>>> that we
> > > >>>>>>>>>>>> will be
> > > >>>>>>>>>>>>>>> able to integrate everything, but we can build
> > > >>>> understanding
> > > >>>>>>> about
> > > >>>>>>>>>>>> where
> > > >>>>>>>>>>>>>>> calcite needs to go, and I hope integrate a good amount
> > > >>>> of
> > > >>>>> code
> > > >>>>>>> to
> > > >>>>>>>>>>>> help us
> > > >>>>>>>>>>>>>>> get there.
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> As I said before, after the integration I would like
> > > >>>>> people to
> > > >>>>>>> be
> > > >>>>>>>>>>>> able
> > > >>>>>>>>>>>>>>> to experiment with it and use it in their production
> > > >>>>> systems.
> > > >>>>>>>> That
> > > >>>>>>>>>>>> way, it
> > > >>>>>>>>>>>>>>> will not be an experiment that withers, but a feature
> > > >> set
> > > >>>>>>>>>>>> integrates with
> > > >>>>>>>>>>>>>>> other calcite features and gets stronger over time.
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> Julian
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> On Jan 28, 2021, at 10:54 AM, Rui Wang <
> > > >>>>> amaliujia@apache.org>
> > > >>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> For me to participate in the discussion for the
> > > >> above
> > > >>>>>>>> questions,
> > > >>>>>>>>>>>> I
> > > >>>>>>>>>>>>>>> will
> > > >>>>>>>>>>>>>>>>> need to read a lot more to know relevant context and
> > > >>>>> likely
> > > >>>>>>> ask
> > > >>>>>>>>>>>> lots of
> > > >>>>>>>>>>>>>>>>> questions :-).  A editable doc is probably good for
> > > >>>>> questions
> > > >>>>>>>> and
> > > >>>>>>>>>>>> back
> > > >>>>>>>>>>>>>>> and
> > > >>>>>>>>>>>>>>>>> forward discussion.
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> -Rui
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> On Thu, Jan 28, 2021 at 10:50 AM Rui Wang <
> > > >>>>>>>> amaliujia@apache.org
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> I am also happy to help push this work into Calcite
> > > >>>>> (review
> > > >>>>>>>> code
> > > >>>>>>>>>>>> and
> > > >>>>>>>>>>>>>>> doc,
> > > >>>>>>>>>>>>>>>>>> etc.).
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> While you can share your code so people can have
> > > >> more
> > > >>>>> idea
> > > >>>>>>> how
> > > >>>>>>>>>>>> it is
> > > >>>>>>>>>>>>>>>>>> implemented, I think it would be also nice to have a
> > > >>>> doc
> > > >>>>> to
> > > >>>>>>>>>>>> discuss
> > > >>>>>>>>>>>>>>> open
> > > >>>>>>>>>>>>>>>>>> questions above. Some points that I copy those to
> > > >>>> here:
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> 1. Can this solution be compatible with existing
> > > >>>>> solutions in
> > > >>>>>>>>>>>> Calcite
> > > >>>>>>>>>>>>>>>>>> Streaming, materialized view maintenance, and
> > > >>>> multi-query
> > > >>>>>>>>>>>> optimization
> > > >>>>>>>>>>>>>>>>>> (Sigma and Delta relational operators, lattice, and
> > > >>>> Spool
> > > >>>>>>>>>>>> operator),
> > > >>>>>>>>>>>>>>>>>> 2. Did you find that you needed two separate cost
> > > >>>> models
> > > >>>>> -
> > > >>>>>>> one
> > > >>>>>>>>>>>> for
> > > >>>>>>>>>>>>>>> “view
> > > >>>>>>>>>>>>>>>>>> maintenance” and another for “user queries” - since
> > > >>>> the
> > > >>>>>>>>>>>> objectives of
> > > >>>>>>>>>>>>>>> each
> > > >>>>>>>>>>>>>>>>>> activity are so different?
> > > >>>>>>>>>>>>>>>>>> 3. whether this work will hasten the arrival of
> > > >>>>>>> multi-objective
> > > >>>>>>>>>>>>>>> parametric
> > > >>>>>>>>>>>>>>>>>> query optimization [1] in Calcite.
> > > >>>>>>>>>>>>>>>>>> 4. probably SQL shell support.
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> [1]:
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>
> > > >>>>
> > > >>
> > >
> https://cacm.acm.org/magazines/2017/10/221322-multi-objective-parametric-query-optimization/fulltext
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> -Rui
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> On Wed, Jan 27, 2021 at 6:52 PM Albert <
> > > >>>>> zinking3@gmail.com>
> > > >>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> it would be very nice to see a POC of your work.
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>> On Thu, Jan 28, 2021 at 10:21 AM Botong Huang <
> > > >>>>>>>>>>>> pkuhbt@gmail.com>
> > > >>>>>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>> Hi Julian,
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>> Just wondering if there are any updates? We are
> > > >>>>> wondering
> > > >>>>>>> if
> > > >>>>>>>> it
> > > >>>>>>>>>>>>>>> would
> > > >>>>>>>>>>>>>>>>>>> help
> > > >>>>>>>>>>>>>>>>>>>> to post our code for a quick preview.
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>> Thanks,
> > > >>>>>>>>>>>>>>>>>>>> Botong
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>> On Fri, Jan 1, 2021 at 11:04 AM Botong Huang <
> > > >>>>>>>> pkuhbt@gmail.com
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>> Hi Julian,
> > > >>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>> Thanks for your interest! Sure let's figure out a
> > > >>>> plan
> > > >>>>>>> that
> > > >>>>>>>>>>>> best
> > > >>>>>>>>>>>>>>>>>>> benefits
> > > >>>>>>>>>>>>>>>>>>>>> the community. Here are some clarifications that
> > > >>>>> hopefully
> > > >>>>>>>>>>>> answer
> > > >>>>>>>>>>>>>>> your
> > > >>>>>>>>>>>>>>>>>>>>> questions.
> > > >>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>> In our work (Tempura), users specify the set of
> > > >>>> time
> > > >>>>>>> points
> > > >>>>>>>> to
> > > >>>>>>>>>>>>>>>>>>> consider
> > > >>>>>>>>>>>>>>>>>>>>> running and a cost function that expresses users'
> > > >>>>>>> preference
> > > >>>>>>>>>>>> over
> > > >>>>>>>>>>>>>>>>>>> time,
> > > >>>>>>>>>>>>>>>>>>>>> Tempura will generate the best incremental plan
> > > >>>> that
> > > >>>>>>>>>>>> minimizes the
> > > >>>>>>>>>>>>>>>>>>>> overall
> > > >>>>>>>>>>>>>>>>>>>>> cost function.
> > > >>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>> In this incremental plan, the sub-plans at
> > > >>>> different
> > > >>>>> time
> > > >>>>>>>>>>>> points
> > > >>>>>>>>>>>>>>> can
> > > >>>>>>>>>>>>>>>>>>> be
> > > >>>>>>>>>>>>>>>>>>>>> different from each other, as opposed to
> > > >> identical
> > > >>>>> plans
> > > >>>>>>> in
> > > >>>>>>>>>>>> all
> > > >>>>>>>>>>>>>>> delta
> > > >>>>>>>>>>>>>>>>>>>> runs
> > > >>>>>>>>>>>>>>>>>>>>> as in streaming or IVM. As mentioned in $2.1 of
> > > >> the
> > > >>>>>>> Tempura
> > > >>>>>>>>>>>> paper,
> > > >>>>>>>>>>>>>>> we
> > > >>>>>>>>>>>>>>>>>>> can
> > > >>>>>>>>>>>>>>>>>>>>> mimic the current streaming implementation by
> > > >>>>> specifying
> > > >>>>>>> two
> > > >>>>>>>>>>>>>>> (logical)
> > > >>>>>>>>>>>>>>>>>>>> time
> > > >>>>>>>>>>>>>>>>>>>>> points in Tempura, representing the initial run
> > > >> and
> > > >>>>> later
> > > >>>>>>>>>>>> delta
> > > >>>>>>>>>>>>>>> runs
> > > >>>>>>>>>>>>>>>>>>>>> respectively. In general, note that Tempura
> > > >>>> supports
> > > >>>>>>> various
> > > >>>>>>>>>>>> form
> > > >>>>>>>>>>>>>>> of
> > > >>>>>>>>>>>>>>>>>>>>> incremental computing, not only the small-delta
> > > >>>>>>> append-only
> > > >>>>>>>>>>>> data
> > > >>>>>>>>>>>>>>>>>>> model in
> > > >>>>>>>>>>>>>>>>>>>>> streaming systems. That's why we believe Tempura
> > > >>>>> subsumes
> > > >>>>>>>> the
> > > >>>>>>>>>>>>>>> current
> > > >>>>>>>>>>>>>>>>>>>>> streaming support, as well as any IVM
> > > >>>> implementations.
> > > >>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>> About the cost model, we did not come up with a
> > > >>>>> seperate
> > > >>>>>>>> cost
> > > >>>>>>>>>>>>>>> model,
> > > >>>>>>>>>>>>>>>>>>> but
> > > >>>>>>>>>>>>>>>>>>>>> rather extended the existing one. Similar to
> > > >>>>>>> multi-objective
> > > >>>>>>>>>>>>>>>>>>>> optimization,
> > > >>>>>>>>>>>>>>>>>>>>> costs incurred at different time points are
> > > >>>> considered
> > > >>>>>>>>>>>> different
> > > >>>>>>>>>>>>>>>>>>>>> dimensions. Tempura lets users supply a function
> > > >>>> that
> > > >>>>>>>>>>>> converts this
> > > >>>>>>>>>>>>>>>>>>> cost
> > > >>>>>>>>>>>>>>>>>>>>> vector into a final cost. So under this function,
> > > >>>> any
> > > >>>>> two
> > > >>>>>>>>>>>>>>> incremental
> > > >>>>>>>>>>>>>>>>>>>> plans
> > > >>>>>>>>>>>>>>>>>>>>> are still comparable and there is an overall
> > > >>>> optimum.
> > > >>>>> I
> > > >>>>>>>> guess
> > > >>>>>>>>>>>> we
> > > >>>>>>>>>>>>>>> can
> > > >>>>>>>>>>>>>>>>>>> go
> > > >>>>>>>>>>>>>>>>>>>>> down the route of multi-objective parametric
> > > >> query
> > > >>>>>>>>>>>> optimization
> > > >>>>>>>>>>>>>>>>>>> instead
> > > >>>>>>>>>>>>>>>>>>>> if
> > > >>>>>>>>>>>>>>>>>>>>> there is a need.
> > > >>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>> Next on materialized views and multi-query
> > > >>>>> optimization,
> > > >>>>>>>>>>>> since our
> > > >>>>>>>>>>>>>>>>>>>>> multi-time-point plan naturally involves
> > > >>>> materializing
> > > >>>>>>>>>>>> intermediate
> > > >>>>>>>>>>>>>>>>>>>> results
> > > >>>>>>>>>>>>>>>>>>>>> for later time points, we need to solve the
> > > >>>> problem of
> > > >>>>>>>>>>>> choosing
> > > >>>>>>>>>>>>>>>>>>>>> materializations and include the cost of saving
> > > >> and
> > > >>>>>>> reusing
> > > >>>>>>>>>>>> the
> > > >>>>>>>>>>>>>>>>>>>>> materializations when costing and comparing
> > > >> plans.
> > > >>>> We
> > > >>>>>>>>>>>> borrowed the
> > > >>>>>>>>>>>>>>>>>>>>> multi-query optimization techniques to solve this
> > > >>>>> problem
> > > >>>>>>>> even
> > > >>>>>>>>>>>>>>> though
> > > >>>>>>>>>>>>>>>>>>> we
> > > >>>>>>>>>>>>>>>>>>>>> are looking at a single query. As a result, we
> > > >>>> think
> > > >>>>> our
> > > >>>>>>>> work
> > > >>>>>>>>>>>> is
> > > >>>>>>>>>>>>>>>>>>>> orthogonal
> > > >>>>>>>>>>>>>>>>>>>>> to Calcite's facilities around utilizing existing
> > > >>>>> views,
> > > >>>>>>>>>>>> lattice
> > > >>>>>>>>>>>>>>> etc.
> > > >>>>>>>>>>>>>>>>>>> We
> > > >>>>>>>>>>>>>>>>>>>> do
> > > >>>>>>>>>>>>>>>>>>>>> feel that the multi-query optimization component
> > > >>>> can
> > > >>>>> be
> > > >>>>>>>>>>>> adopted to
> > > >>>>>>>>>>>>>>>>>>> wider
> > > >>>>>>>>>>>>>>>>>>>>> use, but probably need more suggestions from the
> > > >>>>>>> community.
> > > >>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>> Lastly, our current implementation is set up in
> > > >>>> java
> > > >>>>> code,
> > > >>>>>>>> it
> > > >>>>>>>>>>>>>>> should
> > > >>>>>>>>>>>>>>>>>>> be
> > > >>>>>>>>>>>>>>>>>>>>> straightforward to hook it up with SQL shell.
> > > >>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>> Thanks,
> > > >>>>>>>>>>>>>>>>>>>>> Botong
> > > >>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>> On Mon, Dec 28, 2020 at 6:44 PM Julian Hyde <
> > > >>>>>>>>>>>>>>> jhyde.apache@gmail.com>
> > > >>>>>>>>>>>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>> Botong,
> > > >>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>> This is very exciting; congratulations on this
> > > >>>>> research,
> > > >>>>>>>> and
> > > >>>>>>>>>>>> thank
> > > >>>>>>>>>>>>>>>>>>> you
> > > >>>>>>>>>>>>>>>>>>>>>> for contributing it back to Calcite.
> > > >>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>> The research touches several areas in Calcite:
> > > >>>>> streaming,
> > > >>>>>>>>>>>>>>>>>>> materialized
> > > >>>>>>>>>>>>>>>>>>>>>> view maintenance, and multi-query optimization.
> > > >>>> As we
> > > >>>>>>> have
> > > >>>>>>>>>>>> already
> > > >>>>>>>>>>>>>>>>>>> some
> > > >>>>>>>>>>>>>>>>>>>>>> solutions in those areas (Sigma and Delta
> > > >>>> relational
> > > >>>>>>>>>>>> operators,
> > > >>>>>>>>>>>>>>>>>>> lattice,
> > > >>>>>>>>>>>>>>>>>>>>>> and Spool operator), it will be interesting to
> > > >> see
> > > >>>>>>> whether
> > > >>>>>>>>>>>> we can
> > > >>>>>>>>>>>>>>>>>>> make
> > > >>>>>>>>>>>>>>>>>>>> them
> > > >>>>>>>>>>>>>>>>>>>>>> compatible, or whether one concept can subsume
> > > >>>>> others.
> > > >>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>> Your work differs from streaming queries in that
> > > >>>> your
> > > >>>>>>>>>>>> relations
> > > >>>>>>>>>>>>>>> are
> > > >>>>>>>>>>>>>>>>>>> used
> > > >>>>>>>>>>>>>>>>>>>>>> by “external” user queries, whereas in pure
> > > >>>> streaming
> > > >>>>>>>>>>>> queries, the
> > > >>>>>>>>>>>>>>>>>>> only
> > > >>>>>>>>>>>>>>>>>>>>>> activity is the change propagation. Did you find
> > > >>>>> that you
> > > >>>>>>>>>>>> needed
> > > >>>>>>>>>>>>>>> two
> > > >>>>>>>>>>>>>>>>>>>>>> separate cost models - one for “view
> > > >> maintenance”
> > > >>>> and
> > > >>>>>>>>>>>> another for
> > > >>>>>>>>>>>>>>>>>>> “user
> > > >>>>>>>>>>>>>>>>>>>>>> queries” - since the objectives of each activity
> > > >>>> are
> > > >>>>> so
> > > >>>>>>>>>>>> different?
> > > >>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>> I wonder whether this work will hasten the
> > > >>>> arrival of
> > > >>>>>>>>>>>>>>> multi-objective
> > > >>>>>>>>>>>>>>>>>>>>>> parametric query optimization [1] in Calcite.
> > > >>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>> I will make time over the next few days to read
> > > >>>> and
> > > >>>>>>> digest
> > > >>>>>>>>>>>> your
> > > >>>>>>>>>>>>>>>>>>> paper.
> > > >>>>>>>>>>>>>>>>>>>>>> Then I expect that we will have a back-and-forth
> > > >>>>> process
> > > >>>>>>> to
> > > >>>>>>>>>>>> create
> > > >>>>>>>>>>>>>>>>>>>>>> something that will be useful for the broader
> > > >>>>> community.
> > > >>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>> One thing will be particularly useful: making
> > > >> this
> > > >>>>>>>>>>>> functionality
> > > >>>>>>>>>>>>>>>>>>>>>> available from a SQL shell, so that people can
> > > >>>>> experiment
> > > >>>>>>>>>>>> with
> > > >>>>>>>>>>>>>>> this
> > > >>>>>>>>>>>>>>>>>>>>>> functionality without writing Java code or
> > > >>>> setting up
> > > >>>>>>>> complex
> > > >>>>>>>>>>>>>>>>>>> databases
> > > >>>>>>>>>>>>>>>>>>>> and
> > > >>>>>>>>>>>>>>>>>>>>>> metadata. I have in mind something like the
> > > >> simple
> > > >>>>> DDL
> > > >>>>>>>>>>>> operations
> > > >>>>>>>>>>>>>>>>>>> that
> > > >>>>>>>>>>>>>>>>>>>> are
> > > >>>>>>>>>>>>>>>>>>>>>> available in Calcite’s ’server’ module. I wonder
> > > >>>>> whether
> > > >>>>>>> we
> > > >>>>>>>>>>>> could
> > > >>>>>>>>>>>>>>>>>>> devise
> > > >>>>>>>>>>>>>>>>>>>>>> some kind of SQL syntax for a “multi-query”.
> > > >>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>> Julian
> > > >>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>> [1]
> > > >>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>
> > > >>>>
> > > >>
> > >
> https://cacm.acm.org/magazines/2017/10/221322-multi-objective-parametric-query-optimization/fulltext
> > > >>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>>> On Dec 23, 2020, at 8:55 PM, Botong Huang <
> > > >>>>>>>> pkuhbt@gmail.com
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>>> Thanks Aron for pointing this out. To see the
> > > >>>>> figure,
> > > >>>>>>>> please
> > > >>>>>>>>>>>>>>> refer
> > > >>>>>>>>>>>>>>>>>>> to
> > > >>>>>>>>>>>>>>>>>>>>>> Fig
> > > >>>>>>>>>>>>>>>>>>>>>>> 3(a) in our paper:
> > > >>>>>>>>>>>>>>>>>>>>>>
> > > >>>>> https://kai-zeng.github.io/papers/tempura-vldb2021.pdf
> > > >>>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>>> Best,
> > > >>>>>>>>>>>>>>>>>>>>>>> Botong
> > > >>>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>>> On Wed, Dec 23, 2020 at 7:20 PM JiaTao Tao <
> > > >>>>>>>>>>>> taojiatao@gmail.com>
> > > >>>>>>>>>>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>>>> Seems interesting, the pic can not be seen in
> > > >>>> the
> > > >>>>> mail,
> > > >>>>>>>>>>>> may you
> > > >>>>>>>>>>>>>>>>>>> open
> > > >>>>>>>>>>>>>>>>>>>> a
> > > >>>>>>>>>>>>>>>>>>>>>> JIRA
> > > >>>>>>>>>>>>>>>>>>>>>>>> for this, people who are interested in this
> > > >> can
> > > >>>>>>> subscribe
> > > >>>>>>>>>>>> to the
> > > >>>>>>>>>>>>>>>>>>>> JIRA?
> > > >>>>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>>>> Regards!
> > > >>>>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>>>> Aron Tao
> > > >>>>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>>>> Botong Huang <bo...@apache.org>
> > > >> 于2020年12月24日周四
> > > >>>>>>>> 上午3:18写道:
> > > >>>>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>>>>> Hi all,
> > > >>>>>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>>>>> This is a proposal to extend the Calcite
> > > >>>> optimizer
> > > >>>>>>> into
> > > >>>>>>>> a
> > > >>>>>>>>>>>>>>> general
> > > >>>>>>>>>>>>>>>>>>>>>>>>> incremental query optimizer, based on our
> > > >>>> research
> > > >>>>>>> paper
> > > >>>>>>>>>>>>>>>>>>> published
> > > >>>>>>>>>>>>>>>>>>>> in
> > > >>>>>>>>>>>>>>>>>>>>>>>> VLDB
> > > >>>>>>>>>>>>>>>>>>>>>>>>> 2021:
> > > >>>>>>>>>>>>>>>>>>>>>>>>> Tempura: a general cost-based optimizer
> > > >>>> framework
> > > >>>>> for
> > > >>>>>>>>>>>>>>> incremental
> > > >>>>>>>>>>>>>>>>>>>> data
> > > >>>>>>>>>>>>>>>>>>>>>>>>> processing
> > > >>>>>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>>>>> We also have a demo in SIGMOD 2020
> > > >> illustrating
> > > >>>>> how
> > > >>>>>>>>>>>> Alibaba’s
> > > >>>>>>>>>>>>>>>>>>> data
> > > >>>>>>>>>>>>>>>>>>>>>>>>> warehouse is planning to use this incremental
> > > >>>>> query
> > > >>>>>>>>>>>> optimizer
> > > >>>>>>>>>>>>>>> to
> > > >>>>>>>>>>>>>>>>>>>>>>>> alleviate
> > > >>>>>>>>>>>>>>>>>>>>>>>>> cluster-wise resource skewness:
> > > >>>>>>>>>>>>>>>>>>>>>>>>> Grosbeak: A Data Warehouse Supporting
> > > >>>>> Resource-Aware
> > > >>>>>>>>>>>>>>> Incremental
> > > >>>>>>>>>>>>>>>>>>>>>>>> Computing
> > > >>>>>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>>>>> To our best knowledge, this is the first
> > > >>>> general
> > > >>>>>>>>>>>> cost-based
> > > >>>>>>>>>>>>>>>>>>>>>> incremental
> > > >>>>>>>>>>>>>>>>>>>>>>>>> optimizer that can find the best plan across
> > > >>>>> multiple
> > > >>>>>>>>>>>> families
> > > >>>>>>>>>>>>>>> of
> > > >>>>>>>>>>>>>>>>>>>>>>>>> incremental computing methods, including IVM,
> > > >>>>>>> Streaming,
> > > >>>>>>>>>>>>>>>>>>> DBToaster,
> > > >>>>>>>>>>>>>>>>>>>>>> etc.
> > > >>>>>>>>>>>>>>>>>>>>>>>>> Experiments (in the paper) shows that the
> > > >>>>> generated
> > > >>>>>>> best
> > > >>>>>>>>>>>> plan
> > > >>>>>>>>>>>>>>> is
> > > >>>>>>>>>>>>>>>>>>>>>>>>> consistently much better than the plans from
> > > >>>> each
> > > >>>>>>>>>>>> individual
> > > >>>>>>>>>>>>>>>>>>> method
> > > >>>>>>>>>>>>>>>>>>>>>>>> alone.
> > > >>>>>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>>>>> In general, incremental query planning is
> > > >>>> central
> > > >>>>> to
> > > >>>>>>>>>>>> database
> > > >>>>>>>>>>>>>>>>>>> view
> > > >>>>>>>>>>>>>>>>>>>>>>>>> maintenance and stream processing systems,
> > > >> and
> > > >>>> are
> > > >>>>>>> being
> > > >>>>>>>>>>>>>>> adopted
> > > >>>>>>>>>>>>>>>>>>> in
> > > >>>>>>>>>>>>>>>>>>>>>>>> active
> > > >>>>>>>>>>>>>>>>>>>>>>>>> databases, resumable query execution,
> > > >>>> approximate
> > > >>>>>>> query
> > > >>>>>>>>>>>>>>>>>>> processing,
> > > >>>>>>>>>>>>>>>>>>>>>> etc.
> > > >>>>>>>>>>>>>>>>>>>>>>>> We
> > > >>>>>>>>>>>>>>>>>>>>>>>>> are hoping that this feature can help
> > > >> widening
> > > >>>> the
> > > >>>>>>>>>>>> spectrum of
> > > >>>>>>>>>>>>>>>>>>>>>> Calcite,
> > > >>>>>>>>>>>>>>>>>>>>>>>>> solicit more use cases and adoption of
> > > >> Calcite.
> > > >>>>>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>>>>> Below is a brief description of the technical
> > > >>>>> details.
> > > >>>>>>>>>>>> Please
> > > >>>>>>>>>>>>>>>>>>> refer
> > > >>>>>>>>>>>>>>>>>>>> to
> > > >>>>>>>>>>>>>>>>>>>>>>>> the
> > > >>>>>>>>>>>>>>>>>>>>>>>>> Tempura paper for more details. We are also
> > > >>>>> working
> > > >>>>>>> on a
> > > >>>>>>>>>>>>>>> journal
> > > >>>>>>>>>>>>>>>>>>>>>> version
> > > >>>>>>>>>>>>>>>>>>>>>>>> of
> > > >>>>>>>>>>>>>>>>>>>>>>>>> the paper with more implementation details.
> > > >>>>>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>>>>> Currently the query plan generated by Calcite
> > > >>>> is
> > > >>>>> meant
> > > >>>>>>>> to
> > > >>>>>>>>>>>> be
> > > >>>>>>>>>>>>>>>>>>>> executed
> > > >>>>>>>>>>>>>>>>>>>>>>>>> altogether at once. In the proposal,
> > > >> Calcite’s
> > > >>>>> memo
> > > >>>>>>> will
> > > >>>>>>>>>>>> be
> > > >>>>>>>>>>>>>>>>>>> extended
> > > >>>>>>>>>>>>>>>>>>>>>> with
> > > >>>>>>>>>>>>>>>>>>>>>>>>> temporal information so that it is capable of
> > > >>>>>>> generating
> > > >>>>>>>>>>>>>>>>>>> incremental
> > > >>>>>>>>>>>>>>>>>>>>>>>> plans
> > > >>>>>>>>>>>>>>>>>>>>>>>>> that include multiple sub-plans to execute at
> > > >>>>>>> different
> > > >>>>>>>>>>>> time
> > > >>>>>>>>>>>>>>>>>>> points.
> > > >>>>>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>>>>> The main idea is to view each table as one
> > > >> that
> > > >>>>>>> changes
> > > >>>>>>>>>>>> over
> > > >>>>>>>>>>>>>>> time
> > > >>>>>>>>>>>>>>>>>>>>>> (Time
> > > >>>>>>>>>>>>>>>>>>>>>>>>> Varying Relations (TVR)). To achieve that we
> > > >>>>>>> introduced
> > > >>>>>>>>>>>>>>>>>>> TvrMetaSet
> > > >>>>>>>>>>>>>>>>>>>>>> into
> > > >>>>>>>>>>>>>>>>>>>>>>>>> Calcite’s memo besides RelSet and RelSubset
> > > >> to
> > > >>>>> track
> > > >>>>>>>>>>>> related
> > > >>>>>>>>>>>>>>>>>>> RelSets
> > > >>>>>>>>>>>>>>>>>>>>>> of a
> > > >>>>>>>>>>>>>>>>>>>>>>>>> changing table (e.g. snapshot of the table at
> > > >>>>> certain
> > > >>>>>>>>>>>> time,
> > > >>>>>>>>>>>>>>>>>>> delta of
> > > >>>>>>>>>>>>>>>>>>>>>> the
> > > >>>>>>>>>>>>>>>>>>>>>>>>> table between two time points, etc.).
> > > >>>>>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>>>>> [image: image.png]
> > > >>>>>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>>>>> For example in the above figure, each
> > > >> vertical
> > > >>>>> line
> > > >>>>>>> is a
> > > >>>>>>>>>>>>>>>>>>> TvrMetaSet
> > > >>>>>>>>>>>>>>>>>>>>>>>>> representing a TVR (S, R, S left outer join
> > > >> R,
> > > >>>>> etc.).
> > > >>>>>>>>>>>>>>> Horizontal
> > > >>>>>>>>>>>>>>>>>>>> lines
> > > >>>>>>>>>>>>>>>>>>>>>>>>> represent time. Each black dot in the grid
> > > >> is a
> > > >>>>>>> RelSet.
> > > >>>>>>>>>>>> Users
> > > >>>>>>>>>>>>>>> can
> > > >>>>>>>>>>>>>>>>>>>>>> write
> > > >>>>>>>>>>>>>>>>>>>>>>>> TVR
> > > >>>>>>>>>>>>>>>>>>>>>>>>> Rewrite Rules to describe valid
> > > >> transformations
> > > >>>>>>> between
> > > >>>>>>>>>>>> these
> > > >>>>>>>>>>>>>>>>>>> dots.
> > > >>>>>>>>>>>>>>>>>>>>>> For
> > > >>>>>>>>>>>>>>>>>>>>>>>>> example, the blues lines are inter-TVR rules
> > > >>>> that
> > > >>>>>>>>>>>> describe how
> > > >>>>>>>>>>>>>>> to
> > > >>>>>>>>>>>>>>>>>>>>>> compute
> > > >>>>>>>>>>>>>>>>>>>>>>>>> certain RelSet of a TVR from RelSets of other
> > > >>>>> TVRs.
> > > >>>>>>> The
> > > >>>>>>>>>>>> red
> > > >>>>>>>>>>>>>>> lines
> > > >>>>>>>>>>>>>>>>>>>> are
> > > >>>>>>>>>>>>>>>>>>>>>>>>> intra-TVR rules that describe transformations
> > > >>>>> within a
> > > >>>>>>>>>>>> TVR. All
> > > >>>>>>>>>>>>>>>>>>> TVR
> > > >>>>>>>>>>>>>>>>>>>>>>>> rewrite
> > > >>>>>>>>>>>>>>>>>>>>>>>>> rules are logical rules. All existing Calcite
> > > >>>>> rules
> > > >>>>>>>> still
> > > >>>>>>>>>>>> work
> > > >>>>>>>>>>>>>>> in
> > > >>>>>>>>>>>>>>>>>>>> the
> > > >>>>>>>>>>>>>>>>>>>>>> new
> > > >>>>>>>>>>>>>>>>>>>>>>>>> volcano system without modification.
> > > >>>>>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>>>>> All changes in this feature will consist of
> > > >>>> four
> > > >>>>>>> parts:
> > > >>>>>>>>>>>>>>>>>>>>>>>>> 1. Memo extension with TvrMetaSet
> > > >>>>>>>>>>>>>>>>>>>>>>>>> 2. Rule engine upgrade, capable of matching
> > > >>>>> TvrMetaSet
> > > >>>>>>>> and
> > > >>>>>>>>>>>>>>>>>>> RelNodes,
> > > >>>>>>>>>>>>>>>>>>>>>> as
> > > >>>>>>>>>>>>>>>>>>>>>>>>> well as links in between the nodes.
> > > >>>>>>>>>>>>>>>>>>>>>>>>> 3. A basic set of TvrRules, written using the
> > > >>>>> upgraded
> > > >>>>>>>>>>>> rule
> > > >>>>>>>>>>>>>>>>>>> engine
> > > >>>>>>>>>>>>>>>>>>>>>> API.
> > > >>>>>>>>>>>>>>>>>>>>>>>>> 4. Multi-query optimization, used to find the
> > > >>>> best
> > > >>>>>>>>>>>> incremental
> > > >>>>>>>>>>>>>>>>>>> plan
> > > >>>>>>>>>>>>>>>>>>>>>>>>> involving multiple time points.
> > > >>>>>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>>>>> Note that this feature is an extension in
> > > >>>> nature
> > > >>>>> and
> > > >>>>>>>> thus
> > > >>>>>>>>>>>> when
> > > >>>>>>>>>>>>>>>>>>>>>> disabled,
> > > >>>>>>>>>>>>>>>>>>>>>>>>> does not change any existing Calcite
> > > >> behavior.
> > > >>>>>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>>>>> Other than scenarios in the paper, we also
> > > >>>> applied
> > > >>>>>>> this
> > > >>>>>>>>>>>>>>>>>>>>>> Calcite-extended
> > > >>>>>>>>>>>>>>>>>>>>>>>>> incremental query optimizer to a type of
> > > >>>> periodic
> > > >>>>>>> query
> > > >>>>>>>>>>>> called
> > > >>>>>>>>>>>>>>>>>>> the
> > > >>>>>>>>>>>>>>>>>>>>>>>> ‘‘range
> > > >>>>>>>>>>>>>>>>>>>>>>>>> query’’ in Alibaba’s data warehouse. It
> > > >>>> achieved
> > > >>>>> cost
> > > >>>>>>>>>>>> savings
> > > >>>>>>>>>>>>>>> of
> > > >>>>>>>>>>>>>>>>>>> 80%
> > > >>>>>>>>>>>>>>>>>>>>>> on
> > > >>>>>>>>>>>>>>>>>>>>>>>>> total CPU and memory consumption, and 60% on
> > > >>>>>>> end-to-end
> > > >>>>>>>>>>>>>>> execution
> > > >>>>>>>>>>>>>>>>>>>>>> time.
> > > >>>>>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>>>>> All comments and suggestions are welcome.
> > > >>>> Thanks
> > > >>>>> and
> > > >>>>>>>> happy
> > > >>>>>>>>>>>>>>>>>>> holidays!
> > > >>>>>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>>>>> Best,
> > > >>>>>>>>>>>>>>>>>>>>>>>>> Botong
> > > >>>>>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> --
> > > >>>>>>>>>>>>>>>>>>> ~~~~~~~~~~~~~~~
> > > >>>>>>>>>>>>>>>>>>> no mistakes
> > > >>>>>>>>>>>>>>>>>>> ~~~~~~~~~~~~~~~~~~
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> --
> > > >>>>>>> Viliam Durina
> > > >>>>>>> Jet Developer
> > > >>>>>>>      hazelcast®
> > > >>>>>>>
> > > >>>>>>>  <https://www.hazelcast.com> 2 W 5th Ave, Ste 300 | San Mateo,
> > > >> CA
> > > >>>>> 94402 |
> > > >>>>>>> USA
> > > >>>>>>> +1 (650) 521-5453 <(650)%20521-5453> | hazelcast.com <
> > > >> https://www.hazelcast.com>
> > > >>>>>>>
> > > >>>>>>> --
> > > >>>>>>> This message contains confidential information and is intended
> > > >> only
> > > >>>> for
> > > >>>>>>> the
> > > >>>>>>> individuals named. If you are not the named addressee you
> should
> > > >> not
> > > >>>>>>> disseminate, distribute or copy this e-mail. Please notify the
> > > >>>> sender
> > > >>>>>>> immediately by e-mail if you have received this e-mail by
> mistake
> > > >>>> and
> > > >>>>>>> delete this e-mail from your system. E-mail transmission
> cannot be
> > > >>>>>>> guaranteed to be secure or error-free as information could be
> > > >>>>> intercepted,
> > > >>>>>>> corrupted, lost, destroyed, arrive late or incomplete, or
> contain
> > > >>>>> viruses.
> > > >>>>>>> The sender therefore does not accept liability for any errors
> or
> > > >>>>> omissions
> > > >>>>>>> in the contents of this message, which arise as a result of
> e-mail
> > > >>>>>>> transmission. If verification is required, please request a
> > > >>>> hard-copy
> > > >>>>>>> version. -Hazelcast
> > > >>>>>>>
> > > >>>>>
> > > >>>>
> > > >>>
> > > >>
> > >
> > >
> >
>