You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Vladimir Sitnikov <si...@gmail.com> on 2014/11/15 16:02:59 UTC

cost based decorrelation

Hi,

As far as I understand, Calcite always de-correlates.
Is there a reason why de-correlation is not a planning rule, but a
hard-coded action?

-- 
Regards,
Vladimir Sitnikov

Re: cost based decorrelation

Posted by Julian Hyde <ju...@hydromatic.net>.
For MPP analytic queries, correlated execution is a bad idea, so we don't really consider it. Also, decorrelation tends to be a "big" rewrite that affects an entire section of the RelNode tree.  (Field-trimming and materialized view substitution are other transformations in that category.)

But there's no reason in principle why decorrelation couldn't be used cost-based. I would like to have it in Calcite as an option. You could keep the original and decorrelated queries in the plan, then use cost to choose between them. Your cost model should probably include the cost of a "restart" (when you set the correlating variables and restart a section of the dataflow graph).

Julian


> On Nov 15, 2014, at 7:02 AM, Vladimir Sitnikov <si...@gmail.com> wrote:
> 
> Hi,
> 
> As far as I understand, Calcite always de-correlates.
> Is there a reason why de-correlation is not a planning rule, but a
> hard-coded action?
> 
> -- 
> Regards,
> Vladimir Sitnikov