You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Konstantin Orlov <ko...@gridgain.com> on 2021/03/05 16:53:29 UTC

Derive is not being called for some newly created rels

Hi, folks!

I am facing the problem that derive is not being called for some newly created relations. 

We have a several join algorithms, so we implemented a several physical nodes. During an optimization process they all are created from a logical join with default distribution trait and then they are converted to a single distribution by a passThrough call. But derive is invoked only for the first physical rel, because others relations after conversion from a logical rel are added to the already optimized subset (subset.taskState == OPTIMIZED).

A derive invocation is important for us, because we create a different variations of distribution here (like colocated join, sending one side to another or full rehashing).

So the questions are:
1) Is it a bug or it so by design?
2) if this was done on purpose, how can we best get around this limitation?

Thanks in advance!

-- 
Regards,
Konstantin Orlov



Re: Derive is not being called for some newly created rels

Posted by Konstantin Orlov <ko...@gridgain.com>.
Hi, Julian!

Thanks for your reply! I file a ticket for this issue [1].

[1] https://issues.apache.org/jira/browse/CALCITE-4542

-- 
Regards,
Konstantin Orlov




> On 15 Mar 2021, at 19:56, Julian Hyde <jh...@gmail.com> wrote:
> 
> Can you please log a bug? Describe symptoms, not what’s happening in the code.
> 
> (I don’t remember what “derive” is, even though I probably wrote it.)
> 
> Log the bug before you have a solution, and you’ll write a better bug.
> 
>> On Mar 15, 2021, at 6:08 AM, Konstantin Orlov <ko...@gridgain.com> wrote:
>> 
>> Hi all!
>> 
>> I did a reproducer for mentioned case, and possible fix (at least it fixes this particular problem). [1]
>> 
>> Could someone verify please is provided reproducer/fix valid?
>> 
>> [1] https://github.com/korlov42/calcite/commit/f4a5c2f01e0ec67156f7e91c6b5839dca1db6776 <https://github.com/korlov42/calcite/commit/f4a5c2f01e0ec67156f7e91c6b5839dca1db6776>
>> 
>> -- 
>> Regards,
>> Konstantin Orlov
>> 
>> 
>> 
>> 
>>> On 5 Mar 2021, at 19:53, Konstantin Orlov <ko...@gridgain.com> wrote:
>>> 
>>> Hi, folks!
>>> 
>>> I am facing the problem that derive is not being called for some newly created relations. 
>>> 
>>> We have a several join algorithms, so we implemented a several physical nodes. During an optimization process they all are created from a logical join with default distribution trait and then they are converted to a single distribution by a passThrough call. But derive is invoked only for the first physical rel, because others relations after conversion from a logical rel are added to the already optimized subset (subset.taskState == OPTIMIZED).
>>> 
>>> A derive invocation is important for us, because we create a different variations of distribution here (like colocated join, sending one side to another or full rehashing).
>>> 
>>> So the questions are:
>>> 1) Is it a bug or it so by design?
>>> 2) if this was done on purpose, how can we best get around this limitation?
>>> 
>>> Thanks in advance!
>>> 
>>> -- 
>>> Regards,
>>> Konstantin Orlov
>>> 
>>> 
>> 
> 


Re: Derive is not being called for some newly created rels

Posted by Julian Hyde <jh...@gmail.com>.
Can you please log a bug? Describe symptoms, not what’s happening in the code.

(I don’t remember what “derive” is, even though I probably wrote it.)

Log the bug before you have a solution, and you’ll write a better bug.

> On Mar 15, 2021, at 6:08 AM, Konstantin Orlov <ko...@gridgain.com> wrote:
> 
> Hi all!
> 
> I did a reproducer for mentioned case, and possible fix (at least it fixes this particular problem). [1]
> 
> Could someone verify please is provided reproducer/fix valid?
> 
> [1] https://github.com/korlov42/calcite/commit/f4a5c2f01e0ec67156f7e91c6b5839dca1db6776 <https://github.com/korlov42/calcite/commit/f4a5c2f01e0ec67156f7e91c6b5839dca1db6776>
> 
> -- 
> Regards,
> Konstantin Orlov
> 
> 
> 
> 
>> On 5 Mar 2021, at 19:53, Konstantin Orlov <ko...@gridgain.com> wrote:
>> 
>> Hi, folks!
>> 
>> I am facing the problem that derive is not being called for some newly created relations. 
>> 
>> We have a several join algorithms, so we implemented a several physical nodes. During an optimization process they all are created from a logical join with default distribution trait and then they are converted to a single distribution by a passThrough call. But derive is invoked only for the first physical rel, because others relations after conversion from a logical rel are added to the already optimized subset (subset.taskState == OPTIMIZED).
>> 
>> A derive invocation is important for us, because we create a different variations of distribution here (like colocated join, sending one side to another or full rehashing).
>> 
>> So the questions are:
>> 1) Is it a bug or it so by design?
>> 2) if this was done on purpose, how can we best get around this limitation?
>> 
>> Thanks in advance!
>> 
>> -- 
>> Regards,
>> Konstantin Orlov
>> 
>> 
> 


Re: Derive is not being called for some newly created rels

Posted by Konstantin Orlov <ko...@gridgain.com>.
Hi all!

I did a reproducer for mentioned case, and possible fix (at least it fixes this particular problem). [1]

Could someone verify please is provided reproducer/fix valid?

[1] https://github.com/korlov42/calcite/commit/f4a5c2f01e0ec67156f7e91c6b5839dca1db6776 <https://github.com/korlov42/calcite/commit/f4a5c2f01e0ec67156f7e91c6b5839dca1db6776>

-- 
Regards,
Konstantin Orlov




> On 5 Mar 2021, at 19:53, Konstantin Orlov <ko...@gridgain.com> wrote:
> 
> Hi, folks!
> 
> I am facing the problem that derive is not being called for some newly created relations. 
> 
> We have a several join algorithms, so we implemented a several physical nodes. During an optimization process they all are created from a logical join with default distribution trait and then they are converted to a single distribution by a passThrough call. But derive is invoked only for the first physical rel, because others relations after conversion from a logical rel are added to the already optimized subset (subset.taskState == OPTIMIZED).
> 
> A derive invocation is important for us, because we create a different variations of distribution here (like colocated join, sending one side to another or full rehashing).
> 
> So the questions are:
> 1) Is it a bug or it so by design?
> 2) if this was done on purpose, how can we best get around this limitation?
> 
> Thanks in advance!
> 
> -- 
> Regards,
> Konstantin Orlov
> 
>