You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@iceberg.apache.org by QH Yan <ma...@gmail.com> on 2021/06/03 21:33:19 UTC

Why is Expression serializable?

Curious about the use-case where Expression get serialized and passed
around.

Thank you!
Qinhua
-- 
*Qinhua*

Re: Why is Expression serializable?

Posted by QH Yan <ma...@gmail.com>.
Got it! Thank you both!

On Thu, Jun 3, 2021 at 17:46 Ryan Blue <bl...@apache.org> wrote:

> Jack is right. We evaluate the residual for each task lazily, so it
> happens on the node that is reading data for the task. The residual is the
> part of the original filter expression that needs to be run in the task to
> produce the correct filtered rows. Since that is specific to the task, we
> don't want to build it during job planning.
>
> On Thu, Jun 3, 2021 at 2:39 PM Jack Ye <ye...@gmail.com> wrote:
>
>> Not sure if it is the only case, but the top thing that comes to my mind
>> is that: BaseFileScanTask has ResidualEvaluator which has Expression.
>> Because ScanTask is serializable and passed around workers, expressions
>> have to be serializable.
>> -Jack
>>
>> On Thu, Jun 3, 2021 at 2:33 PM QH Yan <ma...@gmail.com> wrote:
>>
>>> Curious about the use-case where Expression get serialized and passed
>>> around.
>>>
>>> Thank you!
>>> Qinhua
>>> --
>>> *Qinhua*
>>>
>>>
>
> --
> Ryan Blue
>
-- 
*Qinhua*

Re: Why is Expression serializable?

Posted by Ryan Blue <bl...@apache.org>.
Jack is right. We evaluate the residual for each task lazily, so it happens
on the node that is reading data for the task. The residual is the part of
the original filter expression that needs to be run in the task to produce
the correct filtered rows. Since that is specific to the task, we don't
want to build it during job planning.

On Thu, Jun 3, 2021 at 2:39 PM Jack Ye <ye...@gmail.com> wrote:

> Not sure if it is the only case, but the top thing that comes to my mind
> is that: BaseFileScanTask has ResidualEvaluator which has Expression.
> Because ScanTask is serializable and passed around workers, expressions
> have to be serializable.
> -Jack
>
> On Thu, Jun 3, 2021 at 2:33 PM QH Yan <ma...@gmail.com> wrote:
>
>> Curious about the use-case where Expression get serialized and passed
>> around.
>>
>> Thank you!
>> Qinhua
>> --
>> *Qinhua*
>>
>>

-- 
Ryan Blue

Re: Why is Expression serializable?

Posted by Jack Ye <ye...@gmail.com>.
Not sure if it is the only case, but the top thing that comes to my mind is
that: BaseFileScanTask has ResidualEvaluator which has Expression. Because
ScanTask is serializable and passed around workers, expressions have to be
serializable.
-Jack

On Thu, Jun 3, 2021 at 2:33 PM QH Yan <ma...@gmail.com> wrote:

> Curious about the use-case where Expression get serialized and passed
> around.
>
> Thank you!
> Qinhua
> --
> *Qinhua*
>
>