You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by Patryk Mastela <pa...@gmail.com> on 2011/11/14 22:36:25 UTC

How to change execution when extra information is available?

Hi,

If there is information available about the data (e.g. computed averages
over a certain column) where in the code should we start looking to to get
this information and actually use it for the execution?

Thanks,

Patryk

Re: How to change execution when extra information is available?

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
There isn't an optimizer yet that knows how to use this information.
It would probably live at the Logical Optimizer level.
There's an interface for loaders to provide statistics about their
data, you can start by making sure your loaders implement it, and go
from there.

D

On Mon, Nov 14, 2011 at 1:36 PM, Patryk Mastela
<pa...@gmail.com> wrote:
> Hi,
>
> If there is information available about the data (e.g. computed averages
> over a certain column) where in the code should we start looking to to get
> this information and actually use it for the execution?
>
> Thanks,
>
> Patryk
>