You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Cheolsoo Park (JIRA)" <ji...@apache.org> on 2013/10/31 19:29:17 UTC

[jira] [Updated] (PIG-3555) Initial implementation of combiner optimization

     [ https://issues.apache.org/jira/browse/PIG-3555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Cheolsoo Park updated PIG-3555:
-------------------------------

    Summary: Initial implementation of combiner optimization  (was: Initial implement of combiner optimization)

> Initial implementation of combiner optimization
> -----------------------------------------------
>
>                 Key: PIG-3555
>                 URL: https://issues.apache.org/jira/browse/PIG-3555
>             Project: Pig
>          Issue Type: Sub-task
>          Components: tez
>    Affects Versions: tez-branch
>            Reporter: Cheolsoo Park
>            Assignee: Cheolsoo Park
>             Fix For: tez-branch
>
>
> To support algebraic UDFs and others, combiner is required. To start with, I am proposing the following initial implementation-
> * In Tez, combiner runs as part of ShuffledMergedInput in edges, so multiple combine plans (one per edge) need to be registered in a destination vertex. Each vertex is mapped to a TezOperator in Tez plan, so an array of combine plans will be stored in the TezOperator that maps to a destination vertex.
> * To register combine plans in a TezOperator, we will run a CombinerOptimizer on the Tez plan after TezCompiler generates it but before TezDagBuilder converts it into DAG.
> * Finally, TezDagBuilder will insert combine plans into the payload of ShuffledMergedInput while constructing a destination vertex.
> This initial implementation will allow us to run algebraic UDFs. In the future, we can implement more optimizations for limit, order-by, etc on top of this.



--
This message was sent by Atlassian JIRA
(v6.1#6144)