You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tez.apache.org by Hitesh Shah <hi...@apache.org> on 2013/08/30 01:58:02 UTC

TEZ-398: Changes to the underlying Task/Input/Output/Processor engine layer

Hi folks,

Siddharth, Bikas and I have been working on looking at how to re-factor/clean up the tez-engine layer ( i.e the Input/Output/Processor and Task constructs ) to allow non-MR processors to be built and allow easy introduction of new inputs and outputs. 

I posted an initial design/overview draft on TEZ-398 ( https://issues.apache.org/jira/secure/attachment/12600691/TEZ-398-Engine-Design.pdf ). Please feel free to add your comments to the jira. If you are interested in working on any parts of the changes, comment on either the jira and/or mailing lists.  If there are folks interested, we can set up a meeting where we can discuss the proposed design. 

Given that these changes are going to make the current implementation unstable for folks that are starting to use to tez ( in hive/pig ), we plan to create a new branch ( likely to be named TEZ-398 ) to work on these engine changes.

thanks
-- Hitesh 




Re: TEZ-398: Changes to the underlying Task/Input/Output/Processor engine layer

Posted by Siddharth Seth <se...@gmail.com>.
Have created a branch for this TEZ-398.

Just to be clear, the design is likely to evolve as it sees the light of
day over the next week or so. There are also aspects which are not fully
fleshed out. At the end of this exercise, it should be possible to write a
processor which takes multiple Inputs - e.g. an input which reads MapReduce
FileFormat based input and another input which just receives non sorted
key-values generated by a previous vertex.


On Thu, Aug 29, 2013 at 4:58 PM, Hitesh Shah <hi...@apache.org> wrote:

> Hi folks,
>
> Siddharth, Bikas and I have been working on looking at how to
> re-factor/clean up the tez-engine layer ( i.e the Input/Output/Processor
> and Task constructs ) to allow non-MR processors to be built and allow easy
> introduction of new inputs and outputs.
>
> I posted an initial design/overview draft on TEZ-398 (
> https://issues.apache.org/jira/secure/attachment/12600691/TEZ-398-Engine-Design.pdf). Please feel free to add your comments to the jira. If you are interested
> in working on any parts of the changes, comment on either the jira and/or
> mailing lists.  If there are folks interested, we can set up a meeting
> where we can discuss the proposed design.
>
> Given that these changes are going to make the current implementation
> unstable for folks that are starting to use to tez ( in hive/pig ), we plan
> to create a new branch ( likely to be named TEZ-398 ) to work on these
> engine changes.
>
> thanks
> -- Hitesh
>
>
>
>