You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Hitesh Shah (JIRA)" <ji...@apache.org> on 2013/12/10 21:12:07 UTC

[jira] [Commented] (TEZ-668) Allow a Processor to trigger an Input as to when to fetch/prep data.

    [ https://issues.apache.org/jira/browse/TEZ-668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13844614#comment-13844614 ] 

Hitesh Shah commented on TEZ-668:
---------------------------------

An approach could be to use annotations on the Processor which defines whether it caches input data or not. Using this annotation, the framework can explicitly invoke the startFetch() functions for processors that do not cache data.

> Allow a Processor to trigger an Input as to when to fetch/prep data.
> --------------------------------------------------------------------
>
>                 Key: TEZ-668
>                 URL: https://issues.apache.org/jira/browse/TEZ-668
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Hitesh Shah
>
> In cases of container re-use or other scenarios, a processor may not need to process any data from a particular Input. Currently, an Input always fetches/preps the data as soon as its init() function is invoked. 
> A Processor should have a way to trigger the Input to fetch data. This obviously has some overheads of delaying the fetch. 
> The approach should be such that there is no additional burden on developers who write new processors when they need the normal flow of their inputs always fetching on init().



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)