You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Siddharth Seth (JIRA)" <ji...@apache.org> on 2014/09/04 10:28:51 UTC

[jira] [Commented] (TEZ-1447) Provide notification mechanism for user code to know about interesting Vertex state changes

    [ https://issues.apache.org/jira/browse/TEZ-1447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14121105#comment-14121105 ] 

Siddharth Seth commented on TEZ-1447:
-------------------------------------

bq. A separate jira is fine, though the feature seems incomplete without adding it to the VertexManagerPluginContext.
This jira is more for making the InputInitializer aware of state changes, and not for the entire system. For VertexManagerPlugins - this leads to a slightly messy API, considering there's APIs like onVertexStarted.

bq. Option 1) register(VertexName) - the listener gets notifications about all state changes published by the vertex.
Option 2) register(ENUM, VertexName) - the listener registers for a specific change and gets notified when that happens.
One more reason I like the vertexName API better is that if Tez had the concept of control connections - to indicate the flow of control information between vertices and VMs / initializers - a register API would not be required at all. This, IMO, is a better solution than the current requirement for users to send the target vertex information as part of the event.

> Provide notification mechanism for user code to know about interesting Vertex state changes
> -------------------------------------------------------------------------------------------
>
>                 Key: TEZ-1447
>                 URL: https://issues.apache.org/jira/browse/TEZ-1447
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.5.0
>            Reporter: Gunther Hagleitner
>            Assignee: Siddharth Seth
>            Priority: Blocker
>         Attachments: TEZ-1447.1.wip.txt
>
>
> I'm trying to do dynamic partition pruning through input initializer events in Hive. That means that the initializer of a table scan vertex has to receive events from all tasks in another vertex (which contain the pruning info) before generating tasks to run.
> The problem with the current API I ran into:
> getNumTasks: I'm currently using a busy loop to wait for the num tasks for a vertex to be decided (-1 -> x). There's no way around it, because it's the only way to find out what number of events to expect (0 is a valid number of tasks - so I can't wait for the first to complete).
> With auto-reducer parallelism I have to employ another busy loop. Because I might be initially expecting 10 events, which later get's knocked down to 5. Since there's no event associated with this, I have to periodically check whether I have enough events.
> Versioning: Events have a version number, but I don't know which task they are coming from. Thus I can't de-dup events.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)