You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Siddharth Seth (JIRA)" <ji...@apache.org> on 2014/03/15 00:06:42 UTC

[jira] [Created] (TEZ-937) Add a potential sync point for Edges to query Vertex numTasks

Siddharth Seth created TEZ-937:
----------------------------------

             Summary: Add a potential sync point for Edges to query Vertex numTasks
                 Key: TEZ-937
                 URL: https://issues.apache.org/jira/browse/TEZ-937
             Project: Apache Tez
          Issue Type: Improvement
            Reporter: Siddharth Seth
            Assignee: Siddharth Seth


Edges rely on getting properties (specifically numTasks in this case) from the source or destination vertex. The value may not be available correctly till some vertices have initialized, or others have had their final parallelism set.

We could change the Edge API to differentiate between initialParallelism and updatedNumTasks - this at least makes the API clearer (users should expect different values)

Another option is to let the VertexManager control whether these APIs are ready to go (based on whether initialization is important for the vertex, it sets parallelism, has already set parallelism). This can get fairly complicated though - in terms of blocking and queuing events, but would be useflul for users.
[~vikram.dixit] could probably comment more - ran into a similar situation when writing a Bucketed Map Join edge plugin for Hive.

TEZ-933 has some related information.



--
This message was sent by Atlassian JIRA
(v6.2#6252)