You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Siddharth Seth (JIRA)" <ji...@apache.org> on 2014/03/15 00:06:42 UTC
[jira] [Created] (TEZ-937) Add a potential sync point for Edges to
query Vertex numTasks
Siddharth Seth created TEZ-937:
----------------------------------
Summary: Add a potential sync point for Edges to query Vertex numTasks
Key: TEZ-937
URL: https://issues.apache.org/jira/browse/TEZ-937
Project: Apache Tez
Issue Type: Improvement
Reporter: Siddharth Seth
Assignee: Siddharth Seth
Edges rely on getting properties (specifically numTasks in this case) from the source or destination vertex. The value may not be available correctly till some vertices have initialized, or others have had their final parallelism set.
We could change the Edge API to differentiate between initialParallelism and updatedNumTasks - this at least makes the API clearer (users should expect different values)
Another option is to let the VertexManager control whether these APIs are ready to go (based on whether initialization is important for the vertex, it sets parallelism, has already set parallelism). This can get fairly complicated though - in terms of blocking and queuing events, but would be useflul for users.
[~vikram.dixit] could probably comment more - ran into a similar situation when writing a Bucketed Map Join edge plugin for Hive.
TEZ-933 has some related information.
--
This message was sent by Atlassian JIRA
(v6.2#6252)