You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Bikas Saha (JIRA)" <ji...@apache.org> on 2013/10/04 19:41:46 UTC
[jira] [Commented] (TEZ-46) Add compute capability to Inputs /
Outputs specified directly on a Vertex
[ https://issues.apache.org/jira/browse/TEZ-46?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13786392#comment-13786392 ]
Bikas Saha commented on TEZ-46:
-------------------------------
LeafInputOuput is likely using incorrect terminology. In a DAG, the final vertices are leafs and the initial vertices are roots. Hence this should be LeafOutput/RootInput if we want to use graph terminology. Otherwise it will be confusing to people. Similarly it should be TezRootInputInitializer/LeafInputDataInformationEvent
Why would these threads need to be daemon?
{code}
+ this.eventHandler = eventHandler;
+ this.rawExecutor = Executors.newCachedThreadPool(new ThreadFactoryBuilder()
+ .setDaemon(true).setNameFormat("InputInitializer #%d").build());
+ this.executor = MoreExecutors.listeningDecorator(rawExecutor);
{code}
At this point the event could be routed by the vertex itself. What is the value of sending this list to a vertex manager so that it can send the events? Also, doesnt the vertex need to be initialized first before routing the events to the tasks?
{code}
+ vertex.vertexScheduler.onLeafVertexInitialized(
+ liInitEvent.getInputName(),
+ vertex.getAdditionalInputs().get(liInitEvent.getInputName())
+ .getDescriptor(), liInitEvent.getEvents());
+
+ vertex.numInitializedInputs++;
+ if (vertex.numInitializedInputs == vertex.inputsWithInitializers.size()) {
+ // All inputs initialized, shutdown the initializer.
+ vertex.leafInputInitializer.shutdown();
+
+ // If LeafInputs are determining parallelism, it should have been set by
+ // this point, so it's safe to checkTaskLimits and createTasks
+ VertexState vertexState = vertex.initializeVertex();
{code}
I couldnt figure out where the numTasks are being changed based on the split calculation.
> Add compute capability to Inputs / Outputs specified directly on a Vertex
> -------------------------------------------------------------------------
>
> Key: TEZ-46
> URL: https://issues.apache.org/jira/browse/TEZ-46
> Project: Apache Tez
> Issue Type: Task
> Reporter: Siddharth Seth
> Assignee: Siddharth Seth
> Attachments: TEZ-46.wip.1.txt
>
>
> With a longer term goal of getting rid of the VertexCommitter.
> Input calculation, Output commit etc would be handled by these tasks, which run user code. These either run in a separate JVM or inline in the AM.
--
This message was sent by Atlassian JIRA
(v6.1#6144)