You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Hitesh Shah (JIRA)" <ji...@apache.org> on 2015/04/25 01:16:38 UTC

[jira] [Commented] (TEZ-2368) Make the dag number available in Context classes

    [ https://issues.apache.org/jira/browse/TEZ-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14511993#comment-14511993 ] 

Hitesh Shah commented on TEZ-2368:
----------------------------------

dag names are meant to be unique within a session. 

> Make the dag number available in Context classes
> ------------------------------------------------
>
>                 Key: TEZ-2368
>                 URL: https://issues.apache.org/jira/browse/TEZ-2368
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Siddharth Seth
>            Assignee: Siddharth Seth
>
> Provide the dag number, which is a unique number, for each dag running within an application in the TezInputContext, TezOutputContext, TezProcessorContext.
> When containers are re-used, or for external services, this can be used to generate intermediate data to a dag specific directory instead of an application specific directory, where it becomes difficult to differentiate between different dags.
> The DAG name does provide this - but is not suitable for use in a directory name. Hashing the name is an option, but can lead to collisions.
> Generating data into a dag specific directory will eventually only be usable when we move away from the default MR handler, or enhance it to support an additional parameter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)