You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Siddharth Seth (JIRA)" <ji...@apache.org> on 2016/11/10 02:26:58 UTC

[jira] [Commented] (TEZ-3509) Make DAG Deletion path based

    [ https://issues.apache.org/jira/browse/TEZ-3509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15652777#comment-15652777 ] 

Siddharth Seth commented on TEZ-3509:
-------------------------------------

We may be better of relying on the ShuffleHandler  accepting the parameter on dag vs vertex delete, instead of a path. I can't see a case where we want to go more granular than that. Path level starts exposing a lot of information, which can otherwise be avoided, in the container launcher.

e.g. a call to the shuffle handler saying
dagDelete?action=delete&job=jobId&dag=dagId
vertexDelete?action=delete&job=jobId&dag=dagId&vertexId=vertexId
The ShuffleHandler has all the relevant information, and context available to delete the necessary files.

If going with the path route, have to make sure that no absolute paths are sent in. The paths in fact should be a localized directory for the app (i.e. I should not be able to delete local resources used by the app using this invocation).

In terms of the patch - mostly looks good. There's some access to private members of the Tracker class. That should be changed to methods (node registration).

I think there was mentioned of pluggability of this piece at some point. The ContainerLauncher can be shared by entities which use different ShuffleHandlers. e.g. If the MR ShuffleHandler were to be enhanced to support deletes, deletion would not work since it is tied to the new Tez ShuffleHandler. Making the entire DeletionTracker pluggable - with config coming in via the ContainerLauncher payload would help with this. tez-dag ideally should not be referencing the tez-runtime-library module. That's in place today because of some stuff in the ShuffleHandler, and can be broken by moving the VertexManager implementations out of tez-dag. Referencing ShuffleHandler directly brings in one more dependency from tez-dag to tez-runtime-library.

Unrelated to this specific patch.
The model used for dagComplete notifications is to only pass in the dagId. Dag info is supposed to be picked up from the getDagInfo method in {TaskCommunicator|ContainerLauncher}Context. At the moment, the entire DAG is exposed.

> Make DAG Deletion path based
> ----------------------------
>
>                 Key: TEZ-3509
>                 URL: https://issues.apache.org/jira/browse/TEZ-3509
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Kuhu Shukla
>            Assignee: Kuhu Shukla
>         Attachments: TEZ-3509.001.patch
>
>
> The idea here is to have the API take a path to delete, be it DAG path today or a vertex level path later on. The current implementation takes a flag which is specific to DAG deletion. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)