You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Mohammad Kamrul Islam (JIRA)" <ji...@apache.org> on 2012/09/04 22:56:07 UTC

[jira] [Created] (OOZIE-983) [Design] Automatic Oozie application deployment using WebHDFS

Mohammad Kamrul Islam created OOZIE-983:
-------------------------------------------

             Summary: [Design] Automatic Oozie application deployment using WebHDFS
                 Key: OOZIE-983
                 URL: https://issues.apache.org/jira/browse/OOZIE-983
             Project: Oozie
          Issue Type: Bug
            Reporter: Mohammad Kamrul Islam


Problem:
1. A user can't upload the oozie application from his dev box. User needs to access to a specialized box (such as gateway) to run those hadoop commands. It is inconvenient which requires to follow multiple steps and restrictions.

2. Automatic Oozie application versioning. If a user wants to deploy a new version of Oozie application, he needs to run multiple commands. In addition, there is no standard for this.

Proposal:
1. Oozie will provide a tool that will automatically deploy the application and maintained a rigid version mechanism.

2. It could be a new script (e.g. oozie-deply) or it can extend the existing oozie command (e.g. oozie  -deply....."). TBD

3. The new script will get the necessary information to launch a WebHDFS command from the user and upload the necessary files. It includes: WebHDFS end point, security token (for secured version), local application directory and remote application base path.

4. Using the appropriate WebHDFS REST API, the tool will deploy the application.  User can choose whether to override an existing application path. 

5. User can ask to upload a new version of application. The new version could be user provided or auto created by the script. For auto version selection, oozie tools will check the existing application path with pattern "v?". Then select the new version number.

6. For uploading a new application version, the oozie tool will first upload the application and then kill the old job (How to get the old job id?). At last, submit the new application. 


Open question:
1. How to pass the kerberos token? Specially from a dev box.
2. Who will determine the new version? user or automatic?


Other key points:
1. Only supported for Hadoop 1.0.2+ 
2. Need to use/develop some wrapper tools which can hide most of the WebHDFS details. There are already two such tools:  a) for python :  https://github.com/drelu/webhdfs-py b) for Ruby,  https://github.com/zenja/webhdfs-ruby. At this point the options are: 
  * Write a new Java wrapper class.
  * Write a new wrapper tool using pure shell commands.
  * Reuse python or Ruby libraries.




Overall, we need to do it correctly from the beginning. The comments from others are highly appreciated.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (OOZIE-983) [Design] Automatic Oozie application deployment using WebHDFS

Posted by "Mona Chitnis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OOZIE-983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mona Chitnis reassigned OOZIE-983:
----------------------------------

    Assignee: Mona Chitnis
    
> [Design] Automatic Oozie application deployment using WebHDFS
> -------------------------------------------------------------
>
>                 Key: OOZIE-983
>                 URL: https://issues.apache.org/jira/browse/OOZIE-983
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Mohammad Kamrul Islam
>            Assignee: Mona Chitnis
>
> Problem:
> 1. A user can't upload the oozie application from his dev box. User needs to access to a specialized box (such as gateway) to run those hadoop commands. It is inconvenient which requires to follow multiple steps and restrictions.
> 2. Automatic Oozie application versioning. If a user wants to deploy a new version of Oozie application, he needs to run multiple commands. In addition, there is no standard for this.
> Proposal:
> 1. Oozie will provide a tool that will automatically deploy the application and maintained a rigid version mechanism.
> 2. It could be a new script (e.g. oozie-deply) or it can extend the existing oozie command (e.g. oozie  -deply....."). TBD
> 3. The new script will get the necessary information to launch a WebHDFS command from the user and upload the necessary files. It includes: WebHDFS end point, security token (for secured version), local application directory and remote application base path.
> 4. Using the appropriate WebHDFS REST API, the tool will deploy the application.  User can choose whether to override an existing application path. 
> 5. User can ask to upload a new version of application. The new version could be user provided or auto created by the script. For auto version selection, oozie tools will check the existing application path with pattern "v?". Then select the new version number.
> 6. For uploading a new application version, the oozie tool will first upload the application and then kill the old job (How to get the old job id?). At last, submit the new application. 
> Open question:
> 1. How to pass the kerberos token? Specially from a dev box.
> 2. Who will determine the new version? user or automatic?
> Other key points:
> 1. Only supported for Hadoop 1.0.2+ 
> 2. Need to use/develop some wrapper tools which can hide most of the WebHDFS details. There are already two such tools:  a) for python :  https://github.com/drelu/webhdfs-py b) for Ruby,  https://github.com/zenja/webhdfs-ruby. At this point the options are: 
>   * Write a new Java wrapper class.
>   * Write a new wrapper tool using pure shell commands.
>   * Reuse python or Ruby libraries.
> Overall, we need to do it correctly from the beginning. The comments from others are highly appreciated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira