You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Matt Foley (JIRA)" <ji...@apache.org> on 2012/11/21 20:23:59 UTC

[jira] [Commented] (HADOOP-9082) Select and document a platform-independent scripting language for use in Hadoop environment

    [ https://issues.apache.org/jira/browse/HADOOP-9082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13502221#comment-13502221 ] 

Matt Foley commented on HADOOP-9082:
------------------------------------

This discussion started in HADOOP-8924, where it was proposed to replace the build-time utility "saveVersion.sh" with a python script.  This would require Python as a build-time dependency.  Here's the background:

Those of us involved in the branch-1-win port of Hadoop to Windows without use of Cygwin, have faced the issue of frequent use of shell scripts throughout the system, both in build time (eg, the utility "saveVersion.sh"), and run time (config files like "hadoop-env.sh" and the start/stop scripts in "bin/*" ).  Similar usages exist throughout the Hadoop stack, in all projects.

The vast majority of these shell scripts do not do anything platform specific; they can be expressed in a posix-conforming way.  Therefore, it seems to us that it makes sense to start using a cross-platform scripting language, such as python, in place of shell for these purposes.  For those rare occasions where platform-specific functionality really is needed, python also supports quite a lot of platform-specific functionality on both Linux and Windows; but where that is inadequate, one could still conditionally invoke a platform-specific module written in shell (for Linux/*nix) or powershell or bat (for Windows).

The primary motive for moving to a cross-platform scripting language is maintainability.  The alternative would be to maintain two complete suites of scripts, one for Linux and one for Windows (and perhaps others in the future).  We want to avoid the need to update dual modules in two different languages when functionality changes, especially given that many Linux developers are not familiar with powershell or bat, and many Windows developers are not familiar with shell or bash.
                
> Select and document a platform-independent scripting language for use in Hadoop environment
> -------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-9082
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9082
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Matt Foley
>
> This issue is going to be discussed at length in the common-dev@ mailing list, under topic "[PROPOSAL] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira