You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Doug Cutting (JIRA)" <ji...@apache.org> on 2006/02/23 23:32:37 UTC

[jira] Created: (HADOOP-59) support generic command-line options

support generic command-line options
------------------------------------

         Key: HADOOP-59
         URL: http://issues.apache.org/jira/browse/HADOOP-59
     Project: Hadoop
        Type: Improvement
  Components: conf  
    Versions: 0.1    
    Reporter: Doug Cutting
     Fix For: 0.1


Hadoop commands should all support some common options.  For example, it should be possible to specify the namenode, datanode, and, for that matter, any config option, in a generic way.

This could be implemented with code like:

public interface Tool extends Configurable {
  void run(String[] args) throws Exception;
}

public class ToolBase implements Tool extends Configured {
  public final void main(String[] args) throws Exception {
    Configuration conf = new Configuration();
    ... parse config options from args into conf ...
    this.configure(conf);
    this.run();
  }
}

public MyTool extends ExcecutableBase {
  public static void main(String[] args) throws Exception {
    new MyTool().main(args);
  }
}

The general command line syntax could be:

bin/hadoop [generalOptions] command [commandOptions]

Where generalOptions are things that ToolBase handles, and only the commandOptions are passed to Tool.run().  The most important generalOption would be '-D', which would define name/value pairs that are set in the configuration.  This alone would permit folks to set the namenode, datanode, etc.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-59) support generic command-line options

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-59?page=comments#action_12368188 ] 

Doug Cutting commented on HADOOP-59:
------------------------------------

Michel,

I think that -D should define Hadoop configuration properties, not JVM system properties.  And I don't think Hadoop configurations should by default include all of the JVM's system properties.  Finally, please include good user-level javadoc on all public & protected items, attach Apache's license at the top of each file, try to bundle things into a single patch file, etc.  These kinds of things make it much easier for me to commit a patch.  Ideally all that I need to do is read the patch to see that it looks reasonable, apply it with 'patch -p 0 < patchFile', run unit tests and commit.  Including new unit tests is also a good idea.  If I need to clean up  javadoc, licenses, formatting, etc. before I can commit then I will be less inclined to process a contribution promptly.

Doug


> support generic command-line options
> ------------------------------------
>
>          Key: HADOOP-59
>          URL: http://issues.apache.org/jira/browse/HADOOP-59
>      Project: Hadoop
>         Type: Improvement
>   Components: conf
>     Versions: 0.1
>     Reporter: Doug Cutting
>     Assignee: Michel Tourn
>      Fix For: 0.1
>  Attachments: Tool.java, ToolBase.java, bashfile.patch, toolbase.patch
>
> Hadoop commands should all support some common options.  For example, it should be possible to specify the namenode, datanode, and, for that matter, any config option, in a generic way.
> This could be implemented with code like:
> public interface Tool extends Configurable {
>   void run(String[] args) throws Exception;
> }
> public class ToolBase implements Tool extends Configured {
>   public final void main(String[] args) throws Exception {
>     Configuration conf = new Configuration();
>     ... parse config options from args into conf ...
>     this.configure(conf);
>     this.run();
>   }
> }
> public MyTool extends ExcecutableBase {
>   public static void main(String[] args) throws Exception {
>     new MyTool().main(args);
>   }
> }
> The general command line syntax could be:
> bin/hadoop [generalOptions] command [commandOptions]
> Where generalOptions are things that ToolBase handles, and only the commandOptions are passed to Tool.run().  The most important generalOption would be '-D', which would define name/value pairs that are set in the configuration.  This alone would permit folks to set the namenode, datanode, etc.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-59) support generic command-line options

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-59?page=comments#action_12425315 ] 
            
Doug Cutting commented on HADOOP-59:
------------------------------------

Owen, what you say makes good sense.  If you feel strongly about it, please submit a new bug.

> support generic command-line options
> ------------------------------------
>
>                 Key: HADOOP-59
>                 URL: http://issues.apache.org/jira/browse/HADOOP-59
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.2.0
>            Reporter: Doug Cutting
>         Assigned To: Hairong Kuang
>            Priority: Minor
>             Fix For: 0.4.0
>
>         Attachments: bashfile.patch, commons-cli-2.0-SNAPSHOT.jar, genericCommand.patch, Tool.java, ToolBase.java, toolbase.patch
>
>
> Hadoop commands should all support some common options.  For example, it should be possible to specify the namenode, datanode, and, for that matter, any config option, in a generic way.
> This could be implemented with code like:
> public interface Tool extends Configurable {
>   void run(String[] args) throws Exception;
> }
> public class ToolBase implements Tool extends Configured {
>   public final void main(String[] args) throws Exception {
>     Configuration conf = new Configuration();
>     ... parse config options from args into conf ...
>     this.configure(conf);
>     this.run();
>   }
> }
> public MyTool extends ExcecutableBase {
>   public static void main(String[] args) throws Exception {
>     new MyTool().main(args);
>   }
> }
> The general command line syntax could be:
> bin/hadoop [generalOptions] command [commandOptions]
> Where generalOptions are things that ToolBase handles, and only the commandOptions are passed to Tool.run().  The most important generalOption would be '-D', which would define name/value pairs that are set in the configuration.  This alone would permit folks to set the namenode, datanode, etc.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HADOOP-59) support generic command-line options

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-59?page=all ]

Hairong Kuang updated HADOOP-59:
--------------------------------

    Attachment: commons-cli-2.0-SNAPSHOT.jar

In the patch submitted above, the processing of general options are implemented using commons CLI with a fix. The fix has not committed yet. So I attached the jar here.

> support generic command-line options
> ------------------------------------
>
>          Key: HADOOP-59
>          URL: http://issues.apache.org/jira/browse/HADOOP-59
>      Project: Hadoop
>         Type: Improvement

>   Components: conf
>     Versions: 0.2.0
>     Reporter: Doug Cutting
>     Assignee: Hairong Kuang
>     Priority: Minor
>      Fix For: 0.4.0
>  Attachments: Tool.java, ToolBase.java, bashfile.patch, commons-cli-2.0-SNAPSHOT.jar, genericCommand.patch, toolbase.patch
>
> Hadoop commands should all support some common options.  For example, it should be possible to specify the namenode, datanode, and, for that matter, any config option, in a generic way.
> This could be implemented with code like:
> public interface Tool extends Configurable {
>   void run(String[] args) throws Exception;
> }
> public class ToolBase implements Tool extends Configured {
>   public final void main(String[] args) throws Exception {
>     Configuration conf = new Configuration();
>     ... parse config options from args into conf ...
>     this.configure(conf);
>     this.run();
>   }
> }
> public MyTool extends ExcecutableBase {
>   public static void main(String[] args) throws Exception {
>     new MyTool().main(args);
>   }
> }
> The general command line syntax could be:
> bin/hadoop [generalOptions] command [commandOptions]
> Where generalOptions are things that ToolBase handles, and only the commandOptions are passed to Tool.run().  The most important generalOption would be '-D', which would define name/value pairs that are set in the configuration.  This alone would permit folks to set the namenode, datanode, etc.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-59) support generic command-line options

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-59?page=comments#action_12417877 ] 

Doug Cutting commented on HADOOP-59:
------------------------------------

The analogy is with the cvs and svn command line programs, which support both general options, determining which servers to talk to, etc., then command-specific options.  The documentation then can be organized similarly, describing general options, then each command and its options.  The set of commands is extensible (any class with a main() can be specified to bin/hadoop), and each command need only provide documentation for its options.

> support generic command-line options
> ------------------------------------
>
>          Key: HADOOP-59
>          URL: http://issues.apache.org/jira/browse/HADOOP-59
>      Project: Hadoop
>         Type: Improvement

>   Components: conf
>     Versions: 0.2.0
>     Reporter: Doug Cutting
>     Assignee: Hairong Kuang
>     Priority: Minor
>      Fix For: 0.4.0
>  Attachments: Tool.java, ToolBase.java, bashfile.patch, commons-cli-2.0-SNAPSHOT.jar, genericCommand.patch, toolbase.patch
>
> Hadoop commands should all support some common options.  For example, it should be possible to specify the namenode, datanode, and, for that matter, any config option, in a generic way.
> This could be implemented with code like:
> public interface Tool extends Configurable {
>   void run(String[] args) throws Exception;
> }
> public class ToolBase implements Tool extends Configured {
>   public final void main(String[] args) throws Exception {
>     Configuration conf = new Configuration();
>     ... parse config options from args into conf ...
>     this.configure(conf);
>     this.run();
>   }
> }
> public MyTool extends ExcecutableBase {
>   public static void main(String[] args) throws Exception {
>     new MyTool().main(args);
>   }
> }
> The general command line syntax could be:
> bin/hadoop [generalOptions] command [commandOptions]
> Where generalOptions are things that ToolBase handles, and only the commandOptions are passed to Tool.run().  The most important generalOption would be '-D', which would define name/value pairs that are set in the configuration.  This alone would permit folks to set the namenode, datanode, etc.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Updated: (HADOOP-59) support generic command-line options

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-59?page=all ]

Hairong Kuang updated HADOOP-59:
--------------------------------

    Attachment: genericCommand.patch

This patch provides a general framework to support general command options. Tool provides an interface and ToolBase processes general command options.

General command options supported are
-conf <configuration file>      specify an application configuration file
-D <property=value>               use value for given property
-fs <local|namenode:port>    specify a name node
-jt <local|jobtracker:port>       specify a job tracker

The syntax to run bin/hadoop becomes
bin/hadoop command [general options] [command options and arguments]

The patch also includes changes to four tools, DFSShell, DFSck, JobClient, and CopyFiles to inherit from ToolBase. 

Examples using general options are
bin/hadoop dfs -fs darwin:8020 -ls /data
bin/hadoop dfs -Dfs.default.name=darwin:8020 -ls /data
bin/hadoop dfs -conf hadoop-site.xml -ls /data
bin/hadoop job -jt darwin:50020 -status job_0001
bin/hadoop distcp -Dfs.default.name=darwin:8020 -Dmapred.job.tracker=darwin:50020 srcurl dsturl
bin/hadoop fsck -fs dawin:8020 /data

Last, the patch includes a junit test for general options.

> support generic command-line options
> ------------------------------------
>
>          Key: HADOOP-59
>          URL: http://issues.apache.org/jira/browse/HADOOP-59
>      Project: Hadoop
>         Type: Improvement

>   Components: conf
>     Versions: 0.2.0
>     Reporter: Doug Cutting
>     Assignee: Hairong Kuang
>     Priority: Minor
>      Fix For: 0.4.0
>  Attachments: Tool.java, ToolBase.java, bashfile.patch, genericCommand.patch, toolbase.patch
>
> Hadoop commands should all support some common options.  For example, it should be possible to specify the namenode, datanode, and, for that matter, any config option, in a generic way.
> This could be implemented with code like:
> public interface Tool extends Configurable {
>   void run(String[] args) throws Exception;
> }
> public class ToolBase implements Tool extends Configured {
>   public final void main(String[] args) throws Exception {
>     Configuration conf = new Configuration();
>     ... parse config options from args into conf ...
>     this.configure(conf);
>     this.run();
>   }
> }
> public MyTool extends ExcecutableBase {
>   public static void main(String[] args) throws Exception {
>     new MyTool().main(args);
>   }
> }
> The general command line syntax could be:
> bin/hadoop [generalOptions] command [commandOptions]
> Where generalOptions are things that ToolBase handles, and only the commandOptions are passed to Tool.run().  The most important generalOption would be '-D', which would define name/value pairs that are set in the configuration.  This alone would permit folks to set the namenode, datanode, etc.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Updated: (HADOOP-59) support generic command-line options

Posted by "Sameer Paranjpye (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-59?page=all ]

Sameer Paranjpye updated HADOOP-59:
-----------------------------------

    Fix Version: 0.4
                     (was: 0.3)
       Priority: Minor  (was: Major)

> support generic command-line options
> ------------------------------------
>
>          Key: HADOOP-59
>          URL: http://issues.apache.org/jira/browse/HADOOP-59
>      Project: Hadoop
>         Type: Improvement

>   Components: conf
>     Versions: 0.2
>     Reporter: Doug Cutting
>     Assignee: Michel Tourn
>     Priority: Minor
>      Fix For: 0.4
>  Attachments: Tool.java, ToolBase.java, bashfile.patch, toolbase.patch
>
> Hadoop commands should all support some common options.  For example, it should be possible to specify the namenode, datanode, and, for that matter, any config option, in a generic way.
> This could be implemented with code like:
> public interface Tool extends Configurable {
>   void run(String[] args) throws Exception;
> }
> public class ToolBase implements Tool extends Configured {
>   public final void main(String[] args) throws Exception {
>     Configuration conf = new Configuration();
>     ... parse config options from args into conf ...
>     this.configure(conf);
>     this.run();
>   }
> }
> public MyTool extends ExcecutableBase {
>   public static void main(String[] args) throws Exception {
>     new MyTool().main(args);
>   }
> }
> The general command line syntax could be:
> bin/hadoop [generalOptions] command [commandOptions]
> Where generalOptions are things that ToolBase handles, and only the commandOptions are passed to Tool.run().  The most important generalOption would be '-D', which would define name/value pairs that are set in the configuration.  This alone would permit folks to set the namenode, datanode, etc.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-59) support generic command-line options

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12461879 ] 

Raghu Angadi commented on HADOOP-59:
------------------------------------


> This does not address other issues oven brings up.

Sorry Owen!

> support generic command-line options
> ------------------------------------
>
>                 Key: HADOOP-59
>                 URL: https://issues.apache.org/jira/browse/HADOOP-59
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.2.0
>            Reporter: Doug Cutting
>         Assigned To: Hairong Kuang
>            Priority: Minor
>             Fix For: 0.4.0
>
>         Attachments: bashfile.patch, commons-cli-2.0-SNAPSHOT.jar, genericCommand.patch, Tool.java, ToolBase.java, toolbase.patch
>
>
> Hadoop commands should all support some common options.  For example, it should be possible to specify the namenode, datanode, and, for that matter, any config option, in a generic way.
> This could be implemented with code like:
> public interface Tool extends Configurable {
>   void run(String[] args) throws Exception;
> }
> public class ToolBase implements Tool extends Configured {
>   public final void main(String[] args) throws Exception {
>     Configuration conf = new Configuration();
>     ... parse config options from args into conf ...
>     this.configure(conf);
>     this.run();
>   }
> }
> public MyTool extends ExcecutableBase {
>   public static void main(String[] args) throws Exception {
>     new MyTool().main(args);
>   }
> }
> The general command line syntax could be:
> bin/hadoop [generalOptions] command [commandOptions]
> Where generalOptions are things that ToolBase handles, and only the commandOptions are passed to Tool.run().  The most important generalOption would be '-D', which would define name/value pairs that are set in the configuration.  This alone would permit folks to set the namenode, datanode, etc.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HADOOP-59) support generic command-line options

Posted by "Sameer Paranjpye (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-59?page=comments#action_12413874 ] 

Sameer Paranjpye commented on HADOOP-59:
----------------------------------------

This looks like it has been addressed by the refactoring done in HADOOP-59. Does this need to stay open?

> support generic command-line options
> ------------------------------------
>
>          Key: HADOOP-59
>          URL: http://issues.apache.org/jira/browse/HADOOP-59
>      Project: Hadoop
>         Type: Improvement

>   Components: conf
>     Versions: 0.2
>     Reporter: Doug Cutting
>     Assignee: Michel Tourn
>      Fix For: 0.4
>  Attachments: Tool.java, ToolBase.java, bashfile.patch, toolbase.patch
>
> Hadoop commands should all support some common options.  For example, it should be possible to specify the namenode, datanode, and, for that matter, any config option, in a generic way.
> This could be implemented with code like:
> public interface Tool extends Configurable {
>   void run(String[] args) throws Exception;
> }
> public class ToolBase implements Tool extends Configured {
>   public final void main(String[] args) throws Exception {
>     Configuration conf = new Configuration();
>     ... parse config options from args into conf ...
>     this.configure(conf);
>     this.run();
>   }
> }
> public MyTool extends ExcecutableBase {
>   public static void main(String[] args) throws Exception {
>     new MyTool().main(args);
>   }
> }
> The general command line syntax could be:
> bin/hadoop [generalOptions] command [commandOptions]
> Where generalOptions are things that ToolBase handles, and only the commandOptions are passed to Tool.run().  The most important generalOption would be '-D', which would define name/value pairs that are set in the configuration.  This alone would permit folks to set the namenode, datanode, etc.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Resolved: (HADOOP-59) support generic command-line options

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-59?page=all ]
     
Doug Cutting resolved HADOOP-59:
--------------------------------

    Resolution: Fixed

I just committed this.  Thanks, Hairong!

> support generic command-line options
> ------------------------------------
>
>          Key: HADOOP-59
>          URL: http://issues.apache.org/jira/browse/HADOOP-59
>      Project: Hadoop
>         Type: Improvement

>   Components: conf
>     Versions: 0.2.0
>     Reporter: Doug Cutting
>     Assignee: Hairong Kuang
>     Priority: Minor
>      Fix For: 0.4.0
>  Attachments: Tool.java, ToolBase.java, bashfile.patch, commons-cli-2.0-SNAPSHOT.jar, genericCommand.patch, toolbase.patch
>
> Hadoop commands should all support some common options.  For example, it should be possible to specify the namenode, datanode, and, for that matter, any config option, in a generic way.
> This could be implemented with code like:
> public interface Tool extends Configurable {
>   void run(String[] args) throws Exception;
> }
> public class ToolBase implements Tool extends Configured {
>   public final void main(String[] args) throws Exception {
>     Configuration conf = new Configuration();
>     ... parse config options from args into conf ...
>     this.configure(conf);
>     this.run();
>   }
> }
> public MyTool extends ExcecutableBase {
>   public static void main(String[] args) throws Exception {
>     new MyTool().main(args);
>   }
> }
> The general command line syntax could be:
> bin/hadoop [generalOptions] command [commandOptions]
> Where generalOptions are things that ToolBase handles, and only the commandOptions are passed to Tool.run().  The most important generalOption would be '-D', which would define name/value pairs that are set in the configuration.  This alone would permit folks to set the namenode, datanode, etc.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-59) support generic command-line options

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12461878 ] 

Raghu Angadi commented on HADOOP-59:
------------------------------------


I know this is very old bug.. 

Owen Wrote:

> I haven't used the commons cli library, but I would have prefered something less inhertiance-based, more like::

I think it does not need to be inheritence based. All we need is to make static member parseGenericOptions() public. This does not address other issues oven brings up.  Then it can be used like:

   String[] commandArgs = ToolBase.processGenericOptions(conf, argv); // In that case it need not be called ToolBase :)
   // rest as before.

> Parser cliParser = new GenericOptionsParser();
> cliParser.addOption("i", false, "ignore read errors");
> cliParser.parse(args, conf);
> boolean ignoreErrors = cliParser.hasOption("i");

Above can also be supported without changing much I guess.. with another static function.


> support generic command-line options
> ------------------------------------
>
>                 Key: HADOOP-59
>                 URL: https://issues.apache.org/jira/browse/HADOOP-59
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.2.0
>            Reporter: Doug Cutting
>         Assigned To: Hairong Kuang
>            Priority: Minor
>             Fix For: 0.4.0
>
>         Attachments: bashfile.patch, commons-cli-2.0-SNAPSHOT.jar, genericCommand.patch, Tool.java, ToolBase.java, toolbase.patch
>
>
> Hadoop commands should all support some common options.  For example, it should be possible to specify the namenode, datanode, and, for that matter, any config option, in a generic way.
> This could be implemented with code like:
> public interface Tool extends Configurable {
>   void run(String[] args) throws Exception;
> }
> public class ToolBase implements Tool extends Configured {
>   public final void main(String[] args) throws Exception {
>     Configuration conf = new Configuration();
>     ... parse config options from args into conf ...
>     this.configure(conf);
>     this.run();
>   }
> }
> public MyTool extends ExcecutableBase {
>   public static void main(String[] args) throws Exception {
>     new MyTool().main(args);
>   }
> }
> The general command line syntax could be:
> bin/hadoop [generalOptions] command [commandOptions]
> Where generalOptions are things that ToolBase handles, and only the commandOptions are passed to Tool.run().  The most important generalOption would be '-D', which would define name/value pairs that are set in the configuration.  This alone would permit folks to set the namenode, datanode, etc.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HADOOP-59) support generic command-line options

Posted by "eric baldeschwieler (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-59?page=comments#action_12417706 ] 

eric baldeschwieler commented on HADOOP-59:
-------------------------------------------

Any reason we need to segregate general options from command options?  
That seems a little confusing.


> support generic command-line options
> ------------------------------------
>
>          Key: HADOOP-59
>          URL: http://issues.apache.org/jira/browse/HADOOP-59
>      Project: Hadoop
>         Type: Improvement

>   Components: conf
>     Versions: 0.2.0
>     Reporter: Doug Cutting
>     Assignee: Hairong Kuang
>     Priority: Minor
>      Fix For: 0.4.0
>  Attachments: Tool.java, ToolBase.java, bashfile.patch, commons-cli-2.0-SNAPSHOT.jar, genericCommand.patch, toolbase.patch
>
> Hadoop commands should all support some common options.  For example, it should be possible to specify the namenode, datanode, and, for that matter, any config option, in a generic way.
> This could be implemented with code like:
> public interface Tool extends Configurable {
>   void run(String[] args) throws Exception;
> }
> public class ToolBase implements Tool extends Configured {
>   public final void main(String[] args) throws Exception {
>     Configuration conf = new Configuration();
>     ... parse config options from args into conf ...
>     this.configure(conf);
>     this.run();
>   }
> }
> public MyTool extends ExcecutableBase {
>   public static void main(String[] args) throws Exception {
>     new MyTool().main(args);
>   }
> }
> The general command line syntax could be:
> bin/hadoop [generalOptions] command [commandOptions]
> Where generalOptions are things that ToolBase handles, and only the commandOptions are passed to Tool.run().  The most important generalOption would be '-D', which would define name/value pairs that are set in the configuration.  This alone would permit folks to set the namenode, datanode, etc.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Assigned: (HADOOP-59) support generic command-line options

Posted by "Michel Tourn (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-59?page=all ]

Michel Tourn reassigned HADOOP-59:
----------------------------------

    Assign To: Michel Tourn

> support generic command-line options
> ------------------------------------
>
>          Key: HADOOP-59
>          URL: http://issues.apache.org/jira/browse/HADOOP-59
>      Project: Hadoop
>         Type: Improvement
>   Components: conf
>     Versions: 0.1
>     Reporter: Doug Cutting
>     Assignee: Michel Tourn
>      Fix For: 0.1

>
> Hadoop commands should all support some common options.  For example, it should be possible to specify the namenode, datanode, and, for that matter, any config option, in a generic way.
> This could be implemented with code like:
> public interface Tool extends Configurable {
>   void run(String[] args) throws Exception;
> }
> public class ToolBase implements Tool extends Configured {
>   public final void main(String[] args) throws Exception {
>     Configuration conf = new Configuration();
>     ... parse config options from args into conf ...
>     this.configure(conf);
>     this.run();
>   }
> }
> public MyTool extends ExcecutableBase {
>   public static void main(String[] args) throws Exception {
>     new MyTool().main(args);
>   }
> }
> The general command line syntax could be:
> bin/hadoop [generalOptions] command [commandOptions]
> Where generalOptions are things that ToolBase handles, and only the commandOptions are passed to Tool.run().  The most important generalOption would be '-D', which would define name/value pairs that are set in the configuration.  This alone would permit folks to set the namenode, datanode, etc.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-59) support generic command-line options

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-59?page=comments#action_12417960 ] 

Owen O'Malley commented on HADOOP-59:
-------------------------------------

I'm sorry that I didn't pay paid attention to this issue earlier. 

I think that it is confusing having the cli options segregated. For example, I have a patch for distcp (aka CopyFiles) that adds a "-i" option to ignore read errors. With this setup, the user needs to specify the -i *last*.  It implies that the user need to remember what is a generic option versus what is a command option. (Ok, raise your hand if you can tell me the difference between "cvs -d foo co bar" and "cvs co -d foo bar". The sad thing is that I do know and I'm sure a few of you do too. But it certainly does confuse non-experts.)

Furthermore, each application needs to handle --help themselves (and as a side effect adding a new generic option means updating the usage string in each application).

I haven't used the commons cli library, but I would have prefered something less inhertiance-based, more like::

Parser cliParser = new GenericOptionsParser();
cliParser.addOption("i", false, "ignore read errors");
cliParser.parse(args, conf);
boolean ignoreErrors = cliParser.hasOption("i");


> support generic command-line options
> ------------------------------------
>
>          Key: HADOOP-59
>          URL: http://issues.apache.org/jira/browse/HADOOP-59
>      Project: Hadoop
>         Type: Improvement

>   Components: conf
>     Versions: 0.2.0
>     Reporter: Doug Cutting
>     Assignee: Hairong Kuang
>     Priority: Minor
>      Fix For: 0.4.0
>  Attachments: Tool.java, ToolBase.java, bashfile.patch, commons-cli-2.0-SNAPSHOT.jar, genericCommand.patch, toolbase.patch
>
> Hadoop commands should all support some common options.  For example, it should be possible to specify the namenode, datanode, and, for that matter, any config option, in a generic way.
> This could be implemented with code like:
> public interface Tool extends Configurable {
>   void run(String[] args) throws Exception;
> }
> public class ToolBase implements Tool extends Configured {
>   public final void main(String[] args) throws Exception {
>     Configuration conf = new Configuration();
>     ... parse config options from args into conf ...
>     this.configure(conf);
>     this.run();
>   }
> }
> public MyTool extends ExcecutableBase {
>   public static void main(String[] args) throws Exception {
>     new MyTool().main(args);
>   }
> }
> The general command line syntax could be:
> bin/hadoop [generalOptions] command [commandOptions]
> Where generalOptions are things that ToolBase handles, and only the commandOptions are passed to Tool.run().  The most important generalOption would be '-D', which would define name/value pairs that are set in the configuration.  This alone would permit folks to set the namenode, datanode, etc.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Updated: (HADOOP-59) support generic command-line options

Posted by "Sameer Paranjpye (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-59?page=all ]

Sameer Paranjpye updated HADOOP-59:
-----------------------------------

    Fix Version: 0.2
                     (was: 0.1)
        Version: 0.2
                     (was: 0.1)

> support generic command-line options
> ------------------------------------
>
>          Key: HADOOP-59
>          URL: http://issues.apache.org/jira/browse/HADOOP-59
>      Project: Hadoop
>         Type: Improvement
>   Components: conf
>     Versions: 0.2
>     Reporter: Doug Cutting
>     Assignee: Michel Tourn
>      Fix For: 0.2
>  Attachments: Tool.java, ToolBase.java, bashfile.patch, toolbase.patch
>
> Hadoop commands should all support some common options.  For example, it should be possible to specify the namenode, datanode, and, for that matter, any config option, in a generic way.
> This could be implemented with code like:
> public interface Tool extends Configurable {
>   void run(String[] args) throws Exception;
> }
> public class ToolBase implements Tool extends Configured {
>   public final void main(String[] args) throws Exception {
>     Configuration conf = new Configuration();
>     ... parse config options from args into conf ...
>     this.configure(conf);
>     this.run();
>   }
> }
> public MyTool extends ExcecutableBase {
>   public static void main(String[] args) throws Exception {
>     new MyTool().main(args);
>   }
> }
> The general command line syntax could be:
> bin/hadoop [generalOptions] command [commandOptions]
> Where generalOptions are things that ToolBase handles, and only the commandOptions are passed to Tool.run().  The most important generalOption would be '-D', which would define name/value pairs that are set in the configuration.  This alone would permit folks to set the namenode, datanode, etc.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Assigned: (HADOOP-59) support generic command-line options

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-59?page=all ]

Hairong Kuang reassigned HADOOP-59:
-----------------------------------

    Assign To: Hairong Kuang  (was: Michel Tourn)

> support generic command-line options
> ------------------------------------
>
>          Key: HADOOP-59
>          URL: http://issues.apache.org/jira/browse/HADOOP-59
>      Project: Hadoop
>         Type: Improvement

>   Components: conf
>     Versions: 0.2.0
>     Reporter: Doug Cutting
>     Assignee: Hairong Kuang
>     Priority: Minor
>      Fix For: 0.4.0
>  Attachments: Tool.java, ToolBase.java, bashfile.patch, toolbase.patch
>
> Hadoop commands should all support some common options.  For example, it should be possible to specify the namenode, datanode, and, for that matter, any config option, in a generic way.
> This could be implemented with code like:
> public interface Tool extends Configurable {
>   void run(String[] args) throws Exception;
> }
> public class ToolBase implements Tool extends Configured {
>   public final void main(String[] args) throws Exception {
>     Configuration conf = new Configuration();
>     ... parse config options from args into conf ...
>     this.configure(conf);
>     this.run();
>   }
> }
> public MyTool extends ExcecutableBase {
>   public static void main(String[] args) throws Exception {
>     new MyTool().main(args);
>   }
> }
> The general command line syntax could be:
> bin/hadoop [generalOptions] command [commandOptions]
> Where generalOptions are things that ToolBase handles, and only the commandOptions are passed to Tool.run().  The most important generalOption would be '-D', which would define name/value pairs that are set in the configuration.  This alone would permit folks to set the namenode, datanode, etc.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Updated: (HADOOP-59) support generic command-line options

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-59?page=all ]

Doug Cutting updated HADOOP-59:
-------------------------------

    Fix Version: 0.3
                     (was: 0.2)

> support generic command-line options
> ------------------------------------
>
>          Key: HADOOP-59
>          URL: http://issues.apache.org/jira/browse/HADOOP-59
>      Project: Hadoop
>         Type: Improvement

>   Components: conf
>     Versions: 0.2
>     Reporter: Doug Cutting
>     Assignee: Michel Tourn
>      Fix For: 0.3
>  Attachments: Tool.java, ToolBase.java, bashfile.patch, toolbase.patch
>
> Hadoop commands should all support some common options.  For example, it should be possible to specify the namenode, datanode, and, for that matter, any config option, in a generic way.
> This could be implemented with code like:
> public interface Tool extends Configurable {
>   void run(String[] args) throws Exception;
> }
> public class ToolBase implements Tool extends Configured {
>   public final void main(String[] args) throws Exception {
>     Configuration conf = new Configuration();
>     ... parse config options from args into conf ...
>     this.configure(conf);
>     this.run();
>   }
> }
> public MyTool extends ExcecutableBase {
>   public static void main(String[] args) throws Exception {
>     new MyTool().main(args);
>   }
> }
> The general command line syntax could be:
> bin/hadoop [generalOptions] command [commandOptions]
> Where generalOptions are things that ToolBase handles, and only the commandOptions are passed to Tool.run().  The most important generalOption would be '-D', which would define name/value pairs that are set in the configuration.  This alone would permit folks to set the namenode, datanode, etc.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-59) support generic command-line options

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-59?page=comments#action_12416263 ] 

Hairong Kuang commented on HADOOP-59:
-------------------------------------

I am thinking to design the ToolBase as follows:

public class ToolBase implements Tool extends Configured { 
  public final void main(String[] args) throws Exception { 
    Configuration conf = new Configuration(); 
    String [] commandOptions = parseGeneralOptions( conf, args );
    this.configure( conf ); 
    this.run( commandOptions ); 
  } 
} 

ParseGeneralOptions parses the arguments, looks for general options, and sets configuration accordingly. It returns command-specific options.

I plan to support 4 general options:

-fs <local | namenode:port>             specify file system 
-jt <local | jobtracker:port>                 specify job tracker
-config <configuration file>                specify a file that contains application-specific configuration
-Dname=value                                     set a porperty to be value

> support generic command-line options
> ------------------------------------
>
>          Key: HADOOP-59
>          URL: http://issues.apache.org/jira/browse/HADOOP-59
>      Project: Hadoop
>         Type: Improvement

>   Components: conf
>     Versions: 0.2.0
>     Reporter: Doug Cutting
>     Assignee: Hairong Kuang
>     Priority: Minor
>      Fix For: 0.4.0
>  Attachments: Tool.java, ToolBase.java, bashfile.patch, toolbase.patch
>
> Hadoop commands should all support some common options.  For example, it should be possible to specify the namenode, datanode, and, for that matter, any config option, in a generic way.
> This could be implemented with code like:
> public interface Tool extends Configurable {
>   void run(String[] args) throws Exception;
> }
> public class ToolBase implements Tool extends Configured {
>   public final void main(String[] args) throws Exception {
>     Configuration conf = new Configuration();
>     ... parse config options from args into conf ...
>     this.configure(conf);
>     this.run();
>   }
> }
> public MyTool extends ExcecutableBase {
>   public static void main(String[] args) throws Exception {
>     new MyTool().main(args);
>   }
> }
> The general command line syntax could be:
> bin/hadoop [generalOptions] command [commandOptions]
> Where generalOptions are things that ToolBase handles, and only the commandOptions are passed to Tool.run().  The most important generalOption would be '-D', which would define name/value pairs that are set in the configuration.  This alone would permit folks to set the namenode, datanode, etc.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira