You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "Benjamin Reed (JIRA)" <ji...@apache.org> on 2007/05/02 23:26:15 UTC

[jira] Resolved: (HADOOP-435) Encapsulating startup scripts and jars in a single Jar file.

     [ https://issues.apache.org/jira/browse/HADOOP-435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Benjamin Reed resolved HADOOP-435.
----------------------------------

    Resolution: Won't Fix

The encapsulating Jar aspect doesn't seem to be an issue to most people. This should probably be reopened as an issue to redo the hadoop script since that seems to be the direction the issue is heading.

> Encapsulating startup scripts and jars in a single Jar file.
> ------------------------------------------------------------
>
>                 Key: HADOOP-435
>                 URL: https://issues.apache.org/jira/browse/HADOOP-435
>             Project: Hadoop
>          Issue Type: New Feature
>    Affects Versions: 0.12.1
>            Reporter: Benjamin Reed
>             Fix For: 0.13.0
>
>         Attachments: hadoop-exe.patch, hadoop-exe.patch, hadoop-exe.patch, hadoop-exe.patch, hadoopit.patch, hadoopit.patch, hadoopit.patch, start.sh, stop.sh
>
>
> Currently, hadoop is a set of scripts, configurations, and jar files. It makes it a pain to install on compute and datanodes. It also makes it a pain to setup clients so that they can use hadoop. Everytime things are updated the pain begins again.
> I suggest that we should be able to build a single Jar file that has a Main-Class defined with the configuration built in so that we can distribute that one file to nodes and clients on updates. One nice thing that I haven't done would be to make the jarfile downloadable from the JobTracker webpage so that clients can easily submit the jobs.
> I currently use such a setup on my small cluster. To start the job tracker I used "java -jar hadoop.jar -l /tmp/log jobtracker" to submit a job I use "java -jar hadoop.jar jar wordcount.jar". I used the client on my linux and Mac OSX machines and I'll I need installed in java and the hadoop.jar file.
> hadoop.jar helps with logfiles and configurations. The default of pulling the config files from the jar file can be overridden by specifying a config directory so that you can easily have machine specific configs and still have the same hadoop.jar on all machines.
> Here are the available commands from hadoop.jar:
> USAGE: hadoop [-l logdir] command
>   User commands:
>     dfs          run a DFS admin client
>     jar          run a JAR file
>     job          manipulate MapReduce jobs
>     fsck         run a DFS filesystem check utility
>   Runtime startup commands:
>     datanode     run a DFS datanode
>     jobtracker   run the MapReduce job Tracker node
>     namenode     run the DFS namenode (namenode -format formats the FS)
>     tasktracker  run a MapReduce task Tracker node
>   HadoopLoader commands:
>     buildJar     builds the HadoopLoader jar file
>     conf         dump hadoop configuration
> Note, I don't have the classes for hadoop streaming built into this Jar file, but if I had that would also be an option (it checks for needed classes before displaying an option). It makes it very easy for users that just write scripts to use hadoop straight from their machines.
> I'm also attaching the start.sh and stop.sh scripts that I use. These are the only scripts I use to startup the daemons. They are very simple and the start.sh script uses the config file to figure out whether or not to start the jobtracker and the nameserver.
> The attached patch adds the HadoopIt patch, modifies the Configuration class to find the config files correctly, and modifies the build to make a fully contained hadoop.jar. To update the configuration in a hadoop.jar you simply use "zip hadoop.jar hadoop-site.xml".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: [jira] Resolved: (HADOOP-435) Encapsulating startup scripts and jars in a single Jar file.

Posted by Doug Cutting <cu...@apache.org>.

Nigel Daley wrote:
> Just expressing my disappointment that this patch wasn't applied as is 
> and then another Jira opened to rework the scripts.  Should have spoken 
> up earlier.

Re-open it if you like.  I don't see why we can't re-work the script as 
a part of this issue.  Otherwise we end up with duplicated logic with no 
guarantee that it will ever be removed.  The script change should be easy.

Doug

Re: [jira] Resolved: (HADOOP-435) Encapsulating startup scripts and jars in a single Jar file.

Posted by Nigel Daley <nd...@yahoo-inc.com>.

Just expressing my disappointment that this patch wasn't applied as  
is and then another Jira opened to rework the scripts.  Should have  
spoken up earlier.

On May 2, 2007, at 2:26 PM, Benjamin Reed (JIRA) wrote:

>
>      [ https://issues.apache.org/jira/browse/HADOOP-435? 
> page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
>
> Benjamin Reed resolved HADOOP-435.
> ----------------------------------
>
>     Resolution: Won't Fix
>
> The encapsulating Jar aspect doesn't seem to be an issue to most  
> people. This should probably be reopened as an issue to redo the  
> hadoop script since that seems to be the direction the issue is  
> heading.
>
>> Encapsulating startup scripts and jars in a single Jar file.
>> ------------------------------------------------------------
>>
>>                 Key: HADOOP-435
>>                 URL: https://issues.apache.org/jira/browse/HADOOP-435
>>             Project: Hadoop
>>          Issue Type: New Feature
>>    Affects Versions: 0.12.1
>>            Reporter: Benjamin Reed
>>             Fix For: 0.13.0
>>
>>         Attachments: hadoop-exe.patch, hadoop-exe.patch, hadoop- 
>> exe.patch, hadoop-exe.patch, hadoopit.patch, hadoopit.patch,  
>> hadoopit.patch, start.sh, stop.sh
>>
>>
>> Currently, hadoop is a set of scripts, configurations, and jar  
>> files. It makes it a pain to install on compute and datanodes. It  
>> also makes it a pain to setup clients so that they can use hadoop.  
>> Everytime things are updated the pain begins again.
>> I suggest that we should be able to build a single Jar file that  
>> has a Main-Class defined with the configuration built in so that  
>> we can distribute that one file to nodes and clients on updates.  
>> One nice thing that I haven't done would be to make the jarfile  
>> downloadable from the JobTracker webpage so that clients can  
>> easily submit the jobs.
>> I currently use such a setup on my small cluster. To start the job  
>> tracker I used "java -jar hadoop.jar -l /tmp/log jobtracker" to  
>> submit a job I use "java -jar hadoop.jar jar wordcount.jar". I  
>> used the client on my linux and Mac OSX machines and I'll I need  
>> installed in java and the hadoop.jar file.
>> hadoop.jar helps with logfiles and configurations. The default of  
>> pulling the config files from the jar file can be overridden by  
>> specifying a config directory so that you can easily have machine  
>> specific configs and still have the same hadoop.jar on all machines.
>> Here are the available commands from hadoop.jar:
>> USAGE: hadoop [-l logdir] command
>>   User commands:
>>     dfs          run a DFS admin client
>>     jar          run a JAR file
>>     job          manipulate MapReduce jobs
>>     fsck         run a DFS filesystem check utility
>>   Runtime startup commands:
>>     datanode     run a DFS datanode
>>     jobtracker   run the MapReduce job Tracker node
>>     namenode     run the DFS namenode (namenode -format formats  
>> the FS)
>>     tasktracker  run a MapReduce task Tracker node
>>   HadoopLoader commands:
>>     buildJar     builds the HadoopLoader jar file
>>     conf         dump hadoop configuration
>> Note, I don't have the classes for hadoop streaming built into  
>> this Jar file, but if I had that would also be an option (it  
>> checks for needed classes before displaying an option). It makes  
>> it very easy for users that just write scripts to use hadoop  
>> straight from their machines.
>> I'm also attaching the start.sh and stop.sh scripts that I use.  
>> These are the only scripts I use to startup the daemons. They are  
>> very simple and the start.sh script uses the config file to figure  
>> out whether or not to start the jobtracker and the nameserver.
>> The attached patch adds the HadoopIt patch, modifies the  
>> Configuration class to find the config files correctly, and  
>> modifies the build to make a fully contained hadoop.jar. To update  
>> the configuration in a hadoop.jar you simply use "zip hadoop.jar  
>> hadoop-site.xml".
>
> -- 
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>