You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Jun Young Kim <ju...@gmail.com> on 2011/02/24 07:55:11 UTC

is there more smarter way to execute a hadoop cluster?

Hi,
I executed my cluster by this way.

call a command in shell directly.

String runInCommand ="/opt/hadoop-0.21.0/bin/hadoop jar testCluster.jar 
example";

Process proc = Runtime.getRuntime().exec(runInCommand);
proc.waitFor();

BufferedReader in = new BufferedReader(new 
InputStreamReader(proc.getErrorStream()));
for (String str; (str = in.readLine()) != null;)
     System.out.println(str);

System.exit(0);

but, in a hadoop script, it calls the RunJar() class to deploy 
testCluster.jar file. isn't it?

is there more smarter way to execute a hadoop cluster?

thanks.

-- 
Junyoung Kim (juneng603@gmail.com)

Re: is there more smarter way to execute a hadoop cluster?

Posted by Jun Young Kim <ju...@gmail.com>.

hello, harsh.

do you mean I need to read xml files and then parse it to set in my app?


Junyoung Kim (juneng603@gmail.com)


On 02/25/2011 03:32 PM, Harsh J wrote:
> It is best if your application gets
> the right configuration files on its classpath itself, so that the
> right values are read (how else would it know your values!).

Re: is there more smarter way to execute a hadoop cluster?

Posted by JunYoung Kim <ju...@gmail.com>.

hi,

if it is possible, could you give me some examples to load configuration infos?

I've tested it by testing the path of hadoop and hadoop/conf in my $CLASSPATH. --> not a solution for me.

how could I load cluster configurations?

thanks.

2011. 2. 25., 오후 3:32, Harsh J 작성:

> Hello again,
> 
> Finals won't help all the logic you require to be performed in the
> front-end/Driver code. If you're using fs.default.name inside a Task
> somehow, final will help there. It is best if your application gets
> the right configuration files on its classpath itself, so that the
> right values are read (how else would it know your values!).
> 
> Alternatively, you can use GenericOptionsParser to parse -fs and -jt
> arguments when the Driver is launched from commandline.
> 
> On Fri, Feb 25, 2011 at 11:46 AM, Jun Young Kim <ju...@gmail.com> wrote:
>> Hi, Harsh.
>> 
>> I've already tried to do use <final> tag to set it unmodifiable.
>> but, my result is not different.
>> 
>> *core-site.xml:*
>> <configuration>
>> <property>
>> <name>fs.default.name</name>
>> <value>hdfs://localhost</value>
>> <final>true</final>
>> </property>
>> </configuration>
>> 
>> other *-site.xml files are also modified by this rule.
>> 
>> thanks.
>> 
>> Junyoung Kim (juneng603@gmail.com)
>> 
>> 
>> On 02/25/2011 02:50 PM, Harsh J wrote:
>>> 
>>> Hi,
>>> 
>>> On Fri, Feb 25, 2011 at 10:17 AM, Jun Young Kim<ju...@gmail.com>
>>>  wrote:
>>>> 
>>>> hi,
>>>> 
>>>> I got the reason of my problem.
>>>> 
>>>> in case of submitting a job by shell,
>>>> 
>>>> conf.get("fs.default.name") is "hdfs://localhost"
>>>> 
>>>> in case of submitting a job by a java application directly,
>>>> 
>>>> conf.get("fs.default.name") is "file://localhost"
>>>> so I couldn't read any files from hdfs.
>>>> 
>>>> I think the execution of my java app couldn't read *-site.xml
>>>> configurations
>>>> properly.
>>> 
>>> Have a look at this Q:
>>> 
>>> http://wiki.apache.org/hadoop/FAQ#How_do_I_get_my_MapReduce_Java_Program_to_read_the_Cluster.27s_set_configuration_and_not_just_defaults.3F
>>> 
>> 
> 
> 
> 
> -- 
> Harsh J
> www.harshj.com

Re: is there more smarter way to execute a hadoop cluster?

Posted by Harsh J <qw...@gmail.com>.

Hello again,

Finals won't help all the logic you require to be performed in the
front-end/Driver code. If you're using fs.default.name inside a Task
somehow, final will help there. It is best if your application gets
the right configuration files on its classpath itself, so that the
right values are read (how else would it know your values!).

Alternatively, you can use GenericOptionsParser to parse -fs and -jt
arguments when the Driver is launched from commandline.

On Fri, Feb 25, 2011 at 11:46 AM, Jun Young Kim <ju...@gmail.com> wrote:
> Hi, Harsh.
>
> I've already tried to do use <final> tag to set it unmodifiable.
> but, my result is not different.
>
> *core-site.xml:*
> <configuration>
> <property>
> <name>fs.default.name</name>
> <value>hdfs://localhost</value>
> <final>true</final>
> </property>
> </configuration>
>
> other *-site.xml files are also modified by this rule.
>
> thanks.
>
> Junyoung Kim (juneng603@gmail.com)
>
>
> On 02/25/2011 02:50 PM, Harsh J wrote:
>>
>> Hi,
>>
>> On Fri, Feb 25, 2011 at 10:17 AM, Jun Young Kim<ju...@gmail.com>
>>  wrote:
>>>
>>> hi,
>>>
>>> I got the reason of my problem.
>>>
>>> in case of submitting a job by shell,
>>>
>>> conf.get("fs.default.name") is "hdfs://localhost"
>>>
>>> in case of submitting a job by a java application directly,
>>>
>>> conf.get("fs.default.name") is "file://localhost"
>>> so I couldn't read any files from hdfs.
>>>
>>> I think the execution of my java app couldn't read *-site.xml
>>> configurations
>>> properly.
>>
>> Have a look at this Q:
>>
>> http://wiki.apache.org/hadoop/FAQ#How_do_I_get_my_MapReduce_Java_Program_to_read_the_Cluster.27s_set_configuration_and_not_just_defaults.3F
>>
>



-- 
Harsh J
www.harshj.com

Re: is there more smarter way to execute a hadoop cluster?

Posted by Jun Young Kim <ju...@gmail.com>.

Hi, Harsh.

I've already tried to do use <final> tag to set it unmodifiable.
but, my result is not different.

*core-site.xml:*
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost</value>
<final>true</final>
</property>
</configuration>

other *-site.xml files are also modified by this rule.

thanks.

Junyoung Kim (juneng603@gmail.com)


On 02/25/2011 02:50 PM, Harsh J wrote:
> Hi,
>
> On Fri, Feb 25, 2011 at 10:17 AM, Jun Young Kim<ju...@gmail.com>  wrote:
>> hi,
>>
>> I got the reason of my problem.
>>
>> in case of submitting a job by shell,
>>
>> conf.get("fs.default.name") is "hdfs://localhost"
>>
>> in case of submitting a job by a java application directly,
>>
>> conf.get("fs.default.name") is "file://localhost"
>> so I couldn't read any files from hdfs.
>>
>> I think the execution of my java app couldn't read *-site.xml configurations
>> properly.
>
> Have a look at this Q:
> http://wiki.apache.org/hadoop/FAQ#How_do_I_get_my_MapReduce_Java_Program_to_read_the_Cluster.27s_set_configuration_and_not_just_defaults.3F
>

Re: is there more smarter way to execute a hadoop cluster?

Posted by Harsh J <qw...@gmail.com>.

Hi,

On Fri, Feb 25, 2011 at 10:17 AM, Jun Young Kim <ju...@gmail.com> wrote:
> hi,
>
> I got the reason of my problem.
>
> in case of submitting a job by shell,
>
> conf.get("fs.default.name") is "hdfs://localhost"
>
> in case of submitting a job by a java application directly,
>
> conf.get("fs.default.name") is "file://localhost"
> so I couldn't read any files from hdfs.
>
> I think the execution of my java app couldn't read *-site.xml configurations
> properly.


Have a look at this Q:
http://wiki.apache.org/hadoop/FAQ#How_do_I_get_my_MapReduce_Java_Program_to_read_the_Cluster.27s_set_configuration_and_not_just_defaults.3F

-- 
Harsh J
www.harshj.com

Re: is there more smarter way to execute a hadoop cluster?

Posted by Jun Young Kim <ju...@gmail.com>.

hi,

I got the reason of my problem.

in case of submitting a job by shell,

conf.get("fs.default.name") is "hdfs://localhost"

in case of submitting a job by a java application directly,

conf.get("fs.default.name") is "file://localhost"
so I couldn't read any files from hdfs.

I think the execution of my java app couldn't read *-site.xml 
configurations properly.

Junyoung Kim (juneng603@gmail.com)


On 02/24/2011 06:41 PM, Harsh J wrote:
> Hey,
>
> On Thu, Feb 24, 2011 at 2:36 PM, Jun Young Kim<ju...@gmail.com>  wrote:
>> How are I going to do?
> In new API, 'Job' class too has a Job.submit() and
> Job.waitForCompletion(bool) method. Please see the API here:
> http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapreduce/Job.html
>

Re: is there more smarter way to execute a hadoop cluster?

Posted by Jun Young Kim <ju...@gmail.com>.

Now, I am using Job.waitForCompletion(bool) method to submit my job.

but, my jar cannot open hdfs files.
and also after submitting my job, I couldn't look job history on admin 
pages(jobtracker.jsp) even if my job is succeeded..

for example)
I set the input path as "hdfs:/user/juneng/1.input".

but, look this error..

Wrong FS: hdfs:/user/juneng/1.input, expected: file:///

Junyoung Kim (juneng603@gmail.com)

On 02/24/2011 06:41 PM, Harsh J wrote:
>
> In new API, 'Job' class too has a Job.submit() and
> Job.waitForCompletion(bool) method. Please see the API here:
> http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapreduce/Job.html

Re: is there more smarter way to execute a hadoop cluster?

Posted by Harsh J <qw...@gmail.com>.

Hey,

On Thu, Feb 24, 2011 at 2:36 PM, Jun Young Kim <ju...@gmail.com> wrote:
> How are I going to do?

In new API, 'Job' class too has a Job.submit() and
Job.waitForCompletion(bool) method. Please see the API here:
http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapreduce/Job.html

-- 
Harsh J
www.harshj.com

Re: is there more smarter way to execute a hadoop cluster?

Posted by Jun Young Kim <ju...@gmail.com>.

hello, harsh.

to use MultipleOutput class,
I need to use a Job class to set it as a first argument to configure 
about my hadoop job.

|*addNamedOutput 
<http://hadoop.apache.org/mapreduce/docs/r0.21.0/api/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.html#addNamedOutput%28org.apache.hadoop.mapreduce.Job,%20java.lang.String,%20java.lang.Class,%20java.lang.Class,%20java.lang.Class%29>*(Job 
<http://hadoop.apache.org/mapreduce/docs/r0.21.0/api/org/apache/hadoop/mapreduce/Job.html> job,String 
<http://java.sun.com/javase/6/docs/api/java/lang/String.html?is-external=true> namedOutput,Class 
<http://java.sun.com/javase/6/docs/api/java/lang/Class.html?is-external=true><? 
extendsOutputFormat 
<http://hadoop.apache.org/mapreduce/docs/r0.21.0/api/org/apache/hadoop/mapreduce/OutputFormat.html>> outputFormatClass,Class 
<http://java.sun.com/javase/6/docs/api/java/lang/Class.html?is-external=true><?> keyClass,Class 
<http://java.sun.com/javase/6/docs/api/java/lang/Class.html?is-external=true><?> valueClass)|
           Adds a named output for the job.

AYK, Job class is deprecated in 0.21.0.

to submit my job in a cluster like runJob().

How are I going to do?

Junyoung Kim (juneng603@gmail.com)


On 02/24/2011 04:12 PM, Harsh J wrote:
> Hello,
>
> On Thu, Feb 24, 2011 at 12:25 PM, Jun Young Kim<ju...@gmail.com>  wrote:
>> Hi,
>> I executed my cluster by this way.
>>
>> call a command in shell directly.
> What are you doing within your testCluster.jar? If you are simply
> submitting a job, you can use a Driver method and get rid of all these
> hassles. JobClient and Job classes both support submitting jobs from
> Java API itself.
>
> Please read the tutorial on submitting application code via code
> itself: http://developer.yahoo.com/hadoop/tutorial/module4.html#driver
> Notice the last line in the code presented there, which submits a job
> itself. Using runJob() also prints your progress/counters etc.
>
> The way you've implemented this looks unnecessary when your Jar itself
> can be made runnable with a Driver!
>

Re: is there more smarter way to execute a hadoop cluster?

Posted by Harsh J <qw...@gmail.com>.

Hello,

On Thu, Feb 24, 2011 at 12:25 PM, Jun Young Kim <ju...@gmail.com> wrote:
> Hi,
> I executed my cluster by this way.
>
> call a command in shell directly.

What are you doing within your testCluster.jar? If you are simply
submitting a job, you can use a Driver method and get rid of all these
hassles. JobClient and Job classes both support submitting jobs from
Java API itself.

Please read the tutorial on submitting application code via code
itself: http://developer.yahoo.com/hadoop/tutorial/module4.html#driver
Notice the last line in the code presented there, which submits a job
itself. Using runJob() also prints your progress/counters etc.

The way you've implemented this looks unnecessary when your Jar itself
can be made runnable with a Driver!

-- 
Harsh J
www.harshj.com