You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by bharath vissapragada <bh...@students.iiit.ac.in> on 2009/07/23 08:46:10 UTC

Hbase and Hadoop Config to run in Standalone mode

Hi all ,

I wanted to run HBase in standalone mode to check my Hbase MR programs ... I
have dl a built version of hbase-0.20. and i have hadoop 0.19.3

"I have set JAVA_HOME in both of them" .. then i started hbase and inserted
some tables using JAVA API .. Now i have written some MR programs onHBase
and when i run them on Hbase it runs perfectly without any errors and all
the Map -reduce statistics are displayed correctly but  i get no output .

I have one doubt now .. how do HBase recognize hadoop in stand alone mode(i
haven;t started my hadoop even) .. Even simple print statements donot work
.. no output is displayed on the screen ... I doubt my config ....

Do i need to add some config to run them ... Please reply ...

Re: Hbase and Hadoop Config to run in Standalone mode

Posted by Jean-Daniel Cryans <jd...@apache.org>.

If you run HBase in standalone, there is no need for Hadoop
configuration. What you are trying to do is very common, I think you
are just missing out some concepts.

I can also understand that it is urgent for you, but take into
consideration that the people answering on this mailing list have no
contractual agreement with you and when we answer we do it on our own
time. Thank you.

J-D

On Thu, Jul 23, 2009 at 11:34 AM, bharath
vissapragada<bh...@gmail.com> wrote:
> I doubt there is some problem in tmy hbase or hadoop conf .. can u tell me
> any link or explaination on MR in Hbase in standalone mode ... please its
> kinda urgent!
>
> On Thu, Jul 23, 2009 at 8:43 PM, bharath vissapragada <
> bharathvissapragada1990@gmail.com> wrote:
>
>> Thanks for ur reply J-D ... Im pasting some part of the code ...
>>
>> Im doing it frm the command line .. Iam pasting some part of the code here
>> ....
>>
>>  public void mapp(ImmutableBytesWritable row, RowResult value,
>> OutputCollector<Text, Text> output, Reporter reporter) throws IOException {
>>                 System.out.println(row);
>> }
>>
>> public JobConf createSubmittableJob(String[] args) throws IOException {
>>                 JobConf c = new JobConf(getConf(), MR_DS_Scan_Case1.class);
>>                 c.set("col.name", args[1]);
>>                 c.set("operator.name",args[2]);
>>                 c.set("val.name",args[3]);
>>                 IdentityTableMap.initJob(args[0], args[1], this.getClass(),
>> c);
>>                 c.setOutputFormat(NullOutputFormat.class);
>>                  return c
>> }
>>
>> As u can see ... im just printing the value of row in the map .. i can't
>> see in the terminal .....
>> I only wan't the map phase ... so i didn't write any reduce phase .. is my
>> jobConf correct??
>>
>> Also as i have already asked how to check the job logs and web interface
>> like "localhost:<port>/jobTracker.jsp"... since im running in local mode ...
>>
>>
>> On Thu, Jul 23, 2009 at 6:32 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:
>>
>>> What output do you need exactly? I see that you have 8 output records
>>> in your reduce task so if you take a look in your output folder or
>>> table (I don't know which sink you used) you should see them.
>>>
>>> Also did you run your MR inside Eclipse or in command line?
>>>
>>> Thx,
>>>
>>> J-D
>>>
>>> On Thu, Jul 23, 2009 at 8:30 AM, bharath
>>> vissapragada<bh...@students.iiit.ac.in> wrote:
>>> > This is the output i go t.. seems everything is fine ..but no output!!
>>> >
>>> > 09/07/23 23:25:36 INFO jvm.JvmMetrics: Initializing JVM Metrics with
>>> > processName=JobTracker, sessionId=
>>> > 09/07/23 23:25:36 WARN mapred.JobClient: No job jar file set.  User
>>> classes
>>> > may not be found. See JobConf(Class) or JobConf#setJar(String).
>>> > 09/07/23 23:25:36 INFO mapred.TableInputFormatBase: split:
>>> > 0->localhost.localdomain:,
>>> > 09/07/23 23:25:37 INFO mapred.JobClient: Running job: job_local_0001
>>> > 09/07/23 23:25:37 INFO mapred.TableInputFormatBase: split:
>>> > 0->localhost.localdomain:,
>>> > 09/07/23 23:25:37 INFO mapred.MapTask: numReduceTasks: 1
>>> > 09/07/23 23:25:37 INFO mapred.MapTask: io.sort.mb = 100
>>> > 09/07/23 23:25:37 INFO mapred.MapTask: data buffer = 79691776/99614720
>>> > 09/07/23 23:25:37 INFO mapred.MapTask: record buffer = 262144/327680
>>> > 09/07/23 23:25:37 INFO mapred.MapTask: Starting flush of map output
>>> > 09/07/23 23:25:37 INFO mapred.MapTask: Finished spill 0
>>> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
>>> Task:attempt_local_0001_m_000000_0
>>> > is done. And is in the process of commiting
>>> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>>> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
>>> > 'attempt_local_0001_m_000000_0' done.
>>> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>>> > 09/07/23 23:25:37 INFO mapred.Merger: Merging 1 sorted segments
>>> > 09/07/23 23:25:37 INFO mapred.Merger: Down to the last merge-pass, with
>>> 1
>>> > segments left of total size: 333 bytes
>>> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>>> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
>>> Task:attempt_local_0001_r_000000_0
>>> > is done. And is in the process of commiting
>>> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner: reduce > reduce
>>> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
>>> > 'attempt_local_0001_r_000000_0' done.
>>> > 09/07/23 23:25:38 INFO mapred.JobClient: Job complete: job_local_0001
>>> > 09/07/23 23:25:38 INFO mapred.JobClient: Counters: 11
>>> > 09/07/23 23:25:38 INFO mapred.JobClient:   File Systems
>>> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes read=38949
>>> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes written=78378
>>> > 09/07/23 23:25:38 INFO mapred.JobClient:   Map-Reduce Framework
>>> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input groups=8
>>> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine output records=0
>>> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input records=8
>>> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce output records=8
>>> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output bytes=315
>>> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input bytes=0
>>> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine input records=0
>>> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output records=8
>>> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input records=8
>>> >
>>> >
>>> > On Thu, Jul 23, 2009 at 12:17 PM, bharath vissapragada <
>>> > bharat_v@students.iiit.ac.in> wrote:
>>> >
>>> >> since i haven;t started the cluster .. i can even see the details in
>>> >> "localhost:<port>/jobTracker.jsp" ..  i didn't even add anything to
>>> >> hadoop/conf/hadoop-site.xml
>>> >>
>>> >>
>>> >> On Thu, Jul 23, 2009 at 12:16 PM, bharath vissapragada <
>>> >> bharat_v@students.iiit.ac.in> wrote:
>>> >>
>>> >>> Hi all ,
>>> >>>
>>> >>> I wanted to run HBase in standalone mode to check my Hbase MR programs
>>> ...
>>> >>> I have dl a built version of hbase-0.20. and i have hadoop 0.19.3
>>> >>>
>>> >>> "I have set JAVA_HOME in both of them" .. then i started hbase and
>>> >>> inserted some tables using JAVA API .. Now i have written some MR
>>> programs
>>> >>> onHBase and when i run them on Hbase it runs perfectly without any
>>> errors
>>> >>> and all the Map -reduce statistics are displayed correctly but  i get
>>> no
>>> >>> output .
>>> >>>
>>> >>> I have one doubt now .. how do HBase recognize hadoop in stand alone
>>> >>> mode(i haven;t started my hadoop even) .. Even simple print statements
>>> donot
>>> >>> work .. no output is displayed on the screen ... I doubt my config
>>> ....
>>> >>>
>>> >>> Do i need to add some config to run them ... Please reply ...
>>> >>>
>>> >>
>>> >>
>>> >
>>>
>>
>>
>

Re: Hbase and Hadoop Config to run in Standalone mode

Posted by bharath vissapragada <bh...@gmail.com>.

I doubt there is some problem in tmy hbase or hadoop conf .. can u tell me
any link or explaination on MR in Hbase in standalone mode ... please its
kinda urgent!

On Thu, Jul 23, 2009 at 8:43 PM, bharath vissapragada <
bharathvissapragada1990@gmail.com> wrote:

> Thanks for ur reply J-D ... Im pasting some part of the code ...
>
> Im doing it frm the command line .. Iam pasting some part of the code here
> ....
>
>  public void mapp(ImmutableBytesWritable row, RowResult value,
> OutputCollector<Text, Text> output, Reporter reporter) throws IOException {
>                 System.out.println(row);
> }
>
> public JobConf createSubmittableJob(String[] args) throws IOException {
>                 JobConf c = new JobConf(getConf(), MR_DS_Scan_Case1.class);
>                 c.set("col.name", args[1]);
>                 c.set("operator.name",args[2]);
>                 c.set("val.name",args[3]);
>                 IdentityTableMap.initJob(args[0], args[1], this.getClass(),
> c);
>                 c.setOutputFormat(NullOutputFormat.class);
>                  return c
> }
>
> As u can see ... im just printing the value of row in the map .. i can't
> see in the terminal .....
> I only wan't the map phase ... so i didn't write any reduce phase .. is my
> jobConf correct??
>
> Also as i have already asked how to check the job logs and web interface
> like "localhost:<port>/jobTracker.jsp"... since im running in local mode ...
>
>
> On Thu, Jul 23, 2009 at 6:32 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:
>
>> What output do you need exactly? I see that you have 8 output records
>> in your reduce task so if you take a look in your output folder or
>> table (I don't know which sink you used) you should see them.
>>
>> Also did you run your MR inside Eclipse or in command line?
>>
>> Thx,
>>
>> J-D
>>
>> On Thu, Jul 23, 2009 at 8:30 AM, bharath
>> vissapragada<bh...@students.iiit.ac.in> wrote:
>> > This is the output i go t.. seems everything is fine ..but no output!!
>> >
>> > 09/07/23 23:25:36 INFO jvm.JvmMetrics: Initializing JVM Metrics with
>> > processName=JobTracker, sessionId=
>> > 09/07/23 23:25:36 WARN mapred.JobClient: No job jar file set.  User
>> classes
>> > may not be found. See JobConf(Class) or JobConf#setJar(String).
>> > 09/07/23 23:25:36 INFO mapred.TableInputFormatBase: split:
>> > 0->localhost.localdomain:,
>> > 09/07/23 23:25:37 INFO mapred.JobClient: Running job: job_local_0001
>> > 09/07/23 23:25:37 INFO mapred.TableInputFormatBase: split:
>> > 0->localhost.localdomain:,
>> > 09/07/23 23:25:37 INFO mapred.MapTask: numReduceTasks: 1
>> > 09/07/23 23:25:37 INFO mapred.MapTask: io.sort.mb = 100
>> > 09/07/23 23:25:37 INFO mapred.MapTask: data buffer = 79691776/99614720
>> > 09/07/23 23:25:37 INFO mapred.MapTask: record buffer = 262144/327680
>> > 09/07/23 23:25:37 INFO mapred.MapTask: Starting flush of map output
>> > 09/07/23 23:25:37 INFO mapred.MapTask: Finished spill 0
>> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
>> Task:attempt_local_0001_m_000000_0
>> > is done. And is in the process of commiting
>> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
>> > 'attempt_local_0001_m_000000_0' done.
>> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>> > 09/07/23 23:25:37 INFO mapred.Merger: Merging 1 sorted segments
>> > 09/07/23 23:25:37 INFO mapred.Merger: Down to the last merge-pass, with
>> 1
>> > segments left of total size: 333 bytes
>> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
>> Task:attempt_local_0001_r_000000_0
>> > is done. And is in the process of commiting
>> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner: reduce > reduce
>> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
>> > 'attempt_local_0001_r_000000_0' done.
>> > 09/07/23 23:25:38 INFO mapred.JobClient: Job complete: job_local_0001
>> > 09/07/23 23:25:38 INFO mapred.JobClient: Counters: 11
>> > 09/07/23 23:25:38 INFO mapred.JobClient:   File Systems
>> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes read=38949
>> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes written=78378
>> > 09/07/23 23:25:38 INFO mapred.JobClient:   Map-Reduce Framework
>> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input groups=8
>> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine output records=0
>> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input records=8
>> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce output records=8
>> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output bytes=315
>> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input bytes=0
>> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine input records=0
>> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output records=8
>> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input records=8
>> >
>> >
>> > On Thu, Jul 23, 2009 at 12:17 PM, bharath vissapragada <
>> > bharat_v@students.iiit.ac.in> wrote:
>> >
>> >> since i haven;t started the cluster .. i can even see the details in
>> >> "localhost:<port>/jobTracker.jsp" ..  i didn't even add anything to
>> >> hadoop/conf/hadoop-site.xml
>> >>
>> >>
>> >> On Thu, Jul 23, 2009 at 12:16 PM, bharath vissapragada <
>> >> bharat_v@students.iiit.ac.in> wrote:
>> >>
>> >>> Hi all ,
>> >>>
>> >>> I wanted to run HBase in standalone mode to check my Hbase MR programs
>> ...
>> >>> I have dl a built version of hbase-0.20. and i have hadoop 0.19.3
>> >>>
>> >>> "I have set JAVA_HOME in both of them" .. then i started hbase and
>> >>> inserted some tables using JAVA API .. Now i have written some MR
>> programs
>> >>> onHBase and when i run them on Hbase it runs perfectly without any
>> errors
>> >>> and all the Map -reduce statistics are displayed correctly but  i get
>> no
>> >>> output .
>> >>>
>> >>> I have one doubt now .. how do HBase recognize hadoop in stand alone
>> >>> mode(i haven;t started my hadoop even) .. Even simple print statements
>> donot
>> >>> work .. no output is displayed on the screen ... I doubt my config
>> ....
>> >>>
>> >>> Do i need to add some config to run them ... Please reply ...
>> >>>
>> >>
>> >>
>> >
>>
>
>

Re: Hbase and Hadoop Config to run in Standalone mode

Posted by stack <st...@duboce.net>.

On Thu, Jul 23, 2009 at 8:54 AM, bharath vissapragada <
bharathvissapragada1990@gmail.com> wrote:

> I am really thankful to you J-D for replying me inspite of ur busy
> schedule.
> I am still in a learning stage and there are no good guides on HBase other
> than Its own one .. So please spare me and I really appreciate ur help .
>
> Now i got ur point that there is no need of hadoop while running Hbase MR
> programs .... But iam confused abt the config . I have only set the
> JAVA_HOME in the "hbase-env.sh" and other than that i didn't do anything ..
> so i wonder if my conf was wrong or some error in that simple code ...
> because stdout worked for me while writing mapreduce programs ...
>


You seem to be asking basic questions about how hadoop works.

If you want an digestible overview on hadoop and all of its pieces and you
are not getting satisfaction from the hadoop wiki or from its documentation,
you might do better with one of the Hadoop books:
http://wiki.apache.org/hadoop/Books.

Just a suggestion,
St.Ack




>
> Thanks once again!
>
> On Thu, Jul 23, 2009 at 9:14 PM, Jean-Daniel Cryans <jdcryans@apache.org
> >wrote:
>
> > The code itself is very simple, I was referring to your own
> > description of your situation. You say you use standalone HBase yet
> > you talk about Hadoop configuration. You also talk about the
> > JobTracker web UI which is in no use since you run local jobs directly
> > on HBase.
> >
> > J-D
> >
> > On Thu, Jul 23, 2009 at 11:41 AM, bharath
> > vissapragada<bh...@gmail.com> wrote:
> > > I used stdout for debugging while writing codes in hadoop MR programs
> and
> > it
> > > worked fine ...
> > > Can you please tell me wch part of the code u found confusing so that i
> > can
> > > explain it a bit clearly ...
> > >
> > >
> > > On Thu, Jul 23, 2009 at 9:06 PM, Jean-Daniel Cryans <
> jdcryans@apache.org
> > >wrote:
> > >
> > >> What you wrote is a bit confusing to me, sorry.
> > >>
> > >> The usual way to debug MR jobs is to define a logger and post with
> > >> either info or debug level, not sysout like you did. I'm not even sure
> > >> where the standard output is logged when using a local job. Also since
> > >> this is local you won't see anything in your host:50030 web UI. So use
> > >> apache common logging and you should see your output.
> > >>
> > >> J-D
> > >>
> > >> On Thu, Jul 23, 2009 at 11:13 AM, bharath
> > >> vissapragada<bh...@gmail.com> wrote:
> > >> > Thanks for ur reply J-D ... Im pasting some part of the code ...
> > >> >
> > >> > Im doing it frm the command line .. Iam pasting some part of the
> code
> > >> here
> > >> > ....
> > >> >
> > >> >  public void mapp(ImmutableBytesWritable row, RowResult value,
> > >> > OutputCollector<Text, Text> output, Reporter reporter) throws
> > IOException
> > >> {
> > >> >                System.out.println(row);
> > >> > }
> > >> >
> > >> > public JobConf createSubmittableJob(String[] args) throws
> IOException
> > {
> > >> >                JobConf c = new JobConf(getConf(),
> > >> MR_DS_Scan_Case1.class);
> > >> >                c.set("col.name", args[1]);
> > >> >                c.set("operator.name",args[2]);
> > >> >                c.set("val.name",args[3]);
> > >> >                IdentityTableMap.initJob(args[0], args[1],
> > >> this.getClass(),
> > >> > c);
> > >> >                c.setOutputFormat(NullOutputFormat.class);
> > >> >                 return c
> > >> > }
> > >> >
> > >> > As u can see ... im just printing the value of row in the map .. i
> > can't
> > >> see
> > >> > in the terminal .....
> > >> > I only wan't the map phase ... so i didn't write any reduce phase ..
> > is
> > >> my
> > >> > jobConf correct??
> > >> >
> > >> > Also as i have already asked how to check the job logs and web
> > interface
> > >> > like "localhost:<port>/jobTracker.jsp"... since im running in local
> > mode
> > >> ...
> > >> >
> > >> > On Thu, Jul 23, 2009 at 6:32 PM, Jean-Daniel Cryans <
> > jdcryans@apache.org
> > >> >wrote:
> > >> >
> > >> >> What output do you need exactly? I see that you have 8 output
> records
> > >> >> in your reduce task so if you take a look in your output folder or
> > >> >> table (I don't know which sink you used) you should see them.
> > >> >>
> > >> >> Also did you run your MR inside Eclipse or in command line?
> > >> >>
> > >> >> Thx,
> > >> >>
> > >> >> J-D
> > >> >>
> > >> >> On Thu, Jul 23, 2009 at 8:30 AM, bharath
> > >> >> vissapragada<bh...@students.iiit.ac.in> wrote:
> > >> >> > This is the output i go t.. seems everything is fine ..but no
> > output!!
> > >> >> >
> > >> >> > 09/07/23 23:25:36 INFO jvm.JvmMetrics: Initializing JVM Metrics
> > with
> > >> >> > processName=JobTracker, sessionId=
> > >> >> > 09/07/23 23:25:36 WARN mapred.JobClient: No job jar file set.
>  User
> > >> >> classes
> > >> >> > may not be found. See JobConf(Class) or JobConf#setJar(String).
> > >> >> > 09/07/23 23:25:36 INFO mapred.TableInputFormatBase: split:
> > >> >> > 0->localhost.localdomain:,
> > >> >> > 09/07/23 23:25:37 INFO mapred.JobClient: Running job:
> > job_local_0001
> > >> >> > 09/07/23 23:25:37 INFO mapred.TableInputFormatBase: split:
> > >> >> > 0->localhost.localdomain:,
> > >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: numReduceTasks: 1
> > >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: io.sort.mb = 100
> > >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: data buffer =
> > 79691776/99614720
> > >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: record buffer =
> > 262144/327680
> > >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Starting flush of map
> output
> > >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Finished spill 0
> > >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
> > >> >> Task:attempt_local_0001_m_000000_0
> > >> >> > is done. And is in the process of commiting
> > >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
> > >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
> > >> >> > 'attempt_local_0001_m_000000_0' done.
> > >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
> > >> >> > 09/07/23 23:25:37 INFO mapred.Merger: Merging 1 sorted segments
> > >> >> > 09/07/23 23:25:37 INFO mapred.Merger: Down to the last
> merge-pass,
> > >> with 1
> > >> >> > segments left of total size: 333 bytes
> > >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
> > >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
> > >> >> Task:attempt_local_0001_r_000000_0
> > >> >> > is done. And is in the process of commiting
> > >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner: reduce > reduce
> > >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
> > >> >> > 'attempt_local_0001_r_000000_0' done.
> > >> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Job complete:
> > job_local_0001
> > >> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Counters: 11
> > >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   File Systems
> > >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes
> read=38949
> > >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes
> > written=78378
> > >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   Map-Reduce Framework
> > >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input
> groups=8
> > >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine output
> > records=0
> > >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input records=8
> > >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce output
> > records=8
> > >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output bytes=315
> > >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input bytes=0
> > >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine input
> > records=0
> > >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output records=8
> > >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input
> records=8
> > >> >> >
> > >> >> >
> > >> >> > On Thu, Jul 23, 2009 at 12:17 PM, bharath vissapragada <
> > >> >> > bharat_v@students.iiit.ac.in> wrote:
> > >> >> >
> > >> >> >> since i haven;t started the cluster .. i can even see the
> details
> > in
> > >> >> >> "localhost:<port>/jobTracker.jsp" ..  i didn't even add anything
> > to
> > >> >> >> hadoop/conf/hadoop-site.xml
> > >> >> >>
> > >> >> >>
> > >> >> >> On Thu, Jul 23, 2009 at 12:16 PM, bharath vissapragada <
> > >> >> >> bharat_v@students.iiit.ac.in> wrote:
> > >> >> >>
> > >> >> >>> Hi all ,
> > >> >> >>>
> > >> >> >>> I wanted to run HBase in standalone mode to check my Hbase MR
> > >> programs
> > >> >> ...
> > >> >> >>> I have dl a built version of hbase-0.20. and i have hadoop
> 0.19.3
> > >> >> >>>
> > >> >> >>> "I have set JAVA_HOME in both of them" .. then i started hbase
> > and
> > >> >> >>> inserted some tables using JAVA API .. Now i have written some
> MR
> > >> >> programs
> > >> >> >>> onHBase and when i run them on Hbase it runs perfectly without
> > any
> > >> >> errors
> > >> >> >>> and all the Map -reduce statistics are displayed correctly but
>  i
> > >> get
> > >> >> no
> > >> >> >>> output .
> > >> >> >>>
> > >> >> >>> I have one doubt now .. how do HBase recognize hadoop in stand
> > alone
> > >> >> >>> mode(i haven;t started my hadoop even) .. Even simple print
> > >> statements
> > >> >> donot
> > >> >> >>> work .. no output is displayed on the screen ... I doubt my
> > config
> > >> ....
> > >> >> >>>
> > >> >> >>> Do i need to add some config to run them ... Please reply ...
> > >> >> >>>
> > >> >> >>
> > >> >> >>
> > >> >> >
> > >> >>
> > >> >
> > >>
> > >
> >
>

Re: Hbase and Hadoop Config to run in Standalone mode

Posted by Jean-Daniel Cryans <jd...@apache.org>.

Well you should always set an output directory, but in your case I see
that the job still ran.

J-D

On Thu, Jul 23, 2009 at 12:52 PM, bharath
vissapragada<bh...@gmail.com> wrote:
> I have set "c.setOutputFormat(NullOutputFormat.class);"  otherwise its
> showing the error
> "Output directory not set in JobConf."
>
> I think this is causing troubles ... any idea?
>
>
>
> On Thu, Jul 23, 2009 at 10:12 PM, bharath vissapragada <
> bharathvissapragada1990@gmail.com> wrote:
>
>> I have tried apache -commons logging ...
>>
>> instead of printing the row ... i have written log.error(row) ...
>> even then i got the same output as follows ...
>>
>> 09/07/24 03:41:38 INFO jvm.JvmMetrics: Initializing JVM Metrics with
>> processName=JobTracker, sessionId=
>> 09/07/24 03:41:38 WARN mapred.JobClient: No job jar file set.  User classes
>> may not be found. See JobConf(Class) or JobConf#setJar(String).
>> 09/07/24 03:41:39 INFO mapred.TableInputFormatBase: split:
>> 0->localhost.localdomain:,
>> 09/07/24 03:41:39 INFO mapred.JobClient: Running job: job_local_0001
>> 09/07/24 03:41:39 INFO mapred.TableInputFormatBase: split:
>> 0->localhost.localdomain:,
>> 09/07/24 03:41:40 INFO mapred.MapTask: numReduceTasks: 1
>> 09/07/24 03:41:40 INFO mapred.MapTask: io.sort.mb = 100
>> 09/07/24 03:41:40 INFO mapred.MapTask: data buffer = 79691776/99614720
>> 09/07/24 03:41:40 INFO mapred.MapTask: record buffer = 262144/327680
>> 09/07/24 03:41:40 INFO mapred.MapTask: Starting flush of map output
>> 09/07/24 03:41:40 INFO mapred.MapTask: Finished spill 0
>> 09/07/24 03:41:40 INFO mapred.TaskRunner:
>> Task:attempt_local_0001_m_000000_0 is done. And is in the process of
>> commiting
>> 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
>> 09/07/24 03:41:40 INFO mapred.TaskRunner: Task
>> 'attempt_local_0001_m_000000_0' done.
>> 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
>> 09/07/24 03:41:40 INFO mapred.Merger: Merging 1 sorted segments
>> 09/07/24 03:41:40 INFO mapred.Merger: Down to the last merge-pass, with 1
>> segments left of total size: 333 bytes
>> 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
>> 09/07/24 03:41:40 INFO mapred.TaskRunner:
>> Task:attempt_local_0001_r_000000_0 is done. And is in the process of
>> commiting
>> 09/07/24 03:41:40 INFO mapred.LocalJobRunner: reduce > reduce
>> 09/07/24 03:41:40 INFO mapred.TaskRunner: Task
>> 'attempt_local_0001_r_000000_0' done.
>> 09/07/24 03:41:40 INFO mapred.JobClient: Job complete: job_local_0001
>> 09/07/24 03:41:40 INFO mapred.JobClient: Counters: 11
>> 09/07/24 03:41:40 INFO mapred.JobClient:   File Systems
>> 09/07/24 03:41:40 INFO mapred.JobClient:     Local bytes read=38933
>> 09/07/24 03:41:40 INFO mapred.JobClient:     Local bytes written=78346
>> 09/07/24 03:41:40 INFO mapred.JobClient:   Map-Reduce Framework
>> 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce input groups=8
>> 09/07/24 03:41:40 INFO mapred.JobClient:     Combine output records=0
>> 09/07/24 03:41:40 INFO mapred.JobClient:     Map input records=8
>> 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce output records=8
>> 09/07/24 03:41:40 INFO mapred.JobClient:     Map output bytes=315
>> 09/07/24 03:41:40 INFO mapred.JobClient:     Map input bytes=0
>> 09/07/24 03:41:40 INFO mapred.JobClient:     Combine input records=0
>> 09/07/24 03:41:40 INFO mapred.JobClient:     Map output records=8
>> 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce input records=8
>>
>>
>>
>> On Thu, Jul 23, 2009 at 9:32 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:
>>
>>> And you don't need any more config to run local MR jobs on HBase. But
>>> you do need Hadoop when running MR jobs on HBase on a cluster.
>>>
>>> Also your code is running fine as you could see, the real question is
>>> where is the stdout going when in local mode. When you ran your other
>>> MR jobs, it was on a working Hadoop setup right? So you were looking
>>> at the logs in the web UI? One simple thing to do is to do your
>>> debugging with a logger so you are sure to see your output as I
>>> already proposed. Another simple thing is to get a pseudo-distributed
>>> setup and run you HBase MR jobs with Hadoop and get your logs like I'm
>>> sure you did before.
>>>
>>> J-D
>>>
>>> On Thu, Jul 23, 2009 at 11:54 AM, bharath
>>> vissapragada<bh...@gmail.com> wrote:
>>> > I am really thankful to you J-D for replying me inspite of ur busy
>>> schedule.
>>> > I am still in a learning stage and there are no good guides on HBase
>>> other
>>> > than Its own one .. So please spare me and I really appreciate ur help .
>>> >
>>> > Now i got ur point that there is no need of hadoop while running Hbase
>>> MR
>>> > programs .... But iam confused abt the config . I have only set the
>>> > JAVA_HOME in the "hbase-env.sh" and other than that i didn't do anything
>>> ..
>>> > so i wonder if my conf was wrong or some error in that simple code ...
>>> > because stdout worked for me while writing mapreduce programs ...
>>> >
>>> > Thanks once again!
>>> >
>>> > On Thu, Jul 23, 2009 at 9:14 PM, Jean-Daniel Cryans <
>>> jdcryans@apache.org>wrote:
>>> >
>>> >> The code itself is very simple, I was referring to your own
>>> >> description of your situation. You say you use standalone HBase yet
>>> >> you talk about Hadoop configuration. You also talk about the
>>> >> JobTracker web UI which is in no use since you run local jobs directly
>>> >> on HBase.
>>> >>
>>> >> J-D
>>> >>
>>> >> On Thu, Jul 23, 2009 at 11:41 AM, bharath
>>> >> vissapragada<bh...@gmail.com> wrote:
>>> >> > I used stdout for debugging while writing codes in hadoop MR programs
>>> and
>>> >> it
>>> >> > worked fine ...
>>> >> > Can you please tell me wch part of the code u found confusing so that
>>> i
>>> >> can
>>> >> > explain it a bit clearly ...
>>> >> >
>>> >> >
>>> >> > On Thu, Jul 23, 2009 at 9:06 PM, Jean-Daniel Cryans <
>>> jdcryans@apache.org
>>> >> >wrote:
>>> >> >
>>> >> >> What you wrote is a bit confusing to me, sorry.
>>> >> >>
>>> >> >> The usual way to debug MR jobs is to define a logger and post with
>>> >> >> either info or debug level, not sysout like you did. I'm not even
>>> sure
>>> >> >> where the standard output is logged when using a local job. Also
>>> since
>>> >> >> this is local you won't see anything in your host:50030 web UI. So
>>> use
>>> >> >> apache common logging and you should see your output.
>>> >> >>
>>> >> >> J-D
>>> >> >>
>>> >> >> On Thu, Jul 23, 2009 at 11:13 AM, bharath
>>> >> >> vissapragada<bh...@gmail.com> wrote:
>>> >> >> > Thanks for ur reply J-D ... Im pasting some part of the code ...
>>> >> >> >
>>> >> >> > Im doing it frm the command line .. Iam pasting some part of the
>>> code
>>> >> >> here
>>> >> >> > ....
>>> >> >> >
>>> >> >> >  public void mapp(ImmutableBytesWritable row, RowResult value,
>>> >> >> > OutputCollector<Text, Text> output, Reporter reporter) throws
>>> >> IOException
>>> >> >> {
>>> >> >> >                System.out.println(row);
>>> >> >> > }
>>> >> >> >
>>> >> >> > public JobConf createSubmittableJob(String[] args) throws
>>> IOException
>>> >> {
>>> >> >> >                JobConf c = new JobConf(getConf(),
>>> >> >> MR_DS_Scan_Case1.class);
>>> >> >> >                c.set("col.name", args[1]);
>>> >> >> >                c.set("operator.name",args[2]);
>>> >> >> >                c.set("val.name",args[3]);
>>> >> >> >                IdentityTableMap.initJob(args[0], args[1],
>>> >> >> this.getClass(),
>>> >> >> > c);
>>> >> >> >                c.setOutputFormat(NullOutputFormat.class);
>>> >> >> >                 return c
>>> >> >> > }
>>> >> >> >
>>> >> >> > As u can see ... im just printing the value of row in the map .. i
>>> >> can't
>>> >> >> see
>>> >> >> > in the terminal .....
>>> >> >> > I only wan't the map phase ... so i didn't write any reduce phase
>>> ..
>>> >> is
>>> >> >> my
>>> >> >> > jobConf correct??
>>> >> >> >
>>> >> >> > Also as i have already asked how to check the job logs and web
>>> >> interface
>>> >> >> > like "localhost:<port>/jobTracker.jsp"... since im running in
>>> local
>>> >> mode
>>> >> >> ...
>>> >> >> >
>>> >> >> > On Thu, Jul 23, 2009 at 6:32 PM, Jean-Daniel Cryans <
>>> >> jdcryans@apache.org
>>> >> >> >wrote:
>>> >> >> >
>>> >> >> >> What output do you need exactly? I see that you have 8 output
>>> records
>>> >> >> >> in your reduce task so if you take a look in your output folder
>>> or
>>> >> >> >> table (I don't know which sink you used) you should see them.
>>> >> >> >>
>>> >> >> >> Also did you run your MR inside Eclipse or in command line?
>>> >> >> >>
>>> >> >> >> Thx,
>>> >> >> >>
>>> >> >> >> J-D
>>> >> >> >>
>>> >> >> >> On Thu, Jul 23, 2009 at 8:30 AM, bharath
>>> >> >> >> vissapragada<bh...@students.iiit.ac.in> wrote:
>>> >> >> >> > This is the output i go t.. seems everything is fine ..but no
>>> >> output!!
>>> >> >> >> >
>>> >> >> >> > 09/07/23 23:25:36 INFO jvm.JvmMetrics: Initializing JVM Metrics
>>> >> with
>>> >> >> >> > processName=JobTracker, sessionId=
>>> >> >> >> > 09/07/23 23:25:36 WARN mapred.JobClient: No job jar file set.
>>>  User
>>> >> >> >> classes
>>> >> >> >> > may not be found. See JobConf(Class) or JobConf#setJar(String).
>>> >> >> >> > 09/07/23 23:25:36 INFO mapred.TableInputFormatBase: split:
>>> >> >> >> > 0->localhost.localdomain:,
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.JobClient: Running job:
>>> >> job_local_0001
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.TableInputFormatBase: split:
>>> >> >> >> > 0->localhost.localdomain:,
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: numReduceTasks: 1
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: io.sort.mb = 100
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: data buffer =
>>> >> 79691776/99614720
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: record buffer =
>>> >> 262144/327680
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Starting flush of map
>>> output
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Finished spill 0
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
>>> >> >> >> Task:attempt_local_0001_m_000000_0
>>> >> >> >> > is done. And is in the process of commiting
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
>>> >> >> >> > 'attempt_local_0001_m_000000_0' done.
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.Merger: Merging 1 sorted segments
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.Merger: Down to the last
>>> merge-pass,
>>> >> >> with 1
>>> >> >> >> > segments left of total size: 333 bytes
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
>>> >> >> >> Task:attempt_local_0001_r_000000_0
>>> >> >> >> > is done. And is in the process of commiting
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner: reduce > reduce
>>> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
>>> >> >> >> > 'attempt_local_0001_r_000000_0' done.
>>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Job complete:
>>> >> job_local_0001
>>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Counters: 11
>>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   File Systems
>>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes
>>> read=38949
>>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes
>>> >> written=78378
>>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   Map-Reduce Framework
>>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input
>>> groups=8
>>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine output
>>> >> records=0
>>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input
>>> records=8
>>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce output
>>> >> records=8
>>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output
>>> bytes=315
>>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input bytes=0
>>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine input
>>> >> records=0
>>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output
>>> records=8
>>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input
>>> records=8
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> > On Thu, Jul 23, 2009 at 12:17 PM, bharath vissapragada <
>>> >> >> >> > bharat_v@students.iiit.ac.in> wrote:
>>> >> >> >> >
>>> >> >> >> >> since i haven;t started the cluster .. i can even see the
>>> details
>>> >> in
>>> >> >> >> >> "localhost:<port>/jobTracker.jsp" ..  i didn't even add
>>> anything
>>> >> to
>>> >> >> >> >> hadoop/conf/hadoop-site.xml
>>> >> >> >> >>
>>> >> >> >> >>
>>> >> >> >> >> On Thu, Jul 23, 2009 at 12:16 PM, bharath vissapragada <
>>> >> >> >> >> bharat_v@students.iiit.ac.in> wrote:
>>> >> >> >> >>
>>> >> >> >> >>> Hi all ,
>>> >> >> >> >>>
>>> >> >> >> >>> I wanted to run HBase in standalone mode to check my Hbase MR
>>> >> >> programs
>>> >> >> >> ...
>>> >> >> >> >>> I have dl a built version of hbase-0.20. and i have hadoop
>>> 0.19.3
>>> >> >> >> >>>
>>> >> >> >> >>> "I have set JAVA_HOME in both of them" .. then i started
>>> hbase
>>> >> and
>>> >> >> >> >>> inserted some tables using JAVA API .. Now i have written
>>> some MR
>>> >> >> >> programs
>>> >> >> >> >>> onHBase and when i run them on Hbase it runs perfectly
>>> without
>>> >> any
>>> >> >> >> errors
>>> >> >> >> >>> and all the Map -reduce statistics are displayed correctly
>>> but  i
>>> >> >> get
>>> >> >> >> no
>>> >> >> >> >>> output .
>>> >> >> >> >>>
>>> >> >> >> >>> I have one doubt now .. how do HBase recognize hadoop in
>>> stand
>>> >> alone
>>> >> >> >> >>> mode(i haven;t started my hadoop even) .. Even simple print
>>> >> >> statements
>>> >> >> >> donot
>>> >> >> >> >>> work .. no output is displayed on the screen ... I doubt my
>>> >> config
>>> >> >> ....
>>> >> >> >> >>>
>>> >> >> >> >>> Do i need to add some config to run them ... Please reply ...
>>> >> >> >> >>>
>>> >> >> >> >>
>>> >> >> >> >>
>>> >> >> >> >
>>> >> >> >>
>>> >> >> >
>>> >> >>
>>> >> >
>>> >>
>>> >
>>>
>>
>>
>

Re: Hbase and Hadoop Config to run in Standalone mode

Posted by bharath vissapragada <bh...@gmail.com>.

I have set "c.setOutputFormat(NullOutputFormat.class);"  otherwise its
showing the error
"Output directory not set in JobConf."

I think this is causing troubles ... any idea?



On Thu, Jul 23, 2009 at 10:12 PM, bharath vissapragada <
bharathvissapragada1990@gmail.com> wrote:

> I have tried apache -commons logging ...
>
> instead of printing the row ... i have written log.error(row) ...
> even then i got the same output as follows ...
>
> 09/07/24 03:41:38 INFO jvm.JvmMetrics: Initializing JVM Metrics with
> processName=JobTracker, sessionId=
> 09/07/24 03:41:38 WARN mapred.JobClient: No job jar file set.  User classes
> may not be found. See JobConf(Class) or JobConf#setJar(String).
> 09/07/24 03:41:39 INFO mapred.TableInputFormatBase: split:
> 0->localhost.localdomain:,
> 09/07/24 03:41:39 INFO mapred.JobClient: Running job: job_local_0001
> 09/07/24 03:41:39 INFO mapred.TableInputFormatBase: split:
> 0->localhost.localdomain:,
> 09/07/24 03:41:40 INFO mapred.MapTask: numReduceTasks: 1
> 09/07/24 03:41:40 INFO mapred.MapTask: io.sort.mb = 100
> 09/07/24 03:41:40 INFO mapred.MapTask: data buffer = 79691776/99614720
> 09/07/24 03:41:40 INFO mapred.MapTask: record buffer = 262144/327680
> 09/07/24 03:41:40 INFO mapred.MapTask: Starting flush of map output
> 09/07/24 03:41:40 INFO mapred.MapTask: Finished spill 0
> 09/07/24 03:41:40 INFO mapred.TaskRunner:
> Task:attempt_local_0001_m_000000_0 is done. And is in the process of
> commiting
> 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
> 09/07/24 03:41:40 INFO mapred.TaskRunner: Task
> 'attempt_local_0001_m_000000_0' done.
> 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
> 09/07/24 03:41:40 INFO mapred.Merger: Merging 1 sorted segments
> 09/07/24 03:41:40 INFO mapred.Merger: Down to the last merge-pass, with 1
> segments left of total size: 333 bytes
> 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
> 09/07/24 03:41:40 INFO mapred.TaskRunner:
> Task:attempt_local_0001_r_000000_0 is done. And is in the process of
> commiting
> 09/07/24 03:41:40 INFO mapred.LocalJobRunner: reduce > reduce
> 09/07/24 03:41:40 INFO mapred.TaskRunner: Task
> 'attempt_local_0001_r_000000_0' done.
> 09/07/24 03:41:40 INFO mapred.JobClient: Job complete: job_local_0001
> 09/07/24 03:41:40 INFO mapred.JobClient: Counters: 11
> 09/07/24 03:41:40 INFO mapred.JobClient:   File Systems
> 09/07/24 03:41:40 INFO mapred.JobClient:     Local bytes read=38933
> 09/07/24 03:41:40 INFO mapred.JobClient:     Local bytes written=78346
> 09/07/24 03:41:40 INFO mapred.JobClient:   Map-Reduce Framework
> 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce input groups=8
> 09/07/24 03:41:40 INFO mapred.JobClient:     Combine output records=0
> 09/07/24 03:41:40 INFO mapred.JobClient:     Map input records=8
> 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce output records=8
> 09/07/24 03:41:40 INFO mapred.JobClient:     Map output bytes=315
> 09/07/24 03:41:40 INFO mapred.JobClient:     Map input bytes=0
> 09/07/24 03:41:40 INFO mapred.JobClient:     Combine input records=0
> 09/07/24 03:41:40 INFO mapred.JobClient:     Map output records=8
> 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce input records=8
>
>
>
> On Thu, Jul 23, 2009 at 9:32 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:
>
>> And you don't need any more config to run local MR jobs on HBase. But
>> you do need Hadoop when running MR jobs on HBase on a cluster.
>>
>> Also your code is running fine as you could see, the real question is
>> where is the stdout going when in local mode. When you ran your other
>> MR jobs, it was on a working Hadoop setup right? So you were looking
>> at the logs in the web UI? One simple thing to do is to do your
>> debugging with a logger so you are sure to see your output as I
>> already proposed. Another simple thing is to get a pseudo-distributed
>> setup and run you HBase MR jobs with Hadoop and get your logs like I'm
>> sure you did before.
>>
>> J-D
>>
>> On Thu, Jul 23, 2009 at 11:54 AM, bharath
>> vissapragada<bh...@gmail.com> wrote:
>> > I am really thankful to you J-D for replying me inspite of ur busy
>> schedule.
>> > I am still in a learning stage and there are no good guides on HBase
>> other
>> > than Its own one .. So please spare me and I really appreciate ur help .
>> >
>> > Now i got ur point that there is no need of hadoop while running Hbase
>> MR
>> > programs .... But iam confused abt the config . I have only set the
>> > JAVA_HOME in the "hbase-env.sh" and other than that i didn't do anything
>> ..
>> > so i wonder if my conf was wrong or some error in that simple code ...
>> > because stdout worked for me while writing mapreduce programs ...
>> >
>> > Thanks once again!
>> >
>> > On Thu, Jul 23, 2009 at 9:14 PM, Jean-Daniel Cryans <
>> jdcryans@apache.org>wrote:
>> >
>> >> The code itself is very simple, I was referring to your own
>> >> description of your situation. You say you use standalone HBase yet
>> >> you talk about Hadoop configuration. You also talk about the
>> >> JobTracker web UI which is in no use since you run local jobs directly
>> >> on HBase.
>> >>
>> >> J-D
>> >>
>> >> On Thu, Jul 23, 2009 at 11:41 AM, bharath
>> >> vissapragada<bh...@gmail.com> wrote:
>> >> > I used stdout for debugging while writing codes in hadoop MR programs
>> and
>> >> it
>> >> > worked fine ...
>> >> > Can you please tell me wch part of the code u found confusing so that
>> i
>> >> can
>> >> > explain it a bit clearly ...
>> >> >
>> >> >
>> >> > On Thu, Jul 23, 2009 at 9:06 PM, Jean-Daniel Cryans <
>> jdcryans@apache.org
>> >> >wrote:
>> >> >
>> >> >> What you wrote is a bit confusing to me, sorry.
>> >> >>
>> >> >> The usual way to debug MR jobs is to define a logger and post with
>> >> >> either info or debug level, not sysout like you did. I'm not even
>> sure
>> >> >> where the standard output is logged when using a local job. Also
>> since
>> >> >> this is local you won't see anything in your host:50030 web UI. So
>> use
>> >> >> apache common logging and you should see your output.
>> >> >>
>> >> >> J-D
>> >> >>
>> >> >> On Thu, Jul 23, 2009 at 11:13 AM, bharath
>> >> >> vissapragada<bh...@gmail.com> wrote:
>> >> >> > Thanks for ur reply J-D ... Im pasting some part of the code ...
>> >> >> >
>> >> >> > Im doing it frm the command line .. Iam pasting some part of the
>> code
>> >> >> here
>> >> >> > ....
>> >> >> >
>> >> >> >  public void mapp(ImmutableBytesWritable row, RowResult value,
>> >> >> > OutputCollector<Text, Text> output, Reporter reporter) throws
>> >> IOException
>> >> >> {
>> >> >> >                System.out.println(row);
>> >> >> > }
>> >> >> >
>> >> >> > public JobConf createSubmittableJob(String[] args) throws
>> IOException
>> >> {
>> >> >> >                JobConf c = new JobConf(getConf(),
>> >> >> MR_DS_Scan_Case1.class);
>> >> >> >                c.set("col.name", args[1]);
>> >> >> >                c.set("operator.name",args[2]);
>> >> >> >                c.set("val.name",args[3]);
>> >> >> >                IdentityTableMap.initJob(args[0], args[1],
>> >> >> this.getClass(),
>> >> >> > c);
>> >> >> >                c.setOutputFormat(NullOutputFormat.class);
>> >> >> >                 return c
>> >> >> > }
>> >> >> >
>> >> >> > As u can see ... im just printing the value of row in the map .. i
>> >> can't
>> >> >> see
>> >> >> > in the terminal .....
>> >> >> > I only wan't the map phase ... so i didn't write any reduce phase
>> ..
>> >> is
>> >> >> my
>> >> >> > jobConf correct??
>> >> >> >
>> >> >> > Also as i have already asked how to check the job logs and web
>> >> interface
>> >> >> > like "localhost:<port>/jobTracker.jsp"... since im running in
>> local
>> >> mode
>> >> >> ...
>> >> >> >
>> >> >> > On Thu, Jul 23, 2009 at 6:32 PM, Jean-Daniel Cryans <
>> >> jdcryans@apache.org
>> >> >> >wrote:
>> >> >> >
>> >> >> >> What output do you need exactly? I see that you have 8 output
>> records
>> >> >> >> in your reduce task so if you take a look in your output folder
>> or
>> >> >> >> table (I don't know which sink you used) you should see them.
>> >> >> >>
>> >> >> >> Also did you run your MR inside Eclipse or in command line?
>> >> >> >>
>> >> >> >> Thx,
>> >> >> >>
>> >> >> >> J-D
>> >> >> >>
>> >> >> >> On Thu, Jul 23, 2009 at 8:30 AM, bharath
>> >> >> >> vissapragada<bh...@students.iiit.ac.in> wrote:
>> >> >> >> > This is the output i go t.. seems everything is fine ..but no
>> >> output!!
>> >> >> >> >
>> >> >> >> > 09/07/23 23:25:36 INFO jvm.JvmMetrics: Initializing JVM Metrics
>> >> with
>> >> >> >> > processName=JobTracker, sessionId=
>> >> >> >> > 09/07/23 23:25:36 WARN mapred.JobClient: No job jar file set.
>>  User
>> >> >> >> classes
>> >> >> >> > may not be found. See JobConf(Class) or JobConf#setJar(String).
>> >> >> >> > 09/07/23 23:25:36 INFO mapred.TableInputFormatBase: split:
>> >> >> >> > 0->localhost.localdomain:,
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.JobClient: Running job:
>> >> job_local_0001
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.TableInputFormatBase: split:
>> >> >> >> > 0->localhost.localdomain:,
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: numReduceTasks: 1
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: io.sort.mb = 100
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: data buffer =
>> >> 79691776/99614720
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: record buffer =
>> >> 262144/327680
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Starting flush of map
>> output
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Finished spill 0
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
>> >> >> >> Task:attempt_local_0001_m_000000_0
>> >> >> >> > is done. And is in the process of commiting
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
>> >> >> >> > 'attempt_local_0001_m_000000_0' done.
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.Merger: Merging 1 sorted segments
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.Merger: Down to the last
>> merge-pass,
>> >> >> with 1
>> >> >> >> > segments left of total size: 333 bytes
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
>> >> >> >> Task:attempt_local_0001_r_000000_0
>> >> >> >> > is done. And is in the process of commiting
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner: reduce > reduce
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
>> >> >> >> > 'attempt_local_0001_r_000000_0' done.
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Job complete:
>> >> job_local_0001
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Counters: 11
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   File Systems
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes
>> read=38949
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes
>> >> written=78378
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   Map-Reduce Framework
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input
>> groups=8
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine output
>> >> records=0
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input
>> records=8
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce output
>> >> records=8
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output
>> bytes=315
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input bytes=0
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine input
>> >> records=0
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output
>> records=8
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input
>> records=8
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > On Thu, Jul 23, 2009 at 12:17 PM, bharath vissapragada <
>> >> >> >> > bharat_v@students.iiit.ac.in> wrote:
>> >> >> >> >
>> >> >> >> >> since i haven;t started the cluster .. i can even see the
>> details
>> >> in
>> >> >> >> >> "localhost:<port>/jobTracker.jsp" ..  i didn't even add
>> anything
>> >> to
>> >> >> >> >> hadoop/conf/hadoop-site.xml
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >> On Thu, Jul 23, 2009 at 12:16 PM, bharath vissapragada <
>> >> >> >> >> bharat_v@students.iiit.ac.in> wrote:
>> >> >> >> >>
>> >> >> >> >>> Hi all ,
>> >> >> >> >>>
>> >> >> >> >>> I wanted to run HBase in standalone mode to check my Hbase MR
>> >> >> programs
>> >> >> >> ...
>> >> >> >> >>> I have dl a built version of hbase-0.20. and i have hadoop
>> 0.19.3
>> >> >> >> >>>
>> >> >> >> >>> "I have set JAVA_HOME in both of them" .. then i started
>> hbase
>> >> and
>> >> >> >> >>> inserted some tables using JAVA API .. Now i have written
>> some MR
>> >> >> >> programs
>> >> >> >> >>> onHBase and when i run them on Hbase it runs perfectly
>> without
>> >> any
>> >> >> >> errors
>> >> >> >> >>> and all the Map -reduce statistics are displayed correctly
>> but  i
>> >> >> get
>> >> >> >> no
>> >> >> >> >>> output .
>> >> >> >> >>>
>> >> >> >> >>> I have one doubt now .. how do HBase recognize hadoop in
>> stand
>> >> alone
>> >> >> >> >>> mode(i haven;t started my hadoop even) .. Even simple print
>> >> >> statements
>> >> >> >> donot
>> >> >> >> >>> work .. no output is displayed on the screen ... I doubt my
>> >> config
>> >> >> ....
>> >> >> >> >>>
>> >> >> >> >>> Do i need to add some config to run them ... Please reply ...
>> >> >> >> >>>
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >
>> >> >> >>
>> >> >> >
>> >> >>
>> >> >
>> >>
>> >
>>
>
>

Re: Hbase and Hadoop Config to run in Standalone mode

Posted by Jean-Daniel Cryans <jd...@apache.org>.

Good you figured it out!

Sorry if I may have seem harsh but what I wanted to say is
definitively not that you are wasting my time. It was the "urgency" of
your situation that bugged me. What I mean is that it's ok to ask
questions (almost any) and I like answering but when a user start
asking things as if he was my boss or something, then there's
something wrong.

Happy hbase'ing!

J-D

On Thu, Jul 23, 2009 at 1:44 PM, bharath
vissapragada<bh...@gmail.com> wrote:
> I figured out my error and it is in the OutputCollector ... I really thank
> you J-D for replying me constantly .. and i am very very sorry for wasting
> your time so much ...
>
> Thanks for ur help
>
> On Thu, Jul 23, 2009 at 10:58 PM, bharath vissapragada <
> bharathvissapragada1990@gmail.com> wrote:
>
>> but now it is strangely saying that
>>
>>  method does not override or implement a method from a supertype
>>     @Override
>>     ^
>> previously it had pointed that  "org.apache.hadoop.hbase.mapred.IdentityTableMap have the same erasure"
>>  :(
>>
>>
>> On Thu, Jul 23, 2009 at 10:41 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:
>>
>>> Ok that explains. The problem you have is that you extended
>>> IdentityTableMap but tried to override it with the wrong method name
>>> so it was never called. Instead it was the parent's map that was
>>> called.
>>>
>>> The error it's now giving you is pretty much self-explanatory and is
>>> not related to Hadoop or HBase, you must override the map method and
>>> this is done with @override.
>>>
>>> You should also take at look at this doc to learn how to build your
>>> jobs
>>> http://hadoop.apache.org/hbase/docs/current/api/org/apache/hadoop/hbase/mapred/package-summary.html
>>>
>>> J-D
>>>
>>> On Thu, Jul 23, 2009 at 1:04 PM, bharath
>>> vissapragada<bh...@gmail.com> wrote:
>>> > I think this is the problem .. but when i changed it .. it gave me a
>>> weird
>>> > error
>>> >
>>> >  name clash:
>>> >
>>> map(org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.io.RowResult,org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>,org.apache.hadoop.mapred.Reporter)
>>> > in MR_DS_Scan_Case1 and
>>> >
>>> map(org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.io.RowResult,org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.io.RowResult>,org.apache.hadoop.mapred.Reporter)
>>> > in org.apache.hadoop.hbase.mapred.IdentityTableMap have the same
>>> erasure,
>>> > yet neither overrides the other
>>> >
>>> > I must override the map function in the IdentityTableMap ... but other
>>> > libraries also seem to have map function ..
>>> > so what must i do ..
>>> >
>>> > On Thu, Jul 23, 2009 at 10:26 PM, Jean-Daniel Cryans <
>>> jdcryans@apache.org>wrote:
>>> >
>>> >> Think I found your problem, is this a typo?
>>> >>
>>> >>  public void mapp(ImmutableBytesWritable row, RowResult value,
>>> >> OutputCollector<Text, Text> output, Reporter reporter) throws
>>> IOException {
>>> >>
>>> >> I should read map not mapp
>>> >>
>>> >> J-D
>>> >>
>>> >> On Thu, Jul 23, 2009 at 12:42 PM, bharath
>>> >> vissapragada<bh...@gmail.com> wrote:
>>> >> > I have tried apache -commons logging ...
>>> >> >
>>> >> > instead of printing the row ... i have written log.error(row) ...
>>> >> > even then i got the same output as follows ...
>>> >> >
>>> >> > 09/07/24 03:41:38 INFO jvm.JvmMetrics: Initializing JVM Metrics with
>>> >> > processName=JobTracker, sessionId=
>>> >> > 09/07/24 03:41:38 WARN mapred.JobClient: No job jar file set.  User
>>> >> classes
>>> >> > may not be found. See JobConf(Class) or JobConf#setJar(String).
>>> >> > 09/07/24 03:41:39 INFO mapred.TableInputFormatBase: split:
>>> >> > 0->localhost.localdomain:,
>>> >> > 09/07/24 03:41:39 INFO mapred.JobClient: Running job: job_local_0001
>>> >> > 09/07/24 03:41:39 INFO mapred.TableInputFormatBase: split:
>>> >> > 0->localhost.localdomain:,
>>> >> > 09/07/24 03:41:40 INFO mapred.MapTask: numReduceTasks: 1
>>> >> > 09/07/24 03:41:40 INFO mapred.MapTask: io.sort.mb = 100
>>> >> > 09/07/24 03:41:40 INFO mapred.MapTask: data buffer =
>>> 79691776/99614720
>>> >> > 09/07/24 03:41:40 INFO mapred.MapTask: record buffer = 262144/327680
>>> >> > 09/07/24 03:41:40 INFO mapred.MapTask: Starting flush of map output
>>> >> > 09/07/24 03:41:40 INFO mapred.MapTask: Finished spill 0
>>> >> > 09/07/24 03:41:40 INFO mapred.TaskRunner:
>>> >> Task:attempt_local_0001_m_000000_0
>>> >> > is done. And is in the process of commiting
>>> >> > 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
>>> >> > 09/07/24 03:41:40 INFO mapred.TaskRunner: Task
>>> >> > 'attempt_local_0001_m_000000_0' done.
>>> >> > 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
>>> >> > 09/07/24 03:41:40 INFO mapred.Merger: Merging 1 sorted segments
>>> >> > 09/07/24 03:41:40 INFO mapred.Merger: Down to the last merge-pass,
>>> with 1
>>> >> > segments left of total size: 333 bytes
>>> >> > 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
>>> >> > 09/07/24 03:41:40 INFO mapred.TaskRunner:
>>> >> Task:attempt_local_0001_r_000000_0
>>> >> > is done. And is in the process of commiting
>>> >> > 09/07/24 03:41:40 INFO mapred.LocalJobRunner: reduce > reduce
>>> >> > 09/07/24 03:41:40 INFO mapred.TaskRunner: Task
>>> >> > 'attempt_local_0001_r_000000_0' done.
>>> >> > 09/07/24 03:41:40 INFO mapred.JobClient: Job complete: job_local_0001
>>> >> > 09/07/24 03:41:40 INFO mapred.JobClient: Counters: 11
>>> >> > 09/07/24 03:41:40 INFO mapred.JobClient:   File Systems
>>> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Local bytes read=38933
>>> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Local bytes
>>> written=78346
>>> >> > 09/07/24 03:41:40 INFO mapred.JobClient:   Map-Reduce Framework
>>> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce input groups=8
>>> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Combine output records=0
>>> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Map input records=8
>>> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce output records=8
>>> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Map output bytes=315
>>> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Map input bytes=0
>>> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Combine input records=0
>>> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Map output records=8
>>> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce input records=8
>>> >> >
>>> >> >
>>> >> > On Thu, Jul 23, 2009 at 9:32 PM, Jean-Daniel Cryans <
>>> jdcryans@apache.org
>>> >> >wrote:
>>> >> >
>>> >> >> And you don't need any more config to run local MR jobs on HBase.
>>> But
>>> >> >> you do need Hadoop when running MR jobs on HBase on a cluster.
>>> >> >>
>>> >> >> Also your code is running fine as you could see, the real question
>>> is
>>> >> >> where is the stdout going when in local mode. When you ran your
>>> other
>>> >> >> MR jobs, it was on a working Hadoop setup right? So you were looking
>>> >> >> at the logs in the web UI? One simple thing to do is to do your
>>> >> >> debugging with a logger so you are sure to see your output as I
>>> >> >> already proposed. Another simple thing is to get a
>>> pseudo-distributed
>>> >> >> setup and run you HBase MR jobs with Hadoop and get your logs like
>>> I'm
>>> >> >> sure you did before.
>>> >> >>
>>> >> >> J-D
>>> >> >>
>>> >> >> On Thu, Jul 23, 2009 at 11:54 AM, bharath
>>> >> >> vissapragada<bh...@gmail.com> wrote:
>>> >> >> > I am really thankful to you J-D for replying me inspite of ur busy
>>> >> >> schedule.
>>> >> >> > I am still in a learning stage and there are no good guides on
>>> HBase
>>> >> >> other
>>> >> >> > than Its own one .. So please spare me and I really appreciate ur
>>> help
>>> >> .
>>> >> >> >
>>> >> >> > Now i got ur point that there is no need of hadoop while running
>>> Hbase
>>> >> MR
>>> >> >> > programs .... But iam confused abt the config . I have only set
>>> the
>>> >> >> > JAVA_HOME in the "hbase-env.sh" and other than that i didn't do
>>> >> anything
>>> >> >> ..
>>> >> >> > so i wonder if my conf was wrong or some error in that simple code
>>> ...
>>> >> >> > because stdout worked for me while writing mapreduce programs ...
>>> >> >> >
>>> >> >> > Thanks once again!
>>> >> >> >
>>> >> >> > On Thu, Jul 23, 2009 at 9:14 PM, Jean-Daniel Cryans <
>>> >> jdcryans@apache.org
>>> >> >> >wrote:
>>> >> >> >
>>> >> >> >> The code itself is very simple, I was referring to your own
>>> >> >> >> description of your situation. You say you use standalone HBase
>>> yet
>>> >> >> >> you talk about Hadoop configuration. You also talk about the
>>> >> >> >> JobTracker web UI which is in no use since you run local jobs
>>> >> directly
>>> >> >> >> on HBase.
>>> >> >> >>
>>> >> >> >> J-D
>>> >> >> >>
>>> >> >> >> On Thu, Jul 23, 2009 at 11:41 AM, bharath
>>> >> >> >> vissapragada<bh...@gmail.com> wrote:
>>> >> >> >> > I used stdout for debugging while writing codes in hadoop MR
>>> >> programs
>>> >> >> and
>>> >> >> >> it
>>> >> >> >> > worked fine ...
>>> >> >> >> > Can you please tell me wch part of the code u found confusing
>>> so
>>> >> that
>>> >> >> i
>>> >> >> >> can
>>> >> >> >> > explain it a bit clearly ...
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> > On Thu, Jul 23, 2009 at 9:06 PM, Jean-Daniel Cryans <
>>> >> >> jdcryans@apache.org
>>> >> >> >> >wrote:
>>> >> >> >> >
>>> >> >> >> >> What you wrote is a bit confusing to me, sorry.
>>> >> >> >> >>
>>> >> >> >> >> The usual way to debug MR jobs is to define a logger and post
>>> with
>>> >> >> >> >> either info or debug level, not sysout like you did. I'm not
>>> even
>>> >> >> sure
>>> >> >> >> >> where the standard output is logged when using a local job.
>>> Also
>>> >> >> since
>>> >> >> >> >> this is local you won't see anything in your host:50030 web
>>> UI. So
>>> >> >> use
>>> >> >> >> >> apache common logging and you should see your output.
>>> >> >> >> >>
>>> >> >> >> >> J-D
>>> >> >> >> >>
>>> >> >> >> >> On Thu, Jul 23, 2009 at 11:13 AM, bharath
>>> >> >> >> >> vissapragada<bh...@gmail.com> wrote:
>>> >> >> >> >> > Thanks for ur reply J-D ... Im pasting some part of the code
>>> ...
>>> >> >> >> >> >
>>> >> >> >> >> > Im doing it frm the command line .. Iam pasting some part of
>>> the
>>> >> >> code
>>> >> >> >> >> here
>>> >> >> >> >> > ....
>>> >> >> >> >> >
>>> >> >> >> >> >  public void mapp(ImmutableBytesWritable row, RowResult
>>> value,
>>> >> >> >> >> > OutputCollector<Text, Text> output, Reporter reporter)
>>> throws
>>> >> >> >> IOException
>>> >> >> >> >> {
>>> >> >> >> >> >                System.out.println(row);
>>> >> >> >> >> > }
>>> >> >> >> >> >
>>> >> >> >> >> > public JobConf createSubmittableJob(String[] args) throws
>>> >> >> IOException
>>> >> >> >> {
>>> >> >> >> >> >                JobConf c = new JobConf(getConf(),
>>> >> >> >> >> MR_DS_Scan_Case1.class);
>>> >> >> >> >> >                c.set("col.name", args[1]);
>>> >> >> >> >> >                c.set("operator.name",args[2]);
>>> >> >> >> >> >                c.set("val.name",args[3]);
>>> >> >> >> >> >                IdentityTableMap.initJob(args[0], args[1],
>>> >> >> >> >> this.getClass(),
>>> >> >> >> >> > c);
>>> >> >> >> >> >                c.setOutputFormat(NullOutputFormat.class);
>>> >> >> >> >> >                 return c
>>> >> >> >> >> > }
>>> >> >> >> >> >
>>> >> >> >> >> > As u can see ... im just printing the value of row in the
>>> map ..
>>> >> i
>>> >> >> >> can't
>>> >> >> >> >> see
>>> >> >> >> >> > in the terminal .....
>>> >> >> >> >> > I only wan't the map phase ... so i didn't write any reduce
>>> >> phase
>>> >> >> ..
>>> >> >> >> is
>>> >> >> >> >> my
>>> >> >> >> >> > jobConf correct??
>>> >> >> >> >> >
>>> >> >> >> >> > Also as i have already asked how to check the job logs and
>>> web
>>> >> >> >> interface
>>> >> >> >> >> > like "localhost:<port>/jobTracker.jsp"... since im running
>>> in
>>> >> local
>>> >> >> >> mode
>>> >> >> >> >> ...
>>> >> >> >> >> >
>>> >> >> >> >> > On Thu, Jul 23, 2009 at 6:32 PM, Jean-Daniel Cryans <
>>> >> >> >> jdcryans@apache.org
>>> >> >> >> >> >wrote:
>>> >> >> >> >> >
>>> >> >> >> >> >> What output do you need exactly? I see that you have 8
>>> output
>>> >> >> records
>>> >> >> >> >> >> in your reduce task so if you take a look in your output
>>> folder
>>> >> or
>>> >> >> >> >> >> table (I don't know which sink you used) you should see
>>> them.
>>> >> >> >> >> >>
>>> >> >> >> >> >> Also did you run your MR inside Eclipse or in command line?
>>> >> >> >> >> >>
>>> >> >> >> >> >> Thx,
>>> >> >> >> >> >>
>>> >> >> >> >> >> J-D
>>> >> >> >> >> >>
>>> >> >> >> >> >> On Thu, Jul 23, 2009 at 8:30 AM, bharath
>>> >> >> >> >> >> vissapragada<bh...@students.iiit.ac.in> wrote:
>>> >> >> >> >> >> > This is the output i go t.. seems everything is fine
>>> ..but no
>>> >> >> >> output!!
>>> >> >> >> >> >> >
>>> >> >> >> >> >> > 09/07/23 23:25:36 INFO jvm.JvmMetrics: Initializing JVM
>>> >> Metrics
>>> >> >> >> with
>>> >> >> >> >> >> > processName=JobTracker, sessionId=
>>> >> >> >> >> >> > 09/07/23 23:25:36 WARN mapred.JobClient: No job jar file
>>> set.
>>> >> >>  User
>>> >> >> >> >> >> classes
>>> >> >> >> >> >> > may not be found. See JobConf(Class) or
>>> >> JobConf#setJar(String).
>>> >> >> >> >> >> > 09/07/23 23:25:36 INFO mapred.TableInputFormatBase:
>>> split:
>>> >> >> >> >> >> > 0->localhost.localdomain:,
>>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.JobClient: Running job:
>>> >> >> >> job_local_0001
>>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TableInputFormatBase:
>>> split:
>>> >> >> >> >> >> > 0->localhost.localdomain:,
>>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: numReduceTasks: 1
>>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: io.sort.mb = 100
>>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: data buffer =
>>> >> >> >> 79691776/99614720
>>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: record buffer =
>>> >> >> >> 262144/327680
>>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Starting flush of
>>> map
>>> >> >> output
>>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Finished spill 0
>>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
>>> >> >> >> >> >> Task:attempt_local_0001_m_000000_0
>>> >> >> >> >> >> > is done. And is in the process of commiting
>>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
>>> >> >> >> >> >> > 'attempt_local_0001_m_000000_0' done.
>>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.Merger: Merging 1 sorted
>>> >> segments
>>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.Merger: Down to the last
>>> >> >> merge-pass,
>>> >> >> >> >> with 1
>>> >> >> >> >> >> > segments left of total size: 333 bytes
>>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
>>> >> >> >> >> >> Task:attempt_local_0001_r_000000_0
>>> >> >> >> >> >> > is done. And is in the process of commiting
>>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner: reduce >
>>> reduce
>>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
>>> >> >> >> >> >> > 'attempt_local_0001_r_000000_0' done.
>>> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Job complete:
>>> >> >> >> job_local_0001
>>> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Counters: 11
>>> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   File Systems
>>> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes
>>> >> >> read=38949
>>> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes
>>> >> >> >> written=78378
>>> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   Map-Reduce
>>> >> Framework
>>> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input
>>> >> >> groups=8
>>> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine
>>> output
>>> >> >> >> records=0
>>> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input
>>> >> records=8
>>> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce
>>> output
>>> >> >> >> records=8
>>> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output
>>> >> >> bytes=315
>>> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input
>>> >> bytes=0
>>> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine
>>> input
>>> >> >> >> records=0
>>> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output
>>> >> >> records=8
>>> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input
>>> >> >> records=8
>>> >> >> >> >> >> >
>>> >> >> >> >> >> >
>>> >> >> >> >> >> > On Thu, Jul 23, 2009 at 12:17 PM, bharath vissapragada <
>>> >> >> >> >> >> > bharat_v@students.iiit.ac.in> wrote:
>>> >> >> >> >> >> >
>>> >> >> >> >> >> >> since i haven;t started the cluster .. i can even see
>>> the
>>> >> >> details
>>> >> >> >> in
>>> >> >> >> >> >> >> "localhost:<port>/jobTracker.jsp" ..  i didn't even add
>>> >> >> anything
>>> >> >> >> to
>>> >> >> >> >> >> >> hadoop/conf/hadoop-site.xml
>>> >> >> >> >> >> >>
>>> >> >> >> >> >> >>
>>> >> >> >> >> >> >> On Thu, Jul 23, 2009 at 12:16 PM, bharath vissapragada <
>>> >> >> >> >> >> >> bharat_v@students.iiit.ac.in> wrote:
>>> >> >> >> >> >> >>
>>> >> >> >> >> >> >>> Hi all ,
>>> >> >> >> >> >> >>>
>>> >> >> >> >> >> >>> I wanted to run HBase in standalone mode to check my
>>> Hbase
>>> >> MR
>>> >> >> >> >> programs
>>> >> >> >> >> >> ...
>>> >> >> >> >> >> >>> I have dl a built version of hbase-0.20. and i have
>>> hadoop
>>> >> >> 0.19.3
>>> >> >> >> >> >> >>>
>>> >> >> >> >> >> >>> "I have set JAVA_HOME in both of them" .. then i
>>> started
>>> >> hbase
>>> >> >> >> and
>>> >> >> >> >> >> >>> inserted some tables using JAVA API .. Now i have
>>> written
>>> >> some
>>> >> >> MR
>>> >> >> >> >> >> programs
>>> >> >> >> >> >> >>> onHBase and when i run them on Hbase it runs perfectly
>>> >> without
>>> >> >> >> any
>>> >> >> >> >> >> errors
>>> >> >> >> >> >> >>> and all the Map -reduce statistics are displayed
>>> correctly
>>> >> but
>>> >> >>  i
>>> >> >> >> >> get
>>> >> >> >> >> >> no
>>> >> >> >> >> >> >>> output .
>>> >> >> >> >> >> >>>
>>> >> >> >> >> >> >>> I have one doubt now .. how do HBase recognize hadoop
>>> in
>>> >> stand
>>> >> >> >> alone
>>> >> >> >> >> >> >>> mode(i haven;t started my hadoop even) .. Even simple
>>> print
>>> >> >> >> >> statements
>>> >> >> >> >> >> donot
>>> >> >> >> >> >> >>> work .. no output is displayed on the screen ... I
>>> doubt my
>>> >> >> >> config
>>> >> >> >> >> ....
>>> >> >> >> >> >> >>>
>>> >> >> >> >> >> >>> Do i need to add some config to run them ... Please
>>> reply
>>> >> ...
>>> >> >> >> >> >> >>>
>>> >> >> >> >> >> >>
>>> >> >> >> >> >> >>
>>> >> >> >> >> >> >
>>> >> >> >> >> >>
>>> >> >> >> >> >
>>> >> >> >> >>
>>> >> >> >> >
>>> >> >> >>
>>> >> >> >
>>> >> >>
>>> >> >
>>> >>
>>> >
>>>
>>
>>
>

Re: Hbase and Hadoop Config to run in Standalone mode

Posted by bharath vissapragada <bh...@gmail.com>.

I figured out my error and it is in the OutputCollector ... I really thank
you J-D for replying me constantly .. and i am very very sorry for wasting
your time so much ...

Thanks for ur help

On Thu, Jul 23, 2009 at 10:58 PM, bharath vissapragada <
bharathvissapragada1990@gmail.com> wrote:

> but now it is strangely saying that
>
>  method does not override or implement a method from a supertype
>     @Override
>     ^
> previously it had pointed that  "org.apache.hadoop.hbase.mapred.IdentityTableMap have the same erasure"
>  :(
>
>
> On Thu, Jul 23, 2009 at 10:41 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:
>
>> Ok that explains. The problem you have is that you extended
>> IdentityTableMap but tried to override it with the wrong method name
>> so it was never called. Instead it was the parent's map that was
>> called.
>>
>> The error it's now giving you is pretty much self-explanatory and is
>> not related to Hadoop or HBase, you must override the map method and
>> this is done with @override.
>>
>> You should also take at look at this doc to learn how to build your
>> jobs
>> http://hadoop.apache.org/hbase/docs/current/api/org/apache/hadoop/hbase/mapred/package-summary.html
>>
>> J-D
>>
>> On Thu, Jul 23, 2009 at 1:04 PM, bharath
>> vissapragada<bh...@gmail.com> wrote:
>> > I think this is the problem .. but when i changed it .. it gave me a
>> weird
>> > error
>> >
>> >  name clash:
>> >
>> map(org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.io.RowResult,org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>,org.apache.hadoop.mapred.Reporter)
>> > in MR_DS_Scan_Case1 and
>> >
>> map(org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.io.RowResult,org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.io.RowResult>,org.apache.hadoop.mapred.Reporter)
>> > in org.apache.hadoop.hbase.mapred.IdentityTableMap have the same
>> erasure,
>> > yet neither overrides the other
>> >
>> > I must override the map function in the IdentityTableMap ... but other
>> > libraries also seem to have map function ..
>> > so what must i do ..
>> >
>> > On Thu, Jul 23, 2009 at 10:26 PM, Jean-Daniel Cryans <
>> jdcryans@apache.org>wrote:
>> >
>> >> Think I found your problem, is this a typo?
>> >>
>> >>  public void mapp(ImmutableBytesWritable row, RowResult value,
>> >> OutputCollector<Text, Text> output, Reporter reporter) throws
>> IOException {
>> >>
>> >> I should read map not mapp
>> >>
>> >> J-D
>> >>
>> >> On Thu, Jul 23, 2009 at 12:42 PM, bharath
>> >> vissapragada<bh...@gmail.com> wrote:
>> >> > I have tried apache -commons logging ...
>> >> >
>> >> > instead of printing the row ... i have written log.error(row) ...
>> >> > even then i got the same output as follows ...
>> >> >
>> >> > 09/07/24 03:41:38 INFO jvm.JvmMetrics: Initializing JVM Metrics with
>> >> > processName=JobTracker, sessionId=
>> >> > 09/07/24 03:41:38 WARN mapred.JobClient: No job jar file set.  User
>> >> classes
>> >> > may not be found. See JobConf(Class) or JobConf#setJar(String).
>> >> > 09/07/24 03:41:39 INFO mapred.TableInputFormatBase: split:
>> >> > 0->localhost.localdomain:,
>> >> > 09/07/24 03:41:39 INFO mapred.JobClient: Running job: job_local_0001
>> >> > 09/07/24 03:41:39 INFO mapred.TableInputFormatBase: split:
>> >> > 0->localhost.localdomain:,
>> >> > 09/07/24 03:41:40 INFO mapred.MapTask: numReduceTasks: 1
>> >> > 09/07/24 03:41:40 INFO mapred.MapTask: io.sort.mb = 100
>> >> > 09/07/24 03:41:40 INFO mapred.MapTask: data buffer =
>> 79691776/99614720
>> >> > 09/07/24 03:41:40 INFO mapred.MapTask: record buffer = 262144/327680
>> >> > 09/07/24 03:41:40 INFO mapred.MapTask: Starting flush of map output
>> >> > 09/07/24 03:41:40 INFO mapred.MapTask: Finished spill 0
>> >> > 09/07/24 03:41:40 INFO mapred.TaskRunner:
>> >> Task:attempt_local_0001_m_000000_0
>> >> > is done. And is in the process of commiting
>> >> > 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
>> >> > 09/07/24 03:41:40 INFO mapred.TaskRunner: Task
>> >> > 'attempt_local_0001_m_000000_0' done.
>> >> > 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
>> >> > 09/07/24 03:41:40 INFO mapred.Merger: Merging 1 sorted segments
>> >> > 09/07/24 03:41:40 INFO mapred.Merger: Down to the last merge-pass,
>> with 1
>> >> > segments left of total size: 333 bytes
>> >> > 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
>> >> > 09/07/24 03:41:40 INFO mapred.TaskRunner:
>> >> Task:attempt_local_0001_r_000000_0
>> >> > is done. And is in the process of commiting
>> >> > 09/07/24 03:41:40 INFO mapred.LocalJobRunner: reduce > reduce
>> >> > 09/07/24 03:41:40 INFO mapred.TaskRunner: Task
>> >> > 'attempt_local_0001_r_000000_0' done.
>> >> > 09/07/24 03:41:40 INFO mapred.JobClient: Job complete: job_local_0001
>> >> > 09/07/24 03:41:40 INFO mapred.JobClient: Counters: 11
>> >> > 09/07/24 03:41:40 INFO mapred.JobClient:   File Systems
>> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Local bytes read=38933
>> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Local bytes
>> written=78346
>> >> > 09/07/24 03:41:40 INFO mapred.JobClient:   Map-Reduce Framework
>> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce input groups=8
>> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Combine output records=0
>> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Map input records=8
>> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce output records=8
>> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Map output bytes=315
>> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Map input bytes=0
>> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Combine input records=0
>> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Map output records=8
>> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce input records=8
>> >> >
>> >> >
>> >> > On Thu, Jul 23, 2009 at 9:32 PM, Jean-Daniel Cryans <
>> jdcryans@apache.org
>> >> >wrote:
>> >> >
>> >> >> And you don't need any more config to run local MR jobs on HBase.
>> But
>> >> >> you do need Hadoop when running MR jobs on HBase on a cluster.
>> >> >>
>> >> >> Also your code is running fine as you could see, the real question
>> is
>> >> >> where is the stdout going when in local mode. When you ran your
>> other
>> >> >> MR jobs, it was on a working Hadoop setup right? So you were looking
>> >> >> at the logs in the web UI? One simple thing to do is to do your
>> >> >> debugging with a logger so you are sure to see your output as I
>> >> >> already proposed. Another simple thing is to get a
>> pseudo-distributed
>> >> >> setup and run you HBase MR jobs with Hadoop and get your logs like
>> I'm
>> >> >> sure you did before.
>> >> >>
>> >> >> J-D
>> >> >>
>> >> >> On Thu, Jul 23, 2009 at 11:54 AM, bharath
>> >> >> vissapragada<bh...@gmail.com> wrote:
>> >> >> > I am really thankful to you J-D for replying me inspite of ur busy
>> >> >> schedule.
>> >> >> > I am still in a learning stage and there are no good guides on
>> HBase
>> >> >> other
>> >> >> > than Its own one .. So please spare me and I really appreciate ur
>> help
>> >> .
>> >> >> >
>> >> >> > Now i got ur point that there is no need of hadoop while running
>> Hbase
>> >> MR
>> >> >> > programs .... But iam confused abt the config . I have only set
>> the
>> >> >> > JAVA_HOME in the "hbase-env.sh" and other than that i didn't do
>> >> anything
>> >> >> ..
>> >> >> > so i wonder if my conf was wrong or some error in that simple code
>> ...
>> >> >> > because stdout worked for me while writing mapreduce programs ...
>> >> >> >
>> >> >> > Thanks once again!
>> >> >> >
>> >> >> > On Thu, Jul 23, 2009 at 9:14 PM, Jean-Daniel Cryans <
>> >> jdcryans@apache.org
>> >> >> >wrote:
>> >> >> >
>> >> >> >> The code itself is very simple, I was referring to your own
>> >> >> >> description of your situation. You say you use standalone HBase
>> yet
>> >> >> >> you talk about Hadoop configuration. You also talk about the
>> >> >> >> JobTracker web UI which is in no use since you run local jobs
>> >> directly
>> >> >> >> on HBase.
>> >> >> >>
>> >> >> >> J-D
>> >> >> >>
>> >> >> >> On Thu, Jul 23, 2009 at 11:41 AM, bharath
>> >> >> >> vissapragada<bh...@gmail.com> wrote:
>> >> >> >> > I used stdout for debugging while writing codes in hadoop MR
>> >> programs
>> >> >> and
>> >> >> >> it
>> >> >> >> > worked fine ...
>> >> >> >> > Can you please tell me wch part of the code u found confusing
>> so
>> >> that
>> >> >> i
>> >> >> >> can
>> >> >> >> > explain it a bit clearly ...
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > On Thu, Jul 23, 2009 at 9:06 PM, Jean-Daniel Cryans <
>> >> >> jdcryans@apache.org
>> >> >> >> >wrote:
>> >> >> >> >
>> >> >> >> >> What you wrote is a bit confusing to me, sorry.
>> >> >> >> >>
>> >> >> >> >> The usual way to debug MR jobs is to define a logger and post
>> with
>> >> >> >> >> either info or debug level, not sysout like you did. I'm not
>> even
>> >> >> sure
>> >> >> >> >> where the standard output is logged when using a local job.
>> Also
>> >> >> since
>> >> >> >> >> this is local you won't see anything in your host:50030 web
>> UI. So
>> >> >> use
>> >> >> >> >> apache common logging and you should see your output.
>> >> >> >> >>
>> >> >> >> >> J-D
>> >> >> >> >>
>> >> >> >> >> On Thu, Jul 23, 2009 at 11:13 AM, bharath
>> >> >> >> >> vissapragada<bh...@gmail.com> wrote:
>> >> >> >> >> > Thanks for ur reply J-D ... Im pasting some part of the code
>> ...
>> >> >> >> >> >
>> >> >> >> >> > Im doing it frm the command line .. Iam pasting some part of
>> the
>> >> >> code
>> >> >> >> >> here
>> >> >> >> >> > ....
>> >> >> >> >> >
>> >> >> >> >> >  public void mapp(ImmutableBytesWritable row, RowResult
>> value,
>> >> >> >> >> > OutputCollector<Text, Text> output, Reporter reporter)
>> throws
>> >> >> >> IOException
>> >> >> >> >> {
>> >> >> >> >> >                System.out.println(row);
>> >> >> >> >> > }
>> >> >> >> >> >
>> >> >> >> >> > public JobConf createSubmittableJob(String[] args) throws
>> >> >> IOException
>> >> >> >> {
>> >> >> >> >> >                JobConf c = new JobConf(getConf(),
>> >> >> >> >> MR_DS_Scan_Case1.class);
>> >> >> >> >> >                c.set("col.name", args[1]);
>> >> >> >> >> >                c.set("operator.name",args[2]);
>> >> >> >> >> >                c.set("val.name",args[3]);
>> >> >> >> >> >                IdentityTableMap.initJob(args[0], args[1],
>> >> >> >> >> this.getClass(),
>> >> >> >> >> > c);
>> >> >> >> >> >                c.setOutputFormat(NullOutputFormat.class);
>> >> >> >> >> >                 return c
>> >> >> >> >> > }
>> >> >> >> >> >
>> >> >> >> >> > As u can see ... im just printing the value of row in the
>> map ..
>> >> i
>> >> >> >> can't
>> >> >> >> >> see
>> >> >> >> >> > in the terminal .....
>> >> >> >> >> > I only wan't the map phase ... so i didn't write any reduce
>> >> phase
>> >> >> ..
>> >> >> >> is
>> >> >> >> >> my
>> >> >> >> >> > jobConf correct??
>> >> >> >> >> >
>> >> >> >> >> > Also as i have already asked how to check the job logs and
>> web
>> >> >> >> interface
>> >> >> >> >> > like "localhost:<port>/jobTracker.jsp"... since im running
>> in
>> >> local
>> >> >> >> mode
>> >> >> >> >> ...
>> >> >> >> >> >
>> >> >> >> >> > On Thu, Jul 23, 2009 at 6:32 PM, Jean-Daniel Cryans <
>> >> >> >> jdcryans@apache.org
>> >> >> >> >> >wrote:
>> >> >> >> >> >
>> >> >> >> >> >> What output do you need exactly? I see that you have 8
>> output
>> >> >> records
>> >> >> >> >> >> in your reduce task so if you take a look in your output
>> folder
>> >> or
>> >> >> >> >> >> table (I don't know which sink you used) you should see
>> them.
>> >> >> >> >> >>
>> >> >> >> >> >> Also did you run your MR inside Eclipse or in command line?
>> >> >> >> >> >>
>> >> >> >> >> >> Thx,
>> >> >> >> >> >>
>> >> >> >> >> >> J-D
>> >> >> >> >> >>
>> >> >> >> >> >> On Thu, Jul 23, 2009 at 8:30 AM, bharath
>> >> >> >> >> >> vissapragada<bh...@students.iiit.ac.in> wrote:
>> >> >> >> >> >> > This is the output i go t.. seems everything is fine
>> ..but no
>> >> >> >> output!!
>> >> >> >> >> >> >
>> >> >> >> >> >> > 09/07/23 23:25:36 INFO jvm.JvmMetrics: Initializing JVM
>> >> Metrics
>> >> >> >> with
>> >> >> >> >> >> > processName=JobTracker, sessionId=
>> >> >> >> >> >> > 09/07/23 23:25:36 WARN mapred.JobClient: No job jar file
>> set.
>> >> >>  User
>> >> >> >> >> >> classes
>> >> >> >> >> >> > may not be found. See JobConf(Class) or
>> >> JobConf#setJar(String).
>> >> >> >> >> >> > 09/07/23 23:25:36 INFO mapred.TableInputFormatBase:
>> split:
>> >> >> >> >> >> > 0->localhost.localdomain:,
>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.JobClient: Running job:
>> >> >> >> job_local_0001
>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TableInputFormatBase:
>> split:
>> >> >> >> >> >> > 0->localhost.localdomain:,
>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: numReduceTasks: 1
>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: io.sort.mb = 100
>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: data buffer =
>> >> >> >> 79691776/99614720
>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: record buffer =
>> >> >> >> 262144/327680
>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Starting flush of
>> map
>> >> >> output
>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Finished spill 0
>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
>> >> >> >> >> >> Task:attempt_local_0001_m_000000_0
>> >> >> >> >> >> > is done. And is in the process of commiting
>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
>> >> >> >> >> >> > 'attempt_local_0001_m_000000_0' done.
>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.Merger: Merging 1 sorted
>> >> segments
>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.Merger: Down to the last
>> >> >> merge-pass,
>> >> >> >> >> with 1
>> >> >> >> >> >> > segments left of total size: 333 bytes
>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
>> >> >> >> >> >> Task:attempt_local_0001_r_000000_0
>> >> >> >> >> >> > is done. And is in the process of commiting
>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner: reduce >
>> reduce
>> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
>> >> >> >> >> >> > 'attempt_local_0001_r_000000_0' done.
>> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Job complete:
>> >> >> >> job_local_0001
>> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Counters: 11
>> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   File Systems
>> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes
>> >> >> read=38949
>> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes
>> >> >> >> written=78378
>> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   Map-Reduce
>> >> Framework
>> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input
>> >> >> groups=8
>> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine
>> output
>> >> >> >> records=0
>> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input
>> >> records=8
>> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce
>> output
>> >> >> >> records=8
>> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output
>> >> >> bytes=315
>> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input
>> >> bytes=0
>> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine
>> input
>> >> >> >> records=0
>> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output
>> >> >> records=8
>> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input
>> >> >> records=8
>> >> >> >> >> >> >
>> >> >> >> >> >> >
>> >> >> >> >> >> > On Thu, Jul 23, 2009 at 12:17 PM, bharath vissapragada <
>> >> >> >> >> >> > bharat_v@students.iiit.ac.in> wrote:
>> >> >> >> >> >> >
>> >> >> >> >> >> >> since i haven;t started the cluster .. i can even see
>> the
>> >> >> details
>> >> >> >> in
>> >> >> >> >> >> >> "localhost:<port>/jobTracker.jsp" ..  i didn't even add
>> >> >> anything
>> >> >> >> to
>> >> >> >> >> >> >> hadoop/conf/hadoop-site.xml
>> >> >> >> >> >> >>
>> >> >> >> >> >> >>
>> >> >> >> >> >> >> On Thu, Jul 23, 2009 at 12:16 PM, bharath vissapragada <
>> >> >> >> >> >> >> bharat_v@students.iiit.ac.in> wrote:
>> >> >> >> >> >> >>
>> >> >> >> >> >> >>> Hi all ,
>> >> >> >> >> >> >>>
>> >> >> >> >> >> >>> I wanted to run HBase in standalone mode to check my
>> Hbase
>> >> MR
>> >> >> >> >> programs
>> >> >> >> >> >> ...
>> >> >> >> >> >> >>> I have dl a built version of hbase-0.20. and i have
>> hadoop
>> >> >> 0.19.3
>> >> >> >> >> >> >>>
>> >> >> >> >> >> >>> "I have set JAVA_HOME in both of them" .. then i
>> started
>> >> hbase
>> >> >> >> and
>> >> >> >> >> >> >>> inserted some tables using JAVA API .. Now i have
>> written
>> >> some
>> >> >> MR
>> >> >> >> >> >> programs
>> >> >> >> >> >> >>> onHBase and when i run them on Hbase it runs perfectly
>> >> without
>> >> >> >> any
>> >> >> >> >> >> errors
>> >> >> >> >> >> >>> and all the Map -reduce statistics are displayed
>> correctly
>> >> but
>> >> >>  i
>> >> >> >> >> get
>> >> >> >> >> >> no
>> >> >> >> >> >> >>> output .
>> >> >> >> >> >> >>>
>> >> >> >> >> >> >>> I have one doubt now .. how do HBase recognize hadoop
>> in
>> >> stand
>> >> >> >> alone
>> >> >> >> >> >> >>> mode(i haven;t started my hadoop even) .. Even simple
>> print
>> >> >> >> >> statements
>> >> >> >> >> >> donot
>> >> >> >> >> >> >>> work .. no output is displayed on the screen ... I
>> doubt my
>> >> >> >> config
>> >> >> >> >> ....
>> >> >> >> >> >> >>>
>> >> >> >> >> >> >>> Do i need to add some config to run them ... Please
>> reply
>> >> ...
>> >> >> >> >> >> >>>
>> >> >> >> >> >> >>
>> >> >> >> >> >> >>
>> >> >> >> >> >> >
>> >> >> >> >> >>
>> >> >> >> >> >
>> >> >> >> >>
>> >> >> >> >
>> >> >> >>
>> >> >> >
>> >> >>
>> >> >
>> >>
>> >
>>
>
>

Re: Hbase and Hadoop Config to run in Standalone mode

Posted by bharath vissapragada <bh...@gmail.com>.

but now it is strangely saying that

 method does not override or implement a method from a supertype
    @Override
    ^
previously it had pointed that
"org.apache.hadoop.hbase.mapred.IdentityTableMap
have the same erasure"
 :(


On Thu, Jul 23, 2009 at 10:41 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> Ok that explains. The problem you have is that you extended
> IdentityTableMap but tried to override it with the wrong method name
> so it was never called. Instead it was the parent's map that was
> called.
>
> The error it's now giving you is pretty much self-explanatory and is
> not related to Hadoop or HBase, you must override the map method and
> this is done with @override.
>
> You should also take at look at this doc to learn how to build your
> jobs
> http://hadoop.apache.org/hbase/docs/current/api/org/apache/hadoop/hbase/mapred/package-summary.html
>
> J-D
>
> On Thu, Jul 23, 2009 at 1:04 PM, bharath
> vissapragada<bh...@gmail.com> wrote:
> > I think this is the problem .. but when i changed it .. it gave me a
> weird
> > error
> >
> >  name clash:
> >
> map(org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.io.RowResult,org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>,org.apache.hadoop.mapred.Reporter)
> > in MR_DS_Scan_Case1 and
> >
> map(org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.io.RowResult,org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.io.RowResult>,org.apache.hadoop.mapred.Reporter)
> > in org.apache.hadoop.hbase.mapred.IdentityTableMap have the same erasure,
> > yet neither overrides the other
> >
> > I must override the map function in the IdentityTableMap ... but other
> > libraries also seem to have map function ..
> > so what must i do ..
> >
> > On Thu, Jul 23, 2009 at 10:26 PM, Jean-Daniel Cryans <
> jdcryans@apache.org>wrote:
> >
> >> Think I found your problem, is this a typo?
> >>
> >>  public void mapp(ImmutableBytesWritable row, RowResult value,
> >> OutputCollector<Text, Text> output, Reporter reporter) throws
> IOException {
> >>
> >> I should read map not mapp
> >>
> >> J-D
> >>
> >> On Thu, Jul 23, 2009 at 12:42 PM, bharath
> >> vissapragada<bh...@gmail.com> wrote:
> >> > I have tried apache -commons logging ...
> >> >
> >> > instead of printing the row ... i have written log.error(row) ...
> >> > even then i got the same output as follows ...
> >> >
> >> > 09/07/24 03:41:38 INFO jvm.JvmMetrics: Initializing JVM Metrics with
> >> > processName=JobTracker, sessionId=
> >> > 09/07/24 03:41:38 WARN mapred.JobClient: No job jar file set.  User
> >> classes
> >> > may not be found. See JobConf(Class) or JobConf#setJar(String).
> >> > 09/07/24 03:41:39 INFO mapred.TableInputFormatBase: split:
> >> > 0->localhost.localdomain:,
> >> > 09/07/24 03:41:39 INFO mapred.JobClient: Running job: job_local_0001
> >> > 09/07/24 03:41:39 INFO mapred.TableInputFormatBase: split:
> >> > 0->localhost.localdomain:,
> >> > 09/07/24 03:41:40 INFO mapred.MapTask: numReduceTasks: 1
> >> > 09/07/24 03:41:40 INFO mapred.MapTask: io.sort.mb = 100
> >> > 09/07/24 03:41:40 INFO mapred.MapTask: data buffer = 79691776/99614720
> >> > 09/07/24 03:41:40 INFO mapred.MapTask: record buffer = 262144/327680
> >> > 09/07/24 03:41:40 INFO mapred.MapTask: Starting flush of map output
> >> > 09/07/24 03:41:40 INFO mapred.MapTask: Finished spill 0
> >> > 09/07/24 03:41:40 INFO mapred.TaskRunner:
> >> Task:attempt_local_0001_m_000000_0
> >> > is done. And is in the process of commiting
> >> > 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
> >> > 09/07/24 03:41:40 INFO mapred.TaskRunner: Task
> >> > 'attempt_local_0001_m_000000_0' done.
> >> > 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
> >> > 09/07/24 03:41:40 INFO mapred.Merger: Merging 1 sorted segments
> >> > 09/07/24 03:41:40 INFO mapred.Merger: Down to the last merge-pass,
> with 1
> >> > segments left of total size: 333 bytes
> >> > 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
> >> > 09/07/24 03:41:40 INFO mapred.TaskRunner:
> >> Task:attempt_local_0001_r_000000_0
> >> > is done. And is in the process of commiting
> >> > 09/07/24 03:41:40 INFO mapred.LocalJobRunner: reduce > reduce
> >> > 09/07/24 03:41:40 INFO mapred.TaskRunner: Task
> >> > 'attempt_local_0001_r_000000_0' done.
> >> > 09/07/24 03:41:40 INFO mapred.JobClient: Job complete: job_local_0001
> >> > 09/07/24 03:41:40 INFO mapred.JobClient: Counters: 11
> >> > 09/07/24 03:41:40 INFO mapred.JobClient:   File Systems
> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Local bytes read=38933
> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Local bytes written=78346
> >> > 09/07/24 03:41:40 INFO mapred.JobClient:   Map-Reduce Framework
> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce input groups=8
> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Combine output records=0
> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Map input records=8
> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce output records=8
> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Map output bytes=315
> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Map input bytes=0
> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Combine input records=0
> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Map output records=8
> >> > 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce input records=8
> >> >
> >> >
> >> > On Thu, Jul 23, 2009 at 9:32 PM, Jean-Daniel Cryans <
> jdcryans@apache.org
> >> >wrote:
> >> >
> >> >> And you don't need any more config to run local MR jobs on HBase. But
> >> >> you do need Hadoop when running MR jobs on HBase on a cluster.
> >> >>
> >> >> Also your code is running fine as you could see, the real question is
> >> >> where is the stdout going when in local mode. When you ran your other
> >> >> MR jobs, it was on a working Hadoop setup right? So you were looking
> >> >> at the logs in the web UI? One simple thing to do is to do your
> >> >> debugging with a logger so you are sure to see your output as I
> >> >> already proposed. Another simple thing is to get a pseudo-distributed
> >> >> setup and run you HBase MR jobs with Hadoop and get your logs like
> I'm
> >> >> sure you did before.
> >> >>
> >> >> J-D
> >> >>
> >> >> On Thu, Jul 23, 2009 at 11:54 AM, bharath
> >> >> vissapragada<bh...@gmail.com> wrote:
> >> >> > I am really thankful to you J-D for replying me inspite of ur busy
> >> >> schedule.
> >> >> > I am still in a learning stage and there are no good guides on
> HBase
> >> >> other
> >> >> > than Its own one .. So please spare me and I really appreciate ur
> help
> >> .
> >> >> >
> >> >> > Now i got ur point that there is no need of hadoop while running
> Hbase
> >> MR
> >> >> > programs .... But iam confused abt the config . I have only set the
> >> >> > JAVA_HOME in the "hbase-env.sh" and other than that i didn't do
> >> anything
> >> >> ..
> >> >> > so i wonder if my conf was wrong or some error in that simple code
> ...
> >> >> > because stdout worked for me while writing mapreduce programs ...
> >> >> >
> >> >> > Thanks once again!
> >> >> >
> >> >> > On Thu, Jul 23, 2009 at 9:14 PM, Jean-Daniel Cryans <
> >> jdcryans@apache.org
> >> >> >wrote:
> >> >> >
> >> >> >> The code itself is very simple, I was referring to your own
> >> >> >> description of your situation. You say you use standalone HBase
> yet
> >> >> >> you talk about Hadoop configuration. You also talk about the
> >> >> >> JobTracker web UI which is in no use since you run local jobs
> >> directly
> >> >> >> on HBase.
> >> >> >>
> >> >> >> J-D
> >> >> >>
> >> >> >> On Thu, Jul 23, 2009 at 11:41 AM, bharath
> >> >> >> vissapragada<bh...@gmail.com> wrote:
> >> >> >> > I used stdout for debugging while writing codes in hadoop MR
> >> programs
> >> >> and
> >> >> >> it
> >> >> >> > worked fine ...
> >> >> >> > Can you please tell me wch part of the code u found confusing so
> >> that
> >> >> i
> >> >> >> can
> >> >> >> > explain it a bit clearly ...
> >> >> >> >
> >> >> >> >
> >> >> >> > On Thu, Jul 23, 2009 at 9:06 PM, Jean-Daniel Cryans <
> >> >> jdcryans@apache.org
> >> >> >> >wrote:
> >> >> >> >
> >> >> >> >> What you wrote is a bit confusing to me, sorry.
> >> >> >> >>
> >> >> >> >> The usual way to debug MR jobs is to define a logger and post
> with
> >> >> >> >> either info or debug level, not sysout like you did. I'm not
> even
> >> >> sure
> >> >> >> >> where the standard output is logged when using a local job.
> Also
> >> >> since
> >> >> >> >> this is local you won't see anything in your host:50030 web UI.
> So
> >> >> use
> >> >> >> >> apache common logging and you should see your output.
> >> >> >> >>
> >> >> >> >> J-D
> >> >> >> >>
> >> >> >> >> On Thu, Jul 23, 2009 at 11:13 AM, bharath
> >> >> >> >> vissapragada<bh...@gmail.com> wrote:
> >> >> >> >> > Thanks for ur reply J-D ... Im pasting some part of the code
> ...
> >> >> >> >> >
> >> >> >> >> > Im doing it frm the command line .. Iam pasting some part of
> the
> >> >> code
> >> >> >> >> here
> >> >> >> >> > ....
> >> >> >> >> >
> >> >> >> >> >  public void mapp(ImmutableBytesWritable row, RowResult
> value,
> >> >> >> >> > OutputCollector<Text, Text> output, Reporter reporter) throws
> >> >> >> IOException
> >> >> >> >> {
> >> >> >> >> >                System.out.println(row);
> >> >> >> >> > }
> >> >> >> >> >
> >> >> >> >> > public JobConf createSubmittableJob(String[] args) throws
> >> >> IOException
> >> >> >> {
> >> >> >> >> >                JobConf c = new JobConf(getConf(),
> >> >> >> >> MR_DS_Scan_Case1.class);
> >> >> >> >> >                c.set("col.name", args[1]);
> >> >> >> >> >                c.set("operator.name",args[2]);
> >> >> >> >> >                c.set("val.name",args[3]);
> >> >> >> >> >                IdentityTableMap.initJob(args[0], args[1],
> >> >> >> >> this.getClass(),
> >> >> >> >> > c);
> >> >> >> >> >                c.setOutputFormat(NullOutputFormat.class);
> >> >> >> >> >                 return c
> >> >> >> >> > }
> >> >> >> >> >
> >> >> >> >> > As u can see ... im just printing the value of row in the map
> ..
> >> i
> >> >> >> can't
> >> >> >> >> see
> >> >> >> >> > in the terminal .....
> >> >> >> >> > I only wan't the map phase ... so i didn't write any reduce
> >> phase
> >> >> ..
> >> >> >> is
> >> >> >> >> my
> >> >> >> >> > jobConf correct??
> >> >> >> >> >
> >> >> >> >> > Also as i have already asked how to check the job logs and
> web
> >> >> >> interface
> >> >> >> >> > like "localhost:<port>/jobTracker.jsp"... since im running in
> >> local
> >> >> >> mode
> >> >> >> >> ...
> >> >> >> >> >
> >> >> >> >> > On Thu, Jul 23, 2009 at 6:32 PM, Jean-Daniel Cryans <
> >> >> >> jdcryans@apache.org
> >> >> >> >> >wrote:
> >> >> >> >> >
> >> >> >> >> >> What output do you need exactly? I see that you have 8
> output
> >> >> records
> >> >> >> >> >> in your reduce task so if you take a look in your output
> folder
> >> or
> >> >> >> >> >> table (I don't know which sink you used) you should see
> them.
> >> >> >> >> >>
> >> >> >> >> >> Also did you run your MR inside Eclipse or in command line?
> >> >> >> >> >>
> >> >> >> >> >> Thx,
> >> >> >> >> >>
> >> >> >> >> >> J-D
> >> >> >> >> >>
> >> >> >> >> >> On Thu, Jul 23, 2009 at 8:30 AM, bharath
> >> >> >> >> >> vissapragada<bh...@students.iiit.ac.in> wrote:
> >> >> >> >> >> > This is the output i go t.. seems everything is fine ..but
> no
> >> >> >> output!!
> >> >> >> >> >> >
> >> >> >> >> >> > 09/07/23 23:25:36 INFO jvm.JvmMetrics: Initializing JVM
> >> Metrics
> >> >> >> with
> >> >> >> >> >> > processName=JobTracker, sessionId=
> >> >> >> >> >> > 09/07/23 23:25:36 WARN mapred.JobClient: No job jar file
> set.
> >> >>  User
> >> >> >> >> >> classes
> >> >> >> >> >> > may not be found. See JobConf(Class) or
> >> JobConf#setJar(String).
> >> >> >> >> >> > 09/07/23 23:25:36 INFO mapred.TableInputFormatBase: split:
> >> >> >> >> >> > 0->localhost.localdomain:,
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.JobClient: Running job:
> >> >> >> job_local_0001
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TableInputFormatBase: split:
> >> >> >> >> >> > 0->localhost.localdomain:,
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: numReduceTasks: 1
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: io.sort.mb = 100
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: data buffer =
> >> >> >> 79691776/99614720
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: record buffer =
> >> >> >> 262144/327680
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Starting flush of
> map
> >> >> output
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Finished spill 0
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
> >> >> >> >> >> Task:attempt_local_0001_m_000000_0
> >> >> >> >> >> > is done. And is in the process of commiting
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
> >> >> >> >> >> > 'attempt_local_0001_m_000000_0' done.
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.Merger: Merging 1 sorted
> >> segments
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.Merger: Down to the last
> >> >> merge-pass,
> >> >> >> >> with 1
> >> >> >> >> >> > segments left of total size: 333 bytes
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
> >> >> >> >> >> Task:attempt_local_0001_r_000000_0
> >> >> >> >> >> > is done. And is in the process of commiting
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner: reduce >
> reduce
> >> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
> >> >> >> >> >> > 'attempt_local_0001_r_000000_0' done.
> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Job complete:
> >> >> >> job_local_0001
> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Counters: 11
> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   File Systems
> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes
> >> >> read=38949
> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes
> >> >> >> written=78378
> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   Map-Reduce
> >> Framework
> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input
> >> >> groups=8
> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine
> output
> >> >> >> records=0
> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input
> >> records=8
> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce output
> >> >> >> records=8
> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output
> >> >> bytes=315
> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input
> >> bytes=0
> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine input
> >> >> >> records=0
> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output
> >> >> records=8
> >> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input
> >> >> records=8
> >> >> >> >> >> >
> >> >> >> >> >> >
> >> >> >> >> >> > On Thu, Jul 23, 2009 at 12:17 PM, bharath vissapragada <
> >> >> >> >> >> > bharat_v@students.iiit.ac.in> wrote:
> >> >> >> >> >> >
> >> >> >> >> >> >> since i haven;t started the cluster .. i can even see the
> >> >> details
> >> >> >> in
> >> >> >> >> >> >> "localhost:<port>/jobTracker.jsp" ..  i didn't even add
> >> >> anything
> >> >> >> to
> >> >> >> >> >> >> hadoop/conf/hadoop-site.xml
> >> >> >> >> >> >>
> >> >> >> >> >> >>
> >> >> >> >> >> >> On Thu, Jul 23, 2009 at 12:16 PM, bharath vissapragada <
> >> >> >> >> >> >> bharat_v@students.iiit.ac.in> wrote:
> >> >> >> >> >> >>
> >> >> >> >> >> >>> Hi all ,
> >> >> >> >> >> >>>
> >> >> >> >> >> >>> I wanted to run HBase in standalone mode to check my
> Hbase
> >> MR
> >> >> >> >> programs
> >> >> >> >> >> ...
> >> >> >> >> >> >>> I have dl a built version of hbase-0.20. and i have
> hadoop
> >> >> 0.19.3
> >> >> >> >> >> >>>
> >> >> >> >> >> >>> "I have set JAVA_HOME in both of them" .. then i started
> >> hbase
> >> >> >> and
> >> >> >> >> >> >>> inserted some tables using JAVA API .. Now i have
> written
> >> some
> >> >> MR
> >> >> >> >> >> programs
> >> >> >> >> >> >>> onHBase and when i run them on Hbase it runs perfectly
> >> without
> >> >> >> any
> >> >> >> >> >> errors
> >> >> >> >> >> >>> and all the Map -reduce statistics are displayed
> correctly
> >> but
> >> >>  i
> >> >> >> >> get
> >> >> >> >> >> no
> >> >> >> >> >> >>> output .
> >> >> >> >> >> >>>
> >> >> >> >> >> >>> I have one doubt now .. how do HBase recognize hadoop in
> >> stand
> >> >> >> alone
> >> >> >> >> >> >>> mode(i haven;t started my hadoop even) .. Even simple
> print
> >> >> >> >> statements
> >> >> >> >> >> donot
> >> >> >> >> >> >>> work .. no output is displayed on the screen ... I doubt
> my
> >> >> >> config
> >> >> >> >> ....
> >> >> >> >> >> >>>
> >> >> >> >> >> >>> Do i need to add some config to run them ... Please
> reply
> >> ...
> >> >> >> >> >> >>>
> >> >> >> >> >> >>
> >> >> >> >> >> >>
> >> >> >> >> >> >
> >> >> >> >> >>
> >> >> >> >> >
> >> >> >> >>
> >> >> >> >
> >> >> >>
> >> >> >
> >> >>
> >> >
> >>
> >
>

Re: Hbase and Hadoop Config to run in Standalone mode

Posted by Jean-Daniel Cryans <jd...@apache.org>.

Ok that explains. The problem you have is that you extended
IdentityTableMap but tried to override it with the wrong method name
so it was never called. Instead it was the parent's map that was
called.

The error it's now giving you is pretty much self-explanatory and is
not related to Hadoop or HBase, you must override the map method and
this is done with @override.

You should also take at look at this doc to learn how to build your
jobs http://hadoop.apache.org/hbase/docs/current/api/org/apache/hadoop/hbase/mapred/package-summary.html

J-D

On Thu, Jul 23, 2009 at 1:04 PM, bharath
vissapragada<bh...@gmail.com> wrote:
> I think this is the problem .. but when i changed it .. it gave me a weird
> error
>
>  name clash:
> map(org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.io.RowResult,org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>,org.apache.hadoop.mapred.Reporter)
> in MR_DS_Scan_Case1 and
> map(org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.io.RowResult,org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.io.RowResult>,org.apache.hadoop.mapred.Reporter)
> in org.apache.hadoop.hbase.mapred.IdentityTableMap have the same erasure,
> yet neither overrides the other
>
> I must override the map function in the IdentityTableMap ... but other
> libraries also seem to have map function ..
> so what must i do ..
>
> On Thu, Jul 23, 2009 at 10:26 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:
>
>> Think I found your problem, is this a typo?
>>
>>  public void mapp(ImmutableBytesWritable row, RowResult value,
>> OutputCollector<Text, Text> output, Reporter reporter) throws IOException {
>>
>> I should read map not mapp
>>
>> J-D
>>
>> On Thu, Jul 23, 2009 at 12:42 PM, bharath
>> vissapragada<bh...@gmail.com> wrote:
>> > I have tried apache -commons logging ...
>> >
>> > instead of printing the row ... i have written log.error(row) ...
>> > even then i got the same output as follows ...
>> >
>> > 09/07/24 03:41:38 INFO jvm.JvmMetrics: Initializing JVM Metrics with
>> > processName=JobTracker, sessionId=
>> > 09/07/24 03:41:38 WARN mapred.JobClient: No job jar file set.  User
>> classes
>> > may not be found. See JobConf(Class) or JobConf#setJar(String).
>> > 09/07/24 03:41:39 INFO mapred.TableInputFormatBase: split:
>> > 0->localhost.localdomain:,
>> > 09/07/24 03:41:39 INFO mapred.JobClient: Running job: job_local_0001
>> > 09/07/24 03:41:39 INFO mapred.TableInputFormatBase: split:
>> > 0->localhost.localdomain:,
>> > 09/07/24 03:41:40 INFO mapred.MapTask: numReduceTasks: 1
>> > 09/07/24 03:41:40 INFO mapred.MapTask: io.sort.mb = 100
>> > 09/07/24 03:41:40 INFO mapred.MapTask: data buffer = 79691776/99614720
>> > 09/07/24 03:41:40 INFO mapred.MapTask: record buffer = 262144/327680
>> > 09/07/24 03:41:40 INFO mapred.MapTask: Starting flush of map output
>> > 09/07/24 03:41:40 INFO mapred.MapTask: Finished spill 0
>> > 09/07/24 03:41:40 INFO mapred.TaskRunner:
>> Task:attempt_local_0001_m_000000_0
>> > is done. And is in the process of commiting
>> > 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
>> > 09/07/24 03:41:40 INFO mapred.TaskRunner: Task
>> > 'attempt_local_0001_m_000000_0' done.
>> > 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
>> > 09/07/24 03:41:40 INFO mapred.Merger: Merging 1 sorted segments
>> > 09/07/24 03:41:40 INFO mapred.Merger: Down to the last merge-pass, with 1
>> > segments left of total size: 333 bytes
>> > 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
>> > 09/07/24 03:41:40 INFO mapred.TaskRunner:
>> Task:attempt_local_0001_r_000000_0
>> > is done. And is in the process of commiting
>> > 09/07/24 03:41:40 INFO mapred.LocalJobRunner: reduce > reduce
>> > 09/07/24 03:41:40 INFO mapred.TaskRunner: Task
>> > 'attempt_local_0001_r_000000_0' done.
>> > 09/07/24 03:41:40 INFO mapred.JobClient: Job complete: job_local_0001
>> > 09/07/24 03:41:40 INFO mapred.JobClient: Counters: 11
>> > 09/07/24 03:41:40 INFO mapred.JobClient:   File Systems
>> > 09/07/24 03:41:40 INFO mapred.JobClient:     Local bytes read=38933
>> > 09/07/24 03:41:40 INFO mapred.JobClient:     Local bytes written=78346
>> > 09/07/24 03:41:40 INFO mapred.JobClient:   Map-Reduce Framework
>> > 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce input groups=8
>> > 09/07/24 03:41:40 INFO mapred.JobClient:     Combine output records=0
>> > 09/07/24 03:41:40 INFO mapred.JobClient:     Map input records=8
>> > 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce output records=8
>> > 09/07/24 03:41:40 INFO mapred.JobClient:     Map output bytes=315
>> > 09/07/24 03:41:40 INFO mapred.JobClient:     Map input bytes=0
>> > 09/07/24 03:41:40 INFO mapred.JobClient:     Combine input records=0
>> > 09/07/24 03:41:40 INFO mapred.JobClient:     Map output records=8
>> > 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce input records=8
>> >
>> >
>> > On Thu, Jul 23, 2009 at 9:32 PM, Jean-Daniel Cryans <jdcryans@apache.org
>> >wrote:
>> >
>> >> And you don't need any more config to run local MR jobs on HBase. But
>> >> you do need Hadoop when running MR jobs on HBase on a cluster.
>> >>
>> >> Also your code is running fine as you could see, the real question is
>> >> where is the stdout going when in local mode. When you ran your other
>> >> MR jobs, it was on a working Hadoop setup right? So you were looking
>> >> at the logs in the web UI? One simple thing to do is to do your
>> >> debugging with a logger so you are sure to see your output as I
>> >> already proposed. Another simple thing is to get a pseudo-distributed
>> >> setup and run you HBase MR jobs with Hadoop and get your logs like I'm
>> >> sure you did before.
>> >>
>> >> J-D
>> >>
>> >> On Thu, Jul 23, 2009 at 11:54 AM, bharath
>> >> vissapragada<bh...@gmail.com> wrote:
>> >> > I am really thankful to you J-D for replying me inspite of ur busy
>> >> schedule.
>> >> > I am still in a learning stage and there are no good guides on HBase
>> >> other
>> >> > than Its own one .. So please spare me and I really appreciate ur help
>> .
>> >> >
>> >> > Now i got ur point that there is no need of hadoop while running Hbase
>> MR
>> >> > programs .... But iam confused abt the config . I have only set the
>> >> > JAVA_HOME in the "hbase-env.sh" and other than that i didn't do
>> anything
>> >> ..
>> >> > so i wonder if my conf was wrong or some error in that simple code ...
>> >> > because stdout worked for me while writing mapreduce programs ...
>> >> >
>> >> > Thanks once again!
>> >> >
>> >> > On Thu, Jul 23, 2009 at 9:14 PM, Jean-Daniel Cryans <
>> jdcryans@apache.org
>> >> >wrote:
>> >> >
>> >> >> The code itself is very simple, I was referring to your own
>> >> >> description of your situation. You say you use standalone HBase yet
>> >> >> you talk about Hadoop configuration. You also talk about the
>> >> >> JobTracker web UI which is in no use since you run local jobs
>> directly
>> >> >> on HBase.
>> >> >>
>> >> >> J-D
>> >> >>
>> >> >> On Thu, Jul 23, 2009 at 11:41 AM, bharath
>> >> >> vissapragada<bh...@gmail.com> wrote:
>> >> >> > I used stdout for debugging while writing codes in hadoop MR
>> programs
>> >> and
>> >> >> it
>> >> >> > worked fine ...
>> >> >> > Can you please tell me wch part of the code u found confusing so
>> that
>> >> i
>> >> >> can
>> >> >> > explain it a bit clearly ...
>> >> >> >
>> >> >> >
>> >> >> > On Thu, Jul 23, 2009 at 9:06 PM, Jean-Daniel Cryans <
>> >> jdcryans@apache.org
>> >> >> >wrote:
>> >> >> >
>> >> >> >> What you wrote is a bit confusing to me, sorry.
>> >> >> >>
>> >> >> >> The usual way to debug MR jobs is to define a logger and post with
>> >> >> >> either info or debug level, not sysout like you did. I'm not even
>> >> sure
>> >> >> >> where the standard output is logged when using a local job. Also
>> >> since
>> >> >> >> this is local you won't see anything in your host:50030 web UI. So
>> >> use
>> >> >> >> apache common logging and you should see your output.
>> >> >> >>
>> >> >> >> J-D
>> >> >> >>
>> >> >> >> On Thu, Jul 23, 2009 at 11:13 AM, bharath
>> >> >> >> vissapragada<bh...@gmail.com> wrote:
>> >> >> >> > Thanks for ur reply J-D ... Im pasting some part of the code ...
>> >> >> >> >
>> >> >> >> > Im doing it frm the command line .. Iam pasting some part of the
>> >> code
>> >> >> >> here
>> >> >> >> > ....
>> >> >> >> >
>> >> >> >> >  public void mapp(ImmutableBytesWritable row, RowResult value,
>> >> >> >> > OutputCollector<Text, Text> output, Reporter reporter) throws
>> >> >> IOException
>> >> >> >> {
>> >> >> >> >                System.out.println(row);
>> >> >> >> > }
>> >> >> >> >
>> >> >> >> > public JobConf createSubmittableJob(String[] args) throws
>> >> IOException
>> >> >> {
>> >> >> >> >                JobConf c = new JobConf(getConf(),
>> >> >> >> MR_DS_Scan_Case1.class);
>> >> >> >> >                c.set("col.name", args[1]);
>> >> >> >> >                c.set("operator.name",args[2]);
>> >> >> >> >                c.set("val.name",args[3]);
>> >> >> >> >                IdentityTableMap.initJob(args[0], args[1],
>> >> >> >> this.getClass(),
>> >> >> >> > c);
>> >> >> >> >                c.setOutputFormat(NullOutputFormat.class);
>> >> >> >> >                 return c
>> >> >> >> > }
>> >> >> >> >
>> >> >> >> > As u can see ... im just printing the value of row in the map ..
>> i
>> >> >> can't
>> >> >> >> see
>> >> >> >> > in the terminal .....
>> >> >> >> > I only wan't the map phase ... so i didn't write any reduce
>> phase
>> >> ..
>> >> >> is
>> >> >> >> my
>> >> >> >> > jobConf correct??
>> >> >> >> >
>> >> >> >> > Also as i have already asked how to check the job logs and web
>> >> >> interface
>> >> >> >> > like "localhost:<port>/jobTracker.jsp"... since im running in
>> local
>> >> >> mode
>> >> >> >> ...
>> >> >> >> >
>> >> >> >> > On Thu, Jul 23, 2009 at 6:32 PM, Jean-Daniel Cryans <
>> >> >> jdcryans@apache.org
>> >> >> >> >wrote:
>> >> >> >> >
>> >> >> >> >> What output do you need exactly? I see that you have 8 output
>> >> records
>> >> >> >> >> in your reduce task so if you take a look in your output folder
>> or
>> >> >> >> >> table (I don't know which sink you used) you should see them.
>> >> >> >> >>
>> >> >> >> >> Also did you run your MR inside Eclipse or in command line?
>> >> >> >> >>
>> >> >> >> >> Thx,
>> >> >> >> >>
>> >> >> >> >> J-D
>> >> >> >> >>
>> >> >> >> >> On Thu, Jul 23, 2009 at 8:30 AM, bharath
>> >> >> >> >> vissapragada<bh...@students.iiit.ac.in> wrote:
>> >> >> >> >> > This is the output i go t.. seems everything is fine ..but no
>> >> >> output!!
>> >> >> >> >> >
>> >> >> >> >> > 09/07/23 23:25:36 INFO jvm.JvmMetrics: Initializing JVM
>> Metrics
>> >> >> with
>> >> >> >> >> > processName=JobTracker, sessionId=
>> >> >> >> >> > 09/07/23 23:25:36 WARN mapred.JobClient: No job jar file set.
>> >>  User
>> >> >> >> >> classes
>> >> >> >> >> > may not be found. See JobConf(Class) or
>> JobConf#setJar(String).
>> >> >> >> >> > 09/07/23 23:25:36 INFO mapred.TableInputFormatBase: split:
>> >> >> >> >> > 0->localhost.localdomain:,
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.JobClient: Running job:
>> >> >> job_local_0001
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TableInputFormatBase: split:
>> >> >> >> >> > 0->localhost.localdomain:,
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: numReduceTasks: 1
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: io.sort.mb = 100
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: data buffer =
>> >> >> 79691776/99614720
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: record buffer =
>> >> >> 262144/327680
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Starting flush of map
>> >> output
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Finished spill 0
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
>> >> >> >> >> Task:attempt_local_0001_m_000000_0
>> >> >> >> >> > is done. And is in the process of commiting
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
>> >> >> >> >> > 'attempt_local_0001_m_000000_0' done.
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.Merger: Merging 1 sorted
>> segments
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.Merger: Down to the last
>> >> merge-pass,
>> >> >> >> with 1
>> >> >> >> >> > segments left of total size: 333 bytes
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
>> >> >> >> >> Task:attempt_local_0001_r_000000_0
>> >> >> >> >> > is done. And is in the process of commiting
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner: reduce > reduce
>> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
>> >> >> >> >> > 'attempt_local_0001_r_000000_0' done.
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Job complete:
>> >> >> job_local_0001
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Counters: 11
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   File Systems
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes
>> >> read=38949
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes
>> >> >> written=78378
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   Map-Reduce
>> Framework
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input
>> >> groups=8
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine output
>> >> >> records=0
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input
>> records=8
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce output
>> >> >> records=8
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output
>> >> bytes=315
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input
>> bytes=0
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine input
>> >> >> records=0
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output
>> >> records=8
>> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input
>> >> records=8
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> > On Thu, Jul 23, 2009 at 12:17 PM, bharath vissapragada <
>> >> >> >> >> > bharat_v@students.iiit.ac.in> wrote:
>> >> >> >> >> >
>> >> >> >> >> >> since i haven;t started the cluster .. i can even see the
>> >> details
>> >> >> in
>> >> >> >> >> >> "localhost:<port>/jobTracker.jsp" ..  i didn't even add
>> >> anything
>> >> >> to
>> >> >> >> >> >> hadoop/conf/hadoop-site.xml
>> >> >> >> >> >>
>> >> >> >> >> >>
>> >> >> >> >> >> On Thu, Jul 23, 2009 at 12:16 PM, bharath vissapragada <
>> >> >> >> >> >> bharat_v@students.iiit.ac.in> wrote:
>> >> >> >> >> >>
>> >> >> >> >> >>> Hi all ,
>> >> >> >> >> >>>
>> >> >> >> >> >>> I wanted to run HBase in standalone mode to check my Hbase
>> MR
>> >> >> >> programs
>> >> >> >> >> ...
>> >> >> >> >> >>> I have dl a built version of hbase-0.20. and i have hadoop
>> >> 0.19.3
>> >> >> >> >> >>>
>> >> >> >> >> >>> "I have set JAVA_HOME in both of them" .. then i started
>> hbase
>> >> >> and
>> >> >> >> >> >>> inserted some tables using JAVA API .. Now i have written
>> some
>> >> MR
>> >> >> >> >> programs
>> >> >> >> >> >>> onHBase and when i run them on Hbase it runs perfectly
>> without
>> >> >> any
>> >> >> >> >> errors
>> >> >> >> >> >>> and all the Map -reduce statistics are displayed correctly
>> but
>> >>  i
>> >> >> >> get
>> >> >> >> >> no
>> >> >> >> >> >>> output .
>> >> >> >> >> >>>
>> >> >> >> >> >>> I have one doubt now .. how do HBase recognize hadoop in
>> stand
>> >> >> alone
>> >> >> >> >> >>> mode(i haven;t started my hadoop even) .. Even simple print
>> >> >> >> statements
>> >> >> >> >> donot
>> >> >> >> >> >>> work .. no output is displayed on the screen ... I doubt my
>> >> >> config
>> >> >> >> ....
>> >> >> >> >> >>>
>> >> >> >> >> >>> Do i need to add some config to run them ... Please reply
>> ...
>> >> >> >> >> >>>
>> >> >> >> >> >>
>> >> >> >> >> >>
>> >> >> >> >> >
>> >> >> >> >>
>> >> >> >> >
>> >> >> >>
>> >> >> >
>> >> >>
>> >> >
>> >>
>> >
>>
>

Re: Hbase and Hadoop Config to run in Standalone mode

Posted by bharath vissapragada <bh...@gmail.com>.

I think this is the problem .. but when i changed it .. it gave me a weird
error

 name clash:
map(org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.io.RowResult,org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>,org.apache.hadoop.mapred.Reporter)
in MR_DS_Scan_Case1 and
map(org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.io.RowResult,org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.hbase.io.ImmutableBytesWritable,org.apache.hadoop.hbase.io.RowResult>,org.apache.hadoop.mapred.Reporter)
in org.apache.hadoop.hbase.mapred.IdentityTableMap have the same erasure,
yet neither overrides the other

I must override the map function in the IdentityTableMap ... but other
libraries also seem to have map function ..
so what must i do ..

On Thu, Jul 23, 2009 at 10:26 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> Think I found your problem, is this a typo?
>
>  public void mapp(ImmutableBytesWritable row, RowResult value,
> OutputCollector<Text, Text> output, Reporter reporter) throws IOException {
>
> I should read map not mapp
>
> J-D
>
> On Thu, Jul 23, 2009 at 12:42 PM, bharath
> vissapragada<bh...@gmail.com> wrote:
> > I have tried apache -commons logging ...
> >
> > instead of printing the row ... i have written log.error(row) ...
> > even then i got the same output as follows ...
> >
> > 09/07/24 03:41:38 INFO jvm.JvmMetrics: Initializing JVM Metrics with
> > processName=JobTracker, sessionId=
> > 09/07/24 03:41:38 WARN mapred.JobClient: No job jar file set.  User
> classes
> > may not be found. See JobConf(Class) or JobConf#setJar(String).
> > 09/07/24 03:41:39 INFO mapred.TableInputFormatBase: split:
> > 0->localhost.localdomain:,
> > 09/07/24 03:41:39 INFO mapred.JobClient: Running job: job_local_0001
> > 09/07/24 03:41:39 INFO mapred.TableInputFormatBase: split:
> > 0->localhost.localdomain:,
> > 09/07/24 03:41:40 INFO mapred.MapTask: numReduceTasks: 1
> > 09/07/24 03:41:40 INFO mapred.MapTask: io.sort.mb = 100
> > 09/07/24 03:41:40 INFO mapred.MapTask: data buffer = 79691776/99614720
> > 09/07/24 03:41:40 INFO mapred.MapTask: record buffer = 262144/327680
> > 09/07/24 03:41:40 INFO mapred.MapTask: Starting flush of map output
> > 09/07/24 03:41:40 INFO mapred.MapTask: Finished spill 0
> > 09/07/24 03:41:40 INFO mapred.TaskRunner:
> Task:attempt_local_0001_m_000000_0
> > is done. And is in the process of commiting
> > 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
> > 09/07/24 03:41:40 INFO mapred.TaskRunner: Task
> > 'attempt_local_0001_m_000000_0' done.
> > 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
> > 09/07/24 03:41:40 INFO mapred.Merger: Merging 1 sorted segments
> > 09/07/24 03:41:40 INFO mapred.Merger: Down to the last merge-pass, with 1
> > segments left of total size: 333 bytes
> > 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
> > 09/07/24 03:41:40 INFO mapred.TaskRunner:
> Task:attempt_local_0001_r_000000_0
> > is done. And is in the process of commiting
> > 09/07/24 03:41:40 INFO mapred.LocalJobRunner: reduce > reduce
> > 09/07/24 03:41:40 INFO mapred.TaskRunner: Task
> > 'attempt_local_0001_r_000000_0' done.
> > 09/07/24 03:41:40 INFO mapred.JobClient: Job complete: job_local_0001
> > 09/07/24 03:41:40 INFO mapred.JobClient: Counters: 11
> > 09/07/24 03:41:40 INFO mapred.JobClient:   File Systems
> > 09/07/24 03:41:40 INFO mapred.JobClient:     Local bytes read=38933
> > 09/07/24 03:41:40 INFO mapred.JobClient:     Local bytes written=78346
> > 09/07/24 03:41:40 INFO mapred.JobClient:   Map-Reduce Framework
> > 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce input groups=8
> > 09/07/24 03:41:40 INFO mapred.JobClient:     Combine output records=0
> > 09/07/24 03:41:40 INFO mapred.JobClient:     Map input records=8
> > 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce output records=8
> > 09/07/24 03:41:40 INFO mapred.JobClient:     Map output bytes=315
> > 09/07/24 03:41:40 INFO mapred.JobClient:     Map input bytes=0
> > 09/07/24 03:41:40 INFO mapred.JobClient:     Combine input records=0
> > 09/07/24 03:41:40 INFO mapred.JobClient:     Map output records=8
> > 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce input records=8
> >
> >
> > On Thu, Jul 23, 2009 at 9:32 PM, Jean-Daniel Cryans <jdcryans@apache.org
> >wrote:
> >
> >> And you don't need any more config to run local MR jobs on HBase. But
> >> you do need Hadoop when running MR jobs on HBase on a cluster.
> >>
> >> Also your code is running fine as you could see, the real question is
> >> where is the stdout going when in local mode. When you ran your other
> >> MR jobs, it was on a working Hadoop setup right? So you were looking
> >> at the logs in the web UI? One simple thing to do is to do your
> >> debugging with a logger so you are sure to see your output as I
> >> already proposed. Another simple thing is to get a pseudo-distributed
> >> setup and run you HBase MR jobs with Hadoop and get your logs like I'm
> >> sure you did before.
> >>
> >> J-D
> >>
> >> On Thu, Jul 23, 2009 at 11:54 AM, bharath
> >> vissapragada<bh...@gmail.com> wrote:
> >> > I am really thankful to you J-D for replying me inspite of ur busy
> >> schedule.
> >> > I am still in a learning stage and there are no good guides on HBase
> >> other
> >> > than Its own one .. So please spare me and I really appreciate ur help
> .
> >> >
> >> > Now i got ur point that there is no need of hadoop while running Hbase
> MR
> >> > programs .... But iam confused abt the config . I have only set the
> >> > JAVA_HOME in the "hbase-env.sh" and other than that i didn't do
> anything
> >> ..
> >> > so i wonder if my conf was wrong or some error in that simple code ...
> >> > because stdout worked for me while writing mapreduce programs ...
> >> >
> >> > Thanks once again!
> >> >
> >> > On Thu, Jul 23, 2009 at 9:14 PM, Jean-Daniel Cryans <
> jdcryans@apache.org
> >> >wrote:
> >> >
> >> >> The code itself is very simple, I was referring to your own
> >> >> description of your situation. You say you use standalone HBase yet
> >> >> you talk about Hadoop configuration. You also talk about the
> >> >> JobTracker web UI which is in no use since you run local jobs
> directly
> >> >> on HBase.
> >> >>
> >> >> J-D
> >> >>
> >> >> On Thu, Jul 23, 2009 at 11:41 AM, bharath
> >> >> vissapragada<bh...@gmail.com> wrote:
> >> >> > I used stdout for debugging while writing codes in hadoop MR
> programs
> >> and
> >> >> it
> >> >> > worked fine ...
> >> >> > Can you please tell me wch part of the code u found confusing so
> that
> >> i
> >> >> can
> >> >> > explain it a bit clearly ...
> >> >> >
> >> >> >
> >> >> > On Thu, Jul 23, 2009 at 9:06 PM, Jean-Daniel Cryans <
> >> jdcryans@apache.org
> >> >> >wrote:
> >> >> >
> >> >> >> What you wrote is a bit confusing to me, sorry.
> >> >> >>
> >> >> >> The usual way to debug MR jobs is to define a logger and post with
> >> >> >> either info or debug level, not sysout like you did. I'm not even
> >> sure
> >> >> >> where the standard output is logged when using a local job. Also
> >> since
> >> >> >> this is local you won't see anything in your host:50030 web UI. So
> >> use
> >> >> >> apache common logging and you should see your output.
> >> >> >>
> >> >> >> J-D
> >> >> >>
> >> >> >> On Thu, Jul 23, 2009 at 11:13 AM, bharath
> >> >> >> vissapragada<bh...@gmail.com> wrote:
> >> >> >> > Thanks for ur reply J-D ... Im pasting some part of the code ...
> >> >> >> >
> >> >> >> > Im doing it frm the command line .. Iam pasting some part of the
> >> code
> >> >> >> here
> >> >> >> > ....
> >> >> >> >
> >> >> >> >  public void mapp(ImmutableBytesWritable row, RowResult value,
> >> >> >> > OutputCollector<Text, Text> output, Reporter reporter) throws
> >> >> IOException
> >> >> >> {
> >> >> >> >                System.out.println(row);
> >> >> >> > }
> >> >> >> >
> >> >> >> > public JobConf createSubmittableJob(String[] args) throws
> >> IOException
> >> >> {
> >> >> >> >                JobConf c = new JobConf(getConf(),
> >> >> >> MR_DS_Scan_Case1.class);
> >> >> >> >                c.set("col.name", args[1]);
> >> >> >> >                c.set("operator.name",args[2]);
> >> >> >> >                c.set("val.name",args[3]);
> >> >> >> >                IdentityTableMap.initJob(args[0], args[1],
> >> >> >> this.getClass(),
> >> >> >> > c);
> >> >> >> >                c.setOutputFormat(NullOutputFormat.class);
> >> >> >> >                 return c
> >> >> >> > }
> >> >> >> >
> >> >> >> > As u can see ... im just printing the value of row in the map ..
> i
> >> >> can't
> >> >> >> see
> >> >> >> > in the terminal .....
> >> >> >> > I only wan't the map phase ... so i didn't write any reduce
> phase
> >> ..
> >> >> is
> >> >> >> my
> >> >> >> > jobConf correct??
> >> >> >> >
> >> >> >> > Also as i have already asked how to check the job logs and web
> >> >> interface
> >> >> >> > like "localhost:<port>/jobTracker.jsp"... since im running in
> local
> >> >> mode
> >> >> >> ...
> >> >> >> >
> >> >> >> > On Thu, Jul 23, 2009 at 6:32 PM, Jean-Daniel Cryans <
> >> >> jdcryans@apache.org
> >> >> >> >wrote:
> >> >> >> >
> >> >> >> >> What output do you need exactly? I see that you have 8 output
> >> records
> >> >> >> >> in your reduce task so if you take a look in your output folder
> or
> >> >> >> >> table (I don't know which sink you used) you should see them.
> >> >> >> >>
> >> >> >> >> Also did you run your MR inside Eclipse or in command line?
> >> >> >> >>
> >> >> >> >> Thx,
> >> >> >> >>
> >> >> >> >> J-D
> >> >> >> >>
> >> >> >> >> On Thu, Jul 23, 2009 at 8:30 AM, bharath
> >> >> >> >> vissapragada<bh...@students.iiit.ac.in> wrote:
> >> >> >> >> > This is the output i go t.. seems everything is fine ..but no
> >> >> output!!
> >> >> >> >> >
> >> >> >> >> > 09/07/23 23:25:36 INFO jvm.JvmMetrics: Initializing JVM
> Metrics
> >> >> with
> >> >> >> >> > processName=JobTracker, sessionId=
> >> >> >> >> > 09/07/23 23:25:36 WARN mapred.JobClient: No job jar file set.
> >>  User
> >> >> >> >> classes
> >> >> >> >> > may not be found. See JobConf(Class) or
> JobConf#setJar(String).
> >> >> >> >> > 09/07/23 23:25:36 INFO mapred.TableInputFormatBase: split:
> >> >> >> >> > 0->localhost.localdomain:,
> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.JobClient: Running job:
> >> >> job_local_0001
> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TableInputFormatBase: split:
> >> >> >> >> > 0->localhost.localdomain:,
> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: numReduceTasks: 1
> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: io.sort.mb = 100
> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: data buffer =
> >> >> 79691776/99614720
> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: record buffer =
> >> >> 262144/327680
> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Starting flush of map
> >> output
> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Finished spill 0
> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
> >> >> >> >> Task:attempt_local_0001_m_000000_0
> >> >> >> >> > is done. And is in the process of commiting
> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
> >> >> >> >> > 'attempt_local_0001_m_000000_0' done.
> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.Merger: Merging 1 sorted
> segments
> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.Merger: Down to the last
> >> merge-pass,
> >> >> >> with 1
> >> >> >> >> > segments left of total size: 333 bytes
> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
> >> >> >> >> Task:attempt_local_0001_r_000000_0
> >> >> >> >> > is done. And is in the process of commiting
> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner: reduce > reduce
> >> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
> >> >> >> >> > 'attempt_local_0001_r_000000_0' done.
> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Job complete:
> >> >> job_local_0001
> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Counters: 11
> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   File Systems
> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes
> >> read=38949
> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes
> >> >> written=78378
> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   Map-Reduce
> Framework
> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input
> >> groups=8
> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine output
> >> >> records=0
> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input
> records=8
> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce output
> >> >> records=8
> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output
> >> bytes=315
> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input
> bytes=0
> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine input
> >> >> records=0
> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output
> >> records=8
> >> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input
> >> records=8
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> > On Thu, Jul 23, 2009 at 12:17 PM, bharath vissapragada <
> >> >> >> >> > bharat_v@students.iiit.ac.in> wrote:
> >> >> >> >> >
> >> >> >> >> >> since i haven;t started the cluster .. i can even see the
> >> details
> >> >> in
> >> >> >> >> >> "localhost:<port>/jobTracker.jsp" ..  i didn't even add
> >> anything
> >> >> to
> >> >> >> >> >> hadoop/conf/hadoop-site.xml
> >> >> >> >> >>
> >> >> >> >> >>
> >> >> >> >> >> On Thu, Jul 23, 2009 at 12:16 PM, bharath vissapragada <
> >> >> >> >> >> bharat_v@students.iiit.ac.in> wrote:
> >> >> >> >> >>
> >> >> >> >> >>> Hi all ,
> >> >> >> >> >>>
> >> >> >> >> >>> I wanted to run HBase in standalone mode to check my Hbase
> MR
> >> >> >> programs
> >> >> >> >> ...
> >> >> >> >> >>> I have dl a built version of hbase-0.20. and i have hadoop
> >> 0.19.3
> >> >> >> >> >>>
> >> >> >> >> >>> "I have set JAVA_HOME in both of them" .. then i started
> hbase
> >> >> and
> >> >> >> >> >>> inserted some tables using JAVA API .. Now i have written
> some
> >> MR
> >> >> >> >> programs
> >> >> >> >> >>> onHBase and when i run them on Hbase it runs perfectly
> without
> >> >> any
> >> >> >> >> errors
> >> >> >> >> >>> and all the Map -reduce statistics are displayed correctly
> but
> >>  i
> >> >> >> get
> >> >> >> >> no
> >> >> >> >> >>> output .
> >> >> >> >> >>>
> >> >> >> >> >>> I have one doubt now .. how do HBase recognize hadoop in
> stand
> >> >> alone
> >> >> >> >> >>> mode(i haven;t started my hadoop even) .. Even simple print
> >> >> >> statements
> >> >> >> >> donot
> >> >> >> >> >>> work .. no output is displayed on the screen ... I doubt my
> >> >> config
> >> >> >> ....
> >> >> >> >> >>>
> >> >> >> >> >>> Do i need to add some config to run them ... Please reply
> ...
> >> >> >> >> >>>
> >> >> >> >> >>
> >> >> >> >> >>
> >> >> >> >> >
> >> >> >> >>
> >> >> >> >
> >> >> >>
> >> >> >
> >> >>
> >> >
> >>
> >
>

Re: Hbase and Hadoop Config to run in Standalone mode

Posted by Jean-Daniel Cryans <jd...@apache.org>.

Think I found your problem, is this a typo?

 public void mapp(ImmutableBytesWritable row, RowResult value,
OutputCollector<Text, Text> output, Reporter reporter) throws IOException {

I should read map not mapp

J-D

On Thu, Jul 23, 2009 at 12:42 PM, bharath
vissapragada<bh...@gmail.com> wrote:
> I have tried apache -commons logging ...
>
> instead of printing the row ... i have written log.error(row) ...
> even then i got the same output as follows ...
>
> 09/07/24 03:41:38 INFO jvm.JvmMetrics: Initializing JVM Metrics with
> processName=JobTracker, sessionId=
> 09/07/24 03:41:38 WARN mapred.JobClient: No job jar file set.  User classes
> may not be found. See JobConf(Class) or JobConf#setJar(String).
> 09/07/24 03:41:39 INFO mapred.TableInputFormatBase: split:
> 0->localhost.localdomain:,
> 09/07/24 03:41:39 INFO mapred.JobClient: Running job: job_local_0001
> 09/07/24 03:41:39 INFO mapred.TableInputFormatBase: split:
> 0->localhost.localdomain:,
> 09/07/24 03:41:40 INFO mapred.MapTask: numReduceTasks: 1
> 09/07/24 03:41:40 INFO mapred.MapTask: io.sort.mb = 100
> 09/07/24 03:41:40 INFO mapred.MapTask: data buffer = 79691776/99614720
> 09/07/24 03:41:40 INFO mapred.MapTask: record buffer = 262144/327680
> 09/07/24 03:41:40 INFO mapred.MapTask: Starting flush of map output
> 09/07/24 03:41:40 INFO mapred.MapTask: Finished spill 0
> 09/07/24 03:41:40 INFO mapred.TaskRunner: Task:attempt_local_0001_m_000000_0
> is done. And is in the process of commiting
> 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
> 09/07/24 03:41:40 INFO mapred.TaskRunner: Task
> 'attempt_local_0001_m_000000_0' done.
> 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
> 09/07/24 03:41:40 INFO mapred.Merger: Merging 1 sorted segments
> 09/07/24 03:41:40 INFO mapred.Merger: Down to the last merge-pass, with 1
> segments left of total size: 333 bytes
> 09/07/24 03:41:40 INFO mapred.LocalJobRunner:
> 09/07/24 03:41:40 INFO mapred.TaskRunner: Task:attempt_local_0001_r_000000_0
> is done. And is in the process of commiting
> 09/07/24 03:41:40 INFO mapred.LocalJobRunner: reduce > reduce
> 09/07/24 03:41:40 INFO mapred.TaskRunner: Task
> 'attempt_local_0001_r_000000_0' done.
> 09/07/24 03:41:40 INFO mapred.JobClient: Job complete: job_local_0001
> 09/07/24 03:41:40 INFO mapred.JobClient: Counters: 11
> 09/07/24 03:41:40 INFO mapred.JobClient:   File Systems
> 09/07/24 03:41:40 INFO mapred.JobClient:     Local bytes read=38933
> 09/07/24 03:41:40 INFO mapred.JobClient:     Local bytes written=78346
> 09/07/24 03:41:40 INFO mapred.JobClient:   Map-Reduce Framework
> 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce input groups=8
> 09/07/24 03:41:40 INFO mapred.JobClient:     Combine output records=0
> 09/07/24 03:41:40 INFO mapred.JobClient:     Map input records=8
> 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce output records=8
> 09/07/24 03:41:40 INFO mapred.JobClient:     Map output bytes=315
> 09/07/24 03:41:40 INFO mapred.JobClient:     Map input bytes=0
> 09/07/24 03:41:40 INFO mapred.JobClient:     Combine input records=0
> 09/07/24 03:41:40 INFO mapred.JobClient:     Map output records=8
> 09/07/24 03:41:40 INFO mapred.JobClient:     Reduce input records=8
>
>
> On Thu, Jul 23, 2009 at 9:32 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:
>
>> And you don't need any more config to run local MR jobs on HBase. But
>> you do need Hadoop when running MR jobs on HBase on a cluster.
>>
>> Also your code is running fine as you could see, the real question is
>> where is the stdout going when in local mode. When you ran your other
>> MR jobs, it was on a working Hadoop setup right? So you were looking
>> at the logs in the web UI? One simple thing to do is to do your
>> debugging with a logger so you are sure to see your output as I
>> already proposed. Another simple thing is to get a pseudo-distributed
>> setup and run you HBase MR jobs with Hadoop and get your logs like I'm
>> sure you did before.
>>
>> J-D
>>
>> On Thu, Jul 23, 2009 at 11:54 AM, bharath
>> vissapragada<bh...@gmail.com> wrote:
>> > I am really thankful to you J-D for replying me inspite of ur busy
>> schedule.
>> > I am still in a learning stage and there are no good guides on HBase
>> other
>> > than Its own one .. So please spare me and I really appreciate ur help .
>> >
>> > Now i got ur point that there is no need of hadoop while running Hbase MR
>> > programs .... But iam confused abt the config . I have only set the
>> > JAVA_HOME in the "hbase-env.sh" and other than that i didn't do anything
>> ..
>> > so i wonder if my conf was wrong or some error in that simple code ...
>> > because stdout worked for me while writing mapreduce programs ...
>> >
>> > Thanks once again!
>> >
>> > On Thu, Jul 23, 2009 at 9:14 PM, Jean-Daniel Cryans <jdcryans@apache.org
>> >wrote:
>> >
>> >> The code itself is very simple, I was referring to your own
>> >> description of your situation. You say you use standalone HBase yet
>> >> you talk about Hadoop configuration. You also talk about the
>> >> JobTracker web UI which is in no use since you run local jobs directly
>> >> on HBase.
>> >>
>> >> J-D
>> >>
>> >> On Thu, Jul 23, 2009 at 11:41 AM, bharath
>> >> vissapragada<bh...@gmail.com> wrote:
>> >> > I used stdout for debugging while writing codes in hadoop MR programs
>> and
>> >> it
>> >> > worked fine ...
>> >> > Can you please tell me wch part of the code u found confusing so that
>> i
>> >> can
>> >> > explain it a bit clearly ...
>> >> >
>> >> >
>> >> > On Thu, Jul 23, 2009 at 9:06 PM, Jean-Daniel Cryans <
>> jdcryans@apache.org
>> >> >wrote:
>> >> >
>> >> >> What you wrote is a bit confusing to me, sorry.
>> >> >>
>> >> >> The usual way to debug MR jobs is to define a logger and post with
>> >> >> either info or debug level, not sysout like you did. I'm not even
>> sure
>> >> >> where the standard output is logged when using a local job. Also
>> since
>> >> >> this is local you won't see anything in your host:50030 web UI. So
>> use
>> >> >> apache common logging and you should see your output.
>> >> >>
>> >> >> J-D
>> >> >>
>> >> >> On Thu, Jul 23, 2009 at 11:13 AM, bharath
>> >> >> vissapragada<bh...@gmail.com> wrote:
>> >> >> > Thanks for ur reply J-D ... Im pasting some part of the code ...
>> >> >> >
>> >> >> > Im doing it frm the command line .. Iam pasting some part of the
>> code
>> >> >> here
>> >> >> > ....
>> >> >> >
>> >> >> >  public void mapp(ImmutableBytesWritable row, RowResult value,
>> >> >> > OutputCollector<Text, Text> output, Reporter reporter) throws
>> >> IOException
>> >> >> {
>> >> >> >                System.out.println(row);
>> >> >> > }
>> >> >> >
>> >> >> > public JobConf createSubmittableJob(String[] args) throws
>> IOException
>> >> {
>> >> >> >                JobConf c = new JobConf(getConf(),
>> >> >> MR_DS_Scan_Case1.class);
>> >> >> >                c.set("col.name", args[1]);
>> >> >> >                c.set("operator.name",args[2]);
>> >> >> >                c.set("val.name",args[3]);
>> >> >> >                IdentityTableMap.initJob(args[0], args[1],
>> >> >> this.getClass(),
>> >> >> > c);
>> >> >> >                c.setOutputFormat(NullOutputFormat.class);
>> >> >> >                 return c
>> >> >> > }
>> >> >> >
>> >> >> > As u can see ... im just printing the value of row in the map .. i
>> >> can't
>> >> >> see
>> >> >> > in the terminal .....
>> >> >> > I only wan't the map phase ... so i didn't write any reduce phase
>> ..
>> >> is
>> >> >> my
>> >> >> > jobConf correct??
>> >> >> >
>> >> >> > Also as i have already asked how to check the job logs and web
>> >> interface
>> >> >> > like "localhost:<port>/jobTracker.jsp"... since im running in local
>> >> mode
>> >> >> ...
>> >> >> >
>> >> >> > On Thu, Jul 23, 2009 at 6:32 PM, Jean-Daniel Cryans <
>> >> jdcryans@apache.org
>> >> >> >wrote:
>> >> >> >
>> >> >> >> What output do you need exactly? I see that you have 8 output
>> records
>> >> >> >> in your reduce task so if you take a look in your output folder or
>> >> >> >> table (I don't know which sink you used) you should see them.
>> >> >> >>
>> >> >> >> Also did you run your MR inside Eclipse or in command line?
>> >> >> >>
>> >> >> >> Thx,
>> >> >> >>
>> >> >> >> J-D
>> >> >> >>
>> >> >> >> On Thu, Jul 23, 2009 at 8:30 AM, bharath
>> >> >> >> vissapragada<bh...@students.iiit.ac.in> wrote:
>> >> >> >> > This is the output i go t.. seems everything is fine ..but no
>> >> output!!
>> >> >> >> >
>> >> >> >> > 09/07/23 23:25:36 INFO jvm.JvmMetrics: Initializing JVM Metrics
>> >> with
>> >> >> >> > processName=JobTracker, sessionId=
>> >> >> >> > 09/07/23 23:25:36 WARN mapred.JobClient: No job jar file set.
>>  User
>> >> >> >> classes
>> >> >> >> > may not be found. See JobConf(Class) or JobConf#setJar(String).
>> >> >> >> > 09/07/23 23:25:36 INFO mapred.TableInputFormatBase: split:
>> >> >> >> > 0->localhost.localdomain:,
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.JobClient: Running job:
>> >> job_local_0001
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.TableInputFormatBase: split:
>> >> >> >> > 0->localhost.localdomain:,
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: numReduceTasks: 1
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: io.sort.mb = 100
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: data buffer =
>> >> 79691776/99614720
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: record buffer =
>> >> 262144/327680
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Starting flush of map
>> output
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Finished spill 0
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
>> >> >> >> Task:attempt_local_0001_m_000000_0
>> >> >> >> > is done. And is in the process of commiting
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
>> >> >> >> > 'attempt_local_0001_m_000000_0' done.
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.Merger: Merging 1 sorted segments
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.Merger: Down to the last
>> merge-pass,
>> >> >> with 1
>> >> >> >> > segments left of total size: 333 bytes
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
>> >> >> >> Task:attempt_local_0001_r_000000_0
>> >> >> >> > is done. And is in the process of commiting
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner: reduce > reduce
>> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
>> >> >> >> > 'attempt_local_0001_r_000000_0' done.
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Job complete:
>> >> job_local_0001
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Counters: 11
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   File Systems
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes
>> read=38949
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes
>> >> written=78378
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   Map-Reduce Framework
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input
>> groups=8
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine output
>> >> records=0
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input records=8
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce output
>> >> records=8
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output
>> bytes=315
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input bytes=0
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine input
>> >> records=0
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output
>> records=8
>> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input
>> records=8
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > On Thu, Jul 23, 2009 at 12:17 PM, bharath vissapragada <
>> >> >> >> > bharat_v@students.iiit.ac.in> wrote:
>> >> >> >> >
>> >> >> >> >> since i haven;t started the cluster .. i can even see the
>> details
>> >> in
>> >> >> >> >> "localhost:<port>/jobTracker.jsp" ..  i didn't even add
>> anything
>> >> to
>> >> >> >> >> hadoop/conf/hadoop-site.xml
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >> On Thu, Jul 23, 2009 at 12:16 PM, bharath vissapragada <
>> >> >> >> >> bharat_v@students.iiit.ac.in> wrote:
>> >> >> >> >>
>> >> >> >> >>> Hi all ,
>> >> >> >> >>>
>> >> >> >> >>> I wanted to run HBase in standalone mode to check my Hbase MR
>> >> >> programs
>> >> >> >> ...
>> >> >> >> >>> I have dl a built version of hbase-0.20. and i have hadoop
>> 0.19.3
>> >> >> >> >>>
>> >> >> >> >>> "I have set JAVA_HOME in both of them" .. then i started hbase
>> >> and
>> >> >> >> >>> inserted some tables using JAVA API .. Now i have written some
>> MR
>> >> >> >> programs
>> >> >> >> >>> onHBase and when i run them on Hbase it runs perfectly without
>> >> any
>> >> >> >> errors
>> >> >> >> >>> and all the Map -reduce statistics are displayed correctly but
>>  i
>> >> >> get
>> >> >> >> no
>> >> >> >> >>> output .
>> >> >> >> >>>
>> >> >> >> >>> I have one doubt now .. how do HBase recognize hadoop in stand
>> >> alone
>> >> >> >> >>> mode(i haven;t started my hadoop even) .. Even simple print
>> >> >> statements
>> >> >> >> donot
>> >> >> >> >>> work .. no output is displayed on the screen ... I doubt my
>> >> config
>> >> >> ....
>> >> >> >> >>>
>> >> >> >> >>> Do i need to add some config to run them ... Please reply ...
>> >> >> >> >>>
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >
>> >> >> >>
>> >> >> >
>> >> >>
>> >> >
>> >>
>> >
>>
>

Re: Hbase and Hadoop Config to run in Standalone mode

Posted by bharath vissapragada <bh...@gmail.com>.

I have tried apache -commons logging ...

instead of printing the row ... i have written log.error(row) ...
even then i got the same output as follows ...

09/07/24 03:41:38 INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=JobTracker, sessionId=
09/07/24 03:41:38 WARN mapred.JobClient: No job jar file set.  User classes
may not be found. See JobConf(Class) or JobConf#setJar(String).
09/07/24 03:41:39 INFO mapred.TableInputFormatBase: split:
0->localhost.localdomain:,
09/07/24 03:41:39 INFO mapred.JobClient: Running job: job_local_0001
09/07/24 03:41:39 INFO mapred.TableInputFormatBase: split:
0->localhost.localdomain:,
09/07/24 03:41:40 INFO mapred.MapTask: numReduceTasks: 1
09/07/24 03:41:40 INFO mapred.MapTask: io.sort.mb = 100
09/07/24 03:41:40 INFO mapred.MapTask: data buffer = 79691776/99614720
09/07/24 03:41:40 INFO mapred.MapTask: record buffer = 262144/327680
09/07/24 03:41:40 INFO mapred.MapTask: Starting flush of map output
09/07/24 03:41:40 INFO mapred.MapTask: Finished spill 0
09/07/24 03:41:40 INFO mapred.TaskRunner: Task:attempt_local_0001_m_000000_0
is done. And is in the process of commiting
09/07/24 03:41:40 INFO mapred.LocalJobRunner:
09/07/24 03:41:40 INFO mapred.TaskRunner: Task
'attempt_local_0001_m_000000_0' done.
09/07/24 03:41:40 INFO mapred.LocalJobRunner:
09/07/24 03:41:40 INFO mapred.Merger: Merging 1 sorted segments
09/07/24 03:41:40 INFO mapred.Merger: Down to the last merge-pass, with 1
segments left of total size: 333 bytes
09/07/24 03:41:40 INFO mapred.LocalJobRunner:
09/07/24 03:41:40 INFO mapred.TaskRunner: Task:attempt_local_0001_r_000000_0
is done. And is in the process of commiting
09/07/24 03:41:40 INFO mapred.LocalJobRunner: reduce > reduce
09/07/24 03:41:40 INFO mapred.TaskRunner: Task
'attempt_local_0001_r_000000_0' done.
09/07/24 03:41:40 INFO mapred.JobClient: Job complete: job_local_0001
09/07/24 03:41:40 INFO mapred.JobClient: Counters: 11
09/07/24 03:41:40 INFO mapred.JobClient:   File Systems
09/07/24 03:41:40 INFO mapred.JobClient:     Local bytes read=38933
09/07/24 03:41:40 INFO mapred.JobClient:     Local bytes written=78346
09/07/24 03:41:40 INFO mapred.JobClient:   Map-Reduce Framework
09/07/24 03:41:40 INFO mapred.JobClient:     Reduce input groups=8
09/07/24 03:41:40 INFO mapred.JobClient:     Combine output records=0
09/07/24 03:41:40 INFO mapred.JobClient:     Map input records=8
09/07/24 03:41:40 INFO mapred.JobClient:     Reduce output records=8
09/07/24 03:41:40 INFO mapred.JobClient:     Map output bytes=315
09/07/24 03:41:40 INFO mapred.JobClient:     Map input bytes=0
09/07/24 03:41:40 INFO mapred.JobClient:     Combine input records=0
09/07/24 03:41:40 INFO mapred.JobClient:     Map output records=8
09/07/24 03:41:40 INFO mapred.JobClient:     Reduce input records=8


On Thu, Jul 23, 2009 at 9:32 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> And you don't need any more config to run local MR jobs on HBase. But
> you do need Hadoop when running MR jobs on HBase on a cluster.
>
> Also your code is running fine as you could see, the real question is
> where is the stdout going when in local mode. When you ran your other
> MR jobs, it was on a working Hadoop setup right? So you were looking
> at the logs in the web UI? One simple thing to do is to do your
> debugging with a logger so you are sure to see your output as I
> already proposed. Another simple thing is to get a pseudo-distributed
> setup and run you HBase MR jobs with Hadoop and get your logs like I'm
> sure you did before.
>
> J-D
>
> On Thu, Jul 23, 2009 at 11:54 AM, bharath
> vissapragada<bh...@gmail.com> wrote:
> > I am really thankful to you J-D for replying me inspite of ur busy
> schedule.
> > I am still in a learning stage and there are no good guides on HBase
> other
> > than Its own one .. So please spare me and I really appreciate ur help .
> >
> > Now i got ur point that there is no need of hadoop while running Hbase MR
> > programs .... But iam confused abt the config . I have only set the
> > JAVA_HOME in the "hbase-env.sh" and other than that i didn't do anything
> ..
> > so i wonder if my conf was wrong or some error in that simple code ...
> > because stdout worked for me while writing mapreduce programs ...
> >
> > Thanks once again!
> >
> > On Thu, Jul 23, 2009 at 9:14 PM, Jean-Daniel Cryans <jdcryans@apache.org
> >wrote:
> >
> >> The code itself is very simple, I was referring to your own
> >> description of your situation. You say you use standalone HBase yet
> >> you talk about Hadoop configuration. You also talk about the
> >> JobTracker web UI which is in no use since you run local jobs directly
> >> on HBase.
> >>
> >> J-D
> >>
> >> On Thu, Jul 23, 2009 at 11:41 AM, bharath
> >> vissapragada<bh...@gmail.com> wrote:
> >> > I used stdout for debugging while writing codes in hadoop MR programs
> and
> >> it
> >> > worked fine ...
> >> > Can you please tell me wch part of the code u found confusing so that
> i
> >> can
> >> > explain it a bit clearly ...
> >> >
> >> >
> >> > On Thu, Jul 23, 2009 at 9:06 PM, Jean-Daniel Cryans <
> jdcryans@apache.org
> >> >wrote:
> >> >
> >> >> What you wrote is a bit confusing to me, sorry.
> >> >>
> >> >> The usual way to debug MR jobs is to define a logger and post with
> >> >> either info or debug level, not sysout like you did. I'm not even
> sure
> >> >> where the standard output is logged when using a local job. Also
> since
> >> >> this is local you won't see anything in your host:50030 web UI. So
> use
> >> >> apache common logging and you should see your output.
> >> >>
> >> >> J-D
> >> >>
> >> >> On Thu, Jul 23, 2009 at 11:13 AM, bharath
> >> >> vissapragada<bh...@gmail.com> wrote:
> >> >> > Thanks for ur reply J-D ... Im pasting some part of the code ...
> >> >> >
> >> >> > Im doing it frm the command line .. Iam pasting some part of the
> code
> >> >> here
> >> >> > ....
> >> >> >
> >> >> >  public void mapp(ImmutableBytesWritable row, RowResult value,
> >> >> > OutputCollector<Text, Text> output, Reporter reporter) throws
> >> IOException
> >> >> {
> >> >> >                System.out.println(row);
> >> >> > }
> >> >> >
> >> >> > public JobConf createSubmittableJob(String[] args) throws
> IOException
> >> {
> >> >> >                JobConf c = new JobConf(getConf(),
> >> >> MR_DS_Scan_Case1.class);
> >> >> >                c.set("col.name", args[1]);
> >> >> >                c.set("operator.name",args[2]);
> >> >> >                c.set("val.name",args[3]);
> >> >> >                IdentityTableMap.initJob(args[0], args[1],
> >> >> this.getClass(),
> >> >> > c);
> >> >> >                c.setOutputFormat(NullOutputFormat.class);
> >> >> >                 return c
> >> >> > }
> >> >> >
> >> >> > As u can see ... im just printing the value of row in the map .. i
> >> can't
> >> >> see
> >> >> > in the terminal .....
> >> >> > I only wan't the map phase ... so i didn't write any reduce phase
> ..
> >> is
> >> >> my
> >> >> > jobConf correct??
> >> >> >
> >> >> > Also as i have already asked how to check the job logs and web
> >> interface
> >> >> > like "localhost:<port>/jobTracker.jsp"... since im running in local
> >> mode
> >> >> ...
> >> >> >
> >> >> > On Thu, Jul 23, 2009 at 6:32 PM, Jean-Daniel Cryans <
> >> jdcryans@apache.org
> >> >> >wrote:
> >> >> >
> >> >> >> What output do you need exactly? I see that you have 8 output
> records
> >> >> >> in your reduce task so if you take a look in your output folder or
> >> >> >> table (I don't know which sink you used) you should see them.
> >> >> >>
> >> >> >> Also did you run your MR inside Eclipse or in command line?
> >> >> >>
> >> >> >> Thx,
> >> >> >>
> >> >> >> J-D
> >> >> >>
> >> >> >> On Thu, Jul 23, 2009 at 8:30 AM, bharath
> >> >> >> vissapragada<bh...@students.iiit.ac.in> wrote:
> >> >> >> > This is the output i go t.. seems everything is fine ..but no
> >> output!!
> >> >> >> >
> >> >> >> > 09/07/23 23:25:36 INFO jvm.JvmMetrics: Initializing JVM Metrics
> >> with
> >> >> >> > processName=JobTracker, sessionId=
> >> >> >> > 09/07/23 23:25:36 WARN mapred.JobClient: No job jar file set.
>  User
> >> >> >> classes
> >> >> >> > may not be found. See JobConf(Class) or JobConf#setJar(String).
> >> >> >> > 09/07/23 23:25:36 INFO mapred.TableInputFormatBase: split:
> >> >> >> > 0->localhost.localdomain:,
> >> >> >> > 09/07/23 23:25:37 INFO mapred.JobClient: Running job:
> >> job_local_0001
> >> >> >> > 09/07/23 23:25:37 INFO mapred.TableInputFormatBase: split:
> >> >> >> > 0->localhost.localdomain:,
> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: numReduceTasks: 1
> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: io.sort.mb = 100
> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: data buffer =
> >> 79691776/99614720
> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: record buffer =
> >> 262144/327680
> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Starting flush of map
> output
> >> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Finished spill 0
> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
> >> >> >> Task:attempt_local_0001_m_000000_0
> >> >> >> > is done. And is in the process of commiting
> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
> >> >> >> > 'attempt_local_0001_m_000000_0' done.
> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
> >> >> >> > 09/07/23 23:25:37 INFO mapred.Merger: Merging 1 sorted segments
> >> >> >> > 09/07/23 23:25:37 INFO mapred.Merger: Down to the last
> merge-pass,
> >> >> with 1
> >> >> >> > segments left of total size: 333 bytes
> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
> >> >> >> Task:attempt_local_0001_r_000000_0
> >> >> >> > is done. And is in the process of commiting
> >> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner: reduce > reduce
> >> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
> >> >> >> > 'attempt_local_0001_r_000000_0' done.
> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Job complete:
> >> job_local_0001
> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Counters: 11
> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   File Systems
> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes
> read=38949
> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes
> >> written=78378
> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   Map-Reduce Framework
> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input
> groups=8
> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine output
> >> records=0
> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input records=8
> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce output
> >> records=8
> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output
> bytes=315
> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input bytes=0
> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine input
> >> records=0
> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output
> records=8
> >> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input
> records=8
> >> >> >> >
> >> >> >> >
> >> >> >> > On Thu, Jul 23, 2009 at 12:17 PM, bharath vissapragada <
> >> >> >> > bharat_v@students.iiit.ac.in> wrote:
> >> >> >> >
> >> >> >> >> since i haven;t started the cluster .. i can even see the
> details
> >> in
> >> >> >> >> "localhost:<port>/jobTracker.jsp" ..  i didn't even add
> anything
> >> to
> >> >> >> >> hadoop/conf/hadoop-site.xml
> >> >> >> >>
> >> >> >> >>
> >> >> >> >> On Thu, Jul 23, 2009 at 12:16 PM, bharath vissapragada <
> >> >> >> >> bharat_v@students.iiit.ac.in> wrote:
> >> >> >> >>
> >> >> >> >>> Hi all ,
> >> >> >> >>>
> >> >> >> >>> I wanted to run HBase in standalone mode to check my Hbase MR
> >> >> programs
> >> >> >> ...
> >> >> >> >>> I have dl a built version of hbase-0.20. and i have hadoop
> 0.19.3
> >> >> >> >>>
> >> >> >> >>> "I have set JAVA_HOME in both of them" .. then i started hbase
> >> and
> >> >> >> >>> inserted some tables using JAVA API .. Now i have written some
> MR
> >> >> >> programs
> >> >> >> >>> onHBase and when i run them on Hbase it runs perfectly without
> >> any
> >> >> >> errors
> >> >> >> >>> and all the Map -reduce statistics are displayed correctly but
>  i
> >> >> get
> >> >> >> no
> >> >> >> >>> output .
> >> >> >> >>>
> >> >> >> >>> I have one doubt now .. how do HBase recognize hadoop in stand
> >> alone
> >> >> >> >>> mode(i haven;t started my hadoop even) .. Even simple print
> >> >> statements
> >> >> >> donot
> >> >> >> >>> work .. no output is displayed on the screen ... I doubt my
> >> config
> >> >> ....
> >> >> >> >>>
> >> >> >> >>> Do i need to add some config to run them ... Please reply ...
> >> >> >> >>>
> >> >> >> >>
> >> >> >> >>
> >> >> >> >
> >> >> >>
> >> >> >
> >> >>
> >> >
> >>
> >
>

Re: Hbase and Hadoop Config to run in Standalone mode

Posted by Jean-Daniel Cryans <jd...@apache.org>.

And you don't need any more config to run local MR jobs on HBase. But
you do need Hadoop when running MR jobs on HBase on a cluster.

Also your code is running fine as you could see, the real question is
where is the stdout going when in local mode. When you ran your other
MR jobs, it was on a working Hadoop setup right? So you were looking
at the logs in the web UI? One simple thing to do is to do your
debugging with a logger so you are sure to see your output as I
already proposed. Another simple thing is to get a pseudo-distributed
setup and run you HBase MR jobs with Hadoop and get your logs like I'm
sure you did before.

J-D

On Thu, Jul 23, 2009 at 11:54 AM, bharath
vissapragada<bh...@gmail.com> wrote:
> I am really thankful to you J-D for replying me inspite of ur busy schedule.
> I am still in a learning stage and there are no good guides on HBase other
> than Its own one .. So please spare me and I really appreciate ur help .
>
> Now i got ur point that there is no need of hadoop while running Hbase MR
> programs .... But iam confused abt the config . I have only set the
> JAVA_HOME in the "hbase-env.sh" and other than that i didn't do anything ..
> so i wonder if my conf was wrong or some error in that simple code ...
> because stdout worked for me while writing mapreduce programs ...
>
> Thanks once again!
>
> On Thu, Jul 23, 2009 at 9:14 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:
>
>> The code itself is very simple, I was referring to your own
>> description of your situation. You say you use standalone HBase yet
>> you talk about Hadoop configuration. You also talk about the
>> JobTracker web UI which is in no use since you run local jobs directly
>> on HBase.
>>
>> J-D
>>
>> On Thu, Jul 23, 2009 at 11:41 AM, bharath
>> vissapragada<bh...@gmail.com> wrote:
>> > I used stdout for debugging while writing codes in hadoop MR programs and
>> it
>> > worked fine ...
>> > Can you please tell me wch part of the code u found confusing so that i
>> can
>> > explain it a bit clearly ...
>> >
>> >
>> > On Thu, Jul 23, 2009 at 9:06 PM, Jean-Daniel Cryans <jdcryans@apache.org
>> >wrote:
>> >
>> >> What you wrote is a bit confusing to me, sorry.
>> >>
>> >> The usual way to debug MR jobs is to define a logger and post with
>> >> either info or debug level, not sysout like you did. I'm not even sure
>> >> where the standard output is logged when using a local job. Also since
>> >> this is local you won't see anything in your host:50030 web UI. So use
>> >> apache common logging and you should see your output.
>> >>
>> >> J-D
>> >>
>> >> On Thu, Jul 23, 2009 at 11:13 AM, bharath
>> >> vissapragada<bh...@gmail.com> wrote:
>> >> > Thanks for ur reply J-D ... Im pasting some part of the code ...
>> >> >
>> >> > Im doing it frm the command line .. Iam pasting some part of the code
>> >> here
>> >> > ....
>> >> >
>> >> >  public void mapp(ImmutableBytesWritable row, RowResult value,
>> >> > OutputCollector<Text, Text> output, Reporter reporter) throws
>> IOException
>> >> {
>> >> >                System.out.println(row);
>> >> > }
>> >> >
>> >> > public JobConf createSubmittableJob(String[] args) throws IOException
>> {
>> >> >                JobConf c = new JobConf(getConf(),
>> >> MR_DS_Scan_Case1.class);
>> >> >                c.set("col.name", args[1]);
>> >> >                c.set("operator.name",args[2]);
>> >> >                c.set("val.name",args[3]);
>> >> >                IdentityTableMap.initJob(args[0], args[1],
>> >> this.getClass(),
>> >> > c);
>> >> >                c.setOutputFormat(NullOutputFormat.class);
>> >> >                 return c
>> >> > }
>> >> >
>> >> > As u can see ... im just printing the value of row in the map .. i
>> can't
>> >> see
>> >> > in the terminal .....
>> >> > I only wan't the map phase ... so i didn't write any reduce phase ..
>> is
>> >> my
>> >> > jobConf correct??
>> >> >
>> >> > Also as i have already asked how to check the job logs and web
>> interface
>> >> > like "localhost:<port>/jobTracker.jsp"... since im running in local
>> mode
>> >> ...
>> >> >
>> >> > On Thu, Jul 23, 2009 at 6:32 PM, Jean-Daniel Cryans <
>> jdcryans@apache.org
>> >> >wrote:
>> >> >
>> >> >> What output do you need exactly? I see that you have 8 output records
>> >> >> in your reduce task so if you take a look in your output folder or
>> >> >> table (I don't know which sink you used) you should see them.
>> >> >>
>> >> >> Also did you run your MR inside Eclipse or in command line?
>> >> >>
>> >> >> Thx,
>> >> >>
>> >> >> J-D
>> >> >>
>> >> >> On Thu, Jul 23, 2009 at 8:30 AM, bharath
>> >> >> vissapragada<bh...@students.iiit.ac.in> wrote:
>> >> >> > This is the output i go t.. seems everything is fine ..but no
>> output!!
>> >> >> >
>> >> >> > 09/07/23 23:25:36 INFO jvm.JvmMetrics: Initializing JVM Metrics
>> with
>> >> >> > processName=JobTracker, sessionId=
>> >> >> > 09/07/23 23:25:36 WARN mapred.JobClient: No job jar file set.  User
>> >> >> classes
>> >> >> > may not be found. See JobConf(Class) or JobConf#setJar(String).
>> >> >> > 09/07/23 23:25:36 INFO mapred.TableInputFormatBase: split:
>> >> >> > 0->localhost.localdomain:,
>> >> >> > 09/07/23 23:25:37 INFO mapred.JobClient: Running job:
>> job_local_0001
>> >> >> > 09/07/23 23:25:37 INFO mapred.TableInputFormatBase: split:
>> >> >> > 0->localhost.localdomain:,
>> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: numReduceTasks: 1
>> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: io.sort.mb = 100
>> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: data buffer =
>> 79691776/99614720
>> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: record buffer =
>> 262144/327680
>> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Starting flush of map output
>> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Finished spill 0
>> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
>> >> >> Task:attempt_local_0001_m_000000_0
>> >> >> > is done. And is in the process of commiting
>> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
>> >> >> > 'attempt_local_0001_m_000000_0' done.
>> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>> >> >> > 09/07/23 23:25:37 INFO mapred.Merger: Merging 1 sorted segments
>> >> >> > 09/07/23 23:25:37 INFO mapred.Merger: Down to the last merge-pass,
>> >> with 1
>> >> >> > segments left of total size: 333 bytes
>> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
>> >> >> Task:attempt_local_0001_r_000000_0
>> >> >> > is done. And is in the process of commiting
>> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner: reduce > reduce
>> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
>> >> >> > 'attempt_local_0001_r_000000_0' done.
>> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Job complete:
>> job_local_0001
>> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Counters: 11
>> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   File Systems
>> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes read=38949
>> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes
>> written=78378
>> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   Map-Reduce Framework
>> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input groups=8
>> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine output
>> records=0
>> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input records=8
>> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce output
>> records=8
>> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output bytes=315
>> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input bytes=0
>> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine input
>> records=0
>> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output records=8
>> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input records=8
>> >> >> >
>> >> >> >
>> >> >> > On Thu, Jul 23, 2009 at 12:17 PM, bharath vissapragada <
>> >> >> > bharat_v@students.iiit.ac.in> wrote:
>> >> >> >
>> >> >> >> since i haven;t started the cluster .. i can even see the details
>> in
>> >> >> >> "localhost:<port>/jobTracker.jsp" ..  i didn't even add anything
>> to
>> >> >> >> hadoop/conf/hadoop-site.xml
>> >> >> >>
>> >> >> >>
>> >> >> >> On Thu, Jul 23, 2009 at 12:16 PM, bharath vissapragada <
>> >> >> >> bharat_v@students.iiit.ac.in> wrote:
>> >> >> >>
>> >> >> >>> Hi all ,
>> >> >> >>>
>> >> >> >>> I wanted to run HBase in standalone mode to check my Hbase MR
>> >> programs
>> >> >> ...
>> >> >> >>> I have dl a built version of hbase-0.20. and i have hadoop 0.19.3
>> >> >> >>>
>> >> >> >>> "I have set JAVA_HOME in both of them" .. then i started hbase
>> and
>> >> >> >>> inserted some tables using JAVA API .. Now i have written some MR
>> >> >> programs
>> >> >> >>> onHBase and when i run them on Hbase it runs perfectly without
>> any
>> >> >> errors
>> >> >> >>> and all the Map -reduce statistics are displayed correctly but  i
>> >> get
>> >> >> no
>> >> >> >>> output .
>> >> >> >>>
>> >> >> >>> I have one doubt now .. how do HBase recognize hadoop in stand
>> alone
>> >> >> >>> mode(i haven;t started my hadoop even) .. Even simple print
>> >> statements
>> >> >> donot
>> >> >> >>> work .. no output is displayed on the screen ... I doubt my
>> config
>> >> ....
>> >> >> >>>
>> >> >> >>> Do i need to add some config to run them ... Please reply ...
>> >> >> >>>
>> >> >> >>
>> >> >> >>
>> >> >> >
>> >> >>
>> >> >
>> >>
>> >
>>
>

Re: Hbase and Hadoop Config to run in Standalone mode

Posted by bharath vissapragada <bh...@gmail.com>.

I am really thankful to you J-D for replying me inspite of ur busy schedule.
I am still in a learning stage and there are no good guides on HBase other
than Its own one .. So please spare me and I really appreciate ur help .

Now i got ur point that there is no need of hadoop while running Hbase MR
programs .... But iam confused abt the config . I have only set the
JAVA_HOME in the "hbase-env.sh" and other than that i didn't do anything ..
so i wonder if my conf was wrong or some error in that simple code ...
because stdout worked for me while writing mapreduce programs ...

Thanks once again!

On Thu, Jul 23, 2009 at 9:14 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> The code itself is very simple, I was referring to your own
> description of your situation. You say you use standalone HBase yet
> you talk about Hadoop configuration. You also talk about the
> JobTracker web UI which is in no use since you run local jobs directly
> on HBase.
>
> J-D
>
> On Thu, Jul 23, 2009 at 11:41 AM, bharath
> vissapragada<bh...@gmail.com> wrote:
> > I used stdout for debugging while writing codes in hadoop MR programs and
> it
> > worked fine ...
> > Can you please tell me wch part of the code u found confusing so that i
> can
> > explain it a bit clearly ...
> >
> >
> > On Thu, Jul 23, 2009 at 9:06 PM, Jean-Daniel Cryans <jdcryans@apache.org
> >wrote:
> >
> >> What you wrote is a bit confusing to me, sorry.
> >>
> >> The usual way to debug MR jobs is to define a logger and post with
> >> either info or debug level, not sysout like you did. I'm not even sure
> >> where the standard output is logged when using a local job. Also since
> >> this is local you won't see anything in your host:50030 web UI. So use
> >> apache common logging and you should see your output.
> >>
> >> J-D
> >>
> >> On Thu, Jul 23, 2009 at 11:13 AM, bharath
> >> vissapragada<bh...@gmail.com> wrote:
> >> > Thanks for ur reply J-D ... Im pasting some part of the code ...
> >> >
> >> > Im doing it frm the command line .. Iam pasting some part of the code
> >> here
> >> > ....
> >> >
> >> >  public void mapp(ImmutableBytesWritable row, RowResult value,
> >> > OutputCollector<Text, Text> output, Reporter reporter) throws
> IOException
> >> {
> >> >                System.out.println(row);
> >> > }
> >> >
> >> > public JobConf createSubmittableJob(String[] args) throws IOException
> {
> >> >                JobConf c = new JobConf(getConf(),
> >> MR_DS_Scan_Case1.class);
> >> >                c.set("col.name", args[1]);
> >> >                c.set("operator.name",args[2]);
> >> >                c.set("val.name",args[3]);
> >> >                IdentityTableMap.initJob(args[0], args[1],
> >> this.getClass(),
> >> > c);
> >> >                c.setOutputFormat(NullOutputFormat.class);
> >> >                 return c
> >> > }
> >> >
> >> > As u can see ... im just printing the value of row in the map .. i
> can't
> >> see
> >> > in the terminal .....
> >> > I only wan't the map phase ... so i didn't write any reduce phase ..
> is
> >> my
> >> > jobConf correct??
> >> >
> >> > Also as i have already asked how to check the job logs and web
> interface
> >> > like "localhost:<port>/jobTracker.jsp"... since im running in local
> mode
> >> ...
> >> >
> >> > On Thu, Jul 23, 2009 at 6:32 PM, Jean-Daniel Cryans <
> jdcryans@apache.org
> >> >wrote:
> >> >
> >> >> What output do you need exactly? I see that you have 8 output records
> >> >> in your reduce task so if you take a look in your output folder or
> >> >> table (I don't know which sink you used) you should see them.
> >> >>
> >> >> Also did you run your MR inside Eclipse or in command line?
> >> >>
> >> >> Thx,
> >> >>
> >> >> J-D
> >> >>
> >> >> On Thu, Jul 23, 2009 at 8:30 AM, bharath
> >> >> vissapragada<bh...@students.iiit.ac.in> wrote:
> >> >> > This is the output i go t.. seems everything is fine ..but no
> output!!
> >> >> >
> >> >> > 09/07/23 23:25:36 INFO jvm.JvmMetrics: Initializing JVM Metrics
> with
> >> >> > processName=JobTracker, sessionId=
> >> >> > 09/07/23 23:25:36 WARN mapred.JobClient: No job jar file set.  User
> >> >> classes
> >> >> > may not be found. See JobConf(Class) or JobConf#setJar(String).
> >> >> > 09/07/23 23:25:36 INFO mapred.TableInputFormatBase: split:
> >> >> > 0->localhost.localdomain:,
> >> >> > 09/07/23 23:25:37 INFO mapred.JobClient: Running job:
> job_local_0001
> >> >> > 09/07/23 23:25:37 INFO mapred.TableInputFormatBase: split:
> >> >> > 0->localhost.localdomain:,
> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: numReduceTasks: 1
> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: io.sort.mb = 100
> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: data buffer =
> 79691776/99614720
> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: record buffer =
> 262144/327680
> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Starting flush of map output
> >> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Finished spill 0
> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
> >> >> Task:attempt_local_0001_m_000000_0
> >> >> > is done. And is in the process of commiting
> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
> >> >> > 'attempt_local_0001_m_000000_0' done.
> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
> >> >> > 09/07/23 23:25:37 INFO mapred.Merger: Merging 1 sorted segments
> >> >> > 09/07/23 23:25:37 INFO mapred.Merger: Down to the last merge-pass,
> >> with 1
> >> >> > segments left of total size: 333 bytes
> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
> >> >> Task:attempt_local_0001_r_000000_0
> >> >> > is done. And is in the process of commiting
> >> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner: reduce > reduce
> >> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
> >> >> > 'attempt_local_0001_r_000000_0' done.
> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Job complete:
> job_local_0001
> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Counters: 11
> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   File Systems
> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes read=38949
> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes
> written=78378
> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   Map-Reduce Framework
> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input groups=8
> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine output
> records=0
> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input records=8
> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce output
> records=8
> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output bytes=315
> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input bytes=0
> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine input
> records=0
> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output records=8
> >> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input records=8
> >> >> >
> >> >> >
> >> >> > On Thu, Jul 23, 2009 at 12:17 PM, bharath vissapragada <
> >> >> > bharat_v@students.iiit.ac.in> wrote:
> >> >> >
> >> >> >> since i haven;t started the cluster .. i can even see the details
> in
> >> >> >> "localhost:<port>/jobTracker.jsp" ..  i didn't even add anything
> to
> >> >> >> hadoop/conf/hadoop-site.xml
> >> >> >>
> >> >> >>
> >> >> >> On Thu, Jul 23, 2009 at 12:16 PM, bharath vissapragada <
> >> >> >> bharat_v@students.iiit.ac.in> wrote:
> >> >> >>
> >> >> >>> Hi all ,
> >> >> >>>
> >> >> >>> I wanted to run HBase in standalone mode to check my Hbase MR
> >> programs
> >> >> ...
> >> >> >>> I have dl a built version of hbase-0.20. and i have hadoop 0.19.3
> >> >> >>>
> >> >> >>> "I have set JAVA_HOME in both of them" .. then i started hbase
> and
> >> >> >>> inserted some tables using JAVA API .. Now i have written some MR
> >> >> programs
> >> >> >>> onHBase and when i run them on Hbase it runs perfectly without
> any
> >> >> errors
> >> >> >>> and all the Map -reduce statistics are displayed correctly but  i
> >> get
> >> >> no
> >> >> >>> output .
> >> >> >>>
> >> >> >>> I have one doubt now .. how do HBase recognize hadoop in stand
> alone
> >> >> >>> mode(i haven;t started my hadoop even) .. Even simple print
> >> statements
> >> >> donot
> >> >> >>> work .. no output is displayed on the screen ... I doubt my
> config
> >> ....
> >> >> >>>
> >> >> >>> Do i need to add some config to run them ... Please reply ...
> >> >> >>>
> >> >> >>
> >> >> >>
> >> >> >
> >> >>
> >> >
> >>
> >
>

Re: Hbase and Hadoop Config to run in Standalone mode

Posted by Jean-Daniel Cryans <jd...@apache.org>.

The code itself is very simple, I was referring to your own
description of your situation. You say you use standalone HBase yet
you talk about Hadoop configuration. You also talk about the
JobTracker web UI which is in no use since you run local jobs directly
on HBase.

J-D

On Thu, Jul 23, 2009 at 11:41 AM, bharath
vissapragada<bh...@gmail.com> wrote:
> I used stdout for debugging while writing codes in hadoop MR programs and it
> worked fine ...
> Can you please tell me wch part of the code u found confusing so that i can
> explain it a bit clearly ...
>
>
> On Thu, Jul 23, 2009 at 9:06 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:
>
>> What you wrote is a bit confusing to me, sorry.
>>
>> The usual way to debug MR jobs is to define a logger and post with
>> either info or debug level, not sysout like you did. I'm not even sure
>> where the standard output is logged when using a local job. Also since
>> this is local you won't see anything in your host:50030 web UI. So use
>> apache common logging and you should see your output.
>>
>> J-D
>>
>> On Thu, Jul 23, 2009 at 11:13 AM, bharath
>> vissapragada<bh...@gmail.com> wrote:
>> > Thanks for ur reply J-D ... Im pasting some part of the code ...
>> >
>> > Im doing it frm the command line .. Iam pasting some part of the code
>> here
>> > ....
>> >
>> >  public void mapp(ImmutableBytesWritable row, RowResult value,
>> > OutputCollector<Text, Text> output, Reporter reporter) throws IOException
>> {
>> >                System.out.println(row);
>> > }
>> >
>> > public JobConf createSubmittableJob(String[] args) throws IOException {
>> >                JobConf c = new JobConf(getConf(),
>> MR_DS_Scan_Case1.class);
>> >                c.set("col.name", args[1]);
>> >                c.set("operator.name",args[2]);
>> >                c.set("val.name",args[3]);
>> >                IdentityTableMap.initJob(args[0], args[1],
>> this.getClass(),
>> > c);
>> >                c.setOutputFormat(NullOutputFormat.class);
>> >                 return c
>> > }
>> >
>> > As u can see ... im just printing the value of row in the map .. i can't
>> see
>> > in the terminal .....
>> > I only wan't the map phase ... so i didn't write any reduce phase .. is
>> my
>> > jobConf correct??
>> >
>> > Also as i have already asked how to check the job logs and web interface
>> > like "localhost:<port>/jobTracker.jsp"... since im running in local mode
>> ...
>> >
>> > On Thu, Jul 23, 2009 at 6:32 PM, Jean-Daniel Cryans <jdcryans@apache.org
>> >wrote:
>> >
>> >> What output do you need exactly? I see that you have 8 output records
>> >> in your reduce task so if you take a look in your output folder or
>> >> table (I don't know which sink you used) you should see them.
>> >>
>> >> Also did you run your MR inside Eclipse or in command line?
>> >>
>> >> Thx,
>> >>
>> >> J-D
>> >>
>> >> On Thu, Jul 23, 2009 at 8:30 AM, bharath
>> >> vissapragada<bh...@students.iiit.ac.in> wrote:
>> >> > This is the output i go t.. seems everything is fine ..but no output!!
>> >> >
>> >> > 09/07/23 23:25:36 INFO jvm.JvmMetrics: Initializing JVM Metrics with
>> >> > processName=JobTracker, sessionId=
>> >> > 09/07/23 23:25:36 WARN mapred.JobClient: No job jar file set.  User
>> >> classes
>> >> > may not be found. See JobConf(Class) or JobConf#setJar(String).
>> >> > 09/07/23 23:25:36 INFO mapred.TableInputFormatBase: split:
>> >> > 0->localhost.localdomain:,
>> >> > 09/07/23 23:25:37 INFO mapred.JobClient: Running job: job_local_0001
>> >> > 09/07/23 23:25:37 INFO mapred.TableInputFormatBase: split:
>> >> > 0->localhost.localdomain:,
>> >> > 09/07/23 23:25:37 INFO mapred.MapTask: numReduceTasks: 1
>> >> > 09/07/23 23:25:37 INFO mapred.MapTask: io.sort.mb = 100
>> >> > 09/07/23 23:25:37 INFO mapred.MapTask: data buffer = 79691776/99614720
>> >> > 09/07/23 23:25:37 INFO mapred.MapTask: record buffer = 262144/327680
>> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Starting flush of map output
>> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Finished spill 0
>> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
>> >> Task:attempt_local_0001_m_000000_0
>> >> > is done. And is in the process of commiting
>> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
>> >> > 'attempt_local_0001_m_000000_0' done.
>> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>> >> > 09/07/23 23:25:37 INFO mapred.Merger: Merging 1 sorted segments
>> >> > 09/07/23 23:25:37 INFO mapred.Merger: Down to the last merge-pass,
>> with 1
>> >> > segments left of total size: 333 bytes
>> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
>> >> Task:attempt_local_0001_r_000000_0
>> >> > is done. And is in the process of commiting
>> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner: reduce > reduce
>> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
>> >> > 'attempt_local_0001_r_000000_0' done.
>> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Job complete: job_local_0001
>> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Counters: 11
>> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   File Systems
>> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes read=38949
>> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes written=78378
>> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   Map-Reduce Framework
>> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input groups=8
>> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine output records=0
>> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input records=8
>> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce output records=8
>> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output bytes=315
>> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input bytes=0
>> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine input records=0
>> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output records=8
>> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input records=8
>> >> >
>> >> >
>> >> > On Thu, Jul 23, 2009 at 12:17 PM, bharath vissapragada <
>> >> > bharat_v@students.iiit.ac.in> wrote:
>> >> >
>> >> >> since i haven;t started the cluster .. i can even see the details in
>> >> >> "localhost:<port>/jobTracker.jsp" ..  i didn't even add anything to
>> >> >> hadoop/conf/hadoop-site.xml
>> >> >>
>> >> >>
>> >> >> On Thu, Jul 23, 2009 at 12:16 PM, bharath vissapragada <
>> >> >> bharat_v@students.iiit.ac.in> wrote:
>> >> >>
>> >> >>> Hi all ,
>> >> >>>
>> >> >>> I wanted to run HBase in standalone mode to check my Hbase MR
>> programs
>> >> ...
>> >> >>> I have dl a built version of hbase-0.20. and i have hadoop 0.19.3
>> >> >>>
>> >> >>> "I have set JAVA_HOME in both of them" .. then i started hbase and
>> >> >>> inserted some tables using JAVA API .. Now i have written some MR
>> >> programs
>> >> >>> onHBase and when i run them on Hbase it runs perfectly without any
>> >> errors
>> >> >>> and all the Map -reduce statistics are displayed correctly but  i
>> get
>> >> no
>> >> >>> output .
>> >> >>>
>> >> >>> I have one doubt now .. how do HBase recognize hadoop in stand alone
>> >> >>> mode(i haven;t started my hadoop even) .. Even simple print
>> statements
>> >> donot
>> >> >>> work .. no output is displayed on the screen ... I doubt my config
>> ....
>> >> >>>
>> >> >>> Do i need to add some config to run them ... Please reply ...
>> >> >>>
>> >> >>
>> >> >>
>> >> >
>> >>
>> >
>>
>

Re: Hbase and Hadoop Config to run in Standalone mode

Posted by bharath vissapragada <bh...@gmail.com>.

I used stdout for debugging while writing codes in hadoop MR programs and it
worked fine ...
Can you please tell me wch part of the code u found confusing so that i can
explain it a bit clearly ...


On Thu, Jul 23, 2009 at 9:06 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> What you wrote is a bit confusing to me, sorry.
>
> The usual way to debug MR jobs is to define a logger and post with
> either info or debug level, not sysout like you did. I'm not even sure
> where the standard output is logged when using a local job. Also since
> this is local you won't see anything in your host:50030 web UI. So use
> apache common logging and you should see your output.
>
> J-D
>
> On Thu, Jul 23, 2009 at 11:13 AM, bharath
> vissapragada<bh...@gmail.com> wrote:
> > Thanks for ur reply J-D ... Im pasting some part of the code ...
> >
> > Im doing it frm the command line .. Iam pasting some part of the code
> here
> > ....
> >
> >  public void mapp(ImmutableBytesWritable row, RowResult value,
> > OutputCollector<Text, Text> output, Reporter reporter) throws IOException
> {
> >                System.out.println(row);
> > }
> >
> > public JobConf createSubmittableJob(String[] args) throws IOException {
> >                JobConf c = new JobConf(getConf(),
> MR_DS_Scan_Case1.class);
> >                c.set("col.name", args[1]);
> >                c.set("operator.name",args[2]);
> >                c.set("val.name",args[3]);
> >                IdentityTableMap.initJob(args[0], args[1],
> this.getClass(),
> > c);
> >                c.setOutputFormat(NullOutputFormat.class);
> >                 return c
> > }
> >
> > As u can see ... im just printing the value of row in the map .. i can't
> see
> > in the terminal .....
> > I only wan't the map phase ... so i didn't write any reduce phase .. is
> my
> > jobConf correct??
> >
> > Also as i have already asked how to check the job logs and web interface
> > like "localhost:<port>/jobTracker.jsp"... since im running in local mode
> ...
> >
> > On Thu, Jul 23, 2009 at 6:32 PM, Jean-Daniel Cryans <jdcryans@apache.org
> >wrote:
> >
> >> What output do you need exactly? I see that you have 8 output records
> >> in your reduce task so if you take a look in your output folder or
> >> table (I don't know which sink you used) you should see them.
> >>
> >> Also did you run your MR inside Eclipse or in command line?
> >>
> >> Thx,
> >>
> >> J-D
> >>
> >> On Thu, Jul 23, 2009 at 8:30 AM, bharath
> >> vissapragada<bh...@students.iiit.ac.in> wrote:
> >> > This is the output i go t.. seems everything is fine ..but no output!!
> >> >
> >> > 09/07/23 23:25:36 INFO jvm.JvmMetrics: Initializing JVM Metrics with
> >> > processName=JobTracker, sessionId=
> >> > 09/07/23 23:25:36 WARN mapred.JobClient: No job jar file set.  User
> >> classes
> >> > may not be found. See JobConf(Class) or JobConf#setJar(String).
> >> > 09/07/23 23:25:36 INFO mapred.TableInputFormatBase: split:
> >> > 0->localhost.localdomain:,
> >> > 09/07/23 23:25:37 INFO mapred.JobClient: Running job: job_local_0001
> >> > 09/07/23 23:25:37 INFO mapred.TableInputFormatBase: split:
> >> > 0->localhost.localdomain:,
> >> > 09/07/23 23:25:37 INFO mapred.MapTask: numReduceTasks: 1
> >> > 09/07/23 23:25:37 INFO mapred.MapTask: io.sort.mb = 100
> >> > 09/07/23 23:25:37 INFO mapred.MapTask: data buffer = 79691776/99614720
> >> > 09/07/23 23:25:37 INFO mapred.MapTask: record buffer = 262144/327680
> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Starting flush of map output
> >> > 09/07/23 23:25:37 INFO mapred.MapTask: Finished spill 0
> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
> >> Task:attempt_local_0001_m_000000_0
> >> > is done. And is in the process of commiting
> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
> >> > 'attempt_local_0001_m_000000_0' done.
> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
> >> > 09/07/23 23:25:37 INFO mapred.Merger: Merging 1 sorted segments
> >> > 09/07/23 23:25:37 INFO mapred.Merger: Down to the last merge-pass,
> with 1
> >> > segments left of total size: 333 bytes
> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
> >> Task:attempt_local_0001_r_000000_0
> >> > is done. And is in the process of commiting
> >> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner: reduce > reduce
> >> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
> >> > 'attempt_local_0001_r_000000_0' done.
> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Job complete: job_local_0001
> >> > 09/07/23 23:25:38 INFO mapred.JobClient: Counters: 11
> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   File Systems
> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes read=38949
> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes written=78378
> >> > 09/07/23 23:25:38 INFO mapred.JobClient:   Map-Reduce Framework
> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input groups=8
> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine output records=0
> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input records=8
> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce output records=8
> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output bytes=315
> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input bytes=0
> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine input records=0
> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output records=8
> >> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input records=8
> >> >
> >> >
> >> > On Thu, Jul 23, 2009 at 12:17 PM, bharath vissapragada <
> >> > bharat_v@students.iiit.ac.in> wrote:
> >> >
> >> >> since i haven;t started the cluster .. i can even see the details in
> >> >> "localhost:<port>/jobTracker.jsp" ..  i didn't even add anything to
> >> >> hadoop/conf/hadoop-site.xml
> >> >>
> >> >>
> >> >> On Thu, Jul 23, 2009 at 12:16 PM, bharath vissapragada <
> >> >> bharat_v@students.iiit.ac.in> wrote:
> >> >>
> >> >>> Hi all ,
> >> >>>
> >> >>> I wanted to run HBase in standalone mode to check my Hbase MR
> programs
> >> ...
> >> >>> I have dl a built version of hbase-0.20. and i have hadoop 0.19.3
> >> >>>
> >> >>> "I have set JAVA_HOME in both of them" .. then i started hbase and
> >> >>> inserted some tables using JAVA API .. Now i have written some MR
> >> programs
> >> >>> onHBase and when i run them on Hbase it runs perfectly without any
> >> errors
> >> >>> and all the Map -reduce statistics are displayed correctly but  i
> get
> >> no
> >> >>> output .
> >> >>>
> >> >>> I have one doubt now .. how do HBase recognize hadoop in stand alone
> >> >>> mode(i haven;t started my hadoop even) .. Even simple print
> statements
> >> donot
> >> >>> work .. no output is displayed on the screen ... I doubt my config
> ....
> >> >>>
> >> >>> Do i need to add some config to run them ... Please reply ...
> >> >>>
> >> >>
> >> >>
> >> >
> >>
> >
>

Re: Hbase and Hadoop Config to run in Standalone mode

Posted by Jean-Daniel Cryans <jd...@apache.org>.

What you wrote is a bit confusing to me, sorry.

The usual way to debug MR jobs is to define a logger and post with
either info or debug level, not sysout like you did. I'm not even sure
where the standard output is logged when using a local job. Also since
this is local you won't see anything in your host:50030 web UI. So use
apache common logging and you should see your output.

J-D

On Thu, Jul 23, 2009 at 11:13 AM, bharath
vissapragada<bh...@gmail.com> wrote:
> Thanks for ur reply J-D ... Im pasting some part of the code ...
>
> Im doing it frm the command line .. Iam pasting some part of the code here
> ....
>
>  public void mapp(ImmutableBytesWritable row, RowResult value,
> OutputCollector<Text, Text> output, Reporter reporter) throws IOException {
>                System.out.println(row);
> }
>
> public JobConf createSubmittableJob(String[] args) throws IOException {
>                JobConf c = new JobConf(getConf(), MR_DS_Scan_Case1.class);
>                c.set("col.name", args[1]);
>                c.set("operator.name",args[2]);
>                c.set("val.name",args[3]);
>                IdentityTableMap.initJob(args[0], args[1], this.getClass(),
> c);
>                c.setOutputFormat(NullOutputFormat.class);
>                 return c
> }
>
> As u can see ... im just printing the value of row in the map .. i can't see
> in the terminal .....
> I only wan't the map phase ... so i didn't write any reduce phase .. is my
> jobConf correct??
>
> Also as i have already asked how to check the job logs and web interface
> like "localhost:<port>/jobTracker.jsp"... since im running in local mode ...
>
> On Thu, Jul 23, 2009 at 6:32 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:
>
>> What output do you need exactly? I see that you have 8 output records
>> in your reduce task so if you take a look in your output folder or
>> table (I don't know which sink you used) you should see them.
>>
>> Also did you run your MR inside Eclipse or in command line?
>>
>> Thx,
>>
>> J-D
>>
>> On Thu, Jul 23, 2009 at 8:30 AM, bharath
>> vissapragada<bh...@students.iiit.ac.in> wrote:
>> > This is the output i go t.. seems everything is fine ..but no output!!
>> >
>> > 09/07/23 23:25:36 INFO jvm.JvmMetrics: Initializing JVM Metrics with
>> > processName=JobTracker, sessionId=
>> > 09/07/23 23:25:36 WARN mapred.JobClient: No job jar file set.  User
>> classes
>> > may not be found. See JobConf(Class) or JobConf#setJar(String).
>> > 09/07/23 23:25:36 INFO mapred.TableInputFormatBase: split:
>> > 0->localhost.localdomain:,
>> > 09/07/23 23:25:37 INFO mapred.JobClient: Running job: job_local_0001
>> > 09/07/23 23:25:37 INFO mapred.TableInputFormatBase: split:
>> > 0->localhost.localdomain:,
>> > 09/07/23 23:25:37 INFO mapred.MapTask: numReduceTasks: 1
>> > 09/07/23 23:25:37 INFO mapred.MapTask: io.sort.mb = 100
>> > 09/07/23 23:25:37 INFO mapred.MapTask: data buffer = 79691776/99614720
>> > 09/07/23 23:25:37 INFO mapred.MapTask: record buffer = 262144/327680
>> > 09/07/23 23:25:37 INFO mapred.MapTask: Starting flush of map output
>> > 09/07/23 23:25:37 INFO mapred.MapTask: Finished spill 0
>> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
>> Task:attempt_local_0001_m_000000_0
>> > is done. And is in the process of commiting
>> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
>> > 'attempt_local_0001_m_000000_0' done.
>> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>> > 09/07/23 23:25:37 INFO mapred.Merger: Merging 1 sorted segments
>> > 09/07/23 23:25:37 INFO mapred.Merger: Down to the last merge-pass, with 1
>> > segments left of total size: 333 bytes
>> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
>> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
>> Task:attempt_local_0001_r_000000_0
>> > is done. And is in the process of commiting
>> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner: reduce > reduce
>> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
>> > 'attempt_local_0001_r_000000_0' done.
>> > 09/07/23 23:25:38 INFO mapred.JobClient: Job complete: job_local_0001
>> > 09/07/23 23:25:38 INFO mapred.JobClient: Counters: 11
>> > 09/07/23 23:25:38 INFO mapred.JobClient:   File Systems
>> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes read=38949
>> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes written=78378
>> > 09/07/23 23:25:38 INFO mapred.JobClient:   Map-Reduce Framework
>> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input groups=8
>> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine output records=0
>> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input records=8
>> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce output records=8
>> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output bytes=315
>> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input bytes=0
>> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine input records=0
>> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output records=8
>> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input records=8
>> >
>> >
>> > On Thu, Jul 23, 2009 at 12:17 PM, bharath vissapragada <
>> > bharat_v@students.iiit.ac.in> wrote:
>> >
>> >> since i haven;t started the cluster .. i can even see the details in
>> >> "localhost:<port>/jobTracker.jsp" ..  i didn't even add anything to
>> >> hadoop/conf/hadoop-site.xml
>> >>
>> >>
>> >> On Thu, Jul 23, 2009 at 12:16 PM, bharath vissapragada <
>> >> bharat_v@students.iiit.ac.in> wrote:
>> >>
>> >>> Hi all ,
>> >>>
>> >>> I wanted to run HBase in standalone mode to check my Hbase MR programs
>> ...
>> >>> I have dl a built version of hbase-0.20. and i have hadoop 0.19.3
>> >>>
>> >>> "I have set JAVA_HOME in both of them" .. then i started hbase and
>> >>> inserted some tables using JAVA API .. Now i have written some MR
>> programs
>> >>> onHBase and when i run them on Hbase it runs perfectly without any
>> errors
>> >>> and all the Map -reduce statistics are displayed correctly but  i get
>> no
>> >>> output .
>> >>>
>> >>> I have one doubt now .. how do HBase recognize hadoop in stand alone
>> >>> mode(i haven;t started my hadoop even) .. Even simple print statements
>> donot
>> >>> work .. no output is displayed on the screen ... I doubt my config ....
>> >>>
>> >>> Do i need to add some config to run them ... Please reply ...
>> >>>
>> >>
>> >>
>> >
>>
>

Re: Hbase and Hadoop Config to run in Standalone mode

Posted by bharath vissapragada <bh...@gmail.com>.

Thanks for ur reply J-D ... Im pasting some part of the code ...

Im doing it frm the command line .. Iam pasting some part of the code here
....

 public void mapp(ImmutableBytesWritable row, RowResult value,
OutputCollector<Text, Text> output, Reporter reporter) throws IOException {
                System.out.println(row);
}

public JobConf createSubmittableJob(String[] args) throws IOException {
                JobConf c = new JobConf(getConf(), MR_DS_Scan_Case1.class);
                c.set("col.name", args[1]);
                c.set("operator.name",args[2]);
                c.set("val.name",args[3]);
                IdentityTableMap.initJob(args[0], args[1], this.getClass(),
c);
                c.setOutputFormat(NullOutputFormat.class);
                 return c
}

As u can see ... im just printing the value of row in the map .. i can't see
in the terminal .....
I only wan't the map phase ... so i didn't write any reduce phase .. is my
jobConf correct??

Also as i have already asked how to check the job logs and web interface
like "localhost:<port>/jobTracker.jsp"... since im running in local mode ...

On Thu, Jul 23, 2009 at 6:32 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> What output do you need exactly? I see that you have 8 output records
> in your reduce task so if you take a look in your output folder or
> table (I don't know which sink you used) you should see them.
>
> Also did you run your MR inside Eclipse or in command line?
>
> Thx,
>
> J-D
>
> On Thu, Jul 23, 2009 at 8:30 AM, bharath
> vissapragada<bh...@students.iiit.ac.in> wrote:
> > This is the output i go t.. seems everything is fine ..but no output!!
> >
> > 09/07/23 23:25:36 INFO jvm.JvmMetrics: Initializing JVM Metrics with
> > processName=JobTracker, sessionId=
> > 09/07/23 23:25:36 WARN mapred.JobClient: No job jar file set.  User
> classes
> > may not be found. See JobConf(Class) or JobConf#setJar(String).
> > 09/07/23 23:25:36 INFO mapred.TableInputFormatBase: split:
> > 0->localhost.localdomain:,
> > 09/07/23 23:25:37 INFO mapred.JobClient: Running job: job_local_0001
> > 09/07/23 23:25:37 INFO mapred.TableInputFormatBase: split:
> > 0->localhost.localdomain:,
> > 09/07/23 23:25:37 INFO mapred.MapTask: numReduceTasks: 1
> > 09/07/23 23:25:37 INFO mapred.MapTask: io.sort.mb = 100
> > 09/07/23 23:25:37 INFO mapred.MapTask: data buffer = 79691776/99614720
> > 09/07/23 23:25:37 INFO mapred.MapTask: record buffer = 262144/327680
> > 09/07/23 23:25:37 INFO mapred.MapTask: Starting flush of map output
> > 09/07/23 23:25:37 INFO mapred.MapTask: Finished spill 0
> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
> Task:attempt_local_0001_m_000000_0
> > is done. And is in the process of commiting
> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
> > 'attempt_local_0001_m_000000_0' done.
> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
> > 09/07/23 23:25:37 INFO mapred.Merger: Merging 1 sorted segments
> > 09/07/23 23:25:37 INFO mapred.Merger: Down to the last merge-pass, with 1
> > segments left of total size: 333 bytes
> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
> > 09/07/23 23:25:37 INFO mapred.TaskRunner:
> Task:attempt_local_0001_r_000000_0
> > is done. And is in the process of commiting
> > 09/07/23 23:25:37 INFO mapred.LocalJobRunner: reduce > reduce
> > 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
> > 'attempt_local_0001_r_000000_0' done.
> > 09/07/23 23:25:38 INFO mapred.JobClient: Job complete: job_local_0001
> > 09/07/23 23:25:38 INFO mapred.JobClient: Counters: 11
> > 09/07/23 23:25:38 INFO mapred.JobClient:   File Systems
> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes read=38949
> > 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes written=78378
> > 09/07/23 23:25:38 INFO mapred.JobClient:   Map-Reduce Framework
> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input groups=8
> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine output records=0
> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input records=8
> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce output records=8
> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output bytes=315
> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map input bytes=0
> > 09/07/23 23:25:38 INFO mapred.JobClient:     Combine input records=0
> > 09/07/23 23:25:38 INFO mapred.JobClient:     Map output records=8
> > 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input records=8
> >
> >
> > On Thu, Jul 23, 2009 at 12:17 PM, bharath vissapragada <
> > bharat_v@students.iiit.ac.in> wrote:
> >
> >> since i haven;t started the cluster .. i can even see the details in
> >> "localhost:<port>/jobTracker.jsp" ..  i didn't even add anything to
> >> hadoop/conf/hadoop-site.xml
> >>
> >>
> >> On Thu, Jul 23, 2009 at 12:16 PM, bharath vissapragada <
> >> bharat_v@students.iiit.ac.in> wrote:
> >>
> >>> Hi all ,
> >>>
> >>> I wanted to run HBase in standalone mode to check my Hbase MR programs
> ...
> >>> I have dl a built version of hbase-0.20. and i have hadoop 0.19.3
> >>>
> >>> "I have set JAVA_HOME in both of them" .. then i started hbase and
> >>> inserted some tables using JAVA API .. Now i have written some MR
> programs
> >>> onHBase and when i run them on Hbase it runs perfectly without any
> errors
> >>> and all the Map -reduce statistics are displayed correctly but  i get
> no
> >>> output .
> >>>
> >>> I have one doubt now .. how do HBase recognize hadoop in stand alone
> >>> mode(i haven;t started my hadoop even) .. Even simple print statements
> donot
> >>> work .. no output is displayed on the screen ... I doubt my config ....
> >>>
> >>> Do i need to add some config to run them ... Please reply ...
> >>>
> >>
> >>
> >
>

Re: Hbase and Hadoop Config to run in Standalone mode

Posted by Jean-Daniel Cryans <jd...@apache.org>.

What output do you need exactly? I see that you have 8 output records
in your reduce task so if you take a look in your output folder or
table (I don't know which sink you used) you should see them.

Also did you run your MR inside Eclipse or in command line?

Thx,

J-D

On Thu, Jul 23, 2009 at 8:30 AM, bharath
vissapragada<bh...@students.iiit.ac.in> wrote:
> This is the output i go t.. seems everything is fine ..but no output!!
>
> 09/07/23 23:25:36 INFO jvm.JvmMetrics: Initializing JVM Metrics with
> processName=JobTracker, sessionId=
> 09/07/23 23:25:36 WARN mapred.JobClient: No job jar file set.  User classes
> may not be found. See JobConf(Class) or JobConf#setJar(String).
> 09/07/23 23:25:36 INFO mapred.TableInputFormatBase: split:
> 0->localhost.localdomain:,
> 09/07/23 23:25:37 INFO mapred.JobClient: Running job: job_local_0001
> 09/07/23 23:25:37 INFO mapred.TableInputFormatBase: split:
> 0->localhost.localdomain:,
> 09/07/23 23:25:37 INFO mapred.MapTask: numReduceTasks: 1
> 09/07/23 23:25:37 INFO mapred.MapTask: io.sort.mb = 100
> 09/07/23 23:25:37 INFO mapred.MapTask: data buffer = 79691776/99614720
> 09/07/23 23:25:37 INFO mapred.MapTask: record buffer = 262144/327680
> 09/07/23 23:25:37 INFO mapred.MapTask: Starting flush of map output
> 09/07/23 23:25:37 INFO mapred.MapTask: Finished spill 0
> 09/07/23 23:25:37 INFO mapred.TaskRunner: Task:attempt_local_0001_m_000000_0
> is done. And is in the process of commiting
> 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
> 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
> 'attempt_local_0001_m_000000_0' done.
> 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
> 09/07/23 23:25:37 INFO mapred.Merger: Merging 1 sorted segments
> 09/07/23 23:25:37 INFO mapred.Merger: Down to the last merge-pass, with 1
> segments left of total size: 333 bytes
> 09/07/23 23:25:37 INFO mapred.LocalJobRunner:
> 09/07/23 23:25:37 INFO mapred.TaskRunner: Task:attempt_local_0001_r_000000_0
> is done. And is in the process of commiting
> 09/07/23 23:25:37 INFO mapred.LocalJobRunner: reduce > reduce
> 09/07/23 23:25:37 INFO mapred.TaskRunner: Task
> 'attempt_local_0001_r_000000_0' done.
> 09/07/23 23:25:38 INFO mapred.JobClient: Job complete: job_local_0001
> 09/07/23 23:25:38 INFO mapred.JobClient: Counters: 11
> 09/07/23 23:25:38 INFO mapred.JobClient:   File Systems
> 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes read=38949
> 09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes written=78378
> 09/07/23 23:25:38 INFO mapred.JobClient:   Map-Reduce Framework
> 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input groups=8
> 09/07/23 23:25:38 INFO mapred.JobClient:     Combine output records=0
> 09/07/23 23:25:38 INFO mapred.JobClient:     Map input records=8
> 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce output records=8
> 09/07/23 23:25:38 INFO mapred.JobClient:     Map output bytes=315
> 09/07/23 23:25:38 INFO mapred.JobClient:     Map input bytes=0
> 09/07/23 23:25:38 INFO mapred.JobClient:     Combine input records=0
> 09/07/23 23:25:38 INFO mapred.JobClient:     Map output records=8
> 09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input records=8
>
>
> On Thu, Jul 23, 2009 at 12:17 PM, bharath vissapragada <
> bharat_v@students.iiit.ac.in> wrote:
>
>> since i haven;t started the cluster .. i can even see the details in
>> "localhost:<port>/jobTracker.jsp" ..  i didn't even add anything to
>> hadoop/conf/hadoop-site.xml
>>
>>
>> On Thu, Jul 23, 2009 at 12:16 PM, bharath vissapragada <
>> bharat_v@students.iiit.ac.in> wrote:
>>
>>> Hi all ,
>>>
>>> I wanted to run HBase in standalone mode to check my Hbase MR programs ...
>>> I have dl a built version of hbase-0.20. and i have hadoop 0.19.3
>>>
>>> "I have set JAVA_HOME in both of them" .. then i started hbase and
>>> inserted some tables using JAVA API .. Now i have written some MR programs
>>> onHBase and when i run them on Hbase it runs perfectly without any errors
>>> and all the Map -reduce statistics are displayed correctly but  i get no
>>> output .
>>>
>>> I have one doubt now .. how do HBase recognize hadoop in stand alone
>>> mode(i haven;t started my hadoop even) .. Even simple print statements donot
>>> work .. no output is displayed on the screen ... I doubt my config ....
>>>
>>> Do i need to add some config to run them ... Please reply ...
>>>
>>
>>
>

Re: Hbase and Hadoop Config to run in Standalone mode

Posted by bharath vissapragada <bh...@students.iiit.ac.in>.

This is the output i go t.. seems everything is fine ..but no output!!

09/07/23 23:25:36 INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=JobTracker, sessionId=
09/07/23 23:25:36 WARN mapred.JobClient: No job jar file set.  User classes
may not be found. See JobConf(Class) or JobConf#setJar(String).
09/07/23 23:25:36 INFO mapred.TableInputFormatBase: split:
0->localhost.localdomain:,
09/07/23 23:25:37 INFO mapred.JobClient: Running job: job_local_0001
09/07/23 23:25:37 INFO mapred.TableInputFormatBase: split:
0->localhost.localdomain:,
09/07/23 23:25:37 INFO mapred.MapTask: numReduceTasks: 1
09/07/23 23:25:37 INFO mapred.MapTask: io.sort.mb = 100
09/07/23 23:25:37 INFO mapred.MapTask: data buffer = 79691776/99614720
09/07/23 23:25:37 INFO mapred.MapTask: record buffer = 262144/327680
09/07/23 23:25:37 INFO mapred.MapTask: Starting flush of map output
09/07/23 23:25:37 INFO mapred.MapTask: Finished spill 0
09/07/23 23:25:37 INFO mapred.TaskRunner: Task:attempt_local_0001_m_000000_0
is done. And is in the process of commiting
09/07/23 23:25:37 INFO mapred.LocalJobRunner:
09/07/23 23:25:37 INFO mapred.TaskRunner: Task
'attempt_local_0001_m_000000_0' done.
09/07/23 23:25:37 INFO mapred.LocalJobRunner:
09/07/23 23:25:37 INFO mapred.Merger: Merging 1 sorted segments
09/07/23 23:25:37 INFO mapred.Merger: Down to the last merge-pass, with 1
segments left of total size: 333 bytes
09/07/23 23:25:37 INFO mapred.LocalJobRunner:
09/07/23 23:25:37 INFO mapred.TaskRunner: Task:attempt_local_0001_r_000000_0
is done. And is in the process of commiting
09/07/23 23:25:37 INFO mapred.LocalJobRunner: reduce > reduce
09/07/23 23:25:37 INFO mapred.TaskRunner: Task
'attempt_local_0001_r_000000_0' done.
09/07/23 23:25:38 INFO mapred.JobClient: Job complete: job_local_0001
09/07/23 23:25:38 INFO mapred.JobClient: Counters: 11
09/07/23 23:25:38 INFO mapred.JobClient:   File Systems
09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes read=38949
09/07/23 23:25:38 INFO mapred.JobClient:     Local bytes written=78378
09/07/23 23:25:38 INFO mapred.JobClient:   Map-Reduce Framework
09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input groups=8
09/07/23 23:25:38 INFO mapred.JobClient:     Combine output records=0
09/07/23 23:25:38 INFO mapred.JobClient:     Map input records=8
09/07/23 23:25:38 INFO mapred.JobClient:     Reduce output records=8
09/07/23 23:25:38 INFO mapred.JobClient:     Map output bytes=315
09/07/23 23:25:38 INFO mapred.JobClient:     Map input bytes=0
09/07/23 23:25:38 INFO mapred.JobClient:     Combine input records=0
09/07/23 23:25:38 INFO mapred.JobClient:     Map output records=8
09/07/23 23:25:38 INFO mapred.JobClient:     Reduce input records=8


On Thu, Jul 23, 2009 at 12:17 PM, bharath vissapragada <
bharat_v@students.iiit.ac.in> wrote:

> since i haven;t started the cluster .. i can even see the details in
> "localhost:<port>/jobTracker.jsp" ..  i didn't even add anything to
> hadoop/conf/hadoop-site.xml
>
>
> On Thu, Jul 23, 2009 at 12:16 PM, bharath vissapragada <
> bharat_v@students.iiit.ac.in> wrote:
>
>> Hi all ,
>>
>> I wanted to run HBase in standalone mode to check my Hbase MR programs ...
>> I have dl a built version of hbase-0.20. and i have hadoop 0.19.3
>>
>> "I have set JAVA_HOME in both of them" .. then i started hbase and
>> inserted some tables using JAVA API .. Now i have written some MR programs
>> onHBase and when i run them on Hbase it runs perfectly without any errors
>> and all the Map -reduce statistics are displayed correctly but  i get no
>> output .
>>
>> I have one doubt now .. how do HBase recognize hadoop in stand alone
>> mode(i haven;t started my hadoop even) .. Even simple print statements donot
>> work .. no output is displayed on the screen ... I doubt my config ....
>>
>> Do i need to add some config to run them ... Please reply ...
>>
>
>

Re: Hbase and Hadoop Config to run in Standalone mode

Posted by bharath vissapragada <bh...@students.iiit.ac.in>.

since i haven;t started the cluster .. i can even see the details in
"localhost:<port>/jobTracker.jsp" ..  i didn't even add anything to
hadoop/conf/hadoop-site.xml

On Thu, Jul 23, 2009 at 12:16 PM, bharath vissapragada <
bharat_v@students.iiit.ac.in> wrote:

> Hi all ,
>
> I wanted to run HBase in standalone mode to check my Hbase MR programs ...
> I have dl a built version of hbase-0.20. and i have hadoop 0.19.3
>
> "I have set JAVA_HOME in both of them" .. then i started hbase and inserted
> some tables using JAVA API .. Now i have written some MR programs onHBase
> and when i run them on Hbase it runs perfectly without any errors and all
> the Map -reduce statistics are displayed correctly but  i get no output .
>
> I have one doubt now .. how do HBase recognize hadoop in stand alone mode(i
> haven;t started my hadoop even) .. Even simple print statements donot work
> .. no output is displayed on the screen ... I doubt my config ....
>
> Do i need to add some config to run them ... Please reply ...
>