You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Ondrej Holecek <on...@holecek.eu> on 2011/02/18 15:05:54 UTC

mapreduce streaming with hbase as a source

Hello,

I'm testing hadoop and hbase, I can run mapreduce streaming or pipes jobs agains text files on
hadoop, but I have a problem when I try to run the same job against hbase table.

The table looks like this:
hbase(main):015:0> scan 'table1'
ROW                                                COLUMN+CELL

 row1                                              column=family1:a, timestamp=1298037737154,
value=1

 row1                                              column=family1:b, timestamp=1298037744658,
value=2

 row1                                              column=family1:c, timestamp=1298037748020,
value=3

 row2                                              column=family1:a, timestamp=1298037755440,
value=11

 row2                                              column=family1:b, timestamp=1298037758241,
value=22

 row2                                              column=family1:c, timestamp=1298037761198,
value=33

 row3                                              column=family1:a, timestamp=1298037767127,
value=111

 row3                                              column=family1:b, timestamp=1298037770111,
value=222

 row3                                              column=family1:c, timestamp=1298037774954,
value=333

3 row(s) in 0.0240 seconds


And command I use, with the exception I get:

# hadoop jar /usr/lib/hadoop/contrib/streaming/hadoop-streaming-0.20.2+737.jar -D
hbase.mapred.tablecolumn=family1:  -input table1 -output /mtestout45 -mapper test-map
-numReduceTasks 1 -reducer test-reduce -inputformat org.apache.hadoop.hbase.mapred.TableInputFormat

packageJobJar: [/var/lib/hadoop/cache/root/hadoop-unjar8960137205806573426/] []
/tmp/streamjob8218197708173702571.jar tmpDir=null
11/02/18 14:45:48 INFO mapred.JobClient: Cleaning up the staging area
hdfs://oho-nnm.dev.chservices.cz/var/lib/hadoop/cache/mapred/mapred/staging/root/.staging/job_201102151449_0035
Exception in thread "main" java.lang.RuntimeException: Error in configuring object
	at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
	at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
	at org.apache.hadoop.mapred.JobConf.getInputFormat(JobConf.java:597)
	at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:926)
	at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:918)
	at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
	at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:834)
	at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:793)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063)
	at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:793)
	at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:767)
	at org.apache.hadoop.streaming.StreamJob.submitAndMonitorJob(StreamJob.java:922)
	at org.apache.hadoop.streaming.StreamJob.run(StreamJob.java:123)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
	at org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:50)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
Caused by: java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
	... 23 more
Caused by: java.lang.NullPointerException
	at org.apache.hadoop.hbase.mapred.TableInputFormat.configure(TableInputFormat.java:51)
	... 28 more


Can anyone tell me what I am doing wrong?

Regards,
Ondrej

Re: mapreduce streaming with hbase as a source

Posted by Jean-Daniel Cryans <jd...@apache.org>.
(moving to the hbase user ML)

I think streaming used to work correctly in hbase 0.19 since the
RowResult class was giving the value (which you had to parse out), but
now that Result is made of KeyValue and they don't include the values
in toString then I don't see how TableInputFormat could be used. You
could write your own InputFormat that wraps around TIF that returns a
specific format for each cell tho.

Hope that somehow helps,

J-D

2011/2/19 Ondrej Holecek <on...@holecek.eu>:
> I don't think you understand me correctly,
>
> I get this line:
>
> 72 6f 77 31     keyvalues={row1/family1:a/1298037737154/Put/vlen=1,
> row1/family1:b/1298037744658/Put/vlen=1, row1/family1:c/1298037748020/Put/vlen=1}
>
> I know "72 6f 77 31" is the key and the rest is value, let's call it
> mapreduce-value. In this mapreduce-value there is
> "row1/family1:a/1298037737154/Put/vlen=1" that is hbase-row name, hbase-column
> name and hbase-timestamp.  But I expect also hbase-value.
>
> So my question is what to do to make TableInputFormat to send also this hbase-value.
>
>
> Ondrej
>
>
> On 02/19/11 16:41, ShengChang Gu wrote:
>> By default, the prefix of a line
>> up to the first tab character is the key and the rest of the line
>> (excluding the tab character)
>> will be the value. If there is no tab character in the line, then entire
>> line is considered as key
>> and the value is null. However, this can be customized, Use:
>>
>> -D stream.map.output.field.separator=.
>> -D stream.num.map.output.key.fields=4
>>
>> 2011/2/19 Ondrej Holecek <ondrej@holecek.eu <ma...@holecek.eu>>
>>
>>     Thank you, I've spend a lot of time with debuging but didn't notice
>>     this typo :(
>>
>>     Now it works, but I don't understand one thing: On stdin I get this:
>>
>>     72 6f 77 31     keyvalues={row1/family1:a/1298037737154/Put/vlen=1,
>>     row1/family1:b/1298037744658/Put/vlen=1,
>>     row1/family1:c/1298037748020/Put/vlen=1}
>>     72 6f 77 32     keyvalues={row2/family1:a/1298037755440/Put/vlen=2,
>>     row2/family1:b/1298037758241/Put/vlen=2,
>>     row2/family1:c/1298037761198/Put/vlen=2}
>>     72 6f 77 33     keyvalues={row3/family1:a/1298037767127/Put/vlen=3,
>>     row3/family1:b/1298037770111/Put/vlen=3,
>>     row3/family1:c/1298037774954/Put/vlen=3}
>>
>>     I see there is everything but value. What should I do to get value
>>     on stdin too?
>>
>>     Ondrej
>>
>>     On 02/18/11 20:01, Jean-Daniel Cryans wrote:
>>     > You have a typo, it's hbase.mapred.tablecolumns not
>>     hbase.mapred.tablecolumn
>>     >
>>     > J-D
>>     >
>>     > On Fri, Feb 18, 2011 at 6:05 AM, Ondrej Holecek <ondrej@holecek.eu
>>     <ma...@holecek.eu>> wrote:
>>     >> Hello,
>>     >>
>>     >> I'm testing hadoop and hbase, I can run mapreduce streaming or
>>     pipes jobs agains text files on
>>     >> hadoop, but I have a problem when I try to run the same job
>>     against hbase table.
>>     >>
>>     >> The table looks like this:
>>     >> hbase(main):015:0> scan 'table1'
>>     >> ROW                                                COLUMN+CELL
>>     >>
>>     >>  row1
>>      column=family1:a, timestamp=1298037737154,
>>     >> value=1
>>     >>
>>     >>  row1
>>      column=family1:b, timestamp=1298037744658,
>>     >> value=2
>>     >>
>>     >>  row1
>>      column=family1:c, timestamp=1298037748020,
>>     >> value=3
>>     >>
>>     >>  row2
>>      column=family1:a, timestamp=1298037755440,
>>     >> value=11
>>     >>
>>     >>  row2
>>      column=family1:b, timestamp=1298037758241,
>>     >> value=22
>>     >>
>>     >>  row2
>>      column=family1:c, timestamp=1298037761198,
>>     >> value=33
>>     >>
>>     >>  row3
>>      column=family1:a, timestamp=1298037767127,
>>     >> value=111
>>     >>
>>     >>  row3
>>      column=family1:b, timestamp=1298037770111,
>>     >> value=222
>>     >>
>>     >>  row3
>>      column=family1:c, timestamp=1298037774954,
>>     >> value=333
>>     >>
>>     >> 3 row(s) in 0.0240 seconds
>>     >>
>>     >>
>>     >> And command I use, with the exception I get:
>>     >>
>>     >> # hadoop jar
>>     /usr/lib/hadoop/contrib/streaming/hadoop-streaming-0.20.2+737.jar -D
>>     >> hbase.mapred.tablecolumn=family1:  -input table1 -output
>>     /mtestout45 -mapper test-map
>>     >> -numReduceTasks 1 -reducer test-reduce -inputformat
>>     org.apache.hadoop.hbase.mapred.TableInputFormat
>>     >>
>>     >> packageJobJar:
>>     [/var/lib/hadoop/cache/root/hadoop-unjar8960137205806573426/] []
>>     >> /tmp/streamjob8218197708173702571.jar tmpDir=null
>>     >> 11/02/18 14:45:48 INFO mapred.JobClient: Cleaning up the staging area
>>     >>
>>     hdfs://oho-nnm.dev.chservices.cz/var/lib/hadoop/cache/mapred/mapred/staging/root/.staging/job_201102151449_0035
>>     <http://oho-nnm.dev.chservices.cz/var/lib/hadoop/cache/mapred/mapred/staging/root/.staging/job_201102151449_0035>
>>     >> Exception in thread "main" java.lang.RuntimeException: Error in
>>     configuring object
>>     >>        at
>>     org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
>>     >>        at
>>     org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>>     >>        at
>>     org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>>     >>        at
>>     org.apache.hadoop.mapred.JobConf.getInputFormat(JobConf.java:597)
>>     >>        at
>>     org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:926)
>>     >>        at
>>     org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:918)
>>     >>        at
>>     org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
>>     >>        at
>>     org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:834)
>>     >>        at
>>     org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:793)
>>     >>        at java.security.AccessController.doPrivileged(Native Method)
>>     >>        at javax.security.auth.Subject.doAs(Subject.java:396)
>>     >>        at
>>     org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063)
>>     >>        at
>>     org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:793)
>>     >>        at
>>     org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:767)
>>     >>        at
>>     org.apache.hadoop.streaming.StreamJob.submitAndMonitorJob(StreamJob.java:922)
>>     >>        at
>>     org.apache.hadoop.streaming.StreamJob.run(StreamJob.java:123)
>>     >>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>     >>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>     >>        at
>>     org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:50)
>>     >>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>     >>        at
>>     sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>     >>        at
>>     sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>     >>        at java.lang.reflect.Method.invoke(Method.java:597)
>>     >>        at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
>>     >> Caused by: java.lang.reflect.InvocationTargetException
>>     >>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>     >>        at
>>     sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>     >>        at
>>     sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>     >>        at java.lang.reflect.Method.invoke(Method.java:597)
>>     >>        at
>>     org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
>>     >>        ... 23 more
>>     >> Caused by: java.lang.NullPointerException
>>     >>        at
>>     org.apache.hadoop.hbase.mapred.TableInputFormat.configure(TableInputFormat.java:51)
>>     >>        ... 28 more
>>     >>
>>     >>
>>     >> Can anyone tell me what I am doing wrong?
>>     >>
>>     >> Regards,
>>     >> Ondrej
>>     >>
>>
>>
>>
>>
>> --
>> 阿昌
>
>

Re: mapreduce streaming with hbase as a source

Posted by Jean-Daniel Cryans <jd...@apache.org>.
(moving to the hbase user ML)

I think streaming used to work correctly in hbase 0.19 since the
RowResult class was giving the value (which you had to parse out), but
now that Result is made of KeyValue and they don't include the values
in toString then I don't see how TableInputFormat could be used. You
could write your own InputFormat that wraps around TIF that returns a
specific format for each cell tho.

Hope that somehow helps,

J-D

2011/2/19 Ondrej Holecek <on...@holecek.eu>:
> I don't think you understand me correctly,
>
> I get this line:
>
> 72 6f 77 31     keyvalues={row1/family1:a/1298037737154/Put/vlen=1,
> row1/family1:b/1298037744658/Put/vlen=1, row1/family1:c/1298037748020/Put/vlen=1}
>
> I know "72 6f 77 31" is the key and the rest is value, let's call it
> mapreduce-value. In this mapreduce-value there is
> "row1/family1:a/1298037737154/Put/vlen=1" that is hbase-row name, hbase-column
> name and hbase-timestamp.  But I expect also hbase-value.
>
> So my question is what to do to make TableInputFormat to send also this hbase-value.
>
>
> Ondrej
>
>
> On 02/19/11 16:41, ShengChang Gu wrote:
>> By default, the prefix of a line
>> up to the first tab character is the key and the rest of the line
>> (excluding the tab character)
>> will be the value. If there is no tab character in the line, then entire
>> line is considered as key
>> and the value is null. However, this can be customized, Use:
>>
>> -D stream.map.output.field.separator=.
>> -D stream.num.map.output.key.fields=4
>>
>> 2011/2/19 Ondrej Holecek <ondrej@holecek.eu <ma...@holecek.eu>>
>>
>>     Thank you, I've spend a lot of time with debuging but didn't notice
>>     this typo :(
>>
>>     Now it works, but I don't understand one thing: On stdin I get this:
>>
>>     72 6f 77 31     keyvalues={row1/family1:a/1298037737154/Put/vlen=1,
>>     row1/family1:b/1298037744658/Put/vlen=1,
>>     row1/family1:c/1298037748020/Put/vlen=1}
>>     72 6f 77 32     keyvalues={row2/family1:a/1298037755440/Put/vlen=2,
>>     row2/family1:b/1298037758241/Put/vlen=2,
>>     row2/family1:c/1298037761198/Put/vlen=2}
>>     72 6f 77 33     keyvalues={row3/family1:a/1298037767127/Put/vlen=3,
>>     row3/family1:b/1298037770111/Put/vlen=3,
>>     row3/family1:c/1298037774954/Put/vlen=3}
>>
>>     I see there is everything but value. What should I do to get value
>>     on stdin too?
>>
>>     Ondrej
>>
>>     On 02/18/11 20:01, Jean-Daniel Cryans wrote:
>>     > You have a typo, it's hbase.mapred.tablecolumns not
>>     hbase.mapred.tablecolumn
>>     >
>>     > J-D
>>     >
>>     > On Fri, Feb 18, 2011 at 6:05 AM, Ondrej Holecek <ondrej@holecek.eu
>>     <ma...@holecek.eu>> wrote:
>>     >> Hello,
>>     >>
>>     >> I'm testing hadoop and hbase, I can run mapreduce streaming or
>>     pipes jobs agains text files on
>>     >> hadoop, but I have a problem when I try to run the same job
>>     against hbase table.
>>     >>
>>     >> The table looks like this:
>>     >> hbase(main):015:0> scan 'table1'
>>     >> ROW                                                COLUMN+CELL
>>     >>
>>     >>  row1
>>      column=family1:a, timestamp=1298037737154,
>>     >> value=1
>>     >>
>>     >>  row1
>>      column=family1:b, timestamp=1298037744658,
>>     >> value=2
>>     >>
>>     >>  row1
>>      column=family1:c, timestamp=1298037748020,
>>     >> value=3
>>     >>
>>     >>  row2
>>      column=family1:a, timestamp=1298037755440,
>>     >> value=11
>>     >>
>>     >>  row2
>>      column=family1:b, timestamp=1298037758241,
>>     >> value=22
>>     >>
>>     >>  row2
>>      column=family1:c, timestamp=1298037761198,
>>     >> value=33
>>     >>
>>     >>  row3
>>      column=family1:a, timestamp=1298037767127,
>>     >> value=111
>>     >>
>>     >>  row3
>>      column=family1:b, timestamp=1298037770111,
>>     >> value=222
>>     >>
>>     >>  row3
>>      column=family1:c, timestamp=1298037774954,
>>     >> value=333
>>     >>
>>     >> 3 row(s) in 0.0240 seconds
>>     >>
>>     >>
>>     >> And command I use, with the exception I get:
>>     >>
>>     >> # hadoop jar
>>     /usr/lib/hadoop/contrib/streaming/hadoop-streaming-0.20.2+737.jar -D
>>     >> hbase.mapred.tablecolumn=family1:  -input table1 -output
>>     /mtestout45 -mapper test-map
>>     >> -numReduceTasks 1 -reducer test-reduce -inputformat
>>     org.apache.hadoop.hbase.mapred.TableInputFormat
>>     >>
>>     >> packageJobJar:
>>     [/var/lib/hadoop/cache/root/hadoop-unjar8960137205806573426/] []
>>     >> /tmp/streamjob8218197708173702571.jar tmpDir=null
>>     >> 11/02/18 14:45:48 INFO mapred.JobClient: Cleaning up the staging area
>>     >>
>>     hdfs://oho-nnm.dev.chservices.cz/var/lib/hadoop/cache/mapred/mapred/staging/root/.staging/job_201102151449_0035
>>     <http://oho-nnm.dev.chservices.cz/var/lib/hadoop/cache/mapred/mapred/staging/root/.staging/job_201102151449_0035>
>>     >> Exception in thread "main" java.lang.RuntimeException: Error in
>>     configuring object
>>     >>        at
>>     org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
>>     >>        at
>>     org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>>     >>        at
>>     org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>>     >>        at
>>     org.apache.hadoop.mapred.JobConf.getInputFormat(JobConf.java:597)
>>     >>        at
>>     org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:926)
>>     >>        at
>>     org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:918)
>>     >>        at
>>     org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
>>     >>        at
>>     org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:834)
>>     >>        at
>>     org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:793)
>>     >>        at java.security.AccessController.doPrivileged(Native Method)
>>     >>        at javax.security.auth.Subject.doAs(Subject.java:396)
>>     >>        at
>>     org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063)
>>     >>        at
>>     org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:793)
>>     >>        at
>>     org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:767)
>>     >>        at
>>     org.apache.hadoop.streaming.StreamJob.submitAndMonitorJob(StreamJob.java:922)
>>     >>        at
>>     org.apache.hadoop.streaming.StreamJob.run(StreamJob.java:123)
>>     >>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>     >>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>     >>        at
>>     org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:50)
>>     >>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>     >>        at
>>     sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>     >>        at
>>     sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>     >>        at java.lang.reflect.Method.invoke(Method.java:597)
>>     >>        at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
>>     >> Caused by: java.lang.reflect.InvocationTargetException
>>     >>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>     >>        at
>>     sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>     >>        at
>>     sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>     >>        at java.lang.reflect.Method.invoke(Method.java:597)
>>     >>        at
>>     org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
>>     >>        ... 23 more
>>     >> Caused by: java.lang.NullPointerException
>>     >>        at
>>     org.apache.hadoop.hbase.mapred.TableInputFormat.configure(TableInputFormat.java:51)
>>     >>        ... 28 more
>>     >>
>>     >>
>>     >> Can anyone tell me what I am doing wrong?
>>     >>
>>     >> Regards,
>>     >> Ondrej
>>     >>
>>
>>
>>
>>
>> --
>> 阿昌
>
>

Re: mapreduce streaming with hbase as a source

Posted by Ondrej Holecek <on...@holecek.eu>.
I don't think you understand me correctly,

I get this line:

72 6f 77 31     keyvalues={row1/family1:a/1298037737154/Put/vlen=1,
row1/family1:b/1298037744658/Put/vlen=1, row1/family1:c/1298037748020/Put/vlen=1}

I know "72 6f 77 31" is the key and the rest is value, let's call it
mapreduce-value. In this mapreduce-value there is
"row1/family1:a/1298037737154/Put/vlen=1" that is hbase-row name, hbase-column
name and hbase-timestamp.  But I expect also hbase-value.

So my question is what to do to make TableInputFormat to send also this hbase-value.


Ondrej


On 02/19/11 16:41, ShengChang Gu wrote:
> By default, the prefix of a line
> up to the first tab character is the key and the rest of the line
> (excluding the tab character)
> will be the value. If there is no tab character in the line, then entire
> line is considered as key
> and the value is null. However, this can be customized, Use:
>  
> -D stream.map.output.field.separator=.
> -D stream.num.map.output.key.fields=4
> 
> 2011/2/19 Ondrej Holecek <ondrej@holecek.eu <ma...@holecek.eu>>
> 
>     Thank you, I've spend a lot of time with debuging but didn't notice
>     this typo :(
> 
>     Now it works, but I don't understand one thing: On stdin I get this:
> 
>     72 6f 77 31     keyvalues={row1/family1:a/1298037737154/Put/vlen=1,
>     row1/family1:b/1298037744658/Put/vlen=1,
>     row1/family1:c/1298037748020/Put/vlen=1}
>     72 6f 77 32     keyvalues={row2/family1:a/1298037755440/Put/vlen=2,
>     row2/family1:b/1298037758241/Put/vlen=2,
>     row2/family1:c/1298037761198/Put/vlen=2}
>     72 6f 77 33     keyvalues={row3/family1:a/1298037767127/Put/vlen=3,
>     row3/family1:b/1298037770111/Put/vlen=3,
>     row3/family1:c/1298037774954/Put/vlen=3}
> 
>     I see there is everything but value. What should I do to get value
>     on stdin too?
> 
>     Ondrej
> 
>     On 02/18/11 20:01, Jean-Daniel Cryans wrote:
>     > You have a typo, it's hbase.mapred.tablecolumns not
>     hbase.mapred.tablecolumn
>     >
>     > J-D
>     >
>     > On Fri, Feb 18, 2011 at 6:05 AM, Ondrej Holecek <ondrej@holecek.eu
>     <ma...@holecek.eu>> wrote:
>     >> Hello,
>     >>
>     >> I'm testing hadoop and hbase, I can run mapreduce streaming or
>     pipes jobs agains text files on
>     >> hadoop, but I have a problem when I try to run the same job
>     against hbase table.
>     >>
>     >> The table looks like this:
>     >> hbase(main):015:0> scan 'table1'
>     >> ROW                                                COLUMN+CELL
>     >>
>     >>  row1                                            
>      column=family1:a, timestamp=1298037737154,
>     >> value=1
>     >>
>     >>  row1                                            
>      column=family1:b, timestamp=1298037744658,
>     >> value=2
>     >>
>     >>  row1                                            
>      column=family1:c, timestamp=1298037748020,
>     >> value=3
>     >>
>     >>  row2                                            
>      column=family1:a, timestamp=1298037755440,
>     >> value=11
>     >>
>     >>  row2                                            
>      column=family1:b, timestamp=1298037758241,
>     >> value=22
>     >>
>     >>  row2                                            
>      column=family1:c, timestamp=1298037761198,
>     >> value=33
>     >>
>     >>  row3                                            
>      column=family1:a, timestamp=1298037767127,
>     >> value=111
>     >>
>     >>  row3                                            
>      column=family1:b, timestamp=1298037770111,
>     >> value=222
>     >>
>     >>  row3                                            
>      column=family1:c, timestamp=1298037774954,
>     >> value=333
>     >>
>     >> 3 row(s) in 0.0240 seconds
>     >>
>     >>
>     >> And command I use, with the exception I get:
>     >>
>     >> # hadoop jar
>     /usr/lib/hadoop/contrib/streaming/hadoop-streaming-0.20.2+737.jar -D
>     >> hbase.mapred.tablecolumn=family1:  -input table1 -output
>     /mtestout45 -mapper test-map
>     >> -numReduceTasks 1 -reducer test-reduce -inputformat
>     org.apache.hadoop.hbase.mapred.TableInputFormat
>     >>
>     >> packageJobJar:
>     [/var/lib/hadoop/cache/root/hadoop-unjar8960137205806573426/] []
>     >> /tmp/streamjob8218197708173702571.jar tmpDir=null
>     >> 11/02/18 14:45:48 INFO mapred.JobClient: Cleaning up the staging area
>     >>
>     hdfs://oho-nnm.dev.chservices.cz/var/lib/hadoop/cache/mapred/mapred/staging/root/.staging/job_201102151449_0035
>     <http://oho-nnm.dev.chservices.cz/var/lib/hadoop/cache/mapred/mapred/staging/root/.staging/job_201102151449_0035>
>     >> Exception in thread "main" java.lang.RuntimeException: Error in
>     configuring object
>     >>        at
>     org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
>     >>        at
>     org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>     >>        at
>     org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>     >>        at
>     org.apache.hadoop.mapred.JobConf.getInputFormat(JobConf.java:597)
>     >>        at
>     org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:926)
>     >>        at
>     org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:918)
>     >>        at
>     org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
>     >>        at
>     org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:834)
>     >>        at
>     org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:793)
>     >>        at java.security.AccessController.doPrivileged(Native Method)
>     >>        at javax.security.auth.Subject.doAs(Subject.java:396)
>     >>        at
>     org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063)
>     >>        at
>     org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:793)
>     >>        at
>     org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:767)
>     >>        at
>     org.apache.hadoop.streaming.StreamJob.submitAndMonitorJob(StreamJob.java:922)
>     >>        at
>     org.apache.hadoop.streaming.StreamJob.run(StreamJob.java:123)
>     >>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>     >>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>     >>        at
>     org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:50)
>     >>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     >>        at
>     sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     >>        at
>     sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     >>        at java.lang.reflect.Method.invoke(Method.java:597)
>     >>        at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
>     >> Caused by: java.lang.reflect.InvocationTargetException
>     >>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     >>        at
>     sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     >>        at
>     sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     >>        at java.lang.reflect.Method.invoke(Method.java:597)
>     >>        at
>     org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
>     >>        ... 23 more
>     >> Caused by: java.lang.NullPointerException
>     >>        at
>     org.apache.hadoop.hbase.mapred.TableInputFormat.configure(TableInputFormat.java:51)
>     >>        ... 28 more
>     >>
>     >>
>     >> Can anyone tell me what I am doing wrong?
>     >>
>     >> Regards,
>     >> Ondrej
>     >>
> 
> 
> 
> 
> -- 
> 阿昌


Re: mapreduce streaming with hbase as a source

Posted by ShengChang Gu <gu...@gmail.com>.
By default, the prefix of a line
up to the first tab character is the key and the rest of the line (excluding
the tab character)
will be the value. If there is no tab character in the line, then entire
line is considered as key
and the value is null. However, this can be customized, Use:

-D stream.map.output.field.separator=.
-D stream.num.map.output.key.fields=4

2011/2/19 Ondrej Holecek <on...@holecek.eu>

> Thank you, I've spend a lot of time with debuging but didn't notice this
> typo :(
>
> Now it works, but I don't understand one thing: On stdin I get this:
>
> 72 6f 77 31     keyvalues={row1/family1:a/1298037737154/Put/vlen=1,
> row1/family1:b/1298037744658/Put/vlen=1,
> row1/family1:c/1298037748020/Put/vlen=1}
> 72 6f 77 32     keyvalues={row2/family1:a/1298037755440/Put/vlen=2,
> row2/family1:b/1298037758241/Put/vlen=2,
> row2/family1:c/1298037761198/Put/vlen=2}
> 72 6f 77 33     keyvalues={row3/family1:a/1298037767127/Put/vlen=3,
> row3/family1:b/1298037770111/Put/vlen=3,
> row3/family1:c/1298037774954/Put/vlen=3}
>
> I see there is everything but value. What should I do to get value on stdin
> too?
>
> Ondrej
>
> On 02/18/11 20:01, Jean-Daniel Cryans wrote:
> > You have a typo, it's hbase.mapred.tablecolumns not
> hbase.mapred.tablecolumn
> >
> > J-D
> >
> > On Fri, Feb 18, 2011 at 6:05 AM, Ondrej Holecek <on...@holecek.eu>
> wrote:
> >> Hello,
> >>
> >> I'm testing hadoop and hbase, I can run mapreduce streaming or pipes
> jobs agains text files on
> >> hadoop, but I have a problem when I try to run the same job against
> hbase table.
> >>
> >> The table looks like this:
> >> hbase(main):015:0> scan 'table1'
> >> ROW                                                COLUMN+CELL
> >>
> >>  row1                                              column=family1:a,
> timestamp=1298037737154,
> >> value=1
> >>
> >>  row1                                              column=family1:b,
> timestamp=1298037744658,
> >> value=2
> >>
> >>  row1                                              column=family1:c,
> timestamp=1298037748020,
> >> value=3
> >>
> >>  row2                                              column=family1:a,
> timestamp=1298037755440,
> >> value=11
> >>
> >>  row2                                              column=family1:b,
> timestamp=1298037758241,
> >> value=22
> >>
> >>  row2                                              column=family1:c,
> timestamp=1298037761198,
> >> value=33
> >>
> >>  row3                                              column=family1:a,
> timestamp=1298037767127,
> >> value=111
> >>
> >>  row3                                              column=family1:b,
> timestamp=1298037770111,
> >> value=222
> >>
> >>  row3                                              column=family1:c,
> timestamp=1298037774954,
> >> value=333
> >>
> >> 3 row(s) in 0.0240 seconds
> >>
> >>
> >> And command I use, with the exception I get:
> >>
> >> # hadoop jar
> /usr/lib/hadoop/contrib/streaming/hadoop-streaming-0.20.2+737.jar -D
> >> hbase.mapred.tablecolumn=family1:  -input table1 -output /mtestout45
> -mapper test-map
> >> -numReduceTasks 1 -reducer test-reduce -inputformat
> org.apache.hadoop.hbase.mapred.TableInputFormat
> >>
> >> packageJobJar:
> [/var/lib/hadoop/cache/root/hadoop-unjar8960137205806573426/] []
> >> /tmp/streamjob8218197708173702571.jar tmpDir=null
> >> 11/02/18 14:45:48 INFO mapred.JobClient: Cleaning up the staging area
> >> hdfs://
> oho-nnm.dev.chservices.cz/var/lib/hadoop/cache/mapred/mapred/staging/root/.staging/job_201102151449_0035
> >> Exception in thread "main" java.lang.RuntimeException: Error in
> configuring object
> >>        at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
> >>        at
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
> >>        at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
> >>        at
> org.apache.hadoop.mapred.JobConf.getInputFormat(JobConf.java:597)
> >>        at
> org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:926)
> >>        at
> org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:918)
> >>        at
> org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
> >>        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:834)
> >>        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:793)
> >>        at java.security.AccessController.doPrivileged(Native Method)
> >>        at javax.security.auth.Subject.doAs(Subject.java:396)
> >>        at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063)
> >>        at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:793)
> >>        at
> org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:767)
> >>        at
> org.apache.hadoop.streaming.StreamJob.submitAndMonitorJob(StreamJob.java:922)
> >>        at org.apache.hadoop.streaming.StreamJob.run(StreamJob.java:123)
> >>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> >>        at
> org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:50)
> >>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >>        at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >>        at java.lang.reflect.Method.invoke(Method.java:597)
> >>        at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> >> Caused by: java.lang.reflect.InvocationTargetException
> >>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >>        at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >>        at java.lang.reflect.Method.invoke(Method.java:597)
> >>        at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
> >>        ... 23 more
> >> Caused by: java.lang.NullPointerException
> >>        at
> org.apache.hadoop.hbase.mapred.TableInputFormat.configure(TableInputFormat.java:51)
> >>        ... 28 more
> >>
> >>
> >> Can anyone tell me what I am doing wrong?
> >>
> >> Regards,
> >> Ondrej
> >>
>
>


-- 
阿昌

Re: mapreduce streaming with hbase as a source

Posted by Ondrej Holecek <on...@holecek.eu>.
Thank you, I've spend a lot of time with debuging but didn't notice this typo :(

Now it works, but I don't understand one thing: On stdin I get this:

72 6f 77 31     keyvalues={row1/family1:a/1298037737154/Put/vlen=1,
row1/family1:b/1298037744658/Put/vlen=1, row1/family1:c/1298037748020/Put/vlen=1}
72 6f 77 32     keyvalues={row2/family1:a/1298037755440/Put/vlen=2,
row2/family1:b/1298037758241/Put/vlen=2, row2/family1:c/1298037761198/Put/vlen=2}
72 6f 77 33     keyvalues={row3/family1:a/1298037767127/Put/vlen=3,
row3/family1:b/1298037770111/Put/vlen=3, row3/family1:c/1298037774954/Put/vlen=3}

I see there is everything but value. What should I do to get value on stdin too?

Ondrej

On 02/18/11 20:01, Jean-Daniel Cryans wrote:
> You have a typo, it's hbase.mapred.tablecolumns not hbase.mapred.tablecolumn
> 
> J-D
> 
> On Fri, Feb 18, 2011 at 6:05 AM, Ondrej Holecek <on...@holecek.eu> wrote:
>> Hello,
>>
>> I'm testing hadoop and hbase, I can run mapreduce streaming or pipes jobs agains text files on
>> hadoop, but I have a problem when I try to run the same job against hbase table.
>>
>> The table looks like this:
>> hbase(main):015:0> scan 'table1'
>> ROW                                                COLUMN+CELL
>>
>>  row1                                              column=family1:a, timestamp=1298037737154,
>> value=1
>>
>>  row1                                              column=family1:b, timestamp=1298037744658,
>> value=2
>>
>>  row1                                              column=family1:c, timestamp=1298037748020,
>> value=3
>>
>>  row2                                              column=family1:a, timestamp=1298037755440,
>> value=11
>>
>>  row2                                              column=family1:b, timestamp=1298037758241,
>> value=22
>>
>>  row2                                              column=family1:c, timestamp=1298037761198,
>> value=33
>>
>>  row3                                              column=family1:a, timestamp=1298037767127,
>> value=111
>>
>>  row3                                              column=family1:b, timestamp=1298037770111,
>> value=222
>>
>>  row3                                              column=family1:c, timestamp=1298037774954,
>> value=333
>>
>> 3 row(s) in 0.0240 seconds
>>
>>
>> And command I use, with the exception I get:
>>
>> # hadoop jar /usr/lib/hadoop/contrib/streaming/hadoop-streaming-0.20.2+737.jar -D
>> hbase.mapred.tablecolumn=family1:  -input table1 -output /mtestout45 -mapper test-map
>> -numReduceTasks 1 -reducer test-reduce -inputformat org.apache.hadoop.hbase.mapred.TableInputFormat
>>
>> packageJobJar: [/var/lib/hadoop/cache/root/hadoop-unjar8960137205806573426/] []
>> /tmp/streamjob8218197708173702571.jar tmpDir=null
>> 11/02/18 14:45:48 INFO mapred.JobClient: Cleaning up the staging area
>> hdfs://oho-nnm.dev.chservices.cz/var/lib/hadoop/cache/mapred/mapred/staging/root/.staging/job_201102151449_0035
>> Exception in thread "main" java.lang.RuntimeException: Error in configuring object
>>        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
>>        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>>        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>>        at org.apache.hadoop.mapred.JobConf.getInputFormat(JobConf.java:597)
>>        at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:926)
>>        at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:918)
>>        at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
>>        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:834)
>>        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:793)
>>        at java.security.AccessController.doPrivileged(Native Method)
>>        at javax.security.auth.Subject.doAs(Subject.java:396)
>>        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063)
>>        at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:793)
>>        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:767)
>>        at org.apache.hadoop.streaming.StreamJob.submitAndMonitorJob(StreamJob.java:922)
>>        at org.apache.hadoop.streaming.StreamJob.run(StreamJob.java:123)
>>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>        at org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:50)
>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>        at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
>> Caused by: java.lang.reflect.InvocationTargetException
>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
>>        ... 23 more
>> Caused by: java.lang.NullPointerException
>>        at org.apache.hadoop.hbase.mapred.TableInputFormat.configure(TableInputFormat.java:51)
>>        ... 28 more
>>
>>
>> Can anyone tell me what I am doing wrong?
>>
>> Regards,
>> Ondrej
>>


Re: mapreduce streaming with hbase as a source

Posted by Jean-Daniel Cryans <jd...@apache.org>.
You have a typo, it's hbase.mapred.tablecolumns not hbase.mapred.tablecolumn

J-D

On Fri, Feb 18, 2011 at 6:05 AM, Ondrej Holecek <on...@holecek.eu> wrote:
> Hello,
>
> I'm testing hadoop and hbase, I can run mapreduce streaming or pipes jobs agains text files on
> hadoop, but I have a problem when I try to run the same job against hbase table.
>
> The table looks like this:
> hbase(main):015:0> scan 'table1'
> ROW                                                COLUMN+CELL
>
>  row1                                              column=family1:a, timestamp=1298037737154,
> value=1
>
>  row1                                              column=family1:b, timestamp=1298037744658,
> value=2
>
>  row1                                              column=family1:c, timestamp=1298037748020,
> value=3
>
>  row2                                              column=family1:a, timestamp=1298037755440,
> value=11
>
>  row2                                              column=family1:b, timestamp=1298037758241,
> value=22
>
>  row2                                              column=family1:c, timestamp=1298037761198,
> value=33
>
>  row3                                              column=family1:a, timestamp=1298037767127,
> value=111
>
>  row3                                              column=family1:b, timestamp=1298037770111,
> value=222
>
>  row3                                              column=family1:c, timestamp=1298037774954,
> value=333
>
> 3 row(s) in 0.0240 seconds
>
>
> And command I use, with the exception I get:
>
> # hadoop jar /usr/lib/hadoop/contrib/streaming/hadoop-streaming-0.20.2+737.jar -D
> hbase.mapred.tablecolumn=family1:  -input table1 -output /mtestout45 -mapper test-map
> -numReduceTasks 1 -reducer test-reduce -inputformat org.apache.hadoop.hbase.mapred.TableInputFormat
>
> packageJobJar: [/var/lib/hadoop/cache/root/hadoop-unjar8960137205806573426/] []
> /tmp/streamjob8218197708173702571.jar tmpDir=null
> 11/02/18 14:45:48 INFO mapred.JobClient: Cleaning up the staging area
> hdfs://oho-nnm.dev.chservices.cz/var/lib/hadoop/cache/mapred/mapred/staging/root/.staging/job_201102151449_0035
> Exception in thread "main" java.lang.RuntimeException: Error in configuring object
>        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
>        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>        at org.apache.hadoop.mapred.JobConf.getInputFormat(JobConf.java:597)
>        at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:926)
>        at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:918)
>        at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
>        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:834)
>        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:793)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at javax.security.auth.Subject.doAs(Subject.java:396)
>        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063)
>        at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:793)
>        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:767)
>        at org.apache.hadoop.streaming.StreamJob.submitAndMonitorJob(StreamJob.java:922)
>        at org.apache.hadoop.streaming.StreamJob.run(StreamJob.java:123)
>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>        at org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:50)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> Caused by: java.lang.reflect.InvocationTargetException
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
>        ... 23 more
> Caused by: java.lang.NullPointerException
>        at org.apache.hadoop.hbase.mapred.TableInputFormat.configure(TableInputFormat.java:51)
>        ... 28 more
>
>
> Can anyone tell me what I am doing wrong?
>
> Regards,
> Ondrej
>