You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Lucas Nazário dos Santos <na...@gmail.com> on 2009/08/03 23:30:09 UTC

Problem with TableInputFormat - HBase 0.20

Hi,

I'm migrating from HBase 0.19 to version 0.20 and facing an error regarding
the TableInputFormat class. Bellow is how I'm setting up the job and also
the error message I'm getting.

Does anybody have a clue on what may be happening? It used to work on HBase
0.19.

Lucas


this.configuration.set(TableInputFormat.INPUT_TABLE, args[0]);
this.configuration.set(TableInputFormat.SCAN, "date");
this.configuration.set("index.name", args[1]);
this.configuration.set("hbase.master", args[2]);
this.configuration.set("index.replication.level", args[3]);

final Job jobConf = new Job(this.configuration);
jobConf.setJarByClass(Indexer.class);
jobConf.setJobName("NInvestNewsIndexer");

FileInputFormat.setInputPaths(jobConf, new Path(args[0]));

jobConf.setInputFormatClass(TableInputFormat.class);
jobConf.setOutputFormatClass(NullOutputFormat.class);

jobConf.setOutputKeyClass(Text.class);
jobConf.setOutputValueClass(Text.class);

jobConf.setMapperClass(MapChangedTableRowsIntoUrls.class);
jobConf.setReducerClass(ReduceUrlsToLuceneIndexIntoKatta.class);




09/08/03 18:19:19 ERROR mapreduce.TableInputFormat: An error occurred.
java.io.EOFException
        at java.io.DataInputStream.readFully(DataInputStream.java:180)
        at org.apache.hadoop.hbase.util.Bytes.readByteArray(Bytes.java:135)
        at org.apache.hadoop.hbase.client.Scan.readFields(Scan.java:493)
        at
org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.convertStringToScan(TableMapReduceUtil.java:94)
        at
org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:79)
        at
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
        at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
        at
org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:882)
        at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
        at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
        at com.nash.ninvest.index.indexer.Indexer.run(Unknown Source)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at com.nash.ninvest.index.indexer.Indexer.main(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Exception in thread "main" java.lang.NullPointerException
        at
org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:280)
        at
org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
        at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
        at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
        at com.nash.ninvest.index.indexer.Indexer.run(Unknown Source)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at com.nash.ninvest.index.indexer.Indexer.main(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

Re: Problem with TableInputFormat - HBase 0.20

Posted by stack <st...@duboce.net>.

Thanks for posting the below Lucas.
St.Ack

On Sat, Aug 8, 2009 at 3:45 PM, Lucas Nazário dos Santos <
nazario.lucas@gmail.com> wrote:

> After a couple of tries the following code successfully worked. I didn't
> know about the TableMapReduceUtil class and the solution may be of some
> interest to others.
>
> final Job job = new Job(this.configuration);
> job.setJarByClass(Indexer.class);
> job.setJobName("NInvestNewsIndexer");
>
> final Scan scan = new Scan();
> scan.addColumn("date".getBytes(), "crawled".getBytes());
> scan.addColumn("date".getBytes(), "indexed".getBytes());
> TableMapReduceUtil.initTableMapperJob(args[0], scan,
> MapChangedTableRowsIntoUrls.class, Text.class, Text.class, job);
>
> job.setOutputFormatClass(NullOutputFormat.class);
> job.setReducerClass(ReduceUrlsToLuceneIndexIntoKatta.class);
>
>
>
> On Mon, Aug 3, 2009 at 8:31 PM, Amandeep Khurana <am...@gmail.com> wrote:
>
> > The implementation in the new package is different from the old one. So,
> if
> > you want to use it in the same way as you used to use the old one, you'll
> > have to stick to the mapred package till the time you upgrade the code
> > according to the new implementation.
> >
> >
> > On Mon, Aug 3, 2009 at 3:45 PM, Lucas Nazário dos Santos <
> > nazario.lucas@gmail.com> wrote:
> >
> > > Thanks. But I didn't get it. Why should I stick with the old mapred
> > package
> > > if I'm moving everything to Hadoop and HBase 0.20? Stuff in the old
> > mapred
> > > package are all deprecated.
> > >
> > >
> > >
> > > On Mon, Aug 3, 2009 at 7:31 PM, stack <st...@duboce.net> wrote:
> > >
> > > > Looks like crossed lines.
> > > >
> > > > In hadoop 0.20.0, there is the mapred package and the mapreduce
> > package.
> > > > The latter has the new lump-sum context to which you go for all
> things.
> > > > HBase has similar.  The new mapreduce package that is in 0.20.0 hbase
> > is
> > > > the
> > > > old mapred redone to fit the new hadoop APIs.  Below in your
> stacktrace
> > I
> > > > see use of the new hbase mapreduce stuff though you would hone to the
> > old
> > > > interface.  Try using the stuff in mapred package?
> > > >
> > > > St.Ack
> > > >
> > > >
> > > > On Mon, Aug 3, 2009 at 2:30 PM, Lucas Nazário dos Santos <
> > > > nazario.lucas@gmail.com> wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I'm migrating from HBase 0.19 to version 0.20 and facing an error
> > > > regarding
> > > > > the TableInputFormat class. Bellow is how I'm setting up the job
> and
> > > also
> > > > > the error message I'm getting.
> > > > >
> > > > > Does anybody have a clue on what may be happening? It used to work
> on
> > > > HBase
> > > > > 0.19.
> > > > >
> > > > > Lucas
> > > > >
> > > > >
> > > > > this.configuration.set(TableInputFormat.INPUT_TABLE, args[0]);
> > > > > this.configuration.set(TableInputFormat.SCAN, "date");
> > > > > this.configuration.set("index.name", args[1]);
> > > > > this.configuration.set("hbase.master", args[2]);
> > > > > this.configuration.set("index.replication.level", args[3]);
> > > > >
> > > > > final Job jobConf = new Job(this.configuration);
> > > > > jobConf.setJarByClass(Indexer.class);
> > > > > jobConf.setJobName("NInvestNewsIndexer");
> > > > >
> > > > > FileInputFormat.setInputPaths(jobConf, new Path(args[0]));
> > > > >
> > > > > jobConf.setInputFormatClass(TableInputFormat.class);
> > > > > jobConf.setOutputFormatClass(NullOutputFormat.class);
> > > > >
> > > > > jobConf.setOutputKeyClass(Text.class);
> > > > > jobConf.setOutputValueClass(Text.class);
> > > > >
> > > > > jobConf.setMapperClass(MapChangedTableRowsIntoUrls.class);
> > > > > jobConf.setReducerClass(ReduceUrlsToLuceneIndexIntoKatta.class);
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > 09/08/03 18:19:19 ERROR mapreduce.TableInputFormat: An error
> > occurred.
> > > > > java.io.EOFException
> > > > >        at
> java.io.DataInputStream.readFully(DataInputStream.java:180)
> > > > >        at
> > > > org.apache.hadoop.hbase.util.Bytes.readByteArray(Bytes.java:135)
> > > > >        at
> > org.apache.hadoop.hbase.client.Scan.readFields(Scan.java:493)
> > > > >        at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.convertStringToScan(TableMapReduceUtil.java:94)
> > > > >        at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:79)
> > > > >        at
> > > > >
> > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
> > > > >        at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
> > > > >        at
> > > > >
> org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:882)
> > > > >        at
> > > > >
> > >
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
> > > > >        at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
> > > > >        at
> > > org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
> > > > >        at com.nash.ninvest.index.indexer.Indexer.run(Unknown
> Source)
> > > > >        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > > > >        at com.nash.ninvest.index.indexer.Indexer.main(Unknown
> Source)
> > > > >        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> > > > >        at
> > > > >
> > > > >
> > > >
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > > > >        at
> > > > >
> > > > >
> > > >
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > > > >        at java.lang.reflect.Method.invoke(Method.java:597)
> > > > >        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> > > > > Exception in thread "main" java.lang.NullPointerException
> > > > >        at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:280)
> > > > >        at
> > > > >
> org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
> > > > >        at
> > > > >
> > >
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
> > > > >        at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
> > > > >        at
> > > org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
> > > > >        at com.nash.ninvest.index.indexer.Indexer.run(Unknown
> Source)
> > > > >        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > > > >        at com.nash.ninvest.index.indexer.Indexer.main(Unknown
> Source)
> > > > >        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> > > > >        at
> > > > >
> > > > >
> > > >
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > > > >        at
> > > > >
> > > > >
> > > >
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > > > >        at java.lang.reflect.Method.invoke(Method.java:597)
> > > > >        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> > > > >
> > > >
> > >
> >
>

Re: Problem with TableInputFormat - HBase 0.20

Posted by Lucas Nazário dos Santos <na...@gmail.com>.

After a couple of tries the following code successfully worked. I didn't
know about the TableMapReduceUtil class and the solution may be of some
interest to others.

final Job job = new Job(this.configuration);
job.setJarByClass(Indexer.class);
job.setJobName("NInvestNewsIndexer");

final Scan scan = new Scan();
scan.addColumn("date".getBytes(), "crawled".getBytes());
scan.addColumn("date".getBytes(), "indexed".getBytes());
TableMapReduceUtil.initTableMapperJob(args[0], scan,
MapChangedTableRowsIntoUrls.class, Text.class, Text.class, job);

job.setOutputFormatClass(NullOutputFormat.class);
job.setReducerClass(ReduceUrlsToLuceneIndexIntoKatta.class);



On Mon, Aug 3, 2009 at 8:31 PM, Amandeep Khurana <am...@gmail.com> wrote:

> The implementation in the new package is different from the old one. So, if
> you want to use it in the same way as you used to use the old one, you'll
> have to stick to the mapred package till the time you upgrade the code
> according to the new implementation.
>
>
> On Mon, Aug 3, 2009 at 3:45 PM, Lucas Nazário dos Santos <
> nazario.lucas@gmail.com> wrote:
>
> > Thanks. But I didn't get it. Why should I stick with the old mapred
> package
> > if I'm moving everything to Hadoop and HBase 0.20? Stuff in the old
> mapred
> > package are all deprecated.
> >
> >
> >
> > On Mon, Aug 3, 2009 at 7:31 PM, stack <st...@duboce.net> wrote:
> >
> > > Looks like crossed lines.
> > >
> > > In hadoop 0.20.0, there is the mapred package and the mapreduce
> package.
> > > The latter has the new lump-sum context to which you go for all things.
> > > HBase has similar.  The new mapreduce package that is in 0.20.0 hbase
> is
> > > the
> > > old mapred redone to fit the new hadoop APIs.  Below in your stacktrace
> I
> > > see use of the new hbase mapreduce stuff though you would hone to the
> old
> > > interface.  Try using the stuff in mapred package?
> > >
> > > St.Ack
> > >
> > >
> > > On Mon, Aug 3, 2009 at 2:30 PM, Lucas Nazário dos Santos <
> > > nazario.lucas@gmail.com> wrote:
> > >
> > > > Hi,
> > > >
> > > > I'm migrating from HBase 0.19 to version 0.20 and facing an error
> > > regarding
> > > > the TableInputFormat class. Bellow is how I'm setting up the job and
> > also
> > > > the error message I'm getting.
> > > >
> > > > Does anybody have a clue on what may be happening? It used to work on
> > > HBase
> > > > 0.19.
> > > >
> > > > Lucas
> > > >
> > > >
> > > > this.configuration.set(TableInputFormat.INPUT_TABLE, args[0]);
> > > > this.configuration.set(TableInputFormat.SCAN, "date");
> > > > this.configuration.set("index.name", args[1]);
> > > > this.configuration.set("hbase.master", args[2]);
> > > > this.configuration.set("index.replication.level", args[3]);
> > > >
> > > > final Job jobConf = new Job(this.configuration);
> > > > jobConf.setJarByClass(Indexer.class);
> > > > jobConf.setJobName("NInvestNewsIndexer");
> > > >
> > > > FileInputFormat.setInputPaths(jobConf, new Path(args[0]));
> > > >
> > > > jobConf.setInputFormatClass(TableInputFormat.class);
> > > > jobConf.setOutputFormatClass(NullOutputFormat.class);
> > > >
> > > > jobConf.setOutputKeyClass(Text.class);
> > > > jobConf.setOutputValueClass(Text.class);
> > > >
> > > > jobConf.setMapperClass(MapChangedTableRowsIntoUrls.class);
> > > > jobConf.setReducerClass(ReduceUrlsToLuceneIndexIntoKatta.class);
> > > >
> > > >
> > > >
> > > >
> > > > 09/08/03 18:19:19 ERROR mapreduce.TableInputFormat: An error
> occurred.
> > > > java.io.EOFException
> > > >        at java.io.DataInputStream.readFully(DataInputStream.java:180)
> > > >        at
> > > org.apache.hadoop.hbase.util.Bytes.readByteArray(Bytes.java:135)
> > > >        at
> org.apache.hadoop.hbase.client.Scan.readFields(Scan.java:493)
> > > >        at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.convertStringToScan(TableMapReduceUtil.java:94)
> > > >        at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:79)
> > > >        at
> > > >
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
> > > >        at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
> > > >        at
> > > > org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:882)
> > > >        at
> > > >
> > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
> > > >        at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
> > > >        at
> > org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
> > > >        at com.nash.ninvest.index.indexer.Indexer.run(Unknown Source)
> > > >        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > > >        at com.nash.ninvest.index.indexer.Indexer.main(Unknown Source)
> > > >        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > > >        at
> > > >
> > > >
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > > >        at
> > > >
> > > >
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > > >        at java.lang.reflect.Method.invoke(Method.java:597)
> > > >        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> > > > Exception in thread "main" java.lang.NullPointerException
> > > >        at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:280)
> > > >        at
> > > > org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
> > > >        at
> > > >
> > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
> > > >        at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
> > > >        at
> > org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
> > > >        at com.nash.ninvest.index.indexer.Indexer.run(Unknown Source)
> > > >        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > > >        at com.nash.ninvest.index.indexer.Indexer.main(Unknown Source)
> > > >        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > > >        at
> > > >
> > > >
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > > >        at
> > > >
> > > >
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > > >        at java.lang.reflect.Method.invoke(Method.java:597)
> > > >        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> > > >
> > >
> >
>

Re: Problem with TableInputFormat - HBase 0.20

Posted by Amandeep Khurana <am...@gmail.com>.

The implementation in the new package is different from the old one. So, if
you want to use it in the same way as you used to use the old one, you'll
have to stick to the mapred package till the time you upgrade the code
according to the new implementation.


On Mon, Aug 3, 2009 at 3:45 PM, Lucas Nazário dos Santos <
nazario.lucas@gmail.com> wrote:

> Thanks. But I didn't get it. Why should I stick with the old mapred package
> if I'm moving everything to Hadoop and HBase 0.20? Stuff in the old mapred
> package are all deprecated.
>
>
>
> On Mon, Aug 3, 2009 at 7:31 PM, stack <st...@duboce.net> wrote:
>
> > Looks like crossed lines.
> >
> > In hadoop 0.20.0, there is the mapred package and the mapreduce package.
> > The latter has the new lump-sum context to which you go for all things.
> > HBase has similar.  The new mapreduce package that is in 0.20.0 hbase is
> > the
> > old mapred redone to fit the new hadoop APIs.  Below in your stacktrace I
> > see use of the new hbase mapreduce stuff though you would hone to the old
> > interface.  Try using the stuff in mapred package?
> >
> > St.Ack
> >
> >
> > On Mon, Aug 3, 2009 at 2:30 PM, Lucas Nazário dos Santos <
> > nazario.lucas@gmail.com> wrote:
> >
> > > Hi,
> > >
> > > I'm migrating from HBase 0.19 to version 0.20 and facing an error
> > regarding
> > > the TableInputFormat class. Bellow is how I'm setting up the job and
> also
> > > the error message I'm getting.
> > >
> > > Does anybody have a clue on what may be happening? It used to work on
> > HBase
> > > 0.19.
> > >
> > > Lucas
> > >
> > >
> > > this.configuration.set(TableInputFormat.INPUT_TABLE, args[0]);
> > > this.configuration.set(TableInputFormat.SCAN, "date");
> > > this.configuration.set("index.name", args[1]);
> > > this.configuration.set("hbase.master", args[2]);
> > > this.configuration.set("index.replication.level", args[3]);
> > >
> > > final Job jobConf = new Job(this.configuration);
> > > jobConf.setJarByClass(Indexer.class);
> > > jobConf.setJobName("NInvestNewsIndexer");
> > >
> > > FileInputFormat.setInputPaths(jobConf, new Path(args[0]));
> > >
> > > jobConf.setInputFormatClass(TableInputFormat.class);
> > > jobConf.setOutputFormatClass(NullOutputFormat.class);
> > >
> > > jobConf.setOutputKeyClass(Text.class);
> > > jobConf.setOutputValueClass(Text.class);
> > >
> > > jobConf.setMapperClass(MapChangedTableRowsIntoUrls.class);
> > > jobConf.setReducerClass(ReduceUrlsToLuceneIndexIntoKatta.class);
> > >
> > >
> > >
> > >
> > > 09/08/03 18:19:19 ERROR mapreduce.TableInputFormat: An error occurred.
> > > java.io.EOFException
> > >        at java.io.DataInputStream.readFully(DataInputStream.java:180)
> > >        at
> > org.apache.hadoop.hbase.util.Bytes.readByteArray(Bytes.java:135)
> > >        at org.apache.hadoop.hbase.client.Scan.readFields(Scan.java:493)
> > >        at
> > >
> > >
> >
> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.convertStringToScan(TableMapReduceUtil.java:94)
> > >        at
> > >
> > >
> >
> org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:79)
> > >        at
> > > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
> > >        at
> > >
> > >
> >
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
> > >        at
> > > org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:882)
> > >        at
> > >
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
> > >        at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
> > >        at
> org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
> > >        at com.nash.ninvest.index.indexer.Indexer.run(Unknown Source)
> > >        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > >        at com.nash.ninvest.index.indexer.Indexer.main(Unknown Source)
> > >        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >        at
> > >
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > >        at
> > >
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > >        at java.lang.reflect.Method.invoke(Method.java:597)
> > >        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> > > Exception in thread "main" java.lang.NullPointerException
> > >        at
> > >
> > >
> >
> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:280)
> > >        at
> > > org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
> > >        at
> > >
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
> > >        at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
> > >        at
> org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
> > >        at com.nash.ninvest.index.indexer.Indexer.run(Unknown Source)
> > >        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > >        at com.nash.ninvest.index.indexer.Indexer.main(Unknown Source)
> > >        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >        at
> > >
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > >        at
> > >
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > >        at java.lang.reflect.Method.invoke(Method.java:597)
> > >        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> > >
> >
>

Re: Problem with TableInputFormat - HBase 0.20

Posted by Lucas Nazário dos Santos <na...@gmail.com>.

Thanks. But I didn't get it. Why should I stick with the old mapred package
if I'm moving everything to Hadoop and HBase 0.20? Stuff in the old mapred
package are all deprecated.



On Mon, Aug 3, 2009 at 7:31 PM, stack <st...@duboce.net> wrote:

> Looks like crossed lines.
>
> In hadoop 0.20.0, there is the mapred package and the mapreduce package.
> The latter has the new lump-sum context to which you go for all things.
> HBase has similar.  The new mapreduce package that is in 0.20.0 hbase is
> the
> old mapred redone to fit the new hadoop APIs.  Below in your stacktrace I
> see use of the new hbase mapreduce stuff though you would hone to the old
> interface.  Try using the stuff in mapred package?
>
> St.Ack
>
>
> On Mon, Aug 3, 2009 at 2:30 PM, Lucas Nazário dos Santos <
> nazario.lucas@gmail.com> wrote:
>
> > Hi,
> >
> > I'm migrating from HBase 0.19 to version 0.20 and facing an error
> regarding
> > the TableInputFormat class. Bellow is how I'm setting up the job and also
> > the error message I'm getting.
> >
> > Does anybody have a clue on what may be happening? It used to work on
> HBase
> > 0.19.
> >
> > Lucas
> >
> >
> > this.configuration.set(TableInputFormat.INPUT_TABLE, args[0]);
> > this.configuration.set(TableInputFormat.SCAN, "date");
> > this.configuration.set("index.name", args[1]);
> > this.configuration.set("hbase.master", args[2]);
> > this.configuration.set("index.replication.level", args[3]);
> >
> > final Job jobConf = new Job(this.configuration);
> > jobConf.setJarByClass(Indexer.class);
> > jobConf.setJobName("NInvestNewsIndexer");
> >
> > FileInputFormat.setInputPaths(jobConf, new Path(args[0]));
> >
> > jobConf.setInputFormatClass(TableInputFormat.class);
> > jobConf.setOutputFormatClass(NullOutputFormat.class);
> >
> > jobConf.setOutputKeyClass(Text.class);
> > jobConf.setOutputValueClass(Text.class);
> >
> > jobConf.setMapperClass(MapChangedTableRowsIntoUrls.class);
> > jobConf.setReducerClass(ReduceUrlsToLuceneIndexIntoKatta.class);
> >
> >
> >
> >
> > 09/08/03 18:19:19 ERROR mapreduce.TableInputFormat: An error occurred.
> > java.io.EOFException
> >        at java.io.DataInputStream.readFully(DataInputStream.java:180)
> >        at
> org.apache.hadoop.hbase.util.Bytes.readByteArray(Bytes.java:135)
> >        at org.apache.hadoop.hbase.client.Scan.readFields(Scan.java:493)
> >        at
> >
> >
> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.convertStringToScan(TableMapReduceUtil.java:94)
> >        at
> >
> >
> org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:79)
> >        at
> > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
> >        at
> >
> >
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
> >        at
> > org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:882)
> >        at
> > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
> >        at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
> >        at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
> >        at com.nash.ninvest.index.indexer.Indexer.run(Unknown Source)
> >        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >        at com.nash.ninvest.index.indexer.Indexer.main(Unknown Source)
> >        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >        at
> >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >        at
> >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >        at java.lang.reflect.Method.invoke(Method.java:597)
> >        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> > Exception in thread "main" java.lang.NullPointerException
> >        at
> >
> >
> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:280)
> >        at
> > org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
> >        at
> > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
> >        at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
> >        at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
> >        at com.nash.ninvest.index.indexer.Indexer.run(Unknown Source)
> >        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >        at com.nash.ninvest.index.indexer.Indexer.main(Unknown Source)
> >        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >        at
> >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >        at
> >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >        at java.lang.reflect.Method.invoke(Method.java:597)
> >        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> >
>

Re: Problem with TableInputFormat - HBase 0.20

Posted by stack <st...@duboce.net>.

Looks like crossed lines.

In hadoop 0.20.0, there is the mapred package and the mapreduce package.
The latter has the new lump-sum context to which you go for all things.
HBase has similar.  The new mapreduce package that is in 0.20.0 hbase is the
old mapred redone to fit the new hadoop APIs.  Below in your stacktrace I
see use of the new hbase mapreduce stuff though you would hone to the old
interface.  Try using the stuff in mapred package?

St.Ack


On Mon, Aug 3, 2009 at 2:30 PM, Lucas Nazário dos Santos <
nazario.lucas@gmail.com> wrote:

> Hi,
>
> I'm migrating from HBase 0.19 to version 0.20 and facing an error regarding
> the TableInputFormat class. Bellow is how I'm setting up the job and also
> the error message I'm getting.
>
> Does anybody have a clue on what may be happening? It used to work on HBase
> 0.19.
>
> Lucas
>
>
> this.configuration.set(TableInputFormat.INPUT_TABLE, args[0]);
> this.configuration.set(TableInputFormat.SCAN, "date");
> this.configuration.set("index.name", args[1]);
> this.configuration.set("hbase.master", args[2]);
> this.configuration.set("index.replication.level", args[3]);
>
> final Job jobConf = new Job(this.configuration);
> jobConf.setJarByClass(Indexer.class);
> jobConf.setJobName("NInvestNewsIndexer");
>
> FileInputFormat.setInputPaths(jobConf, new Path(args[0]));
>
> jobConf.setInputFormatClass(TableInputFormat.class);
> jobConf.setOutputFormatClass(NullOutputFormat.class);
>
> jobConf.setOutputKeyClass(Text.class);
> jobConf.setOutputValueClass(Text.class);
>
> jobConf.setMapperClass(MapChangedTableRowsIntoUrls.class);
> jobConf.setReducerClass(ReduceUrlsToLuceneIndexIntoKatta.class);
>
>
>
>
> 09/08/03 18:19:19 ERROR mapreduce.TableInputFormat: An error occurred.
> java.io.EOFException
>        at java.io.DataInputStream.readFully(DataInputStream.java:180)
>        at org.apache.hadoop.hbase.util.Bytes.readByteArray(Bytes.java:135)
>        at org.apache.hadoop.hbase.client.Scan.readFields(Scan.java:493)
>        at
>
> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.convertStringToScan(TableMapReduceUtil.java:94)
>        at
>
> org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:79)
>        at
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
>        at
>
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>        at
> org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:882)
>        at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
>        at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
>        at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
>        at com.nash.ninvest.index.indexer.Indexer.run(Unknown Source)
>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>        at com.nash.ninvest.index.indexer.Indexer.main(Unknown Source)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>        at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> Exception in thread "main" java.lang.NullPointerException
>        at
>
> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:280)
>        at
> org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
>        at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
>        at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
>        at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
>        at com.nash.ninvest.index.indexer.Indexer.run(Unknown Source)
>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>        at com.nash.ninvest.index.indexer.Indexer.main(Unknown Source)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>        at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>