You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by Matthew Tovbin <ma...@tovbin.com> on 2011/10/18 11:00:33 UTC

Increase number of reducers for bulk data load to empty HBase table

Hello, Guys,

I'm willing to bulk load data from hdfs folders into HBase, for this purpose
I used configureIncrementalLoad method from HFileOutputFormat that
configures the job, as follows:

org.apache.hadoop.hbase.mapreduce.HFileOutputFormat.configureIncrementalLoad(job,
myTable)

The problem is that destination table in HBase is empty, meaning it's only
hosted by one region server, so the resulted number of reducers is 1, which
makes the job to run almost forever.

How can I increase the number of reducers? Can the number of reducers be set
to more than a number of region servers?

Thanks in advance,
     Matthew Tovbin.

Re: Increase number of reducers for bulk data load to empty HBase table

Posted by Matthew Tovbin <ma...@tovbin.com>.

Worked like a charm. Thanks! ;)

Best regards,
   Matthew Tovbin =)



On Tue, Oct 18, 2011 at 19:30, Jean-Daniel Cryans <jd...@apache.org>wrote:

> (putting dev@ in bcc, please don't cross-post)
>
> You need to pre-split that table, see
> http://hbase.apache.org/book.html#precreate.regions
>
> J-D
>
> On Tue, Oct 18, 2011 at 2:00 AM, Matthew Tovbin <ma...@tovbin.com>
> wrote:
>
> > Hello, Guys,
> >
> > I'm willing to bulk load data from hdfs folders into HBase, for this
> > purpose
> > I used configureIncrementalLoad method from HFileOutputFormat that
> > configures the job, as follows:
> >
> >
> >
> org.apache.hadoop.hbase.mapreduce.HFileOutputFormat.configureIncrementalLoad(job,
> > myTable)
> >
> > The problem is that destination table in HBase is empty, meaning it's
> only
> > hosted by one region server, so the resulted number of reducers is 1,
> which
> > makes the job to run almost forever.
> >
> > How can I increase the number of reducers? Can the number of reducers be
> > set
> > to more than a number of region servers?
> >
> > Thanks in advance,
> >      Matthew Tovbin.
> >
>

Re: Increase number of reducers for bulk data load to empty HBase table

Posted by Matthew Tovbin <ma...@tovbin.com>.

Worked like a charm. Thanks! ;)

Best regards,
   Matthew Tovbin =)



On Tue, Oct 18, 2011 at 19:30, Jean-Daniel Cryans <jd...@apache.org>wrote:

> (putting dev@ in bcc, please don't cross-post)
>
> You need to pre-split that table, see
> http://hbase.apache.org/book.html#precreate.regions
>
> J-D
>
> On Tue, Oct 18, 2011 at 2:00 AM, Matthew Tovbin <ma...@tovbin.com>
> wrote:
>
> > Hello, Guys,
> >
> > I'm willing to bulk load data from hdfs folders into HBase, for this
> > purpose
> > I used configureIncrementalLoad method from HFileOutputFormat that
> > configures the job, as follows:
> >
> >
> >
> org.apache.hadoop.hbase.mapreduce.HFileOutputFormat.configureIncrementalLoad(job,
> > myTable)
> >
> > The problem is that destination table in HBase is empty, meaning it's
> only
> > hosted by one region server, so the resulted number of reducers is 1,
> which
> > makes the job to run almost forever.
> >
> > How can I increase the number of reducers? Can the number of reducers be
> > set
> > to more than a number of region servers?
> >
> > Thanks in advance,
> >      Matthew Tovbin.
> >
>

Re: Increase number of reducers for bulk data load to empty HBase table

Posted by Jean-Daniel Cryans <jd...@apache.org>.

(putting dev@ in bcc, please don't cross-post)

You need to pre-split that table, see
http://hbase.apache.org/book.html#precreate.regions

J-D

On Tue, Oct 18, 2011 at 2:00 AM, Matthew Tovbin <ma...@tovbin.com> wrote:

> Hello, Guys,
>
> I'm willing to bulk load data from hdfs folders into HBase, for this
> purpose
> I used configureIncrementalLoad method from HFileOutputFormat that
> configures the job, as follows:
>
>
> org.apache.hadoop.hbase.mapreduce.HFileOutputFormat.configureIncrementalLoad(job,
> myTable)
>
> The problem is that destination table in HBase is empty, meaning it's only
> hosted by one region server, so the resulted number of reducers is 1, which
> makes the job to run almost forever.
>
> How can I increase the number of reducers? Can the number of reducers be
> set
> to more than a number of region servers?
>
> Thanks in advance,
>      Matthew Tovbin.
>

Re: Increase number of reducers for bulk data load to empty HBase table

Posted by Jean-Daniel Cryans <jd...@apache.org>.

(putting dev@ in bcc, please don't cross-post)

You need to pre-split that table, see
http://hbase.apache.org/book.html#precreate.regions

J-D

On Tue, Oct 18, 2011 at 2:00 AM, Matthew Tovbin <ma...@tovbin.com> wrote:

> Hello, Guys,
>
> I'm willing to bulk load data from hdfs folders into HBase, for this
> purpose
> I used configureIncrementalLoad method from HFileOutputFormat that
> configures the job, as follows:
>
>
> org.apache.hadoop.hbase.mapreduce.HFileOutputFormat.configureIncrementalLoad(job,
> myTable)
>
> The problem is that destination table in HBase is empty, meaning it's only
> hosted by one region server, so the resulted number of reducers is 1, which
> makes the job to run almost forever.
>
> How can I increase the number of reducers? Can the number of reducers be
> set
> to more than a number of region servers?
>
> Thanks in advance,
>      Matthew Tovbin.
>

Re: Increase number of reducers for bulk data load to empty HBase table

Posted by Karthik Ranganathan <kr...@fb.com>.

Hey Matthew,

The only way to increase the number of reducers is to have more regions -
each reducer produces an output per region, so the number of reducers ==
number of regions.

Thanks
Karthik


On 10/18/11 2:00 AM, "Matthew Tovbin" <ma...@tovbin.com> wrote:

>Hello, Guys,
>
>I'm willing to bulk load data from hdfs folders into HBase, for this
>purpose
>I used configureIncrementalLoad method from HFileOutputFormat that
>configures the job, as follows:
>
>org.apache.hadoop.hbase.mapreduce.HFileOutputFormat.configureIncrementalLo
>ad(job,
>myTable)
>
>The problem is that destination table in HBase is empty, meaning it's only
>hosted by one region server, so the resulted number of reducers is 1,
>which
>makes the job to run almost forever.
>
>How can I increase the number of reducers? Can the number of reducers be
>set
>to more than a number of region servers?
>
>Thanks in advance,
>     Matthew Tovbin.

Re: Increase number of reducers for bulk data load to empty HBase table

Posted by Karthik Ranganathan <kr...@fb.com>.

Hey Matthew,

The only way to increase the number of reducers is to have more regions -
each reducer produces an output per region, so the number of reducers ==
number of regions.

Thanks
Karthik


On 10/18/11 2:00 AM, "Matthew Tovbin" <ma...@tovbin.com> wrote:

>Hello, Guys,
>
>I'm willing to bulk load data from hdfs folders into HBase, for this
>purpose
>I used configureIncrementalLoad method from HFileOutputFormat that
>configures the job, as follows:
>
>org.apache.hadoop.hbase.mapreduce.HFileOutputFormat.configureIncrementalLo
>ad(job,
>myTable)
>
>The problem is that destination table in HBase is empty, meaning it's only
>hosted by one region server, so the resulted number of reducers is 1,
>which
>makes the job to run almost forever.
>
>How can I increase the number of reducers? Can the number of reducers be
>set
>to more than a number of region servers?
>
>Thanks in advance,
>     Matthew Tovbin.