You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by T Vinod Gupta <tv...@readypulse.com> on 2012/01/12 19:35:46 UTC

does increasing region filesize followed by major compactions supposed to reduce number of regions?

this is probably a rookie question. but my understanding is that if i
increase the region max file size and then initiate major compaction
manually, the number of regions should ideally go down by the factor by
which i increased the region max file size. isn't that true? im not seeing
that happening. ofcourse, my compaction is still underway - so i don't know
if ill see region count decrease at the very end of it. major compaction
does do region merging right?

thanks

Re: efficient export w/o HDFS/copying

Posted by Stack <st...@duboce.net>.
On Wed, Mar 28, 2012 at 12:59 AM, Michel Segel
<mi...@hotmail.com> wrote:
> Wouldn't that mean having the NAS attached to all of the nodes in the cluster?
>

Yes.  That was the presumption.
St.Ack

Re: efficient export w/o HDFS/copying

Posted by Michel Segel <mi...@hotmail.com>.
Wouldn't that mean having the NAS attached to all of the nodes in the cluster?


Sent from a remote device. Please excuse any typos...

Mike Segel

On Mar 26, 2012, at 11:07 PM, Stack <st...@duboce.net> wrote:

> On Mon, Mar 26, 2012 at 4:31 PM, Ted Tuttle <te...@mentacapital.com> wrote:
>> Is there a method of exporting that skips the HDFS step?  We would
>> ideally like to export from HBase directly to an external filesystem
>> (e.g. our big slow NAS) skipping the HDFS step.
>> 
> 
> Do an OutputFormat that just writes files to your NAS and hook it up
> to the export tool in place of SequenceFileOutputFormat.  Set your new
> NASOutputFormat instead of SequenceFileOutputFormat here:
> http://hbase.apache.org/xref/org/apache/hadoop/hbase/mapreduce/Export.html#99
> (You'll probably have to override Exporter to do your customization
> copying bulk of createSubmittableJob into subclass)
> 
> St.Ack
> 

Re: efficient export w/o HDFS/copying

Posted by Stack <st...@duboce.net>.
On Mon, Mar 26, 2012 at 4:31 PM, Ted Tuttle <te...@mentacapital.com> wrote:
> Is there a method of exporting that skips the HDFS step?  We would
> ideally like to export from HBase directly to an external filesystem
> (e.g. our big slow NAS) skipping the HDFS step.
>

Do an OutputFormat that just writes files to your NAS and hook it up
to the export tool in place of SequenceFileOutputFormat.  Set your new
NASOutputFormat instead of SequenceFileOutputFormat here:
http://hbase.apache.org/xref/org/apache/hadoop/hbase/mapreduce/Export.html#99
(You'll probably have to override Exporter to do your customization
copying bulk of createSubmittableJob into subclass)

St.Ack

efficient export w/o HDFS/copying

Posted by Ted Tuttle <te...@mentacapital.com>.
Hello All-

We've been experimenting w/ exporting and restoring our cluster data
from those exports.  Our current methodology has the following steps:

*	Dump the table from Hbase to HDFS:
	o	hadoop jar /usr/lib/hbase/hbase-0.92.0.jar export
<tablename>  <HDFS location>
*	Copy HDFS dump to filesystem:
	o	hadoop dfs -copyToLocal <Linux Path>  <HDFS location>
*	Import Linux dump into HDFS with:
	o	hadoop dfs -copyFromLocal <linux path, inc dumpdir>
<HDFS location>
*	Import HDFS data into Hbase:
	o	hadoop jar /usr/lib/hbase/hbase-0.92.0.jar import
<tablename> <HDFS location>

Is there a method of exporting that skips the HDFS step?  We would
ideally like to export from HBase directly to an external filesystem
(e.g. our big slow NAS) skipping the HDFS step.

Any thoughts or links would be appreciated.
 
-Ted 

ext3 vs. ext4

Posted by Ted Tuttle <te...@mentacapital.com>.
Hello All-

I've search this list and re-read the section in the George book on this
topic.  From the book I get the impression ext3 is used more widely but
ext4 is gaining popularity.  From this list I see quite a few ext4
recommendations but very little in the way of justification.

Is there anyone out there that has benchmarked an ext3-based cluster vs.
an ext4-based cluster?

-Ted 

Re: does increasing region filesize followed by major compactions supposed to reduce number of regions?

Posted by Doug Meil <do...@explorysmedical.com>.
Thanks Ian!  :-)



On 1/12/12 2:03 PM, "Ian Varley" <iv...@salesforce.com> wrote:

>Vinod,
>
>The answers to your questions (and so many more!) are easily found in the
>HBase Reference Guide:
>
>http://hbase.apache.org/book.html#schema.versions
>
>"Excess versions are removed during major compactions."
>
> - Ian "Doug Meil" Varley ;)
>
>On Jan 12, 2012, at 10:59 AM, T Vinod Gupta wrote:
>
>Thanks ill take a look.. meanwhile, i just decreased the versions for my
>column families from 3 to 1 and triggered another compaction. does this
>make hbase delete the previous versions and keep only the latest one?
>
>thanks
>
>On Thu, Jan 12, 2012 at 10:57 AM, kisalay
><ki...@gmail.com>> wrote:
>
>Vinod,
>
>U will have to do merge of the regions after the major compact to decrease
>the number of regions that you have. You can do either an online or an
>offline merge.
>
>You can pickup the online merge jruby script from the jira
>https://issues.apache.org/jira/browse/HBASE-1621
>
>~Kisalay
>
>On Fri, Jan 13, 2012 at 12:13 AM, T Vinod Gupta
><tv...@readypulse.com>
>wrote:
>
>so there is no way to make the regions merge since they are much below
>region max size?
>
>On Thu, Jan 12, 2012 at 10:40 AM, Doug Meil
><do...@explorysmedical.com>>wrote
>:
>
>
>Major compactions don't change the number of regions.
>
>
>
>
>
>On 1/12/12 1:35 PM, "T Vinod Gupta"
><tv...@readypulse.com>> wrote:
>
>this is probably a rookie question. but my understanding is that if i
>increase the region max file size and then initiate major compaction
>manually, the number of regions should ideally go down by the factor
>by
>which i increased the region max file size. isn't that true? im not
>seeing
>that happening. ofcourse, my compaction is still underway - so i don't
>know
>if ill see region count decrease at the very end of it. major
>compaction
>does do region merging right?
>
>thanks
>
>
>
>
>
>



Re: does increasing region filesize followed by major compactions supposed to reduce number of regions?

Posted by Ian Varley <iv...@salesforce.com>.
Vinod,

The answers to your questions (and so many more!) are easily found in the HBase Reference Guide:

http://hbase.apache.org/book.html#schema.versions

"Excess versions are removed during major compactions."

 - Ian "Doug Meil" Varley ;)

On Jan 12, 2012, at 10:59 AM, T Vinod Gupta wrote:

Thanks ill take a look.. meanwhile, i just decreased the versions for my
column families from 3 to 1 and triggered another compaction. does this
make hbase delete the previous versions and keep only the latest one?

thanks

On Thu, Jan 12, 2012 at 10:57 AM, kisalay <ki...@gmail.com>> wrote:

Vinod,

U will have to do merge of the regions after the major compact to decrease
the number of regions that you have. You can do either an online or an
offline merge.

You can pickup the online merge jruby script from the jira
https://issues.apache.org/jira/browse/HBASE-1621

~Kisalay

On Fri, Jan 13, 2012 at 12:13 AM, T Vinod Gupta <tv...@readypulse.com>
wrote:

so there is no way to make the regions merge since they are much below
region max size?

On Thu, Jan 12, 2012 at 10:40 AM, Doug Meil
<do...@explorysmedical.com>>wrote:


Major compactions don't change the number of regions.





On 1/12/12 1:35 PM, "T Vinod Gupta" <tv...@readypulse.com>> wrote:

this is probably a rookie question. but my understanding is that if i
increase the region max file size and then initiate major compaction
manually, the number of regions should ideally go down by the factor
by
which i increased the region max file size. isn't that true? im not
seeing
that happening. ofcourse, my compaction is still underway - so i don't
know
if ill see region count decrease at the very end of it. major
compaction
does do region merging right?

thanks







Re: does increasing region filesize followed by major compactions supposed to reduce number of regions?

Posted by T Vinod Gupta <tv...@readypulse.com>.
Thanks ill take a look.. meanwhile, i just decreased the versions for my
column families from 3 to 1 and triggered another compaction. does this
make hbase delete the previous versions and keep only the latest one?

thanks

On Thu, Jan 12, 2012 at 10:57 AM, kisalay <ki...@gmail.com> wrote:

> Vinod,
>
> U will have to do merge of the regions after the major compact to decrease
> the number of regions that you have. You can do either an online or an
> offline merge.
>
> You can pickup the online merge jruby script from the jira
> https://issues.apache.org/jira/browse/HBASE-1621
>
> ~Kisalay
>
> On Fri, Jan 13, 2012 at 12:13 AM, T Vinod Gupta <tvinod@readypulse.com
> >wrote:
>
> > so there is no way to make the regions merge since they are much below
> > region max size?
> >
> > On Thu, Jan 12, 2012 at 10:40 AM, Doug Meil
> > <do...@explorysmedical.com>wrote:
> >
> > >
> > > Major compactions don't change the number of regions.
> > >
> > >
> > >
> > >
> > >
> > > On 1/12/12 1:35 PM, "T Vinod Gupta" <tv...@readypulse.com> wrote:
> > >
> > > >this is probably a rookie question. but my understanding is that if i
> > > >increase the region max file size and then initiate major compaction
> > > >manually, the number of regions should ideally go down by the factor
> by
> > > >which i increased the region max file size. isn't that true? im not
> > seeing
> > > >that happening. ofcourse, my compaction is still underway - so i don't
> > > >know
> > > >if ill see region count decrease at the very end of it. major
> compaction
> > > >does do region merging right?
> > > >
> > > >thanks
> > >
> > >
> > >
> >
>

Re: does increasing region filesize followed by major compactions supposed to reduce number of regions?

Posted by kisalay <ki...@gmail.com>.
Vinod,

U will have to do merge of the regions after the major compact to decrease
the number of regions that you have. You can do either an online or an
offline merge.

You can pickup the online merge jruby script from the jira
https://issues.apache.org/jira/browse/HBASE-1621

~Kisalay

On Fri, Jan 13, 2012 at 12:13 AM, T Vinod Gupta <tv...@readypulse.com>wrote:

> so there is no way to make the regions merge since they are much below
> region max size?
>
> On Thu, Jan 12, 2012 at 10:40 AM, Doug Meil
> <do...@explorysmedical.com>wrote:
>
> >
> > Major compactions don't change the number of regions.
> >
> >
> >
> >
> >
> > On 1/12/12 1:35 PM, "T Vinod Gupta" <tv...@readypulse.com> wrote:
> >
> > >this is probably a rookie question. but my understanding is that if i
> > >increase the region max file size and then initiate major compaction
> > >manually, the number of regions should ideally go down by the factor by
> > >which i increased the region max file size. isn't that true? im not
> seeing
> > >that happening. ofcourse, my compaction is still underway - so i don't
> > >know
> > >if ill see region count decrease at the very end of it. major compaction
> > >does do region merging right?
> > >
> > >thanks
> >
> >
> >
>

Re: does increasing region filesize followed by major compactions supposed to reduce number of regions?

Posted by T Vinod Gupta <tv...@readypulse.com>.
so there is no way to make the regions merge since they are much below
region max size?

On Thu, Jan 12, 2012 at 10:40 AM, Doug Meil
<do...@explorysmedical.com>wrote:

>
> Major compactions don't change the number of regions.
>
>
>
>
>
> On 1/12/12 1:35 PM, "T Vinod Gupta" <tv...@readypulse.com> wrote:
>
> >this is probably a rookie question. but my understanding is that if i
> >increase the region max file size and then initiate major compaction
> >manually, the number of regions should ideally go down by the factor by
> >which i increased the region max file size. isn't that true? im not seeing
> >that happening. ofcourse, my compaction is still underway - so i don't
> >know
> >if ill see region count decrease at the very end of it. major compaction
> >does do region merging right?
> >
> >thanks
>
>
>

Re: does increasing region filesize followed by major compactions supposed to reduce number of regions?

Posted by Doug Meil <do...@explorysmedical.com>.
Major compactions don't change the number of regions.





On 1/12/12 1:35 PM, "T Vinod Gupta" <tv...@readypulse.com> wrote:

>this is probably a rookie question. but my understanding is that if i
>increase the region max file size and then initiate major compaction
>manually, the number of regions should ideally go down by the factor by
>which i increased the region max file size. isn't that true? im not seeing
>that happening. ofcourse, my compaction is still underway - so i don't
>know
>if ill see region count decrease at the very end of it. major compaction
>does do region merging right?
>
>thanks