You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Shuai Lin <li...@gmail.com> on 2015/01/24 18:25:06 UTC

Delete a region from hbase

Hi all,

We're using hbase 0.94-15 from CDH4 repo, and we're planning to delete
several regions which contain data that are no longer needed.

Basically we plan to use HRegion.deleteRegion
<http://archive.cloudera.com/cdh4/cdh/4/hbase-0.94.2-cdh4.2.0/apidocs/org/apache/hadoop/hbase/regionserver/HRegion.html#deleteRegion%28org.apache.hadoop.fs.FileSystem,%20org.apache.hadoop.fs.Path,%20org.apache.hadoop.hbase.HRegionInfo%29>
as described in this article.
<http://prafull-blog.blogspot.jp/2012/06/how-to-delete-hbase-region-including.html>

We can guarantee that  there would not be any request going to these
regions during the deletion. Here are my questions:

-- Is there any caveat of using this way to delete regions, especially
those that may cause downtime? Because we'll delete the regions in our
production cluster, we need really be careful of any possible consequences.

-- After deleting the region, do we really need to re-create it? If we do
not recreate these regions, there would be "holes" in the rowkey space. Can
we use some tool like hbck to fix this? Another way is to just recreate the
regions, and later merge these empty regions with their neighbors. Which
one is better?

Thanks!

Regards,
Shuai

Re: Delete a region from hbase

Posted by Shuai Lin <li...@gmail.com>.
Hi Stack,

After a grep in the hbase code, I see that the HRegion.deleteRegion method
can not be found in hbase 0.96. So I'd better not using it since we have a
plan to ugprade to CDH5 soon.

I have tried the steps you described on a local hbase server: first call
close_region from shell, then delete the region folder from HDFS, and
finally assign that region in shell again, and it works fine.

Thanks for your help!

On Sun, Jan 25, 2015 at 11:03 AM, Stack <st...@duboce.net> wrote:

> On Sat, Jan 24, 2015 at 9:25 AM, Shuai Lin <li...@gmail.com> wrote:
>
> > Hi all,
> >
> > We're using hbase 0.94-15 from CDH4 repo, and we're planning to delete
> > several regions which contain data that are no longer needed.
> >
> > Basically we plan to use HRegion.deleteRegion
> > <
> >
> http://archive.cloudera.com/cdh4/cdh/4/hbase-0.94.2-cdh4.2.0/apidocs/org/apache/hadoop/hbase/regionserver/HRegion.html#deleteRegion%28org.apache.hadoop.fs.FileSystem,%20org.apache.hadoop.fs.Path,%20org.apache.hadoop.hbase.HRegionInfo%29
> > >
> > as described in this article.
> > <
> >
> http://prafull-blog.blogspot.jp/2012/06/how-to-delete-hbase-region-including.html
> > >
> >
> > We can guarantee that  there would not be any request going to these
> > regions during the deletion. Here are my questions:
> >
> > -- Is there any caveat of using this way to delete regions, especially
> > those that may cause downtime? Because we'll delete the regions in our
> > production cluster, we need really be careful of any possible
> consequences.
> >
> >
> The blog is using an API that is @InterfaceAudience.Private  This means you
> are taking a risk and all bets are off.
>
>
>
> > -- After deleting the region, do we really need to re-create it? If we do
> > not recreate these regions, there would be "holes" in the rowkey space.
> Can
> > we use some tool like hbck to fix this? Another way is to just recreate
> the
> > regions, and later merge these empty regions with their neighbors. Which
> > one is better?
> >
>
> Better to avoid holes in your table.
>
> Its probably less work just doing the delete yourself as in:
>
> 1. Close the region from the shell (read up on how this works using shell
> help -- don't do unassign)
> 2. Then just delete the content of the region in HDFS once the region is
> closed (the region dir name in HDFS is the same as the region encoded name,
> the last portion of a region name -- check refguide).
> 3. After the delete in HDFS, call assign region.
>
> Practice in a non-critical setup first.
> St.Ack
>

Re: Delete a region from hbase

Posted by Stack <st...@duboce.net>.
On Sat, Jan 24, 2015 at 9:25 AM, Shuai Lin <li...@gmail.com> wrote:

> Hi all,
>
> We're using hbase 0.94-15 from CDH4 repo, and we're planning to delete
> several regions which contain data that are no longer needed.
>
> Basically we plan to use HRegion.deleteRegion
> <
> http://archive.cloudera.com/cdh4/cdh/4/hbase-0.94.2-cdh4.2.0/apidocs/org/apache/hadoop/hbase/regionserver/HRegion.html#deleteRegion%28org.apache.hadoop.fs.FileSystem,%20org.apache.hadoop.fs.Path,%20org.apache.hadoop.hbase.HRegionInfo%29
> >
> as described in this article.
> <
> http://prafull-blog.blogspot.jp/2012/06/how-to-delete-hbase-region-including.html
> >
>
> We can guarantee that  there would not be any request going to these
> regions during the deletion. Here are my questions:
>
> -- Is there any caveat of using this way to delete regions, especially
> those that may cause downtime? Because we'll delete the regions in our
> production cluster, we need really be careful of any possible consequences.
>
>
The blog is using an API that is @InterfaceAudience.Private  This means you
are taking a risk and all bets are off.



> -- After deleting the region, do we really need to re-create it? If we do
> not recreate these regions, there would be "holes" in the rowkey space. Can
> we use some tool like hbck to fix this? Another way is to just recreate the
> regions, and later merge these empty regions with their neighbors. Which
> one is better?
>

Better to avoid holes in your table.

Its probably less work just doing the delete yourself as in:

1. Close the region from the shell (read up on how this works using shell
help -- don't do unassign)
2. Then just delete the content of the region in HDFS once the region is
closed (the region dir name in HDFS is the same as the region encoded name,
the last portion of a region name -- check refguide).
3. After the delete in HDFS, call assign region.

Practice in a non-critical setup first.
St.Ack

Re: Delete a region from hbase

Posted by Ted Yu <yu...@gmail.com>.
The benefit of online merge is that the table stays online for the duration
of region merge.

You can use hbase shell or Java API to offline region.
In HBaseAdmin, we have:

  public void offline(final byte [] regionName)
Is this operation a one time thing ? If you need to perform such cleaning /
merging often, I suggest you consider upgrading to 0.98 release.

Cheers

On Sat, Jan 24, 2015 at 5:37 PM, Shuai Lin <li...@gmail.com> wrote:

> Hi Ted,
>
> About disabling the table: Thanks for reminding me of this. Disabling the
> table may not be acceptable. Can we close the regions instead, or force
> them offline?
>
> Regards,
> Shuai
>
> On Sun, Jan 25, 2015 at 1:33 AM, Ted Yu <yu...@gmail.com> wrote:
>
> > The referenced blog starts with:
> > 1. Disable the table
> >
> > Disabling the table is acceptable to you ?
> >
> > In 0.98+, you can use online merge feature to deal with empty regions.
> > But online merge is not in 0.94 - see HBASE-8217
> >
> > Cheers
> >
> > On Sat, Jan 24, 2015 at 9:25 AM, Shuai Lin <li...@gmail.com>
> wrote:
> >
> > > Hi all,
> > >
> > > We're using hbase 0.94-15 from CDH4 repo, and we're planning to delete
> > > several regions which contain data that are no longer needed.
> > >
> > > Basically we plan to use HRegion.deleteRegion
> > > <
> > >
> >
> http://archive.cloudera.com/cdh4/cdh/4/hbase-0.94.2-cdh4.2.0/apidocs/org/apache/hadoop/hbase/regionserver/HRegion.html#deleteRegion%28org.apache.hadoop.fs.FileSystem,%20org.apache.hadoop.fs.Path,%20org.apache.hadoop.hbase.HRegionInfo%29
> > > >
> > > as described in this article.
> > > <
> > >
> >
> http://prafull-blog.blogspot.jp/2012/06/how-to-delete-hbase-region-including.html
> > > >
> > >
> > > We can guarantee that  there would not be any request going to these
> > > regions during the deletion. Here are my questions:
> > >
> > > -- Is there any caveat of using this way to delete regions, especially
> > > those that may cause downtime? Because we'll delete the regions in our
> > > production cluster, we need really be careful of any possible
> > consequences.
> > >
> > > -- After deleting the region, do we really need to re-create it? If we
> do
> > > not recreate these regions, there would be "holes" in the rowkey space.
> > Can
> > > we use some tool like hbck to fix this? Another way is to just recreate
> > the
> > > regions, and later merge these empty regions with their neighbors.
> Which
> > > one is better?
> > >
> > > Thanks!
> > >
> > > Regards,
> > > Shuai
> > >
> >
>

Re: Delete a region from hbase

Posted by Shuai Lin <li...@gmail.com>.
Hi Ted,

About disabling the table: Thanks for reminding me of this. Disabling the
table may not be acceptable. Can we close the regions instead, or force
them offline?

Regards,
Shuai

On Sun, Jan 25, 2015 at 1:33 AM, Ted Yu <yu...@gmail.com> wrote:

> The referenced blog starts with:
> 1. Disable the table
>
> Disabling the table is acceptable to you ?
>
> In 0.98+, you can use online merge feature to deal with empty regions.
> But online merge is not in 0.94 - see HBASE-8217
>
> Cheers
>
> On Sat, Jan 24, 2015 at 9:25 AM, Shuai Lin <li...@gmail.com> wrote:
>
> > Hi all,
> >
> > We're using hbase 0.94-15 from CDH4 repo, and we're planning to delete
> > several regions which contain data that are no longer needed.
> >
> > Basically we plan to use HRegion.deleteRegion
> > <
> >
> http://archive.cloudera.com/cdh4/cdh/4/hbase-0.94.2-cdh4.2.0/apidocs/org/apache/hadoop/hbase/regionserver/HRegion.html#deleteRegion%28org.apache.hadoop.fs.FileSystem,%20org.apache.hadoop.fs.Path,%20org.apache.hadoop.hbase.HRegionInfo%29
> > >
> > as described in this article.
> > <
> >
> http://prafull-blog.blogspot.jp/2012/06/how-to-delete-hbase-region-including.html
> > >
> >
> > We can guarantee that  there would not be any request going to these
> > regions during the deletion. Here are my questions:
> >
> > -- Is there any caveat of using this way to delete regions, especially
> > those that may cause downtime? Because we'll delete the regions in our
> > production cluster, we need really be careful of any possible
> consequences.
> >
> > -- After deleting the region, do we really need to re-create it? If we do
> > not recreate these regions, there would be "holes" in the rowkey space.
> Can
> > we use some tool like hbck to fix this? Another way is to just recreate
> the
> > regions, and later merge these empty regions with their neighbors. Which
> > one is better?
> >
> > Thanks!
> >
> > Regards,
> > Shuai
> >
>

Re: Delete a region from hbase

Posted by Ted Yu <yu...@gmail.com>.
The referenced blog starts with:
1. Disable the table

Disabling the table is acceptable to you ?

In 0.98+, you can use online merge feature to deal with empty regions.
But online merge is not in 0.94 - see HBASE-8217

Cheers

On Sat, Jan 24, 2015 at 9:25 AM, Shuai Lin <li...@gmail.com> wrote:

> Hi all,
>
> We're using hbase 0.94-15 from CDH4 repo, and we're planning to delete
> several regions which contain data that are no longer needed.
>
> Basically we plan to use HRegion.deleteRegion
> <
> http://archive.cloudera.com/cdh4/cdh/4/hbase-0.94.2-cdh4.2.0/apidocs/org/apache/hadoop/hbase/regionserver/HRegion.html#deleteRegion%28org.apache.hadoop.fs.FileSystem,%20org.apache.hadoop.fs.Path,%20org.apache.hadoop.hbase.HRegionInfo%29
> >
> as described in this article.
> <
> http://prafull-blog.blogspot.jp/2012/06/how-to-delete-hbase-region-including.html
> >
>
> We can guarantee that  there would not be any request going to these
> regions during the deletion. Here are my questions:
>
> -- Is there any caveat of using this way to delete regions, especially
> those that may cause downtime? Because we'll delete the regions in our
> production cluster, we need really be careful of any possible consequences.
>
> -- After deleting the region, do we really need to re-create it? If we do
> not recreate these regions, there would be "holes" in the rowkey space. Can
> we use some tool like hbck to fix this? Another way is to just recreate the
> regions, and later merge these empty regions with their neighbors. Which
> one is better?
>
> Thanks!
>
> Regards,
> Shuai
>