You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Bing Jiang <ji...@gmail.com> on 2013/07/10 13:25:26 UTC

【region compaction check key-bound or not】

Hi,all

If Region process splits, it will make a reference.After executing child
region makes a compaction that absorbs all the reference. And I have a
question that how to make differences when executes scanner.
As we know that Compaction uses the scanner as well, so whether to set the
startKey and endKey of child region's bound, in order to guarantee that the
storefile in child region will not contain outlier keys? I cannot find code
to prove it, BTW, we use 0.94.3.

Thanks.

-- 
Bing Jiang
weibo: http://weibo.com/jiangbinglover
BLOG: http://blog.sina.com.cn/jiangbinglover
National Research Center for Intelligent Computing Systems
Institute of Computing technology
Graduate University of Chinese Academy of Science

Re: 【region compaction check key-bound or not】

Posted by Bing Jiang <ji...@gmail.com>.
Thanks, Ted and Sergey.
I just want to build a tool to move some regions from one table to another,
so I try to think over the strategy of  Bulkload or snapshot,
but there are more costs in our case.
So could I try to make change in region's compaction, that it adds original
region's key bound in scan?
It is useful to some exception case, for example: the failure of processing
split, which  cannot roll-back.

Store.java:
 StoreFile.Writer compactStore(final Collection<StoreFile> filesToCompact,
                               final boolean majorCompaction, final long
maxId)
{
....
 if (scanner == null) {
          Scan scan = new Scan();
   *       scan.setStartKey(getHRegion().getStartKey());
*
*          scan.setEndKey(getHRegion().getEndKey());*
          scan.setMaxVersions(getFamily().getMaxVersions());
          /* Include deletes, unless we are doing a major compaction */
          scanner = new StoreScanner(this, getScanInfo(), scan, scanners,
            majorCompaction? ScanType.MAJOR_COMPACT :
ScanType.MINOR_COMPACT,
            smallestReadPoint, earliestPutTs);
        }
...
}


Please correct me if mistake.
Thanks.


2013/7/11 Ted Yu <yu...@gmail.com>

> bq. upgrade to 0.94.6
>
> Nit: upgrade to 0.94.6.1 (or 0.94.9 which was just released)
>
> Cheers
>
> On Wed, Jul 10, 2013 at 6:04 PM, Sergey Shelukhin <sergey@hortonworks.com
> >wrote:
>
> > Yeah, this is not going to work... moving storefiles manually between
> > regions is generally not a good idea.
> > Do you want to move entire table into new table? Then the best thing is
> > probably to upgrade to 0.94.6 (rolling restart is supported from 0.94.3)
> > and use HBase snapshots. You can use export table w/o upgrading.
> > If you want to merge into existing table I'm not really sure what is the
> > best way to do that. You might have to rewrite data with target region
> > boundaries, or manually split and merge regions accordingly and run
> > compactions. Then you can disable the table, copy the files and bulk load
> > them into target cluster. See
> > http://hbase.apache.org/book/arch.bulk.load.html
> > Note that bulk load requires boundaries that fit within one region
> > (matching or sub-range). There may be a tool that solves this but I'm not
> > aware of it...
> >
> >
> > On Wed, Jul 10, 2013 at 5:40 PM, Bing Jiang <jiangbinglover@gmail.com
> > >wrote:
> >
> > > Thanks,Sergey.
> > > These days, I want to move a table from one hbase cluster to another
> > hbase
> > > cluster, and there are the same table's schema.
> > > So I want to move a region's storefile to another table's region
> > > corresponding directory, and the key bound of regions are overlapped.
> > > For example:
> > > Cluster   |  Table | Region's key bound
> > > cluster1  |  dat    |  [1ffff,2ffff]
> > > cluster2  |  dat    |  [2bfff,31fff]
> > >
> > > In my opinion , if region files in cluster2 are move into cluster1's
> > > regions, and make compaction(Minor && Major) upon the first region, it
> > will
> > > prune the improper key in (2ffff,31fff].
> > > However I found hbase compaction cannot support that.
> > >
> > > Any idea would be thankful.
> > >
> > >
> > > 2013/7/11 Sergey Shelukhin <se...@hortonworks.com>
> > >
> > > > You should not have to manually take care of region bounds in normal
> > > > circumstances (unless you are reading the file from coprocessor in
> some
> > > > special way, or something like that). Please tell us if you are
> seeing
> > > any
> > > > strange behavior :)
> > > > See HalfStoreFileReader for the code that is used to read the
> > referenced
> > > > file and constrains the keys.
> > > >
> > > > On Wed, Jul 10, 2013 at 4:25 AM, Bing Jiang <
> jiangbinglover@gmail.com
> > > > >wrote:
> > > >
> > > > > Hi,all
> > > > >
> > > > > If Region process splits, it will make a reference.After executing
> > > child
> > > > > region makes a compaction that absorbs all the reference. And I
> have
> > a
> > > > > question that how to make differences when executes scanner.
> > > > > As we know that Compaction uses the scanner as well, so whether to
> > set
> > > > the
> > > > > startKey and endKey of child region's bound, in order to guarantee
> > that
> > > > the
> > > > > storefile in child region will not contain outlier keys? I cannot
> > find
> > > > code
> > > > > to prove it, BTW, we use 0.94.3.
> > > > >
> > > > > Thanks.
> > > > >
> > > > > --
> > > > > Bing Jiang
> > > > > weibo: http://weibo.com/jiangbinglover
> > > > > BLOG: http://blog.sina.com.cn/jiangbinglover
> > > > > National Research Center for Intelligent Computing Systems
> > > > > Institute of Computing technology
> > > > > Graduate University of Chinese Academy of Science
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Bing Jiang
> > > Tel:(86)134-2619-1361
> > > weibo: http://weibo.com/jiangbinglover
> > > BLOG: http://blog.sina.com.cn/jiangbinglover
> > > National Research Center for Intelligent Computing Systems
> > > Institute of Computing technology
> > > Graduate University of Chinese Academy of Science
> > >
> >
>



-- 
Bing Jiang
Tel:(86)134-2619-1361
weibo: http://weibo.com/jiangbinglover
BLOG: http://blog.sina.com.cn/jiangbinglover
National Research Center for Intelligent Computing Systems
Institute of Computing technology
Graduate University of Chinese Academy of Science

Re: 【region compaction check key-bound or not】

Posted by Bing Jiang <ji...@gmail.com>.
Thanks, Ted and Sergey.
I just want to build a tool to move some regions from one table to another,
so I try to think over the strategy of  Bulkload or snapshot,
but there are more costs in our case.
So could I try to make change in region's compaction, that it adds original
region's key bound in scan?
It is useful to some exception case, for example: the failure of processing
split, which  cannot roll-back.

Store.java:
 StoreFile.Writer compactStore(final Collection<StoreFile> filesToCompact,
                               final boolean majorCompaction, final long
maxId)
{
....
 if (scanner == null) {
          Scan scan = new Scan();
   *       scan.setStartKey(getHRegion().getStartKey());
*
*          scan.setEndKey(getHRegion().getEndKey());*
          scan.setMaxVersions(getFamily().getMaxVersions());
          /* Include deletes, unless we are doing a major compaction */
          scanner = new StoreScanner(this, getScanInfo(), scan, scanners,
            majorCompaction? ScanType.MAJOR_COMPACT :
ScanType.MINOR_COMPACT,
            smallestReadPoint, earliestPutTs);
        }
...
}


Please correct me if mistake.
Thanks.


2013/7/11 Ted Yu <yu...@gmail.com>

> bq. upgrade to 0.94.6
>
> Nit: upgrade to 0.94.6.1 (or 0.94.9 which was just released)
>
> Cheers
>
> On Wed, Jul 10, 2013 at 6:04 PM, Sergey Shelukhin <sergey@hortonworks.com
> >wrote:
>
> > Yeah, this is not going to work... moving storefiles manually between
> > regions is generally not a good idea.
> > Do you want to move entire table into new table? Then the best thing is
> > probably to upgrade to 0.94.6 (rolling restart is supported from 0.94.3)
> > and use HBase snapshots. You can use export table w/o upgrading.
> > If you want to merge into existing table I'm not really sure what is the
> > best way to do that. You might have to rewrite data with target region
> > boundaries, or manually split and merge regions accordingly and run
> > compactions. Then you can disable the table, copy the files and bulk load
> > them into target cluster. See
> > http://hbase.apache.org/book/arch.bulk.load.html
> > Note that bulk load requires boundaries that fit within one region
> > (matching or sub-range). There may be a tool that solves this but I'm not
> > aware of it...
> >
> >
> > On Wed, Jul 10, 2013 at 5:40 PM, Bing Jiang <jiangbinglover@gmail.com
> > >wrote:
> >
> > > Thanks,Sergey.
> > > These days, I want to move a table from one hbase cluster to another
> > hbase
> > > cluster, and there are the same table's schema.
> > > So I want to move a region's storefile to another table's region
> > > corresponding directory, and the key bound of regions are overlapped.
> > > For example:
> > > Cluster   |  Table | Region's key bound
> > > cluster1  |  dat    |  [1ffff,2ffff]
> > > cluster2  |  dat    |  [2bfff,31fff]
> > >
> > > In my opinion , if region files in cluster2 are move into cluster1's
> > > regions, and make compaction(Minor && Major) upon the first region, it
> > will
> > > prune the improper key in (2ffff,31fff].
> > > However I found hbase compaction cannot support that.
> > >
> > > Any idea would be thankful.
> > >
> > >
> > > 2013/7/11 Sergey Shelukhin <se...@hortonworks.com>
> > >
> > > > You should not have to manually take care of region bounds in normal
> > > > circumstances (unless you are reading the file from coprocessor in
> some
> > > > special way, or something like that). Please tell us if you are
> seeing
> > > any
> > > > strange behavior :)
> > > > See HalfStoreFileReader for the code that is used to read the
> > referenced
> > > > file and constrains the keys.
> > > >
> > > > On Wed, Jul 10, 2013 at 4:25 AM, Bing Jiang <
> jiangbinglover@gmail.com
> > > > >wrote:
> > > >
> > > > > Hi,all
> > > > >
> > > > > If Region process splits, it will make a reference.After executing
> > > child
> > > > > region makes a compaction that absorbs all the reference. And I
> have
> > a
> > > > > question that how to make differences when executes scanner.
> > > > > As we know that Compaction uses the scanner as well, so whether to
> > set
> > > > the
> > > > > startKey and endKey of child region's bound, in order to guarantee
> > that
> > > > the
> > > > > storefile in child region will not contain outlier keys? I cannot
> > find
> > > > code
> > > > > to prove it, BTW, we use 0.94.3.
> > > > >
> > > > > Thanks.
> > > > >
> > > > > --
> > > > > Bing Jiang
> > > > > weibo: http://weibo.com/jiangbinglover
> > > > > BLOG: http://blog.sina.com.cn/jiangbinglover
> > > > > National Research Center for Intelligent Computing Systems
> > > > > Institute of Computing technology
> > > > > Graduate University of Chinese Academy of Science
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Bing Jiang
> > > Tel:(86)134-2619-1361
> > > weibo: http://weibo.com/jiangbinglover
> > > BLOG: http://blog.sina.com.cn/jiangbinglover
> > > National Research Center for Intelligent Computing Systems
> > > Institute of Computing technology
> > > Graduate University of Chinese Academy of Science
> > >
> >
>



-- 
Bing Jiang
Tel:(86)134-2619-1361
weibo: http://weibo.com/jiangbinglover
BLOG: http://blog.sina.com.cn/jiangbinglover
National Research Center for Intelligent Computing Systems
Institute of Computing technology
Graduate University of Chinese Academy of Science

Re: 【region compaction check key-bound or not】

Posted by Ted Yu <yu...@gmail.com>.
bq. upgrade to 0.94.6

Nit: upgrade to 0.94.6.1 (or 0.94.9 which was just released)

Cheers

On Wed, Jul 10, 2013 at 6:04 PM, Sergey Shelukhin <se...@hortonworks.com>wrote:

> Yeah, this is not going to work... moving storefiles manually between
> regions is generally not a good idea.
> Do you want to move entire table into new table? Then the best thing is
> probably to upgrade to 0.94.6 (rolling restart is supported from 0.94.3)
> and use HBase snapshots. You can use export table w/o upgrading.
> If you want to merge into existing table I'm not really sure what is the
> best way to do that. You might have to rewrite data with target region
> boundaries, or manually split and merge regions accordingly and run
> compactions. Then you can disable the table, copy the files and bulk load
> them into target cluster. See
> http://hbase.apache.org/book/arch.bulk.load.html
> Note that bulk load requires boundaries that fit within one region
> (matching or sub-range). There may be a tool that solves this but I'm not
> aware of it...
>
>
> On Wed, Jul 10, 2013 at 5:40 PM, Bing Jiang <jiangbinglover@gmail.com
> >wrote:
>
> > Thanks,Sergey.
> > These days, I want to move a table from one hbase cluster to another
> hbase
> > cluster, and there are the same table's schema.
> > So I want to move a region's storefile to another table's region
> > corresponding directory, and the key bound of regions are overlapped.
> > For example:
> > Cluster   |  Table | Region's key bound
> > cluster1  |  dat    |  [1ffff,2ffff]
> > cluster2  |  dat    |  [2bfff,31fff]
> >
> > In my opinion , if region files in cluster2 are move into cluster1's
> > regions, and make compaction(Minor && Major) upon the first region, it
> will
> > prune the improper key in (2ffff,31fff].
> > However I found hbase compaction cannot support that.
> >
> > Any idea would be thankful.
> >
> >
> > 2013/7/11 Sergey Shelukhin <se...@hortonworks.com>
> >
> > > You should not have to manually take care of region bounds in normal
> > > circumstances (unless you are reading the file from coprocessor in some
> > > special way, or something like that). Please tell us if you are seeing
> > any
> > > strange behavior :)
> > > See HalfStoreFileReader for the code that is used to read the
> referenced
> > > file and constrains the keys.
> > >
> > > On Wed, Jul 10, 2013 at 4:25 AM, Bing Jiang <jiangbinglover@gmail.com
> > > >wrote:
> > >
> > > > Hi,all
> > > >
> > > > If Region process splits, it will make a reference.After executing
> > child
> > > > region makes a compaction that absorbs all the reference. And I have
> a
> > > > question that how to make differences when executes scanner.
> > > > As we know that Compaction uses the scanner as well, so whether to
> set
> > > the
> > > > startKey and endKey of child region's bound, in order to guarantee
> that
> > > the
> > > > storefile in child region will not contain outlier keys? I cannot
> find
> > > code
> > > > to prove it, BTW, we use 0.94.3.
> > > >
> > > > Thanks.
> > > >
> > > > --
> > > > Bing Jiang
> > > > weibo: http://weibo.com/jiangbinglover
> > > > BLOG: http://blog.sina.com.cn/jiangbinglover
> > > > National Research Center for Intelligent Computing Systems
> > > > Institute of Computing technology
> > > > Graduate University of Chinese Academy of Science
> > > >
> > >
> >
> >
> >
> > --
> > Bing Jiang
> > Tel:(86)134-2619-1361
> > weibo: http://weibo.com/jiangbinglover
> > BLOG: http://blog.sina.com.cn/jiangbinglover
> > National Research Center for Intelligent Computing Systems
> > Institute of Computing technology
> > Graduate University of Chinese Academy of Science
> >
>

Re: 【region compaction check key-bound or not】

Posted by Ted Yu <yu...@gmail.com>.
bq. upgrade to 0.94.6

Nit: upgrade to 0.94.6.1 (or 0.94.9 which was just released)

Cheers

On Wed, Jul 10, 2013 at 6:04 PM, Sergey Shelukhin <se...@hortonworks.com>wrote:

> Yeah, this is not going to work... moving storefiles manually between
> regions is generally not a good idea.
> Do you want to move entire table into new table? Then the best thing is
> probably to upgrade to 0.94.6 (rolling restart is supported from 0.94.3)
> and use HBase snapshots. You can use export table w/o upgrading.
> If you want to merge into existing table I'm not really sure what is the
> best way to do that. You might have to rewrite data with target region
> boundaries, or manually split and merge regions accordingly and run
> compactions. Then you can disable the table, copy the files and bulk load
> them into target cluster. See
> http://hbase.apache.org/book/arch.bulk.load.html
> Note that bulk load requires boundaries that fit within one region
> (matching or sub-range). There may be a tool that solves this but I'm not
> aware of it...
>
>
> On Wed, Jul 10, 2013 at 5:40 PM, Bing Jiang <jiangbinglover@gmail.com
> >wrote:
>
> > Thanks,Sergey.
> > These days, I want to move a table from one hbase cluster to another
> hbase
> > cluster, and there are the same table's schema.
> > So I want to move a region's storefile to another table's region
> > corresponding directory, and the key bound of regions are overlapped.
> > For example:
> > Cluster   |  Table | Region's key bound
> > cluster1  |  dat    |  [1ffff,2ffff]
> > cluster2  |  dat    |  [2bfff,31fff]
> >
> > In my opinion , if region files in cluster2 are move into cluster1's
> > regions, and make compaction(Minor && Major) upon the first region, it
> will
> > prune the improper key in (2ffff,31fff].
> > However I found hbase compaction cannot support that.
> >
> > Any idea would be thankful.
> >
> >
> > 2013/7/11 Sergey Shelukhin <se...@hortonworks.com>
> >
> > > You should not have to manually take care of region bounds in normal
> > > circumstances (unless you are reading the file from coprocessor in some
> > > special way, or something like that). Please tell us if you are seeing
> > any
> > > strange behavior :)
> > > See HalfStoreFileReader for the code that is used to read the
> referenced
> > > file and constrains the keys.
> > >
> > > On Wed, Jul 10, 2013 at 4:25 AM, Bing Jiang <jiangbinglover@gmail.com
> > > >wrote:
> > >
> > > > Hi,all
> > > >
> > > > If Region process splits, it will make a reference.After executing
> > child
> > > > region makes a compaction that absorbs all the reference. And I have
> a
> > > > question that how to make differences when executes scanner.
> > > > As we know that Compaction uses the scanner as well, so whether to
> set
> > > the
> > > > startKey and endKey of child region's bound, in order to guarantee
> that
> > > the
> > > > storefile in child region will not contain outlier keys? I cannot
> find
> > > code
> > > > to prove it, BTW, we use 0.94.3.
> > > >
> > > > Thanks.
> > > >
> > > > --
> > > > Bing Jiang
> > > > weibo: http://weibo.com/jiangbinglover
> > > > BLOG: http://blog.sina.com.cn/jiangbinglover
> > > > National Research Center for Intelligent Computing Systems
> > > > Institute of Computing technology
> > > > Graduate University of Chinese Academy of Science
> > > >
> > >
> >
> >
> >
> > --
> > Bing Jiang
> > Tel:(86)134-2619-1361
> > weibo: http://weibo.com/jiangbinglover
> > BLOG: http://blog.sina.com.cn/jiangbinglover
> > National Research Center for Intelligent Computing Systems
> > Institute of Computing technology
> > Graduate University of Chinese Academy of Science
> >
>

Re: 【region compaction check key-bound or not】

Posted by Sergey Shelukhin <se...@hortonworks.com>.
Yeah, this is not going to work... moving storefiles manually between
regions is generally not a good idea.
Do you want to move entire table into new table? Then the best thing is
probably to upgrade to 0.94.6 (rolling restart is supported from 0.94.3)
and use HBase snapshots. You can use export table w/o upgrading.
If you want to merge into existing table I'm not really sure what is the
best way to do that. You might have to rewrite data with target region
boundaries, or manually split and merge regions accordingly and run
compactions. Then you can disable the table, copy the files and bulk load
them into target cluster. See
http://hbase.apache.org/book/arch.bulk.load.html
Note that bulk load requires boundaries that fit within one region
(matching or sub-range). There may be a tool that solves this but I'm not
aware of it...


On Wed, Jul 10, 2013 at 5:40 PM, Bing Jiang <ji...@gmail.com>wrote:

> Thanks,Sergey.
> These days, I want to move a table from one hbase cluster to another hbase
> cluster, and there are the same table's schema.
> So I want to move a region's storefile to another table's region
> corresponding directory, and the key bound of regions are overlapped.
> For example:
> Cluster   |  Table | Region's key bound
> cluster1  |  dat    |  [1ffff,2ffff]
> cluster2  |  dat    |  [2bfff,31fff]
>
> In my opinion , if region files in cluster2 are move into cluster1's
> regions, and make compaction(Minor && Major) upon the first region, it will
> prune the improper key in (2ffff,31fff].
> However I found hbase compaction cannot support that.
>
> Any idea would be thankful.
>
>
> 2013/7/11 Sergey Shelukhin <se...@hortonworks.com>
>
> > You should not have to manually take care of region bounds in normal
> > circumstances (unless you are reading the file from coprocessor in some
> > special way, or something like that). Please tell us if you are seeing
> any
> > strange behavior :)
> > See HalfStoreFileReader for the code that is used to read the referenced
> > file and constrains the keys.
> >
> > On Wed, Jul 10, 2013 at 4:25 AM, Bing Jiang <jiangbinglover@gmail.com
> > >wrote:
> >
> > > Hi,all
> > >
> > > If Region process splits, it will make a reference.After executing
> child
> > > region makes a compaction that absorbs all the reference. And I have a
> > > question that how to make differences when executes scanner.
> > > As we know that Compaction uses the scanner as well, so whether to set
> > the
> > > startKey and endKey of child region's bound, in order to guarantee that
> > the
> > > storefile in child region will not contain outlier keys? I cannot find
> > code
> > > to prove it, BTW, we use 0.94.3.
> > >
> > > Thanks.
> > >
> > > --
> > > Bing Jiang
> > > weibo: http://weibo.com/jiangbinglover
> > > BLOG: http://blog.sina.com.cn/jiangbinglover
> > > National Research Center for Intelligent Computing Systems
> > > Institute of Computing technology
> > > Graduate University of Chinese Academy of Science
> > >
> >
>
>
>
> --
> Bing Jiang
> Tel:(86)134-2619-1361
> weibo: http://weibo.com/jiangbinglover
> BLOG: http://blog.sina.com.cn/jiangbinglover
> National Research Center for Intelligent Computing Systems
> Institute of Computing technology
> Graduate University of Chinese Academy of Science
>

Re: 【region compaction check key-bound or not】

Posted by Sergey Shelukhin <se...@hortonworks.com>.
Yeah, this is not going to work... moving storefiles manually between
regions is generally not a good idea.
Do you want to move entire table into new table? Then the best thing is
probably to upgrade to 0.94.6 (rolling restart is supported from 0.94.3)
and use HBase snapshots. You can use export table w/o upgrading.
If you want to merge into existing table I'm not really sure what is the
best way to do that. You might have to rewrite data with target region
boundaries, or manually split and merge regions accordingly and run
compactions. Then you can disable the table, copy the files and bulk load
them into target cluster. See
http://hbase.apache.org/book/arch.bulk.load.html
Note that bulk load requires boundaries that fit within one region
(matching or sub-range). There may be a tool that solves this but I'm not
aware of it...


On Wed, Jul 10, 2013 at 5:40 PM, Bing Jiang <ji...@gmail.com>wrote:

> Thanks,Sergey.
> These days, I want to move a table from one hbase cluster to another hbase
> cluster, and there are the same table's schema.
> So I want to move a region's storefile to another table's region
> corresponding directory, and the key bound of regions are overlapped.
> For example:
> Cluster   |  Table | Region's key bound
> cluster1  |  dat    |  [1ffff,2ffff]
> cluster2  |  dat    |  [2bfff,31fff]
>
> In my opinion , if region files in cluster2 are move into cluster1's
> regions, and make compaction(Minor && Major) upon the first region, it will
> prune the improper key in (2ffff,31fff].
> However I found hbase compaction cannot support that.
>
> Any idea would be thankful.
>
>
> 2013/7/11 Sergey Shelukhin <se...@hortonworks.com>
>
> > You should not have to manually take care of region bounds in normal
> > circumstances (unless you are reading the file from coprocessor in some
> > special way, or something like that). Please tell us if you are seeing
> any
> > strange behavior :)
> > See HalfStoreFileReader for the code that is used to read the referenced
> > file and constrains the keys.
> >
> > On Wed, Jul 10, 2013 at 4:25 AM, Bing Jiang <jiangbinglover@gmail.com
> > >wrote:
> >
> > > Hi,all
> > >
> > > If Region process splits, it will make a reference.After executing
> child
> > > region makes a compaction that absorbs all the reference. And I have a
> > > question that how to make differences when executes scanner.
> > > As we know that Compaction uses the scanner as well, so whether to set
> > the
> > > startKey and endKey of child region's bound, in order to guarantee that
> > the
> > > storefile in child region will not contain outlier keys? I cannot find
> > code
> > > to prove it, BTW, we use 0.94.3.
> > >
> > > Thanks.
> > >
> > > --
> > > Bing Jiang
> > > weibo: http://weibo.com/jiangbinglover
> > > BLOG: http://blog.sina.com.cn/jiangbinglover
> > > National Research Center for Intelligent Computing Systems
> > > Institute of Computing technology
> > > Graduate University of Chinese Academy of Science
> > >
> >
>
>
>
> --
> Bing Jiang
> Tel:(86)134-2619-1361
> weibo: http://weibo.com/jiangbinglover
> BLOG: http://blog.sina.com.cn/jiangbinglover
> National Research Center for Intelligent Computing Systems
> Institute of Computing technology
> Graduate University of Chinese Academy of Science
>

Re: 【region compaction check key-bound or not】

Posted by Bing Jiang <ji...@gmail.com>.
Thanks,Sergey.
These days, I want to move a table from one hbase cluster to another hbase
cluster, and there are the same table's schema.
So I want to move a region's storefile to another table's region
corresponding directory, and the key bound of regions are overlapped.
For example:
Cluster   |  Table | Region's key bound
cluster1  |  dat    |  [1ffff,2ffff]
cluster2  |  dat    |  [2bfff,31fff]

In my opinion , if region files in cluster2 are move into cluster1's
regions, and make compaction(Minor && Major) upon the first region, it will
prune the improper key in (2ffff,31fff].
However I found hbase compaction cannot support that.

Any idea would be thankful.


2013/7/11 Sergey Shelukhin <se...@hortonworks.com>

> You should not have to manually take care of region bounds in normal
> circumstances (unless you are reading the file from coprocessor in some
> special way, or something like that). Please tell us if you are seeing any
> strange behavior :)
> See HalfStoreFileReader for the code that is used to read the referenced
> file and constrains the keys.
>
> On Wed, Jul 10, 2013 at 4:25 AM, Bing Jiang <jiangbinglover@gmail.com
> >wrote:
>
> > Hi,all
> >
> > If Region process splits, it will make a reference.After executing child
> > region makes a compaction that absorbs all the reference. And I have a
> > question that how to make differences when executes scanner.
> > As we know that Compaction uses the scanner as well, so whether to set
> the
> > startKey and endKey of child region's bound, in order to guarantee that
> the
> > storefile in child region will not contain outlier keys? I cannot find
> code
> > to prove it, BTW, we use 0.94.3.
> >
> > Thanks.
> >
> > --
> > Bing Jiang
> > weibo: http://weibo.com/jiangbinglover
> > BLOG: http://blog.sina.com.cn/jiangbinglover
> > National Research Center for Intelligent Computing Systems
> > Institute of Computing technology
> > Graduate University of Chinese Academy of Science
> >
>



-- 
Bing Jiang
Tel:(86)134-2619-1361
weibo: http://weibo.com/jiangbinglover
BLOG: http://blog.sina.com.cn/jiangbinglover
National Research Center for Intelligent Computing Systems
Institute of Computing technology
Graduate University of Chinese Academy of Science

Re: 【region compaction check key-bound or not】

Posted by Bing Jiang <ji...@gmail.com>.
Thanks,Sergey.
These days, I want to move a table from one hbase cluster to another hbase
cluster, and there are the same table's schema.
So I want to move a region's storefile to another table's region
corresponding directory, and the key bound of regions are overlapped.
For example:
Cluster   |  Table | Region's key bound
cluster1  |  dat    |  [1ffff,2ffff]
cluster2  |  dat    |  [2bfff,31fff]

In my opinion , if region files in cluster2 are move into cluster1's
regions, and make compaction(Minor && Major) upon the first region, it will
prune the improper key in (2ffff,31fff].
However I found hbase compaction cannot support that.

Any idea would be thankful.


2013/7/11 Sergey Shelukhin <se...@hortonworks.com>

> You should not have to manually take care of region bounds in normal
> circumstances (unless you are reading the file from coprocessor in some
> special way, or something like that). Please tell us if you are seeing any
> strange behavior :)
> See HalfStoreFileReader for the code that is used to read the referenced
> file and constrains the keys.
>
> On Wed, Jul 10, 2013 at 4:25 AM, Bing Jiang <jiangbinglover@gmail.com
> >wrote:
>
> > Hi,all
> >
> > If Region process splits, it will make a reference.After executing child
> > region makes a compaction that absorbs all the reference. And I have a
> > question that how to make differences when executes scanner.
> > As we know that Compaction uses the scanner as well, so whether to set
> the
> > startKey and endKey of child region's bound, in order to guarantee that
> the
> > storefile in child region will not contain outlier keys? I cannot find
> code
> > to prove it, BTW, we use 0.94.3.
> >
> > Thanks.
> >
> > --
> > Bing Jiang
> > weibo: http://weibo.com/jiangbinglover
> > BLOG: http://blog.sina.com.cn/jiangbinglover
> > National Research Center for Intelligent Computing Systems
> > Institute of Computing technology
> > Graduate University of Chinese Academy of Science
> >
>



-- 
Bing Jiang
Tel:(86)134-2619-1361
weibo: http://weibo.com/jiangbinglover
BLOG: http://blog.sina.com.cn/jiangbinglover
National Research Center for Intelligent Computing Systems
Institute of Computing technology
Graduate University of Chinese Academy of Science

Re: 【region compaction check key-bound or not】

Posted by Sergey Shelukhin <se...@hortonworks.com>.
You should not have to manually take care of region bounds in normal
circumstances (unless you are reading the file from coprocessor in some
special way, or something like that). Please tell us if you are seeing any
strange behavior :)
See HalfStoreFileReader for the code that is used to read the referenced
file and constrains the keys.

On Wed, Jul 10, 2013 at 4:25 AM, Bing Jiang <ji...@gmail.com>wrote:

> Hi,all
>
> If Region process splits, it will make a reference.After executing child
> region makes a compaction that absorbs all the reference. And I have a
> question that how to make differences when executes scanner.
> As we know that Compaction uses the scanner as well, so whether to set the
> startKey and endKey of child region's bound, in order to guarantee that the
> storefile in child region will not contain outlier keys? I cannot find code
> to prove it, BTW, we use 0.94.3.
>
> Thanks.
>
> --
> Bing Jiang
> weibo: http://weibo.com/jiangbinglover
> BLOG: http://blog.sina.com.cn/jiangbinglover
> National Research Center for Intelligent Computing Systems
> Institute of Computing technology
> Graduate University of Chinese Academy of Science
>