You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Andra Adams <An...@sun.com> on 2008/07/22 02:10:12 UTC
Re: Clarification: Merging and getRowKeyAtOrBefore

Thanks Bryan and Stack for the answers to my questions!

About Question 1, I fully understand the need for a Merge tool for 
non-adjacent regions in broken tables.  Thanks for the extra clarification.

About Question 2, I understand the need for getRowKeyAtOrBefore and its 
overall goal, and ditto for getFull.  What I don't understand is the 
process of finding the best candidate keys in getRowKeyAtOrBefore.

In my previous post, I was comparing getRowKeyAtOrBefore to getFull 
since they are slightly similar operations.  They both must look through 
the entire Memcache (mc and snapshot) as well as every single HStoreFile 
on disk (as Bryan confirmed).  And they both must handle deletes, which 
is where I get lost with getRowKeyAtOrBefore.

getFull keeps a set of deleted items and since it is looking in reverse 
chronological order over the data set, these deleted items are always 
more recent than the current item being analyzed.  Thus later 
discoveries of items that match entries in te deleted set will be ignored.

getRowKeyAtOrBefore takes the opposite approach, collecting candidate 
keys from the sources in no chronological order, and removing items from 
this candidate set if they are later found to be deleted.  What 
guarantees that there will always remain at least one key in the set of 
candidate keys when the Memcache is finally searched?  What if the 
Memcache contains only deletes, such that every candidate key that was 
chosen from the HStoreFiles is now discovered to have been deleted?  
Isn't this a possible scenario, and couldn't it be avoided by searching 
the data in reverse chronological order like getFull (keeping a list of 
deletes and moving backwards in time, rather than keeping a list of 
candidate keys and moving forwards)?

Thanks,
Andra


stack wrote:

An attempt at answering first question is inlined below.

Bryan Duxbury wrote:
> My replies to the second question inline. Feel free to ask follow ups.
>
> -Bryan
>
> On Jun 26, 2008, at 5:24 PM, Andra Adams wrote:
>
>> Hi,
>>
>> I've been looking through the HBase code and I was wondering if I 
>> could get some clarification on two points.
>>
>>
>> 1. Why doesn't HRegion's static merge method check that the two 
>> regions specified are adjacent?

Originally, merge would only allow merging of adjacent regions but a 
couple of months back, we had bugs that could manufacture regions with 
overlapping keys.  To fix damaged clusters, the merge tool was amended 
to remove the adjacency check and refactored so it could merge overlaps 
as well as adjacents (See unit tests for merge tool.  IIRC, it includes 
tests that merge adjacent and overlapping regions).

Yes, if an operator tries to merge non-adjacents, they'll do damage.  We 
should add back some smarts that guard against this (If you don't file 
an issue, I will).

Thanks for reminding us of this hole,
St.Ack


>>
>> As far as I can tell, HRegion's merge method is called from the Merge 
>> tool which gets its region names from command line arguments.  As far 
>> as I can see, merging non-adjacent regions would break many of the 
>> assertions that HBase depends on, yet all calls to HRegion's merge 
>> method result in a merged region.  So how come the caller of the 
>> Merge tool is being trusted to ensure the adjacency of the regions it 
>> is specifying on the command line?  ( Although admittedly, the 
>> adjacency check could be quite computationally-expensive since it 
>> would involve a complete scan of all regions in the "parent" META 
>> table (either .META. or -ROOT-) to ensure that there are no regions 
>> in the "daughter" (either a user table or .META.) table that have a 
>> start key between the end key and start key of the regions being 
>> asked to merge).
>>
>>
>> 2.  Can I get an overview of the algorithm used to determine the best 
>> candidate key in HStore's getRowKeyAtOrBefore (including Memcache's 
>> internalGetRowKeyAtOrBefore, and HStore's rowAtOrBeforeFromMapFile)?
>>
>> I'm having trouble figuring out why HStore's getFull method looks 
>> through the mc, snapshot and storefiles in reverse chronological 
>> order (i.e. mc, then snapshot, then store files), while the 
>> getRowKeyAtOrBefore looks through the storefiles, then the mc, then 
>> the snapshot (in apparently no chronological order...?).  Why does 
>> getFull create a map of deletes (and older entries check this map 
>> before inserting their values in the results map), while 
>> getRowAtOrBefore opts to remove entries from the results map if a 
>> delete is found at a later time?
>>
>> Aside from the difference in style between getFull and 
>> getRowAtOrBefore, I'm also wondering why the discovery of a deleted 
>> value sometimes removes that key from the candidateKeys map, and 
>> other times is simply ignored.  (It could be that I'm missing some of 
>> the concepts behind the algorithm).
>>
>
> The idea of getRowKeyAtOrBefore is to discover the row that comes 
> immediately before or right upon the search row. This is used 
> exclusively when trying to locate which region a key resides in. The 
> reasoning behind this is a little tricky. Regions in HBase are keyed 
> on their start row, which is inclusive. The end row is implied by the 
> presence of the next region. So, when you have an arbitrary key you'd 
> like to perform some operation on, you need to find the region which 
> contains it, which you can only know by scanning past it.
>
> getRowKeyAtOrBefore is a specific, internal-only RPC method that does 
> this operation. In order to actually do the work, at the HStore level, 
> we have to decide amongst the possible keys that presented by the 
> memcache (including the snapshot) and all of the store files. The 
> order here is unimportant, because ultimately, we're going to have to 
> look at every one of those things unless we encounter a precise match. 
> Moreover, there could be deletes in any one of them, so we have to 
> carry the candidates along with us and apply the deletes where they 
> are required. The reasoning here is that if a row is completely 
> deleted, that is, all cells are suppressed by deletes, even if it 
> matches precisely, we don't want to return it as a candidate key. 
> Deletes are ignored when the don't apply to the data we've already 
> found, usually because there's a newer piece of data than there is a 
> delete (this is simply a memory optimization).
>
> Likewise, getFull tries to find a whole row of information about a key 
> at a time. We need to follow deletes around here for the same reason 
> that we do it in regular get: we don't want to return deleted data. We 
> go in reverse chronological order here because that allows the most 
> recent data to easily take precedence.
>
>>
>> Thanks,
>> Andra
>>
>> andra.adams@sun.com
>