You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Amit Sela <am...@infolinks.com> on 2013/11/19 14:40:18 UTC
Bulk load fails to identify pre-split regions
Hi all,
I'm using HBase 0.94.2 (and Hadoop 1.0.4).
I'm using bulk load on daily basis for over a year with no problem.
I recently moved to an OSGi client, and that required some changes.
One of tha changes I made is a fix to what seems like a bug that I
described in https://issues.apache.org/jira/browse/HBASE-9682
While running some tests I executed bulk load (with pre-splitting) a few
times and in one of the times it seems that bulk load didn't identify the
pre-split regions and loaded the HFiles into 2 new regions (instead of 19
pre-split). What's even worse is that it made a mess of lexicographical
order of start/end keys in those regions.
for example:
if pre-split reginos start/end keys were:
Start End
1
1 2
2 3
3
It turned to:
Start End
new1
1 2
new1
2 3
3
So that even scanning over those regions is impossible.
I'm having hard time recreating this behavior so I'm not sure it's the fix
I did (also described in the Jira comments).
Any ideas ?
Thanks,
Amit
Re: Bulk load fails to identify pre-split regions
Posted by Amit Sela <am...@infolinks.com>.
So far no issues after running in production for over a week now.
I didn't commit any patches to the JIRA I opened because I have more
changes in Algorithm on my environment (the other change I have is another
bug fix that is related to GZ configuration and was fixed in later versions
- I'm running with this change for over a year in production with no
issues).
So if I will publish a patch based on "svn diff" it would have more changes
- however, I did describe my changes in the JIRA as a possible fix.
Since I didn't encounter this issue again, the only guess I have is that it
might be related to the fact that I executed bulk load after bulk load (5
or 6 times)...
Thanks.
On Mon, Nov 25, 2013 at 11:42 PM, Ted Yu <yu...@gmail.com> wrote:
> Amit:
> bq. One of tha changes I made is a fix to what seems like a bug
>
> I don't see an attachment to HBASE-9682 so cannot tell whether the change
> was related to what you described.
>
> Have you encountered the problem since last week ?
>
> Cheers
>
>
> On Tue, Nov 19, 2013 at 9:40 PM, Amit Sela <am...@infolinks.com> wrote:
>
> > Hi all,
> > I'm using HBase 0.94.2 (and Hadoop 1.0.4).
> > I'm using bulk load on daily basis for over a year with no problem.
> > I recently moved to an OSGi client, and that required some changes.
> > One of tha changes I made is a fix to what seems like a bug that I
> > described in https://issues.apache.org/jira/browse/HBASE-9682
> > While running some tests I executed bulk load (with pre-splitting) a few
> > times and in one of the times it seems that bulk load didn't identify the
> > pre-split regions and loaded the HFiles into 2 new regions (instead of 19
> > pre-split). What's even worse is that it made a mess of lexicographical
> > order of start/end keys in those regions.
> >
> > for example:
> > if pre-split reginos start/end keys were:
> > Start End
> > 1
> > 1 2
> > 2 3
> > 3
> >
> > It turned to:
> > Start End
> > new1
> > 1 2
> > new1
> > 2 3
> > 3
> >
> > So that even scanning over those regions is impossible.
> >
> > I'm having hard time recreating this behavior so I'm not sure it's the
> fix
> > I did (also described in the Jira comments).
> >
> > Any ideas ?
> >
> > Thanks,
> >
> > Amit
> >
>
Re: Bulk load fails to identify pre-split regions
Posted by Ted Yu <yu...@gmail.com>.
Amit:
bq. One of tha changes I made is a fix to what seems like a bug
I don't see an attachment to HBASE-9682 so cannot tell whether the change
was related to what you described.
Have you encountered the problem since last week ?
Cheers
On Tue, Nov 19, 2013 at 9:40 PM, Amit Sela <am...@infolinks.com> wrote:
> Hi all,
> I'm using HBase 0.94.2 (and Hadoop 1.0.4).
> I'm using bulk load on daily basis for over a year with no problem.
> I recently moved to an OSGi client, and that required some changes.
> One of tha changes I made is a fix to what seems like a bug that I
> described in https://issues.apache.org/jira/browse/HBASE-9682
> While running some tests I executed bulk load (with pre-splitting) a few
> times and in one of the times it seems that bulk load didn't identify the
> pre-split regions and loaded the HFiles into 2 new regions (instead of 19
> pre-split). What's even worse is that it made a mess of lexicographical
> order of start/end keys in those regions.
>
> for example:
> if pre-split reginos start/end keys were:
> Start End
> 1
> 1 2
> 2 3
> 3
>
> It turned to:
> Start End
> new1
> 1 2
> new1
> 2 3
> 3
>
> So that even scanning over those regions is impossible.
>
> I'm having hard time recreating this behavior so I'm not sure it's the fix
> I did (also described in the Jira comments).
>
> Any ideas ?
>
> Thanks,
>
> Amit
>