You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Menno Luiten <ml...@artifix.net> on 2010/01/21 18:48:31 UTC

"Added a key not lexically larger than previous key" when using KeyValueSortReducer

Hi,

I'm trying to bulk load my data into HBase via the new
"HFileOutputFormat", in which I use a Mapper to read from a file and
construct a KeyValue. Then I use the KeyValueSortReducer as Reducer, and
limit myself to 1 reducer only, to prevent writing a custom Partitioner.

However, when running the MapReduce, I get the following error during
reduce:

java.io.IOException: Added a key not lexically larger than previous
key=im2.jpgaspectcolor/blue1---some-color/p.995�������,
lastkey=im2.jpgaspectcolor/cyan---some-color/p.995�������

I thought the KeyValueSortReducer would sort the output to prevent these
errors from occuring. Am I doing something wrong or should I write my
own comparator which compares actual row,column pairs?

Using HBase 0.20.2 and Hadoop 0.20.0

Regards,
Menno Luiten


Re: "Added a key not lexically larger than previous key" when using KeyValueSortReducer

Posted by stack <st...@duboce.net>.
Thanks for writing back to the list...  I was wondering.
St.Ack

On Thu, Jan 21, 2010 at 1:29 PM, Menno Luiten <ml...@artifix.net> wrote:
> Turns out to be a major screw-up on my behalf :)
>
> Passed line number as key instead of actual row key, ofcourse resulting
> in reducing and ordering 1 line at a time. Woops..
>
> It's working fine now :) thanks for your time..
> Menno
>
> On do, 2010-01-21 at 20:35 +0100, Menno Luiten wrote:
>> Manually added the "kv.clone()"-patch and it did not solve the issue,
>> unfortunately.
>>
>> Will most certainly dig deeper as to why this is happening and will post
>> the results here or in JIRA.
>>
>> Thanks
>>
>> On do, 2010-01-21 at 10:47 -0800, stack wrote:
>> > On Thu, Jan 21, 2010 at 9:48 AM, Menno Luiten <ml...@artifix.net> wrote:
>> >
>> > > ...Then I use the KeyValueSortReducer as Reducer, and
>> > > limit myself to 1 reducer only, to prevent writing a custom Partitioner.
>> > >
>> > >
>> > This all sounds like you are setting things up properly.
>> >
>> >
>> >
>> > > java.io.IOException: Added a key not lexically larger than previous
>> > > key=im2.jpgaspectcolor/blue1---some-color/p.995�������,
>> > > lastkey=im2.jpgaspectcolor/cyan---some-color/p.995�������
>> > >
>> > > This is a complaint that comes up out of hfile if keys are not in strictly
>> > increasing lexicographical order.
>> >
>> >
>> >
>> > > I thought the KeyValueSortReducer would sort the output to prevent these
>> > > errors from occuring.
>> >
>> >
>> > Yes.  Me too.  There is a bug in hbase-0.20.2 KVSR -- hbase-2101 -- but it
>> > would not explain what you are seeing, as far as I can tell.
>> >
>> >
>> > > Am I doing something wrong or should I write my
>> > > own comparator which compares actual row,column pairs?
>> > >
>> > > You shouldn't have to.  See the comparator passed to the TreeMap used
>> > internally in KVSR.  It should do the compare properly.
>> >
>> > I'm not sure whats going wrong.  You seem to have it set up right yet keys
>> > are going into hfile out of order.  Any chance of your digging in to figure
>> > what is going awry?
>> >
>> > Thanks,
>> > St.Ack
>>
>>
>
>
>

Re: "Added a key not lexically larger than previous key" when using KeyValueSortReducer

Posted by Menno Luiten <ml...@artifix.net>.
Turns out to be a major screw-up on my behalf :)

Passed line number as key instead of actual row key, ofcourse resulting
in reducing and ordering 1 line at a time. Woops..

It's working fine now :) thanks for your time..
Menno

On do, 2010-01-21 at 20:35 +0100, Menno Luiten wrote:
> Manually added the "kv.clone()"-patch and it did not solve the issue,
> unfortunately.
> 
> Will most certainly dig deeper as to why this is happening and will post
> the results here or in JIRA.
> 
> Thanks
> 
> On do, 2010-01-21 at 10:47 -0800, stack wrote:
> > On Thu, Jan 21, 2010 at 9:48 AM, Menno Luiten <ml...@artifix.net> wrote:
> > 
> > > ...Then I use the KeyValueSortReducer as Reducer, and
> > > limit myself to 1 reducer only, to prevent writing a custom Partitioner.
> > >
> > >
> > This all sounds like you are setting things up properly.
> > 
> > 
> > 
> > > java.io.IOException: Added a key not lexically larger than previous
> > > key=im2.jpgaspectcolor/blue1---some-color/p.995�������,
> > > lastkey=im2.jpgaspectcolor/cyan---some-color/p.995�������
> > >
> > > This is a complaint that comes up out of hfile if keys are not in strictly
> > increasing lexicographical order.
> > 
> > 
> > 
> > > I thought the KeyValueSortReducer would sort the output to prevent these
> > > errors from occuring.
> > 
> > 
> > Yes.  Me too.  There is a bug in hbase-0.20.2 KVSR -- hbase-2101 -- but it
> > would not explain what you are seeing, as far as I can tell.
> > 
> > 
> > > Am I doing something wrong or should I write my
> > > own comparator which compares actual row,column pairs?
> > >
> > > You shouldn't have to.  See the comparator passed to the TreeMap used
> > internally in KVSR.  It should do the compare properly.
> > 
> > I'm not sure whats going wrong.  You seem to have it set up right yet keys
> > are going into hfile out of order.  Any chance of your digging in to figure
> > what is going awry?
> > 
> > Thanks,
> > St.Ack
> 
> 



Re: "Added a key not lexically larger than previous key" when using KeyValueSortReducer

Posted by Menno Luiten <ml...@artifix.net>.
Manually added the "kv.clone()"-patch and it did not solve the issue,
unfortunately.

Will most certainly dig deeper as to why this is happening and will post
the results here or in JIRA.

Thanks

On do, 2010-01-21 at 10:47 -0800, stack wrote:
> On Thu, Jan 21, 2010 at 9:48 AM, Menno Luiten <ml...@artifix.net> wrote:
> 
> > ...Then I use the KeyValueSortReducer as Reducer, and
> > limit myself to 1 reducer only, to prevent writing a custom Partitioner.
> >
> >
> This all sounds like you are setting things up properly.
> 
> 
> 
> > java.io.IOException: Added a key not lexically larger than previous
> > key=im2.jpgaspectcolor/blue1---some-color/p.995�������,
> > lastkey=im2.jpgaspectcolor/cyan---some-color/p.995�������
> >
> > This is a complaint that comes up out of hfile if keys are not in strictly
> increasing lexicographical order.
> 
> 
> 
> > I thought the KeyValueSortReducer would sort the output to prevent these
> > errors from occuring.
> 
> 
> Yes.  Me too.  There is a bug in hbase-0.20.2 KVSR -- hbase-2101 -- but it
> would not explain what you are seeing, as far as I can tell.
> 
> 
> > Am I doing something wrong or should I write my
> > own comparator which compares actual row,column pairs?
> >
> > You shouldn't have to.  See the comparator passed to the TreeMap used
> internally in KVSR.  It should do the compare properly.
> 
> I'm not sure whats going wrong.  You seem to have it set up right yet keys
> are going into hfile out of order.  Any chance of your digging in to figure
> what is going awry?
> 
> Thanks,
> St.Ack



Re: "Added a key not lexically larger than previous key" when using KeyValueSortReducer

Posted by stack <st...@duboce.net>.
On Thu, Jan 21, 2010 at 9:48 AM, Menno Luiten <ml...@artifix.net> wrote:

> ...Then I use the KeyValueSortReducer as Reducer, and
> limit myself to 1 reducer only, to prevent writing a custom Partitioner.
>
>
This all sounds like you are setting things up properly.



> java.io.IOException: Added a key not lexically larger than previous
> key=im2.jpgaspectcolor/blue1---some-color/p.995�������,
> lastkey=im2.jpgaspectcolor/cyan---some-color/p.995�������
>
> This is a complaint that comes up out of hfile if keys are not in strictly
increasing lexicographical order.



> I thought the KeyValueSortReducer would sort the output to prevent these
> errors from occuring.


Yes.  Me too.  There is a bug in hbase-0.20.2 KVSR -- hbase-2101 -- but it
would not explain what you are seeing, as far as I can tell.


> Am I doing something wrong or should I write my
> own comparator which compares actual row,column pairs?
>
> You shouldn't have to.  See the comparator passed to the TreeMap used
internally in KVSR.  It should do the compare properly.

I'm not sure whats going wrong.  You seem to have it set up right yet keys
are going into hfile out of order.  Any chance of your digging in to figure
what is going awry?

Thanks,
St.Ack