You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@accumulo.apache.org by z11373 <z1...@outlook.com> on 2015/09/01 00:13:36 UTC

exception thrown during minor compaction

Hi,
I attach a summing combiner into a newly created table. The snippet code is
something like:

			EnumSet<IteratorScope> iteratorScopes =
EnumSet.allOf(IteratorScope.class);

			// first, remove versioning iterator since it will not work with combiner
			conn.tableOperations().removeIterator(tableName,
				VERS_ITERATOR_NAME,
				iteratorScopes);
			
			// create the combiner setting, in this case it is SummingCombiner, which
will
			// sum value of all rows with same key (different timestamp is considered
same)
			// and result in single row with that key and aggregate value
			IteratorSetting setting = new IteratorSetting(
				COMBINERS_PRIORITY,
				SUM_COMBINERS_NAME,
				SummingCombiner.class);
			
			// set the combiner to apply to all columns
			SummingCombiner.setCombineAllColumns(setting, true);
			
			// need to set encoding type, otherwise exception will be thrown during
scan
			SummingCombiner.setEncodingType(setting, LongLexicoder.class);
			
			// attach the combiner to the table
			conn.tableOperations().attachIterator(tableName, setting,
iteratorScopes);


As you see from the code above, I use LongLexicoder class as the encoding
type.
The mutation I add for that table will be unique row id, a string for column
family, empty column qualifier, and the value is "new
LongLexicoder().encode(1L)", so basically the value is 1.

It runs fine until at one point (and I can see rows are inserted into the
table), but it hung then.
Looking at the tablet server logs I found:

2015-08-31 17:59:42,371 [tserver.MinorCompactor] WARN : MinC failed (0) to
create
hdfs://<machine>:<port>/accumulo/tables/l/default_tablet/F00009gp.rf_tmp
retrying ...
java.lang.ArrayIndexOutOfBoundsException: 0
        at
org.apache.accumulo.core.client.lexicoder.ULongLexicoder.decode(ULongLexicoder.java:60)
        at
org.apache.accumulo.core.client.lexicoder.LongLexicoder.decode(LongLexicoder.java:33)
        at
org.apache.accumulo.core.client.lexicoder.LongLexicoder.decode(LongLexicoder.java:25)
        at
org.apache.accumulo.core.iterators.TypedValueCombiner$VIterator.hasNext(TypedValueCombiner.java:82)
        at
org.apache.accumulo.core.iterators.user.SummingCombiner.typedReduce(SummingCombiner.java:31)
        at
org.apache.accumulo.core.iterators.user.SummingCombiner.typedReduce(SummingCombiner.java:27)
        at
org.apache.accumulo.core.iterators.TypedValueCombiner.reduce(TypedValueCombiner.java:182)
        at
org.apache.accumulo.core.iterators.Combiner.findTop(Combiner.java:166)
        at
org.apache.accumulo.core.iterators.Combiner.next(Combiner.java:147)
        at
org.apache.accumulo.tserver.Compactor.compactLocalityGroup(Compactor.java:505)
        at org.apache.accumulo.tserver.Compactor.call(Compactor.java:362)
        at
org.apache.accumulo.tserver.MinorCompactor.call(MinorCompactor.java:96)
        at org.apache.accumulo.tserver.Tablet.minorCompact(Tablet.java:2072)
        at org.apache.accumulo.tserver.Tablet.access$4400(Tablet.java:172)
        at
org.apache.accumulo.tserver.Tablet$MinorCompactionTask.run(Tablet.java:2159)
        at
org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
        at
org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at
org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47)
        at
org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
        at java.lang.Thread.run(Thread.java:722)


Looking at Accumulo's source code in ULongLexicoder.java, it looks like the
array is empty, hence it throws exception, which is still happening even
after I killed my app. I am thinking to stop Accumulo.
Do you have any idea why the array is empty? I am thinking to experiment
with String encoder instead of using LongLexicoder.


  public Long decode(byte[] data) {

    long l = 0;
    int shift = 0;

    if (data[0] < 0 || data[0] > 16)
      throw new IllegalArgumentException("Unexpected length " + (0xff &
data[0]));

    ...
  }


Thanks,
zainal



--
View this message in context: http://apache-accumulo.1065345.n5.nabble.com/exception-thrown-during-minor-compaction-tp15010.html
Sent from the Developers mailing list archive at Nabble.com.

Re: exception thrown during minor compaction

Posted by Christopher <ct...@apache.org>.
For that, you can have a simple scan time filter that which only returns
values greater than 1.

On Wed, Sep 2, 2015, 09:24 z11373 <z1...@outlook.com> wrote:

> I guess what I was looking for is the final result (after summing combiner)
> that only has rows with value is greater than 1 (or x if it is configured)
>
> Thanks,
> Z
>
>
>
> --
> View this message in context:
> http://apache-accumulo.1065345.n5.nabble.com/exception-thrown-during-minor-compaction-tp15010p15040.html
> Sent from the Developers mailing list archive at Nabble.com.
>

Re: exception thrown during minor compaction

Posted by z11373 <z1...@outlook.com>.
I guess what I was looking for is the final result (after summing combiner)
that only has rows with value is greater than 1 (or x if it is configured)

Thanks,
Z



--
View this message in context: http://apache-accumulo.1065345.n5.nabble.com/exception-thrown-during-minor-compaction-tp15010p15040.html
Sent from the Developers mailing list archive at Nabble.com.

Re: exception thrown during minor compaction

Posted by Christopher <ct...@apache.org>.
I think Eric was suggesting that your Constraint can enforce preconditions
on the data, before it reaches your Combiner. The Combiner still has to do
the job of summation. We actually have an example of that exact combiner:
org.apache.accumulo.core.iterators.user.SummingCombiner , which I believe
already skips over delete markers.

On Tue, Sep 1, 2015 at 5:02 PM z11373 <z1...@outlook.com> wrote:

> Thanks Christ for the info!
> I am still puzzling on how I can use either Constraints or Filters for my
> scenario.
> Let's walkthru my sample scenario, we have 4 inserts coming at different
> time
>
> T1: ['foo', 1]
> T2: ['foo', 1]
> T3: ['bar', 1]
> T4: ['foo', 1]
>
> With Combiners, I could get what I want:
> ['bar', 1]
> ['foo', 3]
>
> I wonder if earlier Eric implies the Constraint that is applied during
> compaction, which in this case it can reject ['bar', 1], and leaves ['foo',
> 3] in the table, which is what I want.
>
> I am not sure if we can do that though. Sorry if I missed the point from
> you
> guys earlier.
>
>
> Thanks,
> Z
>
>
>
>
>
> --
> View this message in context:
> http://apache-accumulo.1065345.n5.nabble.com/exception-thrown-during-minor-compaction-tp15010p15037.html
> Sent from the Developers mailing list archive at Nabble.com.
>

Re: exception thrown during minor compaction

Posted by z11373 <z1...@outlook.com>.
Thanks Christ for the info!
I am still puzzling on how I can use either Constraints or Filters for my
scenario.
Let's walkthru my sample scenario, we have 4 inserts coming at different
time

T1: ['foo', 1] 
T2: ['foo', 1] 
T3: ['bar', 1] 
T4: ['foo', 1]

With Combiners, I could get what I want:
['bar', 1] 
['foo', 3] 

I wonder if earlier Eric implies the Constraint that is applied during
compaction, which in this case it can reject ['bar', 1], and leaves ['foo',
3] in the table, which is what I want.

I am not sure if we can do that though. Sorry if I missed the point from you
guys earlier.


Thanks,
Z





--
View this message in context: http://apache-accumulo.1065345.n5.nabble.com/exception-thrown-during-minor-compaction-tp15010p15037.html
Sent from the Developers mailing list archive at Nabble.com.

Re: exception thrown during minor compaction

Posted by Christopher <ct...@apache.org>.
Constraints are only applied to Mutations during live ingest (BatchWriter).
It's applied on write, well before any compactions. So, there's no issue of
it distinguishing between "before" and "after" processing (compacting).

On Tue, Sep 1, 2015 at 4:33 PM z11373 <z1...@outlook.com> wrote:

> Thanks Eric!
> After I posted my question, I realized this may not make sense, even using
> Filter I mentioned earlier.
> Constraint will likely not work, because how can it distinguish '1' being
> added (while it's not done processing yet) and with '1' as final result.
>
> One solution I can think of is to use a temp table (for while processing),
> and final table which only contains rows with value > 1
>
> Another option is just leave as is (so those rows with '1' will be kept,
> and
> treat them as noise). Unfortunately, from smaller dataset I have for
> testing, it has 15,785,030 rows vs. 848,601 (exclude the ones with value =
> 1), so the difference is so big to ignore.
>
>
> Thanks,
> Z
>
>
>
> --
> View this message in context:
> http://apache-accumulo.1065345.n5.nabble.com/exception-thrown-during-minor-compaction-tp15010p15034.html
> Sent from the Developers mailing list archive at Nabble.com.
>

Re: exception thrown during minor compaction

Posted by z11373 <z1...@outlook.com>.
Thanks Eric!
After I posted my question, I realized this may not make sense, even using
Filter I mentioned earlier.
Constraint will likely not work, because how can it distinguish '1' being
added (while it's not done processing yet) and with '1' as final result.

One solution I can think of is to use a temp table (for while processing),
and final table which only contains rows with value > 1

Another option is just leave as is (so those rows with '1' will be kept, and
treat them as noise). Unfortunately, from smaller dataset I have for
testing, it has 15,785,030 rows vs. 848,601 (exclude the ones with value =
1), so the difference is so big to ignore.


Thanks,
Z



--
View this message in context: http://apache-accumulo.1065345.n5.nabble.com/exception-thrown-during-minor-compaction-tp15010p15034.html
Sent from the Developers mailing list archive at Nabble.com.

Re: exception thrown during minor compaction

Posted by Eric Newton <er...@gmail.com>.
Unrelated to the error you are experiencing:

You can add a Constraint that will reject inserts with values other than 1
or -1.

-Eric



On Tue, Sep 1, 2015 at 3:24 PM, z11373 <z1...@outlook.com> wrote:

> Hi Josh,
> I figured it out that it's actually a human bug :-)
> There is existing code somewhere which I overlooked earlier, it wrote to
> the
> table with empty value, hence makes sense it's giving that error.
>
> I'll try to enforce the semantic from our use case, delete the row
> shouldn't
> happen, and instead it should insert with -1 as value, so I should be good
> here. I also tested deleting a single row, without re-insert, and that row
> not returned from running scan command, which is correct. I didn't try
> re-add that row, which I guess would hit that bug you guys mentioned.
>
> I have one more question, is it possible to keep the row with value is
> greater than 1? For example, let say I insert:
> ['foo', 1]
> ['foo', 1]
> ['bar', 1]
> ['foo', 1]
>
> The scan currently returns:
> ['bar', 1]
> ['foo', 3]
>
> I want it only returns:
> ['foo', 3]
>
> So basically, during compaction it doesn't include those with value = 1 to
> new files, hence this will also make the stats table small since it has so
> many rows with value = 1 (which we don't care). Hopefully the solution also
> can be used for value is other than 1 (i.e. keep rows with value > 10)
> It looks like I may be able to use the built-in RegEx filter for that, but
> perhaps there is better way?
>
> Oh, another thing that I observed during my testing:
>
> root@dev > setiter -t combiner2 -p 15 -scan -minc -majc -n sumcombiners
> -class org.apache.accumulo.core.iterators.user.SummingCombiner
> SummingCombiner interprets Values as Longs and adds them together.  A
> variety of encodings (variable length, fixed length, or string) are
> available
> ----------> set SummingCombiner parameter all, set to true to apply
> Combiner
> to every column, otherwise leave blank. if true, columns option will be
> ignored.: true
> ----------> set SummingCombiner parameter columns, <col fam>[:<col
> qual>]{,<col fam>[:<col qual>]} escape non-alphanum chars using %<hex>.:
> ----------> set SummingCombiner parameter lossy, if true, failed decodes
> are
> ignored. Otherwise combiner will error on failed decodes (default false):
> <TRUE|FALSE>:
> ----------> set SummingCombiner parameter type,
> <VARLEN|FIXEDLEN|STRING|fullClassName>:
> org.apache.accumulo.core.client.lexicoder.LongLexicoder
> 2015-09-01 15:22:46,587 [shell.Shell] ERROR:
> java.lang.IllegalArgumentException: bad encoder option
>
> Do you know why I got that error?
> 'org.apache.accumulo.core.client.lexicoder.LongLexicoder' should be the
> correct full class name.
>
>
> Thanks a lot,
> Z
>
>
>
> --
> View this message in context:
> http://apache-accumulo.1065345.n5.nabble.com/exception-thrown-during-minor-compaction-tp15010p15030.html
> Sent from the Developers mailing list archive at Nabble.com.
>

Re: exception thrown during minor compaction

Posted by z11373 <z1...@outlook.com>.
Hi Josh,
I figured it out that it's actually a human bug :-)
There is existing code somewhere which I overlooked earlier, it wrote to the
table with empty value, hence makes sense it's giving that error.

I'll try to enforce the semantic from our use case, delete the row shouldn't
happen, and instead it should insert with -1 as value, so I should be good
here. I also tested deleting a single row, without re-insert, and that row
not returned from running scan command, which is correct. I didn't try
re-add that row, which I guess would hit that bug you guys mentioned.

I have one more question, is it possible to keep the row with value is
greater than 1? For example, let say I insert:
['foo', 1]
['foo', 1]
['bar', 1]
['foo', 1]

The scan currently returns:
['bar', 1]
['foo', 3]

I want it only returns:
['foo', 3]

So basically, during compaction it doesn't include those with value = 1 to
new files, hence this will also make the stats table small since it has so
many rows with value = 1 (which we don't care). Hopefully the solution also
can be used for value is other than 1 (i.e. keep rows with value > 10)
It looks like I may be able to use the built-in RegEx filter for that, but
perhaps there is better way?

Oh, another thing that I observed during my testing:

root@dev > setiter -t combiner2 -p 15 -scan -minc -majc -n sumcombiners
-class org.apache.accumulo.core.iterators.user.SummingCombiner
SummingCombiner interprets Values as Longs and adds them together.  A
variety of encodings (variable length, fixed length, or string) are
available
----------> set SummingCombiner parameter all, set to true to apply Combiner
to every column, otherwise leave blank. if true, columns option will be
ignored.: true
----------> set SummingCombiner parameter columns, <col fam>[:<col
qual>]{,<col fam>[:<col qual>]} escape non-alphanum chars using %<hex>.:
----------> set SummingCombiner parameter lossy, if true, failed decodes are
ignored. Otherwise combiner will error on failed decodes (default false):
<TRUE|FALSE>:
----------> set SummingCombiner parameter type,
<VARLEN|FIXEDLEN|STRING|fullClassName>:
org.apache.accumulo.core.client.lexicoder.LongLexicoder
2015-09-01 15:22:46,587 [shell.Shell] ERROR:
java.lang.IllegalArgumentException: bad encoder option

Do you know why I got that error?
'org.apache.accumulo.core.client.lexicoder.LongLexicoder' should be the
correct full class name.


Thanks a lot,
Z



--
View this message in context: http://apache-accumulo.1065345.n5.nabble.com/exception-thrown-during-minor-compaction-tp15010p15030.html
Sent from the Developers mailing list archive at Nabble.com.

Re: exception thrown during minor compaction

Posted by Josh Elser <jo...@gmail.com>.
Keith Turner wrote:
>> >  Thanks Eric and Josh.
>> >
>> >  There shouldn't be delete marker because my code doesn't perform any delete
>> >  operation, right?
>> >
>> >  Josh: if that out-of-the-box SummingCombiner cannot handle delete marker,
>> >  then I'd think that's bug:-)
>> >
>
>   https://issues.apache.org/jira/browse/ACCUMULO-2232
>

Aside: it cracks me up that I have no recollection of this conversation 
anymore, much less running into (what might be) the same bug :P

Re: exception thrown during minor compaction

Posted by Keith Turner <ke...@deenlo.com>.
On Tue, Sep 1, 2015 at 9:13 AM, z11373 <z1...@outlook.com> wrote:

> Thanks Eric and Josh.
>
> There shouldn't be delete marker because my code doesn't perform any delete
> operation, right?
>
> Josh: if that out-of-the-box SummingCombiner cannot handle delete marker,
> then I'd think that's bug :-)
>

 https://issues.apache.org/jira/browse/ACCUMULO-2232

>
>
> Thanks,
> Z
>
>
>
> --
> View this message in context:
> http://apache-accumulo.1065345.n5.nabble.com/exception-thrown-during-minor-compaction-tp15010p15025.html
> Sent from the Developers mailing list archive at Nabble.com.
>

Re: exception thrown during minor compaction

Posted by Adam Fuchs <af...@apache.org>.
In your case, there shouldn't be a delete marker unless you're explicitly
writing one.

The tricky thing about deletes in a summing combiner is that sums and
deletes together are not commutative, and combiners require associativity
and commutativity. If I have three operations: add 1 to x, delete x, and
then add 1 to x, I might reasonably expect the result of performing these
operations in order to be x = 1. However, if I reorder the first add and
the delete operations I could get alternatively get x = 2. When using a
combiner this could happen when the first and last entries are included in
two files that go through a non-full major compaction, and the second entry
is in a third file that is not included. For this reason, we shouldn't have
general support for deletes in a SummingCombiner (but maybe we should have
better documentation).

There are a couple of alternative implementations to get delete
functionality:
1. Use a read-write loop to negate the current value of a key. Read the
current value and write back the same key with negative that value. Make
sure to batch this for performance.
2. Write a different iterator that supports deletes, but only operates on
minor compaction and full major compaction scopes.

There may also be a project that the Accumulo dev community would be
interested in, which would be to add a compaction strategy that makes sure
compactions always include a contiguous range of timestamps. I think this
would remove the requirement for commutativity in iterator operations and
wouldn't introduce performance problems in most cases.

Cheers,
Adam


On Tue, Sep 1, 2015 at 9:13 AM, z11373 <z1...@outlook.com> wrote:

> Thanks Eric and Josh.
>
> There shouldn't be delete marker because my code doesn't perform any delete
> operation, right?
>
> Josh: if that out-of-the-box SummingCombiner cannot handle delete marker,
> then I'd think that's bug :-)
>
>
> Thanks,
> Z
>
>
>
> --
> View this message in context:
> http://apache-accumulo.1065345.n5.nabble.com/exception-thrown-during-minor-compaction-tp15010p15025.html
> Sent from the Developers mailing list archive at Nabble.com.
>

Re: exception thrown during minor compaction

Posted by Josh Elser <jo...@gmail.com>.
z11373 wrote:
> Thanks Eric and Josh.
>
> There shouldn't be delete marker because my code doesn't perform any delete
> operation, right?

Correct.

> Josh: if that out-of-the-box SummingCombiner cannot handle delete marker,
> then I'd think that's bug :-)

Yes, definitely. Sounds like that is not the case however. Will wait to 
hear back with what you find.

>
> Thanks,
> Z

Re: exception thrown during minor compaction

Posted by z11373 <z1...@outlook.com>.
Thanks Eric and Josh.

There shouldn't be delete marker because my code doesn't perform any delete
operation, right?

Josh: if that out-of-the-box SummingCombiner cannot handle delete marker,
then I'd think that's bug :-)


Thanks,
Z



--
View this message in context: http://apache-accumulo.1065345.n5.nabble.com/exception-thrown-during-minor-compaction-tp15010p15025.html
Sent from the Developers mailing list archive at Nabble.com.

Re: exception thrown during minor compaction

Posted by Josh Elser <jo...@gmail.com>.
Ah, ok. Thanks for clarifying.

If that is indeed the cause here, it's a bug in our provided Combiners. 
Z is just using the "user" iterators that we provide and invoking the 
API. Based on provided usage earlier, he's doing the right thing.

If you can verify, Z, that'd be really helpful!

Eric Newton wrote:
> Sure, but not during a compaction that does not involve all the underlying
> files. The delete keys must be propagated.
>
> I'm not completely familiar with the underlying libraries that help you
> write iterators, I just know it's a common mistake.
>
>
> On Mon, Aug 31, 2015 at 11:53 PM, Josh Elser<jo...@gmail.com>  wrote:
>
>> Shouldn't the delete be masked at a lower layer (DeletingIterator)? Or am
>> I forgetting that Combiners see that value somehow (and maybe
>> SummingCombiner is broken)?
>>
>>
>> Eric Newton wrote:
>>
>>> You may be seeing a delete marker.  In a compaction, you will see delete
>>> markers, which have an empty value. You will have to check the delete flag
>>> on the key before grabbing the Value.
>>>
>>> -Eric
>>>
>>> On Mon, Aug 31, 2015 at 8:28 PM, z11373<z1...@outlook.com>   wrote:
>>>
>>> Thanks Josh! I am going to do more experiments, because this is really
>>>> weird.
>>>> I'll post the update if there are any interesting stuff I found out
>>>> later.
>>>>
>>>> Thanks,
>>>> Z
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>>
>>>> http://apache-accumulo.1065345.n5.nabble.com/exception-thrown-during-minor-compaction-tp15010p15017.html
>>>> Sent from the Developers mailing list archive at Nabble.com.
>>>>
>>>>
>

Re: exception thrown during minor compaction

Posted by Eric Newton <er...@gmail.com>.
Sure, but not during a compaction that does not involve all the underlying
files. The delete keys must be propagated.

I'm not completely familiar with the underlying libraries that help you
write iterators, I just know it's a common mistake.


On Mon, Aug 31, 2015 at 11:53 PM, Josh Elser <jo...@gmail.com> wrote:

> Shouldn't the delete be masked at a lower layer (DeletingIterator)? Or am
> I forgetting that Combiners see that value somehow (and maybe
> SummingCombiner is broken)?
>
>
> Eric Newton wrote:
>
>> You may be seeing a delete marker.  In a compaction, you will see delete
>> markers, which have an empty value. You will have to check the delete flag
>> on the key before grabbing the Value.
>>
>> -Eric
>>
>> On Mon, Aug 31, 2015 at 8:28 PM, z11373<z1...@outlook.com>  wrote:
>>
>> Thanks Josh! I am going to do more experiments, because this is really
>>> weird.
>>> I'll post the update if there are any interesting stuff I found out
>>> later.
>>>
>>> Thanks,
>>> Z
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>>
>>> http://apache-accumulo.1065345.n5.nabble.com/exception-thrown-during-minor-compaction-tp15010p15017.html
>>> Sent from the Developers mailing list archive at Nabble.com.
>>>
>>>
>>

Re: exception thrown during minor compaction

Posted by Josh Elser <jo...@gmail.com>.
Shouldn't the delete be masked at a lower layer (DeletingIterator)? Or 
am I forgetting that Combiners see that value somehow (and maybe 
SummingCombiner is broken)?

Eric Newton wrote:
> You may be seeing a delete marker.  In a compaction, you will see delete
> markers, which have an empty value. You will have to check the delete flag
> on the key before grabbing the Value.
>
> -Eric
>
> On Mon, Aug 31, 2015 at 8:28 PM, z11373<z1...@outlook.com>  wrote:
>
>> Thanks Josh! I am going to do more experiments, because this is really
>> weird.
>> I'll post the update if there are any interesting stuff I found out later.
>>
>> Thanks,
>> Z
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-accumulo.1065345.n5.nabble.com/exception-thrown-during-minor-compaction-tp15010p15017.html
>> Sent from the Developers mailing list archive at Nabble.com.
>>
>

Re: exception thrown during minor compaction

Posted by Eric Newton <er...@gmail.com>.
You may be seeing a delete marker.  In a compaction, you will see delete
markers, which have an empty value. You will have to check the delete flag
on the key before grabbing the Value.

-Eric

On Mon, Aug 31, 2015 at 8:28 PM, z11373 <z1...@outlook.com> wrote:

> Thanks Josh! I am going to do more experiments, because this is really
> weird.
> I'll post the update if there are any interesting stuff I found out later.
>
> Thanks,
> Z
>
>
>
> --
> View this message in context:
> http://apache-accumulo.1065345.n5.nabble.com/exception-thrown-during-minor-compaction-tp15010p15017.html
> Sent from the Developers mailing list archive at Nabble.com.
>

Re: exception thrown during minor compaction

Posted by z11373 <z1...@outlook.com>.
Thanks Josh! I am going to do more experiments, because this is really weird.
I'll post the update if there are any interesting stuff I found out later.

Thanks,
Z



--
View this message in context: http://apache-accumulo.1065345.n5.nabble.com/exception-thrown-during-minor-compaction-tp15010p15017.html
Sent from the Developers mailing list archive at Nabble.com.

Re: exception thrown during minor compaction

Posted by Josh Elser <jo...@gmail.com>.
Well that's curious. Certainly looks like it's doing the right thing. I 
wonder if there's an edge case with the Lexicoder that's causing it to 
write bad data...

Pretty much the only thing I can think of that wouldn't require code is 
the `grep` shell command. I believe the implementation will also inspect 
the value. Everything else, like you point out, is operating on the keys.

The code-writing option is to write your own Filter implementation that 
only accepts empty values, but that may be more work than just dumping 
the contents of the table and using some `grep` magic in the shell. I'll 
let you decide which is more work :)

z11373 wrote:
> Thanks Josh for the quick reply!
> Yes, my code is in one place, which always insert 1L as value.
>
> LongEncoder encoder = new LongEncoder();
> Value countValue = new Value(encoder.encode(1L));
> Mutation m = new Mutation(key);
> m.put(name, new Text(), countValue);
>
> It works fine until at one point of the ingestion (there are millions of
> data).
> Looking at the code, I can't think what would cause it insert empty or null
> value since it's explicitly hardcoded with 1L
> Is there a way to scan that table and find the key with value is empty?
> I know that we can scan by key but not by value, so I guess this is not
> possible, unless I go thru all of the rows.
>
> Thanks,
> Z
>
>
>
> --
> View this message in context: http://apache-accumulo.1065345.n5.nabble.com/exception-thrown-during-minor-compaction-tp15010p15013.html
> Sent from the Developers mailing list archive at Nabble.com.

Re: exception thrown during minor compaction

Posted by z11373 <z1...@outlook.com>.
Thanks Josh for the quick reply!
Yes, my code is in one place, which always insert 1L as value.

LongEncoder encoder = new LongEncoder();
Value countValue = new Value(encoder.encode(1L));
Mutation m = new Mutation(key);
m.put(name, new Text(), countValue);

It works fine until at one point of the ingestion (there are millions of
data).
Looking at the code, I can't think what would cause it insert empty or null
value since it's explicitly hardcoded with 1L
Is there a way to scan that table and find the key with value is empty?
I know that we can scan by key but not by value, so I guess this is not
possible, unless I go thru all of the rows.

Thanks,
Z



--
View this message in context: http://apache-accumulo.1065345.n5.nabble.com/exception-thrown-during-minor-compaction-tp15010p15013.html
Sent from the Developers mailing list archive at Nabble.com.

Re: exception thrown during minor compaction

Posted by Josh Elser <jo...@gmail.com>.
An empty value seems to imply that you wrote some unexpected data to 
your table. The following code does work correctly (using 1.7.0).

     byte[] bytes = new LongLexicoder().encode(1l);
     System.out.println(Arrays.toString(bytes));
     System.out.println(new LongLexicoder().decode(bytes));

Can you check the code you're using to create Mutations and verify that 
you're passing in the bytes from the LongLexicoder encode() method as 
the Value?

You can try removing the SummingCombiner from your table which should 
let you scan the table to verify the records (and also let your 
compactions happen). Because you removed the VersioningIterator, it 
should preserve all of the entries you have in the table.

z11373 wrote:
> Hi,
> I attach a summing combiner into a newly created table. The snippet code is
> something like:
>
> 			EnumSet<IteratorScope>  iteratorScopes =
> EnumSet.allOf(IteratorScope.class);
>
> 			// first, remove versioning iterator since it will not work with combiner
> 			conn.tableOperations().removeIterator(tableName,
> 				VERS_ITERATOR_NAME,
> 				iteratorScopes);
> 			
> 			// create the combiner setting, in this case it is SummingCombiner, which
> will
> 			// sum value of all rows with same key (different timestamp is considered
> same)
> 			// and result in single row with that key and aggregate value
> 			IteratorSetting setting = new IteratorSetting(
> 				COMBINERS_PRIORITY,
> 				SUM_COMBINERS_NAME,
> 				SummingCombiner.class);
> 			
> 			// set the combiner to apply to all columns
> 			SummingCombiner.setCombineAllColumns(setting, true);
> 			
> 			// need to set encoding type, otherwise exception will be thrown during
> scan
> 			SummingCombiner.setEncodingType(setting, LongLexicoder.class);
> 			
> 			// attach the combiner to the table
> 			conn.tableOperations().attachIterator(tableName, setting,
> iteratorScopes);
>
>
> As you see from the code above, I use LongLexicoder class as the encoding
> type.
> The mutation I add for that table will be unique row id, a string for column
> family, empty column qualifier, and the value is "new
> LongLexicoder().encode(1L)", so basically the value is 1.
>
> It runs fine until at one point (and I can see rows are inserted into the
> table), but it hung then.
> Looking at the tablet server logs I found:
>
> 2015-08-31 17:59:42,371 [tserver.MinorCompactor] WARN : MinC failed (0) to
> create
> hdfs://<machine>:<port>/accumulo/tables/l/default_tablet/F00009gp.rf_tmp
> retrying ...
> java.lang.ArrayIndexOutOfBoundsException: 0
>          at
> org.apache.accumulo.core.client.lexicoder.ULongLexicoder.decode(ULongLexicoder.java:60)
>          at
> org.apache.accumulo.core.client.lexicoder.LongLexicoder.decode(LongLexicoder.java:33)
>          at
> org.apache.accumulo.core.client.lexicoder.LongLexicoder.decode(LongLexicoder.java:25)
>          at
> org.apache.accumulo.core.iterators.TypedValueCombiner$VIterator.hasNext(TypedValueCombiner.java:82)
>          at
> org.apache.accumulo.core.iterators.user.SummingCombiner.typedReduce(SummingCombiner.java:31)
>          at
> org.apache.accumulo.core.iterators.user.SummingCombiner.typedReduce(SummingCombiner.java:27)
>          at
> org.apache.accumulo.core.iterators.TypedValueCombiner.reduce(TypedValueCombiner.java:182)
>          at
> org.apache.accumulo.core.iterators.Combiner.findTop(Combiner.java:166)
>          at
> org.apache.accumulo.core.iterators.Combiner.next(Combiner.java:147)
>          at
> org.apache.accumulo.tserver.Compactor.compactLocalityGroup(Compactor.java:505)
>          at org.apache.accumulo.tserver.Compactor.call(Compactor.java:362)
>          at
> org.apache.accumulo.tserver.MinorCompactor.call(MinorCompactor.java:96)
>          at org.apache.accumulo.tserver.Tablet.minorCompact(Tablet.java:2072)
>          at org.apache.accumulo.tserver.Tablet.access$4400(Tablet.java:172)
>          at
> org.apache.accumulo.tserver.Tablet$MinorCompactionTask.run(Tablet.java:2159)
>          at
> org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
>          at
> org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47)
>          at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>          at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>          at
> org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47)
>          at
> org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
>          at java.lang.Thread.run(Thread.java:722)
>
>
> Looking at Accumulo's source code in ULongLexicoder.java, it looks like the
> array is empty, hence it throws exception, which is still happening even
> after I killed my app. I am thinking to stop Accumulo.
> Do you have any idea why the array is empty? I am thinking to experiment
> with String encoder instead of using LongLexicoder.
>
>
>    public Long decode(byte[] data) {
>
>      long l = 0;
>      int shift = 0;
>
>      if (data[0]<  0 || data[0]>  16)
>        throw new IllegalArgumentException("Unexpected length " + (0xff&
> data[0]));
>
>      ...
>    }
>
>
> Thanks,
> zainal
>
>
>
> --
> View this message in context: http://apache-accumulo.1065345.n5.nabble.com/exception-thrown-during-minor-compaction-tp15010.html
> Sent from the Developers mailing list archive at Nabble.com.