You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-dev@jackrabbit.apache.org by Julian Sedding <js...@gmail.com> on 2015/06/22 10:54:14 UTC

Error handling during AsyncIndexUpdate

Hi all

On a freshly migrated Oak setup (AEM 6.1), I recently observed that
async indexing was running all the time. At first I did not worry,
because there were ~14mio nodes to be indexed, but eventually I got
the impression that there was an endless loop.

Here's my take on what's happening, and please feel free to correct
any wrong assumptions I make:

- after a migration there is no checkpoint for async indexing to start
at, so it indexes everything
- a migration is a single commit, so async indexing is all or nothing
(not sure the single commit is relevant, anyone?)
- due to an oddity in the metadata of a PDF file, async indexing
failed with an exception
- async indexing recommences to see if the error persists on any subsequent run
- rinse and repeat

If my interpretation is correct, I would suggest to review the error handling.

If an error is not recoverable, the current behaviour basically
prevents any documents to be indexed and the AsyncIndexUpdate stops to
make any progress.

It may be a better trade off to report the paths of failing documents
and continue despite the failure.

What do others think?

Regards
Julian

Re: Error handling during AsyncIndexUpdate

Posted by Stefan Egli <st...@apache.org>.
+1 to report and continue.

There was a similar issue earlier where the async indexing would fail with
an OOME - in which case the 'rinse and repeat' even made it worse (as each
time more and more data-to-be-indexed accumulates and the likelihood of an
OOME would just increase)

Cheers,
Stefan

On 6/22/15 10:54 AM, "Julian Sedding" <js...@gmail.com> wrote:

>Hi all
>
>On a freshly migrated Oak setup (AEM 6.1), I recently observed that
>async indexing was running all the time. At first I did not worry,
>because there were ~14mio nodes to be indexed, but eventually I got
>the impression that there was an endless loop.
>
>Here's my take on what's happening, and please feel free to correct
>any wrong assumptions I make:
>
>- after a migration there is no checkpoint for async indexing to start
>at, so it indexes everything
>- a migration is a single commit, so async indexing is all or nothing
>(not sure the single commit is relevant, anyone?)
>- due to an oddity in the metadata of a PDF file, async indexing
>failed with an exception
>- async indexing recommences to see if the error persists on any
>subsequent run
>- rinse and repeat
>
>If my interpretation is correct, I would suggest to review the error
>handling.
>
>If an error is not recoverable, the current behaviour basically
>prevents any documents to be indexed and the AsyncIndexUpdate stops to
>make any progress.
>
>It may be a better trade off to report the paths of failing documents
>and continue despite the failure.
>
>What do others think?
>
>Regards
>Julian



Re: Error handling during AsyncIndexUpdate

Posted by Julian Sedding <js...@gmail.com>.
Hi Davide

I have looked into the issue some more and found that it is not
actually related to the document.

The issue occurs when an property is indexed in a lucene index and is
marked as ordered. Now if a property with a matching name is created
with multiple values, the exception occurs.

I have attached a simple test case to the jira issue that allows
reproducing the issue.

Would it be appropriate to ignore a property during indexing if it
does not correspond to the expectation in the index definition?

Regards
Julian

On Tue, Jun 23, 2015 at 11:04 AM, Davide Giannella <da...@apache.org> wrote:
> On 22/06/2015 18:06, Julian Sedding wrote:
>> Hi Chetan
>>
>> I created OAK-3020 and attached the full stack trace there. The root
>> cause is an IllegalArgumentException in Lucene, which in turn causes a
>> CommitFailedException (OakLucene003):
>>
>> Caused by: java.lang.IllegalArgumentException: DocValuesField
>> ":dvjcr:content/metadata/prism:expirationDate" appears more than once
>> in this document (only one value is allowed per field)
>>         at org.apache.lucene.index.NumericDocValuesWriter.addValue(NumericDocValuesWriter.java:54)
>>         at org.apache.lucene.index.DocValuesProcessor.addNumericField(DocValuesProcessor.java:153)
>>
> Thanks Julian,
>
> would it be possible to have the actual document that cause the error?
> It will be extremely helpful if we could reproduce it on a pure oak side
> with a unit test.
>
> Cheers
> Davide
>
>

Re: Error handling during AsyncIndexUpdate

Posted by Davide Giannella <da...@apache.org>.
On 22/06/2015 18:06, Julian Sedding wrote:
> Hi Chetan
>
> I created OAK-3020 and attached the full stack trace there. The root
> cause is an IllegalArgumentException in Lucene, which in turn causes a
> CommitFailedException (OakLucene003):
>
> Caused by: java.lang.IllegalArgumentException: DocValuesField
> ":dvjcr:content/metadata/prism:expirationDate" appears more than once
> in this document (only one value is allowed per field)
>         at org.apache.lucene.index.NumericDocValuesWriter.addValue(NumericDocValuesWriter.java:54)
>         at org.apache.lucene.index.DocValuesProcessor.addNumericField(DocValuesProcessor.java:153)
>
Thanks Julian,

would it be possible to have the actual document that cause the error?
It will be extremely helpful if we could reproduce it on a pure oak side
with a unit test.

Cheers
Davide



Re: Error handling during AsyncIndexUpdate

Posted by Julian Sedding <js...@gmail.com>.
Hi Chetan

I created OAK-3020 and attached the full stack trace there. The root
cause is an IllegalArgumentException in Lucene, which in turn causes a
CommitFailedException (OakLucene003):

Caused by: java.lang.IllegalArgumentException: DocValuesField
":dvjcr:content/metadata/prism:expirationDate" appears more than once
in this document (only one value is allowed per field)
        at org.apache.lucene.index.NumericDocValuesWriter.addValue(NumericDocValuesWriter.java:54)
        at org.apache.lucene.index.DocValuesProcessor.addNumericField(DocValuesProcessor.java:153)

Regards
Julian



On Mon, Jun 22, 2015 at 4:33 PM, Chetan Mehrotra
<ch...@gmail.com> wrote:
> Hi Julian,
>
> On Mon, Jun 22, 2015 at 2:24 PM, Julian Sedding <js...@gmail.com> wrote:
>> due to an oddity in the metadata of a PDF file, async indexing
>> failed with an exception
>
> What was the error you got. LuceneIndexEditor does take care of
> exception and move on [1]. So it should not cause the indexing to
> break
>
> Chetan Mehrotra
> [1] https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/LuceneIndexEditor.java#L822

Re: Error handling during AsyncIndexUpdate

Posted by Chetan Mehrotra <ch...@gmail.com>.
Hi Julian,

On Mon, Jun 22, 2015 at 2:24 PM, Julian Sedding <js...@gmail.com> wrote:
> due to an oddity in the metadata of a PDF file, async indexing
> failed with an exception

What was the error you got. LuceneIndexEditor does take care of
exception and move on [1]. So it should not cause the indexing to
break

Chetan Mehrotra
[1] https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/LuceneIndexEditor.java#L822