You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Amit Jain (JIRA)" <ji...@apache.org> on 2015/12/22 04:24:46 UTC
[jira] [Commented] (OAK-3813) Exception in datastore leads to async index stop indexing new content

    [ https://issues.apache.org/jira/browse/OAK-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067488#comment-15067488 ] 

Amit Jain commented on OAK-3813:
--------------------------------

It might be that the missing lucene blobs are because OAK-3443 is being hit which was fixed in Oak 1.2.7.

> Exception in datastore leads to async index stop indexing new content
> ---------------------------------------------------------------------
>
>                 Key: OAK-3813
>                 URL: https://issues.apache.org/jira/browse/OAK-3813
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: lucene
>    Affects Versions: 1.2.2
>            Reporter: Alexander Klimetschek
>            Priority: Critical
>
> We are using an S3 based datastore and that (for some other reasons) sometimes starts to miss certain blobs and throws an exception, see below. Unfortunately, it seems that this blocks the indexing of any new content - as the index will try again and again to index that missing binary and fail at the same point.
> It would be great if the indexing process could be more resilient against error like this. (I think the datastore implementation should probably not propagate that exception to the outside but just log it, but that's a separate issue).
> This is seen with oak 1.2.2. I had a look at the [latest version on trunk|https://github.com/apache/jackrabbit-oak/blob/d5da738aa6b43424f84063322987b765aead7813/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/AsyncIndexUpdate.java#L427-L431] but it seems the behavior has not changed since then.
> {noformat}
> 17.12.2015 20:50:26.418 -0500 *ERROR* [pool-7-thread-5] org.apache.sling.commons.scheduler.impl.QuartzScheduler Exception during job execution of org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate@5cc5e2f6 : Error occurred while obtaining InputStream for blobId [2832539c16b1a2e5745370ee89e41ab562436c5f#109419]
> java.lang.RuntimeException: Error occurred while obtaining InputStream for blobId [2832539c16b1a2e5745370ee89e41ab562436c5f#109419]
> 	at org.apache.jackrabbit.oak.plugins.blob.BlobStoreBlob.getNewStream(BlobStoreBlob.java:49)
> 	at org.apache.jackrabbit.oak.plugins.segment.SegmentBlob.getNewStream(SegmentBlob.java:84)
> 	at org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexFile.loadBlob(OakDirectory.java:216)
> 	at org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexFile.readBytes(OakDirectory.java:264)
> 	at org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexInput.readBytes(OakDirectory.java:350)
> 	at org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexInput.readByte(OakDirectory.java:356)
> 	at org.apache.lucene.store.DataInput.readInt(DataInput.java:84)
> 	at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:126)
> 	at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader.<init>(Lucene41PostingsReader.java:75)
> 	at org.apache.lucene.codecs.lucene41.Lucene41PostingsFormat.fieldsProducer(Lucene41PostingsFormat.java:430)
> 	at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.<init>(PerFieldPostingsFormat.java:195)
> 	at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat.fieldsProducer(PerFieldPostingsFormat.java:244)
> 	at org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:116)
> 	at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:96)
> 	at org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:141)
> 	at org.apache.lucene.index.BufferedUpdatesStream.applyDeletesAndUpdates(BufferedUpdatesStream.java:279)
> 	at org.apache.lucene.index.IndexWriter.applyAllDeletesAndUpdates(IndexWriter.java:3191)
> 	at org.apache.lucene.index.IndexWriter.maybeApplyDeletes(IndexWriter.java:3182)
> 	at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3155)
> 	at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3123)
> 	at org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:988)
> 	at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:932)
> 	at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:894)
> 	at org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditorContext.closeWriter(LuceneIndexEditorContext.java:169)
> 	at org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor.leave(LuceneIndexEditor.java:190)
> 	at org.apache.jackrabbit.oak.plugins.index.IndexUpdate.leave(IndexUpdate.java:221)
> 	at org.apache.jackrabbit.oak.spi.commit.VisibleEditor.leave(VisibleEditor.java:63)
> 	at org.apache.jackrabbit.oak.spi.commit.EditorDiff.process(EditorDiff.java:56)
> 	at org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate.updateIndex(AsyncIndexUpdate.java:367)
> 	at org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate.run(AsyncIndexUpdate.java:312)
> 	at org.apache.sling.commons.scheduler.impl.QuartzJobExecutor.execute(QuartzJobExecutor.java:105)
> 	at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: org.apache.jackrabbit.core.data.DataStoreException: Could not length of dataIdentifier 2832539c16b1a2e5745370ee89e41ab562436c5f
> 	at org.apache.jackrabbit.oak.plugins.blob.datastore.DataStoreBlobStore.getStream(DataStoreBlobStore.java:465)
> 	at org.apache.jackrabbit.oak.plugins.blob.datastore.DataStoreBlobStore.getInputStream(DataStoreBlobStore.java:297)
> 	at org.apache.jackrabbit.oak.plugins.blob.BlobStoreBlob.getNewStream(BlobStoreBlob.java:47)
> 	... 34 common frames omitted
> Caused by: org.apache.jackrabbit.core.data.DataStoreException: Could not length of dataIdentifier 2832539c16b1a2e5745370ee89e41ab562436c5f
> 	at org.apache.jackrabbit.aws.ext.ds.S3Backend.getLength(S3Backend.java:474)
> 	at org.apache.jackrabbit.core.data.CachingDataStore.getLength(CachingDataStore.java:669)
> 	at org.apache.jackrabbit.core.data.CachingDataStore.getRecord(CachingDataStore.java:467)
> 	at org.apache.jackrabbit.oak.plugins.blob.datastore.DataStoreBlobStore.getDataRecord(DataStoreBlobStore.java:474)
> 	at org.apache.jackrabbit.oak.plugins.blob.datastore.DataStoreBlobStore.getStream(DataStoreBlobStore.java:463)
> 	... 36 common frames omitted
> Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request ID: E29ADB7F4BE7E12F)
> 	at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1078)
> 	at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:726)
> 	at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:461)
> 	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:296)
> 	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3736)
> 	at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1027)
> 	at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1005)
> 	at org.apache.jackrabbit.aws.ext.ds.S3Backend.getLength(S3Backend.java:467)
> 	... 40 common frames omitted
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)