You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Michael Dürig (JIRA)" <ji...@apache.org> on 2016/11/01 13:58:58 UTC

[jira] [Commented] (OAK-4965) Cold standby logs SNFE ERROR

    [ https://issues.apache.org/jira/browse/OAK-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15625502#comment-15625502 ] 

Michael Dürig commented on OAK-4965:
------------------------------------

[~dulceanu], what I'm missing from the patch is the information from {{SegmentId.gcInfo()}}, which was previously logged along with the error message when an {{SNFE}} occurred. This information is absolutely crucial for determining the root cause of a {{SNFE}}. I think it is fine to just add that method back and make it package protected. 

Regarding naming: {{SegmentNotFoundExceptionHandler}} should probably be renamed to {{SegmentNotFoundExceptionListener}} as it is not really a handler. That is, it cannot react to and handle that exception e.g. by finding the missing segment elsewhere. 

Please also add some Javadoc to the handler/listener and the default implementation. 

> Cold standby logs SNFE ERROR
> ----------------------------
>
>                 Key: OAK-4965
>                 URL: https://issues.apache.org/jira/browse/OAK-4965
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: segment-tar
>    Affects Versions: Segment Tar 0.0.14
>            Reporter: Andrei Dulceanu
>            Assignee: Andrei Dulceanu
>             Fix For: 1.6, 1.5.13
>
>         Attachments: OAK-4965-01.patch
>
>
> On coldstandby, there are a lot of occurences of SNFE:
> {code}
> 200766 04.10.2016 14:29:52.657 *ERROR* [sling-default-16-Registered Service.577] org.apache.jackrabbit.oak.segment.SegmentId Segment not found: 19d493e3-8bad-4124-a962-5388d91f560e. SegmentId age=0ms
> 200767 org.apache.jackrabbit.oak.segment.SegmentNotFoundException: Segment 19d493e3-8bad-4124-a962-5388d91f560e not found
> 200768         at org.apache.jackrabbit.oak.segment.file.FileStore$14.call(FileStore.java:1345)
> 200769         at org.apache.jackrabbit.oak.segment.file.FileStore$14.call(FileStore.java:1285)
> 200770         at org.apache.jackrabbit.oak.cache.CacheLIRS$Segment.load(CacheLIRS.java:1013)
> 200771         at org.apache.jackrabbit.oak.cache.CacheLIRS$Segment.get(CacheLIRS.java:974)
> 200772         at org.apache.jackrabbit.oak.cache.CacheLIRS.get(CacheLIRS.java:285)
> 200773         at org.apache.jackrabbit.oak.segment.SegmentCache.getSegment(SegmentCache.java:92)
> 200774         at org.apache.jackrabbit.oak.segment.file.FileStore.readSegment(FileStore.java:1285)
> 200775         at org.apache.jackrabbit.oak.segment.SegmentId.getSegment(SegmentId.java:123)
> 200776         at org.apache.jackrabbit.oak.segment.Record.getSegment(Record.java:70)
> 200777         at org.apache.jackrabbit.oak.segment.SegmentNodeState.getStableIdBytes(SegmentNodeState.java:139)
> 200778         at org.apache.jackrabbit.oak.segment.SegmentNodeState.getStableId(SegmentNodeState.java:122)
> 200779         at org.apache.jackrabbit.oak.segment.SegmentNodeState.fastEquals(SegmentNodeState.java:633)
> 200780         at org.apache.jackrabbit.oak.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:459)
> 200781         at org.apache.jackrabbit.oak.segment.standby.client.StandbyClientSyncExecution.compareAgainstBaseState(StandbyClientSyncExecution.java:100)
> 200782         at org.apache.jackrabbit.oak.segment.standby.client.StandbyClientSyncExecution.execute(StandbyClientSyncExecution.java:80)
> 200783         at org.apache.jackrabbit.oak.segment.standby.client.StandbyClientSync.run(StandbyClientSync.java:143)
> 200784         at org.apache.sling.commons.scheduler.impl.QuartzJobExecutor.execute(QuartzJobExecutor.java:118)
> 200785         at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
> 200786         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 200787         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 200788         at java.lang.Thread.run(Thread.java:745)
> 200789 04.10.2016 14:29:52.657 *INFO* [sling-default-16-Registered Service.577] org.apache.jackrabbit.oak.segment.standby.client.StandbyClientSyncExecution Found missing segment 19d493e3-8bad-4124-a962-5388d91f560e
> 200790 04.10.2016 14:29:52.657 *INFO* [sling-default-16-Registered Service.577] org.apache.jackrabbit.oak.segment.standby.client.StandbyClientSyncExecution Loading segment 19d493e3-8bad-4124-a962-5388d91f560e
> 200791 04.10.2016 14:29:52.657 *DEBUG* [nioEventLoopGroup-208-1] org.apache.jackrabbit.oak.segment.standby.codec.GetSegmentRequestEncoder Sending request from client qastandby1 for segment 19d493e3-8bad-4124-a962-5388d91f560e
> {code}
> While these are false positives (the segment is found later), we need to find a way to avoid logging the errors. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)