You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@jackrabbit.apache.org by "Alex Parvulescu (Commented) (JIRA)" <ji...@apache.org> on 2011/11/30 17:09:39 UTC

[jira] [Commented] (JCR-3162) Index update overhead on cluster slave due to JCR-905

    [ https://issues.apache.org/jira/browse/JCR-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13160113#comment-13160113 ] 

Alex Parvulescu commented on JCR-3162:
--------------------------------------

Studying this problem revealed that this issue happens whenever we are dealing with a cluster sync operation involving an instance that has been running for a really long time.

At this point I'm not sure what really long time means exactly, but it would appear that after a while the journal revision resets to 0.
This causes the cluster slave to sync using a lower revision number, thus fetching the journal records again, which would determine the repository to index them again.
If the current index corresponds to a bigger revision number, re-indexing again means that there will be duplicates in the index.

JCR-905 tried to address that by first deleting all the records that come from an external source (the cluster sync) before adding them.

The proposed solution tries to determine on repository startup if the index is stale and tries to force a full reindex by deleting it.
Index staleness is currently determined by checking if journal revision is 0 and if there are already index files present in the repository.

Interestingly this happens a lot during tests when the index is conserved from one restart to the other, but the journal impl is memory based so it gets reset every time.

The solution has some issues because of the asynchronous initialization of SearchIndex for workspaces other than "default". Meaning that by the time the SearchIndex gets initialized, the cluster node has already sync'ed to a bigger revision than 0, even if it was 0 at the moment when the repo was starting up.
But this doesn't apply to the default workspace.

                
> Index update overhead on cluster slave due to JCR-905
> -----------------------------------------------------
>
>                 Key: JCR-3162
>                 URL: https://issues.apache.org/jira/browse/JCR-3162
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: clustering
>            Reporter: Alex Parvulescu
>            Priority: Minor
>
> JCR-905 is a quick and dirty fix and causes overhead on a cluster slave node when it processes revisions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira