You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by "Alex Parvulescu (Created) (JIRA)" <ji...@apache.org> on 2011/11/30 16:49:40 UTC

[jira] [Created] (JCR-3162) Index update overhead on cluster slave due to JCR-905

Index update overhead on cluster slave due to JCR-905
-----------------------------------------------------

                 Key: JCR-3162
                 URL: https://issues.apache.org/jira/browse/JCR-3162
             Project: Jackrabbit Content Repository
          Issue Type: Improvement
          Components: clustering
            Reporter: Alex Parvulescu
            Priority: Minor


JCR-905 is a quick and dirty fix and causes overhead on a cluster slave node when it processes revisions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (JCR-3162) Index update overhead on cluster slave due to JCR-905

Posted by "Alex Parvulescu (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alex Parvulescu updated JCR-3162:
---------------------------------

    Attachment: JCR-3162-v3.patch

V3 comes with a complete redesign of the patch.

After further analysis we've decided to go with inspecting the incoming journal changes in the case of an initial index re-build.

I'll try to clarify. The scope of JCR-905 fix should *only* be for an initial index build. 
The initial indexing operation can cause doubles to appear, as some nodes can be seen by a slave before the ADD event has reached it. This happens because of shared storage between cluster nodes.
So, when a slave starts to re-index the repository content, it will include *everything* (potentially also nodes that is hasn't received a ADD event for yet). 
When the index finishes, the repository will continue its startup. A bit later, the cluster component will also initialize and consequently sync. This will pull in the ADD events that were pending in a newer revision, on the master.

The V3 tries to poll the changes before the cluster.sync call, and preemptively generate DELETE events for all the ADD events that it finds on the current workspace.
(this is similar to the JCR-905 patch, but with a much smaller scope).

 
Another feature introduced in the patch is to force flush the index after the initial index has been created.
This was artificially done in the original test case (no unit test though) by:
> However, when I debug clusternode 2 and have a breakpoint (i.e., a pause of a few seconds at line 306 of RepositoryImpl.java - just before the clusternode is started), then the resultset contains two results, both with the same UUID.

So forcing the index flush will correctly reproduce the original problem. And I think should be the correct behaviour of the original index creation.
On the other hand, not flushing the index will hide the problem because the indexing queue is smart enough to remove doubles.

But, flushing the index basically invalidates JCR-905, which is a bit unexpected (see attached patch, by switching the feature flags off).

On the code itself: I guess the AbstractJournal could use a bit of refactoring on the event polling side.


                
> Index update overhead on cluster slave due to JCR-905
> -----------------------------------------------------
>
>                 Key: JCR-3162
>                 URL: https://issues.apache.org/jira/browse/JCR-3162
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: clustering
>            Reporter: Alex Parvulescu
>            Priority: Minor
>         Attachments: JCR-3162-v2.patch, JCR-3162-v3.patch, JCR-3162.patch
>
>
> JCR-905 is a quick and dirty fix and causes overhead on a cluster slave node when it processes revisions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JCR-3162) Index update overhead on cluster slave due to JCR-905

Posted by "Marcel Reutegger (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13170329#comment-13170329 ] 

Marcel Reutegger commented on JCR-3162:
---------------------------------------

I can confirm that the test runs successful when the path in the query statement is quoted.

I think now that we know what's going on, we can apply the patch again. Sorry, for being overly cautious....
                
> Index update overhead on cluster slave due to JCR-905
> -----------------------------------------------------
>
>                 Key: JCR-3162
>                 URL: https://issues.apache.org/jira/browse/JCR-3162
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: clustering
>            Reporter: Alex Parvulescu
>            Priority: Minor
>             Fix For: 2.3.6
>
>         Attachments: DbClusterTestJCR3162.patch, JCR-3162-v2.patch, JCR-3162-v3.patch, JCR-3162-v4.patch, JCR-3162-v5.patch, JCR-3162.patch
>
>
> JCR-905 is a quick and dirty fix and causes overhead on a cluster slave node when it processes revisions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (JCR-3162) Index update overhead on cluster slave due to JCR-905

Posted by "Alex Parvulescu (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alex Parvulescu resolved JCR-3162.
----------------------------------

       Resolution: Fixed
    Fix Version/s: 2.3.6

fixed in revision 1214329
                
> Index update overhead on cluster slave due to JCR-905
> -----------------------------------------------------
>
>                 Key: JCR-3162
>                 URL: https://issues.apache.org/jira/browse/JCR-3162
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: clustering
>            Reporter: Alex Parvulescu
>            Priority: Minor
>             Fix For: 2.3.6
>
>         Attachments: JCR-3162-v2.patch, JCR-3162-v3.patch, JCR-3162-v4.patch, JCR-3162-v5.patch, JCR-3162.patch
>
>
> JCR-905 is a quick and dirty fix and causes overhead on a cluster slave node when it processes revisions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (JCR-3162) Index update overhead on cluster slave due to JCR-905

Posted by "Marcel Reutegger (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marcel Reutegger updated JCR-3162:
----------------------------------

    Attachment: DbClusterTestJCR3162.patch
    
> Index update overhead on cluster slave due to JCR-905
> -----------------------------------------------------
>
>                 Key: JCR-3162
>                 URL: https://issues.apache.org/jira/browse/JCR-3162
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: clustering
>            Reporter: Alex Parvulescu
>            Priority: Minor
>             Fix For: 2.3.6
>
>         Attachments: DbClusterTestJCR3162.patch, JCR-3162-v2.patch, JCR-3162-v3.patch, JCR-3162-v4.patch, JCR-3162-v5.patch, JCR-3162.patch
>
>
> JCR-905 is a quick and dirty fix and causes overhead on a cluster slave node when it processes revisions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JCR-3162) Index update overhead on cluster slave due to JCR-905

Posted by "Alex Parvulescu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13169428#comment-13169428 ] 

Alex Parvulescu commented on JCR-3162:
--------------------------------------

thanks for taking the time to rewrite the patch!

It looks good to me.
I'll remove the feature flags and the ol' dirty fix and commit the code.
                
> Index update overhead on cluster slave due to JCR-905
> -----------------------------------------------------
>
>                 Key: JCR-3162
>                 URL: https://issues.apache.org/jira/browse/JCR-3162
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: clustering
>            Reporter: Alex Parvulescu
>            Priority: Minor
>         Attachments: JCR-3162-v2.patch, JCR-3162-v3.patch, JCR-3162-v4.patch, JCR-3162-v5.patch, JCR-3162.patch
>
>
> JCR-905 is a quick and dirty fix and causes overhead on a cluster slave node when it processes revisions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JCR-3162) Index update overhead on cluster slave due to JCR-905

Posted by "Alex Parvulescu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13168354#comment-13168354 ] 

Alex Parvulescu commented on JCR-3162:
--------------------------------------

hi, thanks for the feedback.

> I see that you moved the initial index creation from the SearchIndex to the MultiIndex. This is how it was before JCR-1064, where we had to move it out of the MultiIndex. I'm not sure if it is now safe to move the initial index creation trigger back to the SearchIndex... 

I wasn't aware of that move. I can put it back, it just felt a bit off to have code that is potentially internal to MultiIndex outside controlled by SearchIndex.

> The class ClusterNode imports the MultiIndex, though this is probably just minor because it is not really required to compile the class, but only to resolve the javadoc reference
True. We can remove that import.

> More troublesome is the import of DbClusterTestJCR3162 in SearchManager. That doesn't work because DbClusterTestJCR3162 is a test class. 
That is just the feature flag. It will be removed once we agree on what the code should look like.

>Wouldn't it be sufficient to use the existing journal and cluster node interface to get changes from a given revision?
I'm not aware of any such interface. You have the local revision number available, but not the actual changes, that is why I've added the 'poll' operation.

Now I'm investigating a lock on Cluster.sync introduced by this patch, so I'll upload another one soon that will take this feedback into consideration.
                
> Index update overhead on cluster slave due to JCR-905
> -----------------------------------------------------
>
>                 Key: JCR-3162
>                 URL: https://issues.apache.org/jira/browse/JCR-3162
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: clustering
>            Reporter: Alex Parvulescu
>            Priority: Minor
>         Attachments: JCR-3162-v2.patch, JCR-3162-v3.patch, JCR-3162.patch
>
>
> JCR-905 is a quick and dirty fix and causes overhead on a cluster slave node when it processes revisions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JCR-3162) Index update overhead on cluster slave due to JCR-905

Posted by "Alex Parvulescu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13169535#comment-13169535 ] 

Alex Parvulescu commented on JCR-3162:
--------------------------------------

yes I did. do you see any tests failing?

is there any reason for reverting the change?
                
> Index update overhead on cluster slave due to JCR-905
> -----------------------------------------------------
>
>                 Key: JCR-3162
>                 URL: https://issues.apache.org/jira/browse/JCR-3162
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: clustering
>            Reporter: Alex Parvulescu
>            Priority: Minor
>             Fix For: 2.3.6
>
>         Attachments: JCR-3162-v2.patch, JCR-3162-v3.patch, JCR-3162-v4.patch, JCR-3162-v5.patch, JCR-3162.patch
>
>
> JCR-905 is a quick and dirty fix and causes overhead on a cluster slave node when it processes revisions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (JCR-3162) Index update overhead on cluster slave due to JCR-905

Posted by "Alex Parvulescu (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alex Parvulescu updated JCR-3162:
---------------------------------

    Attachment: JCR-3162-v4.patch

> I think Journal.getRecords(long startRevision) could be used for that purpose.

yes,that makes perfect sense! (and it gets rid of my lock problem :)

Attaching v4:
 - using Journal.getRecords
 - cleanup of imports
 - moved feature flags from 'test' code to main code (I'll remove them entirely once the code is good)


                
> Index update overhead on cluster slave due to JCR-905
> -----------------------------------------------------
>
>                 Key: JCR-3162
>                 URL: https://issues.apache.org/jira/browse/JCR-3162
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: clustering
>            Reporter: Alex Parvulescu
>            Priority: Minor
>         Attachments: JCR-3162-v2.patch, JCR-3162-v3.patch, JCR-3162-v4.patch, JCR-3162.patch
>
>
> JCR-905 is a quick and dirty fix and causes overhead on a cluster slave node when it processes revisions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (JCR-3162) Index update overhead on cluster slave due to JCR-905

Posted by "Alex Parvulescu (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alex Parvulescu resolved JCR-3162.
----------------------------------

    Resolution: Fixed

Fixed in rev 1214842.

False alarm, the errors came from an issue with the SQL2 Parsed (JCR-3182).
As it turns out, we have a workaround for it: quoting the path helps the parser work properly.
                
> Index update overhead on cluster slave due to JCR-905
> -----------------------------------------------------
>
>                 Key: JCR-3162
>                 URL: https://issues.apache.org/jira/browse/JCR-3162
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: clustering
>            Reporter: Alex Parvulescu
>            Priority: Minor
>             Fix For: 2.3.6
>
>         Attachments: DbClusterTestJCR3162.patch, JCR-3162-v2.patch, JCR-3162-v3.patch, JCR-3162-v4.patch, JCR-3162-v5.patch, JCR-3162.patch
>
>
> JCR-905 is a quick and dirty fix and causes overhead on a cluster slave node when it processes revisions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JCR-3162) Index update overhead on cluster slave due to JCR-905

Posted by "Marcel Reutegger (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13169417#comment-13169417 ] 

Marcel Reutegger commented on JCR-3162:
---------------------------------------

Looks better now to me.

Some more comments:

- I would rather put getting the change log records into the SearchIndex, to not expand the api of the cluster node unnecessarily.
- getChangeLogRecords() should check the producer id of the record
- processing of ClusterRecords can be done without instanceof check
- Record.getWorkspace() may return null for operations on the version storage (RepositoryImpl.getSearchManager() should pass null as workspace name?)
- instance revision is also available from ClusterNode.getRevision()

I'll attach a patch.
                
> Index update overhead on cluster slave due to JCR-905
> -----------------------------------------------------
>
>                 Key: JCR-3162
>                 URL: https://issues.apache.org/jira/browse/JCR-3162
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: clustering
>            Reporter: Alex Parvulescu
>            Priority: Minor
>         Attachments: JCR-3162-v2.patch, JCR-3162-v3.patch, JCR-3162-v4.patch, JCR-3162.patch
>
>
> JCR-905 is a quick and dirty fix and causes overhead on a cluster slave node when it processes revisions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (JCR-3162) Index update overhead on cluster slave due to JCR-905

Posted by "Marcel Reutegger (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marcel Reutegger updated JCR-3162:
----------------------------------

    Attachment: JCR-3162-v5.patch
    
> Index update overhead on cluster slave due to JCR-905
> -----------------------------------------------------
>
>                 Key: JCR-3162
>                 URL: https://issues.apache.org/jira/browse/JCR-3162
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: clustering
>            Reporter: Alex Parvulescu
>            Priority: Minor
>         Attachments: JCR-3162-v2.patch, JCR-3162-v3.patch, JCR-3162-v4.patch, JCR-3162-v5.patch, JCR-3162.patch
>
>
> JCR-905 is a quick and dirty fix and causes overhead on a cluster slave node when it processes revisions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (JCR-3162) Index update overhead on cluster slave due to JCR-905

Posted by "Alex Parvulescu (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alex Parvulescu updated JCR-3162:
---------------------------------

    Attachment: JCR-3162-v2.patch

Attaching v2 of the patch.

It is basically v1 + a test + some feature flags.
I managed to find a way to test this problem within JR Core, which should allow us to have some meaningful progress.

The test DbClusterTestJCR3162 contains 2 feature flags (which will be removed once the patch is good):
 - JCR3162_ENABLE_CLUSTER_DELETE this is the JCR-905 fix which can be switched on/off with the help of this flag
 - JCR3162_ENABLE_REINDEX this controls the new code (reindexing if the index is considered stale)

So basically: 
 - both flags off: you can see the initial behaviour (pre JCR-905) in action. The test fails (as it should).

 - just JCR3162_ENABLE_CLUSTER_DELETE represents the current behaviour of the repository (the test passes)

 - just JCR3162_ENABLE_REINDEX represents the patch in action (the test passes!)

The original problem still stands: how do we know if an index is stale or not. Default workspace will be covered, the other workspaces are going to be initializes asynchronously, so there are no guarantees.

                
> Index update overhead on cluster slave due to JCR-905
> -----------------------------------------------------
>
>                 Key: JCR-3162
>                 URL: https://issues.apache.org/jira/browse/JCR-3162
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: clustering
>            Reporter: Alex Parvulescu
>            Priority: Minor
>         Attachments: JCR-3162-v2.patch, JCR-3162.patch
>
>
> JCR-905 is a quick and dirty fix and causes overhead on a cluster slave node when it processes revisions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JCR-3162) Index update overhead on cluster slave due to JCR-905

Posted by "Marcel Reutegger (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13168372#comment-13168372 ] 

Marcel Reutegger commented on JCR-3162:
---------------------------------------

I think Journal.getRecords(long startRevision) could be used for that purpose.
                
> Index update overhead on cluster slave due to JCR-905
> -----------------------------------------------------
>
>                 Key: JCR-3162
>                 URL: https://issues.apache.org/jira/browse/JCR-3162
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: clustering
>            Reporter: Alex Parvulescu
>            Priority: Minor
>         Attachments: JCR-3162-v2.patch, JCR-3162-v3.patch, JCR-3162.patch
>
>
> JCR-905 is a quick and dirty fix and causes overhead on a cluster slave node when it processes revisions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Reopened] (JCR-3162) Index update overhead on cluster slave due to JCR-905

Posted by "Marcel Reutegger (Reopened) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marcel Reutegger reopened JCR-3162:
-----------------------------------


To me it looks like this is not yet fixed.
                
> Index update overhead on cluster slave due to JCR-905
> -----------------------------------------------------
>
>                 Key: JCR-3162
>                 URL: https://issues.apache.org/jira/browse/JCR-3162
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: clustering
>            Reporter: Alex Parvulescu
>            Priority: Minor
>             Fix For: 2.3.6
>
>         Attachments: DbClusterTestJCR3162.patch, JCR-3162-v2.patch, JCR-3162-v3.patch, JCR-3162-v4.patch, JCR-3162-v5.patch, JCR-3162.patch
>
>
> JCR-905 is a quick and dirty fix and causes overhead on a cluster slave node when it processes revisions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JCR-3162) Index update overhead on cluster slave due to JCR-905

Posted by "Alex Parvulescu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13160113#comment-13160113 ] 

Alex Parvulescu commented on JCR-3162:
--------------------------------------

Studying this problem revealed that this issue happens whenever we are dealing with a cluster sync operation involving an instance that has been running for a really long time.

At this point I'm not sure what really long time means exactly, but it would appear that after a while the journal revision resets to 0.
This causes the cluster slave to sync using a lower revision number, thus fetching the journal records again, which would determine the repository to index them again.
If the current index corresponds to a bigger revision number, re-indexing again means that there will be duplicates in the index.

JCR-905 tried to address that by first deleting all the records that come from an external source (the cluster sync) before adding them.

The proposed solution tries to determine on repository startup if the index is stale and tries to force a full reindex by deleting it.
Index staleness is currently determined by checking if journal revision is 0 and if there are already index files present in the repository.

Interestingly this happens a lot during tests when the index is conserved from one restart to the other, but the journal impl is memory based so it gets reset every time.

The solution has some issues because of the asynchronous initialization of SearchIndex for workspaces other than "default". Meaning that by the time the SearchIndex gets initialized, the cluster node has already sync'ed to a bigger revision than 0, even if it was 0 at the moment when the repo was starting up.
But this doesn't apply to the default workspace.

                
> Index update overhead on cluster slave due to JCR-905
> -----------------------------------------------------
>
>                 Key: JCR-3162
>                 URL: https://issues.apache.org/jira/browse/JCR-3162
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: clustering
>            Reporter: Alex Parvulescu
>            Priority: Minor
>
> JCR-905 is a quick and dirty fix and causes overhead on a cluster slave node when it processes revisions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (JCR-3162) Index update overhead on cluster slave due to JCR-905

Posted by "Alex Parvulescu (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alex Parvulescu updated JCR-3162:
---------------------------------

    Attachment: JCR-3162.patch

attaching patch.

Warning: this is a work in progress, not really tested. 
It includes changes from JCR-3160 (as some files overlap).

Feedback needed.
                
> Index update overhead on cluster slave due to JCR-905
> -----------------------------------------------------
>
>                 Key: JCR-3162
>                 URL: https://issues.apache.org/jira/browse/JCR-3162
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: clustering
>            Reporter: Alex Parvulescu
>            Priority: Minor
>         Attachments: JCR-3162.patch
>
>
> JCR-905 is a quick and dirty fix and causes overhead on a cluster slave node when it processes revisions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JCR-3162) Index update overhead on cluster slave due to JCR-905

Posted by "Marcel Reutegger (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13170132#comment-13170132 ] 

Marcel Reutegger commented on JCR-3162:
---------------------------------------

I also noticed that the test is not included in the TestAll suite of the respective package. This means the new test is not run with every build.

The new test uses annotations to mark the test, which makes it difficult to use in the TestAll. Is there a specific reason why you used annotations instead of inheriting from JUnitTest.

I quickly extended the test to make the test nodes versionable and check consistency on the complete workspace (including the version storage). Running the test now results in random failure for me.

See attached patch.
                
> Index update overhead on cluster slave due to JCR-905
> -----------------------------------------------------
>
>                 Key: JCR-3162
>                 URL: https://issues.apache.org/jira/browse/JCR-3162
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: clustering
>            Reporter: Alex Parvulescu
>            Priority: Minor
>             Fix For: 2.3.6
>
>         Attachments: DbClusterTestJCR3162.patch, JCR-3162-v2.patch, JCR-3162-v3.patch, JCR-3162-v4.patch, JCR-3162-v5.patch, JCR-3162.patch
>
>
> JCR-905 is a quick and dirty fix and causes overhead on a cluster slave node when it processes revisions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JCR-3162) Index update overhead on cluster slave due to JCR-905

Posted by "Marcel Reutegger (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13168350#comment-13168350 ] 

Marcel Reutegger commented on JCR-3162:
---------------------------------------

Thanks for the patch.

I see that you moved the initial index creation from the SearchIndex to the MultiIndex. This is how it was before JCR-1064, where we had to move it out of the MultiIndex. I'm not sure if it is now safe to move the initial index creation trigger back to the SearchIndex...

The are some imports which look odd. The class ClusterNode imports the MultiIndex, though this is probably just minor because it is not really required to compile the class, but only to resolve the javadoc reference. More troublesome is the import of DbClusterTestJCR3162 in SearchManager. That doesn't work because DbClusterTestJCR3162 is a test class.

I'm also wordering if it is possible to fix this issue without introducing too much changes to the journal/cluster code. Wouldn't it be sufficient to use the existing journal and cluster node interface to get changes from a given revision?
                
> Index update overhead on cluster slave due to JCR-905
> -----------------------------------------------------
>
>                 Key: JCR-3162
>                 URL: https://issues.apache.org/jira/browse/JCR-3162
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: clustering
>            Reporter: Alex Parvulescu
>            Priority: Minor
>         Attachments: JCR-3162-v2.patch, JCR-3162-v3.patch, JCR-3162.patch
>
>
> JCR-905 is a quick and dirty fix and causes overhead on a cluster slave node when it processes revisions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JCR-3162) Index update overhead on cluster slave due to JCR-905

Posted by "Marcel Reutegger (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13169529#comment-13169529 ] 

Marcel Reutegger commented on JCR-3162:
---------------------------------------

> I'll remove the feature flags and the ol' dirty fix and commit the code.

my patch was just a quick rewrite of the patch to illustrate my comments. I didn't run any tests with the patch. Did you run the tests? Otherwise I suggest we revert the commit for now...
                
> Index update overhead on cluster slave due to JCR-905
> -----------------------------------------------------
>
>                 Key: JCR-3162
>                 URL: https://issues.apache.org/jira/browse/JCR-3162
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: clustering
>            Reporter: Alex Parvulescu
>            Priority: Minor
>             Fix For: 2.3.6
>
>         Attachments: JCR-3162-v2.patch, JCR-3162-v3.patch, JCR-3162-v4.patch, JCR-3162-v5.patch, JCR-3162.patch
>
>
> JCR-905 is a quick and dirty fix and causes overhead on a cluster slave node when it processes revisions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira