You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by "Martijn Hendriks (JIRA)" <ji...@apache.org> on 2007/08/24 12:43:30 UTC

[jira] Created: (JCR-1087) Maintain the cluster revision table

Maintain the cluster revision table
-----------------------------------

                 Key: JCR-1087
                 URL: https://issues.apache.org/jira/browse/JCR-1087
             Project: Jackrabbit
          Issue Type: Improvement
          Components: clustering
    Affects Versions: 1.3
         Environment: A clustered Jackrabbit
            Reporter: Martijn Hendriks
            Priority: Minor


The revision table in which cluster nodes write their changes can potentially become very large. If all cluster nodes are up to date to a certain revision number, then it seems unnecessary to keep the revisions with a lower number.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (JCR-1087) Maintain the cluster revision table

Posted by "Martijn Hendriks (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-1087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Martijn Hendriks reassigned JCR-1087:
-------------------------------------

    Assignee: Martijn Hendriks

> Maintain the cluster revision table
> -----------------------------------
>
>                 Key: JCR-1087
>                 URL: https://issues.apache.org/jira/browse/JCR-1087
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: clustering
>    Affects Versions: 1.3
>         Environment: A clustered Jackrabbit
>            Reporter: Martijn Hendriks
>            Assignee: Martijn Hendriks
>            Priority: Minor
>         Attachments: cluster-trace.txt, JCR-1087-v2.patch, JCR-1087.patch
>
>
> The revision table in which cluster nodes write their changes can potentially become very large. If all cluster nodes are up to date to a certain revision number, then it seems unnecessary to keep the revisions with a lower number.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (JCR-1087) Maintain the cluster revision table

Posted by "Christian Schröder (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-1087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Christian Schröder updated JCR-1087:
------------------------------------

    Comment: was deleted

> Maintain the cluster revision table
> -----------------------------------
>
>                 Key: JCR-1087
>                 URL: https://issues.apache.org/jira/browse/JCR-1087
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: clustering, jackrabbit-core
>         Environment: A clustered Jackrabbit
>            Reporter: Martijn Hendriks
>            Assignee: Martijn Hendriks
>            Priority: Minor
>             Fix For: 1.5.0
>
>         Attachments: cluster-trace.txt, JCR-1087-v2.patch, JCR-1087.patch
>
>
> The revision table in which cluster nodes write their changes can potentially become very large. If all cluster nodes are up to date to a certain revision number, then it seems unnecessary to keep the revisions with a lower number.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (JCR-1087) Maintain the cluster revision table

Posted by "Martijn Hendriks (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-1087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Martijn Hendriks updated JCR-1087:
----------------------------------

    Attachment: JCR-1087.patch

Attached is a patch for this issue. When a DatabaseJournal is used, the local revisions are also stored in the database instead of on the local file system. This information can then be used for periodic clean-ups of the JOURNAL table which may become very large. Note that this only works if all JR information except for the search index is stored in the database. The clean-up thread is disabled by default.

Please comment. Thanks!

> Maintain the cluster revision table
> -----------------------------------
>
>                 Key: JCR-1087
>                 URL: https://issues.apache.org/jira/browse/JCR-1087
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: clustering
>    Affects Versions: 1.3
>         Environment: A clustered Jackrabbit
>            Reporter: Martijn Hendriks
>            Priority: Minor
>         Attachments: cluster-trace.txt, JCR-1087.patch
>
>
> The revision table in which cluster nodes write their changes can potentially become very large. If all cluster nodes are up to date to a certain revision number, then it seems unnecessary to keep the revisions with a lower number.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (JCR-1087) Maintain the cluster revision table

Posted by "Martijn Hendriks (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-1087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Martijn Hendriks updated JCR-1087:
----------------------------------

    Attachment: JCR-1087-v2.patch

Hi all,

Unfortunately I've been inactive for a while but now i've more time to work on Jackrabbit  which is good :). I created a second patch for this issue which also addresses the upgrade scenario that Dominique mentioned:
- Added the LOCAL_REVISIONS table to the create scripts (*.ddl)
- Added InstanceRevision interface
- The InstanceRevision is now retrieved through the Journal instance
- Added logic to the DatabaseJournal to (i) migrate to a db based InstanceRevision,
  and (ii) start a janitor thread for cleaning up old cluster revision entries

I've tested the patch only on MSSQL, MySQL and Oracle, because I don't have access to the other databases.

I don't really like the solution for the upgrade scenario (a ddl is scanned for the line that creates the LOCAL_REVISIONS table), but I like the alternative of having twice as many .ddl files even less. But maybe there's a third way...?

Best regards, Martijn

> Maintain the cluster revision table
> -----------------------------------
>
>                 Key: JCR-1087
>                 URL: https://issues.apache.org/jira/browse/JCR-1087
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: clustering
>    Affects Versions: 1.3
>         Environment: A clustered Jackrabbit
>            Reporter: Martijn Hendriks
>            Priority: Minor
>         Attachments: cluster-trace.txt, JCR-1087-v2.patch, JCR-1087.patch
>
>
> The revision table in which cluster nodes write their changes can potentially become very large. If all cluster nodes are up to date to a certain revision number, then it seems unnecessary to keep the revisions with a lower number.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (JCR-1087) Maintain the cluster revision table

Posted by "Martijn Hendriks (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-1087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Martijn Hendriks reassigned JCR-1087:
-------------------------------------

    Assignee:     (was: Martijn Hendriks)

> Maintain the cluster revision table
> -----------------------------------
>
>                 Key: JCR-1087
>                 URL: https://issues.apache.org/jira/browse/JCR-1087
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: clustering, jackrabbit-core
>         Environment: A clustered Jackrabbit
>            Reporter: Martijn Hendriks
>            Priority: Minor
>             Fix For: 1.5.0
>
>         Attachments: cluster-trace.txt, JCR-1087-v2.patch, JCR-1087.patch
>
>
> The revision table in which cluster nodes write their changes can potentially become very large. If all cluster nodes are up to date to a certain revision number, then it seems unnecessary to keep the revisions with a lower number.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-1087) Maintain the cluster revision table

Posted by "Christian Schröder (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-1087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661283#action_12661283 ] 

Christian Schröder commented on JCR-1087:
-----------------------------------------

prepare Statements already uses the LOCAL_REVISIONS table. Because of this the checkLocalRevisionSchema() will be too late and startup fails.

checkLocalRevisionSchema() should be called in checkScheme();

> Maintain the cluster revision table
> -----------------------------------
>
>                 Key: JCR-1087
>                 URL: https://issues.apache.org/jira/browse/JCR-1087
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: clustering, jackrabbit-core
>         Environment: A clustered Jackrabbit
>            Reporter: Martijn Hendriks
>            Assignee: Martijn Hendriks
>            Priority: Minor
>             Fix For: 1.5.0
>
>         Attachments: cluster-trace.txt, JCR-1087-v2.patch, JCR-1087.patch
>
>
> The revision table in which cluster nodes write their changes can potentially become very large. If all cluster nodes are up to date to a certain revision number, then it seems unnecessary to keep the revisions with a lower number.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (JCR-1087) Maintain the cluster revision table

Posted by "Martijn Hendriks (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-1087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Martijn Hendriks updated JCR-1087:
----------------------------------

    Attachment: cluster-trace.txt

When the cluster revision table becomes too large a cluster node without search index and local revision number cannot be started due to memory problems (see attached stacktrace).

> Maintain the cluster revision table
> -----------------------------------
>
>                 Key: JCR-1087
>                 URL: https://issues.apache.org/jira/browse/JCR-1087
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: clustering
>    Affects Versions: 1.3
>         Environment: A clustered Jackrabbit
>            Reporter: Martijn Hendriks
>            Priority: Minor
>         Attachments: cluster-trace.txt
>
>
> The revision table in which cluster nodes write their changes can potentially become very large. If all cluster nodes are up to date to a certain revision number, then it seems unnecessary to keep the revisions with a lower number.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-1087) Maintain the cluster revision table

Posted by "Martijn Hendriks (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-1087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12770827#action_12770827 ] 

Martijn Hendriks commented on JCR-1087:
---------------------------------------

I'm afraid there's no documentation yet. I'll try to add it soon.

> Maintain the cluster revision table
> -----------------------------------
>
>                 Key: JCR-1087
>                 URL: https://issues.apache.org/jira/browse/JCR-1087
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: clustering, jackrabbit-core
>         Environment: A clustered Jackrabbit
>            Reporter: Martijn Hendriks
>            Assignee: Martijn Hendriks
>            Priority: Minor
>             Fix For: 1.5.0
>
>         Attachments: cluster-trace.txt, JCR-1087-v2.patch, JCR-1087.patch
>
>
> The revision table in which cluster nodes write their changes can potentially become very large. If all cluster nodes are up to date to a certain revision number, then it seems unnecessary to keep the revisions with a lower number.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (JCR-1087) Maintain the cluster revision table

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-1087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jukka Zitting updated JCR-1087:
-------------------------------

          Component/s: jackrabbit-core
    Affects Version/s:     (was: 1.3)
        Fix Version/s: 1.5

> Maintain the cluster revision table
> -----------------------------------
>
>                 Key: JCR-1087
>                 URL: https://issues.apache.org/jira/browse/JCR-1087
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: clustering, jackrabbit-core
>         Environment: A clustered Jackrabbit
>            Reporter: Martijn Hendriks
>            Assignee: Martijn Hendriks
>            Priority: Minor
>             Fix For: 1.5
>
>         Attachments: cluster-trace.txt, JCR-1087-v2.patch, JCR-1087.patch
>
>
> The revision table in which cluster nodes write their changes can potentially become very large. If all cluster nodes are up to date to a certain revision number, then it seems unnecessary to keep the revisions with a lower number.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-1087) Maintain the cluster revision table

Posted by "Dominique Pfister (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-1087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12540178 ] 

Dominique Pfister commented on JCR-1087:
----------------------------------------

Hi Martijn,

your patch looks good to me, so please go ahead and submit it. One nice thing that might be required for people already owning a database journal: is there a way to easily detect whether the LOCAL_REVISIONS table is missing and to tell the user to upgrade their schema?

Cheers
Dominique

> Maintain the cluster revision table
> -----------------------------------
>
>                 Key: JCR-1087
>                 URL: https://issues.apache.org/jira/browse/JCR-1087
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: clustering
>    Affects Versions: 1.3
>         Environment: A clustered Jackrabbit
>            Reporter: Martijn Hendriks
>            Priority: Minor
>         Attachments: cluster-trace.txt, JCR-1087.patch
>
>
> The revision table in which cluster nodes write their changes can potentially become very large. If all cluster nodes are up to date to a certain revision number, then it seems unnecessary to keep the revisions with a lower number.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-1087) Maintain the cluster revision table

Posted by "Martijn Hendriks (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-1087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12540994 ] 

Martijn Hendriks commented on JCR-1087:
---------------------------------------

Hi Dominique,

Good point! When this patch is applied to a Jackrabbit installation that already uses the clustering feature it will break if the LOCAL_REVISIONS table is not added manually. I'll look into this.

Martijn

> Maintain the cluster revision table
> -----------------------------------
>
>                 Key: JCR-1087
>                 URL: https://issues.apache.org/jira/browse/JCR-1087
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: clustering
>    Affects Versions: 1.3
>         Environment: A clustered Jackrabbit
>            Reporter: Martijn Hendriks
>            Priority: Minor
>         Attachments: cluster-trace.txt, JCR-1087.patch
>
>
> The revision table in which cluster nodes write their changes can potentially become very large. If all cluster nodes are up to date to a certain revision number, then it seems unnecessary to keep the revisions with a lower number.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-1087) Maintain the cluster revision table

Posted by "Martijn Hendriks (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-1087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12536504 ] 

Martijn Hendriks commented on JCR-1087:
---------------------------------------

The resolution of JCR-905 allows us to remove all unnecessary revision data. I.e., the minimum of all local revisions of the clusternodes gives an upperbound on the revisions that can safely be removed from the database.

A solution for this issue would be to add a periodic task that removes all unnecessary revisions:
- All clusternodes should add their local revision to the database.
- Add a configuration option in the repository.xml to let one of the clusternodes execute the cleanup task (i.e., period and offset such as "every night at 00:00 hours").

> Maintain the cluster revision table
> -----------------------------------
>
>                 Key: JCR-1087
>                 URL: https://issues.apache.org/jira/browse/JCR-1087
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: clustering
>    Affects Versions: 1.3
>         Environment: A clustered Jackrabbit
>            Reporter: Martijn Hendriks
>            Priority: Minor
>         Attachments: cluster-trace.txt
>
>
> The revision table in which cluster nodes write their changes can potentially become very large. If all cluster nodes are up to date to a certain revision number, then it seems unnecessary to keep the revisions with a lower number.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (JCR-1087) Maintain the cluster revision table

Posted by "Martijn Hendriks (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-1087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Martijn Hendriks resolved JCR-1087.
-----------------------------------

    Resolution: Fixed

Committed in revision 628697.

The instance revision on the local file system is automatically migrated to the database (to the LOCAL_REVISIONS) table. The clean-up thread is not started by default.

Known caveats of the current solution:
- The user must make sure that all cluster nodes have written their local revision to the database before the clean-up thread runs for the first time because otherwise cluster nodes might miss updates (because they have been purged) and their local caches and search-indexes get out of sync.
- If a cluster node is removed permanently from the cluster, then its entry in the LOCAL_REVISIONS table should be removed manually. Otherwise, the clean-up thread will not be effective.


> Maintain the cluster revision table
> -----------------------------------
>
>                 Key: JCR-1087
>                 URL: https://issues.apache.org/jira/browse/JCR-1087
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: clustering
>    Affects Versions: 1.3
>         Environment: A clustered Jackrabbit
>            Reporter: Martijn Hendriks
>            Assignee: Martijn Hendriks
>            Priority: Minor
>         Attachments: cluster-trace.txt, JCR-1087-v2.patch, JCR-1087.patch
>
>
> The revision table in which cluster nodes write their changes can potentially become very large. If all cluster nodes are up to date to a certain revision number, then it seems unnecessary to keep the revisions with a lower number.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-1087) Maintain the cluster revision table

Posted by "Thomas Mueller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-1087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12770556#action_12770556 ] 

Thomas Mueller commented on JCR-1087:
-------------------------------------

I couldn't find any documentation for this feature at: http://wiki.apache.org/jackrabbit/Clustering

Is there any documentation? So far I only added a link to here (Removing Old Revisions)

> Maintain the cluster revision table
> -----------------------------------
>
>                 Key: JCR-1087
>                 URL: https://issues.apache.org/jira/browse/JCR-1087
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: clustering, jackrabbit-core
>         Environment: A clustered Jackrabbit
>            Reporter: Martijn Hendriks
>            Assignee: Martijn Hendriks
>            Priority: Minor
>             Fix For: 1.5.0
>
>         Attachments: cluster-trace.txt, JCR-1087-v2.patch, JCR-1087.patch
>
>
> The revision table in which cluster nodes write their changes can potentially become very large. If all cluster nodes are up to date to a certain revision number, then it seems unnecessary to keep the revisions with a lower number.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.