You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by "Cédric Chantepie (JIRA)" <ji...@apache.org> on 2010/02/12 12:06:28 UTC

[jira] Created: (JCR-2492) Garbage Collector remove data for active node

Garbage Collector remove data for active node
---------------------------------------------

                 Key: JCR-2492
                 URL: https://issues.apache.org/jira/browse/JCR-2492
             Project: Jackrabbit Content Repository
          Issue Type: Bug
    Affects Versions: core 1.4.5
         Environment: Linux 2.6.x (gentoo or fedora), JDK 1.5 (sun or jrockit), JBoss 4.2.3.GA, Derby (10.4.1.3), PostgreSQL (8.1.11 or 8.0.3)
* FileSystem = LocalFileSystem
* custom AccessManager
* PersistenceManager = PostgreSQLPersistenceManager
* SearchIndex, textFilterClasses = ""
* DataStore = FileDataStore (minLogRecord = 100)
            Reporter: Cédric Chantepie
            Priority: Critical


When we use GarbageCollector on a 42Gb datastore, GarbageCollector erase all data.
Back with node, none have any longer data : jcr:data was removed as data in datastore no longer exist.
On some smaller test repository, this trouble does not occur.

We will try to update Jackrabbit version, but at least it could be "good" to be sure what is really the trouble with GC in Jackrabbit 1.4.5 so that we can be sure that updating it will really fix that.

Thanks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-2492) Garbage Collector remove data for active node

Posted by "Cédric Chantepie (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834234#action_12834234 ] 

Cédric Chantepie commented on JCR-2492:
---------------------------------------

I think the main cause for this trouble is there : http://svn.apache.org/viewvc/jackrabbit/branches/1.4/jackrabbit-core/src/main/java/org/apache/jackrabbit/core/persistence/bundle/BundleDbPersistenceManager.java?p2=%2Fjackrabbit%2Fbranches%2F1.4%2Fjackrabbit-core%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fjackrabbit%2Fcore%2Fpersistence%2Fbundle%2FBundleDbPersistenceManager.java&p1=%2Fjackrabbit%2Fbranches%2F1.4%2Fjackrabbit-core%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fjackrabbit%2Fcore%2Fpersistence%2Fbundle%2FBundleDbPersistenceManager.java&r1=633844&r2=633843&view=diff&pathrev=633844


> Garbage Collector remove data for active node
> ---------------------------------------------
>
>                 Key: JCR-2492
>                 URL: https://issues.apache.org/jira/browse/JCR-2492
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>    Affects Versions: 1.4
>         Environment: Linux 2.6.x (gentoo or fedora), JDK 1.5 (sun or jrockit), JBoss 4.2.3.GA, Derby (10.4.1.3), PostgreSQL (8.1.11 or 8.0.3)
> * FileSystem = LocalFileSystem
> * custom AccessManager
> * PersistenceManager = PostgreSQLPersistenceManager
> * SearchIndex, textFilterClasses = ""
> * DataStore = FileDataStore (minLogRecord = 100)
>            Reporter: Cédric Chantepie
>            Priority: Critical
>
> When we use GarbageCollector on a 42Gb datastore, GarbageCollector erase all data.
> Back with node, none have any longer data : jcr:data was removed as data in datastore no longer exist.
> On some smaller test repository, this trouble does not occur.
> We will try to update Jackrabbit version, but at least it could be "good" to be sure what is really the trouble with GC in Jackrabbit 1.4.5 so that we can be sure that updating it will really fix that.
> Thanks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (JCR-2492) Garbage Collector remove data for active node

Posted by "Thomas Mueller (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thomas Mueller resolved JCR-2492.
---------------------------------

    Resolution: Fixed

There are other problems with version 1.4.x, see also JCR-1414 and specially JCR-2063, which was not backported to 1.4.x. See also the comment there for a workaround.

Please re-open the bug if you can still reproduce it.


> Garbage Collector remove data for active node
> ---------------------------------------------
>
>                 Key: JCR-2492
>                 URL: https://issues.apache.org/jira/browse/JCR-2492
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>    Affects Versions: 1.4
>         Environment: Linux 2.6.x (gentoo or fedora), JDK 1.5 (sun or jrockit), JBoss 4.2.3.GA, Derby (10.4.1.3), PostgreSQL (8.1.11 or 8.0.3)
> * FileSystem = LocalFileSystem
> * custom AccessManager
> * PersistenceManager = PostgreSQLPersistenceManager
> * SearchIndex, textFilterClasses = ""
> * DataStore = FileDataStore (minLogRecord = 100)
>            Reporter: Cédric Chantepie
>            Priority: Critical
>
> When we use GarbageCollector on a 42Gb datastore, GarbageCollector erase all data.
> Back with node, none have any longer data : jcr:data was removed as data in datastore no longer exist.
> On some smaller test repository, this trouble does not occur.
> We will try to update Jackrabbit version, but at least it could be "good" to be sure what is really the trouble with GC in Jackrabbit 1.4.5 so that we can be sure that updating it will really fix that.
> Thanks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-2492) Garbage Collector remove data for active node

Posted by "Thomas Mueller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834242#action_12834242 ] 

Thomas Mueller commented on JCR-2492:
-------------------------------------

Hi,

I think your are right. I have added a comment in JCR-1414 about this.
So I guess this makes it a duplicate of JCR-1414.

A workaround is to disable the PersistenceManager scan using 
GarbageCollector.setPersistenceManagerScan(false), 
however this will not solve the other problems of JCR-1414 and JCR-2063.


> Garbage Collector remove data for active node
> ---------------------------------------------
>
>                 Key: JCR-2492
>                 URL: https://issues.apache.org/jira/browse/JCR-2492
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>    Affects Versions: 1.4
>         Environment: Linux 2.6.x (gentoo or fedora), JDK 1.5 (sun or jrockit), JBoss 4.2.3.GA, Derby (10.4.1.3), PostgreSQL (8.1.11 or 8.0.3)
> * FileSystem = LocalFileSystem
> * custom AccessManager
> * PersistenceManager = PostgreSQLPersistenceManager
> * SearchIndex, textFilterClasses = ""
> * DataStore = FileDataStore (minLogRecord = 100)
>            Reporter: Cédric Chantepie
>            Priority: Critical
>
> When we use GarbageCollector on a 42Gb datastore, GarbageCollector erase all data.
> Back with node, none have any longer data : jcr:data was removed as data in datastore no longer exist.
> On some smaller test repository, this trouble does not occur.
> We will try to update Jackrabbit version, but at least it could be "good" to be sure what is really the trouble with GC in Jackrabbit 1.4.5 so that we can be sure that updating it will really fix that.
> Thanks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (JCR-2492) Garbage Collector remove data for active node

Posted by "Cédric Chantepie (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Cédric Chantepie updated JCR-2492:
----------------------------------

    Affects Version/s:     (was: core 1.4.5)
                       1.4

> Garbage Collector remove data for active node
> ---------------------------------------------
>
>                 Key: JCR-2492
>                 URL: https://issues.apache.org/jira/browse/JCR-2492
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>    Affects Versions: 1.4
>         Environment: Linux 2.6.x (gentoo or fedora), JDK 1.5 (sun or jrockit), JBoss 4.2.3.GA, Derby (10.4.1.3), PostgreSQL (8.1.11 or 8.0.3)
> * FileSystem = LocalFileSystem
> * custom AccessManager
> * PersistenceManager = PostgreSQLPersistenceManager
> * SearchIndex, textFilterClasses = ""
> * DataStore = FileDataStore (minLogRecord = 100)
>            Reporter: Cédric Chantepie
>            Priority: Critical
>
> When we use GarbageCollector on a 42Gb datastore, GarbageCollector erase all data.
> Back with node, none have any longer data : jcr:data was removed as data in datastore no longer exist.
> On some smaller test repository, this trouble does not occur.
> We will try to update Jackrabbit version, but at least it could be "good" to be sure what is really the trouble with GC in Jackrabbit 1.4.5 so that we can be sure that updating it will really fix that.
> Thanks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (JCR-2492) Garbage Collector remove data for active node

Posted by "Thomas Mueller (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thomas Mueller resolved JCR-2492.
---------------------------------

    Resolution: Cannot Reproduce

I can not reproduce this problem. Please see:
http://wiki.apache.org/jackrabbit/QuestionsAndAnswers#Reporting_Problems

What we need is a simple, standalone test case that reproduces the problem.

> Garbage Collector remove data for active node
> ---------------------------------------------
>
>                 Key: JCR-2492
>                 URL: https://issues.apache.org/jira/browse/JCR-2492
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>    Affects Versions: core 1.4.5
>         Environment: Linux 2.6.x (gentoo or fedora), JDK 1.5 (sun or jrockit), JBoss 4.2.3.GA, Derby (10.4.1.3), PostgreSQL (8.1.11 or 8.0.3)
> * FileSystem = LocalFileSystem
> * custom AccessManager
> * PersistenceManager = PostgreSQLPersistenceManager
> * SearchIndex, textFilterClasses = ""
> * DataStore = FileDataStore (minLogRecord = 100)
>            Reporter: Cédric Chantepie
>            Priority: Critical
>
> When we use GarbageCollector on a 42Gb datastore, GarbageCollector erase all data.
> Back with node, none have any longer data : jcr:data was removed as data in datastore no longer exist.
> On some smaller test repository, this trouble does not occur.
> We will try to update Jackrabbit version, but at least it could be "good" to be sure what is really the trouble with GC in Jackrabbit 1.4.5 so that we can be sure that updating it will really fix that.
> Thanks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Reopened: (JCR-2492) Garbage Collector remove data for active node

Posted by "Cédric Chantepie (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Cédric Chantepie reopened JCR-2492:
-----------------------------------


Can be reproduced by reporter (me). Trying to make a testcase that can be uploaded there. Really need to figure out whether the cause was fixed by newer Jackrabbit revision, as this trouble makes datastore remove active data.

> Garbage Collector remove data for active node
> ---------------------------------------------
>
>                 Key: JCR-2492
>                 URL: https://issues.apache.org/jira/browse/JCR-2492
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>    Affects Versions: core 1.4.5
>         Environment: Linux 2.6.x (gentoo or fedora), JDK 1.5 (sun or jrockit), JBoss 4.2.3.GA, Derby (10.4.1.3), PostgreSQL (8.1.11 or 8.0.3)
> * FileSystem = LocalFileSystem
> * custom AccessManager
> * PersistenceManager = PostgreSQLPersistenceManager
> * SearchIndex, textFilterClasses = ""
> * DataStore = FileDataStore (minLogRecord = 100)
>            Reporter: Cédric Chantepie
>            Priority: Critical
>
> When we use GarbageCollector on a 42Gb datastore, GarbageCollector erase all data.
> Back with node, none have any longer data : jcr:data was removed as data in datastore no longer exist.
> On some smaller test repository, this trouble does not occur.
> We will try to update Jackrabbit version, but at least it could be "good" to be sure what is really the trouble with GC in Jackrabbit 1.4.5 so that we can be sure that updating it will really fix that.
> Thanks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-2492) Garbage Collector remove data for active node

Posted by "Cédric Chantepie (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834247#action_12834247 ] 

Cédric Chantepie commented on JCR-2492:
---------------------------------------

I will try using Jackrabbit 2.0.0 rather than the workaround for 1.4 .
Thanks, now it's clear.

> Garbage Collector remove data for active node
> ---------------------------------------------
>
>                 Key: JCR-2492
>                 URL: https://issues.apache.org/jira/browse/JCR-2492
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>    Affects Versions: 1.4
>         Environment: Linux 2.6.x (gentoo or fedora), JDK 1.5 (sun or jrockit), JBoss 4.2.3.GA, Derby (10.4.1.3), PostgreSQL (8.1.11 or 8.0.3)
> * FileSystem = LocalFileSystem
> * custom AccessManager
> * PersistenceManager = PostgreSQLPersistenceManager
> * SearchIndex, textFilterClasses = ""
> * DataStore = FileDataStore (minLogRecord = 100)
>            Reporter: Cédric Chantepie
>            Priority: Critical
>
> When we use GarbageCollector on a 42Gb datastore, GarbageCollector erase all data.
> Back with node, none have any longer data : jcr:data was removed as data in datastore no longer exist.
> On some smaller test repository, this trouble does not occur.
> We will try to update Jackrabbit version, but at least it could be "good" to be sure what is really the trouble with GC in Jackrabbit 1.4.5 so that we can be sure that updating it will really fix that.
> Thanks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-2492) Garbage Collector remove data for active node

Posted by "Cédric Chantepie (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834205#action_12834205 ] 

Cédric Chantepie commented on JCR-2492:
---------------------------------------

I'm still able to reproduce this trouble with the 42Gb datastore.
I've been able to do it once with a smaller datastore, I will try to figure out what is exactly its cause.

It seems that jackrabbit-core used by my RAR is 1.4 (not 1.4.5), even if other libs are 1.4.5.

Getting jackrabbit-1.4 from SVN, I've some doubt about something in org.apache.jackrabbit.core.persistence.bundle.BundleDbPersistenceManager::getAllNodeIds :
--> Statement stmt = connectionManager.executeStmt(sql, keys, false, maxCount + 10);
With "+ 10", infinite maxCount (0) is turned in 10, so as far as I understand, getAllNodeIds asks its connectionManager to get all nodes, but with a query whose result is limited to 10 rows.

If I'm right, GarbageCollector using getAllNodesIds from given IterablePersistenceManager (scanPersistenceManagers) doesn't "really" get all nodes (due to rows limit), and so only some nodes are marked (date updated). Nodes not marked (not included in retrieved rows), are then considered as removable by the deleteUnused method of GarbageCollector.

> Garbage Collector remove data for active node
> ---------------------------------------------
>
>                 Key: JCR-2492
>                 URL: https://issues.apache.org/jira/browse/JCR-2492
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>    Affects Versions: core 1.4.5
>         Environment: Linux 2.6.x (gentoo or fedora), JDK 1.5 (sun or jrockit), JBoss 4.2.3.GA, Derby (10.4.1.3), PostgreSQL (8.1.11 or 8.0.3)
> * FileSystem = LocalFileSystem
> * custom AccessManager
> * PersistenceManager = PostgreSQLPersistenceManager
> * SearchIndex, textFilterClasses = ""
> * DataStore = FileDataStore (minLogRecord = 100)
>            Reporter: Cédric Chantepie
>            Priority: Critical
>
> When we use GarbageCollector on a 42Gb datastore, GarbageCollector erase all data.
> Back with node, none have any longer data : jcr:data was removed as data in datastore no longer exist.
> On some smaller test repository, this trouble does not occur.
> We will try to update Jackrabbit version, but at least it could be "good" to be sure what is really the trouble with GC in Jackrabbit 1.4.5 so that we can be sure that updating it will really fix that.
> Thanks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.