You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by "Przemo Pakulski (JIRA)" <ji...@apache.org> on 2007/05/30 22:38:16 UTC

[jira] Created: (JCR-954) Allow to disable referential integrity checking for workspace

Allow to disable referential integrity checking for workspace
-------------------------------------------------------------

                 Key: JCR-954
                 URL: https://issues.apache.org/jira/browse/JCR-954
             Project: Jackrabbit
          Issue Type: New Feature
          Components: core
            Reporter: Przemo Pakulski
             Fix For: 1.4


Some operations like clone, remove operating on huge subtree of nodes requires a lot of memory. To copy, clone, remove subtree all nodes are loaded into transient spaces. It allows such operations to be transactional, from other side it requires a lot of heap size and this memory size is directly dependent on the size of subtree (number of nodes). In result of this in some cases it is impossible to make such operations in one step. In our environment sometimes 1 GB of java heap is not enough to succesfully clone subtree  from one workspace to another.

You can always clone (copy, remove) tree in chunks, but if you have references between subtrees such approach fails. Possibilty of temporary disabling referential integrity checking for experienced JCR user could be very usefull then.

Another use case is to allow to clone selected subtrees of the whole structure between worskpaces. In our application we need to clone only some selected subtrees from one workspace to another. But we can not do that because of existing references. We need to clone the whol estructure first, then remove all unwanted nodes, which is really time expensive and memory consuming.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-954) Allow to disable referential integrity checking for workspace

Posted by "Stefan Guggisberg (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12573738#action_12573738 ] 

Stefan Guggisberg commented on JCR-954:
---------------------------------------

> Would it be OK to have a protected method for this in RepositoryImpl? That would still require a subclass, but would minimize the amount of coupling that such a subclass needs with Jackrabbit internals.

+1, that would be fine with me. thanks!

> Allow to disable referential integrity checking for workspace
> -------------------------------------------------------------
>
>                 Key: JCR-954
>                 URL: https://issues.apache.org/jira/browse/JCR-954
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: jackrabbit-core
>            Reporter: Przemo Pakulski
>            Priority: Minor
>             Fix For: 1.3.4, 1.5
>
>         Attachments: JCR-954-patch.txt, JCR-954-simple.diff
>
>
> Some operations like clone, remove operating on huge subtree of nodes requires a lot of memory. To copy, clone, remove subtree all nodes are loaded into transient spaces. It allows such operations to be transactional, from other side it requires a lot of heap size and this memory size is directly dependent on the size of subtree (number of nodes). In result of this in some cases it is impossible to make such operations in one step. In our environment sometimes 1 GB of java heap is not enough to succesfully clone subtree  from one workspace to another.
> You can always clone (copy, remove) tree in chunks, but if you have references between subtrees such approach fails. Possibilty of temporary disabling referential integrity checking for experienced JCR user could be very usefull then.
> Another use case is to allow to clone selected subtrees of the whole structure between worskpaces. In our application we need to clone only some selected subtrees from one workspace to another. But we can not do that because of existing references. We need to clone the whol estructure first, then remove all unwanted nodes, which is really time expensive and memory consuming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-954) Allow to disable referential integrity checking for workspace

Posted by "Przemo Pakulski (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500355 ] 

Przemo Pakulski commented on JCR-954:
-------------------------------------

I can agree that such feature is rather workaround, but this solution is quick and really works. As I wrote it will be not recommended to use it, it is only for experience JCR users which will be aware of any possible drawbacks.

>From other point of view using any relational database you can also temporarly disable constraint checking to speedup some bulk operations, then enable it again. And in some cases it is very helpfull.

>references are a core feature of jsr-170 which imo must not be compromised through public api methods.

I dot't need it exposed through public API. What I need is to have some methods which I can call on any component (could be RepositoryImpl, or WorkspaceImpl).

For now we have implemented this by extending SISM class and overriding some methods. But then our code is depenedent on Jackrabbit, and could stop work with newest versions.

>Consider a big subtree of items (1 mio items, eg.), which you might want to delete. Just switching off integrity checks does not help here

I think it helps, because you can remove tree in steps without worrying about references.

>we should address the real issue instead.

I really agree with you, but changing this could mean redesigning of jackrabbit core and it does not look that it could happen in the near future.

Stefan, Felix, could you recommend any other feasible solution for my use cases?

> Allow to disable referential integrity checking for workspace
> -------------------------------------------------------------
>
>                 Key: JCR-954
>                 URL: https://issues.apache.org/jira/browse/JCR-954
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: core
>            Reporter: Przemo Pakulski
>             Fix For: 1.4
>
>         Attachments: JCR-954-patch.txt
>
>
> Some operations like clone, remove operating on huge subtree of nodes requires a lot of memory. To copy, clone, remove subtree all nodes are loaded into transient spaces. It allows such operations to be transactional, from other side it requires a lot of heap size and this memory size is directly dependent on the size of subtree (number of nodes). In result of this in some cases it is impossible to make such operations in one step. In our environment sometimes 1 GB of java heap is not enough to succesfully clone subtree  from one workspace to another.
> You can always clone (copy, remove) tree in chunks, but if you have references between subtrees such approach fails. Possibilty of temporary disabling referential integrity checking for experienced JCR user could be very usefull then.
> Another use case is to allow to clone selected subtrees of the whole structure between worskpaces. In our application we need to clone only some selected subtrees from one workspace to another. But we can not do that because of existing references. We need to clone the whol estructure first, then remove all unwanted nodes, which is really time expensive and memory consuming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-954) Allow to disable referential integrity checking for workspace

Posted by "Przemo Pakulski (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12573752#action_12573752 ] 

Przemo Pakulski commented on JCR-954:
-------------------------------------

+1, that will ensure that all internal classes (including SISM) and methods will be acessible in future relases

> Allow to disable referential integrity checking for workspace
> -------------------------------------------------------------
>
>                 Key: JCR-954
>                 URL: https://issues.apache.org/jira/browse/JCR-954
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: jackrabbit-core
>            Reporter: Przemo Pakulski
>            Priority: Minor
>             Fix For: 1.3.4, 1.5
>
>         Attachments: JCR-954-patch.txt, JCR-954-simple.diff
>
>
> Some operations like clone, remove operating on huge subtree of nodes requires a lot of memory. To copy, clone, remove subtree all nodes are loaded into transient spaces. It allows such operations to be transactional, from other side it requires a lot of heap size and this memory size is directly dependent on the size of subtree (number of nodes). In result of this in some cases it is impossible to make such operations in one step. In our environment sometimes 1 GB of java heap is not enough to succesfully clone subtree  from one workspace to another.
> You can always clone (copy, remove) tree in chunks, but if you have references between subtrees such approach fails. Possibilty of temporary disabling referential integrity checking for experienced JCR user could be very usefull then.
> Another use case is to allow to clone selected subtrees of the whole structure between worskpaces. In our application we need to clone only some selected subtrees from one workspace to another. But we can not do that because of existing references. We need to clone the whol estructure first, then remove all unwanted nodes, which is really time expensive and memory consuming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (JCR-954) Allow to disable referential integrity checking for workspace

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jukka Zitting updated JCR-954:
------------------------------

    Fix Version/s:     (was: 1.4)

Dropping from 1.4. I think we definitely should support Przemo's use case, but with two -1s on table we either need more discussion or an alternative proposal for implementing this.

> Allow to disable referential integrity checking for workspace
> -------------------------------------------------------------
>
>                 Key: JCR-954
>                 URL: https://issues.apache.org/jira/browse/JCR-954
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: jackrabbit-core
>            Reporter: Przemo Pakulski
>         Attachments: JCR-954-patch.txt
>
>
> Some operations like clone, remove operating on huge subtree of nodes requires a lot of memory. To copy, clone, remove subtree all nodes are loaded into transient spaces. It allows such operations to be transactional, from other side it requires a lot of heap size and this memory size is directly dependent on the size of subtree (number of nodes). In result of this in some cases it is impossible to make such operations in one step. In our environment sometimes 1 GB of java heap is not enough to succesfully clone subtree  from one workspace to another.
> You can always clone (copy, remove) tree in chunks, but if you have references between subtrees such approach fails. Possibilty of temporary disabling referential integrity checking for experienced JCR user could be very usefull then.
> Another use case is to allow to clone selected subtrees of the whole structure between worskpaces. In our application we need to clone only some selected subtrees from one workspace to another. But we can not do that because of existing references. We need to clone the whol estructure first, then remove all unwanted nodes, which is really time expensive and memory consuming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (JCR-954) Allow to disable referential integrity checking for workspace

Posted by "Przemo Pakulski (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Przemo Pakulski updated JCR-954:
--------------------------------

    Attachment: JCR-954-simple.diff

Simpler version of patch attached :
- no public api method,
- no changes in functionality (integrity checking enabled by default),
- just single flag with the setter in SharedItemStateManager class which allow to control this behaviour programatically by experienced developers.

> Allow to disable referential integrity checking for workspace
> -------------------------------------------------------------
>
>                 Key: JCR-954
>                 URL: https://issues.apache.org/jira/browse/JCR-954
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: jackrabbit-core
>    Affects Versions: 1.3.3, 1.4, 1.4.1, 1.5
>            Reporter: Przemo Pakulski
>            Priority: Minor
>             Fix For: 1.3.4, 1.4.1, 1.5
>
>         Attachments: JCR-954-patch.txt, JCR-954-simple.diff
>
>
> Some operations like clone, remove operating on huge subtree of nodes requires a lot of memory. To copy, clone, remove subtree all nodes are loaded into transient spaces. It allows such operations to be transactional, from other side it requires a lot of heap size and this memory size is directly dependent on the size of subtree (number of nodes). In result of this in some cases it is impossible to make such operations in one step. In our environment sometimes 1 GB of java heap is not enough to succesfully clone subtree  from one workspace to another.
> You can always clone (copy, remove) tree in chunks, but if you have references between subtrees such approach fails. Possibilty of temporary disabling referential integrity checking for experienced JCR user could be very usefull then.
> Another use case is to allow to clone selected subtrees of the whole structure between worskpaces. In our application we need to clone only some selected subtrees from one workspace to another. But we can not do that because of existing references. We need to clone the whol estructure first, then remove all unwanted nodes, which is really time expensive and memory consuming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-954) Allow to disable referential integrity checking for workspace

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12573231#action_12573231 ] 

Jukka Zitting commented on JCR-954:
-----------------------------------

How about including the method in a custom RepositoryImpl subclass included in o.a.j.core? This way nobody would be affected by default, but if you really needed this feature you could instantiate and use that subclass instead of the normal RepositoryImpl.

> Allow to disable referential integrity checking for workspace
> -------------------------------------------------------------
>
>                 Key: JCR-954
>                 URL: https://issues.apache.org/jira/browse/JCR-954
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: jackrabbit-core
>            Reporter: Przemo Pakulski
>            Priority: Minor
>             Fix For: 1.3.4, 1.5
>
>         Attachments: JCR-954-patch.txt, JCR-954-simple.diff
>
>
> Some operations like clone, remove operating on huge subtree of nodes requires a lot of memory. To copy, clone, remove subtree all nodes are loaded into transient spaces. It allows such operations to be transactional, from other side it requires a lot of heap size and this memory size is directly dependent on the size of subtree (number of nodes). In result of this in some cases it is impossible to make such operations in one step. In our environment sometimes 1 GB of java heap is not enough to succesfully clone subtree  from one workspace to another.
> You can always clone (copy, remove) tree in chunks, but if you have references between subtrees such approach fails. Possibilty of temporary disabling referential integrity checking for experienced JCR user could be very usefull then.
> Another use case is to allow to clone selected subtrees of the whole structure between worskpaces. In our application we need to clone only some selected subtrees from one workspace to another. But we can not do that because of existing references. We need to clone the whol estructure first, then remove all unwanted nodes, which is really time expensive and memory consuming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-954) Allow to disable referential integrity checking for workspace

Posted by "Felix Meschberger (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500341 ] 

Felix Meschberger commented on JCR-954:
---------------------------------------

-1 for this change.

I second the opinion by Stefan, that this would be a very bad idea.

In fact the real issue is the transient items space which is growing due to the "big transaction". This is a big issue of the internal implementation of the item managers and cannot be solved by just switching off integrity checking. Consider a big subtree of items (1 mio items, eg.), which you might want to delete. Just switching off integrity checks does not help here.

> Allow to disable referential integrity checking for workspace
> -------------------------------------------------------------
>
>                 Key: JCR-954
>                 URL: https://issues.apache.org/jira/browse/JCR-954
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: core
>            Reporter: Przemo Pakulski
>             Fix For: 1.4
>
>         Attachments: JCR-954-patch.txt
>
>
> Some operations like clone, remove operating on huge subtree of nodes requires a lot of memory. To copy, clone, remove subtree all nodes are loaded into transient spaces. It allows such operations to be transactional, from other side it requires a lot of heap size and this memory size is directly dependent on the size of subtree (number of nodes). In result of this in some cases it is impossible to make such operations in one step. In our environment sometimes 1 GB of java heap is not enough to succesfully clone subtree  from one workspace to another.
> You can always clone (copy, remove) tree in chunks, but if you have references between subtrees such approach fails. Possibilty of temporary disabling referential integrity checking for experienced JCR user could be very usefull then.
> Another use case is to allow to clone selected subtrees of the whole structure between worskpaces. In our application we need to clone only some selected subtrees from one workspace to another. But we can not do that because of existing references. We need to clone the whol estructure first, then remove all unwanted nodes, which is really time expensive and memory consuming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-954) Allow to disable referential integrity checking for workspace

Posted by "Przemo Pakulski (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12572697#action_12572697 ] 

Przemo Pakulski commented on JCR-954:
-------------------------------------

Without such option it is not possible to clone, import neither remove relatively big subtrees of nodes at all.
I really need such functionality, but nobody even tried to address the real issue since 9 months already.

> Allow to disable referential integrity checking for workspace
> -------------------------------------------------------------
>
>                 Key: JCR-954
>                 URL: https://issues.apache.org/jira/browse/JCR-954
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: jackrabbit-core
>    Affects Versions: 1.3.3, 1.4, 1.4.1, 1.5
>            Reporter: Przemo Pakulski
>            Priority: Minor
>             Fix For: 1.3.4, 1.4.1, 1.5
>
>         Attachments: JCR-954-patch.txt
>
>
> Some operations like clone, remove operating on huge subtree of nodes requires a lot of memory. To copy, clone, remove subtree all nodes are loaded into transient spaces. It allows such operations to be transactional, from other side it requires a lot of heap size and this memory size is directly dependent on the size of subtree (number of nodes). In result of this in some cases it is impossible to make such operations in one step. In our environment sometimes 1 GB of java heap is not enough to succesfully clone subtree  from one workspace to another.
> You can always clone (copy, remove) tree in chunks, but if you have references between subtrees such approach fails. Possibilty of temporary disabling referential integrity checking for experienced JCR user could be very usefull then.
> Another use case is to allow to clone selected subtrees of the whole structure between worskpaces. In our application we need to clone only some selected subtrees from one workspace to another. But we can not do that because of existing references. We need to clone the whol estructure first, then remove all unwanted nodes, which is really time expensive and memory consuming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (JCR-954) Allow to disable referential integrity checking for workspace

Posted by "Przemo Pakulski (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Przemo Pakulski updated JCR-954:
--------------------------------

        Fix Version/s: 1.5
                       1.4.1
                       1.3.4
             Priority: Minor  (was: Major)
    Affects Version/s: 1.5
                       1.4.1
                       1.3.3
                       1.4

> Allow to disable referential integrity checking for workspace
> -------------------------------------------------------------
>
>                 Key: JCR-954
>                 URL: https://issues.apache.org/jira/browse/JCR-954
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: jackrabbit-core
>    Affects Versions: 1.3.3, 1.4, 1.4.1, 1.5
>            Reporter: Przemo Pakulski
>            Priority: Minor
>             Fix For: 1.3.4, 1.4.1, 1.5
>
>         Attachments: JCR-954-patch.txt
>
>
> Some operations like clone, remove operating on huge subtree of nodes requires a lot of memory. To copy, clone, remove subtree all nodes are loaded into transient spaces. It allows such operations to be transactional, from other side it requires a lot of heap size and this memory size is directly dependent on the size of subtree (number of nodes). In result of this in some cases it is impossible to make such operations in one step. In our environment sometimes 1 GB of java heap is not enough to succesfully clone subtree  from one workspace to another.
> You can always clone (copy, remove) tree in chunks, but if you have references between subtrees such approach fails. Possibilty of temporary disabling referential integrity checking for experienced JCR user could be very usefull then.
> Another use case is to allow to clone selected subtrees of the whole structure between worskpaces. In our application we need to clone only some selected subtrees from one workspace to another. But we can not do that because of existing references. We need to clone the whol estructure first, then remove all unwanted nodes, which is really time expensive and memory consuming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-954) Allow to disable referential integrity checking for workspace

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12573735#action_12573735 ] 

Jukka Zitting commented on JCR-954:
-----------------------------------

Would it be OK to have a protected method for this in RepositoryImpl? That would still require a subclass, but would minimize the amount of coupling that such a subclass needs with Jackrabbit internals.

> Allow to disable referential integrity checking for workspace
> -------------------------------------------------------------
>
>                 Key: JCR-954
>                 URL: https://issues.apache.org/jira/browse/JCR-954
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: jackrabbit-core
>            Reporter: Przemo Pakulski
>            Priority: Minor
>             Fix For: 1.3.4, 1.5
>
>         Attachments: JCR-954-patch.txt, JCR-954-simple.diff
>
>
> Some operations like clone, remove operating on huge subtree of nodes requires a lot of memory. To copy, clone, remove subtree all nodes are loaded into transient spaces. It allows such operations to be transactional, from other side it requires a lot of heap size and this memory size is directly dependent on the size of subtree (number of nodes). In result of this in some cases it is impossible to make such operations in one step. In our environment sometimes 1 GB of java heap is not enough to succesfully clone subtree  from one workspace to another.
> You can always clone (copy, remove) tree in chunks, but if you have references between subtrees such approach fails. Possibilty of temporary disabling referential integrity checking for experienced JCR user could be very usefull then.
> Another use case is to allow to clone selected subtrees of the whole structure between worskpaces. In our application we need to clone only some selected subtrees from one workspace to another. But we can not do that because of existing references. We need to clone the whol estructure first, then remove all unwanted nodes, which is really time expensive and memory consuming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-954) Allow to disable referential integrity checking for workspace

Posted by "Stefan Guggisberg (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12573682#action_12573682 ] 

Stefan Guggisberg commented on JCR-954:
---------------------------------------

>  How about including the method in a custom RepositoryImpl subclass included in o.a.j.core? This way nobody would be affected by default, but if you really needed this feature you could instantiate and use that subclass instead of the normal RepositoryImpl.

-0

IMO that would be still too easy and tempting to use. people might start using this 'feature' because they expect better performance. 
however, unless they know exactly what they're doing, they risk corrupting the repository. personally i'd prefer to expose this 
functionality to subclasses but people would have to write their own in order to enable it.


> Allow to disable referential integrity checking for workspace
> -------------------------------------------------------------
>
>                 Key: JCR-954
>                 URL: https://issues.apache.org/jira/browse/JCR-954
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: jackrabbit-core
>            Reporter: Przemo Pakulski
>            Priority: Minor
>             Fix For: 1.3.4, 1.5
>
>         Attachments: JCR-954-patch.txt, JCR-954-simple.diff
>
>
> Some operations like clone, remove operating on huge subtree of nodes requires a lot of memory. To copy, clone, remove subtree all nodes are loaded into transient spaces. It allows such operations to be transactional, from other side it requires a lot of heap size and this memory size is directly dependent on the size of subtree (number of nodes). In result of this in some cases it is impossible to make such operations in one step. In our environment sometimes 1 GB of java heap is not enough to succesfully clone subtree  from one workspace to another.
> You can always clone (copy, remove) tree in chunks, but if you have references between subtrees such approach fails. Possibilty of temporary disabling referential integrity checking for experienced JCR user could be very usefull then.
> Another use case is to allow to clone selected subtrees of the whole structure between worskpaces. In our application we need to clone only some selected subtrees from one workspace to another. But we can not do that because of existing references. We need to clone the whol estructure first, then remove all unwanted nodes, which is really time expensive and memory consuming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (JCR-954) Allow to disable referential integrity checking for workspace

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jukka Zitting resolved JCR-954.
-------------------------------

    Resolution: Fixed
      Assignee: Jukka Zitting

Committed the SISM changes and the proposed protected RepositoryImpl method to trunk in revision 632738. Merged the changes to the 1.3 branch in revision 632739.

I'm reluctant to push this to the 1.4 branch at the moment as we're still making "pure" patch releases from there. Perhaps once 1.5 is out we can do a more relaxed 1.4.x release like we are currently doing with 1.3.4.

> Allow to disable referential integrity checking for workspace
> -------------------------------------------------------------
>
>                 Key: JCR-954
>                 URL: https://issues.apache.org/jira/browse/JCR-954
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: jackrabbit-core
>            Reporter: Przemo Pakulski
>            Assignee: Jukka Zitting
>            Priority: Minor
>             Fix For: 1.3.4, 1.5
>
>         Attachments: JCR-954-patch.txt, JCR-954-simple.diff
>
>
> Some operations like clone, remove operating on huge subtree of nodes requires a lot of memory. To copy, clone, remove subtree all nodes are loaded into transient spaces. It allows such operations to be transactional, from other side it requires a lot of heap size and this memory size is directly dependent on the size of subtree (number of nodes). In result of this in some cases it is impossible to make such operations in one step. In our environment sometimes 1 GB of java heap is not enough to succesfully clone subtree  from one workspace to another.
> You can always clone (copy, remove) tree in chunks, but if you have references between subtrees such approach fails. Possibilty of temporary disabling referential integrity checking for experienced JCR user could be very usefull then.
> Another use case is to allow to clone selected subtrees of the whole structure between worskpaces. In our application we need to clone only some selected subtrees from one workspace to another. But we can not do that because of existing references. We need to clone the whol estructure first, then remove all unwanted nodes, which is really time expensive and memory consuming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (JCR-954) Allow to disable referential integrity checking for workspace

Posted by "Przemo Pakulski (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Przemo Pakulski updated JCR-954:
--------------------------------

    Attachment: JCR-954-patch.txt

Attached patch containing simple solution in SharedItemStateManager class.
You can disable/enable referential integrity checking by simply setting flag on workspace :

((JackrabbitWorkspace)workspace.setReferentialIntegrityChecking(false);

((JackrabbitWorkspace)workspace.setReferentialIntegrityChecking(true);

> Allow to disable referential integrity checking for workspace
> -------------------------------------------------------------
>
>                 Key: JCR-954
>                 URL: https://issues.apache.org/jira/browse/JCR-954
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: core
>            Reporter: Przemo Pakulski
>             Fix For: 1.4
>
>         Attachments: JCR-954-patch.txt
>
>
> Some operations like clone, remove operating on huge subtree of nodes requires a lot of memory. To copy, clone, remove subtree all nodes are loaded into transient spaces. It allows such operations to be transactional, from other side it requires a lot of heap size and this memory size is directly dependent on the size of subtree (number of nodes). In result of this in some cases it is impossible to make such operations in one step. In our environment sometimes 1 GB of java heap is not enough to succesfully clone subtree  from one workspace to another.
> You can always clone (copy, remove) tree in chunks, but if you have references between subtrees such approach fails. Possibilty of temporary disabling referential integrity checking for experienced JCR user could be very usefull then.
> Another use case is to allow to clone selected subtrees of the whole structure between worskpaces. In our application we need to clone only some selected subtrees from one workspace to another. But we can not do that because of existing references. We need to clone the whol estructure first, then remove all unwanted nodes, which is really time expensive and memory consuming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-954) Allow to disable referential integrity checking for workspace

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500338 ] 

Jukka Zitting commented on JCR-954:
-----------------------------------

Disabling the checks seems to be the only way for now to really achieve the use cases in question, so I wouldn't just deny this change. However, could we end up with some real internal issues for example if the NodeReferences structures inside a persistence store become incorrect?

I wouldn't worry too much about inconsistencies visible to the client application(s) if they were knowingly injected by a client, but it is a problem if such inconsistencies would destabilize the repository itself.


> Allow to disable referential integrity checking for workspace
> -------------------------------------------------------------
>
>                 Key: JCR-954
>                 URL: https://issues.apache.org/jira/browse/JCR-954
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: core
>            Reporter: Przemo Pakulski
>             Fix For: 1.4
>
>         Attachments: JCR-954-patch.txt
>
>
> Some operations like clone, remove operating on huge subtree of nodes requires a lot of memory. To copy, clone, remove subtree all nodes are loaded into transient spaces. It allows such operations to be transactional, from other side it requires a lot of heap size and this memory size is directly dependent on the size of subtree (number of nodes). In result of this in some cases it is impossible to make such operations in one step. In our environment sometimes 1 GB of java heap is not enough to succesfully clone subtree  from one workspace to another.
> You can always clone (copy, remove) tree in chunks, but if you have references between subtrees such approach fails. Possibilty of temporary disabling referential integrity checking for experienced JCR user could be very usefull then.
> Another use case is to allow to clone selected subtrees of the whole structure between worskpaces. In our application we need to clone only some selected subtrees from one workspace to another. But we can not do that because of existing references. We need to clone the whol estructure first, then remove all unwanted nodes, which is really time expensive and memory consuming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-954) Allow to disable referential integrity checking for workspace

Posted by "Przemo Pakulski (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12573760#action_12573760 ] 

Przemo Pakulski commented on JCR-954:
-------------------------------------

Could we also include patch in 1.4 maintenance branch ?

> Allow to disable referential integrity checking for workspace
> -------------------------------------------------------------
>
>                 Key: JCR-954
>                 URL: https://issues.apache.org/jira/browse/JCR-954
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: jackrabbit-core
>            Reporter: Przemo Pakulski
>            Priority: Minor
>             Fix For: 1.3.4, 1.5
>
>         Attachments: JCR-954-patch.txt, JCR-954-simple.diff
>
>
> Some operations like clone, remove operating on huge subtree of nodes requires a lot of memory. To copy, clone, remove subtree all nodes are loaded into transient spaces. It allows such operations to be transactional, from other side it requires a lot of heap size and this memory size is directly dependent on the size of subtree (number of nodes). In result of this in some cases it is impossible to make such operations in one step. In our environment sometimes 1 GB of java heap is not enough to succesfully clone subtree  from one workspace to another.
> You can always clone (copy, remove) tree in chunks, but if you have references between subtrees such approach fails. Possibilty of temporary disabling referential integrity checking for experienced JCR user could be very usefull then.
> Another use case is to allow to clone selected subtrees of the whole structure between worskpaces. In our application we need to clone only some selected subtrees from one workspace to another. But we can not do that because of existing references. We need to clone the whol estructure first, then remove all unwanted nodes, which is really time expensive and memory consuming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-954) Allow to disable referential integrity checking for workspace

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500343 ] 

Jukka Zitting commented on JCR-954:
-----------------------------------

> we should address the real issue instead.

Do we have a plan or even a vague idea of how and when we are going to solve that? In fact I do have some ideas in NGP for solving that, but they are way off in the future.

I'm all for going for the root cause, but who will do it?

> Allow to disable referential integrity checking for workspace
> -------------------------------------------------------------
>
>                 Key: JCR-954
>                 URL: https://issues.apache.org/jira/browse/JCR-954
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: core
>            Reporter: Przemo Pakulski
>             Fix For: 1.4
>
>         Attachments: JCR-954-patch.txt
>
>
> Some operations like clone, remove operating on huge subtree of nodes requires a lot of memory. To copy, clone, remove subtree all nodes are loaded into transient spaces. It allows such operations to be transactional, from other side it requires a lot of heap size and this memory size is directly dependent on the size of subtree (number of nodes). In result of this in some cases it is impossible to make such operations in one step. In our environment sometimes 1 GB of java heap is not enough to succesfully clone subtree  from one workspace to another.
> You can always clone (copy, remove) tree in chunks, but if you have references between subtrees such approach fails. Possibilty of temporary disabling referential integrity checking for experienced JCR user could be very usefull then.
> Another use case is to allow to clone selected subtrees of the whole structure between worskpaces. In our application we need to clone only some selected subtrees from one workspace to another. But we can not do that because of existing references. We need to clone the whol estructure first, then remove all unwanted nodes, which is really time expensive and memory consuming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-954) Allow to disable referential integrity checking for workspace

Posted by "Przemo Pakulski (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12572888#action_12572888 ] 

Przemo Pakulski commented on JCR-954:
-------------------------------------

I first proposed patch I've added the following method to RepositoryImpl class :

     * Enables/disables referential integrity check for workspace.
     * 
     * @param workspaceName
     * @param checkIntegrityEnabled
     * @throws RepositoryException
     */
    public void setCheckIntegrityEnabled(String workspaceName, boolean checkIntegrityEnabled) throws RepositoryException;

To keep the patch simple, I created own wrapper for RepositorImpl class and moved this method to custom RepositoryImpl implementation, so I'm able to enable/disable referential integrity checking for any workspace programatically. 

> Allow to disable referential integrity checking for workspace
> -------------------------------------------------------------
>
>                 Key: JCR-954
>                 URL: https://issues.apache.org/jira/browse/JCR-954
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: jackrabbit-core
>            Reporter: Przemo Pakulski
>            Priority: Minor
>             Fix For: 1.3.4, 1.5
>
>         Attachments: JCR-954-patch.txt, JCR-954-simple.diff
>
>
> Some operations like clone, remove operating on huge subtree of nodes requires a lot of memory. To copy, clone, remove subtree all nodes are loaded into transient spaces. It allows such operations to be transactional, from other side it requires a lot of heap size and this memory size is directly dependent on the size of subtree (number of nodes). In result of this in some cases it is impossible to make such operations in one step. In our environment sometimes 1 GB of java heap is not enough to succesfully clone subtree  from one workspace to another.
> You can always clone (copy, remove) tree in chunks, but if you have references between subtrees such approach fails. Possibilty of temporary disabling referential integrity checking for experienced JCR user could be very usefull then.
> Another use case is to allow to clone selected subtrees of the whole structure between worskpaces. In our application we need to clone only some selected subtrees from one workspace to another. But we can not do that because of existing references. We need to clone the whol estructure first, then remove all unwanted nodes, which is really time expensive and memory consuming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (JCR-954) Allow to disable referential integrity checking for workspace

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jukka Zitting updated JCR-954:
------------------------------

    Affects Version/s:     (was: 1.5)
                           (was: 1.4.1)
                           (was: 1.3.3)
                           (was: 1.4)
        Fix Version/s:     (was: 1.4.1)

Agreed with Przemo. There's a real itch here and the proposed fix won't harm anyone, so unless someone comes up with another way to fix this we should not stand in the way.

About the fix, how would you enable the no-referential-checks mode in practice? Should we add some route up to the RepositoryImpl or WorkspaceImpl level, or should the checkReferences flag perhaps be static?

> Allow to disable referential integrity checking for workspace
> -------------------------------------------------------------
>
>                 Key: JCR-954
>                 URL: https://issues.apache.org/jira/browse/JCR-954
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: jackrabbit-core
>            Reporter: Przemo Pakulski
>            Priority: Minor
>             Fix For: 1.3.4, 1.5
>
>         Attachments: JCR-954-patch.txt, JCR-954-simple.diff
>
>
> Some operations like clone, remove operating on huge subtree of nodes requires a lot of memory. To copy, clone, remove subtree all nodes are loaded into transient spaces. It allows such operations to be transactional, from other side it requires a lot of heap size and this memory size is directly dependent on the size of subtree (number of nodes). In result of this in some cases it is impossible to make such operations in one step. In our environment sometimes 1 GB of java heap is not enough to succesfully clone subtree  from one workspace to another.
> You can always clone (copy, remove) tree in chunks, but if you have references between subtrees such approach fails. Possibilty of temporary disabling referential integrity checking for experienced JCR user could be very usefull then.
> Another use case is to allow to clone selected subtrees of the whole structure between worskpaces. In our application we need to clone only some selected subtrees from one workspace to another. But we can not do that because of existing references. We need to clone the whol estructure first, then remove all unwanted nodes, which is really time expensive and memory consuming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-954) Allow to disable referential integrity checking for workspace

Posted by "Stefan Guggisberg (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12573225#action_12573225 ] 

Stefan Guggisberg commented on JCR-954:
---------------------------------------

i'd prefer to not publicly expose this method. doing so would IMO mean compromising a core feature of JSR-170.

> Allow to disable referential integrity checking for workspace
> -------------------------------------------------------------
>
>                 Key: JCR-954
>                 URL: https://issues.apache.org/jira/browse/JCR-954
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: jackrabbit-core
>            Reporter: Przemo Pakulski
>            Priority: Minor
>             Fix For: 1.3.4, 1.5
>
>         Attachments: JCR-954-patch.txt, JCR-954-simple.diff
>
>
> Some operations like clone, remove operating on huge subtree of nodes requires a lot of memory. To copy, clone, remove subtree all nodes are loaded into transient spaces. It allows such operations to be transactional, from other side it requires a lot of heap size and this memory size is directly dependent on the size of subtree (number of nodes). In result of this in some cases it is impossible to make such operations in one step. In our environment sometimes 1 GB of java heap is not enough to succesfully clone subtree  from one workspace to another.
> You can always clone (copy, remove) tree in chunks, but if you have references between subtrees such approach fails. Possibilty of temporary disabling referential integrity checking for experienced JCR user could be very usefull then.
> Another use case is to allow to clone selected subtrees of the whole structure between worskpaces. In our application we need to clone only some selected subtrees from one workspace to another. But we can not do that because of existing references. We need to clone the whol estructure first, then remove all unwanted nodes, which is really time expensive and memory consuming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-954) Allow to disable referential integrity checking for workspace

Posted by "Stefan Guggisberg (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500331 ] 

Stefan Guggisberg commented on JCR-954:
---------------------------------------

-1 for the suggested feature since it might lead to inconsistent data.

references are a core feature of jsr-170 which imo must not be compromised through public api methods.

> Allow to disable referential integrity checking for workspace
> -------------------------------------------------------------
>
>                 Key: JCR-954
>                 URL: https://issues.apache.org/jira/browse/JCR-954
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: core
>            Reporter: Przemo Pakulski
>             Fix For: 1.4
>
>         Attachments: JCR-954-patch.txt
>
>
> Some operations like clone, remove operating on huge subtree of nodes requires a lot of memory. To copy, clone, remove subtree all nodes are loaded into transient spaces. It allows such operations to be transactional, from other side it requires a lot of heap size and this memory size is directly dependent on the size of subtree (number of nodes). In result of this in some cases it is impossible to make such operations in one step. In our environment sometimes 1 GB of java heap is not enough to succesfully clone subtree  from one workspace to another.
> You can always clone (copy, remove) tree in chunks, but if you have references between subtrees such approach fails. Possibilty of temporary disabling referential integrity checking for experienced JCR user could be very usefull then.
> Another use case is to allow to clone selected subtrees of the whole structure between worskpaces. In our application we need to clone only some selected subtrees from one workspace to another. But we can not do that because of existing references. We need to clone the whol estructure first, then remove all unwanted nodes, which is really time expensive and memory consuming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-954) Allow to disable referential integrity checking for workspace

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12573220#action_12573220 ] 

Jukka Zitting commented on JCR-954:
-----------------------------------

I'd be OK to include the setCheckIntegrityEnabled method in RepositoryImpl. It's one less customization needed on your side and other people might also find it useful (I know I would every now and then).

> Allow to disable referential integrity checking for workspace
> -------------------------------------------------------------
>
>                 Key: JCR-954
>                 URL: https://issues.apache.org/jira/browse/JCR-954
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: jackrabbit-core
>            Reporter: Przemo Pakulski
>            Priority: Minor
>             Fix For: 1.3.4, 1.5
>
>         Attachments: JCR-954-patch.txt, JCR-954-simple.diff
>
>
> Some operations like clone, remove operating on huge subtree of nodes requires a lot of memory. To copy, clone, remove subtree all nodes are loaded into transient spaces. It allows such operations to be transactional, from other side it requires a lot of heap size and this memory size is directly dependent on the size of subtree (number of nodes). In result of this in some cases it is impossible to make such operations in one step. In our environment sometimes 1 GB of java heap is not enough to succesfully clone subtree  from one workspace to another.
> You can always clone (copy, remove) tree in chunks, but if you have references between subtrees such approach fails. Possibilty of temporary disabling referential integrity checking for experienced JCR user could be very usefull then.
> Another use case is to allow to clone selected subtrees of the whole structure between worskpaces. In our application we need to clone only some selected subtrees from one workspace to another. But we can not do that because of existing references. We need to clone the whol estructure first, then remove all unwanted nodes, which is really time expensive and memory consuming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-954) Allow to disable referential integrity checking for workspace

Posted by "Stefan Guggisberg (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500342 ] 

Stefan Guggisberg commented on JCR-954:
---------------------------------------

the suggested solution is imo a hack to enable a workaround for the real issue at hand, i.e. in-memory changelog & transient changes

we should address the real issue instead. 

> Allow to disable referential integrity checking for workspace
> -------------------------------------------------------------
>
>                 Key: JCR-954
>                 URL: https://issues.apache.org/jira/browse/JCR-954
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: core
>            Reporter: Przemo Pakulski
>             Fix For: 1.4
>
>         Attachments: JCR-954-patch.txt
>
>
> Some operations like clone, remove operating on huge subtree of nodes requires a lot of memory. To copy, clone, remove subtree all nodes are loaded into transient spaces. It allows such operations to be transactional, from other side it requires a lot of heap size and this memory size is directly dependent on the size of subtree (number of nodes). In result of this in some cases it is impossible to make such operations in one step. In our environment sometimes 1 GB of java heap is not enough to succesfully clone subtree  from one workspace to another.
> You can always clone (copy, remove) tree in chunks, but if you have references between subtrees such approach fails. Possibilty of temporary disabling referential integrity checking for experienced JCR user could be very usefull then.
> Another use case is to allow to clone selected subtrees of the whole structure between worskpaces. In our application we need to clone only some selected subtrees from one workspace to another. But we can not do that because of existing references. We need to clone the whol estructure first, then remove all unwanted nodes, which is really time expensive and memory consuming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.