You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by Raffaele Gambelli <R....@hitachi-systems-cbt.com> on 2018/12/04 13:46:10 UTC

Jackrabbit v. 2.14.0 VERSION_BUNDLE

Hi all,

I'm using Jackrabbit 2.14.0 and I'm experiencing the same problem well 
described here 
https://stackoverflow.com/questions/10979322/tuning-jackrabbit-data-model-version-bundle-table
Nevertheless we have few versionable entities and profiling our 
application we are seeing that this query "select BUNDLE_DATA from 
VERSION_BUNDLE where NODE_ID = :1" is triggered also when we persist a not 
versionable entity, moreover an entity which has no related versionable 
entities.

So I would like to know if you know about some bugs in that version, 
related to version_bundle table and its accesses, or anything other to 
help me to understand why that table is queried many times, more than the 
DEFAULT_BUNDLE in a  use case where I'm sure that no versionable entities 
are engaged.

Thanks in advance, best regards


Raffaele Gambelli
WebRainbow® Software Analyst & Developer

r.gambelli@hitachi-systems-cbt.com | Phone +39 051 8550 576
Meeting Room:https://hitachi-systems-cbt.webex.com/meet/r.gambelli
Via Ettore Cristoni, 84 | 40033 Casalecchio Di Reno
www.hitachi-systems-cbt.com 


This email for the D.lgs.196/2003 (Privacy Code) and European Regulation 
679/2016/UE (GDPR) may contain confidential and/or privileged information 
for the exclusive use of the intended recipient. Any review or 
distribution by others is strictly prohibited. If you are not the intended 
recipient, you must not use, copy, disclose or take any action based on 
this message or any information here. If you have received this email in 
error, please contact us (e-mail: privacy@hitachi-systems-cbt.com) by 
reply e-mail and delete all copies. Legal privilege is not waived because 
you have read this e-mail. Thank you for your cooperation.

Re: Jackrabbit v. 2.14.0 VERSION_BUNDLE

Posted by Julian Reschke <ju...@gmx.de>.
On 2018-12-06 14:58, Luca Tagliani wrote:
> Hi Julian,
>    i'm Raffaele's colleague and perhaps I can better specify the strange
> behaviour that we are facing.
> 
> The scenario is the following:
> 1) create a new node (NOT versionable)
> 2) set some properties
> 3) save the session
> 4) set a previously not existing property on the node itself
> 5) saving the session
> 
> At stage 5) we face a strange behaviour: the Jackrabbit session, using its
> SharedItemStateManager (SISM), tries to retrieve from the cache the bundle
> for the property (method getItemState at line 258 in trunk), but doesn't
> find it in its internal (call to getNonVirtualItemState() throws an
> exception), so execution proceed to the for cycle at line 283.
> In this for cycle the SISM checks its VirtualItemStateProvider that in fact
> contains a VersionItemStateProvider and a VirtualNodeTypeStateProvider.
> Obviously, the bundle is not in the version cache, so the persistenceManager
> bounded to the VersionItemStateProvider tries to retrieve it from the DB,
> thus executing the query "select BUNDLE_DATA from VERSION_BUNDLE where
> NODE_ID = :1".
> 
> To me this is not correct in two ways:
> a) I think that, after the save in stage 3 the bundle should be in cache
> b) even if the bundle is not in cache, it doesn't have to be retrieved from
> the versioning layer if the corresponding nodeType is not versionable
> 
> If this is right, we should consider a patch to avoid the problem putting
> the bundle in cache after a save or avoid the search in versions.
> 
> I'd like to have from you some advice on my thoughts, first if they seems
> correct to you or I am misunderstand the behaviour.
> 
> Thanks in advance
> 
> BR
> 
> Luca Tagliani

Unfortunately I'm not familiar with this part of the code, so you likely 
already know more about it than I do.

Proposal:

1) add TRACE or DEBUG level logging to the relevent parts
2) have a unit tests that exercises the code and inspect the log
3) patch the code
4) test and check logs again


Best regards, Julian

Re: Jackrabbit v. 2.14.0 VERSION_BUNDLE

Posted by Woonsan Ko <wo...@apache.org>.
On Thu, Dec 6, 2018 at 10:58 PM Luca Tagliani <l....@cbt.it> wrote:
>
> Hi Julian,
>   i'm Raffaele's colleague and perhaps I can better specify the strange
> behaviour that we are facing.
>
> The scenario is the following:
> 1) create a new node (NOT versionable)
> 2) set some properties
> 3) save the session
> 4) set a previously not existing property on the node itself
> 5) saving the session
>
> At stage 5) we face a strange behaviour: the Jackrabbit session, using its
> SharedItemStateManager (SISM), tries to retrieve from the cache the bundle
> for the property (method getItemState at line 258 in trunk), but doesn't
> find it in its internal (call to getNonVirtualItemState() throws an
> exception), so execution proceed to the for cycle at line 283.
> In this for cycle the SISM checks its VirtualItemStateProvider that in fact
> contains a VersionItemStateProvider and a VirtualNodeTypeStateProvider.
> Obviously, the bundle is not in the version cache, so the persistenceManager
> bounded to the VersionItemStateProvider tries to retrieve it from the DB,
> thus executing the query "select BUNDLE_DATA from VERSION_BUNDLE where
> NODE_ID = :1".

Yes, it was reproducible to me. Thanks for the elaborated steps and
investigation!
At the step 4, it needs to know whether or not the new property
already exists--whether or not it needs to create a new property. So,
it needs to retrieve the node state.
And, as you pointed out, the VirtualItemStateProvider, one of the
SharedItemStateManager#virtualProviders, is asked if it has that node
item by ItemId.

>
> To me this is not correct in two ways:
> a) I think that, after the save in stage 3 the bundle should be in cache

It seems debatable to me. The current design seems to have assumed
that an item should be cached only on retrieval, which makes sense to
me. It's simpler and batch additions don't need to cache, for example.

> b) even if the bundle is not in cache, it doesn't have to be retrieved from
> the versioning layer if the corresponding nodeType is not versionable

If a non-versionable node is a child of a versionable node, then it
should be retrieved from the Versioning's PersistenceManager. So, it
seems more complex than that.
In addition, the current API such as
o.a.j.core.state.ItemStateManager#getItemState(ItemId) doesn't give
any information about node type and node hierarchy.
Therefore, I assume it is really difficult to fix that in the core
level considering (a) o.a.j.core.state.ItemStateManager is widely used
from many different packages, and (b) Jackrabbit 2 is in the
maintenance mode.

>
> If this is right, we should consider a patch to avoid the problem putting
> the bundle in cache after a save or avoid the search in versions.
>
> I'd like to have from you some advice on my thoughts, first if they seems
> correct to you or I am misunderstand the behaviour.

Another approach is to override the VersionItemStateProvider in your
application. Basically a repository implementation such as
o.a.j.core.RepositoryImpl registers those two
VirtualItemStateProviders through
SharedItemStateManager#addVirtualItemStateProvider(VirtualItemStateProvider).
If you can override this behavior through a more optimized
VersionItemStateProvider somehow--possibly including some dirty code
to access node types or node hierarchy--you can possibly achieve the
goal.
You might also want to suggest some simple refactoring in
RepositoryImpl for example to make the overriding easier, which seems
less risky to everyone.

Just my two cents,

Woonsan

>
> Thanks in advance
>
> BR
>
> Luca Tagliani
>
>
>
> --
> Sent from: http://jackrabbit.510166.n4.nabble.com/Jackrabbit-Users-f510167.html

Re: Jackrabbit v. 2.14.0 VERSION_BUNDLE

Posted by Luca Tagliani <l....@cbt.it>.
Hi Julian,
  i'm Raffaele's colleague and perhaps I can better specify the strange
behaviour that we are facing.

The scenario is the following:
1) create a new node (NOT versionable)
2) set some properties
3) save the session
4) set a previously not existing property on the node itself
5) saving the session

At stage 5) we face a strange behaviour: the Jackrabbit session, using its
SharedItemStateManager (SISM), tries to retrieve from the cache the bundle
for the property (method getItemState at line 258 in trunk), but doesn't
find it in its internal (call to getNonVirtualItemState() throws an
exception), so execution proceed to the for cycle at line 283.
In this for cycle the SISM checks its VirtualItemStateProvider that in fact
contains a VersionItemStateProvider and a VirtualNodeTypeStateProvider.
Obviously, the bundle is not in the version cache, so the persistenceManager
bounded to the VersionItemStateProvider tries to retrieve it from the DB,
thus executing the query "select BUNDLE_DATA from VERSION_BUNDLE where
NODE_ID = :1".

To me this is not correct in two ways: 
a) I think that, after the save in stage 3 the bundle should be in cache
b) even if the bundle is not in cache, it doesn't have to be retrieved from
the versioning layer if the corresponding nodeType is not versionable

If this is right, we should consider a patch to avoid the problem putting
the bundle in cache after a save or avoid the search in versions.

I'd like to have from you some advice on my thoughts, first if they seems
correct to you or I am misunderstand the behaviour.

Thanks in advance

BR

Luca Tagliani



--
Sent from: http://jackrabbit.510166.n4.nabble.com/Jackrabbit-Users-f510167.html

Re: Jackrabbit v. 2.14.0 VERSION_BUNDLE

Posted by Julian Reschke <ju...@gmx.de>.
On 2018-12-04 14:46, Raffaele Gambelli wrote:
> Hi all,
> 
> I'm using Jackrabbit 2.14.0 and I'm experiencing the same problem well 
> described here 
> https://stackoverflow.com/questions/10979322/tuning-jackrabbit-data-model-version-bundle-table 
> 
> Nevertheless we have few versionable entities and profiling our 
> application we are seeing that this query "select BUNDLE_DATA from 
> VERSION_BUNDLE where NODE_ID = :1" is triggered also when we persist a 
> not versionable entity, moreover an entity which has no related 
> versionable entities.
> 
> So I would like to know if you know about some bugs in that version, 
> related to version_bundle table and its accesses, or anything other to 
> help me to understand why that table is queried many times, more than 
> the DEFAULT_BUNDLE in a  use case where I'm sure that no versionable 
> entities are engaged.
> 
> Thanks in advance, best regards
> ..

I'm not aware of any open issues in this area.

I guess the first step would be to find out what the version store 
contains. Maybe you are versioning more than what you thought.

Alternatively, you could add debug logging in the persistence manager 
which does these queries. Maybe a call stack would tell you what's 
causing the requests.

Best regards, Julian