You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-dev@jackrabbit.apache.org by Joel Richard <jo...@adobe.com> on 2015/04/07 09:00:33 UTC

Unnecessary itemExists overhead in JcrResourceProvider.createResource

Hi,

In JcrResourceProvider.createResource (Sling) it uses fist 
Session.itemExists to check whether an item exists and then 
Session.getItem to retreive it. Even though the item data is cached for 
the second call, this adds 8% overhead to the page rendering. 14% of the 
whole page rendering time is spent in itemExists.

Sling could just only use getItem and catch the exception if the item does 
not exist. Unfortunately, this will not perform well if the resource 
resolver is often used to read items which do not exist because exceptions 
are slow.

Technically, the simplest solution would be to export the method 
getItemOrNull. This method could be Oak-specfic and only be used by Sling 
when running on Oak.

Another approach might be to have the last item cached in the session and 
return it immediately if getItem is called right after itemExists. No 
doubt, this is ugly and would solve only this specific problem, but since 
it is a very common pattern, it could make sense.

Maybe it is also possible to improve the performance of itemExists itself, 
but that's a question which somebody else has to answer.

- Joel

Re: Unnecessary itemExists overhead in JcrResourceProvider.createResource

Posted by Angela Schreiber <an...@adobe.com>.
hi felix

>Turns out that checking for item existence before accessing the item is
>JCR API best practice. So Sling follows best practice.

by using deprecated API?

>
>If this generates a performance problem, I would suggest this to fix in
>the implementation of the API such that API usage best practice
>guidelines can be follows.

not sure about the best practices...

introducing an new method get*OrNull to Jackrabbit API that is already used
everywhere (e.g. JackrabbitSession because it allows you to get
the UserManager) that basically comes for free, is IMO preferable
over trying to improve the sequence of itemExists/getItem which
are prone for regressions and won't work properly for long-living
sessions... that's not on a huge effort but one that takes a lot
of time with huge regression risk.

not sure this is really desirable
angela

>
>Regarding the performance impact: If this createResource method has such
>a dramatic impact on request processing performance in Sling, it might be
>that Session.itemExists and/or Session.getItem methods are really
>expensive. Which would be yet another reason why we should concentrate on
>improving the implementation rather than introducing new API.
>
>Regards
>Felix
>
>> Am 07.04.2015 um 09:00 schrieb Joel Richard <jo...@adobe.com>:
>> 
>> Hi,
>> 
>> In JcrResourceProvider.createResource (Sling) it uses fist
>> Session.itemExists to check whether an item exists and then
>> Session.getItem to retreive it. Even though the item data is cached for
>> the second call, this adds 8% overhead to the page rendering. 14% of
>>the 
>> whole page rendering time is spent in itemExists.
>> 
>> Sling could just only use getItem and catch the exception if the item
>>does 
>> not exist. Unfortunately, this will not perform well if the resource
>> resolver is often used to read items which do not exist because
>>exceptions 
>> are slow.
>> 
>> Technically, the simplest solution would be to export the method
>> getItemOrNull. This method could be Oak-specfic and only be used by
>>Sling 
>> when running on Oak.
>> 
>> Another approach might be to have the last item cached in the session
>>and 
>> return it immediately if getItem is called right after itemExists. No
>> doubt, this is ugly and would solve only this specific problem, but
>>since 
>> it is a very common pattern, it could make sense.
>> 
>> Maybe it is also possible to improve the performance of itemExists
>>itself, 
>> but that's a question which somebody else has to answer.
>> 
>> - Joel
>


Re: Unnecessary itemExists overhead in JcrResourceProvider.createResource

Posted by christian <ch...@adobe.com>.
On 08.04.2015 16:32, Felix Meschberger wrote:
> Hi
>
>> Am 08.04.2015 um 11:12 schrieb Michael Dürig <md...@apache.org>:
>>
>>
>>
>> On 8.4.15 10:57 , Julian Reschke wrote:
>>>> Turns out that checking for item existence before accessing the item
>>>> is JCR API best practice. So Sling follows best practice.
>>> Why would it be best practice?
> Hmm, yes, looks like I misread something. So while it may not be JCR API best practices, using exceptions for flow control is not really good and actually expensive also considering:
>
>>> You need to handle an exception anyway
>>> (due to possible race conditions).
>> With Oak you generally don't have such races thanks to MVCC. But in general you are right, checking existence before access is prone to races.
> Indeed, I would not know of a case where such a race condition really created an issue in reality …
Unfortunatly I know on CRX from customer logs.

> Regards
> Felix


Re: Unnecessary itemExists overhead in JcrResourceProvider.createResource

Posted by Felix Meschberger <fm...@adobe.com>.
Hi

> Am 08.04.2015 um 11:12 schrieb Michael Dürig <md...@apache.org>:
> 
> 
> 
> On 8.4.15 10:57 , Julian Reschke wrote:
>>> Turns out that checking for item existence before accessing the item
>>> is JCR API best practice. So Sling follows best practice.
>> 
>> Why would it be best practice?

Hmm, yes, looks like I misread something. So while it may not be JCR API best practices, using exceptions for flow control is not really good and actually expensive also considering:

>> You need to handle an exception anyway
>> (due to possible race conditions).
> 
> With Oak you generally don't have such races thanks to MVCC. But in general you are right, checking existence before access is prone to races.

Indeed, I would not know of a case where such a race condition really created an issue in reality …

Regards
Felix

Re: Unnecessary itemExists overhead in JcrResourceProvider.createResource

Posted by Michael Dürig <md...@apache.org>.

On 8.4.15 10:57 , Julian Reschke wrote:
>> Turns out that checking for item existence before accessing the item
>> is JCR API best practice. So Sling follows best practice.
>
> Why would it be best practice? You need to handle an exception anyway
> (due to possible race conditions).

With Oak you generally don't have such races thanks to MVCC. But in 
general you are right, checking existence before access is prone to races.

Michael

Re: Unnecessary itemExists overhead in JcrResourceProvider.createResource

Posted by Julian Reschke <ju...@greenbytes.de>.
On 2015-04-08 10:10, Felix Meschberger wrote:
> Hi Joel
>
> Thanks for pointing out. Yet, this has been discussed before.
>
> Turns out that checking for item existence before accessing the item is JCR API best practice. So Sling follows best practice.

Why would it be best practice? You need to handle an exception anyway 
(due to possible race conditions).

> If this generates a performance problem, I would suggest this to fix in the implementation of the API such that API usage best practice guidelines can be follows.
>
> Regarding the performance impact: If this createResource method has such a dramatic impact on request processing performance in Sling, it might be that Session.itemExists and/or Session.getItem methods are really expensive. Which would be yet another reason why we should concentrate on improving the implementation rather than introducing new API.
>
> Regards
> Felix
> ...


Best regards, Julian

Re: Unnecessary itemExists overhead in JcrResourceProvider.createResource

Posted by Felix Meschberger <fm...@adobe.com>.
Hi Joel

Thanks for pointing out. Yet, this has been discussed before.

Turns out that checking for item existence before accessing the item is JCR API best practice. So Sling follows best practice.

If this generates a performance problem, I would suggest this to fix in the implementation of the API such that API usage best practice guidelines can be follows.

Regarding the performance impact: If this createResource method has such a dramatic impact on request processing performance in Sling, it might be that Session.itemExists and/or Session.getItem methods are really expensive. Which would be yet another reason why we should concentrate on improving the implementation rather than introducing new API.

Regards
Felix

> Am 07.04.2015 um 09:00 schrieb Joel Richard <jo...@adobe.com>:
> 
> Hi,
> 
> In JcrResourceProvider.createResource (Sling) it uses fist 
> Session.itemExists to check whether an item exists and then 
> Session.getItem to retreive it. Even though the item data is cached for 
> the second call, this adds 8% overhead to the page rendering. 14% of the 
> whole page rendering time is spent in itemExists.
> 
> Sling could just only use getItem and catch the exception if the item does 
> not exist. Unfortunately, this will not perform well if the resource 
> resolver is often used to read items which do not exist because exceptions 
> are slow.
> 
> Technically, the simplest solution would be to export the method 
> getItemOrNull. This method could be Oak-specfic and only be used by Sling 
> when running on Oak.
> 
> Another approach might be to have the last item cached in the session and 
> return it immediately if getItem is called right after itemExists. No 
> doubt, this is ugly and would solve only this specific problem, but since 
> it is a very common pattern, it could make sense.
> 
> Maybe it is also possible to improve the performance of itemExists itself, 
> but that's a question which somebody else has to answer.
> 
> - Joel


Re: Unnecessary itemExists overhead in JcrResourceProvider.createResource

Posted by Joel Richard <jo...@adobe.com>.
Hi,

I have created patches for Jackrabbit and Oak including test coverage. They are attached to:
https://issues.apache.org/jira/browse/JCR-3870
https://issues.apache.org/jira/browse/OAK-3166

Could you please review and apply them?

- Joel

On 07 Apr 2015, at 15:13, Joel Richard <jo...@adobe.com>> wrote:

Thank you for your suggestions. I have opened the following two issues to
fix this problem:

https://issues.apache.org/jira/browse/OAK-2724

https://issues.apache.org/jira/browse/SLING-4585


- Joel


On 07/04/15 07:58, "Angela Schreiber" <an...@adobe.com> wrote:

Hi Michael

I would opt for the Jackrabbit API variant. Changing the semantics
of the JCR API call is asking for severe troubles and I don't see
any additional effort for another version of the spec (in particular
since JSR 333 was never implemented in Jackrabbit or Oak).

Extending the JackrabbitSession interface however, should be quite
painless and equally serving the purpose.

Kind regards
Angela

On 07/04/15 09:54, "Michael Dürig" <md...@apache.org> wrote:


On 7.4.15 9:00 , Joel Richard wrote:
Hi,

In JcrResourceProvider.createResource (Sling) it uses fist
Session.itemExists to check whether an item exists and then
Session.getItem to retreive it. Even though the item data is cached for
the second call, this adds 8% overhead to the page rendering. 14% of
the
whole page rendering time is spent in itemExists.

Sling could just only use getItem and catch the exception if the item
does
not exist. Unfortunately, this will not perform well if the resource
resolver is often used to read items which do not exist because
exceptions
are slow.

This also bothers me for a long time now. I think it is an unfortunate
design decision of JCR to either force clients to use Exceptions for
flow control or to have to go through 2 API calls.


Technically, the simplest solution would be to export the method
getItemOrNull. This method could be Oak-specfic and only be used by
Sling
when running on Oak.


t would at least be simple to extend the Jackrabbit API in such ways.

Another approach would be to introduce sentinels representing non
existing items. Clients could then do something along the lines of

Node n = session.getNode("/foo/bar");
if (n == JackrabbitNode.NULL) {
 // no node at /foo/bar
} else {
 // default case
}

We are using this pattern quite a bit within Oak reducing a lot of
boilerplate.

However retrofitting it to JCR comes with compatibility issues, which
we'd need to address.

Michael



Re: Unnecessary itemExists overhead in JcrResourceProvider.createResource

Posted by Angela Schreiber <an...@adobe.com>.
Hi Joel

Thanks for reporting this improvement request.

I moved it to the Jackrabbit project as the API extensions must
first go into the Jackrabbit API and then be implemented in
the following Session implementations:

- jackrabbit-core
- jackrabbit-jcr2spi
- oak-jcr

The latter must wait until we have the Jackrabbit version in Oak
parent changed to either 2.10.1-SNAPSHOT or a new Jackrabbit release.

Kind regards
Angela

On 07/04/15 15:13, "Joel Richard" <jo...@adobe.com> wrote:

>Thank you for your suggestions. I have opened the following two issues to
>fix this problem:
>
>https://issues.apache.org/jira/browse/OAK-2724
>
>https://issues.apache.org/jira/browse/SLING-4585
>
>
>- Joel
>
>
>On 07/04/15 07:58, "Angela Schreiber" <an...@adobe.com> wrote:
>
>>Hi Michael
>>
>>I would opt for the Jackrabbit API variant. Changing the semantics
>>of the JCR API call is asking for severe troubles and I don't see
>>any additional effort for another version of the spec (in particular
>>since JSR 333 was never implemented in Jackrabbit or Oak).
>>
>>Extending the JackrabbitSession interface however, should be quite
>>painless and equally serving the purpose.
>>
>>Kind regards
>>Angela
>>
>>On 07/04/15 09:54, "Michael Dürig" <md...@apache.org> wrote:
>>
>>>
>>>On 7.4.15 9:00 , Joel Richard wrote:
>>>> Hi,
>>>>
>>>> In JcrResourceProvider.createResource (Sling) it uses fist
>>>> Session.itemExists to check whether an item exists and then
>>>> Session.getItem to retreive it. Even though the item data is cached
>>>>for
>>>> the second call, this adds 8% overhead to the page rendering. 14% of
>>>>the
>>>> whole page rendering time is spent in itemExists.
>>>>
>>>> Sling could just only use getItem and catch the exception if the item
>>>>does
>>>> not exist. Unfortunately, this will not perform well if the resource
>>>> resolver is often used to read items which do not exist because
>>>>exceptions
>>>> are slow.
>>>
>>>This also bothers me for a long time now. I think it is an unfortunate
>>>design decision of JCR to either force clients to use Exceptions for
>>>flow control or to have to go through 2 API calls.
>>>
>>>>
>>>> Technically, the simplest solution would be to export the method
>>>> getItemOrNull. This method could be Oak-specfic and only be used by
>>>>Sling
>>>> when running on Oak.
>>>>
>>>
>>>t would at least be simple to extend the Jackrabbit API in such ways.
>>>
>>>Another approach would be to introduce sentinels representing non
>>>existing items. Clients could then do something along the lines of
>>>
>>>Node n = session.getNode("/foo/bar");
>>>if (n == JackrabbitNode.NULL) {
>>>   // no node at /foo/bar
>>>} else {
>>>   // default case
>>>}
>>>
>>>We are using this pattern quite a bit within Oak reducing a lot of
>>>boilerplate.
>>>
>>>However retrofitting it to JCR comes with compatibility issues, which
>>>we'd need to address.
>>>
>>>Michael
>>


Re: Unnecessary itemExists overhead in JcrResourceProvider.createResource

Posted by Joel Richard <jo...@adobe.com>.
Thank you for your suggestions. I have opened the following two issues to 
fix this problem:

https://issues.apache.org/jira/browse/OAK-2724

https://issues.apache.org/jira/browse/SLING-4585


- Joel


On 07/04/15 07:58, "Angela Schreiber" <an...@adobe.com> wrote:

>Hi Michael
>
>I would opt for the Jackrabbit API variant. Changing the semantics
>of the JCR API call is asking for severe troubles and I don't see
>any additional effort for another version of the spec (in particular
>since JSR 333 was never implemented in Jackrabbit or Oak).
>
>Extending the JackrabbitSession interface however, should be quite
>painless and equally serving the purpose.
>
>Kind regards
>Angela
>
>On 07/04/15 09:54, "Michael Dürig" <md...@apache.org> wrote:
>
>>
>>On 7.4.15 9:00 , Joel Richard wrote:
>>> Hi,
>>>
>>> In JcrResourceProvider.createResource (Sling) it uses fist
>>> Session.itemExists to check whether an item exists and then
>>> Session.getItem to retreive it. Even though the item data is cached for
>>> the second call, this adds 8% overhead to the page rendering. 14% of 
>>>the
>>> whole page rendering time is spent in itemExists.
>>>
>>> Sling could just only use getItem and catch the exception if the item
>>>does
>>> not exist. Unfortunately, this will not perform well if the resource
>>> resolver is often used to read items which do not exist because
>>>exceptions
>>> are slow.
>>
>>This also bothers me for a long time now. I think it is an unfortunate
>>design decision of JCR to either force clients to use Exceptions for
>>flow control or to have to go through 2 API calls.
>>
>>>
>>> Technically, the simplest solution would be to export the method
>>> getItemOrNull. This method could be Oak-specfic and only be used by
>>>Sling
>>> when running on Oak.
>>>
>>
>>t would at least be simple to extend the Jackrabbit API in such ways.
>>
>>Another approach would be to introduce sentinels representing non
>>existing items. Clients could then do something along the lines of
>>
>>Node n = session.getNode("/foo/bar");
>>if (n == JackrabbitNode.NULL) {
>>   // no node at /foo/bar
>>} else {
>>   // default case
>>}
>>
>>We are using this pattern quite a bit within Oak reducing a lot of
>>boilerplate.
>>
>>However retrofitting it to JCR comes with compatibility issues, which
>>we'd need to address.
>>
>>Michael
>

Re: Unnecessary itemExists overhead in JcrResourceProvider.createResource

Posted by Angela Schreiber <an...@adobe.com>.
Hi Michael

I would opt for the Jackrabbit API variant. Changing the semantics
of the JCR API call is asking for severe troubles and I don't see
any additional effort for another version of the spec (in particular
since JSR 333 was never implemented in Jackrabbit or Oak).

Extending the JackrabbitSession interface however, should be quite
painless and equally serving the purpose.

Kind regards
Angela

On 07/04/15 09:54, "Michael Dürig" <md...@apache.org> wrote:

>
>On 7.4.15 9:00 , Joel Richard wrote:
>> Hi,
>>
>> In JcrResourceProvider.createResource (Sling) it uses fist
>> Session.itemExists to check whether an item exists and then
>> Session.getItem to retreive it. Even though the item data is cached for
>> the second call, this adds 8% overhead to the page rendering. 14% of the
>> whole page rendering time is spent in itemExists.
>>
>> Sling could just only use getItem and catch the exception if the item
>>does
>> not exist. Unfortunately, this will not perform well if the resource
>> resolver is often used to read items which do not exist because
>>exceptions
>> are slow.
>
>This also bothers me for a long time now. I think it is an unfortunate
>design decision of JCR to either force clients to use Exceptions for
>flow control or to have to go through 2 API calls.
>
>>
>> Technically, the simplest solution would be to export the method
>> getItemOrNull. This method could be Oak-specfic and only be used by
>>Sling
>> when running on Oak.
>>
>
>t would at least be simple to extend the Jackrabbit API in such ways.
>
>Another approach would be to introduce sentinels representing non
>existing items. Clients could then do something along the lines of
>
>Node n = session.getNode("/foo/bar");
>if (n == JackrabbitNode.NULL) {
>   // no node at /foo/bar
>} else {
>   // default case
>}
>
>We are using this pattern quite a bit within Oak reducing a lot of
>boilerplate.
>
>However retrofitting it to JCR comes with compatibility issues, which
>we'd need to address.
>
>Michael


Re: Unnecessary itemExists overhead in JcrResourceProvider.createResource

Posted by Michael Dürig <md...@apache.org>.
On 7.4.15 9:00 , Joel Richard wrote:
> Hi,
>
> In JcrResourceProvider.createResource (Sling) it uses fist
> Session.itemExists to check whether an item exists and then
> Session.getItem to retreive it. Even though the item data is cached for
> the second call, this adds 8% overhead to the page rendering. 14% of the
> whole page rendering time is spent in itemExists.
>
> Sling could just only use getItem and catch the exception if the item does
> not exist. Unfortunately, this will not perform well if the resource
> resolver is often used to read items which do not exist because exceptions
> are slow.

This also bothers me for a long time now. I think it is an unfortunate 
design decision of JCR to either force clients to use Exceptions for 
flow control or to have to go through 2 API calls.

>
> Technically, the simplest solution would be to export the method
> getItemOrNull. This method could be Oak-specfic and only be used by Sling
> when running on Oak.
>

t would at least be simple to extend the Jackrabbit API in such ways.

Another approach would be to introduce sentinels representing non 
existing items. Clients could then do something along the lines of

Node n = session.getNode("/foo/bar");
if (n == JackrabbitNode.NULL) {
   // no node at /foo/bar
} else {
   // default case
}

We are using this pattern quite a bit within Oak reducing a lot of 
boilerplate.

However retrofitting it to JCR comes with compatibility issues, which 
we'd need to address.

Michael