You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by Lukas Kahwe Smith <ml...@pooteeweet.org> on 2012/02/16 11:49:29 UTC

cpu load issues with 2.3.6

Hi,

just wanted to check if anyone else is seeing issues with 2.3.6 (or later).
we are using Jackrabbit standalone in a 2 node cluster.
we are frequently seeing one node in the cluster using 100% CPU.
restarts do not seem to solve the issue.

regards,
Lukas Kahwe Smith
mls@pooteeweet.org




Re: cpu load issues with 2.3.6

Posted by Christian Stocker <ch...@liip.ch>.
Hi

On 20.02.12 20:42, Jukka Zitting wrote:
> Hi,
> 
> On Mon, Feb 20, 2012 at 6:51 PM, Christian Stocker
> <ch...@liip.ch> wrote:
>> After looking at it further, it's mainly the
>>
>> LOCK
>> PUT
>> REPORT
>> ...
>>
>> which is the culprit. The PUT request changes the list of referenced
>> nodes in one node, and the REPORT request tries to resolve one of the
>> referenced UUIDs to an absolute path. It's a bug in the PHP Library
>> Jackalope that it makes that REPORT call at al (which we will get rid of
>> soon), but I guess, jackalope shouldn't end up using 100% CPU, if
>> someone does that.
> 
> Yep. From the thread dump posted by Lukas it looks like the culprit
> here is the appliesToResource() lock computation done by the
> TxLockManagerImpl class in jackrabbit-webdav. The method makes a wrong
> assumption about the return value of the Text.getRelativeParent()
> utility method, and ends up in an infinite loop. Can you file a bug
> report for this? It should be fairly easy to fix in time for 2.4.1.

Thanks for the update, will file an issue in jira (maybe only tomorrow).

Thanks to that bug in jackrabbit, I found a performance issue in
Jackalope, which I now got rid of (that REPORT wasn't needed after all),
so the issue is not an emergency issue for us right now. But of course
are still glad, if it's fixed soon :)

chregu


> 
> BR,
> 
> Jukka Zitting

Re: cpu load issues with 2.3.6

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Mon, Feb 20, 2012 at 6:51 PM, Christian Stocker
<ch...@liip.ch> wrote:
> After looking at it further, it's mainly the
>
> LOCK
> PUT
> REPORT
> ...
>
> which is the culprit. The PUT request changes the list of referenced
> nodes in one node, and the REPORT request tries to resolve one of the
> referenced UUIDs to an absolute path. It's a bug in the PHP Library
> Jackalope that it makes that REPORT call at al (which we will get rid of
> soon), but I guess, jackalope shouldn't end up using 100% CPU, if
> someone does that.

Yep. From the thread dump posted by Lukas it looks like the culprit
here is the appliesToResource() lock computation done by the
TxLockManagerImpl class in jackrabbit-webdav. The method makes a wrong
assumption about the return value of the Text.getRelativeParent()
utility method, and ends up in an infinite loop. Can you file a bug
report for this? It should be fairly easy to fix in time for 2.4.1.

BR,

Jukka Zitting

Re: cpu load issues with 2.3.6

Posted by Christian Stocker <ch...@liip.ch>.
Hi

After looking at it further, it's mainly the

LOCK
PUT
REPORT
...

which is the culprit. The PUT request changes the list of referenced
nodes in one node, and the REPORT request tries to resolve one of the
referenced UUIDs to an absolute path. It's a bug in the PHP Library
Jackalope that it makes that REPORT call at al (which we will get rid of
soon), but I guess, jackalope shouldn't end up using 100% CPU, if
someone does that.

Greetings

chregu

On 20.02.12 16:22, Christian Stocker wrote:
> Hi
> 
> I discovered the same issue (on the same project as lukas, but on a
> totally different task). It has nothing to do with clustering, but with
> adding and removing references to a Weak Reference Multi-Value property
> within the same session.
> 
> In PHPCR/Jackalope, I do the following:
> ***
> // remove the weakreference property if one was there
> $m->setProperty("refs",null);
> $session->save();
> 
> //add some references
> $m->setProperty("refs",$refs,10);
> $session->save();
> 
> //remove one and set the others again
> array_pop($refs);
> $m->setProperty("refs",$refs,10);
> $session->save();
> ***
> 
> If I do the last setProperty() with a new session, then it works.
> 
> On the HTTP level it looks like this:
> 
> https://gist.github.com/1869626
> 
> The last REPORT (where we try to resolve the UUID to a path), it hangs.
> 
> (I actually don't know, why there's a REPORT there, I have to dig into
> that, but the fact that it stops is nevertheless a bad sign)
> 
> Any help is really appreciated
> 
> chregu
> 
> 
> 
> 
> On 16.02.12 15:27, Lukas Kahwe Smith wrote:
>>
>> On Feb 16, 2012, at 12:02 , Jukka Zitting wrote:
>>
>>> Hi,
>>>
>>> On Thu, Feb 16, 2012 at 11:49 AM, Lukas Kahwe Smith <ml...@pooteeweet.org> wrote:
>>>> just wanted to check if anyone else is seeing issues with 2.3.6 (or later).
>>>> we are using Jackrabbit standalone in a 2 node cluster.
>>>> we are frequently seeing one node in the cluster using 100% CPU.
>>>
>>> I don't recall such issues. Getting a few thread dumps of a process in
>>> such a state should help identify what's keeping it busy.
>>
>>
>> here it is:
>> http://pastie.org/private/jyorgp7qyhchckiyfzja
>>
>> just FYI, we have a very similar configuration in production using jackrabbit 2.3.1 with our old getNodes() patch, where we do not have this issue. well we do see the CPU load increasing there as well .. but at a much much much slower pace to the point where its not a concern for us.
>>
>> so it seems to be caused by something that was done between 2.3.1 and 2.3.6
>>
>> regards,
>> Lukas Kahwe Smith
>> mls@pooteeweet.org
>>
>>
> 

Re: cpu load issues with 2.3.6

Posted by Christian Stocker <ch...@liip.ch>.
Hi

I discovered the same issue (on the same project as lukas, but on a
totally different task). It has nothing to do with clustering, but with
adding and removing references to a Weak Reference Multi-Value property
within the same session.

In PHPCR/Jackalope, I do the following:
***
// remove the weakreference property if one was there
$m->setProperty("refs",null);
$session->save();

//add some references
$m->setProperty("refs",$refs,10);
$session->save();

//remove one and set the others again
array_pop($refs);
$m->setProperty("refs",$refs,10);
$session->save();
***

If I do the last setProperty() with a new session, then it works.

On the HTTP level it looks like this:

https://gist.github.com/1869626

The last REPORT (where we try to resolve the UUID to a path), it hangs.

(I actually don't know, why there's a REPORT there, I have to dig into
that, but the fact that it stops is nevertheless a bad sign)

Any help is really appreciated

chregu




On 16.02.12 15:27, Lukas Kahwe Smith wrote:
> 
> On Feb 16, 2012, at 12:02 , Jukka Zitting wrote:
> 
>> Hi,
>>
>> On Thu, Feb 16, 2012 at 11:49 AM, Lukas Kahwe Smith <ml...@pooteeweet.org> wrote:
>>> just wanted to check if anyone else is seeing issues with 2.3.6 (or later).
>>> we are using Jackrabbit standalone in a 2 node cluster.
>>> we are frequently seeing one node in the cluster using 100% CPU.
>>
>> I don't recall such issues. Getting a few thread dumps of a process in
>> such a state should help identify what's keeping it busy.
> 
> 
> here it is:
> http://pastie.org/private/jyorgp7qyhchckiyfzja
> 
> just FYI, we have a very similar configuration in production using jackrabbit 2.3.1 with our old getNodes() patch, where we do not have this issue. well we do see the CPU load increasing there as well .. but at a much much much slower pace to the point where its not a concern for us.
> 
> so it seems to be caused by something that was done between 2.3.1 and 2.3.6
> 
> regards,
> Lukas Kahwe Smith
> mls@pooteeweet.org
> 
> 

-- 
Liip AG  //  Feldstrasse 133 //  CH-8004 Zurich
Tel +41 43 500 39 81 // Mobile +41 76 561 88 60
www.liip.ch // blog.liip.ch // GnuPG 0x0748D5FE


Re: cpu load issues with 2.3.6

Posted by Lukas Kahwe Smith <ml...@pooteeweet.org>.
On Feb 16, 2012, at 12:02 , Jukka Zitting wrote:

> Hi,
> 
> On Thu, Feb 16, 2012 at 11:49 AM, Lukas Kahwe Smith <ml...@pooteeweet.org> wrote:
>> just wanted to check if anyone else is seeing issues with 2.3.6 (or later).
>> we are using Jackrabbit standalone in a 2 node cluster.
>> we are frequently seeing one node in the cluster using 100% CPU.
> 
> I don't recall such issues. Getting a few thread dumps of a process in
> such a state should help identify what's keeping it busy.


here it is:
http://pastie.org/private/jyorgp7qyhchckiyfzja

just FYI, we have a very similar configuration in production using jackrabbit 2.3.1 with our old getNodes() patch, where we do not have this issue. well we do see the CPU load increasing there as well .. but at a much much much slower pace to the point where its not a concern for us.

so it seems to be caused by something that was done between 2.3.1 and 2.3.6

regards,
Lukas Kahwe Smith
mls@pooteeweet.org




Re: cpu load issues with 2.3.6

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Thu, Feb 16, 2012 at 11:49 AM, Lukas Kahwe Smith <ml...@pooteeweet.org> wrote:
> just wanted to check if anyone else is seeing issues with 2.3.6 (or later).
> we are using Jackrabbit standalone in a 2 node cluster.
> we are frequently seeing one node in the cluster using 100% CPU.

I don't recall such issues. Getting a few thread dumps of a process in
such a state should help identify what's keeping it busy.

BR,

Jukka Zitting