You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by "Przemo Pakulski (JIRA)" <ji...@apache.org> on 2007/09/04 18:32:47 UTC

[jira] Created: (JCR-1112) CacheManager interval between recalculation of cache sizes should be configurable

CacheManager interval between recalculation of cache sizes should be configurable
---------------------------------------------------------------------------------

                 Key: JCR-1112
                 URL: https://issues.apache.org/jira/browse/JCR-1112
             Project: Jackrabbit
          Issue Type: New Feature
          Components: core
    Affects Versions: 1.4
            Reporter: Przemo Pakulski
            Priority: Minor


Currently interval between recaluclation of cahce size is hard coded to 1000 ms. Resizing/recalculation of cache size is quite expensive method (especially getMemoryUsed on MLRUItemStateCache is time consuming)

Depending on the configuration, we realized that under some load up to 10-15% percent of CPU time (profiler metrics) could be spend doing such recalculations. It does not seem to be needed to resize cache every second. Best this interval should be configurable in external config. file with other cache settings (like memory sizes).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-1112) CacheManager interval between recalculation of cache sizes should be configurable

Posted by "Thomas Mueller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12530078 ] 

Thomas Mueller commented on JCR-1112:
-------------------------------------

Hi,

The method getMemoryUsed could be improved by keeping the current value and only add / subtract when there are changes. Still from time to time a full recalculation (like now) is required because the size of the objects in the cache could change. If only one in 20 calls trigger a full recalculation, it would result in a speed up of about 10 times.

For me, the CacheManager is a workaround; if possible the number of caches should be reduced to one.

Thomas

> CacheManager interval between recalculation of cache sizes should be configurable
> ---------------------------------------------------------------------------------
>
>                 Key: JCR-1112
>                 URL: https://issues.apache.org/jira/browse/JCR-1112
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: jackrabbit-core
>            Reporter: Przemo Pakulski
>            Assignee: Thomas Mueller
>            Priority: Minor
>         Attachments: JCR-1112.txt
>
>
> Currently interval between recaluclation of cahce size is hard coded to 1000 ms. Resizing/recalculation of cache size is quite expensive method (especially getMemoryUsed on MLRUItemStateCache is time consuming)
> Depending on the configuration, we realized that under some load up to 10-15% percent of CPU time (profiler metrics) could be spend doing such recalculations. It does not seem to be needed to resize cache every second. Best this interval should be configurable in external config. file with other cache settings (like memory sizes).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Reopened: (JCR-1112) CacheManager interval between recalculation of cache sizes should be configurable

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jukka Zitting reopened JCR-1112:
--------------------------------


> CacheManager interval between recalculation of cache sizes should be configurable
> ---------------------------------------------------------------------------------
>
>                 Key: JCR-1112
>                 URL: https://issues.apache.org/jira/browse/JCR-1112
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: jackrabbit-core
>            Reporter: Przemo Pakulski
>            Assignee: Thomas Mueller
>            Priority: Minor
>         Attachments: JCR-1112.txt
>
>
> Currently interval between recaluclation of cahce size is hard coded to 1000 ms. Resizing/recalculation of cache size is quite expensive method (especially getMemoryUsed on MLRUItemStateCache is time consuming)
> Depending on the configuration, we realized that under some load up to 10-15% percent of CPU time (profiler metrics) could be spend doing such recalculations. It does not seem to be needed to resize cache every second. Best this interval should be configurable in external config. file with other cache settings (like memory sizes).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-1112) CacheManager interval between recalculation of cache sizes should be configurable

Posted by "Thomas Mueller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12525013 ] 

Thomas Mueller commented on JCR-1112:
-------------------------------------

For certain use cases, 10 seconds delay may result in OutOfMemoryException. I'm not against applying this patch, but we need to be sure it doesn't render the CacheManager ineffective.

> under some load 

Is there a way to reproduce this problem using a simple test application? If not, I like to write one. To do that, I need some more information: How many sessions, how much memory is available / used (java -Xmx...), what is the algorithm, is XA used, versioning, how do the nodes look like, how does the data look like, virtual machine, operating system? 

> up to 10-15% percent of CPU time (profiler metrics) 

How was this measured?

Thanks,
Thomas


> CacheManager interval between recalculation of cache sizes should be configurable
> ---------------------------------------------------------------------------------
>
>                 Key: JCR-1112
>                 URL: https://issues.apache.org/jira/browse/JCR-1112
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: core
>            Reporter: Przemo Pakulski
>            Priority: Minor
>         Attachments: JCR-1112.txt
>
>
> Currently interval between recaluclation of cahce size is hard coded to 1000 ms. Resizing/recalculation of cache size is quite expensive method (especially getMemoryUsed on MLRUItemStateCache is time consuming)
> Depending on the configuration, we realized that under some load up to 10-15% percent of CPU time (profiler metrics) could be spend doing such recalculations. It does not seem to be needed to resize cache every second. Best this interval should be configurable in external config. file with other cache settings (like memory sizes).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-1112) CacheManager interval between recalculation of cache sizes should be configurable

Posted by "Thomas Mueller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12525588 ] 

Thomas Mueller commented on JCR-1112:
-------------------------------------

> and all I see in the logs is resizeAll() messages

Don't blame the messenger. Seeing the messages doesn't mean this is the problem. The messages are disabled in the trunk.

Do you have some profiling data?

> CacheManager interval between recalculation of cache sizes should be configurable
> ---------------------------------------------------------------------------------
>
>                 Key: JCR-1112
>                 URL: https://issues.apache.org/jira/browse/JCR-1112
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: core
>            Reporter: Przemo Pakulski
>            Priority: Minor
>         Attachments: JCR-1112.txt
>
>
> Currently interval between recaluclation of cahce size is hard coded to 1000 ms. Resizing/recalculation of cache size is quite expensive method (especially getMemoryUsed on MLRUItemStateCache is time consuming)
> Depending on the configuration, we realized that under some load up to 10-15% percent of CPU time (profiler metrics) could be spend doing such recalculations. It does not seem to be needed to resize cache every second. Best this interval should be configurable in external config. file with other cache settings (like memory sizes).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-1112) CacheManager interval between recalculation of cache sizes should be configurable

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12524813 ] 

Jukka Zitting commented on JCR-1112:
------------------------------------

BTW, please use spaces instead of tabs for indentation.

> CacheManager interval between recalculation of cache sizes should be configurable
> ---------------------------------------------------------------------------------
>
>                 Key: JCR-1112
>                 URL: https://issues.apache.org/jira/browse/JCR-1112
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: core
>            Reporter: Przemo Pakulski
>            Priority: Minor
>         Attachments: JCR-1112.txt
>
>
> Currently interval between recaluclation of cahce size is hard coded to 1000 ms. Resizing/recalculation of cache size is quite expensive method (especially getMemoryUsed on MLRUItemStateCache is time consuming)
> Depending on the configuration, we realized that under some load up to 10-15% percent of CPU time (profiler metrics) could be spend doing such recalculations. It does not seem to be needed to resize cache every second. Best this interval should be configurable in external config. file with other cache settings (like memory sizes).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-1112) CacheManager interval between recalculation of cache sizes should be configurable

Posted by "Martijn Hendriks (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12528718 ] 

Martijn Hendriks commented on JCR-1112:
---------------------------------------

Hi,

I think that there are some issues with the current CacheManager that could be improved:

- The MLRUItemStateCache.touch method triggers the CacheManager.cacheAccessed method, which may call resizeAll. When the system is heavily loaded, many threads may unnecessarily be blocked by the synchronized block in CacheManager.cacheAccessed. The chances for this increase as SLEEP decreases and the time needed for resizeAll increases. This could easily be improved .

- The resizeAll method is expensive (for MLRUItemStateCaches, which are used everywhere) because it calls MLRUItemStateCache.getMemoryUsed, which recalculates the size of each entry in the cache (linear complexity in the size of the cache...). Since the NodeState/PropertyState.calculateMemoryFootprint seem to give approximate values anyway, wouldn't it be an idea to keep track of the approximate cache size in the MLRUItemStateCache itself? Furthermore, getMemoryUsed even blocks read-access to the cache. A large shared cache such as the one of the SharedItemStateManager suffers significantly from this, I think.

The minimum time between rebalancing seems small but, as Thomas noted, there are certain use-cases where this is needed. Isn't there a way to detect such extreme cache blowups in another way? When, for instance, a MLRUItemStateCache keeps track of its own approximate size, the time derivative of this size could be used to prevent blowups.

Best wishes,

Martijn

> CacheManager interval between recalculation of cache sizes should be configurable
> ---------------------------------------------------------------------------------
>
>                 Key: JCR-1112
>                 URL: https://issues.apache.org/jira/browse/JCR-1112
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: core
>            Reporter: Przemo Pakulski
>            Assignee: Thomas Mueller
>            Priority: Minor
>         Attachments: JCR-1112.txt
>
>
> Currently interval between recaluclation of cahce size is hard coded to 1000 ms. Resizing/recalculation of cache size is quite expensive method (especially getMemoryUsed on MLRUItemStateCache is time consuming)
> Depending on the configuration, we realized that under some load up to 10-15% percent of CPU time (profiler metrics) could be spend doing such recalculations. It does not seem to be needed to resize cache every second. Best this interval should be configurable in external config. file with other cache settings (like memory sizes).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-1112) CacheManager interval between recalculation of cache sizes should be configurable

Posted by "Raffaele Sena (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12525529 ] 

Raffaele Sena commented on JCR-1112:
------------------------------------

I can reproduce this problem or something related to this by simply importing an XML file with a few thousand nodes. The more nodes I have in the repository and the more time the system spend in rebalancing the cache, pretty much at every access to the repository.

I was experimenting with a system with a simple hierarchical structure like:

/users
  user1
  user2
  user3

for each user I had a large data structure stored as XML (like an imported XML file)

with such a system, even with only a few users, accessing to the node /users/<userX> takes seconds (and the more users or complex structure I have the longer it takes) and all I see in the logs is resizeAll() messages



> CacheManager interval between recalculation of cache sizes should be configurable
> ---------------------------------------------------------------------------------
>
>                 Key: JCR-1112
>                 URL: https://issues.apache.org/jira/browse/JCR-1112
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: core
>            Reporter: Przemo Pakulski
>            Priority: Minor
>         Attachments: JCR-1112.txt
>
>
> Currently interval between recaluclation of cahce size is hard coded to 1000 ms. Resizing/recalculation of cache size is quite expensive method (especially getMemoryUsed on MLRUItemStateCache is time consuming)
> Depending on the configuration, we realized that under some load up to 10-15% percent of CPU time (profiler metrics) could be spend doing such recalculations. It does not seem to be needed to resize cache every second. Best this interval should be configurable in external config. file with other cache settings (like memory sizes).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (JCR-1112) CacheManager interval between recalculation of cache sizes should be configurable

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jukka Zitting updated JCR-1112:
-------------------------------

    Fix Version/s: 1.3.4

Merged to the 1.3 branch in revision 631606.

> CacheManager interval between recalculation of cache sizes should be configurable
> ---------------------------------------------------------------------------------
>
>                 Key: JCR-1112
>                 URL: https://issues.apache.org/jira/browse/JCR-1112
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: jackrabbit-core
>            Reporter: Przemo Pakulski
>            Assignee: Thomas Mueller
>            Priority: Minor
>             Fix For: 1.3.4, 1.4
>
>         Attachments: JCR-1112.txt
>
>
> Currently interval between recaluclation of cahce size is hard coded to 1000 ms. Resizing/recalculation of cache size is quite expensive method (especially getMemoryUsed on MLRUItemStateCache is time consuming)
> Depending on the configuration, we realized that under some load up to 10-15% percent of CPU time (profiler metrics) could be spend doing such recalculations. It does not seem to be needed to resize cache every second. Best this interval should be configurable in external config. file with other cache settings (like memory sizes).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (JCR-1112) CacheManager interval between recalculation of cache sizes should be configurable

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jukka Zitting updated JCR-1112:
-------------------------------

    Affects Version/s:     (was: 1.4)

+1 I'd wouldn't mind pushing the default interval up to a minute or even higher. I don't think there's much benefit in too aggressive cache balancing.

> CacheManager interval between recalculation of cache sizes should be configurable
> ---------------------------------------------------------------------------------
>
>                 Key: JCR-1112
>                 URL: https://issues.apache.org/jira/browse/JCR-1112
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: core
>            Reporter: Przemo Pakulski
>            Priority: Minor
>         Attachments: JCR-1112.txt
>
>
> Currently interval between recaluclation of cahce size is hard coded to 1000 ms. Resizing/recalculation of cache size is quite expensive method (especially getMemoryUsed on MLRUItemStateCache is time consuming)
> Depending on the configuration, we realized that under some load up to 10-15% percent of CPU time (profiler metrics) could be spend doing such recalculations. It does not seem to be needed to resize cache every second. Best this interval should be configurable in external config. file with other cache settings (like memory sizes).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (JCR-1112) CacheManager interval between recalculation of cache sizes should be configurable

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jukka Zitting resolved JCR-1112.
--------------------------------

       Resolution: Fixed
    Fix Version/s: 1.4

Resolving as fixed for 1.4 based Przemo's changes in revision 592950.

> CacheManager interval between recalculation of cache sizes should be configurable
> ---------------------------------------------------------------------------------
>
>                 Key: JCR-1112
>                 URL: https://issues.apache.org/jira/browse/JCR-1112
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: jackrabbit-core
>            Reporter: Przemo Pakulski
>            Assignee: Thomas Mueller
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1112.txt
>
>
> Currently interval between recaluclation of cahce size is hard coded to 1000 ms. Resizing/recalculation of cache size is quite expensive method (especially getMemoryUsed on MLRUItemStateCache is time consuming)
> Depending on the configuration, we realized that under some load up to 10-15% percent of CPU time (profiler metrics) could be spend doing such recalculations. It does not seem to be needed to resize cache every second. Best this interval should be configurable in external config. file with other cache settings (like memory sizes).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-1112) CacheManager interval between recalculation of cache sizes should be configurable

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12530079 ] 

Jukka Zitting commented on JCR-1112:
------------------------------------

> For me, the CacheManager is a workaround; if possible the number of caches should be reduced to one.

+1!

> CacheManager interval between recalculation of cache sizes should be configurable
> ---------------------------------------------------------------------------------
>
>                 Key: JCR-1112
>                 URL: https://issues.apache.org/jira/browse/JCR-1112
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: jackrabbit-core
>            Reporter: Przemo Pakulski
>            Assignee: Thomas Mueller
>            Priority: Minor
>         Attachments: JCR-1112.txt
>
>
> Currently interval between recaluclation of cahce size is hard coded to 1000 ms. Resizing/recalculation of cache size is quite expensive method (especially getMemoryUsed on MLRUItemStateCache is time consuming)
> Depending on the configuration, we realized that under some load up to 10-15% percent of CPU time (profiler metrics) could be spend doing such recalculations. It does not seem to be needed to resize cache every second. Best this interval should be configurable in external config. file with other cache settings (like memory sizes).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (JCR-1112) CacheManager interval between recalculation of cache sizes should be configurable

Posted by "Thomas Mueller (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thomas Mueller resolved JCR-1112.
---------------------------------

    Resolution: Cannot Reproduce

In my view, the root cause (why does recalculation take so long, are there hundreds of caches?) should be understood and fixed. Just changing the interval would hide another problem, and may cause out of memory error.

To solve this problem the root cause of the problem needs to be reproducible (but maybe the solution will be different then).

> CacheManager interval between recalculation of cache sizes should be configurable
> ---------------------------------------------------------------------------------
>
>                 Key: JCR-1112
>                 URL: https://issues.apache.org/jira/browse/JCR-1112
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: core
>            Reporter: Przemo Pakulski
>            Assignee: Thomas Mueller
>            Priority: Minor
>         Attachments: JCR-1112.txt
>
>
> Currently interval between recaluclation of cahce size is hard coded to 1000 ms. Resizing/recalculation of cache size is quite expensive method (especially getMemoryUsed on MLRUItemStateCache is time consuming)
> Depending on the configuration, we realized that under some load up to 10-15% percent of CPU time (profiler metrics) could be spend doing such recalculations. It does not seem to be needed to resize cache every second. Best this interval should be configurable in external config. file with other cache settings (like memory sizes).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (JCR-1112) CacheManager interval between recalculation of cache sizes should be configurable

Posted by "Thomas Mueller (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thomas Mueller updated JCR-1112:
--------------------------------

    Assignee: Thomas Mueller

> CacheManager interval between recalculation of cache sizes should be configurable
> ---------------------------------------------------------------------------------
>
>                 Key: JCR-1112
>                 URL: https://issues.apache.org/jira/browse/JCR-1112
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: core
>            Reporter: Przemo Pakulski
>            Assignee: Thomas Mueller
>            Priority: Minor
>         Attachments: JCR-1112.txt
>
>
> Currently interval between recaluclation of cahce size is hard coded to 1000 ms. Resizing/recalculation of cache size is quite expensive method (especially getMemoryUsed on MLRUItemStateCache is time consuming)
> Depending on the configuration, we realized that under some load up to 10-15% percent of CPU time (profiler metrics) could be spend doing such recalculations. It does not seem to be needed to resize cache every second. Best this interval should be configurable in external config. file with other cache settings (like memory sizes).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (JCR-1112) CacheManager interval between recalculation of cache sizes should be configurable

Posted by "Przemo Pakulski (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Przemo Pakulski updated JCR-1112:
---------------------------------

    Attachment: JCR-1112.txt

Attached simple patch which allows to set the interval programmatically, and change the default interval to 10 seconds.

> CacheManager interval between recalculation of cache sizes should be configurable
> ---------------------------------------------------------------------------------
>
>                 Key: JCR-1112
>                 URL: https://issues.apache.org/jira/browse/JCR-1112
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: core
>    Affects Versions: 1.4
>            Reporter: Przemo Pakulski
>            Priority: Minor
>         Attachments: JCR-1112.txt
>
>
> Currently interval between recaluclation of cahce size is hard coded to 1000 ms. Resizing/recalculation of cache size is quite expensive method (especially getMemoryUsed on MLRUItemStateCache is time consuming)
> Depending on the configuration, we realized that under some load up to 10-15% percent of CPU time (profiler metrics) could be spend doing such recalculations. It does not seem to be needed to resize cache every second. Best this interval should be configurable in external config. file with other cache settings (like memory sizes).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.