You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by "Bart van der Schans (Created) (JIRA)" <ji...@apache.org> on 2011/10/07 12:58:29 UTC

[jira] [Created] (JCR-3098) Add hit miss statistics and logging to caches

Add hit miss statistics and logging to caches
---------------------------------------------

                 Key: JCR-3098
                 URL: https://issues.apache.org/jira/browse/JCR-3098
             Project: Jackrabbit Content Repository
          Issue Type: Improvement
          Components: jackrabbit-core
            Reporter: Bart van der Schans
             Fix For: 2.2.10, 2.3.1
         Attachments: cache-stats.patch

The current caches (ConcurrentCache) doesn't maintain hit and miss statistics. This makes it very hard to know if you need to increase the caches in a deployment. This functionality does exist in the 1.5 and 1.6 branches, but is missing from the 2.x branches. The patch adds these statistics and adds logging on info level. The frequency of the logging is by default configured to maximal once a minute but can be configured with the system property "org.apache.jackrabbit.cacheLogStatsInterval" (in ms). 

The log lines look like:

07.10.2011 09:00:39 INFO  [org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.logCacheStats():737] name=defaultBundleCache[ConcurrentCache@54fb02fd] num=21074 mem=34504k max=65536k hits=93352 miss=21074 puts=21135
07.10.2011 09:00:40 INFO  [org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.logCacheStats():737] name=versionBundleCache[ConcurrentCache@47b1de1a] num=10637 mem=250k max=8192k hits=36352 miss=10637 puts=10637

This patch will also make possible to later on expose these statistics over JMX when the initial JMX work has been settled.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JCR-3098) Add hit miss statistics and logging to caches

Posted by "Alex Parvulescu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13122744#comment-13122744 ] 

Alex Parvulescu commented on JCR-3098:
--------------------------------------

you beat me to it :)
I was still testing out some cache changes.

my 2 cents:
 - Cache change: getHitCount / getMissCount . I'd drop hitCount as the cache already logs accessCount (hitCount = accessCount - hitCount, you don't need to store this info on the cache)
                         putCount : it *should* follow a missCount. is is really necessary? 
What I'm adding to the stats is the time is takes to recover from a cache miss, but that is component based. Let's say a PersistenceManager impl has a cache miss, and it loads the info from an external storage: you have access +1, miss +1 at the cache level (I don't need put / hit), also I'll compute the cache miss time locally on the PersistenceManager impl.

Also, an AbstractCache can have only one CacheAccessListener. I'm looking at the AbstractBundlePersistenceManager which is now also a CacheAccessListener, that doesn't feel 100% right to me. What do you say about using the Stats object as a dedicated PM cache listener, and there you could also log this info.

As I said the code I'm working on is not committed yet, as I'm still testing, but we can definitely unite our efforts. You can follow up with the changes on the dedicated JMX JR branch [0], if that is something you would be interested in. And we'll merge back to trunk soon enough. 

Overall I agree with the need to have more info on the caches, but this is a really sensitive area, any change must be tested carefully.

ps. what is also missing from the cache info now is a cache miss reason: cold cache - the entry was not there which would be no real reason to worry, or the entry has been evicted previously which would probably indicate a low memory setting must be adjusted.

[0] http://svn.apache.org/repos/asf/jackrabbit/branches/JCR-2936/
                
> Add hit miss statistics and logging to caches
> ---------------------------------------------
>
>                 Key: JCR-3098
>                 URL: https://issues.apache.org/jira/browse/JCR-3098
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: jackrabbit-core
>            Reporter: Bart van der Schans
>             Fix For: 2.2.10, 2.3.1
>
>         Attachments: cache-stats.patch
>
>
> The current caches (ConcurrentCache) doesn't maintain hit and miss statistics. This makes it very hard to know if you need to increase the caches in a deployment. This functionality does exist in the 1.5 and 1.6 branches, but is missing from the 2.x branches. The patch adds these statistics and adds logging on info level. The frequency of the logging is by default configured to maximal once a minute but can be configured with the system property "org.apache.jackrabbit.cacheLogStatsInterval" (in ms). 
> The log lines look like:
> 07.10.2011 09:00:39 INFO  [org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.logCacheStats():737] name=defaultBundleCache[ConcurrentCache@54fb02fd] num=21074 mem=34504k max=65536k hits=93352 miss=21074 puts=21135
> 07.10.2011 09:00:40 INFO  [org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.logCacheStats():737] name=versionBundleCache[ConcurrentCache@47b1de1a] num=10637 mem=250k max=8192k hits=36352 miss=10637 puts=10637
> This patch will also make possible to later on expose these statistics over JMX when the initial JMX work has been settled.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JCR-3098) Add hit miss statistics and logging to caches

Posted by "Alex Parvulescu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13126554#comment-13126554 ] 

Alex Parvulescu commented on JCR-3098:
--------------------------------------

Hi Bart,

 - cache manager set to info, check

 - AbstractBundlePersistenceManager, check
You are right, this makes it easier to backport to 2.2 right now, way before the stats are stable.

 - minor things, check

I think we are all good now :)
                
> Add hit miss statistics and logging to caches
> ---------------------------------------------
>
>                 Key: JCR-3098
>                 URL: https://issues.apache.org/jira/browse/JCR-3098
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: jackrabbit-core
>            Reporter: Bart van der Schans
>             Fix For: 2.2.10
>
>         Attachments: cache-stats.patch
>
>
> The current caches (ConcurrentCache) doesn't maintain hit and miss statistics. This makes it very hard to know if you need to increase the caches in a deployment. This functionality does exist in the 1.5 and 1.6 branches, but is missing from the 2.x branches. The patch adds these statistics and adds logging on info level. The frequency of the logging is by default configured to maximal once a minute but can be configured with the system property "org.apache.jackrabbit.cacheLogStatsInterval" (in ms). 
> The log lines look like:
> 07.10.2011 09:00:39 INFO  [org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.logCacheStats():737] name=defaultBundleCache[ConcurrentCache@54fb02fd] num=21074 mem=34504k max=65536k hits=93352 miss=21074 puts=21135
> 07.10.2011 09:00:40 INFO  [org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.logCacheStats():737] name=versionBundleCache[ConcurrentCache@47b1de1a] num=10637 mem=250k max=8192k hits=36352 miss=10637 puts=10637
> This patch will also make possible to later on expose these statistics over JMX when the initial JMX work has been settled.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JCR-3098) Add hit miss statistics and logging to caches

Posted by "Bart van der Schans (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13122835#comment-13122835 ] 

Bart van der Schans commented on JCR-3098:
------------------------------------------

Hi Alex,

> You are right. The CacheManager does call reset on it. It is weird that this behaviour is tied to a listener implementation. For example, on the patch, the AbstractBundlePersistenceManager never calls reset. 

Yep. The reset call is only needed for correctly calculating the new cache sizes that are managed by the CacheManager. The BundleCaches are not managed by the CacheManager and "just" use the ConcurrentCache.

> What we could do is to only update the hitCount when the cache listeners are called, this way we'll increase once every CacheAccessListener#ACCESS_INTERVAL (127) calls. The stats can be a bit behind, that should not be a problem. 

I would only go for such "optimizations" if there is a need for it. Afaik the AtomicLong.incrementAndGet() should be "fast enough". If not, we can always have a look what the exact bottleneck is and optimize for that. For now it feels like it would only make things more complex. 

> About the element count: I feel we could leverage AbstractCache#recordSizeChange(), it appears that when it is called with a pozitive value it's an add, otherwise it's a remove. 

Ah, I missed that one. It would fit the "pattern" of the other stats. Otoh, the number of segments is coupled to the number of CPUs (did somebody actually benchmark if that's optiomal?) and looping over the segments should be pretty cheap.

> I wasn't talking about a dedicated CacheStats object either. I'm working on a dedicated PersistenceManagerStat that also contains some cache info. I agree about being able to tune the logging for a dedicated cache. 
(I'm also looking to add average cache miss duration on the PM. This is also something to watch out for.) 

I don't think the cache miss duration is of much interest, but the time it takes to load a bundle when the cache is missed certainly is ;-) (it's however not a statistic of the cache in the pure sense)

> Given that we agree on the direction of the patch, we can apply it on the jmx branch, and we can continue there the dev. This will also allow others to chime in with some feedback before we merge to trunk. 
> I don't think I have time today, but I have to push some code related to it soon, and I could merge your changes beforehand. What do you think? 

That would be great! I hope to have some cycles to look at the branch in the next few days and see if we can expose the stats through JMX.

Bart






                
> Add hit miss statistics and logging to caches
> ---------------------------------------------
>
>                 Key: JCR-3098
>                 URL: https://issues.apache.org/jira/browse/JCR-3098
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: jackrabbit-core
>            Reporter: Bart van der Schans
>             Fix For: 2.2.10, 2.3.1
>
>         Attachments: cache-stats.patch
>
>
> The current caches (ConcurrentCache) doesn't maintain hit and miss statistics. This makes it very hard to know if you need to increase the caches in a deployment. This functionality does exist in the 1.5 and 1.6 branches, but is missing from the 2.x branches. The patch adds these statistics and adds logging on info level. The frequency of the logging is by default configured to maximal once a minute but can be configured with the system property "org.apache.jackrabbit.cacheLogStatsInterval" (in ms). 
> The log lines look like:
> 07.10.2011 09:00:39 INFO  [org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.logCacheStats():737] name=defaultBundleCache[ConcurrentCache@54fb02fd] num=21074 mem=34504k max=65536k hits=93352 miss=21074 puts=21135
> 07.10.2011 09:00:40 INFO  [org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.logCacheStats():737] name=versionBundleCache[ConcurrentCache@47b1de1a] num=10637 mem=250k max=8192k hits=36352 miss=10637 puts=10637
> This patch will also make possible to later on expose these statistics over JMX when the initial JMX work has been settled.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JCR-3098) Add hit miss statistics and logging to caches

Posted by "Bart van der Schans (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13126487#comment-13126487 ] 

Bart van der Schans commented on JCR-3098:
------------------------------------------

Hi Alex,

Thanks for committing!

> - no put count - I can't see a situation where #puts != #misses.
You can put an object that already exists in cache. Not sure if that's happening anywhere in the code though. It's not really a very important stat.

>  - cache manager logging to debug instead on info 
The reason why I logged it on info is to make a difference between the resize log messages (which happen every second and are quite verbose) and the stats logging which only happens once a minute by default. The "old" JR 1.x bundlecache logging was also set on INFO.

> The AbstractBundlePersistenceManager stuff will come at a later time, along with a dedicated stats object. 
That was the reason I actually created the patch ;-) The bundleCache tuning is critical for getting the optimal performance in our web based (mostly read) environments. Registering the PM as listener itself isn't the most beautiful solution I agree. For trunk we should find a better way and avoid duplicating the "log only once so often" code. Probably with some kind of CacheStats(Manager). For 2.2 it might the easiest route to just register the listener, but maybe not the best. I guess that will depend on the impact of the dedicated stats object on 2.2?

Anyway, I would love to see this in 2.2 as well!

Some very minor things: 
- shouldn't we move the  totalAccessCount.incrementAndGet(); in the recordCacheAccess() to the beginning of the method?
- for really easy log parsing it could make sense to remove the spaces in the log output, e.g. "Cache name" -> "cachename", "used memory" -> "usedmemorykb", "max memory" -> "maxmemorykb" 


                
> Add hit miss statistics and logging to caches
> ---------------------------------------------
>
>                 Key: JCR-3098
>                 URL: https://issues.apache.org/jira/browse/JCR-3098
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: jackrabbit-core
>            Reporter: Bart van der Schans
>             Fix For: 2.2.10
>
>         Attachments: cache-stats.patch
>
>
> The current caches (ConcurrentCache) doesn't maintain hit and miss statistics. This makes it very hard to know if you need to increase the caches in a deployment. This functionality does exist in the 1.5 and 1.6 branches, but is missing from the 2.x branches. The patch adds these statistics and adds logging on info level. The frequency of the logging is by default configured to maximal once a minute but can be configured with the system property "org.apache.jackrabbit.cacheLogStatsInterval" (in ms). 
> The log lines look like:
> 07.10.2011 09:00:39 INFO  [org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.logCacheStats():737] name=defaultBundleCache[ConcurrentCache@54fb02fd] num=21074 mem=34504k max=65536k hits=93352 miss=21074 puts=21135
> 07.10.2011 09:00:40 INFO  [org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.logCacheStats():737] name=versionBundleCache[ConcurrentCache@47b1de1a] num=10637 mem=250k max=8192k hits=36352 miss=10637 puts=10637
> This patch will also make possible to later on expose these statistics over JMX when the initial JMX work has been settled.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (JCR-3098) Add hit miss statistics and logging to caches

Posted by "Bart van der Schans (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bart van der Schans updated JCR-3098:
-------------------------------------

    Attachment: cache-stats.patch

The attached patch applies to trunk and the 2.2 branch.
                
> Add hit miss statistics and logging to caches
> ---------------------------------------------
>
>                 Key: JCR-3098
>                 URL: https://issues.apache.org/jira/browse/JCR-3098
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: jackrabbit-core
>            Reporter: Bart van der Schans
>             Fix For: 2.2.10, 2.3.1
>
>         Attachments: cache-stats.patch
>
>
> The current caches (ConcurrentCache) doesn't maintain hit and miss statistics. This makes it very hard to know if you need to increase the caches in a deployment. This functionality does exist in the 1.5 and 1.6 branches, but is missing from the 2.x branches. The patch adds these statistics and adds logging on info level. The frequency of the logging is by default configured to maximal once a minute but can be configured with the system property "org.apache.jackrabbit.cacheLogStatsInterval" (in ms). 
> The log lines look like:
> 07.10.2011 09:00:39 INFO  [org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.logCacheStats():737] name=defaultBundleCache[ConcurrentCache@54fb02fd] num=21074 mem=34504k max=65536k hits=93352 miss=21074 puts=21135
> 07.10.2011 09:00:40 INFO  [org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.logCacheStats():737] name=versionBundleCache[ConcurrentCache@47b1de1a] num=10637 mem=250k max=8192k hits=36352 miss=10637 puts=10637
> This patch will also make possible to later on expose these statistics over JMX when the initial JMX work has been settled.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (JCR-3098) Add hit miss statistics and logging to caches

Posted by "Alex Parvulescu (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alex Parvulescu resolved JCR-3098.
----------------------------------

    Resolution: Fixed

Fixed 
 on trunk in revision 1182713 and 1182835.
 on the 2.2 branch in revision 1182853.
                
> Add hit miss statistics and logging to caches
> ---------------------------------------------
>
>                 Key: JCR-3098
>                 URL: https://issues.apache.org/jira/browse/JCR-3098
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: jackrabbit-core
>            Reporter: Bart van der Schans
>             Fix For: 2.2.10, 2.3.2
>
>         Attachments: cache-stats.patch
>
>
> The current caches (ConcurrentCache) doesn't maintain hit and miss statistics. This makes it very hard to know if you need to increase the caches in a deployment. This functionality does exist in the 1.5 and 1.6 branches, but is missing from the 2.x branches. The patch adds these statistics and adds logging on info level. The frequency of the logging is by default configured to maximal once a minute but can be configured with the system property "org.apache.jackrabbit.cacheLogStatsInterval" (in ms). 
> The log lines look like:
> 07.10.2011 09:00:39 INFO  [org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.logCacheStats():737] name=defaultBundleCache[ConcurrentCache@54fb02fd] num=21074 mem=34504k max=65536k hits=93352 miss=21074 puts=21135
> 07.10.2011 09:00:40 INFO  [org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.logCacheStats():737] name=versionBundleCache[ConcurrentCache@47b1de1a] num=10637 mem=250k max=8192k hits=36352 miss=10637 puts=10637
> This patch will also make possible to later on expose these statistics over JMX when the initial JMX work has been settled.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (JCR-3098) Add hit miss statistics and logging to caches

Posted by "Alex Parvulescu (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alex Parvulescu updated JCR-3098:
---------------------------------

    Fix Version/s: 2.3.2
    
> Add hit miss statistics and logging to caches
> ---------------------------------------------
>
>                 Key: JCR-3098
>                 URL: https://issues.apache.org/jira/browse/JCR-3098
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: jackrabbit-core
>            Reporter: Bart van der Schans
>             Fix For: 2.2.10, 2.3.2
>
>         Attachments: cache-stats.patch
>
>
> The current caches (ConcurrentCache) doesn't maintain hit and miss statistics. This makes it very hard to know if you need to increase the caches in a deployment. This functionality does exist in the 1.5 and 1.6 branches, but is missing from the 2.x branches. The patch adds these statistics and adds logging on info level. The frequency of the logging is by default configured to maximal once a minute but can be configured with the system property "org.apache.jackrabbit.cacheLogStatsInterval" (in ms). 
> The log lines look like:
> 07.10.2011 09:00:39 INFO  [org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.logCacheStats():737] name=defaultBundleCache[ConcurrentCache@54fb02fd] num=21074 mem=34504k max=65536k hits=93352 miss=21074 puts=21135
> 07.10.2011 09:00:40 INFO  [org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.logCacheStats():737] name=versionBundleCache[ConcurrentCache@47b1de1a] num=10637 mem=250k max=8192k hits=36352 miss=10637 puts=10637
> This patch will also make possible to later on expose these statistics over JMX when the initial JMX work has been settled.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JCR-3098) Add hit miss statistics and logging to caches

Posted by "Bart van der Schans (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13126600#comment-13126600 ] 

Bart van der Schans commented on JCR-3098:
------------------------------------------

Great, thanks!

I'll try to find some cycles to help out with creating the stats manager :-)

                
> Add hit miss statistics and logging to caches
> ---------------------------------------------
>
>                 Key: JCR-3098
>                 URL: https://issues.apache.org/jira/browse/JCR-3098
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: jackrabbit-core
>            Reporter: Bart van der Schans
>             Fix For: 2.2.10, 2.3.2
>
>         Attachments: cache-stats.patch
>
>
> The current caches (ConcurrentCache) doesn't maintain hit and miss statistics. This makes it very hard to know if you need to increase the caches in a deployment. This functionality does exist in the 1.5 and 1.6 branches, but is missing from the 2.x branches. The patch adds these statistics and adds logging on info level. The frequency of the logging is by default configured to maximal once a minute but can be configured with the system property "org.apache.jackrabbit.cacheLogStatsInterval" (in ms). 
> The log lines look like:
> 07.10.2011 09:00:39 INFO  [org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.logCacheStats():737] name=defaultBundleCache[ConcurrentCache@54fb02fd] num=21074 mem=34504k max=65536k hits=93352 miss=21074 puts=21135
> 07.10.2011 09:00:40 INFO  [org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.logCacheStats():737] name=versionBundleCache[ConcurrentCache@47b1de1a] num=10637 mem=250k max=8192k hits=36352 miss=10637 puts=10637
> This patch will also make possible to later on expose these statistics over JMX when the initial JMX work has been settled.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (JCR-3098) Add hit miss statistics and logging to caches

Posted by "Jukka Zitting (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jukka Zitting updated JCR-3098:
-------------------------------

    Fix Version/s:     (was: 2.3.1)
    
> Add hit miss statistics and logging to caches
> ---------------------------------------------
>
>                 Key: JCR-3098
>                 URL: https://issues.apache.org/jira/browse/JCR-3098
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: jackrabbit-core
>            Reporter: Bart van der Schans
>             Fix For: 2.2.10
>
>         Attachments: cache-stats.patch
>
>
> The current caches (ConcurrentCache) doesn't maintain hit and miss statistics. This makes it very hard to know if you need to increase the caches in a deployment. This functionality does exist in the 1.5 and 1.6 branches, but is missing from the 2.x branches. The patch adds these statistics and adds logging on info level. The frequency of the logging is by default configured to maximal once a minute but can be configured with the system property "org.apache.jackrabbit.cacheLogStatsInterval" (in ms). 
> The log lines look like:
> 07.10.2011 09:00:39 INFO  [org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.logCacheStats():737] name=defaultBundleCache[ConcurrentCache@54fb02fd] num=21074 mem=34504k max=65536k hits=93352 miss=21074 puts=21135
> 07.10.2011 09:00:40 INFO  [org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.logCacheStats():737] name=versionBundleCache[ConcurrentCache@47b1de1a] num=10637 mem=250k max=8192k hits=36352 miss=10637 puts=10637
> This patch will also make possible to later on expose these statistics over JMX when the initial JMX work has been settled.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JCR-3098) Add hit miss statistics and logging to caches

Posted by "Bart van der Schans (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13122792#comment-13122792 ] 

Bart van der Schans commented on JCR-3098:
------------------------------------------

Hi Alex,

Thanks for the feedback.

The following statement is not correct for the current implementation: "hitCount = accessCount - hitCount". The access count is used for resizing the caches (in resizeAll) and reset every second or so. I didn't want to mess with the resizing allogrithm which need this reset. What is true is that the "total accessCount" is the hitCount plus the missCount ;-)

The AbstractBundlePersistenceManager creates a new ConcurrentCache so this can have it's own CacheListener. I'm not sure if we would need more than one listener for each cache. If there is such a need we should either allow for that or indeed create a CacheStats object that can contain information about all the caches. It could be more "elegant" to defer all stats to a separate CacheStats object. On advantage of the current implementation is that you can "choose" which cache stats you want to have logged by setting the appropriate classes to info logging.

I don't see an easy way without creating some overhead to get the cache miss reason. That would require some "shadow" index which contains removed ids that have not yet been re-cached. It would at least involve an extra check while fetching an item from the cache. And you have to figure out what to do with deleted items so it doesn't definitively grows. 

I'm *very* interested in the JMX effort that is going on (we already use JMX to get all kind of other info in our monitoring system). I consider the logging and collecting of the stats as a first step before exposing the stats over JMX.  I'll have a look at the branch and try to also make that next step ;-)
 
Bart



                
> Add hit miss statistics and logging to caches
> ---------------------------------------------
>
>                 Key: JCR-3098
>                 URL: https://issues.apache.org/jira/browse/JCR-3098
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: jackrabbit-core
>            Reporter: Bart van der Schans
>             Fix For: 2.2.10, 2.3.1
>
>         Attachments: cache-stats.patch
>
>
> The current caches (ConcurrentCache) doesn't maintain hit and miss statistics. This makes it very hard to know if you need to increase the caches in a deployment. This functionality does exist in the 1.5 and 1.6 branches, but is missing from the 2.x branches. The patch adds these statistics and adds logging on info level. The frequency of the logging is by default configured to maximal once a minute but can be configured with the system property "org.apache.jackrabbit.cacheLogStatsInterval" (in ms). 
> The log lines look like:
> 07.10.2011 09:00:39 INFO  [org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.logCacheStats():737] name=defaultBundleCache[ConcurrentCache@54fb02fd] num=21074 mem=34504k max=65536k hits=93352 miss=21074 puts=21135
> 07.10.2011 09:00:40 INFO  [org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.logCacheStats():737] name=versionBundleCache[ConcurrentCache@47b1de1a] num=10637 mem=250k max=8192k hits=36352 miss=10637 puts=10637
> This patch will also make possible to later on expose these statistics over JMX when the initial JMX work has been settled.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JCR-3098) Add hit miss statistics and logging to caches

Posted by "Bart van der Schans (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13122798#comment-13122798 ] 

Bart van der Schans commented on JCR-3098:
------------------------------------------

The element count can be useful to get a feel for the cache item sizes. The average size is the used memory divided by the element count. It could also be useful to know if you have a 100 elements with a 1000 hits or 10.000 elements with a 1000 hits. 

Bart

                
> Add hit miss statistics and logging to caches
> ---------------------------------------------
>
>                 Key: JCR-3098
>                 URL: https://issues.apache.org/jira/browse/JCR-3098
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: jackrabbit-core
>            Reporter: Bart van der Schans
>             Fix For: 2.2.10, 2.3.1
>
>         Attachments: cache-stats.patch
>
>
> The current caches (ConcurrentCache) doesn't maintain hit and miss statistics. This makes it very hard to know if you need to increase the caches in a deployment. This functionality does exist in the 1.5 and 1.6 branches, but is missing from the 2.x branches. The patch adds these statistics and adds logging on info level. The frequency of the logging is by default configured to maximal once a minute but can be configured with the system property "org.apache.jackrabbit.cacheLogStatsInterval" (in ms). 
> The log lines look like:
> 07.10.2011 09:00:39 INFO  [org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.logCacheStats():737] name=defaultBundleCache[ConcurrentCache@54fb02fd] num=21074 mem=34504k max=65536k hits=93352 miss=21074 puts=21135
> 07.10.2011 09:00:40 INFO  [org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.logCacheStats():737] name=versionBundleCache[ConcurrentCache@47b1de1a] num=10637 mem=250k max=8192k hits=36352 miss=10637 puts=10637
> This patch will also make possible to later on expose these statistics over JMX when the initial JMX work has been settled.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JCR-3098) Add hit miss statistics and logging to caches

Posted by "Alex Parvulescu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13122823#comment-13122823 ] 

Alex Parvulescu commented on JCR-3098:
--------------------------------------

Hi Bart,


> The following statement is not correct for the current implementation: "hitCount = accessCount - hitCount". The access count is used for resizing the caches (in resizeAll) and reset every second or so. I didn't want to mess with the resizing allogrithm which need this reset. What is true is that the "total accessCount" is the hitCount plus the missCount ;-) 

You are right. The CacheManager does call reset on it. It is weird that this behaviour is tied to a listener implementation. For example, on the patch, the AbstractBundlePersistenceManager never calls reset.

What we could do is to only update the hitCount when the cache listeners are called, this way we'll increase once every CacheAccessListener#ACCESS_INTERVAL (127) calls. 
The stats can be a bit behind, that should not be a problem.

About the element count: I feel we could leverage AbstractCache#recordSizeChange(), it appears that when it is called with a pozitive value it's an add, otherwise it's a remove. 


> If there is such a need we should either allow for that or indeed create a CacheStats object that can contain information about all the caches. It could be more "elegant" to defer all stats to a separate CacheStats object

I wasn't talking about a dedicated CacheStats object either. I'm working on a dedicated PersistenceManagerStat that also contains some cache info. I agree about being able to tune the logging for a dedicated cache.
(I'm also looking to add average cache miss duration on the PM. This is also something to watch out for.) 


>I don't see an easy way without creating some overhead to get the cache miss reason. 

I agree :( That is just a "nice to have" right now.

I'm also interested in having the 2 separated. In the beginning it wasn't very clear to me, also a lack of community feedback didn't make the choice easier ;)
Now Stats and JMX are 2 different things, but the effort is combined on the branch I mentioned.

Given that we agree on the direction of the patch, we can apply it on the jmx branch, and we can continue there the dev. 
This will also allow others to chime in with some feedback before we merge to trunk.

I don't think I have time today, but I have to push some code related to it soon, and I could merge your changes beforehand.
What do you think?

                
> Add hit miss statistics and logging to caches
> ---------------------------------------------
>
>                 Key: JCR-3098
>                 URL: https://issues.apache.org/jira/browse/JCR-3098
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: jackrabbit-core
>            Reporter: Bart van der Schans
>             Fix For: 2.2.10, 2.3.1
>
>         Attachments: cache-stats.patch
>
>
> The current caches (ConcurrentCache) doesn't maintain hit and miss statistics. This makes it very hard to know if you need to increase the caches in a deployment. This functionality does exist in the 1.5 and 1.6 branches, but is missing from the 2.x branches. The patch adds these statistics and adds logging on info level. The frequency of the logging is by default configured to maximal once a minute but can be configured with the system property "org.apache.jackrabbit.cacheLogStatsInterval" (in ms). 
> The log lines look like:
> 07.10.2011 09:00:39 INFO  [org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.logCacheStats():737] name=defaultBundleCache[ConcurrentCache@54fb02fd] num=21074 mem=34504k max=65536k hits=93352 miss=21074 puts=21135
> 07.10.2011 09:00:40 INFO  [org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.logCacheStats():737] name=versionBundleCache[ConcurrentCache@47b1de1a] num=10637 mem=250k max=8192k hits=36352 miss=10637 puts=10637
> This patch will also make possible to later on expose these statistics over JMX when the initial JMX work has been settled.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JCR-3098) Add hit miss statistics and logging to caches

Posted by "Alex Parvulescu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13126475#comment-13126475 ] 

Alex Parvulescu commented on JCR-3098:
--------------------------------------

> Fix Version/s: 2.2.10
Do you want this backported to the 2.2 branch as well?
                
> Add hit miss statistics and logging to caches
> ---------------------------------------------
>
>                 Key: JCR-3098
>                 URL: https://issues.apache.org/jira/browse/JCR-3098
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: jackrabbit-core
>            Reporter: Bart van der Schans
>             Fix For: 2.2.10
>
>         Attachments: cache-stats.patch
>
>
> The current caches (ConcurrentCache) doesn't maintain hit and miss statistics. This makes it very hard to know if you need to increase the caches in a deployment. This functionality does exist in the 1.5 and 1.6 branches, but is missing from the 2.x branches. The patch adds these statistics and adds logging on info level. The frequency of the logging is by default configured to maximal once a minute but can be configured with the system property "org.apache.jackrabbit.cacheLogStatsInterval" (in ms). 
> The log lines look like:
> 07.10.2011 09:00:39 INFO  [org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.logCacheStats():737] name=defaultBundleCache[ConcurrentCache@54fb02fd] num=21074 mem=34504k max=65536k hits=93352 miss=21074 puts=21135
> 07.10.2011 09:00:40 INFO  [org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.logCacheStats():737] name=versionBundleCache[ConcurrentCache@47b1de1a] num=10637 mem=250k max=8192k hits=36352 miss=10637 puts=10637
> This patch will also make possible to later on expose these statistics over JMX when the initial JMX work has been settled.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JCR-3098) Add hit miss statistics and logging to caches

Posted by "Alex Parvulescu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13122752#comment-13122752 ] 

Alex Parvulescu commented on JCR-3098:
--------------------------------------

oh, I forgot something.

About the element count: the cache already has memory used and max memory size which you can use to get an idea abut how filled up the cache is. so maybe we don't need to also log the actual element count.
                
> Add hit miss statistics and logging to caches
> ---------------------------------------------
>
>                 Key: JCR-3098
>                 URL: https://issues.apache.org/jira/browse/JCR-3098
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: jackrabbit-core
>            Reporter: Bart van der Schans
>             Fix For: 2.2.10, 2.3.1
>
>         Attachments: cache-stats.patch
>
>
> The current caches (ConcurrentCache) doesn't maintain hit and miss statistics. This makes it very hard to know if you need to increase the caches in a deployment. This functionality does exist in the 1.5 and 1.6 branches, but is missing from the 2.x branches. The patch adds these statistics and adds logging on info level. The frequency of the logging is by default configured to maximal once a minute but can be configured with the system property "org.apache.jackrabbit.cacheLogStatsInterval" (in ms). 
> The log lines look like:
> 07.10.2011 09:00:39 INFO  [org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.logCacheStats():737] name=defaultBundleCache[ConcurrentCache@54fb02fd] num=21074 mem=34504k max=65536k hits=93352 miss=21074 puts=21135
> 07.10.2011 09:00:40 INFO  [org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.logCacheStats():737] name=versionBundleCache[ConcurrentCache@47b1de1a] num=10637 mem=250k max=8192k hits=36352 miss=10637 puts=10637
> This patch will also make possible to later on expose these statistics over JMX when the initial JMX work has been settled.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JCR-3098) Add hit miss statistics and logging to caches

Posted by "Alex Parvulescu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13126431#comment-13126431 ] 

Alex Parvulescu commented on JCR-3098:
--------------------------------------

I've committed the patch with some tiny changes:
 - no hit count on the cache - I added a total access count which will be used by the other stats as well. the mechanism was already in place
 - no put count - I can't see a situation where #puts != #misses.
 - cache manager logging to debug instead on info
 - I moved the logic to build cache info in the AbstractCache class

The AbstractBundlePersistenceManager stuff will come at a later time, along with a dedicated stats object.
Did I forget anything?

ps: we are back in trunk with the changes, so feel free to dive in ;)
                
> Add hit miss statistics and logging to caches
> ---------------------------------------------
>
>                 Key: JCR-3098
>                 URL: https://issues.apache.org/jira/browse/JCR-3098
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: jackrabbit-core
>            Reporter: Bart van der Schans
>             Fix For: 2.2.10
>
>         Attachments: cache-stats.patch
>
>
> The current caches (ConcurrentCache) doesn't maintain hit and miss statistics. This makes it very hard to know if you need to increase the caches in a deployment. This functionality does exist in the 1.5 and 1.6 branches, but is missing from the 2.x branches. The patch adds these statistics and adds logging on info level. The frequency of the logging is by default configured to maximal once a minute but can be configured with the system property "org.apache.jackrabbit.cacheLogStatsInterval" (in ms). 
> The log lines look like:
> 07.10.2011 09:00:39 INFO  [org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.logCacheStats():737] name=defaultBundleCache[ConcurrentCache@54fb02fd] num=21074 mem=34504k max=65536k hits=93352 miss=21074 puts=21135
> 07.10.2011 09:00:40 INFO  [org.apache.jackrabbit.core.persistence.bundle.AbstractBundlePersistenceManager.logCacheStats():737] name=versionBundleCache[ConcurrentCache@47b1de1a] num=10637 mem=250k max=8192k hits=36352 miss=10637 puts=10637
> This patch will also make possible to later on expose these statistics over JMX when the initial JMX work has been settled.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira